Skip to content

Semantic Search

UQL provides first-class vector similarity search, enabling AI-powered semantic queries out of the box. Works across PostgreSQL (pgvector), MariaDB, SQLite (sqlite-vec), and MongoDB Atlas ($vectorSearch).

Define a vector field with type: 'vector' and dimensions. Optionally, add a vector index for efficient approximate nearest-neighbor (ANN) search.

You write
import { Entity, Id, Field, Index } from 'uql-orm';
@Entity()
@Index(['embedding'], { type: 'hnsw', distance: 'cosine', m: 16, efConstruction: 64 })
export class Article {
@Id() id?: number;
@Field() title?: string;
@Field() category?: string;
@Field({ type: 'vector', dimensions: 1536 })
embedding?: number[];
}

Use $sort on a vector field with $vector and an optional $distance metric:

You write
const results = await querier.findMany(Article, {
$select: { id: true, title: true },
$sort: { embedding: { $vector: queryEmbedding, $distance: 'cosine' } },
$limit: 10,
});
Generated SQL (PostgreSQL)
SELECT "id", "title" FROM "Article"
ORDER BY "embedding" <=> $1::vector
LIMIT 10
Generated SQL (MariaDB)
SELECT `id`, `title` FROM `Article`
ORDER BY VEC_DISTANCE_COSINE(`embedding`, ?)
LIMIT 10
Generated SQL (SQLite)
SELECT `id`, `title` FROM `Article`
ORDER BY vec_distance_cosine(`embedding`, ?)
LIMIT 10

For MongoDB, UQL translates into a $vectorSearch aggregation pipeline:

Generated Pipeline (MongoDB Atlas)
[
{
"$vectorSearch": {
"index": "embedding_index",
"path": "embedding",
"queryVector": [/* queryEmbedding */],
"numCandidates": 100,
"limit": 10
}
}
]

Vector search composes naturally with $where and regular $sort fields:

You write
const results = await querier.findMany(Article, {
$where: { category: 'science' },
$sort: { embedding: { $vector: queryVec, $distance: 'cosine' }, title: 'asc' },
$limit: 10,
});
Generated SQL (PostgreSQL)
SELECT * FROM "Article"
WHERE "category" = $1
ORDER BY "embedding" <=> $2::vector, "title" ASC
LIMIT 10
Generated SQL (MariaDB)
SELECT * FROM `Article`
WHERE `category` = ?
ORDER BY VEC_DISTANCE_COSINE(`embedding`, ?), `title` ASC
LIMIT 10
Generated SQL (SQLite)
SELECT * FROM `Article`
WHERE `category` = ?
ORDER BY vec_distance_cosine(`embedding`, ?), `title` ASC
LIMIT 10

For MongoDB, $where is merged into the $vectorSearch.filter for optimal pre-filtering, and secondary sorts become a separate $sort stage:

Generated Pipeline (MongoDB Atlas)
[
{
"$vectorSearch": {
"index": "embedding_index",
"path": "embedding",
"queryVector": [/* queryVec */],
"numCandidates": 100,
"limit": 10,
"filter": { "category": "science" }
}
},
{ "$sort": { "title": 1 } }
]

MetricPostgres OperatorMariaDB FunctionSQLite FunctionMongoDB AtlasUse Case
cosine<=>VEC_DISTANCE_COSINEvec_distance_cosine✅ (index-defined)Text embeddings (OpenAI, Cohere)
l2<->VEC_DISTANCE_EUCLIDEANvec_distance_L2✅ (index-defined)Image search, spatial data
inner<#>✅ (index-defined)Maximum inner product
l1<+>Manhattan distance
hamming<~>vec_distance_hammingBinary embeddings

If omitted, $distance defaults to 'cosine'. You can also set a default per-field:

@Field({ type: 'vector', dimensions: 1536, distance: 'l2' })
embedding?: number[];

Queries on this field will use l2 unless overridden with $distance at query time.


Project the computed distance as a named field in the result with $project:

You write
import type { WithDistance } from 'uql-orm';
const results = await querier.findMany(Article, {
$select: { id: true, title: true },
$sort: { embedding: { $vector: queryVec, $distance: 'cosine', $project: 'similarity' } },
$limit: 10,
}) as WithDistance<Article, 'similarity'>[];
results.forEach((r) => console.log(r.title, r.similarity));
Generated SQL (PostgreSQL)
SELECT "id", "title", "embedding" <=> $1::vector AS "similarity" FROM "Article"
ORDER BY "similarity"
LIMIT 10
Generated SQL (MariaDB)
SELECT `id`, `title`, VEC_DISTANCE_COSINE(`embedding`, ?) AS `similarity` FROM `Article`
ORDER BY `similarity`
LIMIT 10
Generated SQL (SQLite)
SELECT `id`, `title`, vec_distance_cosine(`embedding`, ?) AS `similarity` FROM `Article`
ORDER BY `similarity`
LIMIT 10

For MongoDB, $project adds a $meta: 'vectorSearchScore' projection:

Generated Pipeline (MongoDB Atlas)
[
{ "$vectorSearch": { "index": "embedding_index", "path": "embedding", "queryVector": ["..."], "numCandidates": 100, "limit": 10 } },
{ "$project": { "id": true, "title": true, "similarity": { "$meta": "vectorSearchScore" } } }
]

UQL supports three vector storage types — use the one that best fits your model and performance needs:

TypeSQL (Postgres)StorageMax DimensionsUse Case
'vector'VECTOR(n)32-bit float2,000Standard embeddings (OpenAI, etc.)
'halfvec'HALFVEC(n)16-bit float4,00050% storage savings, near-identical accuracy
'sparsevec'SPARSEVEC(n)Sparse1,000,000SPLADE, BM25-style sparse retrieval
Examples
@Field({ type: 'vector', dimensions: 1536 }) // OpenAI ada-002
embedding?: number[];
@Field({ type: 'halfvec', dimensions: 1536 }) // Same model, half storage
embedding?: number[];
@Field({ type: 'sparsevec', dimensions: 30000 }) // SPLADE sparse
sparseEmbedding?: number[];

Define vector indexes with @Index() for efficient approximate nearest-neighbor (ANN) search:

Index TypePostgresMariaDBMongoDB AtlasNotes
hnswUSING hnsw with operator classesBest accuracy, higher memory
ivfflatUSING ivfflat with lists paramFaster build, large datasets
vector✅ Inline VECTOR INDEXMariaDB’s native vector index
vectorSearch✅ Atlas vector search indexMongoDB’s managed ANN index
Postgres HNSW
@Index(['embedding'], { type: 'hnsw', distance: 'cosine', m: 16, efConstruction: 64 })
Postgres IVFFlat
@Index(['embedding'], { type: 'ivfflat', distance: 'l2', lists: 100 })
MariaDB
@Index(['embedding'], { type: 'vector', distance: 'cosine', m: 8 })
MongoDB Atlas
@Index(['embedding'], { type: 'vectorSearch', name: 'my_search_index' })