Vector Search Performance / Optimizing Vector Search with Views
Code Summary: Optimizing Vector Search with Views
The following summarizes the code used to implement a standard view on a MongoDB collection and build a vector search index on it.
Prerequisites
- MongoDB Atlas Cluster
- Python
- The MongoDB Shell
Usage
Connect and Switch to the sample_mflix Database:
The following opens a mongosh session using the provided connection string, then switches to the sample_mflix database.
mongosh <connection-string>
use sample_mflix
Create a Filtered View of Movies with Embeddings:
The following creates a MongoDB view called documents_with_embeddings on top of the embedded_movies collection, filtering out any documents where the plot_embedding field is missing.
db.createView(
"documents_with_embeddings",
"embedded_movies",
[
{
$match: {
$expr: {
$ne: [
{ $type: "$plot_embedding" },
"missing"
]
}
}
}
]
)
Create a Vector Search Index on the Movies View:
The following creates a 2048-dimensional cosine similarity vector search index called EmbeddingsIndex on the plot_embedding field of the documents_with_embeddings view.
db.documents_with_embeddings.createSearchIndex({
name: "EmbeddingsIndex",
type: "vectorSearch",
definition: {
fields: [{
type: "vector",
numDimensions: 2048,
path: "plot_embedding",
similarity: "cosine"
}]
}
})
Create a Tenant-Scoped View:
The following creates a MongoDB view called tenantA_docs that filters the documents collection down to only records belonging to tenantA, isolating data for a specific tenant.
db.createView(
"tenantA_docs",
"documents",
[
{
$match: {
$expr: {
$eq: ["$tenant_id", "tenantA"]
}
}
}
]
)
Create a Vector Search Index on the Tenant View:
The following creates a 1024-dimensional dot product vector search index called tenantA_vector_index on the embedding field of the tenantA_docs view, scoping all vector searches to tenantA's documents only.
db.tenantA_docs.createSearchIndex({
name: "tenantA_vector_index",
type: "vectorSearch",
definition: {
fields: [{
type: "vector",
path: "embedding",
numDimensions: 1024,
similarity: "dotProduct"
}]
}
})