Vector Search Performance / Implementing Quantization

Code Summary: Implementing Quantization

The following summarizes the code used to implement automatic vector quantization or to use user-provided precomputed quantized vectors.

Prerequisites

Usage

Sample Data Used:

The following defines a list of 12 reptile documents, each with an id, name, species, and description, covering a range of lizards, snakes, crocodilians, and turtles.

reptiles = [
    {
        "id": 1,
        "name": "Komodo Dragon",
        "species": "Varanus komodoensis",
        "description": "The Komodo dragon is the world's largest living lizard, native to the Indonesian islands of Komodo, Rinca, and Flores. It uses a venomous bite and powerful claws to take down prey as large as deer and buffalo."
    },
    {
        "id": 2,
        "name": "Green Iguana",
        "species": "Iguana iguana",
        "description": "The green iguana is a large, arboreal herbivore found throughout Central and South America. It is well adapted to life in the tree canopy, using its long tail for balance and its dewlap for thermoregulation and communication."
    },
    {
        "id": 3,
        "name": "Saltwater Crocodile",
        "species": "Crocodylus porosus",
        "description": "The saltwater crocodile is the largest living reptile on Earth, capable of exceeding 6 meters in length. It inhabits coastal brackish and freshwater regions of Southeast Asia and northern Australia, and is an apex predator with the most powerful bite of any animal."
    },
    {
        "id": 4,
        "name": "Ball Python",
        "species": "Python regius",
        "description": "The ball python is a non-venomous constrictor native to sub-Saharan Africa. Known for its docile temperament and habit of curling into a tight ball when threatened, it is one of the most popular pet snakes in the world."
    },
    {
        "id": 5,
        "name": "Blue-tongued Skink",
        "species": "Tiliqua scincoides",
        "description": "The blue-tongued skink is a ground-dwelling lizard native to Australia and New Guinea. It displays its vivid blue tongue as a warning signal to deter predators. Omnivorous by nature, it feeds on insects, fruits, and small vertebrates."
    },
    {
        "id": 6,
        "name": "Leatherback Sea Turtle",
        "species": "Dermochelys coriacea",
        "description": "The leatherback sea turtle is the largest of all living turtles and the heaviest non-crocodilian reptile. Unlike other sea turtles it lacks a hard shell, instead having a leathery carapace. It migrates vast oceanic distances and dives deeper than any other turtle."
    },
    {
        "id": 7,
        "name": "Chameleon",
        "species": "Chamaeleo chamaeleon",
        "description": "The common chameleon is renowned for its ability to change skin color for camouflage and social signaling. Its eyes move independently, giving it a near-360-degree field of view, and its projectile tongue can extend twice the length of its body to catch prey."
    },
    {
        "id": 8,
        "name": "Gila Monster",
        "species": "Heloderma suspectum",
        "description": "The Gila monster is one of only a few venomous lizard species in the world, native to the southwestern United States and northwestern Mexico. It moves slowly and stores fat in its thick tail, allowing it to survive months between meals."
    },
    {
        "id": 9,
        "name": "King Cobra",
        "species": "Ophiophagus hannah",
        "description": "The king cobra is the world's longest venomous snake, reaching up to 5.5 meters. Found across South and Southeast Asia, it preys almost exclusively on other snakes. Its venom is a potent neurotoxin capable of killing an elephant with a single bite."
    },
    {
        "id": 10,
        "name": "Panther Chameleon",
        "species": "Furcifer pardalis",
        "description": "The panther chameleon is native to Madagascar and is celebrated for its spectacular and rapid color changes. Males display vivid blues, greens, reds, and oranges to attract females and intimidate rivals. It is a popular species in herpetological research and the exotic pet trade."
    },
    {
        "id": 11,
        "name": "American Alligator",
        "species": "Alligator mississippiensis",
        "description": "The American alligator is a large crocodilian native to the southeastern United States. It plays a keystone role in wetland ecosystems by creating 'alligator holes' that retain water during droughts, providing habitat for other wildlife."
    },
    {
        "id": 12,
        "name": "Thorny Dragon",
        "species": "Moloch horridus",
        "description": "The thorny dragon is a small Australian lizard covered in conical spines that serve as both camouflage and defense against predators. It feeds almost exclusively on ants and absorbs water through capillary channels in its skin from any surface it contacts."
    },
]

Set Up the Environment:

The following imports the reptile dataset, loads environment variables, and initializes a MongoDB client connected to the reptiles collection and a Voyage AI client.

import os
import pymongo
import voyageai
from dotenv import load_dotenv
from data import reptiles

load_dotenv()

client = pymongo.MongoClient(os.environ["MONGODB_URI"])
db = client["mydb"]
coll = db["reptiles"]

vo = voyageai.Client()

Generate Vector Embeddings:

The following extracts each reptile's description, embeds them all in a single voyage-4 call, attaches each resulting vector to its source document, and inserts all 12 records into the reptiles collection.

descriptions = [r["description"] for r in reptiles]
resp = vo.embed(
    descriptions,
    model="voyage-4",
    input_type="document",
)
embeddings = resp.embeddings

docs = []
for reptile, vec in zip(reptiles, embeddings):
    docs.append({
        **reptile,
        "embedding": vec
    })
coll.insert_many(docs)

Create a Vector Search Index with Automatic Quantization:

The following creates and registers a vector search index on the reptiles collection, configured for 1024-dimensional dot product search with binary quantization applied to the stored embeddings.

index_def = SearchIndexModel(
    name="vector_index_auto_binary",
    type="vectorSearch",
    definition={
        "fields": [
            {
                "type": "vector",
                "path": "embedding",
                "numDimensions": 1024,
                "similarity": "dotProduct",
                "quantization": "binary"  # automatic binary quantization
            }
        ]
    }
)

coll.create_search_index(index_def)

Ingest Precomputed Quantize Vectors

BSON Binary Imports:

The following imports Binary and BinaryVectorDtype from the bson library, which are used to encode vectors into BSON's native binary format for compact storage in MongoDB.

from bson.binary import Binary, BinaryVectorDtype

Generate Embeddings with Configurable Dtype:

The following defines a helper function that embeds each text individually using a specified Voyage AI model and dimension, with the output dtype controlling the numeric format of the returned vectors (e.g., float, int8, binary), and returns a list of all resulting embeddings.

def generate_embeddings(texts, model, dtype, output_dimension):
    embeddings = []
    for text in texts:
        embedding = vo.embed(
            texts=[text],
            model=model,
            output_dtype=dtype,
            output_dimension=output_dimension,
        ).embeddings[0]
        embeddings.append(embedding)
    return embeddings

Convert a Vector to BSON Binary Format:

The following defines a helper function that wraps a vector in MongoDB's native BSON binary format using the specified dtype, preparing it for compact storage in MongoDB.

def generate_bson_vector(vector, vector_dtype):
    return Binary.from_vector(vector, vector_dtype)

Generate and Store INT8 BSON Embeddings:

The following generates 1024-dimensional int8 embeddings for each reptile description, converts them to BSON binary format, attaches each to its source document under embedding_int8, and inserts all 12 records into the reptiles collection.

int8_embeddings = generate_embeddings(
    descriptions, model="voyage-4", dtype="int8", output_dimension=1024
)

bson_int8_embeddings = [
    generate_bson_vector(emb, BinaryVectorDtype.INT8) for emb in int8_embeddings
]

docs = []
for reptile, bson_vec in zip(reptiles, bson_int8_embeddings):
    docs.append({**reptile, "embedding_int8": bson_vec})

coll.insert_many(docs)

Create a Vector Search Index for INT8 Embeddings:

The following creates and registers a vector search index on the embedding_int8 field, configured for 1024-dimensional dot product search without quantization, since the vectors are already pre-quantized to int8.

index_def_prequant = SearchIndexModel(
    name="vector_index_int8",
    type="vectorSearch",
    definition={
        "fields": [
            {
                "type": "vector",
                "path": "embedding_int8",
                "numDimensions": 1024,
                "similarity": "dotProduct"
            }
        ]
    }
)

coll.create_search_index(model=index_def_prequant)

Embed and Encode a Query Vector as INT8 BSON:

The following generates a 1024-dimensional int8 query embedding using voyage-4, then converts it to BSON binary format for use in a vector search against the embedding_int8 index.

query_vector = vo.embed(
    texts=[query_text],
    model="voyage-4",
    input_type="query",
    output_dtype="int8",
    output_dimension=1024,
).embeddings[0]

bson_query_vector = Binary.from_vector(query_vector, BinaryVectorDtype.INT8)

Build a Vector Search Pipeline for INT8 Embeddings:

The following defines an aggregation pipeline that runs a vector search against the vector_index_int8 index, retrieving the top 3 results from 50 candidates using the BSON-encoded int8 query vector.

pipeline = [
    {
        "$vectorSearch": {
            "index": "vector_index_int8",
            "path": "embedding_int8",
            "queryVector": bson_query_vector,
            "numCandidates": 50,
            "limit": 3,
        }
    },
]

coll.aggregate(pipeline)