Most major database vendors, like MongoDB, are adding vector search capabilities to their products. It’s becoming a standard feature as demand for AI-powered applications grows.
MongoDB Vector Search enables semantic queries, letting you find results that match in meaning rather than just exact wording. You can locate support tickets describing the same issue even if they use different phrases, or combine precise keyword matches like "error 500" with conceptually similar terms such as "server failure" in a hybrid search.
It also supports personalization, such as recommending products, movies, or articles similar to what a user previously engaged with.
MongoDB Vector Search plays a central role in retrieval-augmented generation (RAG), where a language model retrieves relevant context from stored embeddings before producing an answer. For example, it can pull accurate API details from technical documentation instead of inventing them.
What is a Vector?
A vector is an ordered list of numbers—for example, `[0.12, -0.98, 4.45, 1, 0.44, ...]—that encodes the meaning of a document, sentence, or image so that similar content is positioned close together in a high-dimensional space. Searching with vectors means comparing these numerical representations to find the most semantically relevant matches. For instance, the sentence “A hacker discovers a hidden reality controlled by machines” might be represented as the vector:
[0.12, -0.45, 0.33, 0.08, -0.27, 0.51]
Meanwhile, “A computer programmer learns the world is a simulation” becomes:
[0.10, -0.47, 0.31, 0.07, -0.29, 0.49]
Because their numbers are close, the system identifies the two as similar in meaning.
Another example is illustrated in the figure below. The query “Renewable Energy” is mapped in semantic space near clusters such as Wind Energy and Solar Power, reflecting strong semantic similarity. In contrast, unrelated clusters like MongoDB and Apache Lucene are positioned farther away. In vector space, distance encodes meaning: The closer two points are, the more related their semantic content, and the farther apart they are, the weaker the relationship.
MongoDB does not generate vectors but makes it easy to store, index, and search them with Atlas Vector Search, and soon, this capability will also be available in MongoDB Community Edition and Enterprise Advanced. You can create vectors using external models such as OpenAI or Hugging Face, then store and query them efficiently in MongoDB.
MongoDB recently acquired Voyage AI to bring high-quality embedding models and advanced capabilities like reranking and hybrid relevance scoring directly into the MongoDB Atlas platform in the future.
How MongoDB Implements Vector Search?
MongoDB implements vector search through a dedicated process called mongot, which indexes and executes both full-text and vector queries. It’s built on Apache Lucene, leveraging KNNVectorField for vector storage and the Hierarchical Navigable Small World (HNSW) algorithm, engineered to efficiently execute large-scale approximate nearest-neighbor searches. This design enables low-latency, high-throughput semantic retrieval across MongoDB Atlas today (and on-prem soon). MongoDB exposes two execution modes:
- Approximate (ANN): Powered by HNSW; best when you need speed and scale with a small accuracy trade-off
- Exact (ENN): Exhaustive scan of all vectors; best when maximum precision matters (typically slower, better for smaller sets or final re-ranking)
How Mongot Works?
Process mongot handles search indexes independently from mongod, which is the main MongoDB process for storage and query execution. mongot doesn't store BSON documents but manages Lucene-based indexes.
Data is sent from mongod to mongot via an internal synchronization mechanism based on MongoDB Change Streams.
When documents are inserted or updated in a MongoDB collection, the fields defined in the index configuration are extracted and streamed to mongot, where they are transformed and written to Lucene segment files for indexing.
$vectorSearch Aggregation Pipeline Step
Semantic similarity queries on vector data can be executed in MongoDB using the unified Query API and standard aggregation pipeline stages.
- When a pipeline includes a $vectorSearch stage, mongod parses the request and delegates the vector-specific portion to mongot.
- The mongot process runs the Lucene search (vector similarity, relevance scoring, or hybrid) and returns document IDs with scores.
- Results are merged back in mongod with the original BSON documents and passed through remaining filters or stages.
- In MongoDB Atlas, mongot runs as a separate process (either alongside mongod or on dedicated nodes) and is fully managed.
- Initially exclusive to MongoDB Atlas, vector search will soon be available in Community and Enterprise editions.
Creating Local MongoDB Atlas Clusters
To test Vector Search, you need an Atlas cluster. MongoDB Atlas is a fully managed, multi-cloud data platform that runs on AWS, GCP, and Azure, offering high availability, horizontal scaling, and integrated security. It provides features such as Vector Search, full-text search, global clusters, SQL querying, online archiving to S3, stream processing, and advanced security controls.
Atlas CLI is a command-line tool for creating, managing, and interacting with MongoDB Atlas resources and local clusters from your terminal, enabling a local Atlas-like environment via mongosh without a cloud account, ideal for development, testing, and workshops.
If you still do not have Atlas CLI installed on your machine, learn how to do it. Once you have it installed, you can start an Atlas local cluster with a single command:
atlas deployments setup
- Choose local option, accept defaults, specify port (e.g., 27017).
- Spins up MongoDB 8.0 replica set with Atlas-compatible features.
Now, you can list active deployments using:
atlas deployments list
The output of the command should display your cluster status as shown in the code snippet below:
NAME TYPE MDB VER STATE
local813 LOCAL 8.0.11 IDLE
Start testing vector search locally. To connect to your new local Atlas deployment, simply run:
atlas deployments connect
You’ll be prompted to choose how you want to connect. For example:
? How would you like to connect to local813?
> mongosh - MongoDB Shell
compass - MongoDB Compass
vscode - MongoDB for VSCode
connectionString - Connection String
Selecting mongosh will launch an interactive session connected to your local MongoDB replica set. You can now run queries, create indexes, test aggregations, and explore features like Atlas Search and Vector Search.
Vector Embedding Workflow
After launching a cluster, default system DBs (admin, config, local) are visible.Embedding models (e.g., OpenAI, Voyage AI) generate vectors that MongoDB stores. $vectorSearch uses embeddings to find matches by meaning, not just keywords.
MongoDB
AtlasLocalDev local813 [direct: primary] test> show dbs
admin 256.00 KiB
config 232.00 KiB
local 588.00 KiB
AtlasLocalDev local813 [direct: primary] test>
These default databases are used internally by MongoDB and contain no user data or vector embeddings at this point.
An embedding model, such as OpenAI or Voyage AI, processes raw data and generates a high-dimensional vector that captures its meaning. MongoDB stores these vectors in a collection, and the $vectorSearch aggregation stage uses them to run semantic queries that match data based on meaning rather than exact keywords.
Steps to Load Sample Embeddings
Alternatively, you can load a sample MongoDB dataset that already contains pre-generated vector embeddings using the mongorestore tool.
1. Download Sample Dataset
curl https://atlas-education.s3.amazonaws.com/sampledata.archive -o sampledata.archive
2. Find the Connection String for Your Local Atlas Cluster
atlas deployments connect --connectWith connectionString
This will return a connection string similar to:
mongodb://localhost:55015/?directConnection=true
3. Restore the Dataset using mongorestore
mongorestore --archive=sampledata.archive --uri "mongodb://localhost:55015/?directConnection=true"
4. Confirm the Dataset has been Loaded
- Reconnect to your local Atlas cluster.
- Run show dbs in mongosh to confirm that the sample_mflix database has been added.
sample_mflix
This database includes the embedded_movies collection with pre-generated vector embeddings from the MongoDB sample dataset.
Matrix scenario
When you load the sample dataset to MongoDB, one of the movies you’ll find in the embedded_movies collection is The Matrix.
You can check it with the find command:
MongoDB
db.getSiblingDB("sample_mflix").embedded_movies.find({ title: "The Matrix" })
Example Document:
The document includes standard fields like title, plot, and genres, plus two vector embeddings:
json
{
"title": "The Matrix",
"year": 1999,
"genres": ["Action", "Sci-Fi"],
"rated": "R",
"plot": "A computer hacker learns from mysterious rebels about the true nature of his reality and his role in the war against its controllers.",
"fullplot": "Thomas A. Anderson is a man living two lives...",
"imdb": { "rating": 8.7, "votes": 1080566 },
"metacritic": 73,
"languages": ["English"],
"writers": ["Andy Wachowski", "Lana Wachowski"],
"directors": ["Andy Wachowski", "Lana Wachowski"],
"cast": ["Keanu Reeves", "Laurence Fishburne", "Carrie-Anne Moss", "Hugo Weaving"],
"countries": ["USA", "Australia"],
"runtime": 136,
"released": "1999-03-31",
"awards": "Won 4 Oscars. Another 33 wins & 40 nominations.",
"poster": "https://m.media-amazon.com/images/M/...jpg",
"plot_embedding": [-0.0065, -0.0334, -0.0149, -0.0390, -0.0114, 0.0089, -0.0314, -0.01881, -0.0534, -0.0734, -0.016608...],
"plot_embedding_voyage_3_large": [-0.0376, 0.0339, -0.0164, -0.0154, -0.0134, -0.5164, -0.0371, -0.01881, -0.016608, 0.0920, 0.0474, ...]
}
Embedding Details
- plot_embedding → 1,536 dimensions from OpenAI’s text-embedding-ada-002
- plot_embedding_voyage_3_large → 2,048 dimensions from Voyage AI’s voyage-3-large
These embeddings encode meaning, not just words. You can use them so MongoDB finds movies with a similar concept, even when plots share no obvious keywords.
Using the Embedding: These embeddings encode meaning, not just words. You can use them so MongoDB finds movies with a similar concept, even when plots share no obvious keywords.
For this tutorial:
- Use The Matrix’s plot_embedding as your query vector.
- Since the embedding is already stored in the document, you just need to retrieve it and pass it to the $vectorSearch stage as a query parameter—no extra model calls required.
- Finally, create a vector index on the embedding field to perform semantic search efficiently. This is required for the $vectorSearch aggregation stage to work.
Creating Your Vector Search Index
Use the createSearchIndex command to define a vector index on the plot_embedding field. This enables fast similarity search over 1,536 dimensional vectors.
MongoDB
`db.getSiblingDB("sample_mflix").embedded_movies.createSearchIndex({
name: "plot_embedding_index",
definition: {
mappings: {
dynamic: false,
fields: {
plot_embedding: {
type: "knnVector",
dimensions: 1536,
similarity: "cosine"
}
}
}
}
})`
The plot_embedding field is indexed as a knnVector, a specialized vector field designed for storing high-dimensional numeric data. Cosine means the similarity between vectors is based on the angle between them; the smaller the angle, the higher the similarity, regardless of their magnitude.
To confirm the index exists, run:
MongoDB
`db.getSiblingDB("sample_mflix").embedded_movies.getSearchIndexes()`
You should see output something like:
[
{
id: '68983b85c2c844543026fa6a',
name: 'plot_embedding_index',
type: 'search', status: 'READY',
queryable: true,
latestVersion: 0,
latestDefinition: {
mappings: {
dynamic: false,
fields: {
plot_embedding: { type: 'knnVector', dimensions: 1536, similarity: 'cosine' }
}
}
}
]
Your index is in READY status so you can run queries.
Running Your First Vector Search Queries
$vectorSearch can search BSON Binary vector fields directly inside MongoDB because the index is built on that binary Float32 data.
Example:
However, when you run $vectorSearch from the MongoDB shell (mongosh) or from application code, you must pass the query vector as a plain JavaScript array of numbers—not as raw BSON binary.
MongoDB stores embeddings in documents as BSON Binary (Float32) because it’s compact and efficient for indexing. The vector search index uses this binary data internally without conversion. But the queryVector parameter is an input to the search operation—it isn’t read from the indexed data, it’s sent from your code.
This means you need to decode the BSON Binary into a standard JS array before passing it to $vectorSearch. As shown below, you fetch the plot_embedding BSON Binary for The Matrix, convert it to a Float32Array, and then convert that to a plain JavaScript array for $vectorSearch.
JavaScript
// Get The Matrix embedding from the document
const d = db.getSiblingDB("sample_mflix").embedded_movies.findOne(
{ title: "The Matrix" },
{ plot_embedding: 1, _id: 0 }
)
// Convert BSON Binary (Float32) -> Float32Array -> plain JS array
const qv = Array.from(d.plot_embedding.toFloat32Array())
Run query with $vectorSearch:
JavaScript
db.getSiblingDB("sample_mflix").embedded_movies.aggregate([
{
$vectorSearch: {
index: "plot_embedding_index",
path: "plot_embedding",
queryVector: query_vector,
numCandidates: 200,
limit: 10
}
},
{ $match: { title: { $ne: "The Matrix" } } },
{ $project: { title: 1, year: 1, genres: 1, score: { $meta: "vectorSearchScore" }, _id: 0 } }
])
Expected Output
[
{ "genres": ["Action", "Adventure", "Sci-Fi"], "title": "TRON", "year": 1982, "score": 0.955 },
{ "genres": ["Action", "Drama", "Mystery"], "title": "Arrambam", "year": 2013, "score": 0.954 },
{ "genres": ["Action", "Crime", "Thriller"], "title": "Swordfish", "year": 2001, "score": 0.954 },
...
]
Running Semantic Search
Pass qv into $vectorSearch as the query vector—the reference point used to compare against all indexed vectors. The search engine uses cosine similarity to measure how close each stored embedding is to qv, ranking results from most to least similar.
This returns movies that are conceptually close to The Matrix, even if they don’t share obvious keywords.
JavaScript
db.getSiblingDB("sample_mflix").embedded_movies.aggregate([
{
$vectorSearch: {
index: "plot_embedding_index",
path: "plot_embedding",
queryVector: query_vector,
numCandidates: 200,
limit: 10
}
},
{
$match: { title: { $ne: "The Matrix" } }
},
{
$project: {
title: 1,
year: 1,
genres: 1,
score: { $meta: "vectorSearchScore" },
_id: 0
}
}
])
Expected output:
[
{
genres: [ "Action", "Adventure", "Sci-Fi" ],
title: "TRON",
year: 1982,
score: 0.9550351500511169
},
{
genres: [ "Action", "Drama", "Mystery" ],
title: "Arrambam",
year: 2013,
score: 0.9546242952346802
},
{
year: 2001,
genres: [ "Action", "Crime", "Thriller" ],
title: "Swordfish",
score: 0.9543327689170837
},
{
year: 1995,
genres: [ "Action", "Crime", "Drama" ],
title: "The Net",
score: 0.9502608180046082
},
{
genres: [ "Action", "Drama" ],
title: "Tuff Turf",
year: 1985,
score: 0.9378551244735718
},
{
year: 2015,
genres: [ "Action", "Comedy", "Crime" ],
title: "Spy",
score: 0.9367037415504456
},
{
genres: [ "Action", "Sci-Fi" ],
title: "V: The Final Battle",
year: 1984,
score: 0.9352985620498657
},
{
genres: [ "Action", "Adventure", "Sci-Fi" ],
title: "Jumper",
year: 2008,
score: 0.9346113204956055
},
{
year: 2014,
genres: [ "Action", "Adventure", "Comedy" ],
title: "Kingsman: The Secret Service",
score: 0.9341350793838501
}
]
Explanation: This result shows $vectorSearch running on the 1,536-dim plot_embedding index with cosine similarity (numCandidates: 200, limit: 10), then excluding the seed movie and projecting the Lucene-backed vectorSearchScore.
Scores around 0.955-0.934 indicate strong semantic proximity in embedding space, so titles like TRON, Swordfish, and The Net surface because their plots encode similar concepts (virtual worlds, hacking, surveillance), even if they share few literal keywords with The Matrix. Ranking is driven by vector distance, not term frequency, which is why cross-year and cross-subgenre matches still appear high.
Conclusion
MongoDB Vector Search lets you store, index, and query high-dimensional embeddings directly alongside application data. You can generate embeddings with external models like OpenAI and Voyage AI, store them in BSON Binary (Float32) format for compact storage and fast indexing, and query them via the $vectorSearch stage. Internally, MongoDB runs vector search in a separate mongot process built on Apache Lucene’s KNNVectorField, supporting both HNSW-based Approximate Nearest Neighbor (ANN) search for high speed and scalability, and Exact Nearest Neighbor (ENN) search for maximum precision at the cost of performance.
You can try this locally by spinning up an Atlas-compatible cluster with Atlas CLI and loading a sample dataset containing pre-computed embeddings using mongorestore. Once loaded, you can retrieve an embedding (e.g., The Matrix’s plot vector), convert it to a plain JavaScript array, and use it as a query vector to find conceptually similar documents. This approach powers semantic search, personalization, hybrid relevance, and retrieval-augmented generation without needing a separate vector database.
Explore
MongoDB Tutorial
7 min read
Introduction
Installation
Basics of MongoDB
MongoDB Methods
Comparison Operators
Logical Operators
Arithmetic Operators
Field Update Operators
Array Expression Operators