From in-memory nearest-neighbour to pgvector at scale
Index and query embeddings in Postgres with pgvector — the same extension that powers production RAG at scale — and compare it to a naive in-memory baseline.
You're on lesson 3 of 6 in the free RAG module. Unlock the full AI Engineer curriculum →
What a vector store buys you
A vector store holds embeddings and answers a specific question fast: "give me the top-k vectors closest to this query vector." Regular databases answer exact-match queries (WHERE id = 42). Vector stores answer similarity queries.
You could skip the database entirely and keep vectors in a JavaScript array. Brute-force cosine similarity works fine for under 10,000 vectors. Beyond that, four walls hit you at once:
- Memory. 1M vectors × 1536 dims × 4 bytes ≈ 6 GB. Your Node process can't hold it.
- Speed. Brute-force is O(n) — a linear scan of every vector on every query.
- Persistence. When the process restarts, your index is gone.
- Metadata. You want to filter by
source,date,sectionalongside similarity.
A vector store solves all four. pgvector adds a fifth win: it runs inside the Postgres you already operate — no extra service, no extra auth, no extra backup pipeline.
ANN (approximate nearest neighbour)