You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
vibed/edu/.beans/edu-uz3e--5-under-the-hood-...

3.6 KiB

title status type priority created_at updated_at
§5 Under the Hood: ANN Algorithms completed task normal 2026-03-10T23:30:01Z 2026-03-10T23:30:01Z

§5 Under the Hood: ANN Algorithms — Stub to fill

File: edu/src/vector-db.md, section ### 5. Under the Hood: ANN Algorithms

Replace this stub line with full content:

Exact nearest-neighbour search over millions of high-dimensional vectors is too slow [...] 🚧 Full content tracked in [nbd:6ec5ff].

This is a reading lesson — no Rust code. Target 600800 words. Include the summary table below.

Learning objectives

  • Understand why exact KNN is impractical at scale (O(n·d) per query)
  • Understand how HNSW works conceptually (multi-level navigable graph, greedy search)
  • Understand how IVFFlat works conceptually (k-means clustering, inverted index)
  • Know the key tuning parameters for each and what they control
  • Understand the recall vs. latency trade-off
  • Know that sqlite-vec uses HNSW via libsql_vector_idx

Content to write

Why not exact search? Brute-force KNN computes distance from the query to every stored vector: O(n·d) per query. At n=1M vectors, d=768 dimensions, and 1000 QPS this is ~768B operations/second — infeasible on a CPU. ANN algorithms find approximate results in O(log n) or sub-linear time at the cost of occasionally missing a few true nearest neighbours.

HNSW — Hierarchical Navigable Small World. The dominant algorithm for in-memory ANN, used by sqlite-vec.

Intuition: imagine a multi-level skip list where each level is a proximity graph. The top level is sparse with long-range connections (fast coarse navigation). The bottom level is dense with short-range connections (precise local search). A query starts at the top, greedily moves to whichever neighbour is closest to the query, descends when stuck, and repeats down to the bottom level where the k nearest candidates are collected.

Key parameters:

  • M: number of bidirectional connections per node. Higher M → better recall, more memory, slower inserts. Typical: 16.
  • ef_construction: candidate list size during index build. Higher → better index quality, slower build. Typical: 200.
  • ef_search: candidate list size during query. Higher → better recall, slower query. Often defaults to k.

HNSW supports incremental inserts with no full rebuild. Memory cost is O(n·M·4 bytes).

IVFFlat — Inverted File with flat quantisation. The dominant approach for disk-based or GPU-accelerated ANN (used by Faiss, pgvector default).

Intuition: cluster the dataset into nlist Voronoi cells using k-means. At query time, find the nprobe nearest cell centroids, then do exact search within those cells only — skipping the rest of the dataset.

Key parameters:

  • nlist: number of clusters. Typical: √n.
  • nprobe: number of clusters searched at query time. Higher → better recall, slower query.

IVFFlat requires a training step before data can be inserted. Incremental inserts require reassigning to clusters (or periodic retraining). Lower memory than HNSW for the same n.

sqlite-vec uses HNSW. The libsql_vector_idx index type creates an HNSW index — which is why §6 can insert rows incrementally without a training step. The current API does not expose M or ef parameters; defaults are chosen for broad applicability.

Summary table.

Property HNSW IVFFlat
Query time O(log n) O(nprobe · n/nlist)
Insert Incremental Batch (requires training)
Memory Higher (graph edges) Lower
Recall@10 at defaults ~0.95+ ~0.90+ (depends on nprobe)
Used by sqlite-vec, Qdrant, Weaviate pgvector, Faiss