vibed/edu/.nbd/tickets/6ec5ff.md

+++
title = "§5 Under the Hood: ANN Algorithms"
priority = 5
status = "todo"
ticket_type = "task"
dependencies = []
+++
## §5 Under the Hood: ANN Algorithms — Stub to fill

File: `edu/src/vector-db.md`, section `### 5. Under the Hood: ANN Algorithms`

Replace this stub line with full content:
> Exact nearest-neighbour search over millions of high-dimensional vectors is too slow [...] 🚧 Full content tracked in [nbd:6ec5ff].

This is a **reading lesson** — no Rust code. Target 600–800 words. Include the summary table below.

## Learning objectives

- Understand why exact KNN is impractical at scale (O(n·d) per query)
- Understand how HNSW works conceptually (multi-level navigable graph, greedy search)
- Understand how IVFFlat works conceptually (k-means clustering, inverted index)
- Know the key tuning parameters for each and what they control
- Understand the recall vs. latency trade-off
- Know that sqlite-vec uses HNSW via `libsql_vector_idx`

## Content to write

**Why not exact search?** Brute-force KNN computes distance from the query to every stored vector: O(n·d) per query. At n=1M vectors, d=768 dimensions, and 1000 QPS this is ~768B operations/second — infeasible on a CPU. ANN algorithms find approximate results in O(log n) or sub-linear time at the cost of occasionally missing a few true nearest neighbours.

**HNSW — Hierarchical Navigable Small World.** The dominant algorithm for in-memory ANN, used by sqlite-vec.

Intuition: imagine a multi-level skip list where each level is a proximity graph. The top level is sparse with long-range connections (fast coarse navigation). The bottom level is dense with short-range connections (precise local search). A query starts at the top, greedily moves to whichever neighbour is closest to the query, descends when stuck, and repeats down to the bottom level where the k nearest candidates are collected.

Key parameters:
- `M`: number of bidirectional connections per node. Higher M → better recall, more memory, slower inserts. Typical: 16.
- `ef_construction`: candidate list size during index build. Higher → better index quality, slower build. Typical: 200.
- `ef_search`: candidate list size during query. Higher → better recall, slower query. Often defaults to k.

HNSW supports incremental inserts with no full rebuild. Memory cost is O(n·M·4 bytes).

**IVFFlat — Inverted File with flat quantisation.** The dominant approach for disk-based or GPU-accelerated ANN (used by Faiss, pgvector default).

Intuition: cluster the dataset into `nlist` Voronoi cells using k-means. At query time, find the `nprobe` nearest cell centroids, then do exact search within those cells only — skipping the rest of the dataset.

Key parameters:
- `nlist`: number of clusters. Typical: √n.
- `nprobe`: number of clusters searched at query time. Higher → better recall, slower query.

IVFFlat requires a training step before data can be inserted. Incremental inserts require reassigning to clusters (or periodic retraining). Lower memory than HNSW for the same n.

**sqlite-vec uses HNSW.** The `libsql_vector_idx` index type creates an HNSW index — which is why §6 can insert rows incrementally without a training step. The current API does not expose M or ef parameters; defaults are chosen for broad applicability.

**Summary table.**

| Property | HNSW | IVFFlat |
|---|---|---|
| Query time | O(log n) | O(nprobe · n/nlist) |
| Insert | Incremental | Batch (requires training) |
| Memory | Higher (graph edges) | Lower |
| Recall@10 at defaults | ~0.95+ | ~0.90+ (depends on nprobe) |
| Used by | sqlite-vec, Qdrant, Weaviate | pgvector, Faiss |