vibed/edu/.nbd/tickets/4c961f.md

+++
title = "§9 Generating Embeddings in Rust"
priority = 5
status = "done"
ticket_type = "task"
dependencies = []
+++
## §9 Generating Embeddings in Rust — Stub to fill

File: `edu/src/vector-db.md`, section `### 9. Generating Embeddings in Rust`

Replace this stub line with full section content:
> Before you can search by meaning, you need a way to convert text into vectors. [...] 🚧 Full content tracked in [nbd:4c961f].

This is a **reading lesson with short code snippets** — not a full exercise. The reader should finish with working embedding code they can use in §10–§12. Target 400–600 words.

## Learning objectives

- Know two approaches to generating embeddings in Rust: local model (fastembed) and HTTP API
- Understand how to call fastembed to embed strings locally with no API key
- Understand how to call an OpenAI-compatible embeddings endpoint
- Know which approach to use for the exercises and why (fastembed — offline, deterministic)
- Understand how embedding dimension affects the `F32_BLOB(d)` column type

## Content to write

**Option A — fastembed-rs (local, recommended for exercises).** No API key required, works offline, CPU-only, deterministic results.

Cargo.toml addition:
```toml
fastembed = "4"
```

Basic usage (BGE-Small-EN-v1.5, 384 dimensions, ~130MB model downloaded once to `~/.cache/huggingface/hub/` on first run):
```rust
use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};

let model = TextEmbedding::try_new(
    InitOptions::new(EmbeddingModel::BGESmallENV15)
        .with_show_download_progress(true),
)?;

let docs = vec!["hello world", "Rust is fast"];
let embeddings: Vec<Vec<f32>> = model.embed(docs, None)?;
// embeddings[0].len() == 384
```

Batch embedding (passing multiple strings at once) is more efficient than embedding one at a time — use a single `model.embed()` call for the whole corpus.

**Option B — HTTP API (OpenAI-compatible).** Higher quality models, requires API key and network access.

Cargo.toml additions:
```toml
reqwest = { version = "0.12", features = ["json"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
```

Show a minimal async function that POSTs to the embeddings endpoint and returns `Vec<Vec<f32>>`. Include the request/response struct definitions with serde derives. API key from `std::env::var("OPENAI_API_KEY")`.

**Choosing between them.** Recommend fastembed for §10–§12 because: no API key, no network dependency, deterministic results (important for exercises), sub-100ms per batch on CPU. Use the HTTP approach when you need a specific production-grade model or have one already deployed.

**Dimensionality note.** The `F32_BLOB(d)` column type must match the model's output dimension exactly — you cannot mix dimensions. Change the DDL from the toy `F32_BLOB(3)` used in §6–§8 to `F32_BLOB(384)` for BGE-Small, `F32_BLOB(768)` for all-MiniLM-L6-v2, or `F32_BLOB(1536)` for OpenAI text-embedding-3-small. The index must also be dropped and recreated if you change the dimension.