3.0 KiB
+++ title = "§9 Generating Embeddings in Rust" priority = 5 status = "done" ticket_type = "task" dependencies = [] +++
§9 Generating Embeddings in Rust — Stub to fill
File: edu/src/vector-db.md, section ### 9. Generating Embeddings in Rust
Replace this stub line with full section content:
Before you can search by meaning, you need a way to convert text into vectors. [...] 🚧 Full content tracked in [nbd:4c961f].
This is a reading lesson with short code snippets — not a full exercise. The reader should finish with working embedding code they can use in §10–§12. Target 400–600 words.
Learning objectives
- Know two approaches to generating embeddings in Rust: local model (fastembed) and HTTP API
- Understand how to call fastembed to embed strings locally with no API key
- Understand how to call an OpenAI-compatible embeddings endpoint
- Know which approach to use for the exercises and why (fastembed — offline, deterministic)
- Understand how embedding dimension affects the
F32_BLOB(d)column type
Content to write
Option A — fastembed-rs (local, recommended for exercises). No API key required, works offline, CPU-only, deterministic results.
Cargo.toml addition:
fastembed = "4"
Basic usage (BGE-Small-EN-v1.5, 384 dimensions, ~130MB model downloaded once to ~/.cache/huggingface/hub/ on first run):
use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};
let model = TextEmbedding::try_new(
InitOptions::new(EmbeddingModel::BGESmallENV15)
.with_show_download_progress(true),
)?;
let docs = vec!["hello world", "Rust is fast"];
let embeddings: Vec<Vec<f32>> = model.embed(docs, None)?;
// embeddings[0].len() == 384
Batch embedding (passing multiple strings at once) is more efficient than embedding one at a time — use a single model.embed() call for the whole corpus.
Option B — HTTP API (OpenAI-compatible). Higher quality models, requires API key and network access.
Cargo.toml additions:
reqwest = { version = "0.12", features = ["json"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
Show a minimal async function that POSTs to the embeddings endpoint and returns Vec<Vec<f32>>. Include the request/response struct definitions with serde derives. API key from std::env::var("OPENAI_API_KEY").
Choosing between them. Recommend fastembed for §10–§12 because: no API key, no network dependency, deterministic results (important for exercises), sub-100ms per batch on CPU. Use the HTTP approach when you need a specific production-grade model or have one already deployed.
Dimensionality note. The F32_BLOB(d) column type must match the model's output dimension exactly — you cannot mix dimensions. Change the DDL from the toy F32_BLOB(3) used in §6–§8 to F32_BLOB(384) for BGE-Small, F32_BLOB(768) for all-MiniLM-L6-v2, or F32_BLOB(1536) for OpenAI text-embedding-3-small. The index must also be dropped and recreated if you change the dimension.