|
|
+++
|
|
|
title = "§9 Generating Embeddings in Rust"
|
|
|
priority = 5
|
|
|
status = "done"
|
|
|
ticket_type = "task"
|
|
|
dependencies = []
|
|
|
+++
|
|
|
## §9 Generating Embeddings in Rust — Stub to fill
|
|
|
|
|
|
File: `edu/src/vector-db.md`, section `### 9. Generating Embeddings in Rust`
|
|
|
|
|
|
Replace this stub line with full section content:
|
|
|
> Before you can search by meaning, you need a way to convert text into vectors. [...] 🚧 Full content tracked in [nbd:4c961f].
|
|
|
|
|
|
This is a **reading lesson with short code snippets** — not a full exercise. The reader should finish with working embedding code they can use in §10–§12. Target 400–600 words.
|
|
|
|
|
|
## Learning objectives
|
|
|
|
|
|
- Know two approaches to generating embeddings in Rust: local model (fastembed) and HTTP API
|
|
|
- Understand how to call fastembed to embed strings locally with no API key
|
|
|
- Understand how to call an OpenAI-compatible embeddings endpoint
|
|
|
- Know which approach to use for the exercises and why (fastembed — offline, deterministic)
|
|
|
- Understand how embedding dimension affects the `F32_BLOB(d)` column type
|
|
|
|
|
|
## Content to write
|
|
|
|
|
|
**Option A — fastembed-rs (local, recommended for exercises).** No API key required, works offline, CPU-only, deterministic results.
|
|
|
|
|
|
Cargo.toml addition:
|
|
|
```toml
|
|
|
fastembed = "4"
|
|
|
```
|
|
|
|
|
|
Basic usage (BGE-Small-EN-v1.5, 384 dimensions, ~130MB model downloaded once to `~/.cache/huggingface/hub/` on first run):
|
|
|
```rust
|
|
|
use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};
|
|
|
|
|
|
let model = TextEmbedding::try_new(
|
|
|
InitOptions::new(EmbeddingModel::BGESmallENV15)
|
|
|
.with_show_download_progress(true),
|
|
|
)?;
|
|
|
|
|
|
let docs = vec!["hello world", "Rust is fast"];
|
|
|
let embeddings: Vec<Vec<f32>> = model.embed(docs, None)?;
|
|
|
// embeddings[0].len() == 384
|
|
|
```
|
|
|
|
|
|
Batch embedding (passing multiple strings at once) is more efficient than embedding one at a time — use a single `model.embed()` call for the whole corpus.
|
|
|
|
|
|
**Option B — HTTP API (OpenAI-compatible).** Higher quality models, requires API key and network access.
|
|
|
|
|
|
Cargo.toml additions:
|
|
|
```toml
|
|
|
reqwest = { version = "0.12", features = ["json"] }
|
|
|
serde = { version = "1", features = ["derive"] }
|
|
|
serde_json = "1"
|
|
|
```
|
|
|
|
|
|
Show a minimal async function that POSTs to the embeddings endpoint and returns `Vec<Vec<f32>>`. Include the request/response struct definitions with serde derives. API key from `std::env::var("OPENAI_API_KEY")`.
|
|
|
|
|
|
**Choosing between them.** Recommend fastembed for §10–§12 because: no API key, no network dependency, deterministic results (important for exercises), sub-100ms per batch on CPU. Use the HTTP approach when you need a specific production-grade model or have one already deployed.
|
|
|
|
|
|
**Dimensionality note.** The `F32_BLOB(d)` column type must match the model's output dimension exactly — you cannot mix dimensions. Change the DDL from the toy `F32_BLOB(3)` used in §6–§8 to `F32_BLOB(384)` for BGE-Small, `F32_BLOB(768)` for all-MiniLM-L6-v2, or `F32_BLOB(1536)` for OpenAI text-embedding-3-small. The index must also be dropped and recreated if you change the dimension. |