|
|
---
|
|
|
# edu-c98s
|
|
|
title: §9 Generating Embeddings in Rust
|
|
|
status: completed
|
|
|
type: task
|
|
|
priority: normal
|
|
|
created_at: 2026-03-10T23:30:00Z
|
|
|
updated_at: 2026-03-10T23:30:00Z
|
|
|
---
|
|
|
|
|
|
## §9 Generating Embeddings in Rust — Stub to fill
|
|
|
|
|
|
File: `edu/src/vector-db.md`, section `### 9. Generating Embeddings in Rust`
|
|
|
|
|
|
Replace this stub line with full section content:
|
|
|
> Before you can search by meaning, you need a way to convert text into vectors. [...] 🚧 Full content tracked in [nbd:4c961f].
|
|
|
|
|
|
This is a **reading lesson with short code snippets** — not a full exercise. The reader should finish with working embedding code they can use in §10–§12. Target 400–600 words.
|
|
|
|
|
|
## Learning objectives
|
|
|
|
|
|
- Know two approaches to generating embeddings in Rust: local model (fastembed) and HTTP API
|
|
|
- Understand how to call fastembed to embed strings locally with no API key
|
|
|
- Understand how to call an OpenAI-compatible embeddings endpoint
|
|
|
- Know which approach to use for the exercises and why (fastembed — offline, deterministic)
|
|
|
- Understand how embedding dimension affects the `F32_BLOB(d)` column type
|
|
|
|
|
|
## Content to write
|
|
|
|
|
|
**Option A — fastembed-rs (local, recommended for exercises).** No API key required, works offline, CPU-only, deterministic results.
|
|
|
|
|
|
Cargo.toml addition:
|
|
|
```toml
|
|
|
fastembed = "4"
|
|
|
```
|
|
|
|
|
|
Basic usage (BGE-Small-EN-v1.5, 384 dimensions, ~130MB model downloaded once to `~/.cache/huggingface/hub/` on first run):
|
|
|
```rust
|
|
|
use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};
|
|
|
|
|
|
let model = TextEmbedding::try_new(
|
|
|
InitOptions::new(EmbeddingModel::BGESmallENV15)
|
|
|
.with_show_download_progress(true),
|
|
|
)?;
|
|
|
|
|
|
let docs = vec!["hello world", "Rust is fast"];
|
|
|
let embeddings: Vec<Vec<f32>> = model.embed(docs, None)?;
|
|
|
// embeddings[0].len() == 384
|
|
|
```
|
|
|
|
|
|
Batch embedding (passing multiple strings at once) is more efficient than embedding one at a time — use a single `model.embed()` call for the whole corpus.
|
|
|
|
|
|
**Option B — HTTP API (OpenAI-compatible).** Higher quality models, requires API key and network access.
|
|
|
|
|
|
Cargo.toml additions:
|
|
|
```toml
|
|
|
reqwest = { version = "0.12", features = ["json"] }
|
|
|
serde = { version = "1", features = ["derive"] }
|
|
|
serde_json = "1"
|
|
|
```
|
|
|
|
|
|
Show a minimal async function that POSTs to the embeddings endpoint and returns `Vec<Vec<f32>>`. Include the request/response struct definitions with serde derives. API key from `std::env::var("OPENAI_API_KEY")`.
|
|
|
|
|
|
**Choosing between them.** Recommend fastembed for §10–§12 because: no API key, no network dependency, deterministic results (important for exercises), sub-100ms per batch on CPU. Use the HTTP approach when you need a specific production-grade model or have one already deployed.
|
|
|
|
|
|
**Dimensionality note.** The `F32_BLOB(d)` column type must match the model's output dimension exactly — you cannot mix dimensions. Change the DDL from the toy `F32_BLOB(3)` used in §6–§8 to `F32_BLOB(384)` for BGE-Small, `F32_BLOB(768)` for all-MiniLM-L6-v2, or `F32_BLOB(1536)` for OpenAI text-embedding-3-small. The index must also be dropped and recreated if you change the dimension.
|