You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
vibed/edu/.beans/archive/edu-c98s--9-generating-embe...

67 lines
3.0 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

---
# edu-c98s
title: §9 Generating Embeddings in Rust
status: completed
type: task
priority: normal
created_at: 2026-03-10T23:30:00Z
updated_at: 2026-03-10T23:30:00Z
---
## §9 Generating Embeddings in Rust — Stub to fill
File: `edu/src/vector-db.md`, section `### 9. Generating Embeddings in Rust`
Replace this stub line with full section content:
> Before you can search by meaning, you need a way to convert text into vectors. [...] 🚧 Full content tracked in [nbd:4c961f].
This is a **reading lesson with short code snippets** — not a full exercise. The reader should finish with working embedding code they can use in §10§12. Target 400600 words.
## Learning objectives
- Know two approaches to generating embeddings in Rust: local model (fastembed) and HTTP API
- Understand how to call fastembed to embed strings locally with no API key
- Understand how to call an OpenAI-compatible embeddings endpoint
- Know which approach to use for the exercises and why (fastembed — offline, deterministic)
- Understand how embedding dimension affects the `F32_BLOB(d)` column type
## Content to write
**Option A — fastembed-rs (local, recommended for exercises).** No API key required, works offline, CPU-only, deterministic results.
Cargo.toml addition:
```toml
fastembed = "4"
```
Basic usage (BGE-Small-EN-v1.5, 384 dimensions, ~130MB model downloaded once to `~/.cache/huggingface/hub/` on first run):
```rust
use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};
let model = TextEmbedding::try_new(
InitOptions::new(EmbeddingModel::BGESmallENV15)
.with_show_download_progress(true),
)?;
let docs = vec!["hello world", "Rust is fast"];
let embeddings: Vec<Vec<f32>> = model.embed(docs, None)?;
// embeddings[0].len() == 384
```
Batch embedding (passing multiple strings at once) is more efficient than embedding one at a time — use a single `model.embed()` call for the whole corpus.
**Option B — HTTP API (OpenAI-compatible).** Higher quality models, requires API key and network access.
Cargo.toml additions:
```toml
reqwest = { version = "0.12", features = ["json"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
```
Show a minimal async function that POSTs to the embeddings endpoint and returns `Vec<Vec<f32>>`. Include the request/response struct definitions with serde derives. API key from `std::env::var("OPENAI_API_KEY")`.
**Choosing between them.** Recommend fastembed for §10§12 because: no API key, no network dependency, deterministic results (important for exercises), sub-100ms per batch on CPU. Use the HTTP approach when you need a specific production-grade model or have one already deployed.
**Dimensionality note.** The `F32_BLOB(d)` column type must match the model's output dimension exactly — you cannot mix dimensions. Change the DDL from the toy `F32_BLOB(3)` used in §6§8 to `F32_BLOB(384)` for BGE-Small, `F32_BLOB(768)` for all-MiniLM-L6-v2, or `F32_BLOB(1536)` for OpenAI text-embedding-3-small. The index must also be dropped and recreated if you change the dimension.