You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

3.0 KiB

+++ title = "§9 Generating Embeddings in Rust" priority = 5 status = "done" ticket_type = "task" dependencies = [] +++

§9 Generating Embeddings in Rust — Stub to fill

File: edu/src/vector-db.md, section ### 9. Generating Embeddings in Rust

Replace this stub line with full section content:

Before you can search by meaning, you need a way to convert text into vectors. [...] 🚧 Full content tracked in [nbd:4c961f].

This is a reading lesson with short code snippets — not a full exercise. The reader should finish with working embedding code they can use in §10§12. Target 400600 words.

Learning objectives

  • Know two approaches to generating embeddings in Rust: local model (fastembed) and HTTP API
  • Understand how to call fastembed to embed strings locally with no API key
  • Understand how to call an OpenAI-compatible embeddings endpoint
  • Know which approach to use for the exercises and why (fastembed — offline, deterministic)
  • Understand how embedding dimension affects the F32_BLOB(d) column type

Content to write

Option A — fastembed-rs (local, recommended for exercises). No API key required, works offline, CPU-only, deterministic results.

Cargo.toml addition:

fastembed = "4"

Basic usage (BGE-Small-EN-v1.5, 384 dimensions, ~130MB model downloaded once to ~/.cache/huggingface/hub/ on first run):

use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};

let model = TextEmbedding::try_new(
    InitOptions::new(EmbeddingModel::BGESmallENV15)
        .with_show_download_progress(true),
)?;

let docs = vec!["hello world", "Rust is fast"];
let embeddings: Vec<Vec<f32>> = model.embed(docs, None)?;
// embeddings[0].len() == 384

Batch embedding (passing multiple strings at once) is more efficient than embedding one at a time — use a single model.embed() call for the whole corpus.

Option B — HTTP API (OpenAI-compatible). Higher quality models, requires API key and network access.

Cargo.toml additions:

reqwest = { version = "0.12", features = ["json"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"

Show a minimal async function that POSTs to the embeddings endpoint and returns Vec<Vec<f32>>. Include the request/response struct definitions with serde derives. API key from std::env::var("OPENAI_API_KEY").

Choosing between them. Recommend fastembed for §10§12 because: no API key, no network dependency, deterministic results (important for exercises), sub-100ms per batch on CPU. Use the HTTP approach when you need a specific production-grade model or have one already deployed.

Dimensionality note. The F32_BLOB(d) column type must match the model's output dimension exactly — you cannot mix dimensions. Change the DDL from the toy F32_BLOB(3) used in §6§8 to F32_BLOB(384) for BGE-Small, F32_BLOB(768) for all-MiniLM-L6-v2, or F32_BLOB(1536) for OpenAI text-embedding-3-small. The index must also be dropped and recreated if you change the dimension.