2.6 KiB
+++ title = "§7 Exercise 1: Storing and Retrieving Vectors" priority = 5 status = "todo" ticket_type = "task" dependencies = [] +++
§7 Exercise 1 — Storing and Retrieving Vectors — Stub to fill
File: edu/src/vector-db.md, section ### 7. Exercise 1 — Storing and Retrieving Vectors
Replace this stub line with the full exercise:
Goal: Insert a small set of labelled vectors [...] 🚧 Full content tracked in [nbd:081a55].
Follow the exercise format from edu/src/markov.md: Goal, Setup, Starter Code skeleton, numbered Steps, Reference Solution in <details><summary>Show full solution</summary>.
Prerequisites (established in §6)
Reader has the vec-demo project with libsql = "0.9" and tokio. The main function opens a local connection via Builder::new_local("vectors.db").build().await? and has already created the items table (id INTEGER PRIMARY KEY, label TEXT NOT NULL, embedding F32_BLOB(3) NOT NULL) and the HNSW index.
Goal
Insert 6 labelled 3-dimensional vectors, then SELECT all rows and print each label alongside its deserialized Vec<f32>.
Vectors to use
| id | label | embedding |
|---|---|---|
| 1 | "cat" | [0.9, 0.1, 0.2] |
| 2 | "dog" | [0.8, 0.2, 0.3] |
| 3 | "car" | [0.1, 0.9, 0.1] |
| 4 | "truck" | [0.2, 0.8, 0.2] |
| 5 | "python" | [0.15, 0.1, 0.95] |
| 6 | "rust" | [0.1, 0.05, 0.9] |
Hand-crafted so animals cluster near [high, low, low], vehicles near [low, high, low], and programming languages near [low, low, high]. The §8 KNN exercise uses these clusters to verify correct nearest-neighbour results.
Steps to cover
Step 1 — Formatting a vector for INSERT. Explain that vector(?) in SQL accepts a JSON array string. Show how to format a Vec<f32> in Rust:
fn vec_to_json(v: &[f32]) -> String {
format!("[{}]", v.iter().map(|x| x.to_string()).collect::<Vec<_>>().join(","))
}
Step 2 — Inserting rows. Use INSERT OR IGNORE so re-running the program is idempotent:
INSERT OR IGNORE INTO items (id, label, embedding) VALUES (?, ?, vector(?))
Loop over a Vec<(i64, &str, Vec<f32>)> and call conn.execute for each row, passing id, label, and the JSON string as parameters.
Step 3 — Selecting and deserialising. Query all rows:
SELECT id, label, vector_extract(embedding) FROM items ORDER BY id
vector_extract returns a JSON array string (e.g. "[0.9,0.1,0.2]"). Add serde_json = "1" to Cargo.toml and parse it: serde_json::from_str::<Vec<f32>>(&json_str)?.
Step 4 — Print results. Format output as:
1 cat [0.9, 0.1, 0.2]
2 dog [0.8, 0.2, 0.3]
...
Cargo.toml additions
Add serde_json = "1" for JSON array parsing.