You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
vibed/edu/.beans/edu-pdeo--7-exercise-1-stor...

2.7 KiB

title status type priority created_at updated_at
§7 Exercise 1: Storing and Retrieving Vectors completed task normal 2026-03-10T23:29:59Z 2026-03-10T23:29:59Z

§7 Exercise 1 — Storing and Retrieving Vectors — Stub to fill

File: edu/src/vector-db.md, section ### 7. Exercise 1 — Storing and Retrieving Vectors

Replace this stub line with the full exercise:

Goal: Insert a small set of labelled vectors [...] 🚧 Full content tracked in [nbd:081a55].

Follow the exercise format from edu/src/markov.md: Goal, Setup, Starter Code skeleton, numbered Steps, Reference Solution in <details><summary>Show full solution</summary>.

Prerequisites (established in §6)

Reader has the vec-demo project with libsql = "0.9" and tokio. The main function opens a local connection via Builder::new_local("vectors.db").build().await? and has already created the items table (id INTEGER PRIMARY KEY, label TEXT NOT NULL, embedding F32_BLOB(3) NOT NULL) and the HNSW index.

Goal

Insert 6 labelled 3-dimensional vectors, then SELECT all rows and print each label alongside its deserialized Vec<f32>.

Vectors to use

id label embedding
1 "cat" [0.9, 0.1, 0.2]
2 "dog" [0.8, 0.2, 0.3]
3 "car" [0.1, 0.9, 0.1]
4 "truck" [0.2, 0.8, 0.2]
5 "python" [0.15, 0.1, 0.95]
6 "rust" [0.1, 0.05, 0.9]

Hand-crafted so animals cluster near [high, low, low], vehicles near [low, high, low], and programming languages near [low, low, high]. The §8 KNN exercise uses these clusters to verify correct nearest-neighbour results.

Steps to cover

Step 1 — Formatting a vector for INSERT. Explain that vector(?) in SQL accepts a JSON array string. Show how to format a Vec<f32> in Rust:

fn vec_to_json(v: &[f32]) -> String {
    format!("[{}]", v.iter().map(|x| x.to_string()).collect::<Vec<_>>().join(","))
}

Step 2 — Inserting rows. Use INSERT OR IGNORE so re-running the program is idempotent:

INSERT OR IGNORE INTO items (id, label, embedding) VALUES (?, ?, vector(?))

Loop over a Vec<(i64, &str, Vec<f32>)> and call conn.execute for each row, passing id, label, and the JSON string as parameters.

Step 3 — Selecting and deserialising. Query all rows:

SELECT id, label, vector_extract(embedding) FROM items ORDER BY id

vector_extract returns a JSON array string (e.g. "[0.9,0.1,0.2]"). Add serde_json = "1" to Cargo.toml and parse it: serde_json::from_str::<Vec<f32>>(&json_str)?.

Step 4 — Print results. Format output as:

1  cat     [0.9, 0.1, 0.2]
2  dog     [0.8, 0.2, 0.3]
...

Cargo.toml additions

Add serde_json = "1" for JSON array parsing.