You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
vibed/edu/.beans/archive/edu-pdeo--7-exercise-1-stor...

78 lines
2.7 KiB
Markdown

---
# edu-pdeo
title: '§7 Exercise 1: Storing and Retrieving Vectors'
status: completed
type: task
priority: normal
created_at: 2026-03-10T23:29:59Z
updated_at: 2026-03-10T23:29:59Z
---
## §7 Exercise 1 — Storing and Retrieving Vectors — Stub to fill
File: `edu/src/vector-db.md`, section `### 7. Exercise 1 — Storing and Retrieving Vectors`
Replace this stub line with the full exercise:
> **Goal:** Insert a small set of labelled vectors [...] 🚧 Full content tracked in [nbd:081a55].
Follow the exercise format from `edu/src/markov.md`: Goal, Setup, Starter Code skeleton, numbered Steps, Reference Solution in `<details><summary>Show full solution</summary>`.
## Prerequisites (established in §6)
Reader has the `vec-demo` project with `libsql = "0.9"` and `tokio`. The `main` function opens a local connection via `Builder::new_local("vectors.db").build().await?` and has already created the `items` table (`id INTEGER PRIMARY KEY, label TEXT NOT NULL, embedding F32_BLOB(3) NOT NULL`) and the HNSW index.
## Goal
Insert 6 labelled 3-dimensional vectors, then SELECT all rows and print each label alongside its deserialized `Vec<f32>`.
## Vectors to use
| id | label | embedding |
|---|---|---|
| 1 | "cat" | [0.9, 0.1, 0.2] |
| 2 | "dog" | [0.8, 0.2, 0.3] |
| 3 | "car" | [0.1, 0.9, 0.1] |
| 4 | "truck" | [0.2, 0.8, 0.2] |
| 5 | "python" | [0.15, 0.1, 0.95] |
| 6 | "rust" | [0.1, 0.05, 0.9] |
Hand-crafted so animals cluster near [high, low, low], vehicles near [low, high, low], and programming languages near [low, low, high]. The §8 KNN exercise uses these clusters to verify correct nearest-neighbour results.
## Steps to cover
**Step 1 — Formatting a vector for INSERT.** Explain that `vector(?)` in SQL accepts a JSON array string. Show how to format a `Vec<f32>` in Rust:
```rust
fn vec_to_json(v: &[f32]) -> String {
format!("[{}]", v.iter().map(|x| x.to_string()).collect::<Vec<_>>().join(","))
}
```
**Step 2 — Inserting rows.** Use `INSERT OR IGNORE` so re-running the program is idempotent:
```sql
INSERT OR IGNORE INTO items (id, label, embedding) VALUES (?, ?, vector(?))
```
Loop over a `Vec<(i64, &str, Vec<f32>)>` and call `conn.execute` for each row, passing id, label, and the JSON string as parameters.
**Step 3 — Selecting and deserialising.** Query all rows:
```sql
SELECT id, label, vector_extract(embedding) FROM items ORDER BY id
```
`vector_extract` returns a JSON array string (e.g. `"[0.9,0.1,0.2]"`). Add `serde_json = "1"` to Cargo.toml and parse it: `serde_json::from_str::<Vec<f32>>(&json_str)?`.
**Step 4 — Print results.** Format output as:
```
1 cat [0.9, 0.1, 0.2]
2 dog [0.8, 0.2, 0.3]
...
```
## Cargo.toml additions
Add `serde_json = "1"` for JSON array parsing.