vibed

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

2.9 KiB

Raw Blame History

+++ title = "§11 Exercise 4: Recommendation Engine" priority = 5 status = "done" ticket_type = "task" dependencies = [] +++

§11 Exercise 4 — Recommendation Engine — Stub to fill

File: edu/src/vector-db.md, section ### 11. Exercise 4 — Recommendation Engine

Replace this stub line with the full exercise:

Goal: Implement item-based collaborative filtering using vector similarity. [...] 🚧 Full content tracked in [nbd:e8be9a].

Follow the exercise format from edu/src/markov.md.

Goal

Build an item-based recommendation engine. Store item feature vectors in Turso, then given a target item, find the k most similar items using KNN and exclude the query item from the results.

Approach

Use hand-crafted 5-dimensional feature vectors for a product catalogue (no fastembed dependency needed — keeps focus on the recommendation logic). Dimensions represent affinity scores for: [electronics, clothing, sports, food, books].

Catalogue (10 items)

id	name	embedding
1	"Laptop"	[0.95, 0.0, 0.1, 0.0, 0.2]
2	"Mechanical Keyboard"	[0.85, 0.0, 0.0, 0.0, 0.1]
3	"USB-C Hub"	[0.9, 0.0, 0.0, 0.0, 0.0]
4	"Running Shoes"	[0.0, 0.6, 0.9, 0.0, 0.0]
5	"Yoga Mat"	[0.0, 0.2, 0.95, 0.0, 0.0]
6	"Water Bottle"	[0.1, 0.1, 0.7, 0.0, 0.0]
7	"T-Shirt"	[0.0, 0.95, 0.1, 0.0, 0.0]
8	"Cookbook"	[0.0, 0.0, 0.0, 0.6, 0.9]
9	"Protein Bar"	[0.0, 0.0, 0.3, 0.95, 0.0]
10	"Novel"	[0.0, 0.0, 0.0, 0.1, 0.95]

Steps to cover

Step 1 — Schema. Table products (id INTEGER PRIMARY KEY, name TEXT NOT NULL, embedding F32_BLOB(5) NOT NULL) with a libsql_vector_idx HNSW index.

Step 2 — Insert items. Same pattern as Exercise 1: format Vec<f32> as JSON, INSERT OR IGNORE.

Step 3 — Recommend function. Write a helper:

async fn recommend(
    conn: &libsql::Connection,
    item_id: i64,
    k: usize,
) -> Result<Vec<(String, f64)>, Box<dyn std::error::Error>>

SELECT vector_extract(embedding) FROM products WHERE id = ? to get the query item's embedding as a JSON string
Pass that JSON string to vector_top_k with k+1 (to have room to exclude the query item)
JOIN to get product names and vector_distance_cos distances
Filter out products.id = item_id
Return the top k (name, distance) pairs

Step 4 — Print recommendations for three items.

"Laptop" → expect Mechanical Keyboard, USB-C Hub (electronics cluster)
"Running Shoes" → expect Yoga Mat, Water Bottle (sports cluster)
"Cookbook" → expect Novel, Protein Bar (food/books cluster)

Output format: "Customers who liked Laptop also liked: Mechanical Keyboard (0.023), USB-C Hub (0.041)"

Reference solution

Full main.rs inside <details>. The recommend function should be clearly separated from the setup boilerplate. The recommendation query pattern (SELECT embedding → feed as query to vector_top_k) is the key technique to highlight.

2.9 KiB Raw Blame History