You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
vibed/edu/.beans/archive/edu-7lu6--write-12-exercise...

585 B

title status type priority created_at updated_at parent
Write §12: Exercise 4 — replace rollout with the value network completed task normal 2026-03-13T20:03:17Z 2026-03-16T01:31:54Z edu-coqp

Exercise: substitute random rollout in MCTS with a neural-network value estimate; compare strength before and after.

Summary of Changes

Wrote full content for §12: Exercise 4 covering PUCT formula, replacing random rollouts with value network evaluation, adding policy priors to MCTS nodes, modified MCTS code, and pure vs network-guided MCTS comparison.