You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
585 B
585 B
| title | status | type | priority | created_at | updated_at | parent |
|---|---|---|---|---|---|---|
| Write §12: Exercise 4 — replace rollout with the value network | completed | task | normal | 2026-03-13T20:03:17Z | 2026-03-16T01:31:54Z | edu-coqp |
Exercise: substitute random rollout in MCTS with a neural-network value estimate; compare strength before and after.
Summary of Changes
Wrote full content for §12: Exercise 4 covering PUCT formula, replacing random rollouts with value network evaluation, adding policy priors to MCTS nodes, modified MCTS code, and pure vs network-guided MCTS comparison.