You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
17 lines
585 B
Markdown
17 lines
585 B
Markdown
---
|
|
# edu-7lu6
|
|
title: 'Write §12: Exercise 4 — replace rollout with the value network'
|
|
status: completed
|
|
type: task
|
|
priority: normal
|
|
created_at: 2026-03-13T20:03:17Z
|
|
updated_at: 2026-03-16T01:31:54Z
|
|
parent: edu-coqp
|
|
---
|
|
|
|
Exercise: substitute random rollout in MCTS with a neural-network value estimate; compare strength before and after.
|
|
|
|
## Summary of Changes
|
|
|
|
Wrote full content for §12: Exercise 4 covering PUCT formula, replacing random rollouts with value network evaluation, adding policy priors to MCTS nodes, modified MCTS code, and pure vs network-guided MCTS comparison.
|