Skip to main content

Proving Wins and Losses

Standard MCTS gives statistical estimates: "this move wins 73% of playouts." MCTS-Solver converts those estimates into proofs. Once every child of a node is proven, the node itself is proven. The search stops guessing and starts knowing.

You will learn to:

  • Implement terminal_value() and terminal_score() for game outcomes
  • Enable MCTS-Solver and Score-Bounded search
  • Read and interpret proven values and score bounds after search

Prerequisites: Two-Player Games.

Two independent features

This tutorial covers two separate features that can be used independently:

  1. MCTS-Solver (first half) — proves positions as Win/Loss/Draw. Enable with solver_enabled() = true. Start here.
  2. Score-Bounded MCTS (second half) — tracks minimax score bounds. Enable with score_bounded_enabled() = true. Skip this on first read if you don't need exact scores.

Most games only need the solver. Score bounds are useful when you care about margin of victory, not just win/loss.

How the solver proves nodes

Proven values propagate up the tree after each playout. The rules follow minimax logic:

  • Any child proven Loss -- the parent is proven Win. The parent's player picks the child where the opponent loses.
  • All children proven Win -- the parent is proven Loss. Every move leads to the opponent winning. No escape.
  • All children proven, at least one Draw -- the parent is proven Draw. The best available outcome is a tie.
  • Terminal node -- terminal_value() returns the outcome directly. This is where proofs originate.

Values are always from the current player's perspective. A child's Loss means the child's mover loses, which means the parent's mover wins.

Implementing terminal_value

The solver needs to know the outcome at terminal states. Implement terminal_value() on your GameState:

#[derive(Clone, Debug)]
struct Nim {
stones: u8,
current: Player,
}

#[derive(Clone, Copy, Debug, PartialEq, Eq)]
enum Player {
P1,
P2,
}

#[derive(Clone, Debug, PartialEq)]
enum NimMove {
Take1,
Take2,
}

impl std::fmt::Display for NimMove {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match self {
NimMove::Take1 => write!(f, "Take 1"),
NimMove::Take2 => write!(f, "Take 2"),
}
}
}

impl GameState for Nim {
type Move = NimMove;
type Player = Player;
type MoveList = Vec<NimMove>;

fn current_player(&self) -> Player {
self.current
}

fn available_moves(&self) -> Vec<NimMove> {
match self.stones {
0 => vec![],
1 => vec![NimMove::Take1],
_ => vec![NimMove::Take1, NimMove::Take2],
}
}

fn make_move(&mut self, mov: &NimMove) {
match mov {
NimMove::Take1 => self.stones -= 1,
NimMove::Take2 => self.stones -= 2,
}
self.current = match self.current {
Player::P1 => Player::P2,
Player::P2 => Player::P1,
};
}

fn terminal_value(&self) -> Option<ProvenValue> {
if self.stones == 0 {
// Previous player took the last stone and won.
// Current player (who can't move) lost.
Some(ProvenValue::Loss)
} else {
None
}
}
}

The key method is terminal_value(). When no stones remain, the current player (who cannot move) has lost. The function returns Some(ProvenValue::Loss). For non-terminal states, return None.

Enabling the solver

Turn on MCTS-Solver by returning true from solver_enabled() in your MCTS config:

#[derive(Default)]
struct NimMCTS;

impl MCTS for NimMCTS {
type State = Nim;
type Eval = NimEval;
type NodeData = ();
type ExtraThreadData = ();
type TreePolicy = UCTPolicy;
type TranspositionTable = ();

fn solver_enabled(&self) -> bool {
true
}
}

That single method is all it takes. The tree policy, backpropagation, and best-move selection all adapt automatically.

Running the solver

The solver proves positions as it searches. Call root_proven_value() to read the result:

fn main() {
println!("=== MCTS-Solver: Nim ===\n");
println!("Rules: take 1 or 2 stones. Last stone wins.");
println!("Theory: position is losing iff stones % 3 == 0.\n");

for stones in 1u8..=9 {
let mut mcts = MCTSManager::new(
Nim {
stones,
current: Player::P1,
},
NimMCTS,
NimEval,
UCTPolicy::new(1.0),
(),
);
mcts.playout_n(500);

let proven = mcts.root_proven_value();
let theory = if stones % 3 == 0 { "Loss" } else { "Win " };
let best = mcts
.best_move()
.map(|m| format!("{m}"))
.unwrap_or_else(|| "-".into());
let nodes = mcts.tree().num_nodes();

println!(
"Stones={stones} Proven={proven:?} Theory={theory} Best={best:6} Nodes={nodes}"
);
}
}

Expected output:

=== MCTS-Solver: Nim ===

Rules: take 1 or 2 stones. Last stone wins.
Theory: position is losing iff stones % 3 == 0.

Stones=1 Proven=Win Theory=Win Best=Take 1 Nodes=2
Stones=2 Proven=Win Theory=Win Best=Take 2 Nodes=3
Stones=3 Proven=Loss Theory=Loss Best=Take 1 Nodes=6
Stones=4 Proven=Win Theory=Win Best=Take 1 Nodes=10
Stones=5 Proven=Win Theory=Win Best=Take 2 Nodes=10
Stones=6 Proven=Loss Theory=Loss Best=Take 1 Nodes=25
Stones=7 Proven=Win Theory=Win Best=Take 1 Nodes=31
Stones=8 Proven=Win Theory=Win Best=Take 2 Nodes=54
Stones=9 Proven=Loss Theory=Loss Best=Take 1 Nodes=86

root_proven_value() returns one of four values:

ValueMeaning
ProvenValue::WinCurrent player has a forced win
ProvenValue::LossCurrent player loses with best play
ProvenValue::DrawBest achievable outcome is a tie
ProvenValue::UnknownNot yet proven (search still running)

Nim theory says a position is losing if and only if stones % 3 == 0. The solver proves this without knowing the theory -- it discovers the game-theoretic truth through search alone.

Solving Tic-Tac-Toe

Tic-Tac-Toe is the ideal solver showcase. With only ~5,478 distinct game states, MCTS-Solver proves the entire game tree within a few thousand playouts:

  • From an empty board: Proven Draw (both players can force a draw with optimal play)
  • After X plays center: Proven Draw (O can always equalize)
  • After X plays corner, O plays non-center: Proven Win for X (X has a forced win)

Try it in the Playground → — play as X and watch MCTS prove each position. The solver badge shows Win/Loss/Draw for the current position, and each move shows its proven value. Notice how MCTS stops searching once a position is fully resolved.

This is exactly what happens in endgame play in production game engines — the solver kicks in when the remaining tree is small enough to prove, saving computation and guaranteeing optimal moves.


The rest of this tutorial covers Score-Bounded MCTS — an advanced feature for games with numeric scores. If you only need Win/Loss/Draw classification, skip ahead to Tutorial 5: Stochastic Games.


Score-Bounded MCTS

The solver proves win, loss, or draw -- but some games have scores, not just outcomes. How much did you win by? Score-Bounded MCTS answers that.

Sometimes win/loss/draw is not enough. You want the exact score: by how much does the winner win? Score-Bounded MCTS tracks [lower, upper] intervals on the minimax value (the optimal score achievable by both players playing perfectly) at each node. These bounds tighten during search as the tree explores more of the game. When lower == upper, the exact minimax value is known.

Terminal scores

Instead of (or in addition to) terminal_value(), implement terminal_score() to return exact scores at terminal nodes:

/// A two-player game with a fixed depth-2 tree.
///
/// Root (P1, maximizer) picks branch A, B, or C.
/// Then P2 (minimizer) picks a response. Each leaf has a known score
/// (from P1's perspective):
///
/// ```text
/// Root (P1)
/// / | \
/// A B C
/// (P2) (P2) (P2)
/// / \ / \ / \
/// 2 5 1 3 8 6
/// ```
///
/// Minimax: P2 minimizes, P1 maximizes.
/// A -> P2 picks min(2,5) = 2
/// B -> P2 picks min(1,3) = 1
/// C -> P2 picks min(8,6) = 6
/// Root -> P1 picks max(2,1,6) = 6 → best move is C
#[derive(Clone, Debug)]
struct ScoreGame {
depth: u8,
/// Which branch P1 chose (set after depth 0).
branch: Option<Branch>,
/// Terminal score from P1's perspective (set at depth 2).
score: Option<i32>,
current: Player,
}

#[derive(Clone, Copy, Debug, PartialEq, Eq)]
enum Player {
P1,
P2,
}

#[derive(Clone, Copy, Debug, PartialEq, Eq)]
enum Branch {
A,
B,
C,
}

/// Moves encode both P1's branch choice and P2's response index.
#[derive(Clone, Debug, PartialEq)]
enum ScoreMove {
/// P1 picks a branch.
Pick(Branch),
/// P2 picks the i-th response within the chosen branch.
Respond(u8),
}

impl std::fmt::Display for ScoreMove {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
match self {
ScoreMove::Pick(b) => write!(f, "{b:?}"),
ScoreMove::Respond(i) => write!(f, "R{i}"),
}
}
}

/// Terminal scores for each (branch, response) pair, from P1's perspective.
fn terminal_scores(branch: Branch, response: u8) -> i32 {
match (branch, response) {
(Branch::A, 0) => 2,
(Branch::A, 1) => 5,
(Branch::B, 0) => 1,
(Branch::B, 1) => 3,
(Branch::C, 0) => 8,
(Branch::C, 1) => 6,
_ => unreachable!(),
}
}

impl GameState for ScoreGame {
type Move = ScoreMove;
type Player = Player;
type MoveList = Vec<ScoreMove>;

fn current_player(&self) -> Player {
self.current
}

fn available_moves(&self) -> Vec<ScoreMove> {
match self.depth {
0 => vec![
ScoreMove::Pick(Branch::A),
ScoreMove::Pick(Branch::B),
ScoreMove::Pick(Branch::C),
],
1 => {
// Each branch has exactly 2 responses.
vec![ScoreMove::Respond(0), ScoreMove::Respond(1)]
}
_ => vec![],
}
}

fn make_move(&mut self, mov: &ScoreMove) {
match mov {
ScoreMove::Pick(b) => {
self.branch = Some(*b);
self.depth = 1;
self.current = Player::P2;
}
ScoreMove::Respond(i) => {
let branch = self.branch.unwrap();
self.score = Some(terminal_scores(branch, *i));
self.depth = 2;
self.current = Player::P1;
}
}
}

// Return the exact minimax score from the current player's perspective.
// At depth 2 the current player is P1, and scores are already from P1's view.
fn terminal_score(&self) -> Option<i32> {
if self.depth == 2 {
self.score
} else {
None
}
}

// Note: terminal_value() is NOT implemented. The library cross-derives
// proven values (Win/Loss/Draw) from terminal_score() automatically.
}

Notice that terminal_value() is not implemented here. The library cross-derives proven values from scores automatically:

  • Positive score -- ProvenValue::Win
  • Negative score -- ProvenValue::Loss
  • Zero -- ProvenValue::Draw

You can implement either terminal_value(), terminal_score(), or both. If you provide both, the library checks consistency in debug builds.

Score-Bounded MCTS requires both score_bounded_enabled() and solver_enabled() to return true:

#[derive(Default)]
struct ScoreMCTS;

impl MCTS for ScoreMCTS {
type State = ScoreGame;
type Eval = ScoreEval;
type NodeData = ();
type ExtraThreadData = ();
type TreePolicy = UCTPolicy;
type TranspositionTable = ();

fn visits_before_expansion(&self) -> u64 {
0
}

fn score_bounded_enabled(&self) -> bool {
true
}

fn solver_enabled(&self) -> bool {
true
}
}

The visits_before_expansion() override of 0 expands every node immediately. This helps the solver prove small, fully-enumerable trees faster.

Reading score bounds

After running playouts, read the converged bounds and proven value from the root:

let bounds = mcts.root_score_bounds();
println!("Score bounds: [{}, {}]", bounds.lower, bounds.upper);
println!("Converged: {}", bounds.is_proven());

let proven = mcts.root_proven_value();
println!("Proven value: {proven:?}");

root_score_bounds() returns a ScoreBounds { lower, upper } struct. When lower == upper, the exact minimax value is known, and is_proven() returns true. For the score game above, the bounds converge to [6, 6] -- P1's optimal play through branch C yields a minimax value of 6.

Child-level bounds are also available through root_child_stats(), so you can inspect how each move's score interval tightened during search.

Solver in the Playground

Tic-Tac-Toe has solver enabled — positions are proven in real time. Connect Four does not — the tree is too deep for browser-speed proofs, so it relies on heuristic evaluation. This illustrates the practical boundary: solver is powerful but needs a manageable game tree.

What's next

The solver proves outcomes in deterministic games. Tutorial 5 adds randomness with chance nodes -- dice rolls, card draws, and other stochastic transitions.