Your First Search

You'll build a complete MCTS search from scratch. The game: increment or decrement a counter toward a target of 100. Two moves per turn, one obvious best choice. Simple enough to verify, complex enough to demonstrate every part of the API.

You will learn to:

Implement GameState for a single-player game
Implement Evaluator with state evaluation and player-perspective reward
Configure and run MCTS search, and interpret the results

Prerequisites: What is MCTS? and a Rust toolchain.

Add the dependency

[dependencies]
mcts = "0.3"

Define the game

Every MCTS game implements the GameState trait: what moves are available, whose turn it is, and how moves change the state.

#[derive(Clone)]
struct CountingGame(i64);

#[derive(Clone, Debug)]
enum Move {
    Add,
    Sub,
}

impl GameState for CountingGame {
    type Move = Move;
    type Player = ();
    type MoveList = Vec<Self::Move>;

    fn current_player(&self) -> Self::Player {}

    fn available_moves(&self) -> Vec<Self::Move> {
        if self.0 == 100 {
            vec![]
        } else {
            vec![Move::Add, Move::Sub]
        }
    }

    fn make_move(&mut self, mov: &Self::Move) {
        match *mov {
            Move::Add => self.0 += 1,
            Move::Sub => self.0 -= 1,
        }
    }
}

CountingGame wraps a single i64 counter. Each turn, the player can Add (increment) or Sub (decrement). When the counter reaches 100, available_moves() returns an empty vec -- the game is over.

current_player() returns () because this is a single-player optimization problem. There is no opponent.

make_move() mutates the state in place. MCTS clones the state internally before calling this, so the original is preserved.

Evaluate positions

The Evaluator trait tells MCTS how good a position is and how to score moves.

struct MyEvaluator;

impl Evaluator<MyMCTS> for MyEvaluator {
    type StateEvaluation = i64;

    fn evaluate_new_state(
        &self,
        state: &CountingGame,
        moves: &Vec<Move>,
        _: Option<SearchHandle<MyMCTS>>,
    ) -> (Vec<()>, i64) {
        (vec![(); moves.len()], state.0)
    }

    fn interpret_evaluation_for_player(&self, evaln: &i64, _player: &()) -> i64 {
        *evaln
    }

    fn evaluate_existing_state(
        &self,
        _: &CountingGame,
        evaln: &i64,
        _: SearchHandle<MyMCTS>,
    ) -> i64 {
        *evaln
    }
}

Three methods:

evaluate_new_state() returns a pair: per-move evaluations (here () for each move, meaning no prior bias) and a state evaluation (the counter value itself). Higher counter = closer to 100 = better.
interpret_evaluation_for_player() converts the state evaluation into a reward for the given player. With a single player, this just returns the evaluation directly.
evaluate_existing_state() handles re-evaluation when the search revisits a node (through different move paths that reach the same game state — called transpositions). Here, it returns the same value.

Configure the search

The MCTS trait wires the game, evaluator, tree policy, and transposition table together into a single configuration type.

#[derive(Default)]
struct MyMCTS;

impl MCTS for MyMCTS {
    type State = CountingGame;
    type Eval = MyEvaluator;
    type NodeData = ();
    type ExtraThreadData = ();
    type TreePolicy = UCTPolicy;
    type TranspositionTable = ApproxTable<Self>;

    fn virtual_loss(&self) -> i64 {
        500
    }

    fn cycle_behaviour(&self) -> CycleBehaviour<Self> {
        CycleBehaviour::UseCurrentEvalWhenCycleDetected
    }
}

MyMCTS is a zero-sized type that exists solely to connect the associated types.

TreePolicy = UCTPolicy -- the tree policy (the algorithm that decides which child to explore next — here, UCT from Tutorial 1).
TranspositionTable = ApproxTable -- detect when different move sequences reach the same counter value.
virtual_loss() returns 500. During parallel search (covered in the how-to guide), a thread temporarily penalizes a node it's exploring so other threads avoid duplicating work. The value should be larger than any realistic evaluation.
cycle_behaviour() tells the search what to do when different move paths loop back to the same position. UseCurrentEvalWhenCycleDetected stops expanding and uses the node's current evaluation.

Run it

fn main() {
    let game = CountingGame(0);
    let mut mcts = MCTSManager::new(
        game,
        MyMCTS,
        MyEvaluator,
        UCTPolicy::new(5.0),
        ApproxTable::new(1024),
    );
    mcts.playout_n(100000);
    let pv: Vec<_> = mcts
        .principal_variation_states(10)
        .into_iter()
        .map(|x| x.0)
        .collect();
    println!("Principal variation: {:?}", pv);
    println!("Evaluation of moves:");
    mcts.tree().debug_moves();
}

MCTSManager::new() takes five arguments: the initial game state, the MCTS configuration, the evaluator, the tree policy (with exploration constant C=5.0), and the transposition table (1024-slot approximate table).

playout_n(100_000) runs 100,000 iterations. Each iteration walks the four phases: select, expand, simulate, backpropagate.

principal_variation_states(10) extracts the principal variation (PV) -- the best sequence of moves found by search, following the most-visited child at each level. Think of it as "what MCTS thinks will happen if both sides play optimally." This returns up to 10 states along that path.

debug_moves() prints statistics for each child of the root: visit count, average reward, and move name. You'll see Add with far more visits and a higher average reward than Sub.

Expected output (exact numbers vary between runs):

Principal variation: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Evaluation of moves:
Add [99995 visits] [99.89701485074254 avg reward]
Sub [7 visits] [-1 avg reward]

The search overwhelmingly prefers Add — nearly all 100,000 playouts go through it. The principal variation climbs 0, 1, 2, 3, ... toward 100. MCTS found the obvious optimal strategy through sampling alone.

The full picture

The complete example is at examples/counting_game.rs in the repository. Run it:

cargo run --example counting_game

What's next

The counting game has one player and one obvious strategy. In Two-Player Games, you'll add an opponent and adversarial evaluation.

Add the dependency​

Define the game​

Evaluate positions​

Configure the search​

Run it​

The full picture​

What's next​