HACKING

   1 This is brief developer-oriented overview in Pachi structure.
   2
   3 Pachi is completely Go-specific (c.f. Fuego; though e.g. atari go support
   4 should be easy to add), but fairly modular. It has been built with focus
   5 on MonteCarlo-based play, but it can in principle be used for other
   6 play engines as well.
   7
   8
   9 Basic architecture
  10 ==================
  11
  12 Pachi consists of the following components:
  13
  14
  15   +------+    +--------+    +---------+
  16   | core | -- | engine | -- | playout |
  17   +------+    +--------+    +---------+
  18                      |        |
  19                   +-------------+
  20                   | aux library |
  21                   +-------------+
  22
  23 * "core" takes care of the program's lifetime, GTP interface and basic
  24   fast Go board implementation
  25
  26         zzgo.c          global initialization and the main loop
  27         version.h       current version information
  28         debug.h         debugging infrastructure
  29         random.[ch]     fast random number generator
  30         gtp.[ch]        GTP protocol interface
  31         stone.[ch]      one board point coloring definition
  32         move.[ch]       one board move definition
  33         board.[ch]      board definition and basic interface
  34
  35 * "aux library" provides extra functions like static tactical evaluation
  36   and pattern matching; it is somewhat interwound with "core" component
  37
  38         tactics.[ch]    extended interfaces for the go board
  39         mq.h            "move queue" data structure
  40         stats.h         "move statistics" data structure
  41         probdist.[ch]   "probability distribution" data structure
  42         ownermap.[ch]   simulation-based finalpos. "owner map" data structure
  43         pattern3.[ch]   fast 3x3 spatial pattern matcher
  44         pattern.[ch]    general multi-feature pattern matcher
  45
  46 * "engine" receives notifications about opponent moves and is asked
  47   to generate a move to play on given board
  48
  49         engine.h        abstract engine interface
  50         random/         example "random move generator" engine
  51         replay/         example "playout move generator" engine
  52         montecarlo/     simple treeless Monte Carlo engine, quite bitrotten
  53         uct/            the main UCT-player engine, see below
  54         patternscan/    auxiliary engine for harvesting patterns from
  55                                 existing games
  56
  57 * "playout" policy is asked to generate moves to play during the Monte Carlo
  58   simulations, and to provide rough evaluation of moves feasibility for
  59   the engine
  60
  61         playout.[ch]    abstract playout policy interface,
  62                                 Monte Carlo simulation execution
  63         playout/light   uniformly random playout policy
  64         playout/moggy   rule-based "Mogo-like" playout policy
  65         playout/elo     probdist-based "CrazyStone-like" playout policy
  66
  67 * Also, several ways of testing Pachi are provided:
  68
  69         t-unit/         interface for writing unit-tests for specific
  70                                 functionality, mainly tactics
  71         t-play/         interface for testing performance by playing games
  72                                 against a fixed opponent (e.g. GNUGo)
  73
  74
  75 UCT architecture
  76 ================
  77
  78 The UCT engine has non-trivial structure by itself:
  79
  80   +-------------+    +-----+     +-------------------+
  81   | node policy | -- | UCT | --- | node prior-hinter |
  82   +-------------+    +-----+     +-------------------+
  83                         |           |
  84                    +---------+      |
  85                    | playout | -----'
  86                    +---------+
  87
  88 * "UCT" is the core of the engine
  89
  90         uct.[ch]        engine initialization, public interface
  91         internal.h      internal state and data structures
  92         tree.[ch]       minimax move tree with success statistics
  93         walk.[ch]       filling the tree by walking it many times
  94                                 and running MC simulations from leaves
  95
  96 * "node prior-hinter" assigns newly created nodes preliminary success
  97   statistics ("prior values") to focus the search better
  98
  99         prior.[ch]      variety of methods for setting the priors
 100
 101 * "node policy" mainly chooses the current node's child to descend
 102   through during the tree walk, based on the already recorded statistics;
 103   it must balance exploration and exploitation well during the selection
 104
 105         policy/ucb1     the old-school original simple policy
 106         policy/ucb1amaf the AMAF/RAVE-based policy gathering statistics rapidly
 107
 108
 109 Board Implementation
 110 ====================
 111
 112 The infrastructure is optimized for speed to make it well suited
 113 for bruteforce engines, however tradeoffs are made to make it useful
 114 for heavier MonteCarlo playouts as well (e.g. real liberties are
 115 tracked instead of pseudoliberties). If you are looking for raw
 116 light playout speed, libEGO is better choice.
 117
 118 Ruleset
 119 -------
 120
 121 While the Pachi engines generally play according to Chinese rules,
 122 internally, Pachi uses Tromp-Taylor rules because they are simple,
 123 fast and universal; they are very close to the New Zealand rules.
 124 That means, it simply counts the number of stones and one-point eyes
 125 of each color on the board, plus komi and handicap correction.
 126
 127 Tromp-Taylor rules also mean that multi-stone suicide is allowed! If you
 128 do not like that (basically if you want to pretend it plays according
 129 to Chinese rules), you need to rule that out in your engine, currently.
 130 The provided engines DO avoid multi-stone suicide (but the UCT engine
 131 will never play it itself).
 132
 133 Tromp-Taylor rules have positional superko; the board implementation
 134 will set a flag if it is violated, but play the move anyway. You need
 135 to enforce the superko rule in your engine.
 136
 137
 138 GTP Implementation
 139 ==================
 140
 141 ...is a very sad hack. ENSURE that only trusted parties talk to Pachi's
 142 GTP interface, as it is totally non-resilient to any kind of overflow
 143 or bad input attacks and allowing arbitrary input to be entered within
 144 is a major security hole. Yes, this needs to be cleaned up. Also, currently
 145 engines cannot plug in their own commands and there is no GoGui interface.
 146
 147 Pachi supports only few GTP commands now. Most importantly, it does not
 148 support the undo command and it does not support time-keeping.
 149 The final_status_list command requires engine support.
 150
 151
 152 General Pattern Matcher
 153 =======================
 154
 155 Pachi has in-development general pattern matcher that can match various
 156 sets of features (spatial and others), inspired by the CrazyStone pattern
 157 model. Please see pattern.h for detailed description of the pattern concept
 158 and recognized features.
 159
 160 To harvest patterns, use 'zzgo -e patternscan' (see patternscan/patternscan.c
 161 for available options).  The output of the pattern scanner are two data
 162 structures: The matched patterns
 163
 164         (feature1:payload feature2:payload ...)
 165
 166 and spatial dictionary. "Spatial" feature represents a particular
 167 configuration of stones in a circle around the move-to-play; each
 168 configuration has its own record in the dictionary and the spatial
 169 feature references only the id in the dictionary; so you need to keep
 170 both the patterns and the "patterns.spat" file.  Normally, 'patternscan'
 171 will only match already existing dictionary entries, but you
 172 can pass it "gen_spat_dict" so that it appends all newly found spatial
 173 features to the dictionary - use "spat_threshold" to limit recording
 174 only to frequently occuring spatial features; to start the dictionary
 175 from scratch, simply remove any existing "patterns.spat" file.
 176
 177 There are few pre-made scripts to make the initialization of the pattern
 178 matcher easy:
 179
 180 * pattern_byplayer.sh: Sorts out patterns from given SGF collection by
 181   player names, one player per file in a dedicated directory. This is
 182   useful if you want to use the patterns to e.g. recognize games of a
 183   player by characteristic patterns. Spatial dictionary is autogenerated
 184   in full.
 185
 186 * pattern_spatial_gen.sh: Initializes spatial dictionary by spatial features
 187   found at least N times in given SGF collection.  This is useful for
 188   further gathering of general pattern statistics while keeping the amount
 189   of spatial features manageable.
 190
 191 * pattern_spatial_show.pl ID: Shows spatial pattern of given id in 2D plane.
 192
 193 * pattern_mm.sh: Combines patternsacn engine and the MM tool (see below),
 194   producing gamma values for harvested patterns.
 195
 196 Minorization-majorization (CrazyStone patterns)
 197 -----------------------------------------------
 198
 199 The pattern harvester can be used together with the MM tool by Remi Coulom:
 200
 201         http://remi.coulom.free.fr/Amsterdam2007/mm.tar.bz2
 202
 203 This tool will compute relative strength of individual features for teaming
 204 them up and using the outcoming probability distribution for generating moves.
 205 There is a script that will take you from SGF game collection to gamma values
 206 in single shot - "pattern_mm.sh".
 207
 208 The resulting "patterns.gamma" file contains mapping from feature instances
 209 to gamma floats, representing the features strength; note that it is totally
 210 meaningless without the accompanying "patterns.spat" file generated by the
 211 pattern_gather script. To make Pachi use the gamma values for tree bias and
 212 in MC playouts, use the "elo" playout policy - but note that it's still in
 213 heavy development.