HACKING

   1 This is brief developer-oriented overview in Pachi structure.
   2
   3 Pachi is completely Go-specific (c.f. Fuego; though e.g. atari go support
   4 should be easy to add), but fairly modular. It has been built with focus
   5 on MonteCarlo-based play, but it can in principle be used for other
   6 play engines as well.
   7
   8
   9 Basic architecture
  10 ==================
  11
  12 Pachi consists of the following components:
  13
  14
  15   +------+    +--------+    +---------+
  16   | core | -- | engine | -- | playout |
  17   +------+    +--------+    +---------+
  18                      |        |
  19                   +-------------+
  20                   | aux library |
  21                   +-------------+
  22
  23 * "core" takes care of the program's lifetime, GTP interface and basic
  24   fast Go board implementation
  25
  26         zzgo.c          global initialization and the main loop
  27         version.h       current version information
  28         debug.h         debugging infrastructure
  29         random.[ch]     fast random number generator
  30         gtp.[ch]        GTP protocol interface
  31         timeinfo.[ch]   Time-keeping information
  32         stone.[ch]      one board point coloring definition
  33         move.[ch]       one board move definition
  34         board.[ch]      board definition and basic interface
  35
  36 * "aux library" provides extra functions like static tactical evaluation
  37   and pattern matching; it is somewhat interwound with "core" component
  38
  39         tactics.[ch]    extended interfaces for the go board
  40         mq.h            "move queue" data structure
  41         stats.h         "move statistics" data structure
  42         probdist.[ch]   "probability distribution" data structure
  43         ownermap.[ch]   simulation-based finalpos. "owner map" data structure
  44         pattern3.[ch]   fast 3x3 spatial pattern matcher
  45         pattern.[ch]    general multi-feature pattern matcher
  46         network.[ch]    Network interface (useful for distributed engine)
  47
  48 * "engine" receives notifications about opponent moves and is asked
  49   to generate a move to play on given board
  50
  51         engine.h        abstract engine interface
  52         random/         example "random move generator" engine
  53         replay/         example "playout move generator" engine
  54         montecarlo/     simple treeless Monte Carlo engine, quite bitrotten
  55         uct/            the main UCT-player engine, see below
  56         patternscan/    auxiliary engine for harvesting patterns from
  57                                 existing games
  58         distributed/    "meta-engine" for distributed play by orchestrating
  59                                 several UCT engines on different computers
  60
  61 * "playout" policy is asked to generate moves to play during the Monte Carlo
  62   simulations, and to provide rough evaluation of moves feasibility for
  63   the engine
  64
  65         playout.[ch]    abstract playout policy interface,
  66                                 Monte Carlo simulation execution
  67         playout/light   uniformly random playout policy
  68         playout/moggy   rule-based "Mogo-like" playout policy
  69         playout/elo     probdist-based "CrazyStone-like" playout policy
  70
  71 * Also, several ways of testing Pachi are provided:
  72
  73         t-unit/         interface for writing unit-tests for specific
  74                                 functionality, mainly tactics
  75         t-play/         interface for testing performance by playing games
  76                                 against a fixed opponent (e.g. GNUGo)
  77
  78
  79 UCT architecture
  80 ================
  81
  82 The UCT engine has non-trivial structure by itself:
  83
  84   +-------------+    +-----+     +-------------------+
  85   | node policy | -- | UCT | --- | node prior-hinter |
  86   +-------------+    +-----+     +-------------------+
  87                         |           |
  88                    +---------+      |
  89                    | playout | -----'
  90                    +---------+
  91
  92 * "UCT" is the core of the engine
  93
  94         uct.[ch]        engine initialization, public interface
  95         internal.h      internal state and data structures
  96         tree.[ch]       minimax move tree with success statistics
  97         walk.[ch]       filling the tree by walking it many times
  98                                 and running MC simulations from leaves
  99         slave.[ch]      engine interface for the distributed engine
 100
 101 * "node prior-hinter" assigns newly created nodes preliminary success
 102   statistics ("prior values") to focus the search better
 103
 104         prior.[ch]      variety of methods for setting the priors
 105
 106 * "node policy" mainly chooses the current node's child to descend
 107   through during the tree walk, based on the already recorded statistics;
 108   it must balance exploration and exploitation well during the selection
 109
 110         policy/ucb1     the old-school original simple policy
 111         policy/ucb1amaf the AMAF/RAVE-based policy gathering statistics rapidly
 112
 113 * "dynkomi driver" dynamically determines self-imposed extra virtual komi
 114
 115         dynkomi.[ch]
 116
 117
 118 Board Implementation
 119 ====================
 120
 121 The infrastructure is optimized for speed to make it well suited
 122 for bruteforce engines, however tradeoffs are made to make it useful
 123 for heavier MonteCarlo playouts as well (e.g. real liberties are
 124 tracked instead of pseudoliberties). If you are looking for raw
 125 light playout speed, libEGO is better choice.
 126
 127 In general, arbitrary board sizes are supported; however, board sizes
 128 smaller than 9x9 have not been tested and board sizes larger than 25x25
 129 are not supported by GTP - also, in theory some buffer overflows might
 130 happen with board sizes larger than 19x19. The engine parameters are
 131 primarily tuned for 19x19 play.
 132
 133 Ruleset
 134 -------
 135
 136 While the Pachi engines generally play according to Chinese rules,
 137 internally, Pachi uses Tromp-Taylor rules because they are simple,
 138 fast and universal; they are very close to the New Zealand rules.
 139 That means, it simply counts the number of stones and one-point eyes
 140 of each color on the board, plus komi and handicap correction.
 141
 142 Tromp-Taylor rules also mean that multi-stone suicide is allowed! If you
 143 do not like that (basically if you want to pretend it plays according
 144 to Chinese rules), you need to rule that out in your engine, currently.
 145 The provided engines DO avoid multi-stone suicide, though it is allowed
 146 in the playouts for performance reasons (perhaps we should re-visit that
 147 decision in light of heavy playouts).
 148
 149 Tromp-Taylor rules have positional superko; the board implementation
 150 will set a flag if it is violated, but play the move anyway. You need
 151 to enforce the superko rule in your engine.
 152
 153
 154 GTP Implementation
 155 ==================
 156
 157 ...is a very sad hack. ENSURE that only trusted parties talk to Pachi's
 158 GTP interface, as it is totally non-resilient to any kind of overflow
 159 or bad input attacks and allowing arbitrary input to be entered within
 160 is a major security hole. Yes, this needs to be cleaned up. Also, currently
 161 engines cannot plug in their own commands and there is no GoGui interface.
 162
 163 Pachi supports only few GTP commands now. Most importantly, it does not
 164 support the undo command.  The final_status_list command requires engine
 165 support.
 166
 167
 168 General Pattern Matcher
 169 =======================
 170
 171 Pachi has in-development general pattern matcher that can match various
 172 sets of features (spatial and others), inspired by the CrazyStone pattern
 173 model. Please see pattern.h for detailed description of the pattern concept
 174 and recognized features.
 175
 176 To harvest patterns, use 'zzgo -e patternscan' (see patternscan/patternscan.c
 177 for available options).  The output of the pattern scanner are two data
 178 structures: The matched patterns
 179
 180         (feature1:payload feature2:payload ...)
 181
 182 and spatial dictionary. "Spatial" feature represents a particular
 183 configuration of stones in a circle around the move-to-play; each
 184 configuration has its own record in the dictionary and the spatial
 185 feature references only the id in the dictionary; so you need to keep
 186 both the patterns and the "patterns.spat" file.  Normally, 'patternscan'
 187 will only match already existing dictionary entries, but you
 188 can pass it "gen_spat_dict" so that it appends all newly found spatial
 189 features to the dictionary - use "spat_threshold" to limit recording
 190 only to frequently occuring spatial features; to start the dictionary
 191 from scratch, simply remove any existing "patterns.spat" file.
 192
 193 There are few pre-made scripts to make the initialization of the pattern
 194 matcher easy:
 195
 196 * pattern_byplayer.sh: Sorts out patterns from given SGF collection by
 197   player names, one player per file in a dedicated directory. This is
 198   useful if you want to use the patterns to e.g. recognize games of a
 199   player by characteristic patterns. Spatial dictionary is autogenerated
 200   in full.
 201
 202 * pattern_spatial_gen.sh: Initializes spatial dictionary by spatial features
 203   found at least N times in given SGF collection.  This is useful for
 204   further gathering of general pattern statistics while keeping the amount
 205   of spatial features manageable.
 206
 207 * pattern_spatial_show.pl ID: Shows spatial pattern of given id in 2D plane.
 208
 209 * pattern_mm.sh: Combines patternsacn engine and the MM tool (see below),
 210   producing gamma values for harvested patterns.
 211
 212 Minorization-majorization (CrazyStone patterns)
 213 -----------------------------------------------
 214
 215 The pattern harvester can be used together with the MM tool by Remi Coulom:
 216
 217         http://remi.coulom.free.fr/Amsterdam2007/mm.tar.bz2
 218
 219 This tool will compute relative strength of individual features for teaming
 220 them up and using the outcoming probability distribution for generating moves.
 221 There is a script that will take you from SGF game collection to gamma values
 222 in single shot - "pattern_mm.sh".
 223
 224 The resulting "patterns.gamma" file contains mapping from feature instances
 225 to gamma floats, representing the features strength; note that it is totally
 226 meaningless without the accompanying "patterns.spat" file generated by the
 227 pattern_gather script. This file is used for "full-scale" pattern matching
 228 used for tree node priors. Another "fast" pattern feature set is used for
 229 generation of in-playout probability distribution - gamma values for these
 230 features are stored in "patterns.gammaf".
 231
 232 To make Pachi use the gamma values for tree bias and in MC playouts, use the
 233 "elo" playout policy - but note that it's still in heavy development.
 234
 235
 236 Plugin API
 237 ==========
 238
 239 The UCT engine allows external plugins to be loaded and provide external
 240 knowledge for various heuristics - currently just biasing the MCTS priors,
 241 but more can be added easily. The plugins should follow the API in
 242 <uct/plugin.h> - see that file for details.