HACKING

   1 This is brief developer-oriented overview in Pachi structure.
   2
   3 Pachi is completely Go-specific (c.f. Fuego; though e.g. atari go support
   4 should be easy to add), but fairly modular. It has been built with focus
   5 on MonteCarlo-based play, but it can in principle be used for other
   6 play engines as well.
   7
   8
   9 Basic architecture
  10 ==================
  11
  12 Pachi consists of the following components:
  13
  14
  15   +------+    +--------+    +---------+    +------------------+
  16   | core | -- | engine | -- | playout | -- | tactical library |
  17   +------+    +--------+    +---------+    +------------------+
  18                      |        |
  19                   +-------------+
  20                   | aux library |
  21                   +-------------+
  22
  23 * "core" takes care of the program's lifetime, GTP interface and basic
  24   fast Go board implementation
  25
  26         pachi.c         global initialization and the main loop
  27         version.h       current version information
  28         debug.h         debugging infrastructure
  29         random.[ch]     fast random number generator
  30         gtp.[ch]        GTP protocol interface
  31         network.[ch]    Network interface (useful for distributed engine)
  32         timeinfo.[ch]   Time-keeping information
  33         stone.[ch]      one board point coloring definition
  34         move.[ch]       one board move definition
  35         board.[ch]      board definition and basic interface
  36
  37 * "aux library" provides extra functions like static tactical evaluation
  38   and pattern matching; it is somewhat interwound with "core" component
  39
  40         mq.h            "move queue" data structure
  41         stats.h         "move statistics" data structure
  42         probdist.[ch]   "probability distribution" data structure
  43         ownermap.[ch]   simulation-based finalpos. "owner map" data structure
  44         pattern3.[ch]   fast 3x3 spatial pattern matcher
  45
  46 * "tactical library" provides extended interfaces for the go board,
  47   most important non-trivial tactical information
  48
  49         tactics/
  50
  51 * "engine" receives notifications about opponent moves and is asked
  52   to generate a move to play on given board
  53
  54         engine.h        abstract engine interface
  55         random/         example "random move generator" engine
  56         replay/         example "playout move generator" engine
  57         montecarlo/     simple treeless Monte Carlo engine, quite bitrotten
  58         uct/            the main UCT-player engine, see below
  59         distributed/    "meta-engine" for distributed play by orchestrating
  60                                 several UCT engines on different computers
  61         joseki/         auxiliary engine for generating joseki patterns
  62
  63 * "playout" policy is asked to generate moves to play during the Monte Carlo
  64   simulations, and to provide rough evaluation of moves feasibility for
  65   the engine
  66
  67         playout.[ch]    abstract playout policy interface,
  68                                 Monte Carlo simulation execution
  69         playout/light   uniformly random playout policy
  70         playout/moggy   rule-based "Mogo-like" playout policy
  71
  72 * Also, several ways of testing Pachi are provided:
  73
  74         t-unit/         interface for writing unit-tests for specific
  75                                 functionality, mainly tactics
  76         t-play/         interface for testing performance by playing games
  77                                 against a fixed opponent (e.g. GNUGo)
  78
  79
  80 UCT architecture
  81 ================
  82
  83 The UCT engine (the proper name should be MCTS as it does not have that
  84 much common with classic UCT now) has non-trivial structure by itself:
  85
  86   +-------------+    +-----+     +-------------------+
  87   | node policy | -- | UCT | --- | node prior-hinter |
  88   +-------------+    +-----+     +-------------------+
  89                         |           |
  90                    +---------+      |
  91                    | playout | -----'
  92                    +---------+
  93
  94 * "UCT" is the core of the engine
  95
  96         uct.[ch]        engine initialization, public interface
  97         internal.h      internal state and data structures
  98         tree.[ch]       minimax move tree with success statistics
  99         walk.[ch]       filling the tree by walking it many times
 100                                 and running MC simulations from leaves
 101         slave.[ch]      engine interface for the distributed engine
 102
 103 * "node prior-hinter" assigns newly created nodes preliminary success
 104   statistics ("prior values") to focus the search better
 105
 106         prior.[ch]      variety of methods for setting the priors
 107
 108 * "node policy" mainly chooses the current node's child to descend
 109   through during the tree walk, based on the already recorded statistics;
 110   it must balance exploration and exploitation well during the selection
 111
 112         policy/ucb1     the old-school original simple policy
 113         policy/ucb1amaf the AMAF/RAVE-based policy gathering statistics rapidly
 114
 115 * "dynkomi driver" dynamically determines self-imposed extra virtual komi
 116
 117         dynkomi.[ch]
 118
 119
 120 Board Implementation
 121 ====================
 122
 123 The infrastructure is optimized for speed to make it well suited
 124 for bruteforce engines, however tradeoffs are made to make it useful
 125 for heavier MonteCarlo playouts as well (e.g. real liberties are
 126 tracked instead of pseudoliberties). If you are looking for raw
 127 light playout speed, libEGO is better choice.
 128
 129 In general, arbitrary board sizes are supported; however, board sizes
 130 smaller than 9x9 have not been tested and board sizes larger than 25x25
 131 are not supported by GTP - also, in theory some buffer overflows might
 132 happen with board sizes larger than 19x19. The engine parameters are
 133 primarily tuned for 19x19 play.
 134
 135 Ruleset
 136 -------
 137
 138 While the Pachi engines generally play according to Chinese rules,
 139 internally, Pachi uses Tromp-Taylor rules because they are simple,
 140 fast and universal; they are very close to the New Zealand rules.
 141 That means, it simply counts the number of stones and one-point eyes
 142 of each color on the board, plus komi and handicap correction.
 143
 144 Tromp-Taylor rules also mean that multi-stone suicide is allowed! If you
 145 do not like that (basically if you want to pretend it plays according
 146 to Chinese rules), you need to rule that out in your engine, currently.
 147 The provided engines DO avoid multi-stone suicide, though it is allowed
 148 in the playouts for performance reasons (perhaps we should re-visit that
 149 decision in light of heavy playouts).
 150
 151 Tromp-Taylor rules have positional superko; the board implementation
 152 will set a flag if it is violated, but play the move anyway. You need
 153 to enforce the superko rule in your engine.
 154
 155
 156 GTP Implementation
 157 ==================
 158
 159 ...is a very sad hack. ENSURE that only trusted parties talk to Pachi's
 160 GTP interface, as it is totally non-resilient to any kind of overflow
 161 or bad input attacks and allowing arbitrary input to be entered within
 162 is a major security hole. Yes, this needs to be cleaned up. Also, currently
 163 engines cannot plug in their own commands and there is no GoGui interface.
 164
 165 Pachi supports only few GTP commands now. Most importantly, it does not
 166 support the undo command.  The final_status_list command requires engine
 167 support.
 168
 169
 170 Plugin API
 171 ==========
 172
 173 The UCT engine allows external plugins to be loaded and provide external
 174 knowledge for various heuristics - currently just biasing the MCTS priors,
 175 but more can be added easily. The plugins should follow the API in
 176 <uct/plugin.h> - see that file for details.
 177
 178
 179 Joseki Patterns
 180 ===============
 181
 182 Joseki patterns can be generated from variations of a SGF file. Josekis
 183 are matched per-quadrant, no other move must be within the quadrant. See
 184 joseki/README for details on usage of the auxiliary engine and a full
 185 recipe for making Pachi play according to joseki sequences extracted from
 186 the Kogo dictionary.
 187
 188
 189 Opening Book
 190 ============
 191
 192 The UCT engine can "pre-read" the starting board position and
 193 dump the core of the built tree to a file, loading it later. This is
 194 called a 'tbook' (as in "tree book") and can be generated using the
 195 tools/gentbook.sh script. The newly generated file is automatically
 196 used by the UCT engine when found.
 197
 198 Alternatively, there is a support for directly used opening book
 199 (so-called fbook, a.k.a. "forced book" or "fuseki book"). The book
 200 is stored in a text file in Fuego-compatible format and can be loaded
 201 using the ./pachi -f parameter. A naive way to build such a book
 202 based on shell-based, twogtp-based UCT is available through the
 203 tools/autobook/ framework.
 204
 205
 206 Local Trees
 207 ===========
 208
 209 This is a mostly unique idea in Pachi, currently in development;
 210 it does not work too well yet, but Pasky has great hopes for it in
 211 the future.
 212
 213 A local tree is a tree comprising of (CFG-topologically) local
 214 sequences, built in parallel with the real game tree; there is
 215 a separate tree for black-first and white-first sequences. E.g if
 216 the game tree (on empty board) consists of D4-C6-D6-Q16-O17-D7,
 217 the coresponding local tree sequences are (black) D4-C6-D6, (white)
 218 Q16-O17 and (white) D7. The simulation result is then recorded as
 219 usual in the game tree, but also in the corresponding local trees.
 220
 221 The goal of this is to dynamically build a cache of various
 222 resolutions of local situations. This is to overcome cases where
 223 the proper resolution of a given situation is easily found in
 224 the tree, but improper heuristical bisaes keep muddying the
 225 evaluation down.
 226
 227 Exactly what kind of values to store in the local tree nodes is
 228 still an open question. The original approach has been to simply
 229 use simulation results directly or biased on base "value temperature";
 230 now, we are experimenting with survival rate of the "base stone"
 231 of the branch (the first move in the branch sequence). We also intend
 232 to integrate criticality information with local tree data.
 233
 234 The local trees are descended in parallel with the main tree.
 235 The values stored in local trees are added to the RAVE term
 236 from the main tree during node selection. We also wish to use
 237 the information from local trees in the simulations, but this is
 238 still subject to research.
 239
 240 To enable local trees usage, pass local_tree=1 to the UCT engine.
 241 There are many further knobs to twiddle, too.