3 This repository includes various tools used to analyze Go-players style.
6 - the neural network (NN) training and running code (see subfolder `gnet')
7 - simple Python library used for
8 - creating neural network train files from pattern files
9 - doing PCA analysis of pattern files
11 Python gostyle library and example programs:
12 gostyle.py - library code
13 make_train_set.py - script that makes a train set for a NN
14 make_input_vector.py - used to create input vectors for trained NN
15 pca.py - performs a PCA analysis on vectors made from pattern files and prints it in gnuplot-friendly format
16 data_about_players.py - contains data about players and their strategies
19 Suppose you want to generate input vectors from player patfiles and output vectors based on an expert-knowledged-information.
20 1. Create an InputGenerator object:
21 >>> i = InputVectorGenerator(main_pat_filename, num_features)
22 2. Then you can print some input vectors - pattern files (e.g. 'Superman') must exist,..
23 >>> for name in [ 'Cho-Chikun', 'Superman', 'Toya Koyo']:
25 3. Now you reaalized you want to do PCA:
26 >>> list_of_input_vectors = [ i(name) for name in [ 'Cho-Chikun', 'Superman', 'Toya Koyo']]
27 >>> pca = PCA(input_vectors, reduce=True)
28 >>> reduced_list_of_vectors = pca.process_list_of_vectors( list_of_input_vectors )
29 4. Now we generate output vectors:
30 >>> dict = { 'Cho-Chikun' : [1,1], 'Superman' : [666,666], 'Toya Koyo' : [1, 5] }
31 >>> output_vectors = [ dict[name] for name in ['Cho-Chikun', 'Superman', 'Toya Koyo']]
32 5. You may also have written - this does the same, but you may use different output vector object, while keep your current code
33 >>> dict = { 'Cho-Chikun' : [1,1], 'Superman' : [666,666], 'Toya Koyo' : [1, 5] }
34 >>> o = PlanarOutputVectorGenerator(dict)
35 >>> output_vectors = [ o(name) for name in ['Cho-Chikun', 'Superman', 'Toya Koyo']]
37 See the densely commented code for other examples...
40 If you want to understand what is going on
42 What we do is that we try to determine players' strategies and characteristics by inspecting pattern files. A pattern file is a file that
43 has patterns in it :-) A pattern is a set of features (such as a distance from a border, a shape of stones on the goban, ..) for each move.
48 PREPARING THE DATASET:
50 1. Assemble a list of players you are interested in, one name per line,
51 to some file, e.g. players.list.
53 2. Get large SGF collection with index file listing games by players.
55 We use GoGoD 2008 Winter as our reference database.
57 3. Get Pachi source tree, and build yourself the binary. Copy over these
58 files: zzgo sgf2gtp.pl patterns.spat pattern_byplayer.sh
60 4. Create a directory where you will stash raw lists of patterns for each
61 player encountered - e.g. rawpats/.
63 5. For each game in the SGF collection where at least one of the players
64 is interesting, run ./pattern_byplayer.sh rawpats/ $SGFFILE. You can
65 ignore warnings/errors sometimes printed for individual files.
67 If importing from GoGoD, you can use:
69 ./data_gogod.sh $PATH_TO_GOGOD players.list rawpats/
71 See its header for usage details.
73 6. Create a directory where the pattern summaries by frequency for
74 interesting players will go - e.g. pats/.
76 7. Fill up pats/ from rawpats/ based on players.list. You can use a pre-made
77 script, run ./data_summarize.sh players.list rawpats/ pats/.
79 8. pats/ now contains the dataset that can be analyzed by other tools