From eefc999dd473d0c3e902af9a88fcee89a30736d8 Mon Sep 17 00:00:00 2001 From: Joe Moudrik Date: Tue, 28 May 2013 12:32:58 +0200 Subject: [PATCH] clanek++ --- clanek_go_congress/clanek.bib | 14 ++-- clanek_go_congress/clanek.tex | 161 +++++++++++++++++++++++++++++++++++++----- 2 files changed, 150 insertions(+), 25 deletions(-) diff --git a/clanek_go_congress/clanek.bib b/clanek_go_congress/clanek.bib index d296b22..2f2deaa 100644 --- a/clanek_go_congress/clanek.bib +++ b/clanek_go_congress/clanek.bib @@ -1,12 +1,10 @@ % Encoding: UTF-8 -@ONLINE{GoAncientChina, - author = "John Fairbairn", - year = "1995", - title = "Go in Ancient China", - url = "http://www.pandanet.co.jp/English/essay/goancientchina.html", - cited = "15.\,1.\,2013", - owner = "jm" +@TECHREPORT{style_quest, + author = "{M}oud{\v{r}\'{i}}k, {J}osef and {B}audi{\v{s}}, {P}etr", + title = "Style Consensus: Style of Professional Players, Judged by Strong Players", + year = "2013", + month = "May", } @MastersThesis{Moudrik13, @@ -14,9 +12,9 @@ title = "Meta-learning methods for analyzing Go playing trends", school = "Charles University, Faculty of Mathematics and Physics", address = "Prague, Czech Republic", + note = "Currently in preparation.", year = "2013", } - @ONLINE{GoAncientChina, author = "John Fairbairn", year = "1995", diff --git a/clanek_go_congress/clanek.tex b/clanek_go_congress/clanek.tex index a92c8e5..0cd23bb 100644 --- a/clanek_go_congress/clanek.tex +++ b/clanek_go_congress/clanek.tex @@ -59,9 +59,13 @@ algorithms, they can be used to predict arbitrary relevant target variables. We apply this methodology to predict strength and playing style (e.g. territoriality or aggressivity) of a player and make our predictor available as an online tool, a part of the GoStyle project. -By inspecting the dependencies between the evaluations and the target variable, -we are able to tell which patterns are bad or good (in case of strength as the -target variable), or which moves e.g. constitute the territorial style of play. +%% No, na tohle neni v clanku misto, pze to ma mit jen 8 stranek +% navic bych tyhle veci chtel zverejnit i samy o sobe, nejak dukladnejc, +% +%By inspecting the dependencies between the evaluations and the target variable, +%we are able to tell which patterns are bad or good (in case of strength as the +%target variable), or which moves e.g. constitute the territorial style of play. +%% We propose a number of possible applications including seeding real-work ranks of internet players, aiding in Go study and tuning of Go-playing programs, or contribution to Go-theoretical discussion on the scope of ``playing style''. @@ -82,7 +86,7 @@ capture enemy stones. We assume basic familiarity with the game. Since the game has a worldwide popularity, there exist large collections of Go game records, both for amateur players and professionals -\citep{KGS,GoGoD,gokifu}. +e.g. \citep{KGS,GoGoD}. So far, not much has been done in analysing these records using computers. There are programs that serve as tools to study the opening phase of the game by giving simple statistics of next move from professional @@ -134,7 +138,7 @@ sub-vectors we call \emph{features} -- examples include the aforementioned local patterns or statistics of sente and gote sequences. These will be detailed in the rest of this section. Some of the explanations are simplified to fit the size of -the paper, please see \citep{Moudrik13} for precise details. +the paper, please see \citep{Moudrik13} for precise details and algorithms. \subsection{Raw Game Processing} Firstly, we need to specify how do we process the games. @@ -152,15 +156,17 @@ Pachi outputs a list of key-value pairs regarding the current move: \item \textbf{atari flag} --- whether the move put enemy stones in atari, \item \textbf{atari escape flag} --- whether the move saved own stones from atari, \item \textbf{capture} --- number of enemy stones the move captured, - \item \textbf{contiguity to last move} --- the gridcular distance (presented - below) from the last move, - \item \textbf{board edge distance} --- the distance from the nearest edge of the board, + \item \textbf{contiguity to last move} --- the gridcular + distance \footnotemark{\ref{grid}} from the last move, + \item \textbf{board edge distance} --- the distance from + the nearest edge of the board, \item \textbf{spatial pattern} --- configuration of stones around the played move. \end{itemize} We use this information to compute the higher level features given below. The spatial pattern pictures positions of stones around the current move up to a certain distance.\footnote{ + \label{grid} The distance is given by the {\em gridcular} metric $d(x,y) = |\delta x| + |\delta y| + \max(|\delta x|, |\delta y|)$, which produces a circle-like structure on the Go board square grid \cite{SpatPat}. @@ -173,8 +179,8 @@ spatial patterns (together with both atari flags). The list of the $N$ most freq played patterns is computed beforehand from the whole database of games. Given a set of of colored games $GC$ we then count how many times was each of the $N$ -patterns played -- thus obtaining a vector $c, |c| = 400$ of counts. -Like this, however, the particular counts $c_i$ increase proportionally to +patterns played -- thus obtaining a vector $c$ of counts ($|c| = 400$). +With simple occurences count however, particular counts $c_i$ increase proportionally to number of games in $GC$. To maintain invariancy under the number of games in the sample, a normalization is needed. We do this by dividing the $c$ by $|GC|$, though other schemes are possible, see \citep{Moudrik13}. @@ -187,8 +193,11 @@ for human players, we restricted ourselves to what we call $\omega$-local (sente and gote) sequences. The simplification has a clear assumption -- the responses to a sente move are always local. We say, that a move is $\omega$-local (with respect to the previous move) if its gridcular distance from previous move -is smaller than a fixed number $\omega$. Of course, this assumption might not always -hold, but the feature proves to be useful nonetheless. +is smaller than a fixed number $\omega$ (in this work, we used $\omega=10$\footnote{ + Of course, different values are possible, see \citep{Moudrik13} for details. +}). +Of course, this assumption might not always hold, but +the feature proves to be useful nonetheless. We than partition each game into $\omega$-local sequences (that is, each move in the sequence is $\omega$-local with respect to its directly previous move) and observe @@ -322,27 +331,145 @@ error estimation to be robust. \subsubsection*{Cross-Validation} -Cross-validation is a standard statistical technique for estimation of parameters. +Cross-validation is a standard statistical technique for robust estimation of parameters. The idea is to split the data into $k$ disjunct subsets (called \emph{folds}), and then iteratively compose the training and testing sets and measure errors. -In each of the $k$ iterations, $k$-th fold is chosen as the testing data, and -all the remaining $k-1$ folds form the training data. The division into the folds is -done randomly, and so that the folds have approximately the -same size. Refer to~\citep{crossval} to learn more. In this work, we have used 10-fold +%In each of the $k$ iterations, $k$-th fold is chosen as the testing data, and +%all the remaining $k-1$ folds form the training data. The division into the folds is +%done randomly, and so that the folds have approximately the +%same size. +Refer to~\citep{crossval} to learn more. In this work, we have used 10-fold cross validation. \section{Experiments and Results} \label{sec:expe} +\subsection{Strength} +One of two major domains we have tested our framework on is the prediction of player +strengths. +\subsubsection*{Dataset} +We have collected a large sample of games from the publicly available +archives of the Kiseido Go server~\citep{KGSArchives}. +The sample consists of over 100 000 records of games in the \emph{.sgf} format~\citep{SGF}. + +For each rank $r$ in the range of 6-dan to 20-kyu, we gathered a +list of players $P_r$ of the particular rank. To avoid biases caused by +different strategies, the sample only consists of games played on $19 \times 19$ goban without +handicap stones. +The set of colored games $GC_p$ for a~player $p \in P_r$ consists of the games player $p$ +played when he had the rank $r$. We only use the $GC_p$ if the number of +games is not smaller than 10 games; if the sample is larger than 50 games, we +randomly choose a subset of the sample (the size of subset is uniformly randomly +chosen from interval $\langle 10, 50\rangle$).\footnote{ + By cutting the number of games to a fixed number (say 50) for large + samples, we would create an artificial disproportion in sizes of $GC_p$, + which could introduce bias into the process. +} + +For each of the 26 ranks, we gathered 120 such $GC_p$. +The target variable $y$ we learn directly corresponds to the ranks: +$y=20$ for rank of 20-kyu, $y=1$ for 1-kyu, $y=0$ for 1-dan, $y=-5$ +for 6-dan, other values similarly. (With increasing strength, the $y$ +decreases.) + +\subsubsection*{Results} + +\subsection{Style} + +The second domain is the prediction of different aspects of player styles. + +\subsubsection*{Dataset} +The collection of games in this dataset comes from the Games of Go on Disk database by \citet{GoGoD}. +This database contains more than 70 000 games, spanning from the ancient times +to the present. + +We chose a small subset of well known players (mainly from the 20th century) and +asked some experts (professional and strong amateur players) +to evaluate these players using a questionnaire. The experts (Alexander +Dinerchtein 3-pro, Motoki Noguchi 7-dan, +Vladim\'{i}r Dan\v{e}k 5-dan and V\'{i}t Brunner 4-dan) +were asked to value the players on four scales, each ranging from 1 to 10. + +%\begin{table}[h!] +\begin{center} +%\caption{Styles} +\begin{tabular}{|c|c|c|} +\hline +\textbf{Style} & \textbf{1} & \textbf{10}\\ \hline +Territoriality & Moyo & Territory \\ +Orthodoxity & Classic & Novel \\ +Aggressivity& Calm & Fighting \\ +Thickness & Safe & Shinogi \\ \hline +\end{tabular} +\end{center} +%\caption[Definition of the style scales]{ +%The definition of the style scales. +%} +%\label{tab:style_def} +%\end{table} + +The scales try to reflect +some of the traditionally perceived playing styles.\footnote{ + Refer to~\citet{GoGoD:styles}, or~\citet{senseis:styles} to grasp the concept deeper. +} +For example, the first scale (\emph{territoriality}) +stresses whether a player prefers safe, yet inherently smaller territory (number 10 on the scale), +or roughly sketched large territory (\emph{moyo}, 1 on the scale), which is however insecure. +For each of the selected professionals, we took 192 of his games from the GoGoD database +at random. We divided these games (at random) into 12 colored sets $GC$ of 16 games. + +The target variable (for each of the four styles) $y$ is given by average of the answers of +the experts. Results of the questionnaire are published in~\citep{style_quest}. + +\subsubsection*{Results} + \section{Discussion} \label{sec:disc} +The results in both the domains showed, that our evaluations are useful in predicting +different kinds of player attributes. This might have a number of possible applications. + +So far, we have utilized some of our findings in an online web +application\footnote{\url{http://gostyle.j2m.cz/webapp.html}}. Based on data submitted +by an user, it computes his evaluation and predicts his playing style +and recommends relevant professional players to review. + +Other possible applications include helping the ranking algorithms to converge faster --- +usually, the ranking of a player is determined from his opponents' ranking by looking +at numbers of wins and losses (e.g. by computing an ELO rating). Our methods might improve this +by including the domain knowledge. +Similarly, a computer Go program can quickly classify the level of its +human opponent based on the evaluation from their previous games +and auto-adjust its difficulty settings accordingly +to provide more even games for beginners. + +Also, it is possible to study dependencies between single elements of the evaluation vector +and the target variable $y$ directly. By pinpointing e.g. the patterns +that correlate most strongly with small strength (players who play them are weak), we can +warn the user not to play these. We have made some initial research into this in~\citep{Moudrik13}, +we do not present these results here, because of space constraints. + \section{Conclusion} \label{sec:conc} +This article presents a method for evaluating a player based on a sample of his games. +These summary evaluations turn out to be useful in many cases --- they allow us to predict +different player attributes (such as strength, or playing style) with reasonable accuracy. +We hope, that the applications of these findings can help to improve both human and computer +understanding in the game of Go. \section{Implementation} \label{sec:impl} +The code used in this work +is released online as a part of GoStyle project~\citep{GoStyleWeb}. +The majority of the source code is implemented in +the Python programming language~\citep{Python27}. + +The machine learnin part was realized using the +Orange Datamining suite~\citep{curk05}, with the exception of +the Fast Artificial Neural Network library FANN~\citep{Nissen2003}. +We used the Pachi Go engine~\citep{Pachi} for the raw game processing. + \bibliographystyle{abbrvnat} \bibliography{clanek} -- 2.11.4.GIT