From ee8bb25fc4673a4726c9a77adaab84f0041382ac Mon Sep 17 00:00:00 2001 From: hellboy Date: Mon, 8 Mar 2010 23:10:39 +0100 Subject: [PATCH] gostyle.tex: text bugfixes --- tex/gostyle.tex | 40 +++++++++++++++++++++++++--------------- 1 file changed, 25 insertions(+), 15 deletions(-) diff --git a/tex/gostyle.tex b/tex/gostyle.tex index c4a14e5..2174f0b 100644 --- a/tex/gostyle.tex +++ b/tex/gostyle.tex @@ -364,8 +364,8 @@ we have gathered some expert-based information about various traditionally perceived style aspects. Three high-level Go players (Alexander Dinerstein 3-pro, Motoki Noguchi 7-dan and Vit Brunner 4-dan) have judged style of several Go -professionals (chosen for both being well-known within the community -and having large number of played games in our collection). +professionals -- we call them \emph{reference playerse} -- chosen for both +being well-known within the community and having large number of played games in our collection. This expert-based knowledge allows us to predict styles of unknown players based on the similarity of their pattern vectors, as well as discover correlations between @@ -391,16 +391,21 @@ Thickness & Safe & Shinogi \\ \hline \end{tabular} \end{center} %\end{table} +\vspace{4mm} -Averaging and rescaling the expert based evaluation yields a set of -\emph{reference style vectors} $\vec s_r$. -%-- each with a \emph{pattern vector} $\vec p_i$ and \emph{style vector} $\vec s_i$. +Averaging this expert based evaluation yields +\emph{reference style vector} $\vec s_r$ (of dimension $4$) for each player $r$ +from the set of \emph{reference players} $R$. +%-- each with a \emph{pattern vector} $\vec p_i$ and \emph{style vector} $\vec s_i$. \section{Data Extraction} \label{pattern-vectors} +In addition to the explicit expert knowledge, we use the data obtained by... -As the input of our method, we assume a~collection of game records\footnote{We +TODO rozvest uvod, nemuze se zacinat jenom As the input... + +As the input, we assume a~collection of game records\footnote{We use the SGF format (TODO) in our implementation.} organized by player names. We use two collections; the first one is GoGoD Winter 2009 (TODO) containing 42000 (TODO) professional games, dating from the early Go history 1500 years ago to the present. @@ -410,14 +415,18 @@ The other source is Go Teaching Ladder reviews (TODO). These include 7600 games of players spanning over all strength levels; we use this collection for finding correlations between moves of players of the same strength rank. -In order to generate the required compact description of most played moves, -for each player, we extract a~generic description from each move -played by the player, then take the most occuring $n$ patterns across all players% -\footnote{We use $n=500$ in our analysis.} and assign each player a~{\em pattern vector} -$\vec p$ where each dimension corresponds to the number of occurences of -one given pattern normalized to range $[0,1]$. +In order to generate the required compact description of most frequently played moves, +we construct a set of $n$ most occuring patterns (\emph{top patterns}) +across all players and games from the database\footnote{We use $n=500$ in our analysis.}. +For each player, we then count how many times was each of those $n$ patterns played +during all his games and finally assign him a~{\em pattern vector} $\vec p$ of dimension $n$, with each +dimension corresponding to the relative number of occurences of a given pattern +(with respect to player's most played \emph{top pattern}). Using relative numbers of occurences ensures that +each dimension of player's \emph{pattern vector} is scaled to range $[0,1]$ and +therefore even players with different number of games in the database have comparable \emph{pattern vectors}. \subsection{Pattern Features} +TODO sladit aby to navazovalo na predchozi odstavec Of course a big question is how to compose the pattern descriptions. There are some tradeoffs in play - overly general descriptions carry too few @@ -427,7 +436,7 @@ not statistically significant. We have chosen an intuitive and simple approach inspired by pattern features used when computing ELO ratings for candidate patterns in Computer Go play. -\cite{ELO} Each pattern is combination of several {\em pattern features} +\cite{ELO} Each pattern is a~combination of several {\em pattern features} matched at the position of the played move. We use these features: \begin{itemize} @@ -464,7 +473,7 @@ analysis finds orthogonal vector components that have biggest variance. Reversin indicates which patterns correlate with each style. Additionally, PCA can be used as a vector-preprocessing for methods that are negatively sensitive to \emph{pattern vector} component correlations. -A second method -- Kohonen's networks -- is based on the theory of self-organizing maps of neurons that +A~second method -- Kohonen's networks -- is based on the theory of self-organizing maps of neurons that compete against each other for representation of the input space. Because neurons in the network are organized in a two-dimensional plane, the trained network virtually spreads vectors to the 2D plane, allowing for simple visualization. @@ -650,7 +659,7 @@ is then constructed as: T = T_\mathit{base} \cup \mathit{SomeFiniteSubset}(T_\mathit{ext}) \end{equation} -The network is trained as shown in the following pseudocode in Algorithm \ref{alg:tnn}. +The network is trained as shown in Algorithm \ref{alg:tnn}. \begin{algorithm} \caption{Training Neural Network} @@ -693,6 +702,7 @@ TODO libfann \section{Style Components Analysis} + \section{Strength Estimation Analysis} -- 2.11.4.GIT