From ee8bb25fc4673a4726c9a77adaab84f0041382ac Mon Sep 17 00:00:00 2001
From: hellboy <j.moudrik@gmail.com>
Date: Mon, 8 Mar 2010 23:10:39 +0100
Subject: [PATCH] gostyle.tex: text bugfixes

---
 tex/gostyle.tex | 40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/tex/gostyle.tex b/tex/gostyle.tex
index c4a14e5..2174f0b 100644
--- a/tex/gostyle.tex
+++ b/tex/gostyle.tex
@@ -364,8 +364,8 @@ we have gathered some expert-based information about various
 traditionally perceived style aspects.
 Three high-level Go players (Alexander Dinerstein 3-pro, Motoki Noguchi
 7-dan and Vit Brunner 4-dan) have judged style of several Go
-professionals (chosen for both being well-known within the community
-and having large number of played games in our collection).
+professionals -- we call them \emph{reference playerse} -- chosen for both
+being well-known within the community and having large number of played games in our collection.
 
 This expert-based knowledge allows us to predict styles of unknown players based on
 the similarity of their pattern vectors, as well as discover correlations between
@@ -391,16 +391,21 @@ Thickness & Safe & Shinogi \\ \hline
 \end{tabular}
 \end{center}
 %\end{table}
+\vspace{4mm}
 
-Averaging and rescaling the expert based evaluation yields a set of
-\emph{reference style vectors} $\vec s_r$. 
-%-- each with a \emph{pattern vector} $\vec p_i$ and \emph{style vector} $\vec s_i$.
+Averaging this expert based evaluation yields
+\emph{reference style vector} $\vec s_r$ (of dimension $4$) for each player $r$
+from the set of \emph{reference players} $R$.
 
+%-- each with a \emph{pattern vector} $\vec p_i$ and \emph{style vector} $\vec s_i$.
 
 \section{Data Extraction}
 \label{pattern-vectors}
+In addition to the explicit expert knowledge, we use the data obtained by...
 
-As the input of our method, we assume a~collection of game records\footnote{We
+TODO rozvest uvod, nemuze se zacinat jenom As the input...
+
+As the input, we assume a~collection of game records\footnote{We
 use the SGF format (TODO) in our implementation.} organized by player names.
 We use two collections; the first one is GoGoD Winter 2009 (TODO) containing 42000 (TODO)
 professional games, dating from the early Go history 1500 years ago to the present.
@@ -410,14 +415,18 @@ The other source is Go Teaching Ladder reviews (TODO). These include 7600 games
 of players spanning over all strength levels; we use this collection
 for finding correlations between moves of players of the same strength rank.
 
-In order to generate the required compact description of most played moves,
-for each player, we extract a~generic description from each move
-played by the player, then take the most occuring $n$ patterns across all players%
-\footnote{We use $n=500$ in our analysis.} and assign each player a~{\em pattern vector}
-$\vec p$ where each dimension corresponds to the number of occurences of
-one given pattern normalized to range $[0,1]$.
+In order to generate the required compact description of most frequently played moves,
+we construct a set of $n$ most occuring patterns (\emph{top patterns})
+across all players and games from the database\footnote{We use $n=500$ in our analysis.}.
+For each player, we then count how many times was each of those $n$ patterns played
+during all his games and finally assign him a~{\em pattern vector} $\vec p$ of dimension $n$, with each
+dimension corresponding to the relative number of occurences of a given pattern
+(with respect to player's most played \emph{top pattern}). Using relative numbers of occurences ensures that
+each dimension of player's \emph{pattern vector} is scaled to range $[0,1]$ and
+therefore even players with different number of games in the database have comparable \emph{pattern vectors}.
 
 \subsection{Pattern Features}
+TODO sladit aby to navazovalo na predchozi odstavec
 
 Of course a big question is how to compose the pattern descriptions.
 There are some tradeoffs in play - overly general descriptions carry too few
@@ -427,7 +436,7 @@ not statistically significant.
 
 We have chosen an intuitive and simple approach inspired by pattern features
 used when computing ELO ratings for candidate patterns in Computer Go play.
-\cite{ELO} Each pattern is combination of several {\em pattern features}
+\cite{ELO} Each pattern is a~combination of several {\em pattern features}
 matched at the position of the played move. We use these features:
 
 \begin{itemize}
@@ -464,7 +473,7 @@ analysis finds orthogonal vector components that have biggest variance. Reversin
 indicates which patterns correlate with each style. Additionally, PCA can be used as a vector-preprocessing
 for methods that are negatively sensitive to \emph{pattern vector} component correlations.
 
-A second method -- Kohonen's networks -- is based on the theory of self-organizing maps of neurons that
+A~second method -- Kohonen's networks -- is based on the theory of self-organizing maps of neurons that
 compete against each other for representation of the input space. Because neurons in the network are
 organized in a two-dimensional plane, the trained network virtually spreads vectors to the 2D plane,
 allowing for simple visualization.
@@ -650,7 +659,7 @@ is then constructed as:
 T = T_\mathit{base} \cup \mathit{SomeFiniteSubset}(T_\mathit{ext})
 \end{equation}
 
-The network is trained as shown in the following pseudocode in Algorithm \ref{alg:tnn}.
+The network is trained as shown in Algorithm \ref{alg:tnn}.
 
 \begin{algorithm}
 \caption{Training Neural Network}
@@ -693,6 +702,7 @@ TODO libfann
 \section{Style Components Analysis}
 
 
+
 \section{Strength Estimation Analysis}
 
 
-- 
2.11.4.GIT