From bce7226ff25ddd110137ea4fd0927c4d881da444 Mon Sep 17 00:00:00 2001
From: hellboy <j.moudrik@gmail.com>
Date: Mon, 15 Mar 2010 02:05:09 +0100
Subject: [PATCH] gostyle.tex: strength estimation for different number of
 games

---
 tex/gostyle.tex | 57 ++++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 46 insertions(+), 11 deletions(-)

diff --git a/tex/gostyle.tex b/tex/gostyle.tex
index 8ea42ec..dd23c80 100644
--- a/tex/gostyle.tex
+++ b/tex/gostyle.tex
@@ -793,7 +793,7 @@ The sociomap has been visualised using the Team Profile Analyzer \cite{TPA}
 which is part of the Sociomap suite \cite{SociomapSite}.
 
 
-\section{Strength Estimator}
+\section{Strength Estimation}
 
 \begin{figure*}[!t]
 \centering
@@ -818,12 +818,12 @@ Multiple independent real-world ranking scales exist
 the difference between scales can be up to several ranks and the rank
 distributions also differ. \cite{RankComparison}
 
+\subsection{Data used}
 As the source game collection, we use Go Teaching Ladder reviews archive%
 \footnote{The reviews contain comments and variations --- we consider only the main
 variation with the actual played game.}
 \cite{GTL} --- this collection contains 7700 games of players with strength ranging
-from 30-kyu to 4-dan; we consider only even games with clear rank information,
-and then randomly separate 770 games as a testing set.
+from 30-kyu to 4-dan; we consider only even games with clear rank information.
 Since the rank information is provided by the users and may not be consistent,
 we are forced to take a simplified look at the ranks,
 discarding the differences between various systems and thus somewhat
@@ -831,6 +831,7 @@ increasing error in our model.\footnote{Since our results seem satisfying,
 we did not pursue to try another collection;
 one could e.g. look at game archives of some Go server.}
 
+\subsection{PCA analysis}
 First, we have created a single pattern vector for each rank, from 30-kyu to 4-dan;
 we have performed PCA analysis on the pattern vectors, achieving near-perfect
 rank correspondence in the first PCA dimension%
@@ -847,16 +848,49 @@ reasonably satisfying accuracy by itself.%
 \footnote{Extended vector normalization (sec. \ref{xnorm})
 produced noticeably less clear-cut results.}
 
-To further enhance the strength estimator accuracy,
-we have tried to train a NN classifier on our train set, consisting
-of one $(\vec p, {\rm rank})$ pair per player --- we use the pattern vector
-for activation of input neurons and rank number as result of the output
-neuron. We then proceeded to test the NN on per-player pattern vectors built
-from the games in the test set, yielding MSE of TODO with TODO games per player
-on average.
+\subsection{Strength classifier}
+To further enhance the strength estimator usability,
+we have tried to use a $k$-NN classifier on the testing set, consisting
+of one $(\vec p, {\rm rank})$ pair per player. The testing set was
+randomly separated as $10\%$ size of the game database.
+We want to learn the smallest number of games for a~player's strength to be reasonably estimated.
+The player files within each rank in the database were
+merged to increase the size of files and to make the pattern distribution more reliable.
+
+TODO zminit uz na zacatku?
+Moreover, we discarded \emph{ldist} and \emph{lldist} from the pattern files, since they increase
+granularity of patterns (note that we are dealing with small input files, so the benefit of
+having more features -- that increase variability -- is negligible in this case).
+
+The results for different file sizes are shown in the table \ref{table-str-class}. The smaller the
+file, the smaller the number of games needed; however, with smaller files comes bigger error.
+The error is listed as either MSE, or standard deviation $\sigma$ in percentage (meaning
+the difference from the real rank on average).
+
+TODO je v pohode ta velikost souboru?? (zalezi na reprezentaci..) 
+TODO jestli jen hry tak predelat i nahore
+\begin{table}[!t]
+% increase table row spacing, adjust to taste
+\renewcommand{\arraystretch}{1.3}
+\caption{TODO}
+\label{table-str-class}
+\centering
+\begin{tabular}{|c|c|c|c|c|}
+\hline
+File size ($\sim$ number of games) & MSE & $\sigma \%$ \\ \hline
+$500 \mathrm{kB} (\sim85)$& $0.007$ & $4\%$   \\
+$250 \mathrm{kB} (\sim43)$& $0.029$ & $8\%$	\\
+$100 \mathrm{kB} (\sim17)$& $0.081$ & $14\%$ \\
+$50 \mathrm{kB}  (\sim9)$& $0.131$ & $18\%$ \\
+$10 \mathrm{kB}  (\sim2)$& $0.187$ & $22\%$ \\\hline
+\end{tabular}
+\end{table}
 
+Finally, we used a $8$-fold cross validation (see section \ref{crossval}) on one-file-per-rank files, 
+yielding MSE $0.085$ (rank encoded linearly in $[-1,1]$), which is equivalent to
+standard deviation of $15\%$.
 
-\section{Style Estimator}
+\section{Style Estimation}
 \label{styleest}
 
 As a~second case study for our pattern analysis,
@@ -1316,6 +1350,7 @@ data. The network's output was afterwards rescaled back to allow for MSE compari
 All input (pattern) vectors were preprocessed using PCA, reducing the input dimension from $400$ to $23$.
 
 \subsubsection{Cross-validation}
+\label{crossval}
 To compare and evaluate all methods, we have performed $5$-fold cross validation
 and compared each method's performance with a~random classifier.
 In the $5$-fold cross-validation, we randomly divide the training set
-- 
2.11.4.GIT