From 0b0cfbacdb606b7d4d103e5aef80b2fc0e3d6adf Mon Sep 17 00:00:00 2001 From: hellboy Date: Sat, 6 Mar 2010 18:31:31 +0100 Subject: [PATCH] tex --- tex/gostyle.tex | 67 +++++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 46 insertions(+), 21 deletions(-) diff --git a/tex/gostyle.tex b/tex/gostyle.tex index a04dc36..6ee7c83 100644 --- a/tex/gostyle.tex +++ b/tex/gostyle.tex @@ -442,37 +442,65 @@ TODO PCA: We use it either on it own, or as a~pre-processing making data suitabl TODO rozdelit na algo/results?? \subsection{Principal Component Analysis} -Principal Component Analysis \emph{PCA} \cite{Jolliffe1986} is a~method used to reduce the dimensions of input -vectors while preserving as much variabiality as possible. +Principal Component Analysis \emph{PCA} \cite{Jolliffe1986} is a~method we use to reduce the dimensions of +\emph{pattern vectors} while preserving as much information as possible. -Shortly, PCA is an eigenvalue decomposition of a~covariance matrix of centered vectors. -It can be thought of as a~mapping $o$ from $n$-dimensional vector space to -a~reduced $m$-dimensional vector space. The base of this reduced vector space comprises eigenvectors -of input vectors' covariance matrix. See (TODO) for details. +Shortly, PCA is an eigenvalue decomposition of a~covariance matrix of centered \emph{pattern vectors}. +It can be thought of as a~mapping $o$ from $n$-dimensional vector space to a~reduced $m$-dimensional vector space. +The base of this reduced vector space comprises $m$ eigenvectors of original vectors' covariance matrix. +We choose them to be the eigenvectors with biggest eigenvalues. +Ordered by decreasing eigenvalues, the eigenvectors form rows of the transformation matrix $W$. -\begin{enumerate} -\item Center the input vectors (per element TODO clarify) -\item Compute the covariance matrix of input data -\item Compute eigenvalues and eigenvectors of the covariance matrix -\item $\qquad$ $m$ first eigenvectors (with largest eigenvalues) form a~base of the reduced vector space -\item $\qquad$ we may interpret these vectors as a~rows of a~projection matrix of projection $o$ -\end{enumerate} -TODO prepsat pomoci math +Finally, we represent reduced \emph{pattern vectors} as a vector of coeficients of this eigenvector-base. +For each original \emph{pattern vector} $\vec p_i$, we obtain its new representation $\vec r_i$ as shown +in the following equation: +\begin{equation} +\vec r_i = W * \vec p_i +\end{equation} +The whole process is described in the Algorithm \ref{alg:pca}. + +\begin{algorithm} +\caption{PCA -- Principal Component analysis} +\begin{algorithmic}[1] +\label{alg:pca} +\REQUIRE{$m > 0$, set of players $R$ with \emph{pattern vectors} $p_r$} +\STATE $\vec \mu \leftarrow 1/|R| * \sum_{r \in R}{\vec p_r}$ +\FOR{ $r \in R$} +\STATE $\vec p_r \leftarrow \vec p_r - \vec \mu$ +\ENDFOR +\FOR{ $(i,j) \in \{1,... ,n\} \times \{1,... ,n\}$} +\STATE $Cov[i,j] \leftarrow 1/|R| * \sum_{r \in R}{\vec p_{ri} * \vec p_{rj}}$ +\ENDFOR +\STATE Compute Eigenvalue Decomposition of $Cov$ matrix +\STATE Get $m$ biggest eigenvalues +\STATE According eigenvectors ordered by decreasing eigenvalues form rows of matrix $W$ +\FOR{ $r \in R$} +\STATE $\vec r_r\leftarrow W \vec p_r$ +\ENDFOR +\end{algorithmic} +\end{algorithm} \subsection{?? Kohonen Maps ??} \subsection{k-nearest Neighbors Classifier} K-nearest neigbors is an essential classification technique. We use it to approximate player's \emph{style vector} $\vec S$, assuming that his \emph{pattern vector} $\vec P$ is known. -To achieve this, we utilize \emph{reference style vectors} (see chapter \ref{style-vectors}). +To achieve this, we utilize \emph{reference style vectors} (see section \ref{style-vectors}). -The idea is to approximate $\vec S$ as a weighted average of \emph{style vectors} -$\vec s_i$ for $k$ players with \emph{pattern vectors} $\vec p_i$ closest to $\vec P$. +The idea is based on a assumption that similarities in players' \emph{pattern vectors} +correlate with similarities in players' \emph{style vectors}. We try to approximate $\vec S$ +as a weighted average of \emph{style vectors} +$\vec s_i$ of $k$ players with \emph{pattern vectors} $\vec p_i$ closest to $\vec P$. +This is illustrated in the Algorithm \ref{alg:knn}. +Note that the weight is a function of distance and it is not explicitly defined in Algorithm \ref{alg:knn}. +During our research, exponentialy decreasing weight has proven to be sufficient. \begin{algorithm} \caption{k-Nearest Neighbors} \begin{algorithmic} +\label{alg:knn} +\REQUIRE{pattern vector $\vec P$, $k > 0$, set of reference players $R$} \FORALL{r $\in$ R } \STATE $D[r] \leftarrow EuclideanDistance(\vec p_r, \vec P)$ \ENDFOR @@ -484,12 +512,9 @@ $\vec s_i$ for $k$ players with \emph{pattern vectors} $\vec p_i$ closest to $\v \end{algorithmic} \end{algorithm} - - - - \subsection{Neural Network Classifier} + \subsection{Implementation} -- 2.11.4.GIT