From 535479a1ac60c575064adf18e5ac3ad8479373fd Mon Sep 17 00:00:00 2001 From: Petr Baudis Date: Sun, 13 Jun 2010 19:10:32 +0200 Subject: [PATCH] tex: Add also more abstract PCA description --- tex/gostyle.tex | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/tex/gostyle.tex b/tex/gostyle.tex index 6e33b81..d2bcf9b 100644 --- a/tex/gostyle.tex +++ b/tex/gostyle.tex @@ -524,15 +524,24 @@ The first two methods {\em (analytic)} rely purely on single data set and serve to show internal structure and correlations within the data set. Principal Component Analysis \cite{Jolliffe1986} -finds orthogonal vector components that have the largest variance. -Reversing the process% +finds orthogonal vector components that \rv{represent} the largest variance +\rv{of values within the dataset. +That is, PCA will produce vectors representing +the overall variability within the dataset --- the first vector representing +the ``primary axis'' of the dataset, the next vectors representing the less +significant axes; each vector has an associated number that +determines its impact on the overall dataset variance: $1.0$ would mean +that all points within the dataset lie on this vector, value close to zero +would mean that removing this dimension would have little effect on the +overall shape of the dataset.} +Reversing the process of the PCA% \footnote{Looking at dependencies of a single orthogonal vector component in the original vector space.} can indicate which patterns correlate with each component. Additionally, PCA can be used as vector preprocessing for methods that are negatively sensitive to pattern vector component correlations. -The~second method of Sociomaps \cite{Sociomaps} \cite{TeamProf} creates +\rv{On the other hand,} Sociomaps \cite{Sociomaps} \cite{TeamProf} produce spatial representation of the data set elements (e.g. players) based on similarity of their data set features; we can then project other information on the map to illutrate its connection to the data set.% @@ -606,7 +615,7 @@ to reduce the dimensions of the pattern vectors while preserving as much information as possible, assuming inter-dependencies between pattern vector dimensions are linear. -Briefly, PCA is an eigenvalue decomposition of a~covariance matrix of centered pattern vectors, +\rv{Technically}, PCA is an eigenvalue decomposition of a~covariance matrix of centered pattern vectors, producing a~linear mapping $o$ from $n$-dimensional vector space to a~reduced $m$-dimensional vector space. The $m$ eigenvectors of the original vectors' covariance matrix -- 2.11.4.GIT