From 1667dc3a649155939d195b109ebd4bcc860aede3 Mon Sep 17 00:00:00 2001 From: AJ Rossini Date: Wed, 22 Jul 2009 19:20:24 +0200 Subject: [PATCH] initial doc for why/what dataframes are in CLS. Signed-off-by: AJ Rossini --- Doc/dataframes.txt | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 Doc/dataframes.txt diff --git a/Doc/dataframes.txt b/Doc/dataframes.txt new file mode 100644 index 0000000..52855ab --- /dev/null +++ b/Doc/dataframes.txt @@ -0,0 +1,28 @@ + -*- mode: org -*- + +* Introduction + + Dataframes are a central object within the S language. They are + extensions of matrices, which are a specialization of arrays, and so + could be considered a specialization of arrays. Dataframes, + however, provide a link between statistical data and the + corresponding numerical data which, for the resulting statistical + procedures, is manipulated through numerical linear algebra. + + We have a virtual dataframe class, which inherits from the + MATRIX-LIKE class in LISP-MATRIX. LISP-MATRIX constructs a + framework for numerical linear algebra using a range of "storage + back-ends", currently lisp-arrays (using Tamas Papp's + Foriegn-Friendly-Array (FFA) package) and foriegn arrays (using Rif + Rifkin's foreign-numerical-value (FNV) package). + + Future plans currently include implementing a listoflist backend for + both matrices and dataframes, as well as GSLL backend for matrices + (not clear if usable for dataframes). + +* Dataframe implementation + + The DATAFRAME-LIKE class directly generalizes the MATRIX-LIKE class + by adding statistically-relevant typing to columns, along with case + ids (row names) and column ids (variable names). We need to + construct a common nomenclature for this. -- 2.11.4.GIT