some updates
[PyX/mjg.git] / manual / data.tex
blobfbdea896c335bb44e47195565902c77498cccc45
1 \chapter{Module data}
2 \label{module:data}
4 \section{Reading a table from a file}
6 The module datafile contains the class \verb|datafile| which can be
7 used to read in a table from a file. You just have to construct an
8 instance and provide a filename as the parameter, e.g.
9 \verb|datafile("testdata")|. The parsing of the file, namely the
10 columns of the table, is done by matching regular expressions. They
11 can be modified, as they are additional named arguments of the
12 constructor. Furthermore there is the possibility to skip some of
13 the data points by some other keyword arguments as listed in the
14 following table:
16 \medskip
17 \begin{tabularx}{\linewidth}{l>{\raggedright\arraybackslash}X}
18 argument name&description\\
19 \hline
20 \texttt{commentpattern}&start a comment line; default: \texttt{re.compile(r"(\#+|!+|\%+)\textbackslash s*")}\\
21 \texttt{stringpattern}&a string column; default: \texttt{re.compile(r"\textbackslash"(.*?)\textbackslash"(\textbackslash s+|\$)}\\
22 \texttt{columnpattern}&any other column; default: \texttt{re.compile(r"(.*?)(\textbackslash s+|\$)}\\
23 \texttt{skiphead}&skip first data lines; default: \texttt{0}\\
24 \texttt{skiptail}&skip last data lines; default: \texttt{0}\\
25 \texttt{every}&only take every \texttt{every} data line into account; default: \texttt{1}
26 \end{tabularx}
27 \medskip
29 The processing of the input file is done by reading the file line by
30 line and first strip leading and tailing whitespaces of the line. Then
31 a check is performed, whether the line matches the comment pattern or
32 not. If it does match, this rest of the line is analysed like a table
33 line when no data was read before (otherwise it is just thrown away).
34 The result is interpreted as column titles. As the titles are
35 sequentially overwritten by another comment line previous to the data,
36 finally the last non-empty comment line determines the column titles.
38 Thus we have still to explain, how the reading of data lines works. We
39 create a list of entries for each column out of a given line. A line
40 resulting in an empty list (e.g. an empty line) is just ignored. As
41 shown in the table above, there is a special string column pattern.
42 When it matches it forces the interpretation of a column as a string.
43 Otherwise \verb|datafile| will try to convert the columns
44 automatically into floats except for the title line. When the
45 conversions fails, it just keeps the string.
47 The default string pattern allows for columns to contain whitespaces.
48 It matches a string whenever it starts with a quote (\verb|"|) and
49 then tries to find the end of that very string by another quote
50 immediately followed by a whitespace or the end of the line. Hence a
51 quote within a string is just ignored and no kind of escaping is
52 needed. The only disadvantage is, that you cannot describe a string
53 which contains a quote and a whitespace consecutively. However, you
54 can always replace this string pattern to fit your special needs.
56 Finally the number of columns is fixed to the maximal number contained
57 in the file and lines with less entries get filled with \verb|None|.
58 Also the titles list is cutted to this maximal number of columns.
60 \section{Accessing columns}
62 The method \verb|getcolumnno| takes a parameter as the column
63 description. If it matches exactly one entry in the titles list, the
64 number of this element is returned. Otherwise the parameter should be
65 an integer and it is checked, if this integer is a valid column index.
66 Like for other python indices a column number might be negative
67 counting the columns from the end. When an error occurres, the
68 exception \verb|ColumnError| is raised. Please note, that the datafile
69 inserts a first column having the index 0, which contains the line
70 number (starting at 1 and counting only data lines). Examples are
71 \verb|getcolumnno(1)| or \verb|getcolumnno("title")|.
73 The method \verb|getcolumn| takes the same argument as the method
74 \verb|getcolumnno| described above, but it returns a list with the
75 values of this very column.
77 \section{Mathematics on columns}
79 By the method \verb|addcolumn| a new column is appended. The method
80 takes a string as the first parameter which is interpreted as an
81 expression. When the expression contains an equal sign (\verb|=|),
82 everything left to the last equal sign will become the title of the
83 new column. If no equal sign is found, the title will be set to
84 \verb|None|. The part right to the last equal sign is interpreted as
85 an mathematical expression. A list of functions, predefined variables
86 and operators can be found in appendix~\ref{mathtree}. The list of
87 available functions and predefined variables can be extended by a
88 dictionary passed as the keyword argument \verb|context| to the
89 \verb|addcolumn| method.
91 The expression might contain variable names. The interpretation of
92 this names is done in the following way:
93 \begin{itemize}
94 \item The names can be a column title, but this is only allowed for
95 column titles which are valid variable names (e.g. they should start
96 with a letter or an underscore and contain only letters, digits and
97 the underscore).
98 \item A variable name can start with the dollar symbol (\verb|$|) and
99 the following integer number will directly refer to a column number.
100 \end{itemize}
101 The data referenced by variables in the expression need to be
102 floats, otherwise the result for that data line will be \verb|None|.
104 \section{Reading data from a sectioned config file}
106 The class \verb|sectionfile| provides a reader for files in the
107 ConfigFile format (see the description of the module \verb|ConfigFile|
108 from the pyx standard library).
110 \section{Own datafile readers}
112 The development of other datafile readers should be based on the
113 class \verb|data| by inheritance. When doing so, the methods
114 \verb|getcolumnno|, \verb|getcolumn|, and \verb|addcolumn| are
115 immediately available and the cooperation with other parts of \PyX{}
116 is assured. All what has to be done, is a call to the inherited
117 constructor supplying at least a sequence of data points as the
118 \verb|data| keyword argument. A data point itself is a sequence of
119 floats and/or strings. Additionally a sequence of column titles
120 (strings) might be given in the \verb|titles| argument.