2 Time-stamp: <2010-11-07 11:54:36 tony>
3 Creation: <2008-09-08 08:06:30 tony>
8 Author: AJ Rossini <blindglobe@gmail.com>
9 Copyright: (c) 2007-2010, AJ Rossini <blindglobe@gmail.com>. BSD.
10 Purpose: Stuff that needs to be made working sits inside the
13 This file contains the current challenges to solve,
14 including a description of the setup and the work to
15 solve. Solutions welcome.
17 What is this talk of 'release'? Klingons do not make software
18 'releases'. Our software 'escapes', leaving a bloody trail of
19 designers and quality assurance people in its wake.
23 ** (Internal) Package and (External) System Hierarchy
26 *** Singletons (primary building blocks)
28 These are packages as well as
30 | asdf | common system loader |
31 | xarray | common access structure to array-like |
32 | | (matrix, vector) structures. |
33 | cls-config | initialization of Lisp state, variables, etc, |
34 | | localization to the particular lisp. |
35 | lift | unit-testing |
36 | cffi | foriegn function library |
43 *** Dependency structure
45 | lisp-matrix | general purpose matrix package, linking to lapack | |
46 | | for numerics. Depends on: | |
49 | | cl-blapack | cffi |
51 | cls-dataframe | in the same spirit as lisp-matrix, a means to | |
52 | | create tables. Perhaps better called datatables? | |
53 | cls-probability | depends on gsll, cl-variates, cl-? initially, | |
69 Usually, we need to load it before going on.
72 (asdf:oos 'asdf:load-op :cls)
75 - State "DONE" from "CURR" [2010-10-12 Tue 13:48] \\
76 setup is mostly complete
77 - State "CURR" from "TODO" [2010-10-12 Tue 13:47]
78 - State "TODO" from "" [2010-10-12 Tue 13:47]
85 (defun init-CLS (&key (compile 'nil))
86 (let ((packagesToLoad (list ;; core system
87 :lift :lisp-matrix :cls
90 ;; :cl-cairo2-x11 :iterate
93 :cl-pdf :cl-typesetting
95 :asdf-system-connections :xarray
97 :metatilities-base :anaphora :tinaa
98 :cl-ppcre :cl-markdown :docudown
99 ;; version and validate CLOS objects
100 ;; :versioned-objects :validations
103 ;; :cl-glu :cl-glut :cl-glut-examples
108 (mapcar #'(lambda (x)
110 (asdf:oos 'asdf:compile-op x :force T)
111 (asdf:oos 'asdf:load-op x)))
118 | | #<PACKAGE "COMMON-LISP-USER"> |
120 ** CURR [#A] Testing: unit, regression, examples. [0/3]
121 - State "CURR" from "TODO" [2010-10-12 Tue 13:51]
122 - State "TODO" from "" [2010-10-12 Tue 13:51]
123 Testing consists of unit tests, which internally verify subsets of
124 code, regression tests, and functional tests (in increasing order
126 *** CURR [#B] Unit tests
127 - State "CURR" from "TODO" [2010-11-04 Thu 18:33]
128 - State "CURR" from "TODO" [2010-10-12 Tue 13:48]
129 - State "TODO" from "" [2010-10-12 Tue 13:48]
130 Unit tests have been started using LIFT. Need to consider some of
131 the other systems that provide testing, when people add them to the
132 mix of libraries that we need, along with examples of how to use.
135 (in-package :lisp-stat-unittests)
136 (run-tests :suite 'lisp-stat-ut)
137 ;; => tests = 78, failures = 7, errors = 20
138 (asdf:oos 'asdf:test-op 'cls)
139 ;; which runs (describe (run-tests :suite 'lisp-stat-ut))
141 and check documentation to see if it is useful.
144 (in-package :lisp-stat-unittests)
146 (describe 'lisp-stat-ut)
147 (documentation 'lisp-stat-ut 'type)
149 ;; FIXME: Example: currently not relevant, yet
150 ;; (describe (lift::run-test :test-case 'lisp-stat-unittests::create-proto
151 ;; :suite 'lisp-stat-unittests::lisp-stat-ut-proto))
153 (describe (lift::run-tests :suite 'lisp-stat-ut-dataframe))
154 (lift::run-tests :suite 'lisp-stat-ut-dataframe)
156 (describe (lift::run-test
157 :test-case 'lisp-stat-unittests::create-proto
158 :suite 'lisp-stat-unittests::lisp-stat-ut-proto))
161 *** TODO [#B] Regression Tests
162 - State "TODO" from "" [2010-10-12 Tue 13:54]
164 *** TODO [#B] Functional Tests
165 - State "TODO" from "" [2010-10-12 Tue 13:54]
167 ** TODO [#B] Functional Examples that need to work [0/2]
168 - State "TODO" from "" [2010-10-12 Tue 13:55]
170 These examples should be functional forms within CLS, describing
171 working functionality which is needed for work.
173 *** TODO [#B] Scoping with datasets
174 - State "TODO" from "" [2010-11-04 Thu 18:46]
176 The following needs to work, and a related syntax for resampling
177 and similar synthetic data approaches (bootstrapping, imputation)
178 ought to use similar syntax as well.
179 #+srcname: DataSetNameScoping
181 (in-package :ls-user)
183 ;; Syntax examples using lexical scope, closures, and bindings to
184 ;; ensure a clean communication of results
185 (with-data dataset ((dsvarname1 [usevarname1])
186 (dsvarname2 [usevarname2]))
190 *** TODO [#B] Dataframe variable typing
191 - State "TODO" from "" [2010-11-04 Thu 18:48]
193 #+srcname: DFvarTyping
195 (in-package :ls-user)
196 (defparameter *df-test*
197 (make-instance 'dataframe-array
198 :storage #2A (('a "test0" 0 0d0)
204 :case-labels (list "0" "1" 2 "3" "4")
205 :var-labels (list "symbol" "string" "integer" "double-float")
206 :var-types (list 'symbol 'string 'integer 'double-float)))
208 ;; with SBCL, ints become floats? Need to adjust output
209 ;; representation appropriately..
212 (defun check-var (df colnum)
213 (let ((nobs (xdim (dataset df) 0)))
215 (check-type (xref df i colnum) (elt (var-types df) i)))))
217 (xdim (dataset *df-test*) 1)
218 (xdim (dataset *df-test*) 0)
220 (check-var *df-test* 0)
223 (xref *df-test* 1 1))
225 (check-type (xref *df-test* 1 1)
226 string) ;; => nil, so good.
227 (check-type (xref *df-test* 1 1)
228 vector) ;; => nil, so good.
229 (check-type (xref *df-test* 1 1)
230 real) ;; => simple-error type thrown, so good.
232 ;; How to nest errors within errors?
233 (check-type (check-type (xref *df-test* 1 1) real) ;; => error thrown, so good.
239 (integerp (xref *df-test* 1 2))
240 (floatp (xref *df-test* 1 2))
241 (integerp (xref *df-test* 1 3))
242 (type-of (xref *df-test* 1 3))
243 (floatp (xref *df-test* 1 3))
245 (type-of (vector 1 1d0))
251 (xref *df-test* 1 '*)
254 ** CURR [#A] Random Numbers
255 - State "CURR" from "TODO" [2010-11-05 Fri 15:41]
256 - State "TODO" from "" [2010-10-14 Thu 00:12]
258 Need to select and choose a probability system (probability
259 functions, random numbers). Goal is to have a general framework
260 for representing probability functions, functionals on
261 probabilities, and reproducible random streams based on such
263 *** CURR [#B] CL-VARIATES system evaluation [2/3]
264 - State "CURR" from "TODO" [2010-11-05 Fri 15:40]
265 - State "TODO" from "" [2010-10-12 Tue 14:16]
267 CL-VARIATES is a system developed by Gary W King. It uses streams
268 with seeds, and is hence reproducible. (Random comment: why do CL
269 programmers as a class ignore computational reproducibility?)
270 **** DONE [#B] load and verify
271 - State "DONE" from "CURR" [2010-11-04 Thu 18:59] \\
272 load, init, and verify performance.
273 - State "CURR" from "TODO" [2010-11-04 Thu 18:58]
274 - State "TODO" from "" [2010-11-04 Thu 18:58]
276 #+srcname: Loading-CL-VARIATES
278 (in-package :cl-user)
279 (asdf:oos 'asdf:load-op 'cl-variates)
280 (asdf:oos 'asdf:load-op 'cl-variates-test)
284 #+srcname: CL-VARIATES-UNITTESTS
286 (in-package :cl-variates-test)
288 (run-tests :suite 'cl-variates-test)
289 (describe (run-tests :suite 'cl-variates-test))
292 **** DONE [#B] Examples of use
293 - State "DONE" from "CURR" [2010-11-05 Fri 15:39] \\
294 basic example of reproducible draws from the uniform and normal random
296 - State "CURR" from "TODO" [2010-11-05 Fri 15:39]
297 - State "TODO" from "" [2010-11-04 Thu 19:01]
299 #+srcname: CL-VARIATES-EXAMPLE-USE
301 (in-package :cl-variates-user)
303 (defparameter state (make-random-number-generator))
304 (setf (random-seed state) 44)
306 (loop for i from 1 to 10 collect
307 (random-range state 0 10))
308 ;; => (1 5 1 0 7 1 2 2 8 10)
309 (setf (random-seed state) 44)
310 (loop for i from 1 to 10 collect
311 (random-range state 0 10))
312 ;; => (1 5 1 0 7 1 2 2 8 10)
314 (setf (random-seed state) 44)
316 (loop for i from 1 to 10 collect
317 (normal-random state 0 1))
319 ;; (-1.2968656102820426 0.40746363934173213 -0.8594712469518473 0.8795681301148328
320 ;; 1.0731526250004264 -0.8161629082481728 0.7001813608754809 0.1078045427044097
321 ;; 0.20750134211656893 -0.14501914108452274)
323 (setf (random-seed state) 44)
324 (loop for i from 1 to 10 collect
325 (normal-random state 0 1))
327 ;; (-1.2968656102820426 0.40746363934173213 -0.8594712469518473 0.8795681301148328
328 ;; 1.0731526250004264 -0.8161629082481728 0.7001813608754809 0.1078045427044097
329 ;; 0.20750134211656893 -0.14501914108452274)
332 **** CURR [#B] Full example of general usage
333 - State "CURR" from "TODO" [2010-11-05 Fri 15:40]
334 - State "TODO" from "" [2010-11-05 Fri 15:40]
336 What we want to do here is describe the basic available API that
337 is present. So while the previous work describes what the
338 *** TODO [#B] CL-RANDOM system evaluation
339 - State "TODO" from "" [2010-11-05 Fri 15:40]
342 1. no seed setting for random numbers
343 2. contamination of a probability support with optimization and
348 2. nice design for generics.
350 *** TODO [#B] Native CLS (from XLS)
351 - State "TODO" from "" [2010-11-05 Fri 15:40]
353 ** TODO [#B] Numerical Linear Algebra
354 - State "TODO" from "" [2010-10-14 Thu 00:12]
356 *** TODO [#B] LLA evaluation
357 - State "TODO" from "" [2010-10-12 Tue 14:13]
358 ;;; experiments with LLA
359 (in-package :cl-user)
360 (asdf:oos 'asdf:load-op 'lla)
361 (in-package :lla-user)
363 *** CURR [#B] Lisp-Matrix system evaluation
364 - State "CURR" from "TODO" [2010-10-12 Tue 14:13]
365 - State "TODO" from "" [2010-10-12 Tue 14:13]
367 *** TODO [#B] LispLab system evaluation
368 - State "TODO" from "" [2010-10-12 Tue 14:13]
370 ** TODO [#B] Statistical Procedures to implement
371 - State "TODO" from "" [2010-10-14 Thu 00:12]
374 (in-package :cls-user)
379 ;; population design eval and opt
387 number of samples/cost of lab analysis and collection
391 (defun pfim (&key model ( constraints ( summary-function )
393 (list num-subjects num-times list-times))))
397 Each individal has a deisgn psi_i
398 nubmer of samples n_i and sampling times t_{i{1}} t_{i{n_1}}
399 individuals can differ
403 individual-level model
406 (=model y_i (+ (f \theta_i \psi_i) epsilion_i ))
407 (=var \epsilion_i \sigma_between \sigma_within )
409 ;; Information Matrix for pop deisgn
411 (defparameter IM (sum (i 1 N) (MF \psi_i \phi_i)))
414 For nonlinear structureal models, expand around RE=0
416 Cramer-Rao : MF^{-1} is lower bound for estimation variance.
420 - smallest SE, but is a matrix, so
421 - criteria for matrix comparison
422 -- D-opt, (power (determinant MF) (/ 1 P))
425 find design maxing D opt, (power (determinant MF) (/ 1 P))
427 -- contin vars for smapling times within interval or set -- number of groups for cat vars
429 Stat in Med 2009, expansion around post-hoc RE est, not necessarily zero.
431 Example binary covariate C
434 (if (= i reference-class)
439 (=model (log \theta) ( ))
446 PFIM provides for a given design and values of \beta:
448 SE/RSE for \beta of each class of each covar
449 eval influence of design on SE(\beta)
451 inter-occassion variability (IOV)
452 - patients sampled more than once, H occassions
454 - additional vars to estimate
458 ;;; comparison criteria
460 functional of conc/time curve which is used for comparison, i.e.
461 (AUC conc/time-curve)
462 (Cmax conc/time-curve)
463 (Tmax conc/time-curve)
467 (defun conc/time-curve (t)
470 (let ((conc (exp (* t \beta1))))
476 (url-get "www.pfim.biostat.fr")
479 ;;; Thinking of generics...
480 (information-matrix model parameters)
481 (information-matrix variance-matrix)
482 (information-matrix model data)
483 (information-matrix list-of-individual-IMs)
486 (defun IM (loglikelihood parameters times)
487 "Does double work. Sum up the resulting IMs to form a full IM."
488 (let ((IM (make-matrix (length parameters)
490 :initial-value 0.0d0)))
491 (dolist (parameterI parameters)
492 (dolist (parameterJ parameters)
494 (differentiate (differentiate loglikelihood parameterI) parameterJ))))))
496 *** difference between empirical, fisherian, and ...? information.
497 *** Example of Integration with CL-GENOMIC
498 - State "TODO" from "" [2010-10-12 Tue 14:03]
500 CL-GENOMIC is a very interesting data-structure strategy for
501 manipulating sequence data.
505 (in-package :cl-user)
506 (asdf:oos 'asdf:compile-op :ironclad)
507 (asdf:oos 'asdf:load-op :cl-genomic)
509 (in-package :bio-sequence)
510 (make-dna "agccg") ;; fine
511 (make-aa "agccg") ;; fine
512 (make-aa "agc9zz") ;; error expected
515 ** TODO [#B] Documentation and Examples [0/3]
516 - State "TODO" from "" [2010-10-14 Thu 00:12]
518 *** TODO [#B] Docudown
519 - State "TODO" from "" [2010-11-05 Fri 15:34]
522 - State "TODO" from "" [2010-11-05 Fri 15:34]
524 *** TODO [#B] CLPDF, and literate data analysis
525 - State "TODO" from "" [2010-11-05 Fri 15:34]