Add 2009 version of Arxana
[arxana.git] / latex / arxana-reboot.tex
blob052b5fe7b02c449194595ea312d49dffe8232a93
1 %;; arxana.tex -*- mode: Emacs-Lisp; -*-
2 %;; Copyright (C) 2005-2009 Joe Corneli <holtzermann17@gmail.com>
3 %;; DISOWNED! THIS FILE IS PUBLIC DOMAIN. DO WHAT YOU WILL!
5 % (progn
6 % (find-file "~/arxana.tex")
7 % (save-excursion
8 % (goto-char (point-max))
9 % (let ((beg (progn (search-backward "\\begin{verbatim}")
10 % (match-end 0)))
11 % (end (progn (search-forward "\\end{verbatim}")
12 % (match-beginning 0))))
13 % (eval-region beg end)
14 % (lit-process))))
16 %%% Commentary:
18 %% To load: remove %'s above and evaluate with C-x C-e.
20 %% Alternatively, run this:
21 % head -n 13 arxana.tex | sed -e "/%/s///" > arxana-loader.el
22 %% on the command line to produce something you can use
23 %% to load Arxana when you start Emacs:
24 % emacs -l arxana-loader.el
26 %% Or put the expression in your ~/.emacs (perhaps wrapped
27 %% in function like `eval-arxana').
29 %% Or search for a similar form below and evaluate there!
31 %% Q. Where exactly are we supposed to store the most
32 %% up-to-date Arxana files when they are ready to go?
34 %% A. Copy them into /usr/lib/sbcl/site-systems/arxana/
35 %% and that should be enough. Make sure that arxana.asd
36 %% is in that directory and that you have a symbolic link,
37 %% made via
39 %% ln -s ./arxana/arxana.asd .
41 %% in the directory /usr/lib/sbcl/site-systems/
42 %% -- Make sure to load once as root to generate new fasls.
44 %% Q. How to run the remote slime after that?
46 %% A. Make sure that Emacs `slime-protocol-version' matches
47 %% Common Lisp's `swank::*swank-wire-protocol-version*', then,
48 %% like this:
50 %% ssh -L 4005:127.0.0.1:4005 joe@li23-125.members.linode.com
51 %% linode$ sbcl
52 %% M-x slime-connect RET RET
54 %%% Code:
56 \documentclass{article}
58 \usepackage{amsmath}
59 \usepackage{amsthm}
60 \usepackage{verbatim}
62 \newcommand{\meta}[1]{$\langle${\it #1}$\rangle$}
64 \theoremstyle{definition}
65 \newtheorem{nota}{Note}[section]
67 \parindent = 1.2em
69 \newenvironment{notate}[1]
70 {\begin{nota}[{\bf {\em #1}}]}%
71 {\end{nota}}
73 \makeatletter
74 \newenvironment{elisp}
75 {\let\ORGverbatim@font\verbatim@font
76 \def\verbatim@font{\ttfamily\scshape}%
77 \verbatim}
78 {\endverbatim
79 \let\verbatim@font\ORGverbatim@font}
80 \makeatother
82 \makeatletter
83 \newenvironment{common}[1]
84 {\let\ORGverbatim@font\verbatim@font
85 \def\verbatim@font{\ttfamily\scshape}%
86 \verbatim}
87 {\endverbatim
88 \let\verbatim@font\ORGverbatim@font}
89 \makeatother
91 \makeatletter
92 \newenvironment{idea}
93 {\let\ORGverbatim@font\verbatim@font
94 \def\verbatim@font{\ttfamily\slshape}%
95 \verbatim}
96 {\endverbatim
97 \let\verbatim@font\ORGverbatim@font}
98 \makeatother
100 \begin{document}
102 \title{\emph{Arxana}}
104 \author{Joseph Corneli\thanks{Copyright (C) 2005-2010
105 Joseph Corneli {\tt <holtzermann17@gmail.com>}\newline
106 $\longrightarrow$ transferred to the public domain.}}
107 \date{Last revised: \today}
109 \maketitle
111 \abstract{A tool for building hackable semantic hypertext
112 platforms. Source code and mailing lists are at {\tt
113 http://common-lisp.net/project/arxana}.}
115 \tableofcontents
117 \section{Introduction}
119 \begin{notate}{What is ``Arxana''?} \label{arxana}
120 \emph{Arxana} is the name of a ``next generation''
121 hypertext system that emphasizes annotation. Every object
122 in this system is annotatable. Because of this, I
123 sometimes call Arxana's core ``the scholium system'', but
124 the name ``Arxana'' better reflects our aim: to explore
125 the mysterious world of links, attachments,
126 correspondences, and side-effects.
127 \end{notate}
129 \begin{notate}{The idea} \label{theoretical-context}
130 A scholia-based document model for commons-based peer
131 production will inform the development of our
132 system.\footnote{{\tt
133 http://www.metascholar.org/events/2005/freeculture/viewabstract.php?id=19
134 % alternate:
135 % http://br.endernet.org/~akrowne/planetmath/papers/corneli\_fcdl/corneli-krowne.pdf
136 \label{corneli-krowne}
138 In this model, texts are made up of smaller texts until
139 you get to atomic texts; user actions are built in the
140 same way. Multiple users should interact with a shared
141 persistent data-store, through functional annotation, not
142 destructive modification. We should pursue the
143 asynchronous interaction model until we arrive at live,
144 synchronous, settings, where we facilitate real-time
145 computer-mediated interactions between users, and between
146 users and running hackable programs.
147 \end{notate}
149 \begin{notate}{The data model} \label{data-model}
150 Start by storing a collection of \emph{strings}. Now add
151 in \emph{pairs} and \emph{triples} which point at 2 and 3
152 objects respectively. (We can extend to n-tuples if that
153 turns out to be convenient.) Finally, we will maintain a
154 collection of \emph{lists}, each of which points at an
155 unlimited number of objects.
156 \end{notate}
158 \begin{notate}{History}
159 Thinking about how to improve existing systems for
160 peer-based collaboration in 2004, I designed a simple
161 version of the scholium system that treated textual
162 commentary and markup as scholia.\footnote{{\tt
163 http://wiki.planetmath.org/AsteroidMeta/old\_draft\_of\_scholium\_system}}
164 In 2006, I put together a single-user version of this
165 system that ran exclusively under Emacs.\footnote{{\tt
166 http://metameso.org/files/sbdm4cbpp.tex} \label{old-version}}
167 The current system is an almost-completely rewritten
168 variant, bringing in a shared database and various other
169 enhancements to support multi-user interaction.
170 \end{notate}
172 \begin{notate}{A brisk review of the programming literature} \label{prog-lit-review}
173 Many years before I started working on this project, there
174 was something called the Emacs HyperText
175 System.\footnote{{\tt
176 http://www.aue.aau.dk/\~{}kock/Publications/HyperBase/}}
177 What we're doing here updates for modern database methods,
178 uses a more interesting data storage format, and also
179 considers multiple front-ends to the same database (for
180 example, a web interface).
182 Contemporary Emacs-based hypertext creation systems
183 include Muse and Emacs Wiki.\footnote{{\tt
184 http://mwolson.org/projects/EmacsMuse.html}}$^,$\footnote{{\tt
185 http://mwolson.org/projects/EmacsWiki.html}} The
186 browsing side features old standbys, Info and
187 Emacs/w3m\footnote{Not to be confused with Emacs-w3m,
188 which is not entirely ``Emacs-based''.}. These packages
189 provide ways to author or view what what we should now
190 call ``traditional'' hypertext documents.
192 An another legacy tool worth mentioning is
193 HyperCard\footnote{{\tt
194 http://en.wikipedia.org/wiki/HyperCard}}. This system
195 was oriented around the idea of using hypertext to create
196 software, a vision we share, but like just about everyone
197 else working in the field at the time, it used
198 uni-directional links.
200 Hypertext \emph{nouveau} is based on semantic triples.
201 The Semantic Web standard provides one specification of
202 the features we can expect from triples.\footnote{{\tt
203 http://www.w3.org/TR/2004/REC-rdf-primer-20040210/}}
204 Triples provide a framework for knowledge representation
205 with more depth and flexibility than the popular
206 ``tagging'' methodology. For example, suitable
207 collections of triples implement AI-style ``frames''. The
208 idea of using triples to organize archival material is
209 generating some interest as Semantic Web ideas
210 spread.\footnote{Cf. recent museum and library
211 conferences}$^,$\footnote{Even among academic computer
212 scientists! (Josh Grochow, p.c.)}
214 An abstractly similar project to Arxana with some grand
215 goals is being developed by Chris Hanson at MIT under the
216 name ``Web-scale Environments for Deduction
217 Systems''.\footnote{{\tt
218 http://publications.csail.mit.edu/abstracts/abstracts07/cph2/cph2.html}}
220 Another technically similar project is Freebase, a hand
221 rolled database of open content, organized on frame-based,
222 triple driven, principles. The developer of the Freebase
223 graphd database has some interesting things to say about
224 old and new ways of handling triples.\footnote{{\tt
225 http://blog.freebase.com/2008/04/09/a-brief-tour-of-graphd/}}
226 \end{notate}
228 \begin{notate}{Fitting in}
229 My current development goal is to use this system to
230 create a more flexible multiuser interaction platform than
231 those currently available to web-based collaborative
232 projects (such as PlanetMath\footnote{{\tt
233 http://planetmath.org}}). As an intermediate stage,
234 I'm using Arxana to help organize material for a book I'm
235 writing. Arxana's theoretical generality, active
236 development status, detailed documentation, and
237 superlatively liberal terms of use may make it an
238 attractive option for you to try as well!
239 \end{notate}
241 \begin{notate}{What you get}
242 Arxana has an Emacs frontend, a Common Lisp middle-end,
243 and a SQL backend. If you want to do some work, any one
244 of these components can be swapped out and replaced with
245 the engine of your choice. I've released all of the
246 implementation work on this system into the public domain,
247 and it runs on an entirely free/libre/open source software
248 platform.
249 \end{notate}
251 \begin{notate}{Acknowledgements}
252 Ted Nelson's ``Literary Machines'' and Marvin Minsky's
253 ``Society of Mind'' are cornerstones in the historical and
254 social contextualization of this work. Alfred Korzybski's
255 ``Science and Sanity'' and Gilles Deleuze's ``The Logic of
256 Sense'' provided grounding and encouragement. \TeX\ and
257 GNU Emacs have been useful not just in prototyping this
258 system, but also as exemplary projects in the genre I'm
259 aiming for. John McCarthy's Elephant 2000 was an
260 inspiring thing to look at and think about\footnote{{\tt
261 http://www-formal.stanford.edu/jmc/elephant/elephant.html}}, and of course Lisp has been a vital ingredient.
263 Thanks also to everyone who's talked about this project
264 with me!
265 \end{notate}
267 \section{Using the program}
269 \begin{notate}{Dependencies} \label{dependencies}
270 Our interface is embedded in Emacs. Backend processing is
271 done with Common Lisp. We are currently using the
272 PostgreSQL database. These packages should be available
273 to you through the usual channels. (I've been using SBCL,
274 but any Lisp should do; please make sure you are using a
275 contemporary Emacs version.)
277 We will connect Emacs to Lisp via Slime\footnote{{\tt
278 http://common-lisp.net/project/slime/}}, and Lisp to
279 PostgreSQL via CLSQL.\footnote{{\tt http://clsql.b9.com/}}
280 CLSQL also talks directly to the Sphinx search engine,
281 which we use for text-based search.\footnote{{\tt
282 http://www.sphinxsearch.com/}} Once all of these
283 things are installed and working together, you should be
284 able to begin to use Arxana.
286 Setting up all of these packages can be a somewhat
287 time-consuming and confusing task, especially if you
288 haven't done it before! See Appendix \ref{appendix-setup}
289 for help.
290 \end{notate}
292 \begin{notate}{Export code and set up the interface}
293 If you are looking at the source version of this document
294 in Emacs, evaluate the following s-expression (type
295 \emph{C-x C-e} with the cursor positioned just after its
296 final parenthesis). This exports the Common Lisp
297 components of the program to suitable files for subsequent
298 use, and prepares the Emacs environment. (The code that
299 does this is in Appendix \ref{appendix-lit}.)
300 \end{notate}
302 \begin{idea}
303 (save-excursion
304 (let ((beg (search-forward "\\begin{verbatim}"))
305 (end (progn (search-forward "\\end{verbatim}")
306 (match-beginning 0))))
307 (eval-region beg end)
308 (lit-process)))
309 \end{idea}
311 \begin{notate}{To load Common Lisp components at run-time} \label{load-at-runtime}
312 Link {\tt arxana.asd} somewhere where Lisp can find it.
313 Then run commands like these in your Lisp; if you like,
314 you can place all of this stuff in your config file to
315 automatically load Arxana when Lisp starts. The final
316 form is only necessary if you plan to use CLSQL's special
317 syntax on the Lisp command-line.
318 \end{notate}
320 \begin{idea}
321 (asdf:operate 'asdf:load-op 'clsql)
322 (asdf:operate 'asdf:load-op 'arxana)
323 (in-package arxana)
324 (connect-to-database)
325 (locally-enable-sql-reader-syntax)
326 \end{idea}
328 \begin{notate}{To connect Emacs to Lisp}
329 Either run {\tt M-x slime RET} to start and connect to
330 Lisp locally, or {\tt M-x slime-connect RET RET} after you
331 have opened a remote connection to your remote server with
332 a command like this: {\tt ssh -L 4005:127.0.0.1:4005
333 <username>@<host>} and started Lisp and the Swank server
334 on the remote machine. To have Swank start automatically
335 when you start Lisp, put commands like this in your config
336 file.
337 \end{notate}
339 \begin{idea}
340 (asdf:operate 'asdf:load-op 'swank)
341 (setf swank:*use-dedicated-output-stream* nil)
342 (setf swank:*communication-style* :fd-handler)
343 (swank:create-server :dont-close t)
344 \end{idea}
346 \begin{notate}{To define database structures}
347 If you haven't yet defined the basic database structures,
348 make sure to load them now! (Using {\tt tabledefs.lisp},
349 or the SQL code in Section \ref{sql-code})
350 \end{notate}
352 \begin{notate}{Importing this document into system}
353 You can browse this document inside Arxana: after loading
354 the code, run \emph{M-x autoimport-arxana}.
355 \end{notate}
357 \section{SQL tables} \label{sql-code}
359 \begin{notate}{Objects and codes} \label{objects-and-codes}
360 Every object in the system is identified by an ordered
361 pair: a \emph{code} and a \emph{reference}. The codes say
362 which table contains the indicated object, and references
363 provide that object's id. To a specific element of a list
364 or n-tuple, a third number, that element's \emph{offset},
365 is required. The codes are as follows:
367 \begin{center}
368 \begin{tabular}{|l|l|}
369 \hline
370 0 & list \\ \hline
371 1 & string \\ \hline
372 2 & pair \\ \hline
373 3 & triple \\ \hline
374 \end{tabular}
375 \end{center}
376 \end{notate}
378 \begin{idea}
379 CREATE TABLE strings (
380 id SERIAL PRIMARY KEY,
381 text TEXT NOT NULL UNIQUE
384 CREATE TABLE pairs (
385 id SERIAL PRIMARY KEY,
386 code1 INT NOT NULL,
387 ref1 INT NOT NULL,
388 code2 INT NOT NULL,
389 ref2 INT NOT NULL,
390 UNIQUE (code1, ref1,
391 code2, ref2)
394 CREATE TABLE triples (
395 id SERIAL PRIMARY KEY,
396 code1 INT NOT NULL,
397 ref1 INT NOT NULL,
398 code2 INT NOT NULL,
399 ref2 INT NOT NULL,
400 code3 INT NOT NULL,
401 ref3 INT NOT NULL,
402 UNIQUE (code1, ref1,
403 code2, ref2,
404 code3, ref3)
406 \end{idea}
408 \begin{notate}{A list of lists}\label{models-of-theories}
409 As a central place to manage our collections, we first
410 create a list of lists. The `heading' is the list's name,
411 and its `header' is metadata.
412 \end{notate}
414 \begin{idea}
415 CREATE TABLE lists (
416 id SERIAL PRIMARY KEY,
417 heading REFERENCES strings(id) UNIQUE,
418 header REFERENCES strings(id)
420 \end{idea}
422 \begin{notate}{Lists on demand}\label{models-of-theories}
423 Whenever we want to create a new list, we first add to the
424 `lists' table, and then create a new table ``listk''
425 (where k is equal to the new maximum id on `lists').
426 \end{notate}
428 \begin{idea}
429 CREATE TABLE listk (
430 offset SERIAL PRIMARY KEY,
431 code INT NOT NULL,
432 ref INT NOT NULL
434 \end{idea}
436 \begin{notate}{Side-note on containers via triples} \label{containers-using-triples}
437 To model a basic container, we can just use triples like
438 ``(A in B)''. This is useful, but the elements of B are
439 of course unordered. In Section \ref{importing}, we make
440 extensive use of triples like (B 1 $\alpha$), (B 2
441 $\beta$), etc., to indicate that B's first component is
442 $\alpha$, second component is $\beta$, and so on; so we
443 can make ordered list-like containers as well.
445 This is an example of the difference in expressive power
446 of tags (which only provide a sense of unordered
447 containment in ``virtual baskets'') and triples (which
448 here are seen to at least provide the additional sense of
449 ordered containment in ``virtual filing cabinets'',
450 although they have much more in store for us); cf. Note
451 \ref{prog-lit-review}.
453 As useful as models based on these two principles are in
454 principle, the user could easily be overloaded by looking
455 at lots of different containers encoded in raw triples,
456 all at once.
457 \end{notate}
459 \begin{notate}{Sense of containment}
460 Note that every element of a list is in the list in the
461 same ``sense'' -- for example, we can't instantly
462 distinguish elements that are ``halfway in'' from those
463 that are ``all the way in'', the same way we could with
464 pure triples.
465 \end{notate}
467 %% \begin{notate}{References into theories}
468 %% Since at the moment we have less than 10 basic codes, we
469 %% can uniquely reference contents of theory $k$ with ordered
470 %% pairs $10k+\mathit{basic\ code}$ and $\mathit{reference}$.
471 %% \end{notate}
473 \begin{notate}{Uniqueness of strings and triples} \label{unique-things}
474 An attempt to create a duplicate contents in a string or
475 triple generates a warning. This saves storage, given
476 possible repetitive use -- and avoids confusion. We can,
477 however, reference duplicate ``copies'' on the lists.
478 \end{notate}
480 \begin{notate}{Change} \label{change}
481 Notice also that since neither strings nor triples
482 ``change'', we have to account for change in other ways.
483 In particular, the contents of lists can change. (We may
484 subsequently add some metadata to certain lists are
485 ``locked'', or indicate that they can only be changed by
486 adding, etc., so that their contents can be cited stably
487 and reliably.)
488 \end{notate}
490 %% \begin{notate}{Each place contains one object} \label{places}
491 %% It is obvious from the table definition that I want each
492 %% place to contain precisely one thing; perhaps it is less
493 %% obvious why I want to use a database table to maintain
494 %% this relationship between ``places'' and ``things''. This
495 %% is largely a matter of convenience, but in particular it
496 %% makes it easy for places to change.
497 %% \end{notate}
499 \begin{notate}{Provenance and other metadata} \label{provenance}
500 We could of course add much more structure to the
501 database, starting with simple adjustments like adding
502 provenance metadata or versioning into the records for
503 each stored thing. For the time being, I assume that such
504 metadata will appear in the application or content layer,
505 as triples. (The exception are the ``headings'' and
506 ``headers'' associated with lists.)
507 \end{notate}
509 \section{Common Lisp-side}
511 \subsection{Preliminaries}
513 \subsubsection*{System definition}
515 \begin{common}{arxana.asd}
516 (defsystem "arxana"
517 :version "1"
518 :author "Joe Corneli <holtzermann17@gmail.com>"
519 :licence "Public Domain"
520 :components
521 ((:file "packages")
522 (:file "utilities" :depends-on ("packages"))
523 (:file "database" :depends-on ("utilities"))
524 (:file "queries" :depends-on ("packages"))))
525 \end{common}
527 \subsubsection*{Package definition}
529 \begin{common}{packages.lisp}
530 (defpackage :arxana
531 (:use #:cl #:clsql #:clsql-sys))
532 \end{common}
534 \subsubsection*{Utilities}
536 \begin{notate}{Useful things} \label{useful}
537 These definitions are either necessary or useful for
538 working the database and manipulating triple-centric
539 and/or theory-situated data. The implementation of
540 theories given here is inspired by Lisp's streams. This
541 is perhaps the most gnarly part of the code; the pay-off
542 of doing things the way we do them here is that
543 subsequently theories can sit ``transparently'' over other
544 structures.
545 \end{notate}
547 \begin{common}{utilities.lisp}
548 (in-package arxana)
549 (locally-enable-sql-reader-syntax)
551 ;; (defun connect-to-database ()
552 ;; (connect `("localhost" "joe" "joe" "")
553 ;; :database-type :postgresql-socket))
555 (defun connect-to-database ()
556 (connect `("localhost" "joe" "joe" "joe")
557 :database-type :mysql))
559 (defmacro select-one (&rest args)
560 `(car (select ,@args :flatp t)))
562 (defmacro select-flat (&rest args)
563 `(select ,@args :flatp t))
565 (defun resolve-ambiguity (stuff)
566 (first stuff))
568 (defun isolate-components (content i j)
569 (list (nth (1- i) content)
570 (nth (1- j) content)))
572 (defun isolate-beginning (triple)
573 (isolate-components (cdr triple) 1 2))
575 (defun isolate-middle (triple)
576 (isolate-components (cdr triple) 3 4))
578 (defun isolate-end (triple)
579 (isolate-components (cdr triple) 5 6))
581 (defvar *read-from-heading* nil)
583 (defvar *write-to-heading* nil)
584 \end{common}
586 \begin{notate}{On `datatype'}
587 Just translate coordinates into their primary dimension.
588 (How should this change to accomodate codes 4, 5, 6,
589 possibly etc.?)
590 \end{notate}
592 \begin{common}{utilities.lisp}
593 (defun datatype (data)
594 (cond ((eq (car data) 0)
595 "strings")
596 ((eq (car data) 1)
597 "places")
598 ((eq (car data) 2)
599 "triples")
600 ((eq (car data) 3)
601 "theories")))
603 (locally-disable-sql-reader-syntax)
604 \end{common}
606 \begin{notate}{Resolving ambiguity}
607 Often it will eventuate that there will be more than one
608 item returned when we are only truly prepared to deal with
609 one item. In order to handle this sort of ambiguity, it
610 would be great to have either a non-interactive notifier
611 that says that some ambiguity has been dealt with, or an
612 interactive tool that will let the user decide which of
613 the ambiguous options to choose from. For now, we provide
614 the simplest non-interactive tool: just choose the first
615 item from a possibly ambiguous list of items.
616 \end{notate}
618 \begin{notate}{Using a different database}
619 See Note \ref{backend-variant} for instructions on changes
620 you will want to make if you use a different database.
621 \end{notate}
623 \begin{notate}{Use of the ``count'' function}
624 The SQL count function is thought to be inefficient with
625 some backends; workarounds exist. (And it's considered to
626 be efficient with MySQL.)
627 \end{notate}
629 \begin{notate}{Abstraction} \label{abstraction}
630 While it might be in some ways ``nice'' to allow people to
631 chain together ever-more-abstract references to elements
632 from other theories, I actually think it is better to
633 demand that there just be \emph{one} layer of abstraction
634 (since we can then quickly translate back and forth,
635 rather than running through a chain of translations).
637 This does not imply that we cannot have a theory
638 superimposed over another theory (or over multiple
639 theories) that draws input from throughout a massively
640 distributed interlaced system -- rather, just that we
641 assume we will need to translate to ``base coordinates''
642 when building such structures. However, we'll certainly
643 want to explore the possibilities for running links
644 between theories (abstractly similar in some sense to
645 pointing at a component of a triple, but here there's no
646 uniform beg, mid, end scheme to refer to).
647 \end{notate}
649 \subsection{Main table definitions}
651 \begin{notate}{Defining tables from within Lisp}
652 This is Lisp code to define the permanent SQL tables
653 described in Section \ref{sql-code}.
654 \end{notate}
656 \begin{common}{tabledefs.lisp}
657 ;; (execute-command "CREATE TABLE strings (
658 ;; id SERIAL PRIMARY KEY,
659 ;; text TEXT NOT NULL UNIQUE
660 ;; );")
662 (execute-command "CREATE TABLE strings (
663 id SERIAL PRIMARY KEY,
664 text TEXT,
665 UNIQUE INDEX (text(255))
666 );")
668 (execute-command "CREATE TABLE places (
669 id SERIAL PRIMARY KEY,
670 code INT NOT NULL,
671 ref INT NOT NULL
672 );")
674 (execute-command "CREATE TABLE triples (
675 id SERIAL PRIMARY KEY,
676 code1 INT NOT NULL,
677 ref1 INT NOT NULL,
678 code2 INT NOT NULL,
679 ref2 INT NOT NULL,
680 code3 INT NOT NULL,
681 ref3 INT NOT NULL,
682 UNIQUE (code1, ref1,
683 code2, ref2,
684 code3, ref3)
685 );")
687 (execute-command "CREATE TABLE theories (
688 id SERIAL PRIMARY KEY,
689 name INT UNIQUE REFERENCES strings(id)
690 );")
691 \end{common}
693 \begin{notate}{Eliminating and tables}
694 In case you ever need to redefine these tables, you can
695 run code like this first, to delete the existing copies.
696 (Additional tables are added whenever a theory is created;
697 code for deleting theories or their contents will appear
698 in Section \ref{processing-theories}.)
699 \end{notate}
701 \begin{idea}
702 (dolist (view (list-views)) (drop-view view))
703 (execute-command "DROP TABLE strings")
704 (execute-command "DROP TABLE triples")
705 (execute-command "DROP TABLE places")
706 (execute-command "DROP TABLE theories")
707 \end{idea}
709 \subsection{Modifying the database}
711 \begin{common}{database.lisp}
712 (in-package arxana)
713 (locally-enable-sql-reader-syntax)
714 \end{common}
716 \subsection*{Processing strings}
718 \begin{notate}{On `string-to-id'}
719 Return the id of `text', if present, otherwise nil.
721 There was a segmentation fault with clisp here at one
722 point, maybe because I hadn't gotten the clsql sql reader
723 syntax loaded up properly. Note that calling the code
724 without the function wrapper did not produce the same
725 segfault.
726 \end{notate}
728 \begin{common}{database.lisp}
729 (defun string-to-id (text)
730 (select [id]
731 :from [strings]
732 :where [= [text] text]))
733 \end{common}
735 \begin{notate}{On `add-string'} \label{add-string}
736 Add the argument `text' to the list of strings. If the string
737 is successfully created, its coordinates are returned.
738 Otherwise, and in particular, if the request was to create
739 a duplicate, nil is returned.
741 Should this give a message ``Adding \meta{text} to the
742 strings table'' when the string is added by an indirecto
743 function call, such as through `massage'?
744 (Note \ref{massage}.)
745 \end{notate}
747 \begin{common}{database.lisp}
748 (defun add-string (text)
749 (handler-case
750 (progn (insert :into [strings]
751 :attributes '(text)
752 :values `(,text))
753 `(1 ,(string-to-id text)))
754 (sql-database-data-error ()
755 (warn "\"~a\" already exists."
756 text))))
757 \end{common}
759 \begin{notate}{Error handling bug}
760 The function `add-string' (Note \ref{add-string}) exhibits
761 the first of several error handling calls designed to
762 ensure uniqueness (Note \ref{unique-things}).
763 Experimentally, this works, but I'm observing that, at
764 least sometimes, if the user tries to add an item that's
765 already present in the database, the index tied to the
766 associated table increases even though the item isn't
767 added. This is annoying. I haven't checked whether this
768 happens on all possible installations of the underlying
769 software.
770 \end{notate}
772 \subsection*{Parsing general input}
774 \begin{notate}{On `massage'} \label{massage}
775 User input to functions like `add-triple' and so on and so
776 forth can be strings, integers (which the function
777 ``serializes'' as the string versions of themselves), or
778 as \emph{coordinates} -- lists of the form (code ref).
779 This function converts all of these input forms into the
780 last one! It takes an optional argument `addstr' which,
781 if supplied, says to add string data to the database if it
782 wasn't there already.
783 \end{notate}
785 \begin{common}{database.lisp}
786 (defun massage (data &optional addstr)
787 (cond
788 ((integerp data)
789 (massage (format nil "~a" data) addstr))
790 ((stringp data)
791 (let ((id (string-to-id data)))
792 (if id
793 (list 0 id)
794 (when addstr
795 (add-string data)))))
796 ((and (listp data)
797 (equal (length data) 2))
798 data)
799 (t nil)))
800 \end{common}
803 \subsection*{Processing triples}
805 \begin{notate}{On `triple-to-id'}
806 Return the id of the triple (beg mid end),
807 if present, otherwise nil.
808 \end{notate}
810 \begin{common}{database.lisp}
811 (defun triple-to-id (beg mid end)
812 (let ((b (massage beg))
813 (m (massage mid))
814 (e (massage end)))
815 (select [id]
816 :from [triples]
817 :where [and [= [code1] (first b)]
818 [= [ref1] (second b)]
819 [= [code2] (first m)]
820 [= [ref2] (second m)]
821 [= [code3] (first e)]
822 [= [ref3] (second e)]])))
823 \end{common}
825 \begin{notate}{On `add-triple'} \label{add-triple}
826 Elements of triples are parsed by `massage'
827 (Note \ref{massage}). If the triple
828 is successfully created, its coordinates are returned.
829 Otherwise, and in particular, if the request was to create
830 a duplicate, nil is returned.
831 \end{notate}
833 \begin{common}{database.lisp}
834 (defun add-triple (beg mid end)
835 "Add a triple comprised of BEG MID and END."
836 (let ((b (massage beg t))
837 (m (massage mid t))
838 (e (massage end t)))
839 (when (and b m e)
840 (handler-case
841 (progn
842 (insert-records
843 :into [triples] :attributes '(code1 ref1
844 code2 ref2
845 code3 ref3)
846 :values `(,(first b) ,(second b)
847 ,(first m) ,(second m)
848 ,(first e) ,(second e)))
849 `(2 ,(triple-to-id b m e)))
850 (sql-database-data-error ()
851 (warn "\"~a\" already entered as [~a ~a ~a]."
852 (list beg mid end) b m e))))))
853 \end{common}
855 \subsection*{Processing theories} \label{processing-theories}
857 \begin{notate}{Things to do with theories}
858 For the record, we want to be able to create a theory, add
859 elements to that theory, remove or change elements in the
860 theory, and, for convenience, zap everything in a theory.
861 Perhaps we will also want functions to remove the tables
862 associated with a theory as well, swap the position of two
863 theories, or change the name of a theory. We will also
864 want to be able to export and import theories, so they can
865 be ``beamed'' between installations. At appropriate
866 places in the Emacs interface, we'll need to set
867 `*write-to-heading*' and `*read-from-heading*'.
868 \end{notate}
870 \begin{notate}{What can go in a theory} \label{what-can-go-in}
871 Notice that there is no rule that says that a triple or
872 place that's part of a theory needs to point only at
873 strings that are in the same theory.
874 \end{notate}
876 \begin{notate}{On `list-to-id'}
877 Return the id of the theory with given `heading', if present,
878 otherwise, nil.
879 \end{notate}
881 \begin{common}{database.lisp}
882 (defun list-to-id (heading)
883 (let ((string-id (string-to-id heading)))
884 (select [id]
885 :from [lists]
886 :where [= [heading] string-id])))
887 \end{common}
889 \begin{notate}{On `add-theory'} \label{add-theory}
890 Add a theory to the theories table, and all the new
891 dimensions of the frame that comprise this theory.
892 (Theories have names that are strings -- it seems a
893 little funny to always have to translate submitted
894 strings to ids for lookup, but this is what we do.)
895 \end{notate}
897 \begin{common}{database.lisp}
898 (defun add-list (heading)
899 (let ((string-id (second (massage heading t))))
900 (handler-case
901 (progn (insert :into [lists]
902 :attributes '(heading)
903 :values `(,string-id))
904 (let ((k (theory-to-id heading)))
905 (execute-command
906 (format nil "CREATE TABLE lists~A (
907 offset SERIAL PRIMARY KEY,
908 code INT NOT NULL,
909 ref INT NOT NULL
910 );" k))
911 `(0 ,k)))
912 (sql-database-data-error
914 (warn "The list \"~a\" already exists."
915 heading)))))
916 \end{common}
918 \begin{notate}{On `get-lists'}
919 Find all lists that contain `symbol'.
920 \end{notate}
922 \begin{common}{database.lisp}
923 (defun get-lists (symbol)
924 (let* ((data (massage symbol))
925 (type (datatype data))
926 (id (second data))
927 (n (caar
928 (query "select count(*) from lists")))
929 results)
930 (loop for k from 1 upto n
931 do (let ((present
932 (query (concatenate
933 'string
934 "select offset from list"
935 (format nil "~A" k)
936 " where ((code = "
937 (format nil "~A" type)
938 ") and (ref = "
939 (format nil "~A" id)
940 "))"))))
941 (when present
942 ;; bit of a problem if there are multiple
943 ;; entries of that item on the given
944 ;; list.
945 (setq results (cons (list 0 k present)
946 results)))))
947 results))
948 \end{common}
950 \begin{notate}{On `save-to-list'}
951 Record `symbol' on list named `name'.
952 \end{notate}
954 \begin{common}{database.lisp}
955 (defun save-to-list (symbol name)
956 (let* ((data (massage symbol t))
957 (type (datatype data))
958 (string-id (string-to-id name))
959 (k (select-one [id]
960 :from [lists]
961 :where [= [name] string-id]))
962 (tablek (concatenate 'string
963 type (format nil "~A" k))))
964 (insert-records :into (sql-expression :table tablek)
965 :attributes '(id)
966 :values `(,(second data)))))
967 \end{common}
969 \subsection*{Lookup by id or coordinates}
971 \begin{notate}{The data format that's best for Lisp} \label{what-is-best-for-lisp}
972 It is a reasonable question to ask whether or not the an
973 item's id should be considered part of that item's
974 defining data when that data is no longer in the database.
975 For the functions defined here, the id is an input, and so
976 by default I'm not including it in the output here,
977 because it is already known. However, for functions like
978 `triples-given-beginning' (See Note
979 \ref{graph-like-data}), the id is \emph{not} part of the
980 known data, and so it is returned. Therefore I am
981 providing the `retain-id' flag here, for cases where
982 output should be consistent with that of these other
983 functions.
984 \end{notate}
986 \begin{common}{database.lisp}
987 (defun string-lookup (id &optional retain-id)
988 (let ((ret (select [text]
989 :from [strings]
990 :where [= [id] id])))
991 (if retain-id
992 (list id ret)
993 ret)))
995 (defun triple-lookup (id &optional retain-id)
996 (let ((ret (select [code1] [ref1]
997 [code2] [ref2]
998 [code3] [ref3]
999 :from [triples]
1000 :where [= [id] id])))
1001 (if retain-id
1002 (cons id ret)
1003 ret)))
1005 (defun list-lookup (id &optional retain-id)
1006 (let ((ret (select [name]
1007 :from [lists]
1008 :where [= [id] id])))
1009 (if retain-id
1010 (list id ret)
1011 ret)))
1012 \end{common}
1014 \begin{notate}{Succinct idioms for following pointers}
1015 Here are some variants on the functions above which save
1016 us from needing to extract the id of the item from its
1017 coordinates.
1018 \end{notate}
1020 \begin{common}{database.lisp}
1021 (defun string-contents (coords)
1022 (string-lookup (second coords)))
1024 (defun place-contents (coords)
1025 (place-lookup (second coords)))
1027 (defun triple-contents (coords)
1028 (triple-lookup (second coords)))
1029 \end{common}
1031 \begin{notate}{Switchboard} \label{switchboard}
1032 Even more succinctly, one function that can get
1033 the object indicated by any set of coordinates.
1034 \end{notate}
1036 \begin{common}{database.lisp}
1037 (defun switchboard (coords)
1038 (cond ((eq (first coords) 0)
1039 (string-contents coords))
1040 ((eq (first coords) 1)
1041 (place-contents coords))
1042 ((eq (first coords) 2)
1043 (triple-contents coords))))
1044 \end{common}
1046 \begin{notate}{Anti-pasti}
1047 The readability of this code could perhaps be improved if
1048 we used functions like `switchboard' more frequently.
1049 (More to the point, it seems it's not currently used.) In
1050 particular, it would be nice if we could sweep idioms like
1051 \verb+`(2 ,(car triple))+ under the rug.
1052 \end{notate}
1054 \begin{common}{database.lisp}
1055 (locally-disable-sql-reader-syntax)
1056 \end{common}
1058 \subsection{Queries} \label{queries}
1060 \begin{notate}{The use of views} \label{use-of-views}
1061 It is easy enough to select those triples which match
1062 simple data, e.g., those triples which have the same
1063 beginning, middle, or end, or any combination of these.
1064 It is a little more complicated to find items that match
1065 criteria specified by several different triples; for
1066 example, to \emph{find all the books by Arthur C. Clarke
1067 that are also works of fiction}.
1069 Suppose our collection of triples contains a portion as
1070 follows:
1071 \begin{center}
1072 \begin{tabular}{lll}
1073 Profiles of the Future & is a & book \\ 2001: A Space
1074 Odyssey & is a & book \\ Ender's Game & is a & book
1075 \\ Profiles of the Future & has genre & non-fiction
1076 \\ 2001: A Space Odyssey & has genre & fiction \\ Ender's
1077 Game & has genre & fiction \\ Profiles of the Future & has
1078 author & Arthur C. Clarke \\ 2001: A Space Odyssey & has
1079 author & Arthur C. Clarke \\ Ender's Game & has author &
1080 Orson Scott Card
1081 \end{tabular}
1082 \end{center}
1084 One way to solve the given problem would be to find those
1085 items that \emph{are written by Arthur C. Clarke} (* ``has
1086 author'' and ``Arthur C. Clarke''), that \emph{are books}
1087 (* ``is a'' ``book''), and \emph{that are classified as
1088 fiction} (* ``has genre'' ``fiction''). We are looking
1089 for items that match \emph{all} of these conditions.
1091 Our implementation strategy is: collect the items matching
1092 each criterion into a view, then join these views. (See
1093 the function `satisfy-conditions'
1094 \ref{satisfy-conditions}.)
1096 If we end up working with large queries and a lot of data,
1097 this use of views may not be an efficient way to go -- but
1098 we'll cross that bridge when we come to it.
1099 \end{notate}
1101 \begin{notate}{Search queries}
1102 In Note \ref{sphinx-setup} et seq., we give some
1103 instructions on how to set up the Sphinx search engine to
1104 work with Arxana. However, a much tighter integration of
1105 Sphinx into Arxana is possible, and will be coming soon.
1106 \end{notate}
1108 \begin{common}{queries.lisp}
1109 (in-package arxana)
1110 (locally-enable-sql-reader-syntax)
1111 \end{common}
1113 \subsection*{Printing}
1115 \begin{notate}{On `print-system-object'} \label{print-system-object}
1116 The function `print-system-object' bears some resemblance
1117 to `massage', but is for printing instead,
1118 and therefor has to be recursive (because triples and
1119 places can point to other system objects, printing can be
1120 a long and drawn out ordeal).
1121 \end{notate}
1123 \begin{common}{queries.lisp}
1124 (defun print-system-object (data &optional components)
1125 (cond
1126 ;; just return strings
1127 ((stringp data)
1128 data)
1129 ;; printing from coordinates (code, ref)
1130 ((and (listp data)
1131 (equal (length data) 2))
1132 ;; we'll need some hack to deal with
1133 ;; elements-of-theories, which, right now, are two
1134 ;; elements long but are not (code, ref) pairs but
1135 ;; rather (local_id, ref) pairs, or maybe actually if
1136 ;; we take context into consideration, they're
1137 ;; actually (k, table, local_id, ref) quadruplets.
1138 ;; Obviously with *that* data we can translate to
1139 ;; (code, ref). On the other hand, if we *don't*
1140 ;; take it into consideration, we probably can't do
1141 ;; much of anything. So we should be careful to be
1142 ;; aware of just what sort of information we're
1143 ;; passing around.
1144 (cond ((equal (first data) 0)
1145 (string-lookup (second data)))
1146 ((equal (first data) 1)
1147 (print-system-object
1148 (place-lookup (second data) t)))
1149 ((equal (first data) 2)
1150 (let ((triple (triple-lookup (second data) t)))
1151 (if components
1152 (list
1153 (print-beginning triple)
1154 (print-middle triple)
1155 (print-end triple))
1156 (concatenate
1157 'string
1158 (format nil "T~a[" (second data))
1159 (print-beginning triple) "."
1160 (print-middle triple) "."
1161 (print-end triple) "]"))))
1162 ((equal (first data) 3)
1163 (concatenate 'string "List printing not implemented yet."))))
1164 ;; place
1165 ((and (listp data)
1166 (equal (length data) 3))
1167 (concatenate 'string
1168 (format nil "P~a|" (first data))
1169 (print-system-object (cdr data)) "|"))
1170 ;; triple
1171 ((and (listp data)
1172 (equal (length data) 7))
1173 (if components
1174 (list
1175 (print-beginning data)
1176 (print-middle data)
1177 (print-end data))
1178 (concatenate
1179 'string
1180 (format nil "T~a[" (first data))
1181 (print-beginning data) "."
1182 (print-middle data) "."
1183 (print-end data) "]")))
1184 (t nil)))
1186 (defun print-beginning (triple)
1187 (print-system-object (isolate-beginning triple)))
1189 (defun print-middle (triple)
1190 (print-system-object (isolate-middle triple)))
1192 (defun print-end (triple)
1193 (print-system-object (isolate-end triple)))
1194 \end{common}
1196 \begin{notate}{Depth}
1197 If we are going to have complicated recursive references,
1198 our printer, and anything else that gives the system some
1199 semantics, should come with some sort of ``layers'' switch
1200 that can be used to limit the amount of recursion we do in
1201 any given computation.
1202 \end{notate}
1204 \begin{notate}{Printing objects as they appear in Lisp} \label{printing-objects-in-lisp}
1205 With the following functions we provide facilities for
1206 printing an object, either from its id or from the
1207 expanded form of the data that represents it in Lisp.
1208 (This is one good reason to have one standard form for
1209 this data; compare Note \ref{what-is-best-for-lisp}.
1210 These functions assume that the id \emph{is} part of
1211 what's printed, so if using functions like `triple-lookup'
1212 to retrieve data for printing, you'll have to graft the id
1213 back on before printing with these functions.)
1214 \end{notate}
1216 \begin{notate}{Printing theories}
1217 We'll want to both print all of the content of a theory,
1218 and print \emph{from} the theory in a more limited way.
1219 (Perhaps we get the second item for free, already?)
1220 \end{notate}
1222 \begin{common}{queries.lisp}
1223 (defun print-string (string &optional components)
1224 (print-system-object string components))
1226 (defun print-place (place &optional components)
1227 (print-system-object place components))
1229 (defun print-triple (triple &optional components)
1230 (print-system-object triple components))
1232 (defun print-string-from-id (id &optional components)
1233 (print-system-object (list 0 id) components))
1235 (defun print-place-from-id (id &optional components)
1236 (print-system-object (list 1 id) components))
1238 (defun print-triple-from-id (id &optional components)
1239 (print-system-object (list 2 id) components))
1240 \end{common}
1242 \begin{notate}{Printing some stuff but not other stuff} \label{printing-some}
1243 These functions are good for printing lists as come out of
1244 the database. See Note \ref{strings-and-ids} on printing
1245 strings.
1246 \end{notate}
1248 \begin{common}{queries.lisp}
1249 (defun print-strings (strings)
1250 (mapcar 'second strings))
1252 (defun print-places (places &optional components)
1253 (mapcar (lambda (item)
1254 (print-system-object item components))
1255 places))
1257 (defun print-triples (triples &optional components)
1258 (mapcar (lambda (item)
1259 (print-system-object item components))
1260 triples))
1262 (defun print-theories (theories &optional components)
1263 (mapcar (lambda (item)
1264 (print-system-object item components))
1265 theories))
1266 \end{common}
1268 \begin{notate}{Printing everything in each table} \label{printing-everything}
1269 These functions collect human-readable versions of
1270 everything in each table. Notice that `all-strings' is
1271 written differently.
1272 \end{notate}
1274 \begin{common}{queries.lisp}
1275 (defun all-strings ()
1276 (mapcar 'second (select [*] :from [strings])))
1278 (defun all-places ()
1279 (mapcar 'print-system-object
1280 (select [*] :from [places])))
1282 (defun all-triples ()
1283 (mapcar 'print-system-object
1284 (select [*] :from [triples])))
1286 (defun all-theories ()
1287 (mapcar 'print-system-object
1288 (select [*] :from [theories])))
1289 \end{common}
1291 \begin{notate}{Printing on particular dimensions}
1292 One possible upgrade to the printing functions would be to
1293 provide the built-in to ``curry'' the printout -- for
1294 example, just print the source nodes from a list of
1295 triples. However, it should of course also be possible to
1296 do processing like this Lisp after the printout has been
1297 made (the point is, it is presumably it is more efficient
1298 only to retrieve and format the data we're actually
1299 looking for).
1300 \end{notate}
1302 \begin{notate}{Strings and ids} \label{strings-and-ids}
1303 Unlike other objects, strings don't get printed with their
1304 ids. We should probably provide an \emph{option} to print
1305 with ids (this could be helpful for subsequent work with
1306 the strings in question; on the other hand, since strings
1307 are being kept unique, we can immediately exchange a
1308 string and it's id, so I'm not sure if it's necessary to
1309 have an explicit ``option'').
1310 \end{notate}
1312 \subsection*{Functions that establish basic graph structure}
1314 \begin{notate}{Thinking about graph-like data} \label{graph-like-data}
1315 Here we have in mind one or more objects (e.g. a
1316 particular source and sink) that is associated with
1317 potentially any number of triples (e.g. all the possible
1318 middles running between these two identified objects).
1319 These functions establish various forms of locality or
1320 neighborhood within the data.
1322 The results of such queries can be optionally cached in a
1323 view, which is useful for further processing
1324 (cf. \ref{satisfy-conditions}).
1326 These functions take input in the form of strings and/or
1327 coordinates (cf. Note \ref{massage}).
1328 \end{notate}
1330 \begin{common}{queries.lisp}
1331 (defun triples-given-beginning (node &optional view)
1332 "Get triples outbound from the given NODE. Optional
1333 argument VIEW causes the results to be selected into a
1334 view with that name."
1335 (let ((data (massage node))
1336 (window (or view "interal-view"))
1337 ret)
1338 (when data
1339 (create-view
1340 window
1341 :as (select [*]
1342 :from [triples]
1343 :where [and [= [code1] (first data)]
1344 [= [ref1] (second data)]]))
1345 (setq ret (select [*] :from window))
1346 (unless view
1347 (drop-view window))
1348 ret)))
1350 (defun triples-given-end (node &optional view)
1351 "Get triples inbound into NODE. Optional argument VIEW
1352 causes the results to be selected into a view with
1353 that name."
1354 (let ((data (massage node))
1355 (window (or view "interal-view"))
1356 ret)
1357 (when data
1358 (create-view
1359 window
1360 :as (select [*]
1361 :from [triples]
1362 :where [and [= [code3] (first data)]
1363 [= [ref3] (second data)]]))
1364 (setq ret (select [*] :from window))
1365 (unless view
1366 (drop-view window))
1367 ret)))
1369 (defun triples-given-middle (edge &optional view)
1370 "Get the triples that run along EDGE. Optional argument
1371 VIEW causes the results to be selected into a view
1372 with that name."
1373 (let ((data (massage edge))
1374 (window (or view "interal-view"))
1375 ret)
1376 (when data
1377 (create-view
1378 window
1379 :as (select [*]
1380 :from [triples]
1381 :where [and [= [code2] (first data)]
1382 [= [ref2] (second data)]]))
1383 (setq ret (select [*] :from window))
1384 (unless view
1385 (drop-view window))
1386 ret)))
1388 (defun triples-given-middle-and-end (edge node &optional
1389 view)
1390 "Get the triples that run along EDGE into NODE.
1391 Optional argument VIEW causes the results to be
1392 selected into a view with that name."
1393 (let ((edgedata (massage edge))
1394 (nodedata (massage node))
1395 (window (or view "interal-view"))
1396 ret)
1397 (when (and edgedata nodedata)
1398 (create-view
1399 window
1400 :as (select [*]
1401 :from [triples]
1402 :where [and [= [code2] (first edgedata)]
1403 [= [ref2] (second edgedata)]
1404 [= [code3] (first nodedata)]
1405 [= [ref3] (second nodedata)]]))
1406 (setq ret (select [*] :from window))
1407 (unless view
1408 (drop-view window))
1409 ret)))
1411 (defun triples-given-beginning-and-middle (node edge
1412 &optional view)
1413 "Get the triples that run from NODE along EDGE.
1414 Optional argument VIEW causes the results to be selected
1415 into a view with that name."
1416 (let ((nodedata (massage node))
1417 (edgedata (massage edge))
1418 (window (or view "interal-view"))
1419 ret)
1420 (when (and nodedata edgedata)
1421 (create-view
1422 window
1423 :as (select [*]
1424 :from [triples]
1425 :where [and [= [code1] (first nodedata)]
1426 [= [ref1] (second nodedata)]
1427 [= [code2] (first edgedata)]
1428 [= [ref2] (second edgedata)]]))
1429 (setq ret (select [*] :from window))
1430 (unless view
1431 (drop-view window))
1432 ret)))
1434 (defun triples-given-beginning-and-end (node1 node2
1435 &optional view)
1436 "Get the triples that run from NODE1 to NODE2. Optional
1437 argument VIEW causes the results to be selected
1438 into a view with that name."
1439 (let ((node1data (massage node1))
1440 (node2data (massage node2))
1441 (window (or view "interal-view"))
1442 ret)
1443 (when (and node1data node2data)
1444 (create-view
1445 window
1446 :as (select [*]
1447 :from [triples]
1448 :where [and [= [code1] (first node1data)]
1449 [= [ref1] (second node1data)]
1450 [= [code3] (first node2data)]
1451 [= [ref3] (second node2data)]]))
1452 (setq ret (select [*] :from window))
1453 (unless view
1454 (drop-view window))
1455 ret)))
1457 ;; This one use `select-one' instead of `select'
1458 (defun triple-exact-match (node1 edge node2 &optional
1459 view)
1460 "Get the triples that run from NODE1 along EDGE to
1461 NODE2. Optional argument VIEW causes the results to be
1462 selected into a view with that name."
1463 (let ((node1data (massage node1))
1464 (edgedata (massage edge))
1465 (node2data (massage node2))
1466 (window (or view "interal-view"))
1467 ret)
1468 (when (and node1data edgedata node2data)
1469 (create-view
1470 window
1471 :as (select [*]
1472 :from [triples]
1473 :where [and [= [code1] (first node1data)]
1474 [= [ref1] (second node1data)]
1475 [= [code2] (first edgedata)]
1476 [= [ref2] (second edgedata)]
1477 [= [code3] (first node2data)]
1478 [= [ref3] (second node2data)]]))
1479 (setq ret (select-one [*] :from window))
1480 (unless view
1481 (drop-view window))
1482 ret)))
1483 \end{common}
1485 \begin{notate}{Becoming flexible about a string's status}
1486 One possible upgrade would be to provide versions of these
1487 functions that will flexibly accept either a string or a
1488 ``placed string'' as input (since frequently we're
1489 interested in content of that sort; see
1490 \ref{importing-sketch}).
1491 \end{notate}
1493 \subsection*{Finding places that satisfy some property}
1495 \begin{notate}{On `get-places-subject-to-constraint'}
1496 Like `get-places' (Note \ref{get-places}), but this
1497 time takes an extra condition of the form (A C B)
1498 where one of A, B, and C is `nil'. We test each
1499 of the places in place of this `nil', to see if a
1500 triple matching that criterion exists.
1501 \end{notate}
1503 \begin{common}{queries.lisp}
1504 (defun get-places-subject-to-constraint (symbol condition)
1505 (let ((candidate-places (get-places symbol))
1506 accepted-places)
1507 (dolist (place candidate-places)
1508 (let ((filled-condition
1509 (map 'list (lambda (elt) (or elt
1510 `(1 ,place)))
1511 condition)))
1512 (when (apply 'triple-relaxed-match
1513 filled-condition)
1514 (setq accepted-places
1515 (cons place accepted-places)))))
1516 accepted-places))
1517 \end{common}
1519 \subsection*{Logic}
1521 \begin{notate}{Caution: compatibility with theories?}
1522 For the moment, I'm not sure how compatible this function
1523 is with the theories apparatus we've established, or with
1524 the somewhat vaguer notion of trans-theory questions or
1525 concerns. Global queries should work just fine, but
1526 theory-local questions may need some work. Before getting
1527 into compatibility of these questions with the theory
1528 apparatus, I want to make sure that apparatus is working
1529 properly. Note that the questions here do rely on
1530 functions for graph-like thinking (Note
1531 \ref{graph-like-data} et seq.), and it would certainly
1532 make sense to port to ``subgraphs'' as represented by
1533 theories.
1534 \end{notate}
1536 \begin{notate}{On `satisfy-conditions'} \label{satisfy-conditions}
1537 This function finds the items which match constraints.
1538 Constraints take the form (A B C), where precisely one of
1539 A, B, or C should be `nil', and any of the others can be
1540 either input suitable for `massage', or
1541 `t'. The `nil' entry stands for the object we're
1542 interested in. Any `t' entries are wildcards.
1544 The first thing that happens as the function runs is that
1545 views are established exhibiting each group of triples
1546 satisfying each predicate. The names of these views are
1547 then massaged into a large SQL query. (It is important to
1548 ``typeset'' all of this correctly for our SQL `query'.)
1549 Finally, once that query has been run, we clean up,
1550 dropping all of the views we created.
1551 \end{notate}
1553 \begin{common}{queries.lisp}
1554 (defun satisfy-conditions (constraints)
1555 (let* ((views (generate-views constraints))
1556 (formatted-list-of-views (format-views
1557 views))
1558 (where-condition (generate-where-condition
1559 views
1560 constraints))
1561 (ret
1562 ;; Let's see what the query is, first of all.
1563 (query
1564 (concatenate
1565 'string
1566 "select v1.id, v1.code1, v1.ref1, "
1567 "v1.code2, v1.ref2, "
1568 "v1.code3, v1.ref3 "
1569 "from "
1570 formatted-list-of-views
1571 "where "
1572 where-condition
1573 ";"))))
1574 (mapc (lambda (name) (drop-view name)) views)
1575 ret))
1576 \end{common}
1578 \begin{notate}{Subroutines for `satisfy-conditions'}
1579 The functions below produce bits and pieces of the SQL
1580 query that `satisfy-conditions' submits. The point of the
1581 `generate-views' is to create a series of views centered
1582 on the term(s) we're interested in (the `nil' slots in
1583 each submitted constraint). With
1584 `generate-where-condition', we insist that all of these
1585 interesting terms should, in fact, be equal to one
1586 another.
1587 \end{notate}
1589 \begin{notate}{On `generate-views'}
1590 In a `cond' form, for each constraint we must select the
1591 appropriate function to generate the view; at the very end
1592 of the cond form, we spit out the viewname (for `mapcar'
1593 to add to the list of views).
1594 \end{notate}
1596 \begin{common}{queries.lisp}
1597 (defun generate-views (constraints)
1598 (let ((counter 0))
1599 (mapcar
1600 (lambda (constraint)
1601 (setq counter (1+ counter))
1602 (let ((viewname (format nil "v~a" counter)))
1603 (cond
1604 ;; A * ? or A ? *
1605 ((or (and (eq (second constraint) t)
1606 (eq (third constraint) nil))
1607 (and (eq (second constraint) nil)
1608 (eq (third constraint) t)))
1609 (triples-given-beginning
1610 (first constraint)
1611 viewname))
1612 ;; * B ? or ? B *
1613 ((or (and (eq (first constraint) t)
1614 (eq (third constraint) nil))
1615 (and (eq (first constraint) nil)
1616 (eq (third constraint) t)))
1617 (triples-given-middle
1618 (second constraint)
1619 viewname))
1620 ;; * ? C or ? * C
1621 ((or (and (eq (first constraint) t)
1622 (eq (second constraint) nil))
1623 (and (eq (first constraint) nil)
1624 (eq (second constraint) t)))
1625 (triples-given-end
1626 (third constraint)
1627 viewname))
1628 ;; ? B C
1629 ((eq (first constraint) nil)
1630 (triples-given-middle-and-end
1631 (second constraint)
1632 (third constraint)
1633 viewname))
1634 ;; A ? C
1635 ((eq (second constraint) nil)
1636 (triples-given-beginning-and-middle
1637 (first constraint)
1638 (second constraint)
1639 viewname))
1640 ;; A C ?
1641 ((eq (third constraint) nil)
1642 (triples-given-beginning-and-end
1643 (first constraint)
1644 (third constraint)
1645 viewname)))
1646 viewname))
1647 constraints)))
1649 (defun format-views (views)
1650 (let ((formatted-list-of-views ""))
1651 (mapc (lambda (view)
1652 (setq formatted-list-of-views
1653 (concatenate
1654 'string
1655 formatted-list-of-views
1656 (format nil "~a," view))))
1657 (butlast views))
1658 (setq formatted-list-of-views
1659 (concatenate
1660 'string
1661 formatted-list-of-views
1662 (format nil "~a " (car (last views)))))
1663 formatted-list-of-views))
1665 (defun generate-where-condition (views conditions)
1666 (let ((where-condition "")
1667 (c (select-component (first conditions))))
1668 ;; there should be one less "=" condition than there
1669 ;; are things to compare; until we get to the last
1670 ;; view, everything is joined together by an `and'.
1671 ;; -- this needs to consider (map over) both `views'
1672 ;; and `conditions'.
1673 (loop
1674 for i from 1 upto (1- (length views))
1676 (let ((compi (select-component (nth i conditions)))
1677 (viewi (nth i views)))
1678 (setq
1679 where-condition
1680 (concatenate
1681 'string
1682 where-condition
1683 (concatenate
1684 'string
1685 "(v1.code" c " = " viewi ".code" compi ") and "
1686 "(v1.ref" c " = " viewi ".ref" compi ") and ")))))
1687 (let ((viewn (nth (1- (length views)) views))
1688 (compn (select-component
1689 (nth (length views) conditions))))
1690 (setq
1691 where-condition
1692 (concatenate
1693 'string
1694 where-condition
1695 "(v1.code" c " = " viewn ".code" compn ") and "
1696 "(v1.ref" c " = " viewn ".ref" compn ")")))
1697 where-condition))
1699 (defun select-component (condition)
1700 (cond ((eq (first condition) nil) "1")
1701 ((eq (second condition) nil) "2")
1702 ((eq (third condition) nil) "3")))
1703 \end{common}
1705 \begin{common}{queries.lisp}
1706 (locally-disable-sql-reader-syntax)
1707 \end{common}
1709 \begin{notate}{Even more complicated logic}
1710 In order to conveniently manage complex queries, it would
1711 be nice if we could store the results of earlier queries
1712 into views, so that we can combine several such views for
1713 further processing.
1714 \end{notate}
1716 \section{Emacs-side} \label{emacs-side}
1718 \subsection{The interface to Common Lisp}
1720 \begin{notate}{On `Defun'} \label{defun-interface}
1721 A way to define Elisp functions whose bodies are evaluated
1722 by Common Lisp. Trust me, this is a good idea. Besides,
1723 it exhibits some facinating backquote and comma tricks.
1724 But be careful: this definition of `Defun' did not work on
1725 Emacs version 21.
1727 If we want to be able to feed in a standard arglist to
1728 Common Lisp (with optional elements and so forth), we'd
1729 have define how these arguments are handled here!
1730 \end{notate}
1732 \begin{elisp}
1733 (defmacro Defun (name arglist &rest body)
1734 (declare (indent defun))
1735 `(defun ,name ,arglist
1736 (let* ((outbound-string
1737 (translate-emacs-syntax-to-common-syntax
1738 (format "%S"
1739 (append
1740 (list
1741 (append (list 'lambda ',arglist)
1742 ',body))
1743 (mapcar
1744 (lambda (arg) `',arg)
1745 (list
1746 ,@(remove-if
1747 (lambda (testelt)
1748 (eq testelt
1749 '&optional))
1750 arglist)))))))
1751 (returned-string
1752 (second
1753 ;; we now specify the right package!
1754 (slime-eval
1755 (list 'swank:eval-and-grab-output
1756 outbound-string)
1757 :arxana))))
1758 (process-slime-output returned-string))))
1759 \end{elisp}
1761 \begin{notate}{On `process-slime-output'}
1762 This should downcase all constituent symbols, but for
1763 expediency I'm just downcasing `NIL' at the moment. Will
1764 come back for more testing and downcasing shortly. (I
1765 suspect the general case is just about as easy as what
1766 happens here.)
1767 \end{notate}
1769 \begin{elisp}
1770 (defun process-slime-output (str)
1771 (condition-case nil
1772 (let ((read-value (read str)))
1773 (if (symbolp read-value)
1774 (read (downcase str)))
1775 (nsubst nil 'NIL read-value))
1776 (error str)))
1777 \end{elisp}
1779 \begin{elisp}
1780 (defun translate-emacs-syntax-to-common-syntax (str)
1781 (with-temp-buffer
1782 (insert str)
1783 (dolist (swap '(("(\\` " "`")
1784 ("(\\\, " ",")))
1785 (goto-char (point-min))
1786 (while (search-forward (first swap) nil t)
1787 (goto-char (match-beginning 0))
1788 (forward-sexp)
1789 (delete-char -1)
1790 (goto-char (match-beginning 0))
1791 (delete-region (match-beginning 0)
1792 (match-end 0))
1793 (insert (second swap))))
1794 (buffer-substring-no-properties (point-min)
1795 (point-max))))
1796 \end{elisp}
1798 \begin{notate}{Interactive `Defun'}
1799 Note, an improved version of this macro would allow me to
1800 specify that some Defuns are interactive and some are not.
1801 This could be done by examining the submitted body, and
1802 adjusting the defun if its car is an `interactive' form.
1803 Most of the Defuns will be things that people will want to
1804 use interactively, so making this change would probably be
1805 a good idea. What I'm doing in the mean time is just
1806 writing 2 functions each time I need to make an
1807 interactive function that accesses Common Lisp data!
1808 \end{notate}
1810 \begin{notate}{Common Lisp evaluation of code chunks}
1811 Another potentially beneficial and simple approach is to
1812 write a form like `progn' that evaluates its contents on
1813 Common Lisp. This saves us from having to rewrite all of
1814 the `defun' facilities into `Defun' (e.g. interactivity).
1815 But... the problem with \emph{this} is that Common Lisp
1816 doesn't know the names of all the variables that are
1817 defined in Emacs! I'm not sure how to get all of the
1818 values of these variable substituted \emph{first}, before
1819 the call to Common Lisp is made.
1820 \end{notate}
1822 \begin{notate}{Debugging `Defun'}
1823 In order to make debugging go easier, it might be nice to
1824 have an option to make the code that is supposed to be
1825 evaluated by Defun actually \emph{print} on the REPL
1826 instead of being processed through an invisible back-end.
1827 There could be a couple of different ways to do that, one
1828 would be to simulate just what a user might do, the other
1829 would be a happy medium between that and what we're doing
1830 now: just put our computery auto-generated code on the
1831 REPL and evaluate it. (To some extent, I think the
1832 *slime-events* buffer captures this information, but it is
1833 not particularly easy to read.)
1834 \end{notate}
1836 \begin{notate}{Interactive Common Lisp?}
1837 Suppose we set up some kind of interactive environment in
1838 Common Lisp; how would we go about passing this
1839 environment along to a user interacting via Emacs? (Note
1840 that SLIME's presentation of the debugging loop is one
1841 good example.)
1842 \end{notate}
1844 \subsection{Database interaction} \label{interaction}
1846 \begin{notate}{The `article' function} \label{the-article-function}
1847 You can use this function to create an article with a
1848 given name and contents. If you like you can put it in a
1849 list.
1850 \end{notate}
1852 \begin{elisp}
1853 (Defun article (name contents &optional heading)
1854 (let ((coordinates (add-triple name
1855 "has content"
1856 contents)))
1857 (when theory (add-triple coordinates "in" heading))
1858 (when place (if (numberp place)
1859 (put-in-place coordinates place)
1860 (put-in-place coordinates)))
1861 coordinates))
1862 \end{elisp}
1864 \begin{notate}{The `scholium' function} \label{the-scholium-function}
1865 You can use this function to link annotations to objects.
1866 As with the `article' function, you can optionally
1867 categorize the connection on a given list (cf. Note
1868 \ref{the-article-function}).
1869 \end{notate}
1871 \begin{elisp}
1872 (Defun scholium (beginning link end &optional heading)
1873 (let ((coordinates (add-triple beginning
1874 link
1875 end)))
1876 (when list (add-triple coordinates "in" heading))
1877 (when place (if (numberp place)
1878 (put-in-place coordinates place)
1879 (put-in-place coordinates)))
1880 coordinates))
1881 \end{elisp}
1883 \begin{notate}{Uses of coordinates}
1884 Note that, if desired, you can feed input of the form
1885 '(\meta{code} \meta{ref}) into `article' and `scholium'.
1886 It's convenient to do further any processing of the object
1887 we've created, while we still have ahold of the coordinates
1888 returned by `add-triple' (cf. Note
1889 \ref{import-code-continuations} for an example).
1890 \end{notate}
1892 \begin{notate}{Finding all the members of a list by type?}
1893 We just narrow according to type.
1894 \end{notate}
1896 \begin{notate}{On `get-article'} \label{get-article}
1897 Get the contents of the article named `name'. Optional
1898 argument `list' lets us find and use the position on the
1899 given list that holds the name, and use that instead of
1900 the name itself.
1902 We do not yet deal well with the ambiguous case in which
1903 there are several positions that correspond to the given
1904 name that appear on the same list.
1906 Note also that out of the data returned by
1907 `triples-given-beginning-and-middle', we should pick the
1908 (hopefully just) ONE that corresponds to the given list.
1910 This means we need to pick over the list of triples
1911 returned here, and test each one to see if it is in our
1912 heading. As to WHY there might be more than one ``has
1913 content'' for a place that we know to be in our
1914 heading... I'm not sure. I guess we can go with the
1915 assumption that there is just one, for now.
1916 \end{notate}
1918 \begin{elisp}
1919 (Defun get-article (name &optional heading)
1920 (let* ((place-pseudonyms
1921 (if heading
1922 (get-places-subject-to-constraint
1923 name `(nil "in" ,heading))
1924 (get-places name)))
1925 (goes-by (cond
1926 ((eq (length place-pseudonyms) 1)
1927 `(1 ,(car place-pseudonyms)))
1928 ((triple-exact-match
1929 name "in" heading)
1930 name)
1931 ((not heading) name)
1932 (t nil))))
1933 (when goes-by
1934 ;; it might be nice to also return `goes-by'
1935 ;; so we can access the appropriate place again.
1936 (third (print-triple
1937 (resolve-ambiguity
1938 (triples-given-beginning-and-middle
1939 goes-by "has content"))
1940 t)))))
1941 \end{elisp}
1943 \begin{notate}{On `get-names'} \label{get-names}
1944 This function simply gets the names of articles that have
1945 names -- in other words, every triple built around the
1946 ``has content'' relation.
1947 \end{notate}
1949 \begin{elisp}
1950 (Defun get-names (&optional heading)
1951 (let ((conditions (list (list nil "has content" t))))
1952 (when heading
1953 (setq conditions
1954 (append conditions
1955 (list (list nil "in" heading)))))
1956 (mapcar
1957 (lambda (place-or-string)
1958 (cond
1959 ;; place case
1960 ((eq (first place-or-string) 1)
1961 (print-system-object
1962 (place-lookup (second place-or-string))))
1963 ;; string case
1964 ((eq (first place-or-string) 0)
1965 (print-system-object place-or-string))))
1966 (mapcar
1967 (lambda (triple)
1968 (isolate-beginning triple))
1969 (satisfy-conditions conditions)))))
1970 \end{elisp}
1972 \begin{notate}{Contrasting cases} \label{contrasting-cases}
1973 Consider the difference between
1974 \begin{quote}
1975 (? ``has author'' ``Arthur C. Clarke'') \\
1976 (? ``has genre'' ``fiction'')
1977 \end{quote}
1979 \begin{quote}
1980 (\emph{name} ``has content'' *) \\
1981 (\emph{name} ``in'' ``heading'')
1982 \end{quote}
1983 where, in the latter case, we know \emph{who} we're
1984 talking about, and we just want to limit the list of items
1985 generated by the ``*'' by the second condition. This
1986 should help illustrate the difference between `get-names'
1987 (which is making a general query) and `get-article' (which
1988 already knows the name of a specific article), and the
1989 logic that they use.
1990 \end{notate}
1992 \begin{notate}{Placing items from Emacs} \label{place-item}
1993 We periodically need to place items from within Emacs.
1994 The function `place-item' is a wrapper for `put-in-place'
1995 that makes this possible (it also provides the user with
1996 an extra option, namely to put the place itself under a
1997 given heading).
1999 Notice that when the symbol is placed in some pre-existing
2000 place (which can only happen when `id' is not nil), that
2001 place may already be under some other heading. We will ignore
2002 this case for now (since it seems that putting objects
2003 into \emph{new} places will be the preferred action), but
2004 later we will have to look at what to do in this other
2005 case.
2006 \end{notate}
2008 \begin{elisp}
2009 (Defun place-item (symbol &optional id heading)
2010 (let ((coordinates (put-in-place symbol id)))
2011 (when heading (add-triple coordinates "in" heading))
2012 coordinates))
2013 \end{elisp}
2015 \begin{notate}{Automatic classifications} \label{classifications}
2016 It will presumably make sense to offer increasingly
2017 ``automatic'' classifications for new objects. At this
2018 point, we've set things up so that the user can optionally
2019 supply the name of \emph{one} heading that their new object
2020 is a part of.
2022 It may make more sense to allow an `\&rest theories'
2023 argument, and add the triple to all of the specified
2024 theories. This would require modifying `Defun' to
2025 accommodate the `\&rest' idiom; see Note
2026 \ref{defun-interface}.
2027 \end{notate}
2029 \begin{notate}{Postconditions and provenance}
2030 After adding something to the database, we may want to do
2031 something extra; perhaps generating provenance
2032 information, perhaps checking or enforcing database
2033 consistency, or perhaps running a hook that causes some
2034 update in the frontend (cf. Note \ref{provenance}).
2035 Provisions of this sort will come later, as will
2036 short-hand convenience functions for making particularly
2037 common complex entries.
2038 \end{notate}
2040 \subsection{Importing \LaTeX\ documents} \label{importing}
2042 \begin{notate}{Importing sketch} \label{importing-sketch}
2043 The code in this section imports a document as a
2044 collection of (sub-)sections and notes. It gathers the
2045 sections, sub-sections, and notes recursively and records
2046 their content in a tree whose nodes are places (Note
2047 \ref{places}) and whose links express the ``component-of''
2048 relation described in Note \ref{order-of-order}.
2050 This representation lets us see the geometric,
2051 hierarchical, structure of the document we've imported.
2052 It exemplifies a general principle, that geometric data
2053 should be represented by relationships between places, not
2054 direct relationships between strings. This is because
2055 ``the same'' string often appears in ``different'' places
2056 in any given document (e.g. a paper's many sub-sections
2057 titled ``Introduction'' will not all have the same
2058 content).
2060 What goes into the places is in some sense arbitrary. The
2061 key is that whatever is \emph{in} or \emph{attached} to
2062 these places must tell us everything we need to know about
2063 the part of the document associated with that place
2064 (e.g. in the case of a note, its title and contents).
2065 That's over and above the \emph{structural} links which
2066 say how the places relate to one another. Finally, all of
2067 these places and structural links will be added to a
2068 heading that represents the document as a whole.
2070 A natural convention we'll use will be to put the name
2071 of any document component that's associated with a given
2072 place into that place, and add all other information as
2073 annotations.
2074 \end{notate}
2076 \begin{notate}{Ordered versus unordered data} \label{ordered-vs-unordered}
2077 The code in this section is an example of one way to work
2078 with ordered data (i.e. \LaTeX\ documents are not just
2079 hierarchical, but the elements at each level of the
2080 hierarchy are also ordered).
2082 Since \emph{many} artifacts are hierachical (e.g. Lisp
2083 code), we should try to be compatible with \emph{native}
2084 methods for working with order (in the case of Lisp, feed
2085 the code into a Lisp processor and use CDR and CAR, etc.).
2087 We \emph{can} use triples such as (``rank'' ``1''
2088 ``Fred'') and (``rank'' ``2'' ``Barney'') to talk about
2089 order. There may be some SQL techniques that would help.
2090 (FYI, order can be handled very explicitly in Elephant!)
2092 In order to account for \emph{different} orderings, we
2093 need one more piece of data -- some explicit treatment of
2094 where the order \emph{is}; in other words, theories.
2095 (This table illustrates the fact that a heading is not so
2096 different from ``an additional triple''; indeed, the only
2097 reason to make them different is to have the extra
2098 convenience of having their elements be numbered.)
2100 \begin{center}
2101 \begin{tabular}{|lll|l|}
2102 \hline
2103 rank & 1 & Fred & Friday \\
2104 rank & 2 & Barney & Friday \\
2105 rank & 1 & Barney & Saturday \\
2106 rank & 2 & Fred & Saturday \\
2107 \hline
2108 \end{tabular}
2109 \end{center}
2110 \end{notate}
2112 \begin{notate}{The order of order} \label{order-of-order}
2113 The triples (``rank'' ``1'' ``Fred'') and (``rank'' ``2''
2114 ``Barney'') mentioned in Note \ref{ordered-vs-unordered}
2115 are easy enough to read and understand; it might be more
2116 natural in some ways for us to say (``Fred'' ``rank''
2117 ``1'') -- Fred has rank 1. In this section, we're
2118 concerned with talking about the ordered parts of a
2119 document, and ($A$ $n$ $B$) seems like an intuitive way to
2120 say ``$A$'s $n$th component is $B$''.
2121 \end{notate}
2123 \begin{notate}{It's not overdoing it, right?}
2124 When importing \emph{this} document, we see links like the
2125 following. I hope that's not ``overdoing it''. (Take a
2126 look at Note \ref{get-article} and Note \ref{get-names} to
2127 see how we go about getting information out of the
2128 database.) We could get rid of one link if theories were
2129 database objects (cf. Note
2130 \ref{theories-as-database-objects}).
2131 \end{notate}
2133 \begin{idea}
2134 "T557[P135|Web interface|.in.arxana.tex]"
2135 "T558[Future plans.9.P135|Web interface|]"
2136 "T559[T558[Future plans.9.P135|Web interface|].in.arxana.tex]"
2137 \end{idea}
2139 \begin{notate}{Importing in general} \label{importing-generally}
2140 We will eventually have a collection of parsers to get
2141 various kinds of documents into the system in various
2142 different ways (Note \ref{parsing}). For now, this
2143 section gives a simple way to get some sorts of
2144 \LaTeX\ documents into the system, namely documents
2145 structured along the same lines as the document you're
2146 reading now!
2148 An interesting approach to parsing \emph{math} documents
2149 has been undertaken in the \LaTeX ML
2150 project.\footnote{{\tt http://dlmf.nist.gov/LaTeXML/}}
2151 Eventually it would be nice to get that level of detail
2152 here, too! Emacsspeak is another example of a
2153 \LaTeX\ parser that deals with large-scale textual
2154 structures as well as smaller bits and
2155 pieces.\footnote{{\tt
2156 http://www.cs.cornell.edu/home/raman/aster/aster-thesis.ps}}
2158 It would probably be useful to put together some parsers
2159 for HTML and wiki code soon.
2160 \end{notate}
2162 \begin{notate}{On `import-buffer'}
2163 This function imports \LaTeX\ documents, taking care of
2164 the non-recursive aspects of this operation. It imports
2165 frontmatter (everything up to the first
2166 \verb+\begin{section}+), but assumes ``backmatter'' is
2167 trivial, and does not import it. The imported material is
2168 classified as a ``document'' with the same name as the
2169 imported buffer.
2170 \end{notate}
2172 \begin{elisp}
2173 (defun import-buffer (&optional buffername)
2174 (save-excursion
2175 (set-buffer (get-buffer (or buffername
2176 (current-buffer))))
2177 (goto-char (point-min))
2178 (search-forward-regexp "\\\\begin{document}")
2179 (search-forward-regexp "\\\\section")
2180 (goto-char (match-beginning 0))
2181 ;; other links will be made in the "heading of this
2182 ;; document", but here we make a broader assertion.
2183 (scholium buffername "is a" "document")
2184 (scholium buffername
2185 "has frontmatter"
2186 (buffer-substring-no-properties
2187 (point-min)
2188 (point))
2189 buffername)
2190 ;;; These should maybe be scholia attached to
2191 ;; root-coords (below), but for some reason that
2192 ;; wasn't working so well -- investigate later --
2193 ;; maybe it just wasn't good to run after running
2194 ;; `import-within'.
2195 (let* ((root-coords (place-item buffername nil
2196 buffername))
2197 (levels
2198 '("section" "subsection" "subsubsection"))
2199 (current-parent buffername)
2200 (level-end nil)
2201 (sections (import-within levels))
2202 (index 0))
2203 (while sections
2204 (let ((coords (car sections)))
2205 (setq index (1+ index))
2206 (scholium root-coords
2207 index
2208 coords
2209 buffername))
2210 (setq sections (cdr sections))))))
2211 \end{elisp}
2213 \begin{notate}{On `import-within'}
2214 Recurse through levels of sectioning to import
2215 \LaTeX\ code.
2217 It would be good if we could do something about sections
2218 that contain neither subsections nor notes (for example, a
2219 preface), or, more generally, about text that is not
2220 contained in any environment (possibly that appears before
2221 any section). We'll save things like this for another
2222 editing round!
2224 For the moment, we've decided to build the document
2225 hierarchy with links that are blind to whether the $k$th
2226 component of a section is a note or a subsection.
2227 Children that are notes are attached in the subroutine
2228 `import-notes' and those that are sections are attached in
2229 `import-within'. Users can find out what type of object
2230 they are looking at based on whether or not it ``has
2231 content''.
2233 Incidentally, when looking for the end of an importing
2234 level, `nil' is an OK result -- if this is the \emph{last}
2235 section at this level \emph{and} there is no subsequent
2236 section at a higher level.
2237 \end{notate}
2239 \begin{elisp}
2240 (defun import-within (levels)
2241 (let ((this-level (car levels))
2242 (next-level (car (cdr levels))) answer)
2243 (while (re-search-forward
2244 (concat
2245 "^\\\\" this-level "{\\([^}\n]*\\)}"
2246 "\\( +\\\\label{\\)?"
2247 "\\([^}\n]*\\)?")
2248 level-end t)
2249 (let* ((name (match-string-no-properties 1))
2250 (at (place-item name nil buffername))
2251 (level-end
2252 (or (save-excursion
2253 (search-forward-regexp
2254 (concat "^\\\\" this-level "{.*")
2255 level-end t))
2256 level-end))
2257 (notes-end
2258 (if next-level
2259 (or (progn (point)
2260 (save-excursion
2261 (search-forward-regexp
2262 (concat "^\\\\"
2263 next-level "{.*")
2264 level-end t)))
2265 level-end)
2266 level-end))
2267 (index (let ((current-parent at))
2268 (import-notes notes-end)))
2269 (subsections (let ((current-parent at))
2270 (import-within (cdr levels)))))
2271 (while subsections
2272 (let ((coords (car subsections)))
2273 (setq index (1+ index))
2274 (scholium at
2275 index
2276 coords
2277 buffername)
2278 (setq subsections (cdr subsections))))
2279 (setq answer (cons at answer))))
2280 (reverse answer)))
2281 \end{elisp}
2283 \begin{notate}{On `import-notes'} \label{import-notes}
2284 We're going to make the daring assumption that the
2285 ``textual'' portions of incoming \LaTeX\ documents are
2286 contained in ``Notes''. That assumption is true, at
2287 least, for the current document. The function returns the
2288 count of the number of notes imported, so that
2289 `import-within' knows where to start counting this
2290 section's non-note children.
2292 Would this same function work to import all notes from a
2293 buffer without examining its sectioning structure? Not
2294 quite, but close! (Could be a fun exercise to fix this.)
2295 \end{notate}
2297 \begin{elisp}
2298 (defun import-notes (end)
2299 (let ((index 0))
2300 (while (re-search-forward (concat "\\\\begin{notate}"
2301 "{\\([^}\n]*\\)}"
2302 "\\( +\\\\label{\\)?"
2303 "\\([^}\n]*\\)?")
2304 end t)
2305 (let* ((name
2306 (match-string-no-properties 1))
2307 (tag (match-string-no-properties 3))
2308 (beg
2309 (progn (next-line 1)
2310 (line-beginning-position)))
2311 (end
2312 (progn (search-forward-regexp
2313 "\\\\end{notate}")
2314 (match-beginning 0)))
2315 (coords (place-item name nil buffername)))
2316 (setq index (1+ index))
2317 (scholium current-parent
2318 index
2319 coords
2320 buffername)
2321 ;; not in the heading
2322 (scholium coords
2323 "has content"
2324 (buffer-substring-no-properties
2325 beg end))
2326 (import-code-continuations coords)))
2327 index))
2328 \end{elisp}
2330 \begin{notate}{On `import-code-continuations'} \label{import-code-continuations}
2331 This runs within the scope of `import-notes', to turn the
2332 series of Lisp chunks or other code snippets that follow a
2333 given note into a scholium attached to that note. Each
2334 separate snippet becomes its own annotation.
2336 The ``conditional regexps'' used here only work with Emacs
2337 version 23 or higher.
2339 I'm noticing a problem with the way the `looking-at'
2340 form behaves. It matches the expression in question,
2341 but then the match-end is reported as one character
2342 less than it supposed to be. Maybe `looking-at' is
2343 just not as good as `re-search-forward'? But it's
2344 what seems easiest to use.
2345 \end{notate}
2347 \begin{elisp}
2348 (defun import-code-continuations (coords)
2349 (let ((possible-environments
2350 "\\(1?:lisp\\|idea\\|common\\)"))
2351 (while (looking-at
2352 (concat "\n*?\\\\begin{"
2353 possible-environments
2354 "}"))
2355 (let* ((beg (match-end 0))
2356 (environment (match-string 1))
2357 (end (progn (search-forward-regexp
2358 (concat "\\\\end{"
2359 environment
2360 "}"))
2361 (match-beginning 0)))
2362 (content (buffer-substring-no-properties
2364 end)))
2365 (scholium (scholium coords
2366 "has attachment"
2367 content)
2368 "has type"
2369 environment)))))
2370 \end{elisp}
2372 \begin{notate}{On `autoimport-arxana'} \label{autoimport-arxana}
2373 This just calls `import-buffer', and imports this document
2374 into the system.
2375 \end{notate}
2377 \begin{elisp}
2378 (defun autoimport-arxana ()
2379 (interactive)
2380 (import-buffer "arxana.tex"))
2381 \end{elisp}
2383 \begin{notate}{Importing textual links}
2384 Of course, it would be good to import the links that users
2385 make between articles, since then we can quickly navigate
2386 from an article to the various articles that cite that
2387 article, as well as follow the usual forward-directional
2388 links. Indeed, we should be able to browse each article
2389 within a ``neighborhood'' of other related articles.
2390 (We'll need to import labels as well, of course.)
2391 \end{notate}
2393 \subsection{Browsing database contents} \label{browsing}
2395 \begin{notate}{Browsing sketch} \label{browsing-sketch}
2396 This section facilitates browsing of documents represented
2397 with structures like those created in Section
2398 \ref{importing}, and sets the ground for browsing other
2399 sorts of contents (e.g. collections of tasks, as in
2400 Section \ref{managing-tasks}).
2402 In order to facilitate general browsing, it is not enough
2403 to simply use `get-article' (Note \ref{get-article}) and
2404 `get-names' (Note \ref{get-names}), although these
2405 functions provide our defaults. We must provide the means
2406 to find and display different things differently -- for
2407 example, a section's table of contents will typically
2408 be displayed differently from its actual contents.
2410 Indeed, the ability to display and select elements of
2411 document sections (Note \ref{display-section}) is
2412 basically the core browsing deliverable. In the process
2413 we develop a re-usable article selector (Note
2414 \ref{selector}; cf. Note \ref{browsing-tasks}). This in
2415 turn relies on a flexible function for displaying
2416 different kinds of articles (Note \ref{display-article}).
2417 \end{notate}
2419 \begin{notate}{On `display-article'} \label{display-article}
2420 This function takes in the name of the article to display.
2421 Furthermore, it takes optional arguments `retriever' and
2422 `formatter', which tell it how to look up and/or format
2423 the information for display, respectively.
2425 Thus, either we make some statement up front (choosing our
2426 `formatter' based on what we already know about the
2427 article), or we decide what to display after making some
2428 investigation of information attached to the article, some
2429 of which may be retrieved and displayed (this requires
2430 that we specify a suitable `retriever' and a complementary
2431 `formatter').
2433 For example, the major mode in which to display the
2434 article's contents could be stored as a scholium attached
2435 to the article; or we might maintain some information
2436 about ``areas'' of the database that would tell us up
2437 front what which mode is associated with the current area.
2438 (The default is to simply insert the data with no markup
2439 whatsoever.)
2441 Observe that this works when no heading argument is given,
2442 because in that case `get-article' looks for \emph{all}
2443 place pseudonyms. (But of course that won't work well
2444 when we have multiple theories containing things with the
2445 same names, so we should get used to using the heading
2446 argument.)
2448 (The business about requiring the data to be a sequence
2449 before engaging in further formatting is, of course, just
2450 a matter of expediency for making things work with the
2451 current dataset.)
2452 \end{notate}
2454 \begin{elisp}
2455 (defun display-article
2456 (name &optional heading retriever formatter)
2457 (interactive "Mname: ")
2458 (let* ((data (if retriever
2459 (funcall retriever name heading)
2460 (get-article name heading))))
2461 (when (and data (sequencep data))
2462 (save-excursion
2463 (if formatter
2464 (funcall formatter data heading)
2465 (pop-to-buffer (get-buffer-create
2466 "*Arxana Display*"))
2467 (delete-region (point-min) (point-max))
2468 (insert "NAME: " name "\n\n")
2469 (insert data)
2470 (goto-char (point-min)))))))
2471 \end{elisp}
2473 \begin{notate}{An interactive article selector} \label{selector}
2474 The function `get-names' (Note \ref{get-names}) and
2475 similar functions can give us a collection of articles.
2476 The next few functions provide an interactive
2477 functionality for moving through this collection to find
2478 the article we want to look at.
2480 We define a ``display style'' that the article selector
2481 uses to determine how to display various articles. These
2482 display styles are specified by text properties attached
2483 to each option the selector provides. Similarly, when
2484 we're working within a given heading, the relevant heading
2485 is also specified as a text property.
2487 At selection time, these text properties are checked to
2488 determine which information to pass along to
2489 `display-article'.
2490 \end{notate}
2492 \begin{elisp}
2493 (defvar display-style '((nil . (nil nil))))
2495 (defun thing-name-at-point ()
2496 (buffer-substring-no-properties
2497 (line-beginning-position)
2498 (line-end-position)))
2500 (defun get-display-type ()
2501 (get-text-property (line-beginning-position)
2502 'arxana-display-type))
2504 (defun get-relevant-heading ()
2505 (get-text-property (line-beginning-position)
2506 'arxana-relevant-heading))
2508 (defun arxana-list-select ()
2509 (interactive)
2510 (apply 'display-article
2511 (thing-name-at-point)
2512 (get-relevant-heading)
2513 (cdr (assoc (get-display-type)
2514 display-style))))
2516 (define-derived-mode arxana-list-mode fundamental-mode
2517 "arxana-list" "Arxana List Mode.
2519 \\{arxana-list-mode-map}")
2521 (define-key arxana-list-mode-map (kbd "RET")
2522 'arxana-list-select)
2523 \end{elisp}
2525 \begin{notate}{On `pick-a-name'} \label{pick-a-name}
2526 Here `generate' is the name of a function to call to
2527 generate a list of items to display, and `format' is a
2528 function to put these items (including any mark-up) into
2529 the buffer from which individiual items can then be
2530 selected.
2532 One simple way to get a list of names to display would be
2533 to reuse a list that we had already produced (this would
2534 save querying the database each time). We could, in fact,
2535 store a history list of lists of names that had been
2536 displayed previously (cf. Note \ref{local-storage}).
2538 We'll eventually want versions of `generate' that provide
2539 various useful views into the data, e.g., listing all of
2540 the elements of a given section (Note
2541 \ref{display-section}).
2543 Finding all the elements that match a given search term,
2544 whether that's just normal text search or some kind of
2545 structured search would be worthwhile too. Upgrading the
2546 display to e.g. color-code listed elements according to
2547 their type would be another nice feature to add.
2548 \end{notate}
2550 \begin{elisp}
2551 (defun pick-a-name (&optional generate format heading)
2552 (interactive)
2553 (let ((items (if generate
2554 (funcall generate)
2555 (get-names heading))))
2556 (when items
2557 (set-buffer (get-buffer-create "*Arxana Articles*"))
2558 (toggle-read-only -1)
2559 (delete-region (point-min)
2560 (point-max))
2561 (if format
2562 (funcall format items)
2563 (mapc (lambda (item) (insert item "\n")) items))
2564 (toggle-read-only t)
2565 (arxana-list-mode)
2566 (goto-char (point-min))
2567 (pop-to-buffer (get-buffer "*Arxana Articles*")))))
2568 \end{elisp}
2570 \begin{notate}{On `display-section'} \label{display-section}
2571 When browsing a document, if you select a section, you
2572 should display a list of that section's constituent
2573 elements, be they notes or subsections. The question
2574 comes up: when you go to display something, how do you
2575 know whether you're looking at the name of a section, or
2576 the name of an article?
2578 When you get the section's contents out of the database
2579 (Note \ref{get-section-contents})
2580 \end{notate}
2582 \begin{elisp}
2583 (defun display-section (name heading)
2584 (interactive (list (read-string
2585 (concat
2586 "name (default "
2587 (buffer-name) "): ")
2588 nil nil (buffer-name))))
2589 ;; should this pop to the Articles window?
2590 (pick-a-name `(lambda ()
2591 (get-section-contents
2592 ,name ,heading))
2593 `(lambda (items)
2594 (format-section-contents
2595 items ,heading))))
2597 (add-to-list 'display-style
2598 '(section . (display-section
2599 nil)))
2600 \end{elisp}
2602 \begin{notate}{On `get-section-contents'} \label{get-section-contents}
2603 Sent by `display-section' (Note \ref{display-section})
2604 to `pick-a-name' as a generator for the table of contents
2605 of the section with the given name in the given heading.
2607 This function first finds the triples that begin with the
2608 (placed) name of the section, then checks to see which of
2609 these are in the heading of the document we're examinining
2610 (in other words, which of these links represent structural
2611 information about that document). It also looks at the
2612 items found at the end of these links to see if they are
2613 sections or notes (``noteness'' is determined by them
2614 having content). The links are then sorted by their
2615 middles (which show the order in which these components
2616 have in the section we're examining). After this ordering
2617 information has been used for sorting, it is deleted, and
2618 we're left with just a list of names in the apropriate
2619 order together with an indication of their noteness.
2620 \end{notate}
2622 \begin{elisp}
2623 (Defun get-section-contents (name heading)
2624 (let (contents)
2625 (dolist (triple (triples-given-beginning
2626 `(1 ,(resolve-ambiguity
2627 (get-places name)))))
2628 (when (triple-exact-match
2629 `(2 ,(car triple)) "in" heading)
2630 (let* ((number (print-middle triple))
2631 (site (isolate-end triple))
2632 (noteness
2633 (when (triples-given-beginning-and-middle
2634 site "has content")
2635 t)))
2636 (setq contents
2637 (cons (list number
2638 (print-system-object
2639 (place-contents site))
2640 noteness)
2641 contents)))))
2642 (mapcar 'cdr
2643 (sort contents
2644 (lambda (component1 component2)
2645 (< (parse-integer (car component1))
2646 (parse-integer (car component2))))))))
2647 \end{elisp}
2649 \begin{notate}{On `format-section-contents'} \label{format-section-contents}
2650 A formatter for document contents, used by
2651 `display-document' (Note \ref{display-document}) as input
2652 for `pick-a-name' (Note \ref{pick-a-name}).
2654 Instead of just printing the items one by one,
2655 like the default formatter in `pick-a-name' does,
2656 this version adds appropriate text properties, which
2657 we determine based the second component of
2658 of `items' to format.
2659 \end{notate}
2661 \begin{elisp}
2662 (defun format-section-contents (items heading)
2663 ;; just replicating the default and building on that.
2664 (mapc (lambda (item)
2665 (insert (car item))
2666 (let* ((beg (line-beginning-position))
2667 (end (1+ beg)))
2668 (unless (second item)
2669 (put-text-property beg end
2670 'arxana-display-type
2671 'section))
2672 (put-text-property beg end
2673 'arxana-relevant-heading
2674 heading))
2675 (insert "\n"))
2676 items))
2677 \end{elisp}
2679 \begin{notate}{On `display-document'} \label{display-document}
2680 When browsing a document, you should first display its
2681 top-level table of contents. (Most typically, a list of
2682 all of that document's major sections.) In order to do
2683 this, we must find the triples that are begin at the node
2684 representing this document \emph{and} that are in the
2685 heading of this document. This boils down to treating the
2686 document's root as if it was a section and using the
2687 function `display-section' (Note \ref{display-section}).
2688 \end{notate}
2690 \begin{elisp}
2691 (defun display-document (name)
2692 (interactive (list (read-string
2693 (concat
2694 "name (default "
2695 (buffer-name) "): ")
2696 nil nil (buffer-name))))
2697 (display-section name name))
2698 \end{elisp}
2700 \begin{notate}{Work with `heading' argument}
2701 We should make sure that if we know the heading we're
2702 working with (e.g. the name of the document we're
2703 browsing) that this information gets communicated in the
2704 background of the user interaction with the article
2705 selector.
2706 \end{notate}
2708 \begin{notate}{Selecting from a hierarchical display} \label{hierarchical-display}
2709 A fancier ``article selector'' would be able to display
2710 several sections with nice indenting to show their
2711 hierarchical order.
2712 \end{notate}
2714 \begin{notate}{Browser history tricks} \label{history-tricks}
2715 I want to put together (or put back together) something
2716 similar to the multihistoried browser that I had going in
2717 the previous version of Arxana and my Emacs/Lynx-based web
2718 browser, Nero\footnote{{\tt http://metameso.org/~joe/nero.el}}.
2719 The basic features are:
2720 (1) forward, back, and up inside the structure of a given
2721 document; (2) switch between tabs. More advanced features
2722 might include: (3) forward and back globally across all
2723 tabs; (4) explicit understanding of paths that loop.
2725 These sorts of features are independent of the exact
2726 details of what's printed to the screen each time
2727 something is displayed. So, for instance, you could flip
2728 between section manifests a la Note \ref{display-section},
2729 or between hierarchical displays a la Note
2730 \ref{hierarchical-display}, or some combination; the key
2731 thing is just to keep track in some sensible way of
2732 whatever's been displayed!
2733 \end{notate}
2735 \begin{notate}{Local storage for browsing purposes} \label{local-storage}
2736 Right now, in order to browse the contents of the
2737 database, you need to query the database every time. It
2738 might be handy to offer the option to cache names of
2739 things locally, and only sync with the database from time
2740 to time. Indeed, the same principle could apply in
2741 various places; however, it may also be somewhat
2742 complicated to set up. Using two systems for storage, one
2743 local and one permanent, is certainly more heavy-duty than
2744 just using one permanent storage system and the local
2745 temporary display. However, one thing in favor of local
2746 storage systems is that that's what I used in the the
2747 previous prototype of Arxana -- so some code already
2748 exists for local storage! (Caching the list of
2749 \emph{names} we just made a selection from would be one
2750 simple expedient, see Note \ref{pick-a-name}.)
2751 \end{notate}
2753 \begin{notate}{Hang onto absolute references}
2754 Since `get-article' (Note \ref{get-article}) translates
2755 strings into their ``place pseudonyms'', we may want to
2756 hang onto those pseudonyms, because they are, in fact, the
2757 absolute references to the objects we end up working with.
2758 In particular, they should probably go into the
2759 text-property background of the article selector, so it
2760 will know right away what to select!
2761 \end{notate}
2763 \subsection{Exporting \LaTeX\ documents$^*$}
2765 \begin{notate}{Roundtripping}
2766 The easiest test is: can we import a document into the
2767 system and then export it again, and find it unchanged?
2768 \end{notate}
2770 \begin{notate}{Data format}
2771 We should be able to \emph{stably} import and export a
2772 document, as well as export any modifications to the
2773 document that were generated within Arxana. This means
2774 that the exporting functions will have to read the data
2775 format that the importing functions use, \emph{and} that
2776 any functions that edit document contents (or structure)
2777 will also have to use the same format. Furthermore,
2778 \emph{browsing} functions will have to be somewhat aware
2779 of this format. So, this is a good time to ask -- did we
2780 use a good format?
2781 \end{notate}
2783 \subsection{Editing database contents$^*$} \label{editing}
2785 \begin{notate}{Roundtripping, with changes}
2786 Here, we should import a document into the system and then
2787 make some simple changes, and after exporting, check with
2788 diff to make sure the changes are correct.
2789 \end{notate}
2791 \begin{notate}{Re-importing}
2792 One nice feature would be a function to ``re-import'' a
2793 document that has changed outside of the system, and make
2794 changes in the system's version whereever changes appeared
2795 in the source version.
2796 \end{notate}
2798 \begin{notate}{Editing document structure}
2799 The way we have things set up currently, it is one thing
2800 to make a change to a document's textual components, and
2801 another to change its structure. Both types of changes
2802 must, of course, be supported.
2803 \end{notate}
2805 \section{Applications}
2807 \subsection{Managing tasks} \label{managing-tasks}
2809 \begin{notate}{What are tasks?}
2810 Each task tends to have a \emph{name}, a
2811 \emph{description}, a collection of \emph{prerequisite
2812 tasks}, a description of other \emph{material
2813 dependencies}, a \emph{status}, some \emph{justification
2814 of that status}, a \emph{creation date}, and an
2815 \emph{estimated time of completion}. There might actually
2816 be several ``estimated times of completion'', since the
2817 estimate would tend to improve over time. To really
2818 understand a task, one should keep track of revisions like
2819 this.
2820 \end{notate}
2822 \begin{notate}{On `store-task-data'} \label{store-task-data}
2823 Here, we're just filling in a frame. Since ``filling in a
2824 frame'' seems like the sort of operation that might happen
2825 over and over again in different contexts, to save space,
2826 it would probably be nice to have a macro (or similar)
2827 that would do a more general version of what this function
2828 does.
2829 \end{notate}
2831 \begin{elisp}
2832 (Defun store-task-data
2833 (name description prereqs materials status
2834 justification submitted eta)
2835 (add-triple name "is a" "task")
2836 (add-triple name "description" description)
2837 (add-triple name "prereqs" prereqs)
2838 (add-triple name "materials" materials)
2839 (add-triple name "status" status)
2840 (add-triple name "status justification" justification)
2841 (add-triple name "date submitted" submitted)
2842 (add-triple name "estimated time of completion" eta))
2843 \end{elisp}
2845 \begin{notate}{On `generate-task-data'} \label{generate-task-data}
2846 This is a simple function to create a new task matching
2847 the description above.
2848 \end{notate}
2850 \begin{elisp}
2851 (defun generate-task-data ()
2852 (interactive)
2853 (let ((name (read-string "Name: "))
2854 (description (read-string "Description: "))
2855 (prereqs (read-string
2856 "Task(s) this task depends on: "))
2857 (materials (read-string "Material dependencies: "))
2858 (status (completing-read
2859 "Status (tabled, in progress, completed):
2860 " '("tabled" "in progress" "completed")))
2861 (justification (read-string "Why this status? "))
2862 (submitted
2863 (read-string
2864 (concat "Date submitted (default "
2865 (substring (current-time-string) 0 10)
2866 "): ")
2867 nil nil (substring (current-time-string) 0 10)))
2868 (eta
2869 (read-string "Estimated date of completion:")))
2870 (store-task-data name description prereqs materials
2871 status
2872 justification submitted eta)))
2873 \end{elisp}
2875 \begin{notate}{Possible enhancements to `generate-task-data'}
2876 In order to make this function very nice, it would be good
2877 to allow ``completing read'' over known tasks when filling
2878 in the prerequisites. Indeed, it might be especially nice
2879 to offer a type of completing read that is similar in some
2880 sense to the tab-completion you get when completing a file
2881 name, i.e., quickly completing certain sub-strings of the
2882 final string (in this case, these substrings would
2883 correspond to task areas we are progressively zooming down
2884 into).
2886 As for the task description, rather than forcing the user
2887 to type the description into the minibuffer, it might be
2888 nice to pop up a separate buffer instead (a la the
2889 Emacs/w3m textarea). If we had a list of all the known
2890 tasks, we could offer completing-read over the names of
2891 existing tasks to generate the list of `prereqs'. It
2892 might be nice to systematize date data, so we could more
2893 easily e.g. sort and display task info ``by date''.
2894 (Perhaps we should be working with predefined database
2895 types for dates and so on; but see Note
2896 \ref{choice-of-database}.)
2898 Also, before storing the task, it might be nice to offer
2899 the user the chance to review the data they entered.
2900 \end{notate}
2902 \begin{notate}{On `get-filler'} \label{get-filler}
2903 Just a wrapper for `triples-given-beginning-and-middle'.
2904 (Maybe add `heading' as an option here.)
2905 \end{notate}
2907 \begin{elisp}
2908 (Defun get-filler (frame slot)
2909 (third (first
2910 (print-triples
2911 (triples-given-beginning-and-middle frame
2912 slot)))))
2913 \end{elisp}
2915 \begin{notate}{On `get-task'} \label{get-task}
2916 Uses `get-filler' (Note \ref{get-filler}) to assemble the
2917 elements of a task's frame.
2918 \end{notate}
2920 \begin{elisp}
2921 (Defun get-task (name)
2922 (when (triple-exact-match name "is a" "task")
2923 (list (get-filler name "description")
2924 (get-filler name "prereqs")
2925 (get-filler name "materials")
2926 (get-filler name "status")
2927 (get-filler name "status justification")
2928 (get-filler name "date submitted")
2929 (get-filler name
2930 "estimated time of completion"))))
2931 \end{elisp}
2933 \begin{notate}{On `review-task'} \label{review-task}
2934 This is a function to review a task by name.
2935 \end{notate}
2937 \begin{elisp}
2938 (defun review-task (name)
2939 (interactive "MName: ")
2940 (let ((task-data (get-task name)))
2941 (if task-data
2942 (display-task task-data)
2943 (message "No data."))))
2945 (defun display-task (data)
2946 (save-excursion
2947 (pop-to-buffer (get-buffer-create
2948 "*Arxana Display*"))
2949 (delete-region (point-min) (point-max))
2950 (insert "NAME: " name "\n\n")
2951 (insert "DESCRIPTION: " (first data) "\n\n")
2952 (insert "TASKS THIS TASK DEPENDS ON: "
2953 (second data) "\n\n")
2954 (insert "MATERIAL DEPENDENCIES: "
2955 (third data) "\n\n")
2956 (insert "STATUS: " (fourth data) "\n\n")
2957 (insert "WHY THIS STATUS?: " (fifth data) "\n\n")
2958 (insert "DATE SUBMITTED:" (sixth data) "\n\n")
2959 (insert "ESTIMATED TIME OF COMPLETION: "
2960 (seventh data) "\n\n")
2961 (goto-char (point-min))
2962 (fill-individual-paragraphs (point-min) (point-max))))
2963 \end{elisp}
2965 \begin{notate}{Possible enhancements to `review-task'}
2966 Breaking this down into a function to select the task and
2967 another function to display the task would be nice. Maybe
2968 we should have a generic function for selecting any object
2969 ``by name'', and then special-purpose functions for
2970 displaying objects with different properties.
2972 Using text properties, we could set up a ``field-editing
2973 mode'' that would enable you to select a particular field
2974 and edit it independently of the others. Another more
2975 complex editing mode would \emph{know} which fields the
2976 user had edited, and would store all edits back to the
2977 database properly. See Section \ref{editing} for more on
2978 editing.
2979 \end{notate}
2981 \begin{notate}{Browsing tasks} \label{browsing-tasks}
2982 The function `pick-a-name' (Note \ref{pick-a-name}) takes
2983 two functions, one that finds the names to choose from,
2984 and the other that says how to present these names. We
2985 can therefore build `pick-a-task' on top of `pick-a-name'.
2986 \end{notate}
2988 \begin{elisp}
2989 (Defun get-tasks ()
2990 (mapcar #'first
2991 (print-triples
2992 (triples-given-middle-and-end "is a" "task")
2993 t)))
2995 (defun pick-a-task ()
2996 (interactive)
2997 (pick-a-name
2998 'get-tasks
2999 (lambda (items)
3000 (mapc (lambda (item)
3001 (let ((pos (line-beginning-position)))
3002 (insert item)
3003 (put-text-property pos (1+ pos)
3004 'arxana-display-type
3005 'task)
3006 (insert "\n"))) items))))
3008 (add-to-list 'display-style
3009 '(task . (get-task display-task)))
3010 \end{elisp}
3012 \begin{notate}{Working with theories}
3013 Presumably, like other related functions, `get-tasks'
3014 should take a heading argument.
3015 \end{notate}
3017 \begin{notate}{Check display style}
3018 Check if this works, and make style consistent between
3019 this usage and earlier usage.
3020 \end{notate}
3022 \begin{notate}{Example tasks}
3023 It might be fun to add some tasks associated with
3024 improving Arxana, just to show that it can be done...
3025 maybe along with a small importer to show how importing
3026 something without a whole lot of structure can be easy.
3027 \end{notate}
3029 \subsection{Other ideas$^*$}
3031 \begin{notate}{A browser within a browser} \label{browser-within}
3032 All the stuff we're doing with triples can be superimposed
3033 over the existing web and existing web interfaces, by, for
3034 example, writing a web browser as a web app, and in this
3035 ``browser within a browser'' offer the ability to annotate
3036 and rewrite other people's web pages, produce 3rd-party
3037 redirects, and so forth, sharing these mods with other
3038 subscribers to the service. (Already websites such as the
3039 short-lived scrum.diddlyumptio.us have offered limited
3040 versions of ``web annotation'', but, so far, what one can
3041 do with such services seems quite weak compared with
3042 what's possible.)
3043 \end{notate}
3045 \begin{notate}{Improvements to the PlanetMath backend}
3046 From one point of view, the SQL tables are the main thing
3047 in Noosphere. We could say that getting the things out of
3048 SQL and storing new things there is what Noosphere mainly
3049 does. Following this line of thought, anything that
3050 adjusts these tables will do just as well, e.g., it
3051 shouldn't be terribly hard to develop an email-based
3052 front-end. But rather than making Arxana work with the
3053 Noosphere relational table system, it is probably
3054 advantageous to translate the data from these tables into
3055 the scholium system.
3056 \end{notate}
3058 \begin{notate}{A new communication platform}
3059 One of the premier applications I have in mind is a new
3060 way to handle communications in an online-forum. I have
3061 previously called this ``subchanneling'', but really,
3062 joining channels is just as important.
3063 \end{notate}
3065 \begin{notate}{Some tutorials}
3066 It would be interesting to write a tutorial for Common
3067 Lisp or just about any other topic with this system. For
3068 example, some little ``worksheets'' or ``gymnasia'' that
3069 will help solidify user knowledge in topics on which
3070 questions keep appearing.
3071 \end{notate}
3073 \section{Topics of philosophical interest}
3075 \begin{notate}{Research and development}
3076 In Note \ref{theoretical-context}, I mentioned a model
3077 that could apply in many contexts; it is an essentially
3078 metaphysical conception. I'm pretty sure that the data
3079 model of Note \ref{data-model} provides a general-enough
3080 framework to represent anything we might find ``out
3081 there''. However, even if this is the case, questions as
3082 to \emph{efficient} means of working with such data still
3083 abound (cf. Note \ref{models-of-theories}, Note
3084 \ref{use-of-views}).
3086 I propose that along with \emph{development} of Arxana as
3087 a useful system for \emph{doing} ``commons-based peer
3088 production'' should come a \emph{research} programme for
3089 understanding in much greater detail what ``commons-based
3090 peer production'' \emph{is}. Eventually we may want to
3091 change the name of the subject of study to reflect still
3092 more general ideas of resource use.
3094 While the ``frontend'' of this research project is
3095 anthropological, the ``backend'' is much closer to
3096 artificial intelligence. On this level, the project is
3097 about understanding \emph{effective} means for solving
3098 human problems. Often this will involve decomposing
3099 events and processes into constituent elements, making
3100 increasingly detailed treatments along the lines described
3101 in Note \ref{arxana}.
3102 \end{notate}
3104 \begin{notate}{The relationship between text and commentary}
3105 Text under revision might be marked up by a copyeditor: in
3106 cases like these, the interpretation is clear. However,
3107 what about marginalia with looser interpretations? These
3108 seem to become part of the copy of the text they are
3109 attached to. What about steering processes applied to a
3110 given course of action? How about the relationship of
3111 thoughts or words to perception and action? How can we
3112 lower the barrier between conception and action, while
3113 still maintaining some purchase on wisdom?
3115 You see, a lot of issues in life have to do with overlays,
3116 multi-tracking, interchange between different systems; and
3117 in these terms, a lot of philosophy reduces to ``media
3118 awareness'' which extends into more and more immediate
3119 contexts (Note \ref{theoretical-context}).
3120 \end{notate}
3122 \begin{notate}{Heuristic flow}
3123 Continuing the notion above: one does not need a
3124 fully-developed ``heading'' of work in order to do work --
3125 instead, one wants some straightforward heuristics that
3126 will enable the desired work to get done. So, even
3127 supposing the work is ``heading building'', it can progress
3128 without becoming overwhelmed in abstractions -- because
3129 theories and heuristics are different things.
3130 \end{notate}
3132 \begin{notate}{Limits of simple languages} \label{simple-languages}
3133 Triples are frequently ``subject, verb, object''
3134 statements, although with the annotation features, we can
3135 modify any part of any such statement; for example, we
3136 can apply an adverb to a given verb.
3138 ``Tags'', of course, already provide ``subject,
3139 predicate'' relationships. It will be interesting to
3140 examine the degree to which human languages can be mapped
3141 down into these sorts of simple languages. What features
3142 are needed to make such languages \emph{useful}? (Lisp's
3143 `car' and `cdr' seem related to the idea of making
3144 predicates useful.)
3146 How are triples and predicates ``enough''? What, if
3147 anything, do they lack? The difference between triples
3148 and predicates illustrates the issue. How should we
3149 characterize Arxana's additions to Lisp?
3150 \end{notate}
3152 \begin{notate}{Higher dimensions}
3153 Why stop with three components? Why not have $(A, B, C,
3154 D, T)$ represent a semantic relationship between all of
3155 $A$, $B$, $C$, and $D$ (in heading $T$, of course)?
3156 Actually, there is no reason to stop apart from the fact
3157 that I want to explore simple languages (Note
3158 \ref{simple-languages}). In real life, things are not as
3159 simple, and we should be ready to deal with the
3160 complexities! (Cf., for example, Note \ref{pointing}).
3161 \end{notate}
3163 \section{Future plans}
3165 \begin{notate}{Development pathways}
3166 To the extent that it's possible, I'd like to maintain a
3167 succinct non-linear roadmap in which tasks are outlined
3168 and prioritized, and some procedural details are made
3169 concrete. Whenever relevant this map should point into
3170 the current document. I'll begin by revising the plans
3171 I've used so far!\footnote{{\tt
3172 http://metameso.org/files/plan-arxana.pdf}} Over the
3173 next several months, I'd like to see these plans develop
3174 into a genuine production machine, and see the machine
3175 begin to stabilize its operations.
3176 \end{notate}
3178 \begin{notate}{Theories as database objects} \label{theories-as-database-objects}
3179 We're just beginning to treat theories as database
3180 objects; I expect there will be more work to do to make
3181 this work really well. We'll want to make some test
3182 cases, like building a ``theory of chess'', or even just
3183 describing a particular chess board; cf. Note
3184 \ref{partial-image}.
3185 \end{notate}
3187 \begin{notate}{Search engine/elements} \label{search-engine}
3188 One of the features that came very easy in the Emacs-only
3189 prototype was textual search. With the strings stored in
3190 a database, Sphinx seems to be the most suitable search
3191 engine to use. It is tempting to try to make our own
3192 inverted index using triples, so that text-based search
3193 can be even more directly integrated with semantic search.
3194 (Since the latest version(s) of Sphinx can act to some
3195 extent like a MySQL database, we almost have a direct
3196 connection in the backend, but since Sphinx is not
3197 \emph{the same} database, one would at least need some
3198 glue code to effect joins and so forth.)
3200 More to the point, it is important for this project that
3201 the scholia-based document model be transparently extended
3202 down to the level of words and characters. It may be
3203 helpful to think about text as \emph{always being}
3204 hypertext; a document as a heading; and a word in the
3205 inverted index as a frame.
3206 \end{notate}
3208 \begin{notate}{Pointing at database elements and other things} \label{pointing}
3209 We will want to be able to point at other tables and at
3210 other sorts of objects and make use of their contents.
3211 The plan is that our triples will provide a sort of guide
3212 or backbone superimposed over a much larger data system.
3213 \end{notate}
3215 \begin{notate}{Feature-chase}
3216 There are lots of different features that could be
3217 explored, for example: multi-dimensional history lists; a
3218 useful treatment of ``clusions''; MS Word-like colorful
3219 annotations; etc. Many of these features are already
3220 prototyped.\footnote{See footnote \ref{old-version}.}
3221 \end{notate}
3223 \begin{notate}{Regression testing}
3224 Along with any major feature chase, we should provide
3225 and maintain a regression testing suite.
3226 \end{notate}
3228 \begin{notate}{Deleting and changing things}
3229 How will we deal with unlinking, disassociating,
3230 forgetting, entropy, and the like? Changes can perhaps
3231 be modeled by an insertion following a deletion, and,
3232 as noted, we'll need effective ways to represent and
3233 manage change (Note \ref{change}).
3234 \end{notate}
3236 \begin{notate}{Tutorial}
3237 Right now the system is simple enough to be pretty much
3238 self-explanatory, but if it becomes much more complicated,
3239 it might be helpful to put together a simple guide to some
3240 likely-to-be-interesting features.
3241 \end{notate}
3243 \begin{notate}{Computing possible paths and connections}
3244 If we can find all the \emph{direct} paths from one node
3245 to another using `triples-given-beginning-and-end', can we
3246 inject some algorthms for finding longer, indirect paths
3247 into the system, and find ways to make them useful?
3249 Similarly, we can satisfy local conditions (Note
3250 \ref{satisfy-conditions}), but we'll want to deal with
3251 increasingly ``non-local'' conditions (even just using the
3252 logical operator ``or'', instead of ``and'', for example).
3253 \end{notate}
3255 \begin{notate}{Monster Mountain}
3256 In Summer 2007, we checked out the Monster Mountain MUD
3257 server\footnote{{\tt http://code.google.com/p/mmtn/}},
3258 which would enable several users to interact with one
3259 LISP, instead of just one database. This would have a
3260 number of advantages, particularly for exploring
3261 ``scholiumific programming'', but also towards fulfilling
3262 the user-to-user interaction objective stated in Note
3263 \ref{theoretical-context}. I plan to explore this after
3264 the primary goal of multi-user interaction with the
3265 database has been solidly completed.
3266 \end{notate}
3268 \begin{notate}{Web interface}
3269 A finished web interface may take a considerable amount of
3270 work (if the complexity of an interesting Emacs interface
3271 is any indication), but the basics shouldn't be hard to
3272 put together soon.
3273 \end{notate}
3275 \begin{notate}{Parsing input} \label{parsing}
3276 Complicated objects specified in long-hand (e.g. triples
3277 pointing to triples) can be read by a relatively simple
3278 parser -- which we'll have to write! The simplest goal
3279 for the parser would be to be able to distinguish between
3280 a triple and a string -- presumably that much isn't hard.
3281 And of course, building complexes of triples that
3282 represent statements from natural language is a good
3283 long-term goal. (Right now, our granularity level is set
3284 much higher.)
3285 \end{notate}
3287 \begin{notate}{Choice of database} \label{choice-of-database}
3288 I expect Elephant\footnote{{\tt
3289 http://common-lisp.net/project/elephant/}} may become
3290 our preferred database at some point in the future; we are
3291 currently awaiting changes to Elephant that make nested
3292 queries possible and efficient. Some core queries related
3293 to managing a database of semantic links with the current
3294 Elephant were constructed by Ian Eslick, Elephant's
3295 maintainer.\footnote{{\tt
3296 http://planetx.cc.vt.edu/\~{}jcorneli/arxana/variant-4.lisp}}
3298 On the other hand, it might be reasonable to use an Emacs
3299 database and redo the whole thing to work in Emacs
3300 (again), e.g. for single-user applications or users who
3301 want to work offline a lot of the time.
3302 \end{notate}
3304 \begin{notate}{Different kinds of theories}
3305 Theories or variants thereof are of course already popular
3306 in other knowledge representation contexts.\footnote{{\tt
3307 http://www.cyc.com/cycdoc/vocab/mt-expansion-vocab.html}}$^{,}$\footnote{{\tt
3308 http://www.stanford.edu/\~{}kdevlin/HHL\_SituationTheory.pdf}}
3309 We'll want to adopt some useful techniques for knowledge
3310 management as soon as the core systems are ready.
3312 Various notions of a mathematical theory
3313 exist.\footnote{{\tt
3314 http://planetmath.org/encyclopedia/Theory.html}} It
3315 would be nice to be able to assign specific logic to
3316 theories in Arxana, following the ``little theories''
3317 design of e.g. IMPS.\footnote{{\tt
3318 http://imps.mcmaster.ca/manual/node13.html}}
3319 \end{notate}
3321 \section{Conclusion} \label{conclusion}
3323 \begin{notate}{Ending and beginning again}
3324 This is the end of the Arxana system itself; the
3325 appendices provide some ancillary tools, and some further
3326 discussion. Contributions that support the development of
3327 the Arxana project are welcome.
3328 \end{notate}
3330 \appendix
3332 \section{Appendix: Auto-setup} \label{appendix-setup}
3334 \begin{notate}{Setting up auto-setup}
3335 This section provides code for satifying dependencies and
3336 setting up the program. This code assumes that you are
3337 using a Debian/APT-based system (but things are not so
3338 different using say, Fedora or Fink; writing a
3339 multi-package-manager-friendly installer shouldn't be
3340 hard). Of course, feel free to set things up differently
3341 if you have something else in mind!
3342 \end{notate}
3344 \begin{elisp}
3345 (defalias 'set-up 'shell-command)
3347 (defun alternative-set-up (string)
3348 (save-excursion
3349 (pop-to-buffer (get-buffer-create "*Arxana Help*"))
3350 (goto-char (point-max))
3351 (insert string "\n")))
3353 (defun set-up-arxana-environment ()
3354 (interactive)
3355 (if (y-or-n-p
3356 "Run commands (y) (or just show instructions)? ")
3357 (fset 'set-up 'shell-command)
3358 (fset 'set-up 'alternative-set-up))
3359 (when (y-or-n-p "Install dependencies? ")
3360 (set-up "mkdir ~/arxana")
3361 (set-up "cd arxana"))
3363 (when (y-or-n-p "Download latest Arxana? ")
3364 (set-up "wget http://metameso.org/files/arxana.tex"))
3366 (unless (y-or-n-p "Is your emacs good enough?... ")
3367 (set-up
3368 (concat "cvs -z3 -d"
3369 ":pserver:anonymous@cvs.savannah.gnu.org:"
3370 "/sources/emacs co emacs"))
3371 (set-up "mv emacs ~")
3372 (set-up "cd ~/emacs")
3373 (set-up "./configure && make bootstrap")
3374 (set-up "cd ~/arxana"))
3376 (defvar pac-man nil)
3378 (cond ((y-or-n-p
3379 "Do you use an apt-based package manager? ")
3380 (setq pac-man "apt-get"))
3381 (t (message
3382 "OK, get Lisp and SQL on your own, then!")))
3384 (when pac-man
3385 (when (y-or-n-p "Install Common Lisp? ")
3386 (set-up (concat pac-man " install sbcl")))
3388 (when (y-or-n-p "Install Postgresql? ")
3389 (set-up (concat pac-man " install postgresql"))
3390 (when (y-or-n-p "Help setting up PostgreSQL? ")
3391 (save-excursion
3392 (pop-to-buffer (get-buffer-create "*Arxana Help*"))
3393 (insert "As superuser (root),
3394 edit /etc/postgresql/7.4/main/pg_hba.conf
3395 make sure it says this:
3396 host all all 127.0.0.1 255.255.255.255 trust
3397 then edit /etc/postgresql/7.4/main/postgresql.conf
3398 and make it say
3399 tcpip_socket = true
3400 then restart:
3401 /etc/init.d/postgresql-7.4 restart
3402 su postgres
3403 createuser username
3404 exit
3405 as username, run
3406 createdb -U username\n")))))
3408 (when (y-or-n-p "Install SLIME...? ")
3409 (set-up (concat "cvs -d :pserver:anonymous"
3410 ":anonymous@common-lisp.net:"
3411 "/project/slime/cvsroot co slime"))
3412 (set-up
3413 (concat "echo \";; Added to ~/.emacs for Arxana:\n\n"
3414 "(add-to-list 'load-path \"~/slime/\")\n"
3415 "(setq inferior-lisp-program \"/usr/bin/sbcl\")\n"
3416 "(require 'slime)\n"
3417 "(slime-setup '(slime-repl))\n\n\""
3418 "| cat - ~/.emacs > ~/updated.emacs &&"
3419 "mv ~/updated.emacs ~/.emacs")))
3421 (when (y-or-n-p "Set up Common Lisp environment? ")
3422 (set-up "mkdir ~/.sbcl")
3423 (set-up "mkdir ~/.sbcl/site")
3424 (set-up "mkdir ~/.sbcl/systems")
3425 (set-up "cd ~/.sbcl/site")
3426 (set-up (concat "wget http://files.b9.com/"
3427 "clsql/clsql-latest.tar.gz"))
3428 (set-up "tar -zxf clsql-4.0.3.tar.gz")
3429 (set-up (concat "wget http://files.b9.com/"
3430 "uffi/uffi-latest.tar.gz"))
3431 (set-up "tar -zxf uffi-1.6.0.tar.gz")
3432 (set-up (concat "wget http://files.b9.com/"
3433 "md5/md5-1.8.5.tar.gz"))
3434 (set-up "tar -zxf md5-1.8.5.tar.gz")
3435 (set-up "cd ~/.sbcl/systems")
3436 (set-up "ln -s ../site/md5-1.8.5/md5.asd .")
3437 (set-up "ln -s ../site/uffi-1.6.0/uffi.asd .")
3438 (set-up "ln -s ../site/clsql-4.0.3/clsql.asd .")
3439 (set-up "ln -s ../site/clsql-4.0.3/clsql-uffi.asd .")
3440 (set-up (concat "ln -s ../site/clsql-4.0.3/"
3441 "clsql-postgresql-socket.asd ."))
3442 (set-up "ln -s ~/arxana/arxana.asd ."))
3444 (when (y-or-n-p "Modify ~/.sbclrc so CL always starts Arxana? ")
3445 (set-up
3446 (concat "echo \";; Added to ~/.sbclrc for Arxana:\n\n"
3447 "(require 'asdf)\n\n"
3448 "(asdf:operate 'asdf:load-op 'swank)\n"
3449 "(setf swank:*use-dedicated-output-stream* nil)\n"
3450 "(setf swank:*communication-style* :fd-handler)\n"
3451 "(swank:create-server :port 4006 :dont-close t)\n\n"
3452 "(asdf:operate 'asdf:load-op 'clsql)\n"
3453 "(asdf:operate 'asdf:load-op 'arxana)\n"
3454 "(in-package arxana)\n"
3455 "(connect-to-database)\n"
3456 "(locally-enable-sql-reader-syntax)\n\n\""
3457 "| cat ~/.sbclrc - > ~/updated.sbclrc &&"
3458 "mv ~/updated.sbclrc ~/.sbclrc")))
3460 (when (y-or-n-p "Install Monster Mountain? ")
3461 (set-up "cd ~/.sbcl/systems")
3462 (set-up (concat
3463 "darcs get http://common-lisp.net/project/"
3464 "bordeaux-threads/darcs/bordeaux-threads/"))
3465 (set-up (concat
3466 "svn checkout svn://common-lisp.net/project/"
3467 "usocket/svn/usocket/trunk usocket-svn"))
3468 ;; I've had problems with this approach to setting cclan
3469 ;; mirror...
3470 (set-up
3471 (concat
3472 "wget \"http://ww.telent.net/cclan-choose-mirror"
3473 "?M=http%3A%2F%2Fthingamy.com%2Fcclan%2F\""))
3474 (set-up (concat "wget http://ww.telent.net/cclan/"
3475 "split-sequence.tar.gz"))
3476 (set-up "tar -zxf split-sequence.tar.gz")
3477 (set-up
3478 (concat "svn checkout http://mmtn.googlecode.com/"
3479 "svn/trunk/ mmtn-read-only"))
3480 (set-up
3481 "ln -s ~/bordeaux-threads/bordeaux-threads.asd .")
3482 (set-up "ln -s ~/usocket-svn/usocket.asd .")
3483 (set-up "ln -s ~/split-sequence/split-sequence.asd .")
3484 (set-up "ln -s ~/mmtn/src/mmtn.asd .")))
3485 \end{elisp}
3487 \begin{notate}{Postgresql on Fedora}
3488 There are some slightly different instructions for
3489 installing postgresql on Fedora; the above will be
3490 changed to include them, but for now, check them
3491 out on the
3492 web.\footnote{{\tt http://www.flmnh.ufl.edu/linux/install\_postgresql.htm}}
3493 \end{notate}
3495 \begin{notate}{Using MySQL and CLISP instead} \label{backend-variant}
3496 Since my OS X box seems to have a variety of confusing
3497 PostgreSQL systems already installed (which I'm not sure
3498 how to configure), and CLISP is easy to install with fink,
3499 I thought I'd try a different set up for simplicity and
3500 variety.
3502 In order to make it work, I enabled root user on Mac OS X
3503 per instructions on web, and installed and configured
3504 mysql; used a slight modification of the strings table
3505 described previously; download and installed
3506 cffi\footnote{{\tt
3507 http://common-lisp.net/project/cffi/releases/cffi\_latest.tar.gz}};
3508 changed the definition of `connect-to-database' in
3509 Arxana's utilities.lisp; doctored up my ~/.clisprc.lisp;
3510 and changed how I started Lisp. Details below.
3511 \end{notate}
3513 \begin{idea}
3514 ;; on the shell prompt
3515 sudo apt-get install mysql
3516 sudo mysqld_safe --user=mysql &
3517 sudo daemonic enable mysql
3518 sudo mysqladmin -u root password root
3519 mysql --user=root --password=root -D test
3520 create database joe; grant all on joe.* to joe@localhost
3521 identified by 'joe'
3523 ;; in tabledefs.lisp
3524 (execute-command "CREATE TABLE strings (
3525 id SERIAL PRIMARY KEY,
3526 text TEXT,
3527 UNIQUE INDEX (text(255))
3528 );")
3530 ;; in ~/asdf-registry/ or whatever you've designated as
3531 ;; your asdf:*central-registry*
3532 ln -s ~/cffi_0.10.4/cffi-uffi-compat.asd .
3533 ln -s ~/cffi_0.10.4/cffi.asd .
3535 ;; In utilities.lisp
3536 (defun connect-to-database ()
3537 (connect `("localhost" "joe" "joe" "joe")
3538 :database-type :mysql))
3540 ;; In ~/.clisprc.lisp
3541 (asdf:operate 'asdf:load-op 'clsql)
3542 (push "/sw/lib/mysql/"
3543 CLSQL-SYS:*FOREIGN-LIBRARY-SEARCH-PATHS*)
3545 ;; From SLIME prompt, and not in ~/.clisprc.lisp
3546 (in-package #:arxana)
3547 (connect-to-database)
3548 (locally-enable-sql-reader-syntax)
3549 \end{idea}
3551 \begin{notate}{Installing Sphinx}
3552 Here are some tips on how to install and configure
3553 Sphinx.
3554 \end{notate}
3556 \begin{idea}
3557 ;; Fedora/Postgresql flavor
3558 yum install postgresql-devel
3559 ./configure --without-mysql
3560 --with-pgsql
3561 --with-pgsql-libs=/usr/lib/pgsql/
3562 --with-pgsql-includes=/usr/include/pgsql
3564 ;; Fink/MySQL flavor
3565 ./configure --with-mysql
3566 --with-mysql-includes=/sw/include/mysql
3567 --with-mysql-libs=/sw/lib/mysql
3568 \end{idea}
3570 \begin{notate}{Getting Sphinx set up} \label{sphinx-setup}
3571 Here are some instructions I've used to get Sphinx set
3573 \end{notate}
3575 \begin{notate}{Create a sphinx.conf}
3576 I want a very minimal sphinx.conf, this seems to work.
3577 (We should probably set this up so that it gets written
3578 to a file when the Arxana is set up.)
3579 \end{notate}
3581 \begin{idea}
3582 ## Copy this to /usr/local/etc/sphinx.conf when you want
3583 ## to use it.
3585 source strings
3587 type = mysql
3588 sql_host = localhost
3589 sql_user = joe
3590 sql_pass = joe
3591 sql_db = joe
3592 sql_query = SELECT id, text FROM strings
3595 ## index definition
3597 index strings
3599 source = strings
3600 path = /Users/planetmath/sphinx/search-testing
3601 morphology = none
3604 ## indexer settings
3606 indexer
3608 mem_limit = 32M
3611 ## searchd settings
3613 searchd
3615 listen = 3312
3616 listen = localhost:3307:mysql41
3617 log = /Users/planetmath/sphinx/searchd.log
3618 query_log = /Users/planetmath/sphinx/searchd_query.log
3619 read_timeout = 5
3620 max_children = 30
3621 pid_file = /Users/planetmath/sphinx/searchd.pid
3622 max_matches = 1000
3624 \end{idea}
3626 \begin{notate}{Working from the command line}
3627 Then you can run commands like these.
3628 \end{notate}
3630 \begin{idea}
3631 /usr/local/bin/indexer strings
3632 /usr/local/bin/search "but, then"
3634 % mysql -h 127.0.0.1 -P 3307
3635 mysql> SELECT * FROM strings WHERE MATCH('but, then');
3636 \end{idea}
3638 \begin{notate}{Integrating this with Lisp}
3639 Since we can talk to Sphinx via Mysql
3640 protocol, it seems reasonable that we should be able to talk to
3641 it from CLSQL, too. With a little fussing to get the format
3642 right, I found something that works!
3643 \end{notate}
3645 \begin{idea}
3646 (connect `("127.0.0.1" "" "" "" "3307") :database-type :mysql)
3647 (mapcar (lambda (elt) (floor (car elt)))
3648 (query "select * from strings where match('text')"))
3649 \end{idea}
3651 \begin{notate}{Some added difficulty with Postgresql}
3652 When I try to index things on the server, I get an
3653 error, as below. The question is a good one... I'm
3654 not sure \emph{how} postgresql is set up on the server,
3655 actually...
3656 \end{notate}
3658 \begin{idea}
3659 ERROR: index 'strings': sql_connect: could not connect to server:
3660 Connection refused
3661 Is the server running on host "localhost" and accepting
3662 TCP/IP connections on port 5432?
3663 \end{idea}
3665 \section{Appendix: A simple literate programming system} \label{appendix-lit}
3667 \begin{notate}{The literate programming system used in this paper}
3668 This code defines functions that grab all the Lisp
3669 portions of this document, evaluate the Emacs Lisp
3670 sections in Emacs, and save the Common Lisp sections in
3671 suitable files.\footnote{{\tt
3672 Cf. http://mmm-mode.sourceforge.net/}} It requires
3673 that the \LaTeX\ be written in a certain consistent way.
3674 The function assumes that this document is the current
3675 buffer.
3677 \begin{verbatim}
3678 (defvar lit-code-beginning-regexp
3679 "^\\\\begin{elisp}\\|^\\\\begin{common}{\\([^}\n]*\\)}")
3681 (defvar lit-code-end-regexp
3682 "^\\\\end{elisp}\\|^\\\\end{common}")
3684 (defun lit-process ()
3685 (interactive)
3686 (save-excursion
3687 (let ((to-buffer "*Lit Code*")
3688 (from-buffer (buffer-name (current-buffer)))
3689 (start-buffers (buffer-list)))
3690 (set-buffer (get-buffer-create to-buffer))
3691 (erase-buffer)
3692 (set-buffer (get-buffer-create from-buffer))
3693 (goto-char (point-min))
3694 (while (re-search-forward
3695 lit-code-beginning-regexp nil t)
3696 (let* ((file (match-string 1))
3697 (beg (match-end 0))
3698 (end (save-excursion
3699 (search-forward-regexp
3700 lit-code-end-regexp nil t)
3701 (match-beginning 0)))
3702 (match (buffer-substring-no-properties
3703 beg end)))
3704 (let ((to-buffer
3705 (if file
3706 (concat "*Lit Code*: " file)
3707 "*Lit Code*")))
3708 (save-excursion
3709 (set-buffer (get-buffer-create
3710 to-buffer))
3711 (insert match)))))
3712 (dolist
3713 (buffer (set-difference (buffer-list)
3714 start-buffers))
3715 (save-excursion
3716 (set-buffer buffer)
3717 (if (string= (buffer-name buffer)
3718 "*Lit Code*")
3719 (eval-buffer)
3720 (write-region (point-min)
3721 (point-max)
3722 (concat "~/arxana/"
3723 (substring
3724 (buffer-name
3725 buffer)
3726 12)))))
3727 (kill-buffer buffer)))))
3728 \end{verbatim}
3729 \end{notate}
3731 \begin{notate}{Emacs-export?}
3732 It wouldn't be hard to export the Elisp sections so
3733 that those who wanted to could ditch the literate
3734 wrapper.
3735 \end{notate}
3737 \begin{notate}{Bidirectional updating}
3738 Eventually it would be nice to have a code repository set
3739 up, and make it so that changes to the code can get
3740 snarfed up here.
3741 \end{notate}
3743 \begin{notate}{A literate style}
3744 Ideally, each function will have its own Note to introduce
3745 it, and will not be called before it has been defined. I
3746 sometimes make an exception to this rule, for example,
3747 functions used to form recursions may appear with no
3748 further introduction, and may be called before they are
3749 defined.
3750 \end{notate}
3752 \section{Appendix: Hypertext platforms} \label{appendix-hyper}
3754 \begin{notate}{The hypertextual canon} \label{canon}
3755 There is a core library of texts that come up in
3756 discussions of hypertext.
3757 \begin{itemize}
3758 % \item (Plato)
3759 \item The Rosetta stone
3760 \item The Talmud (Judah haNasi, Rav Ashi, and many others)
3761 \item Monadology (Wilhelm Leibniz)
3762 \item The Life and Opinions of Tristam Shandy, Gentleman
3763 (Lawrence Sterne)
3764 \item Middlemarch (George Eliot)
3765 % \item The Gay Science (Freidrich Nietzsche)
3766 % \item (Wittgenstein)
3767 % \item (Alan Turing)
3768 \item The Nova Trilogy (William S. Burroughs)
3769 \item The Logic of Sense (Gilles Deleuze)
3770 % \item Open Creation and its Enemies (Asger Jorn)
3771 \item Labyrinths (Jorge Luis Borges)
3772 \item Literary Machines (Ted Nelson)
3773 % \item Simulation and Simulacra (Jean Baudrillard)
3774 \item Lila (Robert M. Pirsig)
3775 % \item \TeX: the program (Donald Knuth)
3776 \item Dirk Gently's Holistic Detective Agency
3777 (Douglas Adams)
3778 \item Pussy, King of the Pirates (Kathy Acker)
3779 % \item Rachel Blau DuPlessis,
3780 % \item Emily Dickinson
3781 % \item Gertrude Stein
3782 % \item Zora Neale Hurston
3783 \end{itemize}
3784 At the same time, it is somewhat ironic that none of the
3785 items on this list are themselves hypertexts in the
3786 contemporary sense of the word. It's also a bit funny
3787 that certain other works (even some by the same authors)
3788 aren't on this list. Perhaps we begin to get a sense of
3789 what's going on in this quote from Kathleen
3790 Burnett:\footnote{{\tt http://www.iath.virginia.edu/pmc/text-only/issue.193/burnett.193}}
3791 \begin{quote}
3792 ``Multiplicity, as a hypertextual principle, recognizes a
3793 multiplicity of relationships beyond the canonical
3794 (hierarchical). Thus, the traditional concept of
3795 literary authorship comes under attack from two
3796 quarters--as connectivity blurs the boundary between
3797 author and reader, multiplicity problematizes the
3798 hierarchy that is canonicity.''
3799 \end{quote}
3800 It seems quite telling that non-hypertextual canons remain
3801 mostly-non-hypertextual even today, despite the existence
3802 of catalogs, indexes, and online access.\footnote{{\tt
3803 http://www.gutenberg.org/wiki/Category:Bookshelf}}
3804 \end{notate}
3806 \begin{notate}{A geek's guide to literature}
3807 This title is a riff on Slasov \v{Z}i\v{z}ek's ``A
3808 pervert's guide to cinema''. Taking Note \ref{canon} as a
3809 jumping-off point, why don't we make a survey of
3810 historical texts from the point of view of an aficionado
3811 of hypertext! Just what does one have to do to ``get on
3812 the list''? Just what is ``the hypertextual
3813 perspective''? And, if \v{Z}i\v{z}ek is correct and we're
3814 to look for the hyperreal in the world of cinematic
3815 fictions -- what's left over for the world of literature?
3816 (Or mathematics?)
3817 \end{notate}
3819 \begin{notate}{The number 3}
3820 This is the number of things present if we count carefully
3821 the items $A$, $B$, and a connection $C$ between them.
3822 [Picture of $A\xrightarrow{C} B$.]
3824 (Or even: given $A$ and $B$, we use Wittgenstein counting,
3825 and \emph{intuit} that $C$ exists as the collection $\{A,
3826 B\}$; after all,
3827 some connection must exist precisely because we were
3828 presented with $A$ and $B$ together -- and lest the
3829 connections proliferate infinitely, we lump them all
3830 together as one. [Picture of $A$, $B$,
3831 with the \emph{frame} labeled $C$.])
3832 \end{notate}
3834 \begin{notate}{Surfaces}
3835 Deleuze talks about a theory of surfaces associated with
3836 verbs and events. His surfaces represent the evanescence
3837 of events in time, and of their descriptions in language.
3838 An event is seen as a vanishingly-thin boundary between
3839 one state of being and another.
3841 Certainly, a statement that is true \emph{now} may not be
3842 true five minutes from now. It is easier to think and
3843 talk about things that are coming up and things that have
3844 already happened. ``Living in the moment'' is regarded as
3845 special or even ``Zen''.
3847 We can begin to put these musings on a more solid
3848 mathematical basis. We first examine two types of
3849 \emph{interfaces}:
3850 \begin{enumerate}
3851 \item $A\xrightarrow{C} B$, $A\xrightarrow{D} B$,
3852 $A\xrightarrow{E} B$
3853 (the interface of $A$ and $B$ across $C$, $D$, and $E$);
3854 \item $A\xrightarrow{C} B$, $D\xrightarrow{C} E$,
3855 $F\xrightarrow{C} G$
3856 (the interface of various terms across $C$).
3857 \end{enumerate}
3858 \end{notate}
3860 \begin{notate}{Comic books}
3861 No geek's guide to literature would be complete without
3862 putting comics in a hallowed place. [Framed picture of
3863 $A$, $B$ next to framed
3864 picture of $A$, $B$, $a$.] What happened?
3865 $\ddot{\smile}$
3866 \end{notate}
3868 \begin{notate}{Intersecting triples}
3869 Diagrammatically, it is tempting to portray
3870 $(ACB)_{\mathrm{mid}}DE$ as if it was closely related to
3871 $A(CDE)_{\mathrm{beg}}B$, despite the fact that they are
3872 notationally very different. I'll have to think more
3873 about what this means.
3874 \end{notate}
3876 \section{Appendix: Computational Linguistics} \label{appendix-linguistics}
3878 \begin{notate}{What is this?}
3879 It might be reasonable to make annotating sentences part
3880 of our writeup on hypertext platforms -- but I'm putting
3881 it here for now. If hypertext is what deals with language
3882 artifacts on the ``bulky'' level (saying, for example,
3883 that a subsection is part of a section, and so on), then
3884 computational linguistics is what deals with the finer
3885 levels. However, the distinction is in some ways
3886 arbitrary, and many of the techniques should be at least
3887 vaguely similar.
3888 \end{notate}
3890 \begin{notate}{Annotation sensibilities}\label{sense}
3891 We will want to be able to make at least two different
3892 kinds of annotations of verbs. For example, given the
3893 statement
3894 \begin{itemize}
3895 \item[$S$.] (``Who'' ``is on'' ``first''),
3896 \end{itemize}
3897 I'd like to be able to say
3898 \begin{itemize}
3899 \item[I.](``is on'' ``means'' ``the position of a base runner in baseball'').
3900 \end{itemize}
3901 However, I'd also like to be able to say
3902 \begin{itemize}
3903 \item[II.] (``is on'' ``because'' ``he was walked'').
3904 \end{itemize}
3905 Annotation I is meant to apply to the term ``is on''
3906 itself (in a context that might be more general than just
3907 this one sentence). If Who is also on steroids, that's
3908 another matter -- as this type of annotation helps make
3909 clear!
3911 Annotation II is meant to apply to the term ``is on''
3912 \emph{as it
3913 appears in sentence $S$}. In particular, Annotation II
3914 seems to work best in a context in which we've already
3915 accepted the ontological status of the verb-phrase ``is
3916 on first''.
3918 Whereas Annotation I should presumably exist before
3919 statement $S$ is ever made (and it certainly helps make
3920 that statement make sense), Annotation II is most properly
3921 understood with reference to the fully-formed statement
3922 $S$. However, Annotation II is different from a statement
3923 like ($S$ ``has truth value'' $F$) in that it looks into
3924 the guts of $S$.
3925 \end{notate}
3927 \begin{notate}{Comparison of places and ontological status} \label{places-and-onto-status}
3928 The difference between (I) a ``global'' annotation, and
3929 (II) the annotation of a specific sentence is analogous to
3930 the difference between (a) relationships between objects
3931 without a place, and (b) relationships between objects in
3932 specific places. (Cf. Note \ref{sense}: ``global''
3933 statements are of course made ``local'' by the theories
3934 that scope them.)
3936 For example, in a descriptive ontology of research
3937 documents, I might make the ``placeless'' statement,
3938 \begin{itemize}
3939 \item[a.] (``Introduction'' ``names'' ``a section'')
3940 \end{itemize}
3941 On the other hand, the statement
3942 \begin{itemize}
3943 \item[b.] (``Introduction'' ``has subject'' ``American
3944 History''),
3945 \end{itemize}
3946 seems likely to be about a specific Introduction. (And
3947 somewhere in the backend, this triple should be expressed
3948 in terms of places!)
3949 \end{notate}
3951 \begin{notate}{Semantics}
3952 In a sentence like
3953 \begin{quote}
3954 (((``I'' ``saw'' ``myself'')$_{\mathrm{mid}}$ ``as if''
3955 ``through a glass'')$_{\mathrm{beg}}$ ``but'' ``darkly'')
3956 \end{quote}
3957 first of all, there may be different parenthesizations,
3958 and second of all, the semantics of links like ``as if''
3959 and ``but'' may shape, to some extent, the ways in
3960 which we parethesize.
3961 \end{notate}
3963 \section{Appendix: Resource use} \label{appendix-resources}
3965 \begin{notate}{Free culture in action}
3966 I thought it worthwhile to include this quote from
3967 a joint paper with Aaron Krowne:\footnote{See Footnote
3968 \ref{corneli-krowne}.}
3969 \begin{quote}
3970 ``[F]ree content typically
3971 manifests aspects of a common resource as well as an
3972 open access resource; while anyone can do essentially
3973 whatever they wish with the content offline, in its
3974 online life, the content is managed in a
3975 socially-mediated way. In particular, rights to
3976 \emph{in situ} modification tend to be strictly
3977 controlled. [...] By finding new ways to support
3978 freedom of speech within CBPP documents, we embrace
3979 subjectivity as a way to enhance the content of an
3980 intersubjectively valued corpus. In the context of
3981 ``hackable'' media and maintenance protocols, the
3982 semantics with which scholia are handled can be improved
3983 upon indefinitely on a user-by-user basis and a
3984 resource-wide basis. This is free culture in action.''
3985 \end{quote}
3986 \end{notate}
3988 \begin{notate}{Learning}
3989 The learner, confronted with a learning resource, or the
3990 consumer of any other information resource (or indeed,
3991 practically any resource whatsoever) may want a chance to
3992 respond to the questions ``was this what you were looking
3993 for?'' and ``did you find this helpful?''. In some cases,
3994 an independent answer to that question could be generated
3995 (e.g. if a student is seen to come up with a correct
3996 answer, or not).
3997 \end{notate}
3999 \begin{notate}{Connections}
4000 A useful communication goal is to expose some of the
4001 connections between disparate resources. Some existing
4002 connections may be far more explicit than others. It's
4003 important to facilitate the making and explicating of
4004 connections by ``third parties'' (Note
4005 \ref{browser-within}). The search for connections between
4006 ostensibly unrelated things is a key part of both
4007 creativity and learning. In addition, connecting with
4008 what others are doing is an important part of being a
4009 social animal.
4010 \end{notate}
4012 \begin{notate}{Boundaries}
4013 Notice that the departmentalization of knowledge is
4014 similar to any regime that oversees and administers
4015 boundaries. In addition to bridging different areas,
4016 learning often involves pushing one's boundaries and
4017 getting out of one's comfort zone. The ``sociological
4018 imagination'' involves seeing oneself as part of something
4019 bigger; this goes along with the idea of a discourse that
4020 lowers or transcends the boundaries between participants.
4021 Imagination of any form can challenge myopic patterns of
4022 resource use, although there are also myopic fictions
4023 which neglect to look at what's going on in reality!
4024 \end{notate}
4026 \end{document}