adjust README
[arxana.git] / latex / arxana-reboot.tex
blob3c67c540705a4b3accd8d89386665226a1060778
1 %;; arxana.tex -*- mode: Emacs-Lisp; -*-
2 %;; Copyright (C) 2005-2009 Joe Corneli <holtzermann17@gmail.com>
4 %;; This program is free software: you can redistribute it and/or modify
5 %;; it under the terms of the GNU Affero General Public License as published by
6 %;; the Free Software Foundation, either version 3 of the License, or
7 %;; (at your option) any later version.
8 %;;
9 %;; This program is distributed in the hope that it will be useful,
10 %;; but WITHOUT ANY WARRANTY; without even the implied warranty of
11 %;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 %;; GNU Affero General Public License for more details.
13 %;;
14 %;; You should have received a copy of the GNU Affero General Public License
15 %;; along with this program. If not, see <http://www.gnu.org/licenses/>.
17 % (progn
18 % (find-file "~/arxana.tex")
19 % (save-excursion
20 % (goto-char (point-max))
21 % (let ((beg (progn (search-backward "\\begin{verbatim}")
22 % (match-end 0)))
23 % (end (progn (search-forward "\\end{verbatim}")
24 % (match-beginning 0))))
25 % (eval-region beg end)
26 % (lit-process))))
28 %%% Commentary:
30 %% To load: remove %'s above and evaluate with C-x C-e.
32 %% Alternatively, run this:
33 % head -n 13 arxana.tex | sed -e "/%/s///" > arxana-loader.el
34 %% on the command line to produce something you can use
35 %% to load Arxana when you start Emacs:
36 % emacs -l arxana-loader.el
38 %% Or put the expression in your ~/.emacs (perhaps wrapped
39 %% in function like `eval-arxana').
41 %% Or search for a similar form below and evaluate there!
43 %% Q. Where exactly are we supposed to store the most
44 %% up-to-date Arxana files when they are ready to go?
46 %% A. Copy them into /usr/lib/sbcl/site-systems/arxana/
47 %% and that should be enough. Make sure that arxana.asd
48 %% is in that directory and that you have a symbolic link,
49 %% made via
51 %% ln -s ./arxana/arxana.asd .
53 %% in the directory /usr/lib/sbcl/site-systems/
54 %% -- Make sure to load once as root to generate new fasls.
56 %% Q. How to run the remote slime after that?
58 %% A. Make sure that Emacs `slime-protocol-version' matches
59 %% Common Lisp's `swank::*swank-wire-protocol-version*', then,
60 %% like this:
62 %% ssh -L 4005:127.0.0.1:4005 joe@li23-125.members.linode.com
63 %% linode$ sbcl
64 %% M-x slime-connect RET RET
66 %%% Code:
68 \documentclass{article}
70 \usepackage{amsmath}
71 \usepackage{amsthm}
72 \usepackage{verbatim}
74 \newcommand{\meta}[1]{$\langle${\it #1}$\rangle$}
76 \theoremstyle{definition}
77 \newtheorem{nota}{Note}[section]
79 \parindent = 1.2em
81 \newenvironment{notate}[1]
82 {\begin{nota}[{\bf {\em #1}}]}%
83 {\end{nota}}
85 \makeatletter
86 \newenvironment{elisp}
87 {\let\ORGverbatim@font\verbatim@font
88 \def\verbatim@font{\ttfamily\scshape}%
89 \verbatim}
90 {\endverbatim
91 \let\verbatim@font\ORGverbatim@font}
92 \makeatother
94 \makeatletter
95 \newenvironment{common}[1]
96 {\let\ORGverbatim@font\verbatim@font
97 \def\verbatim@font{\ttfamily\scshape}%
98 \verbatim}
99 {\endverbatim
100 \let\verbatim@font\ORGverbatim@font}
101 \makeatother
103 \makeatletter
104 \newenvironment{idea}
105 {\let\ORGverbatim@font\verbatim@font
106 \def\verbatim@font{\ttfamily\slshape}%
107 \verbatim}
108 {\endverbatim
109 \let\verbatim@font\ORGverbatim@font}
110 \makeatother
112 \begin{document}
114 \title{\emph{Arxana}}
116 \author{Joseph Corneli\thanks{Copyright (C) 2005-2010
117 Joseph Corneli {\tt <holtzermann17@gmail.com>}\newline
118 $\longrightarrow$ transferred to the public domain.}}
119 \date{Last revised: \today}
121 \maketitle
123 \abstract{A tool for building hackable semantic hypertext
124 platforms. Source code and mailing lists are at {\tt
125 http://common-lisp.net/project/arxana}.}
127 \tableofcontents
129 \section{Introduction}
131 \begin{notate}{What is ``Arxana''?} \label{arxana}
132 \emph{Arxana} is the name of a ``next generation''
133 hypertext system that emphasizes annotation. Every object
134 in this system is annotatable. Because of this, I
135 sometimes call Arxana's core ``the scholium system'', but
136 the name ``Arxana'' better reflects our aim: to explore
137 the mysterious world of links, attachments,
138 correspondences, and side-effects.
139 \end{notate}
141 \begin{notate}{The idea} \label{theoretical-context}
142 A scholia-based document model for commons-based peer
143 production will inform the development of our
144 system.\footnote{{\tt
145 http://www.metascholar.org/events/2005/freeculture/viewabstract.php?id=19
146 % alternate:
147 % http://br.endernet.org/~akrowne/planetmath/papers/corneli\_fcdl/corneli-krowne.pdf
148 \label{corneli-krowne}
150 In this model, texts are made up of smaller texts until
151 you get to atomic texts; user actions are built in the
152 same way. Multiple users should interact with a shared
153 persistent data-store, through functional annotation, not
154 destructive modification. We should pursue the
155 asynchronous interaction model until we arrive at live,
156 synchronous, settings, where we facilitate real-time
157 computer-mediated interactions between users, and between
158 users and running hackable programs.
159 \end{notate}
161 \begin{notate}{The data model} \label{data-model}
162 Start by storing a collection of \emph{strings}. Now add
163 in \emph{pairs} and \emph{triples} which point at 2 and 3
164 objects respectively. (We can extend to n-tuples if that
165 turns out to be convenient.) Finally, we will maintain a
166 collection of \emph{lists}, each of which points at an
167 unlimited number of objects.
168 \end{notate}
170 \begin{notate}{History}
171 Thinking about how to improve existing systems for
172 peer-based collaboration in 2004, I designed a simple
173 version of the scholium system that treated textual
174 commentary and markup as scholia.\footnote{{\tt
175 http://wiki.planetmath.org/AsteroidMeta/old\_draft\_of\_scholium\_system}}
176 In 2006, I put together a single-user version of this
177 system that ran exclusively under Emacs.\footnote{{\tt
178 http://metameso.org/files/sbdm4cbpp.tex} \label{old-version}}
179 The current system is an almost-completely rewritten
180 variant, bringing in a shared database and various other
181 enhancements to support multi-user interaction.
182 \end{notate}
184 \begin{notate}{A brisk review of the programming literature} \label{prog-lit-review}
185 Many years before I started working on this project, there
186 was something called the Emacs HyperText
187 System.\footnote{{\tt
188 http://www.aue.aau.dk/\~{}kock/Publications/HyperBase/}}
189 What we're doing here updates for modern database methods,
190 uses a more interesting data storage format, and also
191 considers multiple front-ends to the same database (for
192 example, a web interface).
194 Contemporary Emacs-based hypertext creation systems
195 include Muse and Emacs Wiki.\footnote{{\tt
196 http://mwolson.org/projects/EmacsMuse.html}}$^,$\footnote{{\tt
197 http://mwolson.org/projects/EmacsWiki.html}} The
198 browsing side features old standbys, Info and
199 Emacs/w3m\footnote{Not to be confused with Emacs-w3m,
200 which is not entirely ``Emacs-based''.}. These packages
201 provide ways to author or view what what we should now
202 call ``traditional'' hypertext documents.
204 An another legacy tool worth mentioning is
205 HyperCard\footnote{{\tt
206 http://en.wikipedia.org/wiki/HyperCard}}. This system
207 was oriented around the idea of using hypertext to create
208 software, a vision we share, but like just about everyone
209 else working in the field at the time, it used
210 uni-directional links.
212 Hypertext \emph{nouveau} is based on semantic triples.
213 The Semantic Web standard provides one specification of
214 the features we can expect from triples.\footnote{{\tt
215 http://www.w3.org/TR/2004/REC-rdf-primer-20040210/}}
216 Triples provide a framework for knowledge representation
217 with more depth and flexibility than the popular
218 ``tagging'' methodology. For example, suitable
219 collections of triples implement AI-style ``frames''. The
220 idea of using triples to organize archival material is
221 generating some interest as Semantic Web ideas
222 spread.\footnote{Cf. recent museum and library
223 conferences}$^,$\footnote{Even among academic computer
224 scientists! (Josh Grochow, p.c.)}
226 An abstractly similar project to Arxana with some grand
227 goals is being developed by Chris Hanson at MIT under the
228 name ``Web-scale Environments for Deduction
229 Systems''.\footnote{{\tt
230 http://publications.csail.mit.edu/abstracts/abstracts07/cph2/cph2.html}}
232 Another technically similar project is Freebase, a hand
233 rolled database of open content, organized on frame-based,
234 triple driven, principles. The developer of the Freebase
235 graphd database has some interesting things to say about
236 old and new ways of handling triples.\footnote{{\tt
237 http://blog.freebase.com/2008/04/09/a-brief-tour-of-graphd/}}
238 \end{notate}
240 \begin{notate}{Fitting in}
241 My current development goal is to use this system to
242 create a more flexible multiuser interaction platform than
243 those currently available to web-based collaborative
244 projects (such as PlanetMath\footnote{{\tt
245 http://planetmath.org}}). As an intermediate stage,
246 I'm using Arxana to help organize material for a book I'm
247 writing. Arxana's theoretical generality, active
248 development status, detailed documentation, and
249 superlatively liberal terms of use may make it an
250 attractive option for you to try as well!
251 \end{notate}
253 \begin{notate}{What you get}
254 Arxana has an Emacs frontend, a Common Lisp middle-end,
255 and a SQL backend. If you want to do some work, any one
256 of these components can be swapped out and replaced with
257 the engine of your choice. I've released all of the
258 implementation work on this system into the public domain,
259 and it runs on an entirely free/libre/open source software
260 platform.
261 \end{notate}
263 \begin{notate}{Acknowledgements}
264 Ted Nelson's ``Literary Machines'' and Marvin Minsky's
265 ``Society of Mind'' are cornerstones in the historical and
266 social contextualization of this work. Alfred Korzybski's
267 ``Science and Sanity'' and Gilles Deleuze's ``The Logic of
268 Sense'' provided grounding and encouragement. \TeX\ and
269 GNU Emacs have been useful not just in prototyping this
270 system, but also as exemplary projects in the genre I'm
271 aiming for. John McCarthy's Elephant 2000 was an
272 inspiring thing to look at and think about\footnote{{\tt
273 http://www-formal.stanford.edu/jmc/elephant/elephant.html}}, and of course Lisp has been a vital ingredient.
275 Thanks also to everyone who's talked about this project
276 with me!
277 \end{notate}
279 \section{Using the program}
281 \begin{notate}{Dependencies} \label{dependencies}
282 Our interface is embedded in Emacs. Backend processing is
283 done with Common Lisp. We are currently using the
284 PostgreSQL database. These packages should be available
285 to you through the usual channels. (I've been using SBCL,
286 but any Lisp should do; please make sure you are using a
287 contemporary Emacs version.)
289 We will connect Emacs to Lisp via Slime\footnote{{\tt
290 http://common-lisp.net/project/slime/}}, and Lisp to
291 PostgreSQL via CLSQL.\footnote{{\tt http://clsql.b9.com/}}
292 CLSQL also talks directly to the Sphinx search engine,
293 which we use for text-based search.\footnote{{\tt
294 http://www.sphinxsearch.com/}} Once all of these
295 things are installed and working together, you should be
296 able to begin to use Arxana.
298 Setting up all of these packages can be a somewhat
299 time-consuming and confusing task, especially if you
300 haven't done it before! See Appendix \ref{appendix-setup}
301 for help.
302 \end{notate}
304 \begin{notate}{Export code and set up the interface}
305 If you are looking at the source version of this document
306 in Emacs, evaluate the following s-expression (type
307 \emph{C-x C-e} with the cursor positioned just after its
308 final parenthesis). This exports the Common Lisp
309 components of the program to suitable files for subsequent
310 use, and prepares the Emacs environment. (The code that
311 does this is in Appendix \ref{appendix-lit}.)
312 \end{notate}
314 \begin{idea}
315 (save-excursion
316 (let ((beg (search-forward "\\begin{verbatim}"))
317 (end (progn (search-forward "\\end{verbatim}")
318 (match-beginning 0))))
319 (eval-region beg end)
320 (lit-process)))
321 \end{idea}
323 \begin{notate}{To load Common Lisp components at run-time} \label{load-at-runtime}
324 Link {\tt arxana.asd} somewhere where Lisp can find it.
325 Then run commands like these in your Lisp; if you like,
326 you can place all of this stuff in your config file to
327 automatically load Arxana when Lisp starts. The final
328 form is only necessary if you plan to use CLSQL's special
329 syntax on the Lisp command-line.
330 \end{notate}
332 \begin{idea}
333 (asdf:operate 'asdf:load-op 'clsql)
334 (asdf:operate 'asdf:load-op 'arxana)
335 (in-package arxana)
336 (connect-to-database)
337 (locally-enable-sql-reader-syntax)
338 \end{idea}
340 \begin{notate}{To connect Emacs to Lisp}
341 Either run {\tt M-x slime RET} to start and connect to
342 Lisp locally, or {\tt M-x slime-connect RET RET} after you
343 have opened a remote connection to your remote server with
344 a command like this: {\tt ssh -L 4005:127.0.0.1:4005
345 <username>@<host>} and started Lisp and the Swank server
346 on the remote machine. To have Swank start automatically
347 when you start Lisp, put commands like this in your config
348 file.
349 \end{notate}
351 \begin{idea}
352 (asdf:operate 'asdf:load-op 'swank)
353 (setf swank:*use-dedicated-output-stream* nil)
354 (setf swank:*communication-style* :fd-handler)
355 (swank:create-server :dont-close t)
356 \end{idea}
358 \begin{notate}{To define database structures}
359 If you haven't yet defined the basic database structures,
360 make sure to load them now! (Using {\tt tabledefs.lisp},
361 or the SQL code in Section \ref{sql-code})
362 \end{notate}
364 \begin{notate}{Importing this document into system}
365 You can browse this document inside Arxana: after loading
366 the code, run \emph{M-x autoimport-arxana}.
367 \end{notate}
369 \section{SQL tables} \label{sql-code}
371 \begin{notate}{Objects and codes} \label{objects-and-codes}
372 Every object in the system is identified by an ordered
373 pair: a \emph{code} and a \emph{reference}. The codes say
374 which table contains the indicated object, and references
375 provide that object's id. To a specific element of a list
376 or n-tuple, a third number, that element's \emph{offset},
377 is required. The codes are as follows:
379 \begin{center}
380 \begin{tabular}{|l|l|}
381 \hline
382 0 & list \\ \hline
383 1 & string \\ \hline
384 2 & pair \\ \hline
385 3 & triple \\ \hline
386 \end{tabular}
387 \end{center}
388 \end{notate}
390 \begin{idea}
391 CREATE TABLE strings (
392 id SERIAL PRIMARY KEY,
393 text TEXT NOT NULL UNIQUE
396 CREATE TABLE pairs (
397 id SERIAL PRIMARY KEY,
398 code1 INT NOT NULL,
399 ref1 INT NOT NULL,
400 code2 INT NOT NULL,
401 ref2 INT NOT NULL,
402 UNIQUE (code1, ref1,
403 code2, ref2)
406 CREATE TABLE triples (
407 id SERIAL PRIMARY KEY,
408 code1 INT NOT NULL,
409 ref1 INT NOT NULL,
410 code2 INT NOT NULL,
411 ref2 INT NOT NULL,
412 code3 INT NOT NULL,
413 ref3 INT NOT NULL,
414 UNIQUE (code1, ref1,
415 code2, ref2,
416 code3, ref3)
418 \end{idea}
420 \begin{notate}{A list of lists}\label{models-of-theories}
421 As a central place to manage our collections, we first
422 create a list of lists. The `heading' is the list's name,
423 and its `header' is metadata.
424 \end{notate}
426 \begin{idea}
427 CREATE TABLE lists (
428 id SERIAL PRIMARY KEY,
429 heading REFERENCES strings(id) UNIQUE,
430 header REFERENCES strings(id)
432 \end{idea}
434 \begin{notate}{Lists on demand}\label{models-of-theories}
435 Whenever we want to create a new list, we first add to the
436 `lists' table, and then create a new table ``listk''
437 (where k is equal to the new maximum id on `lists').
438 \end{notate}
440 \begin{idea}
441 CREATE TABLE listk (
442 offset SERIAL PRIMARY KEY,
443 code INT NOT NULL,
444 ref INT NOT NULL
446 \end{idea}
448 \begin{notate}{Side-note on containers via triples} \label{containers-using-triples}
449 To model a basic container, we can just use triples like
450 ``(A in B)''. This is useful, but the elements of B are
451 of course unordered. In Section \ref{importing}, we make
452 extensive use of triples like (B 1 $\alpha$), (B 2
453 $\beta$), etc., to indicate that B's first component is
454 $\alpha$, second component is $\beta$, and so on; so we
455 can make ordered list-like containers as well.
457 This is an example of the difference in expressive power
458 of tags (which only provide a sense of unordered
459 containment in ``virtual baskets'') and triples (which
460 here are seen to at least provide the additional sense of
461 ordered containment in ``virtual filing cabinets'',
462 although they have much more in store for us); cf. Note
463 \ref{prog-lit-review}.
465 As useful as models based on these two principles are in
466 principle, the user could easily be overloaded by looking
467 at lots of different containers encoded in raw triples,
468 all at once.
469 \end{notate}
471 \begin{notate}{Sense of containment}
472 Note that every element of a list is in the list in the
473 same ``sense'' -- for example, we can't instantly
474 distinguish elements that are ``halfway in'' from those
475 that are ``all the way in'', the same way we could with
476 pure triples.
477 \end{notate}
479 %% \begin{notate}{References into theories}
480 %% Since at the moment we have less than 10 basic codes, we
481 %% can uniquely reference contents of theory $k$ with ordered
482 %% pairs $10k+\mathit{basic\ code}$ and $\mathit{reference}$.
483 %% \end{notate}
485 \begin{notate}{Uniqueness of strings and triples} \label{unique-things}
486 An attempt to create a duplicate contents in a string or
487 triple generates a warning. This saves storage, given
488 possible repetitive use -- and avoids confusion. We can,
489 however, reference duplicate ``copies'' on the lists.
490 \end{notate}
492 \begin{notate}{Change} \label{change}
493 Notice also that since neither strings nor triples
494 ``change'', we have to account for change in other ways.
495 In particular, the contents of lists can change. (We may
496 subsequently add some metadata to certain lists are
497 ``locked'', or indicate that they can only be changed by
498 adding, etc., so that their contents can be cited stably
499 and reliably.)
500 \end{notate}
502 %% \begin{notate}{Each place contains one object} \label{places}
503 %% It is obvious from the table definition that I want each
504 %% place to contain precisely one thing; perhaps it is less
505 %% obvious why I want to use a database table to maintain
506 %% this relationship between ``places'' and ``things''. This
507 %% is largely a matter of convenience, but in particular it
508 %% makes it easy for places to change.
509 %% \end{notate}
511 \begin{notate}{Provenance and other metadata} \label{provenance}
512 We could of course add much more structure to the
513 database, starting with simple adjustments like adding
514 provenance metadata or versioning into the records for
515 each stored thing. For the time being, I assume that such
516 metadata will appear in the application or content layer,
517 as triples. (The exception are the ``headings'' and
518 ``headers'' associated with lists.)
519 \end{notate}
521 \section{Common Lisp-side}
523 \subsection{Preliminaries}
525 \subsubsection*{System definition}
527 \begin{common}{arxana.asd}
528 (defsystem "arxana"
529 :version "1"
530 :author "Joe Corneli <holtzermann17@gmail.com>"
531 :licence "Public Domain"
532 :components
533 ((:file "packages")
534 (:file "utilities" :depends-on ("packages"))
535 (:file "database" :depends-on ("utilities"))
536 (:file "queries" :depends-on ("packages"))))
537 \end{common}
539 \subsubsection*{Package definition}
541 \begin{common}{packages.lisp}
542 (defpackage :arxana
543 (:use #:cl #:clsql #:clsql-sys))
544 \end{common}
546 \subsubsection*{Utilities}
548 \begin{notate}{Useful things} \label{useful}
549 These definitions are either necessary or useful for
550 working the database and manipulating triple-centric
551 and/or theory-situated data. The implementation of
552 theories given here is inspired by Lisp's streams. This
553 is perhaps the most gnarly part of the code; the pay-off
554 of doing things the way we do them here is that
555 subsequently theories can sit ``transparently'' over other
556 structures.
557 \end{notate}
559 \begin{common}{utilities.lisp}
560 (in-package arxana)
561 (locally-enable-sql-reader-syntax)
563 ;; (defun connect-to-database ()
564 ;; (connect `("localhost" "joe" "joe" "")
565 ;; :database-type :postgresql-socket))
567 (defun connect-to-database ()
568 (connect `("localhost" "joe" "joe" "joe")
569 :database-type :mysql))
571 (defmacro select-one (&rest args)
572 `(car (select ,@args :flatp t)))
574 (defmacro select-flat (&rest args)
575 `(select ,@args :flatp t))
577 (defun resolve-ambiguity (stuff)
578 (first stuff))
580 (defun isolate-components (content i j)
581 (list (nth (1- i) content)
582 (nth (1- j) content)))
584 (defun isolate-beginning (triple)
585 (isolate-components (cdr triple) 1 2))
587 (defun isolate-middle (triple)
588 (isolate-components (cdr triple) 3 4))
590 (defun isolate-end (triple)
591 (isolate-components (cdr triple) 5 6))
593 (defvar *read-from-heading* nil)
595 (defvar *write-to-heading* nil)
596 \end{common}
598 \begin{notate}{On `datatype'}
599 Just translate coordinates into their primary dimension.
600 (How should this change to accomodate codes 4, 5, 6,
601 possibly etc.?)
602 \end{notate}
604 \begin{common}{utilities.lisp}
605 (defun datatype (data)
606 (cond ((eq (car data) 0)
607 "strings")
608 ((eq (car data) 1)
609 "places")
610 ((eq (car data) 2)
611 "triples")
612 ((eq (car data) 3)
613 "theories")))
615 (locally-disable-sql-reader-syntax)
616 \end{common}
618 \begin{notate}{Resolving ambiguity}
619 Often it will eventuate that there will be more than one
620 item returned when we are only truly prepared to deal with
621 one item. In order to handle this sort of ambiguity, it
622 would be great to have either a non-interactive notifier
623 that says that some ambiguity has been dealt with, or an
624 interactive tool that will let the user decide which of
625 the ambiguous options to choose from. For now, we provide
626 the simplest non-interactive tool: just choose the first
627 item from a possibly ambiguous list of items.
628 \end{notate}
630 \begin{notate}{Using a different database}
631 See Note \ref{backend-variant} for instructions on changes
632 you will want to make if you use a different database.
633 \end{notate}
635 \begin{notate}{Use of the ``count'' function}
636 The SQL count function is thought to be inefficient with
637 some backends; workarounds exist. (And it's considered to
638 be efficient with MySQL.)
639 \end{notate}
641 \begin{notate}{Abstraction} \label{abstraction}
642 While it might be in some ways ``nice'' to allow people to
643 chain together ever-more-abstract references to elements
644 from other theories, I actually think it is better to
645 demand that there just be \emph{one} layer of abstraction
646 (since we can then quickly translate back and forth,
647 rather than running through a chain of translations).
649 This does not imply that we cannot have a theory
650 superimposed over another theory (or over multiple
651 theories) that draws input from throughout a massively
652 distributed interlaced system -- rather, just that we
653 assume we will need to translate to ``base coordinates''
654 when building such structures. However, we'll certainly
655 want to explore the possibilities for running links
656 between theories (abstractly similar in some sense to
657 pointing at a component of a triple, but here there's no
658 uniform beg, mid, end scheme to refer to).
659 \end{notate}
661 \subsection{Main table definitions}
663 \begin{notate}{Defining tables from within Lisp}
664 This is Lisp code to define the permanent SQL tables
665 described in Section \ref{sql-code}.
666 \end{notate}
668 \begin{common}{tabledefs.lisp}
669 ;; (execute-command "CREATE TABLE strings (
670 ;; id SERIAL PRIMARY KEY,
671 ;; text TEXT NOT NULL UNIQUE
672 ;; );")
674 (execute-command "CREATE TABLE strings (
675 id SERIAL PRIMARY KEY,
676 text TEXT,
677 UNIQUE INDEX (text(255))
678 );")
680 (execute-command "CREATE TABLE places (
681 id SERIAL PRIMARY KEY,
682 code INT NOT NULL,
683 ref INT NOT NULL
684 );")
686 (execute-command "CREATE TABLE triples (
687 id SERIAL PRIMARY KEY,
688 code1 INT NOT NULL,
689 ref1 INT NOT NULL,
690 code2 INT NOT NULL,
691 ref2 INT NOT NULL,
692 code3 INT NOT NULL,
693 ref3 INT NOT NULL,
694 UNIQUE (code1, ref1,
695 code2, ref2,
696 code3, ref3)
697 );")
699 (execute-command "CREATE TABLE theories (
700 id SERIAL PRIMARY KEY,
701 name INT UNIQUE REFERENCES strings(id)
702 );")
703 \end{common}
705 \begin{notate}{Eliminating and tables}
706 In case you ever need to redefine these tables, you can
707 run code like this first, to delete the existing copies.
708 (Additional tables are added whenever a theory is created;
709 code for deleting theories or their contents will appear
710 in Section \ref{processing-theories}.)
711 \end{notate}
713 \begin{idea}
714 (dolist (view (list-views)) (drop-view view))
715 (execute-command "DROP TABLE strings")
716 (execute-command "DROP TABLE triples")
717 (execute-command "DROP TABLE places")
718 (execute-command "DROP TABLE theories")
719 \end{idea}
721 \subsection{Modifying the database}
723 \begin{common}{database.lisp}
724 (in-package arxana)
725 (locally-enable-sql-reader-syntax)
726 \end{common}
728 \subsection*{Processing strings}
730 \begin{notate}{On `string-to-id'}
731 Return the id of `text', if present, otherwise nil.
733 There was a segmentation fault with clisp here at one
734 point, maybe because I hadn't gotten the clsql sql reader
735 syntax loaded up properly. Note that calling the code
736 without the function wrapper did not produce the same
737 segfault.
738 \end{notate}
740 \begin{common}{database.lisp}
741 (defun string-to-id (text)
742 (select [id]
743 :from [strings]
744 :where [= [text] text]))
745 \end{common}
747 \begin{notate}{On `add-string'} \label{add-string}
748 Add the argument `text' to the list of strings. If the string
749 is successfully created, its coordinates are returned.
750 Otherwise, and in particular, if the request was to create
751 a duplicate, nil is returned.
753 Should this give a message ``Adding \meta{text} to the
754 strings table'' when the string is added by an indirecto
755 function call, such as through `massage'?
756 (Note \ref{massage}.)
757 \end{notate}
759 \begin{common}{database.lisp}
760 (defun add-string (text)
761 (handler-case
762 (progn (insert :into [strings]
763 :attributes '(text)
764 :values `(,text))
765 `(1 ,(string-to-id text)))
766 (sql-database-data-error ()
767 (warn "\"~a\" already exists."
768 text))))
769 \end{common}
771 \begin{notate}{Error handling bug}
772 The function `add-string' (Note \ref{add-string}) exhibits
773 the first of several error handling calls designed to
774 ensure uniqueness (Note \ref{unique-things}).
775 Experimentally, this works, but I'm observing that, at
776 least sometimes, if the user tries to add an item that's
777 already present in the database, the index tied to the
778 associated table increases even though the item isn't
779 added. This is annoying. I haven't checked whether this
780 happens on all possible installations of the underlying
781 software.
782 \end{notate}
784 \subsection*{Parsing general input}
786 \begin{notate}{On `massage'} \label{massage}
787 User input to functions like `add-triple' and so on and so
788 forth can be strings, integers (which the function
789 ``serializes'' as the string versions of themselves), or
790 as \emph{coordinates} -- lists of the form (code ref).
791 This function converts all of these input forms into the
792 last one! It takes an optional argument `addstr' which,
793 if supplied, says to add string data to the database if it
794 wasn't there already.
795 \end{notate}
797 \begin{common}{database.lisp}
798 (defun massage (data &optional addstr)
799 (cond
800 ((integerp data)
801 (massage (format nil "~a" data) addstr))
802 ((stringp data)
803 (let ((id (string-to-id data)))
804 (if id
805 (list 0 id)
806 (when addstr
807 (add-string data)))))
808 ((and (listp data)
809 (equal (length data) 2))
810 data)
811 (t nil)))
812 \end{common}
815 \subsection*{Processing triples}
817 \begin{notate}{On `triple-to-id'}
818 Return the id of the triple (beg mid end),
819 if present, otherwise nil.
820 \end{notate}
822 \begin{common}{database.lisp}
823 (defun triple-to-id (beg mid end)
824 (let ((b (massage beg))
825 (m (massage mid))
826 (e (massage end)))
827 (select [id]
828 :from [triples]
829 :where [and [= [code1] (first b)]
830 [= [ref1] (second b)]
831 [= [code2] (first m)]
832 [= [ref2] (second m)]
833 [= [code3] (first e)]
834 [= [ref3] (second e)]])))
835 \end{common}
837 \begin{notate}{On `add-triple'} \label{add-triple}
838 Elements of triples are parsed by `massage'
839 (Note \ref{massage}). If the triple
840 is successfully created, its coordinates are returned.
841 Otherwise, and in particular, if the request was to create
842 a duplicate, nil is returned.
843 \end{notate}
845 \begin{common}{database.lisp}
846 (defun add-triple (beg mid end)
847 "Add a triple comprised of BEG MID and END."
848 (let ((b (massage beg t))
849 (m (massage mid t))
850 (e (massage end t)))
851 (when (and b m e)
852 (handler-case
853 (progn
854 (insert-records
855 :into [triples] :attributes '(code1 ref1
856 code2 ref2
857 code3 ref3)
858 :values `(,(first b) ,(second b)
859 ,(first m) ,(second m)
860 ,(first e) ,(second e)))
861 `(2 ,(triple-to-id b m e)))
862 (sql-database-data-error ()
863 (warn "\"~a\" already entered as [~a ~a ~a]."
864 (list beg mid end) b m e))))))
865 \end{common}
867 \subsection*{Processing theories} \label{processing-theories}
869 \begin{notate}{Things to do with theories}
870 For the record, we want to be able to create a theory, add
871 elements to that theory, remove or change elements in the
872 theory, and, for convenience, zap everything in a theory.
873 Perhaps we will also want functions to remove the tables
874 associated with a theory as well, swap the position of two
875 theories, or change the name of a theory. We will also
876 want to be able to export and import theories, so they can
877 be ``beamed'' between installations. At appropriate
878 places in the Emacs interface, we'll need to set
879 `*write-to-heading*' and `*read-from-heading*'.
880 \end{notate}
882 \begin{notate}{What can go in a theory} \label{what-can-go-in}
883 Notice that there is no rule that says that a triple or
884 place that's part of a theory needs to point only at
885 strings that are in the same theory.
886 \end{notate}
888 \begin{notate}{On `list-to-id'}
889 Return the id of the theory with given `heading', if present,
890 otherwise, nil.
891 \end{notate}
893 \begin{common}{database.lisp}
894 (defun list-to-id (heading)
895 (let ((string-id (string-to-id heading)))
896 (select [id]
897 :from [lists]
898 :where [= [heading] string-id])))
899 \end{common}
901 \begin{notate}{On `add-theory'} \label{add-theory}
902 Add a theory to the theories table, and all the new
903 dimensions of the frame that comprise this theory.
904 (Theories have names that are strings -- it seems a
905 little funny to always have to translate submitted
906 strings to ids for lookup, but this is what we do.)
907 \end{notate}
909 \begin{common}{database.lisp}
910 (defun add-list (heading)
911 (let ((string-id (second (massage heading t))))
912 (handler-case
913 (progn (insert :into [lists]
914 :attributes '(heading)
915 :values `(,string-id))
916 (let ((k (theory-to-id heading)))
917 (execute-command
918 (format nil "CREATE TABLE lists~A (
919 offset SERIAL PRIMARY KEY,
920 code INT NOT NULL,
921 ref INT NOT NULL
922 );" k))
923 `(0 ,k)))
924 (sql-database-data-error
926 (warn "The list \"~a\" already exists."
927 heading)))))
928 \end{common}
930 \begin{notate}{On `get-lists'}
931 Find all lists that contain `symbol'.
932 \end{notate}
934 \begin{common}{database.lisp}
935 (defun get-lists (symbol)
936 (let* ((data (massage symbol))
937 (type (datatype data))
938 (id (second data))
939 (n (caar
940 (query "select count(*) from lists")))
941 results)
942 (loop for k from 1 upto n
943 do (let ((present
944 (query (concatenate
945 'string
946 "select offset from list"
947 (format nil "~A" k)
948 " where ((code = "
949 (format nil "~A" type)
950 ") and (ref = "
951 (format nil "~A" id)
952 "))"))))
953 (when present
954 ;; bit of a problem if there are multiple
955 ;; entries of that item on the given
956 ;; list.
957 (setq results (cons (list 0 k present)
958 results)))))
959 results))
960 \end{common}
962 \begin{notate}{On `save-to-list'}
963 Record `symbol' on list named `name'.
964 \end{notate}
966 \begin{common}{database.lisp}
967 (defun save-to-list (symbol name)
968 (let* ((data (massage symbol t))
969 (type (datatype data))
970 (string-id (string-to-id name))
971 (k (select-one [id]
972 :from [lists]
973 :where [= [name] string-id]))
974 (tablek (concatenate 'string
975 type (format nil "~A" k))))
976 (insert-records :into (sql-expression :table tablek)
977 :attributes '(id)
978 :values `(,(second data)))))
979 \end{common}
981 \subsection*{Lookup by id or coordinates}
983 \begin{notate}{The data format that's best for Lisp} \label{what-is-best-for-lisp}
984 It is a reasonable question to ask whether or not the an
985 item's id should be considered part of that item's
986 defining data when that data is no longer in the database.
987 For the functions defined here, the id is an input, and so
988 by default I'm not including it in the output here,
989 because it is already known. However, for functions like
990 `triples-given-beginning' (See Note
991 \ref{graph-like-data}), the id is \emph{not} part of the
992 known data, and so it is returned. Therefore I am
993 providing the `retain-id' flag here, for cases where
994 output should be consistent with that of these other
995 functions.
996 \end{notate}
998 \begin{common}{database.lisp}
999 (defun string-lookup (id &optional retain-id)
1000 (let ((ret (select [text]
1001 :from [strings]
1002 :where [= [id] id])))
1003 (if retain-id
1004 (list id ret)
1005 ret)))
1007 (defun triple-lookup (id &optional retain-id)
1008 (let ((ret (select [code1] [ref1]
1009 [code2] [ref2]
1010 [code3] [ref3]
1011 :from [triples]
1012 :where [= [id] id])))
1013 (if retain-id
1014 (cons id ret)
1015 ret)))
1017 (defun list-lookup (id &optional retain-id)
1018 (let ((ret (select [name]
1019 :from [lists]
1020 :where [= [id] id])))
1021 (if retain-id
1022 (list id ret)
1023 ret)))
1024 \end{common}
1026 \begin{notate}{Succinct idioms for following pointers}
1027 Here are some variants on the functions above which save
1028 us from needing to extract the id of the item from its
1029 coordinates.
1030 \end{notate}
1032 \begin{common}{database.lisp}
1033 (defun string-contents (coords)
1034 (string-lookup (second coords)))
1036 (defun place-contents (coords)
1037 (place-lookup (second coords)))
1039 (defun triple-contents (coords)
1040 (triple-lookup (second coords)))
1041 \end{common}
1043 \begin{notate}{Switchboard} \label{switchboard}
1044 Even more succinctly, one function that can get
1045 the object indicated by any set of coordinates.
1046 \end{notate}
1048 \begin{common}{database.lisp}
1049 (defun switchboard (coords)
1050 (cond ((eq (first coords) 0)
1051 (string-contents coords))
1052 ((eq (first coords) 1)
1053 (place-contents coords))
1054 ((eq (first coords) 2)
1055 (triple-contents coords))))
1056 \end{common}
1058 \begin{notate}{Anti-pasti}
1059 The readability of this code could perhaps be improved if
1060 we used functions like `switchboard' more frequently.
1061 (More to the point, it seems it's not currently used.) In
1062 particular, it would be nice if we could sweep idioms like
1063 \verb+`(2 ,(car triple))+ under the rug.
1064 \end{notate}
1066 \begin{common}{database.lisp}
1067 (locally-disable-sql-reader-syntax)
1068 \end{common}
1070 \subsection{Queries} \label{queries}
1072 \begin{notate}{The use of views} \label{use-of-views}
1073 It is easy enough to select those triples which match
1074 simple data, e.g., those triples which have the same
1075 beginning, middle, or end, or any combination of these.
1076 It is a little more complicated to find items that match
1077 criteria specified by several different triples; for
1078 example, to \emph{find all the books by Arthur C. Clarke
1079 that are also works of fiction}.
1081 Suppose our collection of triples contains a portion as
1082 follows:
1083 \begin{center}
1084 \begin{tabular}{lll}
1085 Profiles of the Future & is a & book \\ 2001: A Space
1086 Odyssey & is a & book \\ Ender's Game & is a & book
1087 \\ Profiles of the Future & has genre & non-fiction
1088 \\ 2001: A Space Odyssey & has genre & fiction \\ Ender's
1089 Game & has genre & fiction \\ Profiles of the Future & has
1090 author & Arthur C. Clarke \\ 2001: A Space Odyssey & has
1091 author & Arthur C. Clarke \\ Ender's Game & has author &
1092 Orson Scott Card
1093 \end{tabular}
1094 \end{center}
1096 One way to solve the given problem would be to find those
1097 items that \emph{are written by Arthur C. Clarke} (* ``has
1098 author'' and ``Arthur C. Clarke''), that \emph{are books}
1099 (* ``is a'' ``book''), and \emph{that are classified as
1100 fiction} (* ``has genre'' ``fiction''). We are looking
1101 for items that match \emph{all} of these conditions.
1103 Our implementation strategy is: collect the items matching
1104 each criterion into a view, then join these views. (See
1105 the function `satisfy-conditions'
1106 \ref{satisfy-conditions}.)
1108 If we end up working with large queries and a lot of data,
1109 this use of views may not be an efficient way to go -- but
1110 we'll cross that bridge when we come to it.
1111 \end{notate}
1113 \begin{notate}{Search queries}
1114 In Note \ref{sphinx-setup} et seq., we give some
1115 instructions on how to set up the Sphinx search engine to
1116 work with Arxana. However, a much tighter integration of
1117 Sphinx into Arxana is possible, and will be coming soon.
1118 \end{notate}
1120 \begin{common}{queries.lisp}
1121 (in-package arxana)
1122 (locally-enable-sql-reader-syntax)
1123 \end{common}
1125 \subsection*{Printing}
1127 \begin{notate}{On `print-system-object'} \label{print-system-object}
1128 The function `print-system-object' bears some resemblance
1129 to `massage', but is for printing instead,
1130 and therefor has to be recursive (because triples and
1131 places can point to other system objects, printing can be
1132 a long and drawn out ordeal).
1133 \end{notate}
1135 \begin{common}{queries.lisp}
1136 (defun print-system-object (data &optional components)
1137 (cond
1138 ;; just return strings
1139 ((stringp data)
1140 data)
1141 ;; printing from coordinates (code, ref)
1142 ((and (listp data)
1143 (equal (length data) 2))
1144 ;; we'll need some hack to deal with
1145 ;; elements-of-theories, which, right now, are two
1146 ;; elements long but are not (code, ref) pairs but
1147 ;; rather (local_id, ref) pairs, or maybe actually if
1148 ;; we take context into consideration, they're
1149 ;; actually (k, table, local_id, ref) quadruplets.
1150 ;; Obviously with *that* data we can translate to
1151 ;; (code, ref). On the other hand, if we *don't*
1152 ;; take it into consideration, we probably can't do
1153 ;; much of anything. So we should be careful to be
1154 ;; aware of just what sort of information we're
1155 ;; passing around.
1156 (cond ((equal (first data) 0)
1157 (string-lookup (second data)))
1158 ((equal (first data) 1)
1159 (print-system-object
1160 (place-lookup (second data) t)))
1161 ((equal (first data) 2)
1162 (let ((triple (triple-lookup (second data) t)))
1163 (if components
1164 (list
1165 (print-beginning triple)
1166 (print-middle triple)
1167 (print-end triple))
1168 (concatenate
1169 'string
1170 (format nil "T~a[" (second data))
1171 (print-beginning triple) "."
1172 (print-middle triple) "."
1173 (print-end triple) "]"))))
1174 ((equal (first data) 3)
1175 (concatenate 'string "List printing not implemented yet."))))
1176 ;; place
1177 ((and (listp data)
1178 (equal (length data) 3))
1179 (concatenate 'string
1180 (format nil "P~a|" (first data))
1181 (print-system-object (cdr data)) "|"))
1182 ;; triple
1183 ((and (listp data)
1184 (equal (length data) 7))
1185 (if components
1186 (list
1187 (print-beginning data)
1188 (print-middle data)
1189 (print-end data))
1190 (concatenate
1191 'string
1192 (format nil "T~a[" (first data))
1193 (print-beginning data) "."
1194 (print-middle data) "."
1195 (print-end data) "]")))
1196 (t nil)))
1198 (defun print-beginning (triple)
1199 (print-system-object (isolate-beginning triple)))
1201 (defun print-middle (triple)
1202 (print-system-object (isolate-middle triple)))
1204 (defun print-end (triple)
1205 (print-system-object (isolate-end triple)))
1206 \end{common}
1208 \begin{notate}{Depth}
1209 If we are going to have complicated recursive references,
1210 our printer, and anything else that gives the system some
1211 semantics, should come with some sort of ``layers'' switch
1212 that can be used to limit the amount of recursion we do in
1213 any given computation.
1214 \end{notate}
1216 \begin{notate}{Printing objects as they appear in Lisp} \label{printing-objects-in-lisp}
1217 With the following functions we provide facilities for
1218 printing an object, either from its id or from the
1219 expanded form of the data that represents it in Lisp.
1220 (This is one good reason to have one standard form for
1221 this data; compare Note \ref{what-is-best-for-lisp}.
1222 These functions assume that the id \emph{is} part of
1223 what's printed, so if using functions like `triple-lookup'
1224 to retrieve data for printing, you'll have to graft the id
1225 back on before printing with these functions.)
1226 \end{notate}
1228 \begin{notate}{Printing theories}
1229 We'll want to both print all of the content of a theory,
1230 and print \emph{from} the theory in a more limited way.
1231 (Perhaps we get the second item for free, already?)
1232 \end{notate}
1234 \begin{common}{queries.lisp}
1235 (defun print-string (string &optional components)
1236 (print-system-object string components))
1238 (defun print-place (place &optional components)
1239 (print-system-object place components))
1241 (defun print-triple (triple &optional components)
1242 (print-system-object triple components))
1244 (defun print-string-from-id (id &optional components)
1245 (print-system-object (list 0 id) components))
1247 (defun print-place-from-id (id &optional components)
1248 (print-system-object (list 1 id) components))
1250 (defun print-triple-from-id (id &optional components)
1251 (print-system-object (list 2 id) components))
1252 \end{common}
1254 \begin{notate}{Printing some stuff but not other stuff} \label{printing-some}
1255 These functions are good for printing lists as come out of
1256 the database. See Note \ref{strings-and-ids} on printing
1257 strings.
1258 \end{notate}
1260 \begin{common}{queries.lisp}
1261 (defun print-strings (strings)
1262 (mapcar 'second strings))
1264 (defun print-places (places &optional components)
1265 (mapcar (lambda (item)
1266 (print-system-object item components))
1267 places))
1269 (defun print-triples (triples &optional components)
1270 (mapcar (lambda (item)
1271 (print-system-object item components))
1272 triples))
1274 (defun print-theories (theories &optional components)
1275 (mapcar (lambda (item)
1276 (print-system-object item components))
1277 theories))
1278 \end{common}
1280 \begin{notate}{Printing everything in each table} \label{printing-everything}
1281 These functions collect human-readable versions of
1282 everything in each table. Notice that `all-strings' is
1283 written differently.
1284 \end{notate}
1286 \begin{common}{queries.lisp}
1287 (defun all-strings ()
1288 (mapcar 'second (select [*] :from [strings])))
1290 (defun all-places ()
1291 (mapcar 'print-system-object
1292 (select [*] :from [places])))
1294 (defun all-triples ()
1295 (mapcar 'print-system-object
1296 (select [*] :from [triples])))
1298 (defun all-theories ()
1299 (mapcar 'print-system-object
1300 (select [*] :from [theories])))
1301 \end{common}
1303 \begin{notate}{Printing on particular dimensions}
1304 One possible upgrade to the printing functions would be to
1305 provide the built-in to ``curry'' the printout -- for
1306 example, just print the source nodes from a list of
1307 triples. However, it should of course also be possible to
1308 do processing like this Lisp after the printout has been
1309 made (the point is, it is presumably it is more efficient
1310 only to retrieve and format the data we're actually
1311 looking for).
1312 \end{notate}
1314 \begin{notate}{Strings and ids} \label{strings-and-ids}
1315 Unlike other objects, strings don't get printed with their
1316 ids. We should probably provide an \emph{option} to print
1317 with ids (this could be helpful for subsequent work with
1318 the strings in question; on the other hand, since strings
1319 are being kept unique, we can immediately exchange a
1320 string and it's id, so I'm not sure if it's necessary to
1321 have an explicit ``option'').
1322 \end{notate}
1324 \subsection*{Functions that establish basic graph structure}
1326 \begin{notate}{Thinking about graph-like data} \label{graph-like-data}
1327 Here we have in mind one or more objects (e.g. a
1328 particular source and sink) that is associated with
1329 potentially any number of triples (e.g. all the possible
1330 middles running between these two identified objects).
1331 These functions establish various forms of locality or
1332 neighborhood within the data.
1334 The results of such queries can be optionally cached in a
1335 view, which is useful for further processing
1336 (cf. \ref{satisfy-conditions}).
1338 These functions take input in the form of strings and/or
1339 coordinates (cf. Note \ref{massage}).
1340 \end{notate}
1342 \begin{common}{queries.lisp}
1343 (defun triples-given-beginning (node &optional view)
1344 "Get triples outbound from the given NODE. Optional
1345 argument VIEW causes the results to be selected into a
1346 view with that name."
1347 (let ((data (massage node))
1348 (window (or view "interal-view"))
1349 ret)
1350 (when data
1351 (create-view
1352 window
1353 :as (select [*]
1354 :from [triples]
1355 :where [and [= [code1] (first data)]
1356 [= [ref1] (second data)]]))
1357 (setq ret (select [*] :from window))
1358 (unless view
1359 (drop-view window))
1360 ret)))
1362 (defun triples-given-end (node &optional view)
1363 "Get triples inbound into NODE. Optional argument VIEW
1364 causes the results to be selected into a view with
1365 that name."
1366 (let ((data (massage node))
1367 (window (or view "interal-view"))
1368 ret)
1369 (when data
1370 (create-view
1371 window
1372 :as (select [*]
1373 :from [triples]
1374 :where [and [= [code3] (first data)]
1375 [= [ref3] (second data)]]))
1376 (setq ret (select [*] :from window))
1377 (unless view
1378 (drop-view window))
1379 ret)))
1381 (defun triples-given-middle (edge &optional view)
1382 "Get the triples that run along EDGE. Optional argument
1383 VIEW causes the results to be selected into a view
1384 with that name."
1385 (let ((data (massage edge))
1386 (window (or view "interal-view"))
1387 ret)
1388 (when data
1389 (create-view
1390 window
1391 :as (select [*]
1392 :from [triples]
1393 :where [and [= [code2] (first data)]
1394 [= [ref2] (second data)]]))
1395 (setq ret (select [*] :from window))
1396 (unless view
1397 (drop-view window))
1398 ret)))
1400 (defun triples-given-middle-and-end (edge node &optional
1401 view)
1402 "Get the triples that run along EDGE into NODE.
1403 Optional argument VIEW causes the results to be
1404 selected into a view with that name."
1405 (let ((edgedata (massage edge))
1406 (nodedata (massage node))
1407 (window (or view "interal-view"))
1408 ret)
1409 (when (and edgedata nodedata)
1410 (create-view
1411 window
1412 :as (select [*]
1413 :from [triples]
1414 :where [and [= [code2] (first edgedata)]
1415 [= [ref2] (second edgedata)]
1416 [= [code3] (first nodedata)]
1417 [= [ref3] (second nodedata)]]))
1418 (setq ret (select [*] :from window))
1419 (unless view
1420 (drop-view window))
1421 ret)))
1423 (defun triples-given-beginning-and-middle (node edge
1424 &optional view)
1425 "Get the triples that run from NODE along EDGE.
1426 Optional argument VIEW causes the results to be selected
1427 into a view with that name."
1428 (let ((nodedata (massage node))
1429 (edgedata (massage edge))
1430 (window (or view "interal-view"))
1431 ret)
1432 (when (and nodedata edgedata)
1433 (create-view
1434 window
1435 :as (select [*]
1436 :from [triples]
1437 :where [and [= [code1] (first nodedata)]
1438 [= [ref1] (second nodedata)]
1439 [= [code2] (first edgedata)]
1440 [= [ref2] (second edgedata)]]))
1441 (setq ret (select [*] :from window))
1442 (unless view
1443 (drop-view window))
1444 ret)))
1446 (defun triples-given-beginning-and-end (node1 node2
1447 &optional view)
1448 "Get the triples that run from NODE1 to NODE2. Optional
1449 argument VIEW causes the results to be selected
1450 into a view with that name."
1451 (let ((node1data (massage node1))
1452 (node2data (massage node2))
1453 (window (or view "interal-view"))
1454 ret)
1455 (when (and node1data node2data)
1456 (create-view
1457 window
1458 :as (select [*]
1459 :from [triples]
1460 :where [and [= [code1] (first node1data)]
1461 [= [ref1] (second node1data)]
1462 [= [code3] (first node2data)]
1463 [= [ref3] (second node2data)]]))
1464 (setq ret (select [*] :from window))
1465 (unless view
1466 (drop-view window))
1467 ret)))
1469 ;; This one use `select-one' instead of `select'
1470 (defun triple-exact-match (node1 edge node2 &optional
1471 view)
1472 "Get the triples that run from NODE1 along EDGE to
1473 NODE2. Optional argument VIEW causes the results to be
1474 selected into a view with that name."
1475 (let ((node1data (massage node1))
1476 (edgedata (massage edge))
1477 (node2data (massage node2))
1478 (window (or view "interal-view"))
1479 ret)
1480 (when (and node1data edgedata node2data)
1481 (create-view
1482 window
1483 :as (select [*]
1484 :from [triples]
1485 :where [and [= [code1] (first node1data)]
1486 [= [ref1] (second node1data)]
1487 [= [code2] (first edgedata)]
1488 [= [ref2] (second edgedata)]
1489 [= [code3] (first node2data)]
1490 [= [ref3] (second node2data)]]))
1491 (setq ret (select-one [*] :from window))
1492 (unless view
1493 (drop-view window))
1494 ret)))
1495 \end{common}
1497 \begin{notate}{Becoming flexible about a string's status}
1498 One possible upgrade would be to provide versions of these
1499 functions that will flexibly accept either a string or a
1500 ``placed string'' as input (since frequently we're
1501 interested in content of that sort; see
1502 \ref{importing-sketch}).
1503 \end{notate}
1505 \subsection*{Finding places that satisfy some property}
1507 \begin{notate}{On `get-places-subject-to-constraint'}
1508 Like `get-places' (Note \ref{get-places}), but this
1509 time takes an extra condition of the form (A C B)
1510 where one of A, B, and C is `nil'. We test each
1511 of the places in place of this `nil', to see if a
1512 triple matching that criterion exists.
1513 \end{notate}
1515 \begin{common}{queries.lisp}
1516 (defun get-places-subject-to-constraint (symbol condition)
1517 (let ((candidate-places (get-places symbol))
1518 accepted-places)
1519 (dolist (place candidate-places)
1520 (let ((filled-condition
1521 (map 'list (lambda (elt) (or elt
1522 `(1 ,place)))
1523 condition)))
1524 (when (apply 'triple-relaxed-match
1525 filled-condition)
1526 (setq accepted-places
1527 (cons place accepted-places)))))
1528 accepted-places))
1529 \end{common}
1531 \subsection*{Logic}
1533 \begin{notate}{Caution: compatibility with theories?}
1534 For the moment, I'm not sure how compatible this function
1535 is with the theories apparatus we've established, or with
1536 the somewhat vaguer notion of trans-theory questions or
1537 concerns. Global queries should work just fine, but
1538 theory-local questions may need some work. Before getting
1539 into compatibility of these questions with the theory
1540 apparatus, I want to make sure that apparatus is working
1541 properly. Note that the questions here do rely on
1542 functions for graph-like thinking (Note
1543 \ref{graph-like-data} et seq.), and it would certainly
1544 make sense to port to ``subgraphs'' as represented by
1545 theories.
1546 \end{notate}
1548 \begin{notate}{On `satisfy-conditions'} \label{satisfy-conditions}
1549 This function finds the items which match constraints.
1550 Constraints take the form (A B C), where precisely one of
1551 A, B, or C should be `nil', and any of the others can be
1552 either input suitable for `massage', or
1553 `t'. The `nil' entry stands for the object we're
1554 interested in. Any `t' entries are wildcards.
1556 The first thing that happens as the function runs is that
1557 views are established exhibiting each group of triples
1558 satisfying each predicate. The names of these views are
1559 then massaged into a large SQL query. (It is important to
1560 ``typeset'' all of this correctly for our SQL `query'.)
1561 Finally, once that query has been run, we clean up,
1562 dropping all of the views we created.
1563 \end{notate}
1565 \begin{common}{queries.lisp}
1566 (defun satisfy-conditions (constraints)
1567 (let* ((views (generate-views constraints))
1568 (formatted-list-of-views (format-views
1569 views))
1570 (where-condition (generate-where-condition
1571 views
1572 constraints))
1573 (ret
1574 ;; Let's see what the query is, first of all.
1575 (query
1576 (concatenate
1577 'string
1578 "select v1.id, v1.code1, v1.ref1, "
1579 "v1.code2, v1.ref2, "
1580 "v1.code3, v1.ref3 "
1581 "from "
1582 formatted-list-of-views
1583 "where "
1584 where-condition
1585 ";"))))
1586 (mapc (lambda (name) (drop-view name)) views)
1587 ret))
1588 \end{common}
1590 \begin{notate}{Subroutines for `satisfy-conditions'}
1591 The functions below produce bits and pieces of the SQL
1592 query that `satisfy-conditions' submits. The point of the
1593 `generate-views' is to create a series of views centered
1594 on the term(s) we're interested in (the `nil' slots in
1595 each submitted constraint). With
1596 `generate-where-condition', we insist that all of these
1597 interesting terms should, in fact, be equal to one
1598 another.
1599 \end{notate}
1601 \begin{notate}{On `generate-views'}
1602 In a `cond' form, for each constraint we must select the
1603 appropriate function to generate the view; at the very end
1604 of the cond form, we spit out the viewname (for `mapcar'
1605 to add to the list of views).
1606 \end{notate}
1608 \begin{common}{queries.lisp}
1609 (defun generate-views (constraints)
1610 (let ((counter 0))
1611 (mapcar
1612 (lambda (constraint)
1613 (setq counter (1+ counter))
1614 (let ((viewname (format nil "v~a" counter)))
1615 (cond
1616 ;; A * ? or A ? *
1617 ((or (and (eq (second constraint) t)
1618 (eq (third constraint) nil))
1619 (and (eq (second constraint) nil)
1620 (eq (third constraint) t)))
1621 (triples-given-beginning
1622 (first constraint)
1623 viewname))
1624 ;; * B ? or ? B *
1625 ((or (and (eq (first constraint) t)
1626 (eq (third constraint) nil))
1627 (and (eq (first constraint) nil)
1628 (eq (third constraint) t)))
1629 (triples-given-middle
1630 (second constraint)
1631 viewname))
1632 ;; * ? C or ? * C
1633 ((or (and (eq (first constraint) t)
1634 (eq (second constraint) nil))
1635 (and (eq (first constraint) nil)
1636 (eq (second constraint) t)))
1637 (triples-given-end
1638 (third constraint)
1639 viewname))
1640 ;; ? B C
1641 ((eq (first constraint) nil)
1642 (triples-given-middle-and-end
1643 (second constraint)
1644 (third constraint)
1645 viewname))
1646 ;; A ? C
1647 ((eq (second constraint) nil)
1648 (triples-given-beginning-and-middle
1649 (first constraint)
1650 (second constraint)
1651 viewname))
1652 ;; A C ?
1653 ((eq (third constraint) nil)
1654 (triples-given-beginning-and-end
1655 (first constraint)
1656 (third constraint)
1657 viewname)))
1658 viewname))
1659 constraints)))
1661 (defun format-views (views)
1662 (let ((formatted-list-of-views ""))
1663 (mapc (lambda (view)
1664 (setq formatted-list-of-views
1665 (concatenate
1666 'string
1667 formatted-list-of-views
1668 (format nil "~a," view))))
1669 (butlast views))
1670 (setq formatted-list-of-views
1671 (concatenate
1672 'string
1673 formatted-list-of-views
1674 (format nil "~a " (car (last views)))))
1675 formatted-list-of-views))
1677 (defun generate-where-condition (views conditions)
1678 (let ((where-condition "")
1679 (c (select-component (first conditions))))
1680 ;; there should be one less "=" condition than there
1681 ;; are things to compare; until we get to the last
1682 ;; view, everything is joined together by an `and'.
1683 ;; -- this needs to consider (map over) both `views'
1684 ;; and `conditions'.
1685 (loop
1686 for i from 1 upto (1- (length views))
1688 (let ((compi (select-component (nth i conditions)))
1689 (viewi (nth i views)))
1690 (setq
1691 where-condition
1692 (concatenate
1693 'string
1694 where-condition
1695 (concatenate
1696 'string
1697 "(v1.code" c " = " viewi ".code" compi ") and "
1698 "(v1.ref" c " = " viewi ".ref" compi ") and ")))))
1699 (let ((viewn (nth (1- (length views)) views))
1700 (compn (select-component
1701 (nth (length views) conditions))))
1702 (setq
1703 where-condition
1704 (concatenate
1705 'string
1706 where-condition
1707 "(v1.code" c " = " viewn ".code" compn ") and "
1708 "(v1.ref" c " = " viewn ".ref" compn ")")))
1709 where-condition))
1711 (defun select-component (condition)
1712 (cond ((eq (first condition) nil) "1")
1713 ((eq (second condition) nil) "2")
1714 ((eq (third condition) nil) "3")))
1715 \end{common}
1717 \begin{common}{queries.lisp}
1718 (locally-disable-sql-reader-syntax)
1719 \end{common}
1721 \begin{notate}{Even more complicated logic}
1722 In order to conveniently manage complex queries, it would
1723 be nice if we could store the results of earlier queries
1724 into views, so that we can combine several such views for
1725 further processing.
1726 \end{notate}
1728 \section{Emacs-side} \label{emacs-side}
1730 \subsection{The interface to Common Lisp}
1732 \begin{notate}{On `Defun'} \label{defun-interface}
1733 A way to define Elisp functions whose bodies are evaluated
1734 by Common Lisp. Trust me, this is a good idea. Besides,
1735 it exhibits some facinating backquote and comma tricks.
1736 But be careful: this definition of `Defun' did not work on
1737 Emacs version 21.
1739 If we want to be able to feed in a standard arglist to
1740 Common Lisp (with optional elements and so forth), we'd
1741 have define how these arguments are handled here!
1742 \end{notate}
1744 \begin{elisp}
1745 (defmacro Defun (name arglist &rest body)
1746 (declare (indent defun))
1747 `(defun ,name ,arglist
1748 (let* ((outbound-string
1749 (translate-emacs-syntax-to-common-syntax
1750 (format "%S"
1751 (append
1752 (list
1753 (append (list 'lambda ',arglist)
1754 ',body))
1755 (mapcar
1756 (lambda (arg) `',arg)
1757 (list
1758 ,@(remove-if
1759 (lambda (testelt)
1760 (eq testelt
1761 '&optional))
1762 arglist)))))))
1763 (returned-string
1764 (second
1765 ;; we now specify the right package!
1766 (slime-eval
1767 (list 'swank:eval-and-grab-output
1768 outbound-string)
1769 :arxana))))
1770 (process-slime-output returned-string))))
1771 \end{elisp}
1773 \begin{notate}{On `process-slime-output'}
1774 This should downcase all constituent symbols, but for
1775 expediency I'm just downcasing `NIL' at the moment. Will
1776 come back for more testing and downcasing shortly. (I
1777 suspect the general case is just about as easy as what
1778 happens here.)
1779 \end{notate}
1781 \begin{elisp}
1782 (defun process-slime-output (str)
1783 (condition-case nil
1784 (let ((read-value (read str)))
1785 (if (symbolp read-value)
1786 (read (downcase str)))
1787 (nsubst nil 'NIL read-value))
1788 (error str)))
1789 \end{elisp}
1791 \begin{elisp}
1792 (defun translate-emacs-syntax-to-common-syntax (str)
1793 (with-temp-buffer
1794 (insert str)
1795 (dolist (swap '(("(\\` " "`")
1796 ("(\\\, " ",")))
1797 (goto-char (point-min))
1798 (while (search-forward (first swap) nil t)
1799 (goto-char (match-beginning 0))
1800 (forward-sexp)
1801 (delete-char -1)
1802 (goto-char (match-beginning 0))
1803 (delete-region (match-beginning 0)
1804 (match-end 0))
1805 (insert (second swap))))
1806 (buffer-substring-no-properties (point-min)
1807 (point-max))))
1808 \end{elisp}
1810 \begin{notate}{Interactive `Defun'}
1811 Note, an improved version of this macro would allow me to
1812 specify that some Defuns are interactive and some are not.
1813 This could be done by examining the submitted body, and
1814 adjusting the defun if its car is an `interactive' form.
1815 Most of the Defuns will be things that people will want to
1816 use interactively, so making this change would probably be
1817 a good idea. What I'm doing in the mean time is just
1818 writing 2 functions each time I need to make an
1819 interactive function that accesses Common Lisp data!
1820 \end{notate}
1822 \begin{notate}{Common Lisp evaluation of code chunks}
1823 Another potentially beneficial and simple approach is to
1824 write a form like `progn' that evaluates its contents on
1825 Common Lisp. This saves us from having to rewrite all of
1826 the `defun' facilities into `Defun' (e.g. interactivity).
1827 But... the problem with \emph{this} is that Common Lisp
1828 doesn't know the names of all the variables that are
1829 defined in Emacs! I'm not sure how to get all of the
1830 values of these variable substituted \emph{first}, before
1831 the call to Common Lisp is made.
1832 \end{notate}
1834 \begin{notate}{Debugging `Defun'}
1835 In order to make debugging go easier, it might be nice to
1836 have an option to make the code that is supposed to be
1837 evaluated by Defun actually \emph{print} on the REPL
1838 instead of being processed through an invisible back-end.
1839 There could be a couple of different ways to do that, one
1840 would be to simulate just what a user might do, the other
1841 would be a happy medium between that and what we're doing
1842 now: just put our computery auto-generated code on the
1843 REPL and evaluate it. (To some extent, I think the
1844 *slime-events* buffer captures this information, but it is
1845 not particularly easy to read.)
1846 \end{notate}
1848 \begin{notate}{Interactive Common Lisp?}
1849 Suppose we set up some kind of interactive environment in
1850 Common Lisp; how would we go about passing this
1851 environment along to a user interacting via Emacs? (Note
1852 that SLIME's presentation of the debugging loop is one
1853 good example.)
1854 \end{notate}
1856 \subsection{Database interaction} \label{interaction}
1858 \begin{notate}{The `article' function} \label{the-article-function}
1859 You can use this function to create an article with a
1860 given name and contents. If you like you can put it in a
1861 list.
1862 \end{notate}
1864 \begin{elisp}
1865 (Defun article (name contents &optional heading)
1866 (let ((coordinates (add-triple name
1867 "has content"
1868 contents)))
1869 (when theory (add-triple coordinates "in" heading))
1870 (when place (if (numberp place)
1871 (put-in-place coordinates place)
1872 (put-in-place coordinates)))
1873 coordinates))
1874 \end{elisp}
1876 \begin{notate}{The `scholium' function} \label{the-scholium-function}
1877 You can use this function to link annotations to objects.
1878 As with the `article' function, you can optionally
1879 categorize the connection on a given list (cf. Note
1880 \ref{the-article-function}).
1881 \end{notate}
1883 \begin{elisp}
1884 (Defun scholium (beginning link end &optional heading)
1885 (let ((coordinates (add-triple beginning
1886 link
1887 end)))
1888 (when list (add-triple coordinates "in" heading))
1889 (when place (if (numberp place)
1890 (put-in-place coordinates place)
1891 (put-in-place coordinates)))
1892 coordinates))
1893 \end{elisp}
1895 \begin{notate}{Uses of coordinates}
1896 Note that, if desired, you can feed input of the form
1897 '(\meta{code} \meta{ref}) into `article' and `scholium'.
1898 It's convenient to do further any processing of the object
1899 we've created, while we still have ahold of the coordinates
1900 returned by `add-triple' (cf. Note
1901 \ref{import-code-continuations} for an example).
1902 \end{notate}
1904 \begin{notate}{Finding all the members of a list by type?}
1905 We just narrow according to type.
1906 \end{notate}
1908 \begin{notate}{On `get-article'} \label{get-article}
1909 Get the contents of the article named `name'. Optional
1910 argument `list' lets us find and use the position on the
1911 given list that holds the name, and use that instead of
1912 the name itself.
1914 We do not yet deal well with the ambiguous case in which
1915 there are several positions that correspond to the given
1916 name that appear on the same list.
1918 Note also that out of the data returned by
1919 `triples-given-beginning-and-middle', we should pick the
1920 (hopefully just) ONE that corresponds to the given list.
1922 This means we need to pick over the list of triples
1923 returned here, and test each one to see if it is in our
1924 heading. As to WHY there might be more than one ``has
1925 content'' for a place that we know to be in our
1926 heading... I'm not sure. I guess we can go with the
1927 assumption that there is just one, for now.
1928 \end{notate}
1930 \begin{elisp}
1931 (Defun get-article (name &optional heading)
1932 (let* ((place-pseudonyms
1933 (if heading
1934 (get-places-subject-to-constraint
1935 name `(nil "in" ,heading))
1936 (get-places name)))
1937 (goes-by (cond
1938 ((eq (length place-pseudonyms) 1)
1939 `(1 ,(car place-pseudonyms)))
1940 ((triple-exact-match
1941 name "in" heading)
1942 name)
1943 ((not heading) name)
1944 (t nil))))
1945 (when goes-by
1946 ;; it might be nice to also return `goes-by'
1947 ;; so we can access the appropriate place again.
1948 (third (print-triple
1949 (resolve-ambiguity
1950 (triples-given-beginning-and-middle
1951 goes-by "has content"))
1952 t)))))
1953 \end{elisp}
1955 \begin{notate}{On `get-names'} \label{get-names}
1956 This function simply gets the names of articles that have
1957 names -- in other words, every triple built around the
1958 ``has content'' relation.
1959 \end{notate}
1961 \begin{elisp}
1962 (Defun get-names (&optional heading)
1963 (let ((conditions (list (list nil "has content" t))))
1964 (when heading
1965 (setq conditions
1966 (append conditions
1967 (list (list nil "in" heading)))))
1968 (mapcar
1969 (lambda (place-or-string)
1970 (cond
1971 ;; place case
1972 ((eq (first place-or-string) 1)
1973 (print-system-object
1974 (place-lookup (second place-or-string))))
1975 ;; string case
1976 ((eq (first place-or-string) 0)
1977 (print-system-object place-or-string))))
1978 (mapcar
1979 (lambda (triple)
1980 (isolate-beginning triple))
1981 (satisfy-conditions conditions)))))
1982 \end{elisp}
1984 \begin{notate}{Contrasting cases} \label{contrasting-cases}
1985 Consider the difference between
1986 \begin{quote}
1987 (? ``has author'' ``Arthur C. Clarke'') \\
1988 (? ``has genre'' ``fiction'')
1989 \end{quote}
1991 \begin{quote}
1992 (\emph{name} ``has content'' *) \\
1993 (\emph{name} ``in'' ``heading'')
1994 \end{quote}
1995 where, in the latter case, we know \emph{who} we're
1996 talking about, and we just want to limit the list of items
1997 generated by the ``*'' by the second condition. This
1998 should help illustrate the difference between `get-names'
1999 (which is making a general query) and `get-article' (which
2000 already knows the name of a specific article), and the
2001 logic that they use.
2002 \end{notate}
2004 \begin{notate}{Placing items from Emacs} \label{place-item}
2005 We periodically need to place items from within Emacs.
2006 The function `place-item' is a wrapper for `put-in-place'
2007 that makes this possible (it also provides the user with
2008 an extra option, namely to put the place itself under a
2009 given heading).
2011 Notice that when the symbol is placed in some pre-existing
2012 place (which can only happen when `id' is not nil), that
2013 place may already be under some other heading. We will ignore
2014 this case for now (since it seems that putting objects
2015 into \emph{new} places will be the preferred action), but
2016 later we will have to look at what to do in this other
2017 case.
2018 \end{notate}
2020 \begin{elisp}
2021 (Defun place-item (symbol &optional id heading)
2022 (let ((coordinates (put-in-place symbol id)))
2023 (when heading (add-triple coordinates "in" heading))
2024 coordinates))
2025 \end{elisp}
2027 \begin{notate}{Automatic classifications} \label{classifications}
2028 It will presumably make sense to offer increasingly
2029 ``automatic'' classifications for new objects. At this
2030 point, we've set things up so that the user can optionally
2031 supply the name of \emph{one} heading that their new object
2032 is a part of.
2034 It may make more sense to allow an `\&rest theories'
2035 argument, and add the triple to all of the specified
2036 theories. This would require modifying `Defun' to
2037 accommodate the `\&rest' idiom; see Note
2038 \ref{defun-interface}.
2039 \end{notate}
2041 \begin{notate}{Postconditions and provenance}
2042 After adding something to the database, we may want to do
2043 something extra; perhaps generating provenance
2044 information, perhaps checking or enforcing database
2045 consistency, or perhaps running a hook that causes some
2046 update in the frontend (cf. Note \ref{provenance}).
2047 Provisions of this sort will come later, as will
2048 short-hand convenience functions for making particularly
2049 common complex entries.
2050 \end{notate}
2052 \subsection{Importing \LaTeX\ documents} \label{importing}
2054 \begin{notate}{Importing sketch} \label{importing-sketch}
2055 The code in this section imports a document as a
2056 collection of (sub-)sections and notes. It gathers the
2057 sections, sub-sections, and notes recursively and records
2058 their content in a tree whose nodes are places (Note
2059 \ref{places}) and whose links express the ``component-of''
2060 relation described in Note \ref{order-of-order}.
2062 This representation lets us see the geometric,
2063 hierarchical, structure of the document we've imported.
2064 It exemplifies a general principle, that geometric data
2065 should be represented by relationships between places, not
2066 direct relationships between strings. This is because
2067 ``the same'' string often appears in ``different'' places
2068 in any given document (e.g. a paper's many sub-sections
2069 titled ``Introduction'' will not all have the same
2070 content).
2072 What goes into the places is in some sense arbitrary. The
2073 key is that whatever is \emph{in} or \emph{attached} to
2074 these places must tell us everything we need to know about
2075 the part of the document associated with that place
2076 (e.g. in the case of a note, its title and contents).
2077 That's over and above the \emph{structural} links which
2078 say how the places relate to one another. Finally, all of
2079 these places and structural links will be added to a
2080 heading that represents the document as a whole.
2082 A natural convention we'll use will be to put the name
2083 of any document component that's associated with a given
2084 place into that place, and add all other information as
2085 annotations.
2086 \end{notate}
2088 \begin{notate}{Ordered versus unordered data} \label{ordered-vs-unordered}
2089 The code in this section is an example of one way to work
2090 with ordered data (i.e. \LaTeX\ documents are not just
2091 hierarchical, but the elements at each level of the
2092 hierarchy are also ordered).
2094 Since \emph{many} artifacts are hierachical (e.g. Lisp
2095 code), we should try to be compatible with \emph{native}
2096 methods for working with order (in the case of Lisp, feed
2097 the code into a Lisp processor and use CDR and CAR, etc.).
2099 We \emph{can} use triples such as (``rank'' ``1''
2100 ``Fred'') and (``rank'' ``2'' ``Barney'') to talk about
2101 order. There may be some SQL techniques that would help.
2102 (FYI, order can be handled very explicitly in Elephant!)
2104 In order to account for \emph{different} orderings, we
2105 need one more piece of data -- some explicit treatment of
2106 where the order \emph{is}; in other words, theories.
2107 (This table illustrates the fact that a heading is not so
2108 different from ``an additional triple''; indeed, the only
2109 reason to make them different is to have the extra
2110 convenience of having their elements be numbered.)
2112 \begin{center}
2113 \begin{tabular}{|lll|l|}
2114 \hline
2115 rank & 1 & Fred & Friday \\
2116 rank & 2 & Barney & Friday \\
2117 rank & 1 & Barney & Saturday \\
2118 rank & 2 & Fred & Saturday \\
2119 \hline
2120 \end{tabular}
2121 \end{center}
2122 \end{notate}
2124 \begin{notate}{The order of order} \label{order-of-order}
2125 The triples (``rank'' ``1'' ``Fred'') and (``rank'' ``2''
2126 ``Barney'') mentioned in Note \ref{ordered-vs-unordered}
2127 are easy enough to read and understand; it might be more
2128 natural in some ways for us to say (``Fred'' ``rank''
2129 ``1'') -- Fred has rank 1. In this section, we're
2130 concerned with talking about the ordered parts of a
2131 document, and ($A$ $n$ $B$) seems like an intuitive way to
2132 say ``$A$'s $n$th component is $B$''.
2133 \end{notate}
2135 \begin{notate}{It's not overdoing it, right?}
2136 When importing \emph{this} document, we see links like the
2137 following. I hope that's not ``overdoing it''. (Take a
2138 look at Note \ref{get-article} and Note \ref{get-names} to
2139 see how we go about getting information out of the
2140 database.) We could get rid of one link if theories were
2141 database objects (cf. Note
2142 \ref{theories-as-database-objects}).
2143 \end{notate}
2145 \begin{idea}
2146 "T557[P135|Web interface|.in.arxana.tex]"
2147 "T558[Future plans.9.P135|Web interface|]"
2148 "T559[T558[Future plans.9.P135|Web interface|].in.arxana.tex]"
2149 \end{idea}
2151 \begin{notate}{Importing in general} \label{importing-generally}
2152 We will eventually have a collection of parsers to get
2153 various kinds of documents into the system in various
2154 different ways (Note \ref{parsing}). For now, this
2155 section gives a simple way to get some sorts of
2156 \LaTeX\ documents into the system, namely documents
2157 structured along the same lines as the document you're
2158 reading now!
2160 An interesting approach to parsing \emph{math} documents
2161 has been undertaken in the \LaTeX ML
2162 project.\footnote{{\tt http://dlmf.nist.gov/LaTeXML/}}
2163 Eventually it would be nice to get that level of detail
2164 here, too! Emacsspeak is another example of a
2165 \LaTeX\ parser that deals with large-scale textual
2166 structures as well as smaller bits and
2167 pieces.\footnote{{\tt
2168 http://www.cs.cornell.edu/home/raman/aster/aster-thesis.ps}}
2170 It would probably be useful to put together some parsers
2171 for HTML and wiki code soon.
2172 \end{notate}
2174 \begin{notate}{On `import-buffer'}
2175 This function imports \LaTeX\ documents, taking care of
2176 the non-recursive aspects of this operation. It imports
2177 frontmatter (everything up to the first
2178 \verb+\begin{section}+), but assumes ``backmatter'' is
2179 trivial, and does not import it. The imported material is
2180 classified as a ``document'' with the same name as the
2181 imported buffer.
2182 \end{notate}
2184 \begin{elisp}
2185 (defun import-buffer (&optional buffername)
2186 (save-excursion
2187 (set-buffer (get-buffer (or buffername
2188 (current-buffer))))
2189 (goto-char (point-min))
2190 (search-forward-regexp "\\\\begin{document}")
2191 (search-forward-regexp "\\\\section")
2192 (goto-char (match-beginning 0))
2193 ;; other links will be made in the "heading of this
2194 ;; document", but here we make a broader assertion.
2195 (scholium buffername "is a" "document")
2196 (scholium buffername
2197 "has frontmatter"
2198 (buffer-substring-no-properties
2199 (point-min)
2200 (point))
2201 buffername)
2202 ;;; These should maybe be scholia attached to
2203 ;; root-coords (below), but for some reason that
2204 ;; wasn't working so well -- investigate later --
2205 ;; maybe it just wasn't good to run after running
2206 ;; `import-within'.
2207 (let* ((root-coords (place-item buffername nil
2208 buffername))
2209 (levels
2210 '("section" "subsection" "subsubsection"))
2211 (current-parent buffername)
2212 (level-end nil)
2213 (sections (import-within levels))
2214 (index 0))
2215 (while sections
2216 (let ((coords (car sections)))
2217 (setq index (1+ index))
2218 (scholium root-coords
2219 index
2220 coords
2221 buffername))
2222 (setq sections (cdr sections))))))
2223 \end{elisp}
2225 \begin{notate}{On `import-within'}
2226 Recurse through levels of sectioning to import
2227 \LaTeX\ code.
2229 It would be good if we could do something about sections
2230 that contain neither subsections nor notes (for example, a
2231 preface), or, more generally, about text that is not
2232 contained in any environment (possibly that appears before
2233 any section). We'll save things like this for another
2234 editing round!
2236 For the moment, we've decided to build the document
2237 hierarchy with links that are blind to whether the $k$th
2238 component of a section is a note or a subsection.
2239 Children that are notes are attached in the subroutine
2240 `import-notes' and those that are sections are attached in
2241 `import-within'. Users can find out what type of object
2242 they are looking at based on whether or not it ``has
2243 content''.
2245 Incidentally, when looking for the end of an importing
2246 level, `nil' is an OK result -- if this is the \emph{last}
2247 section at this level \emph{and} there is no subsequent
2248 section at a higher level.
2249 \end{notate}
2251 \begin{elisp}
2252 (defun import-within (levels)
2253 (let ((this-level (car levels))
2254 (next-level (car (cdr levels))) answer)
2255 (while (re-search-forward
2256 (concat
2257 "^\\\\" this-level "{\\([^}\n]*\\)}"
2258 "\\( +\\\\label{\\)?"
2259 "\\([^}\n]*\\)?")
2260 level-end t)
2261 (let* ((name (match-string-no-properties 1))
2262 (at (place-item name nil buffername))
2263 (level-end
2264 (or (save-excursion
2265 (search-forward-regexp
2266 (concat "^\\\\" this-level "{.*")
2267 level-end t))
2268 level-end))
2269 (notes-end
2270 (if next-level
2271 (or (progn (point)
2272 (save-excursion
2273 (search-forward-regexp
2274 (concat "^\\\\"
2275 next-level "{.*")
2276 level-end t)))
2277 level-end)
2278 level-end))
2279 (index (let ((current-parent at))
2280 (import-notes notes-end)))
2281 (subsections (let ((current-parent at))
2282 (import-within (cdr levels)))))
2283 (while subsections
2284 (let ((coords (car subsections)))
2285 (setq index (1+ index))
2286 (scholium at
2287 index
2288 coords
2289 buffername)
2290 (setq subsections (cdr subsections))))
2291 (setq answer (cons at answer))))
2292 (reverse answer)))
2293 \end{elisp}
2295 \begin{notate}{On `import-notes'} \label{import-notes}
2296 We're going to make the daring assumption that the
2297 ``textual'' portions of incoming \LaTeX\ documents are
2298 contained in ``Notes''. That assumption is true, at
2299 least, for the current document. The function returns the
2300 count of the number of notes imported, so that
2301 `import-within' knows where to start counting this
2302 section's non-note children.
2304 Would this same function work to import all notes from a
2305 buffer without examining its sectioning structure? Not
2306 quite, but close! (Could be a fun exercise to fix this.)
2307 \end{notate}
2309 \begin{elisp}
2310 (defun import-notes (end)
2311 (let ((index 0))
2312 (while (re-search-forward (concat "\\\\begin{notate}"
2313 "{\\([^}\n]*\\)}"
2314 "\\( +\\\\label{\\)?"
2315 "\\([^}\n]*\\)?")
2316 end t)
2317 (let* ((name
2318 (match-string-no-properties 1))
2319 (tag (match-string-no-properties 3))
2320 (beg
2321 (progn (next-line 1)
2322 (line-beginning-position)))
2323 (end
2324 (progn (search-forward-regexp
2325 "\\\\end{notate}")
2326 (match-beginning 0)))
2327 (coords (place-item name nil buffername)))
2328 (setq index (1+ index))
2329 (scholium current-parent
2330 index
2331 coords
2332 buffername)
2333 ;; not in the heading
2334 (scholium coords
2335 "has content"
2336 (buffer-substring-no-properties
2337 beg end))
2338 (import-code-continuations coords)))
2339 index))
2340 \end{elisp}
2342 \begin{notate}{On `import-code-continuations'} \label{import-code-continuations}
2343 This runs within the scope of `import-notes', to turn the
2344 series of Lisp chunks or other code snippets that follow a
2345 given note into a scholium attached to that note. Each
2346 separate snippet becomes its own annotation.
2348 The ``conditional regexps'' used here only work with Emacs
2349 version 23 or higher.
2351 I'm noticing a problem with the way the `looking-at'
2352 form behaves. It matches the expression in question,
2353 but then the match-end is reported as one character
2354 less than it supposed to be. Maybe `looking-at' is
2355 just not as good as `re-search-forward'? But it's
2356 what seems easiest to use.
2357 \end{notate}
2359 \begin{elisp}
2360 (defun import-code-continuations (coords)
2361 (let ((possible-environments
2362 "\\(1?:lisp\\|idea\\|common\\)"))
2363 (while (looking-at
2364 (concat "\n*?\\\\begin{"
2365 possible-environments
2366 "}"))
2367 (let* ((beg (match-end 0))
2368 (environment (match-string 1))
2369 (end (progn (search-forward-regexp
2370 (concat "\\\\end{"
2371 environment
2372 "}"))
2373 (match-beginning 0)))
2374 (content (buffer-substring-no-properties
2376 end)))
2377 (scholium (scholium coords
2378 "has attachment"
2379 content)
2380 "has type"
2381 environment)))))
2382 \end{elisp}
2384 \begin{notate}{On `autoimport-arxana'} \label{autoimport-arxana}
2385 This just calls `import-buffer', and imports this document
2386 into the system.
2387 \end{notate}
2389 \begin{elisp}
2390 (defun autoimport-arxana ()
2391 (interactive)
2392 (import-buffer "arxana.tex"))
2393 \end{elisp}
2395 \begin{notate}{Importing textual links}
2396 Of course, it would be good to import the links that users
2397 make between articles, since then we can quickly navigate
2398 from an article to the various articles that cite that
2399 article, as well as follow the usual forward-directional
2400 links. Indeed, we should be able to browse each article
2401 within a ``neighborhood'' of other related articles.
2402 (We'll need to import labels as well, of course.)
2403 \end{notate}
2405 \subsection{Browsing database contents} \label{browsing}
2407 \begin{notate}{Browsing sketch} \label{browsing-sketch}
2408 This section facilitates browsing of documents represented
2409 with structures like those created in Section
2410 \ref{importing}, and sets the ground for browsing other
2411 sorts of contents (e.g. collections of tasks, as in
2412 Section \ref{managing-tasks}).
2414 In order to facilitate general browsing, it is not enough
2415 to simply use `get-article' (Note \ref{get-article}) and
2416 `get-names' (Note \ref{get-names}), although these
2417 functions provide our defaults. We must provide the means
2418 to find and display different things differently -- for
2419 example, a section's table of contents will typically
2420 be displayed differently from its actual contents.
2422 Indeed, the ability to display and select elements of
2423 document sections (Note \ref{display-section}) is
2424 basically the core browsing deliverable. In the process
2425 we develop a re-usable article selector (Note
2426 \ref{selector}; cf. Note \ref{browsing-tasks}). This in
2427 turn relies on a flexible function for displaying
2428 different kinds of articles (Note \ref{display-article}).
2429 \end{notate}
2431 \begin{notate}{On `display-article'} \label{display-article}
2432 This function takes in the name of the article to display.
2433 Furthermore, it takes optional arguments `retriever' and
2434 `formatter', which tell it how to look up and/or format
2435 the information for display, respectively.
2437 Thus, either we make some statement up front (choosing our
2438 `formatter' based on what we already know about the
2439 article), or we decide what to display after making some
2440 investigation of information attached to the article, some
2441 of which may be retrieved and displayed (this requires
2442 that we specify a suitable `retriever' and a complementary
2443 `formatter').
2445 For example, the major mode in which to display the
2446 article's contents could be stored as a scholium attached
2447 to the article; or we might maintain some information
2448 about ``areas'' of the database that would tell us up
2449 front what which mode is associated with the current area.
2450 (The default is to simply insert the data with no markup
2451 whatsoever.)
2453 Observe that this works when no heading argument is given,
2454 because in that case `get-article' looks for \emph{all}
2455 place pseudonyms. (But of course that won't work well
2456 when we have multiple theories containing things with the
2457 same names, so we should get used to using the heading
2458 argument.)
2460 (The business about requiring the data to be a sequence
2461 before engaging in further formatting is, of course, just
2462 a matter of expediency for making things work with the
2463 current dataset.)
2464 \end{notate}
2466 \begin{elisp}
2467 (defun display-article
2468 (name &optional heading retriever formatter)
2469 (interactive "Mname: ")
2470 (let* ((data (if retriever
2471 (funcall retriever name heading)
2472 (get-article name heading))))
2473 (when (and data (sequencep data))
2474 (save-excursion
2475 (if formatter
2476 (funcall formatter data heading)
2477 (pop-to-buffer (get-buffer-create
2478 "*Arxana Display*"))
2479 (delete-region (point-min) (point-max))
2480 (insert "NAME: " name "\n\n")
2481 (insert data)
2482 (goto-char (point-min)))))))
2483 \end{elisp}
2485 \begin{notate}{An interactive article selector} \label{selector}
2486 The function `get-names' (Note \ref{get-names}) and
2487 similar functions can give us a collection of articles.
2488 The next few functions provide an interactive
2489 functionality for moving through this collection to find
2490 the article we want to look at.
2492 We define a ``display style'' that the article selector
2493 uses to determine how to display various articles. These
2494 display styles are specified by text properties attached
2495 to each option the selector provides. Similarly, when
2496 we're working within a given heading, the relevant heading
2497 is also specified as a text property.
2499 At selection time, these text properties are checked to
2500 determine which information to pass along to
2501 `display-article'.
2502 \end{notate}
2504 \begin{elisp}
2505 (defvar display-style '((nil . (nil nil))))
2507 (defun thing-name-at-point ()
2508 (buffer-substring-no-properties
2509 (line-beginning-position)
2510 (line-end-position)))
2512 (defun get-display-type ()
2513 (get-text-property (line-beginning-position)
2514 'arxana-display-type))
2516 (defun get-relevant-heading ()
2517 (get-text-property (line-beginning-position)
2518 'arxana-relevant-heading))
2520 (defun arxana-list-select ()
2521 (interactive)
2522 (apply 'display-article
2523 (thing-name-at-point)
2524 (get-relevant-heading)
2525 (cdr (assoc (get-display-type)
2526 display-style))))
2528 (define-derived-mode arxana-list-mode fundamental-mode
2529 "arxana-list" "Arxana List Mode.
2531 \\{arxana-list-mode-map}")
2533 (define-key arxana-list-mode-map (kbd "RET")
2534 'arxana-list-select)
2535 \end{elisp}
2537 \begin{notate}{On `pick-a-name'} \label{pick-a-name}
2538 Here `generate' is the name of a function to call to
2539 generate a list of items to display, and `format' is a
2540 function to put these items (including any mark-up) into
2541 the buffer from which individiual items can then be
2542 selected.
2544 One simple way to get a list of names to display would be
2545 to reuse a list that we had already produced (this would
2546 save querying the database each time). We could, in fact,
2547 store a history list of lists of names that had been
2548 displayed previously (cf. Note \ref{local-storage}).
2550 We'll eventually want versions of `generate' that provide
2551 various useful views into the data, e.g., listing all of
2552 the elements of a given section (Note
2553 \ref{display-section}).
2555 Finding all the elements that match a given search term,
2556 whether that's just normal text search or some kind of
2557 structured search would be worthwhile too. Upgrading the
2558 display to e.g. color-code listed elements according to
2559 their type would be another nice feature to add.
2560 \end{notate}
2562 \begin{elisp}
2563 (defun pick-a-name (&optional generate format heading)
2564 (interactive)
2565 (let ((items (if generate
2566 (funcall generate)
2567 (get-names heading))))
2568 (when items
2569 (set-buffer (get-buffer-create "*Arxana Articles*"))
2570 (toggle-read-only -1)
2571 (delete-region (point-min)
2572 (point-max))
2573 (if format
2574 (funcall format items)
2575 (mapc (lambda (item) (insert item "\n")) items))
2576 (toggle-read-only t)
2577 (arxana-list-mode)
2578 (goto-char (point-min))
2579 (pop-to-buffer (get-buffer "*Arxana Articles*")))))
2580 \end{elisp}
2582 \begin{notate}{On `display-section'} \label{display-section}
2583 When browsing a document, if you select a section, you
2584 should display a list of that section's constituent
2585 elements, be they notes or subsections. The question
2586 comes up: when you go to display something, how do you
2587 know whether you're looking at the name of a section, or
2588 the name of an article?
2590 When you get the section's contents out of the database
2591 (Note \ref{get-section-contents})
2592 \end{notate}
2594 \begin{elisp}
2595 (defun display-section (name heading)
2596 (interactive (list (read-string
2597 (concat
2598 "name (default "
2599 (buffer-name) "): ")
2600 nil nil (buffer-name))))
2601 ;; should this pop to the Articles window?
2602 (pick-a-name `(lambda ()
2603 (get-section-contents
2604 ,name ,heading))
2605 `(lambda (items)
2606 (format-section-contents
2607 items ,heading))))
2609 (add-to-list 'display-style
2610 '(section . (display-section
2611 nil)))
2612 \end{elisp}
2614 \begin{notate}{On `get-section-contents'} \label{get-section-contents}
2615 Sent by `display-section' (Note \ref{display-section})
2616 to `pick-a-name' as a generator for the table of contents
2617 of the section with the given name in the given heading.
2619 This function first finds the triples that begin with the
2620 (placed) name of the section, then checks to see which of
2621 these are in the heading of the document we're examinining
2622 (in other words, which of these links represent structural
2623 information about that document). It also looks at the
2624 items found at the end of these links to see if they are
2625 sections or notes (``noteness'' is determined by them
2626 having content). The links are then sorted by their
2627 middles (which show the order in which these components
2628 have in the section we're examining). After this ordering
2629 information has been used for sorting, it is deleted, and
2630 we're left with just a list of names in the apropriate
2631 order together with an indication of their noteness.
2632 \end{notate}
2634 \begin{elisp}
2635 (Defun get-section-contents (name heading)
2636 (let (contents)
2637 (dolist (triple (triples-given-beginning
2638 `(1 ,(resolve-ambiguity
2639 (get-places name)))))
2640 (when (triple-exact-match
2641 `(2 ,(car triple)) "in" heading)
2642 (let* ((number (print-middle triple))
2643 (site (isolate-end triple))
2644 (noteness
2645 (when (triples-given-beginning-and-middle
2646 site "has content")
2647 t)))
2648 (setq contents
2649 (cons (list number
2650 (print-system-object
2651 (place-contents site))
2652 noteness)
2653 contents)))))
2654 (mapcar 'cdr
2655 (sort contents
2656 (lambda (component1 component2)
2657 (< (parse-integer (car component1))
2658 (parse-integer (car component2))))))))
2659 \end{elisp}
2661 \begin{notate}{On `format-section-contents'} \label{format-section-contents}
2662 A formatter for document contents, used by
2663 `display-document' (Note \ref{display-document}) as input
2664 for `pick-a-name' (Note \ref{pick-a-name}).
2666 Instead of just printing the items one by one,
2667 like the default formatter in `pick-a-name' does,
2668 this version adds appropriate text properties, which
2669 we determine based the second component of
2670 of `items' to format.
2671 \end{notate}
2673 \begin{elisp}
2674 (defun format-section-contents (items heading)
2675 ;; just replicating the default and building on that.
2676 (mapc (lambda (item)
2677 (insert (car item))
2678 (let* ((beg (line-beginning-position))
2679 (end (1+ beg)))
2680 (unless (second item)
2681 (put-text-property beg end
2682 'arxana-display-type
2683 'section))
2684 (put-text-property beg end
2685 'arxana-relevant-heading
2686 heading))
2687 (insert "\n"))
2688 items))
2689 \end{elisp}
2691 \begin{notate}{On `display-document'} \label{display-document}
2692 When browsing a document, you should first display its
2693 top-level table of contents. (Most typically, a list of
2694 all of that document's major sections.) In order to do
2695 this, we must find the triples that are begin at the node
2696 representing this document \emph{and} that are in the
2697 heading of this document. This boils down to treating the
2698 document's root as if it was a section and using the
2699 function `display-section' (Note \ref{display-section}).
2700 \end{notate}
2702 \begin{elisp}
2703 (defun display-document (name)
2704 (interactive (list (read-string
2705 (concat
2706 "name (default "
2707 (buffer-name) "): ")
2708 nil nil (buffer-name))))
2709 (display-section name name))
2710 \end{elisp}
2712 \begin{notate}{Work with `heading' argument}
2713 We should make sure that if we know the heading we're
2714 working with (e.g. the name of the document we're
2715 browsing) that this information gets communicated in the
2716 background of the user interaction with the article
2717 selector.
2718 \end{notate}
2720 \begin{notate}{Selecting from a hierarchical display} \label{hierarchical-display}
2721 A fancier ``article selector'' would be able to display
2722 several sections with nice indenting to show their
2723 hierarchical order.
2724 \end{notate}
2726 \begin{notate}{Browser history tricks} \label{history-tricks}
2727 I want to put together (or put back together) something
2728 similar to the multihistoried browser that I had going in
2729 the previous version of Arxana and my Emacs/Lynx-based web
2730 browser, Nero\footnote{{\tt http://metameso.org/~joe/nero.el}}.
2731 The basic features are:
2732 (1) forward, back, and up inside the structure of a given
2733 document; (2) switch between tabs. More advanced features
2734 might include: (3) forward and back globally across all
2735 tabs; (4) explicit understanding of paths that loop.
2737 These sorts of features are independent of the exact
2738 details of what's printed to the screen each time
2739 something is displayed. So, for instance, you could flip
2740 between section manifests a la Note \ref{display-section},
2741 or between hierarchical displays a la Note
2742 \ref{hierarchical-display}, or some combination; the key
2743 thing is just to keep track in some sensible way of
2744 whatever's been displayed!
2745 \end{notate}
2747 \begin{notate}{Local storage for browsing purposes} \label{local-storage}
2748 Right now, in order to browse the contents of the
2749 database, you need to query the database every time. It
2750 might be handy to offer the option to cache names of
2751 things locally, and only sync with the database from time
2752 to time. Indeed, the same principle could apply in
2753 various places; however, it may also be somewhat
2754 complicated to set up. Using two systems for storage, one
2755 local and one permanent, is certainly more heavy-duty than
2756 just using one permanent storage system and the local
2757 temporary display. However, one thing in favor of local
2758 storage systems is that that's what I used in the the
2759 previous prototype of Arxana -- so some code already
2760 exists for local storage! (Caching the list of
2761 \emph{names} we just made a selection from would be one
2762 simple expedient, see Note \ref{pick-a-name}.)
2763 \end{notate}
2765 \begin{notate}{Hang onto absolute references}
2766 Since `get-article' (Note \ref{get-article}) translates
2767 strings into their ``place pseudonyms'', we may want to
2768 hang onto those pseudonyms, because they are, in fact, the
2769 absolute references to the objects we end up working with.
2770 In particular, they should probably go into the
2771 text-property background of the article selector, so it
2772 will know right away what to select!
2773 \end{notate}
2775 \subsection{Exporting \LaTeX\ documents$^*$}
2777 \begin{notate}{Roundtripping}
2778 The easiest test is: can we import a document into the
2779 system and then export it again, and find it unchanged?
2780 \end{notate}
2782 \begin{notate}{Data format}
2783 We should be able to \emph{stably} import and export a
2784 document, as well as export any modifications to the
2785 document that were generated within Arxana. This means
2786 that the exporting functions will have to read the data
2787 format that the importing functions use, \emph{and} that
2788 any functions that edit document contents (or structure)
2789 will also have to use the same format. Furthermore,
2790 \emph{browsing} functions will have to be somewhat aware
2791 of this format. So, this is a good time to ask -- did we
2792 use a good format?
2793 \end{notate}
2795 \subsection{Editing database contents$^*$} \label{editing}
2797 \begin{notate}{Roundtripping, with changes}
2798 Here, we should import a document into the system and then
2799 make some simple changes, and after exporting, check with
2800 diff to make sure the changes are correct.
2801 \end{notate}
2803 \begin{notate}{Re-importing}
2804 One nice feature would be a function to ``re-import'' a
2805 document that has changed outside of the system, and make
2806 changes in the system's version whereever changes appeared
2807 in the source version.
2808 \end{notate}
2810 \begin{notate}{Editing document structure}
2811 The way we have things set up currently, it is one thing
2812 to make a change to a document's textual components, and
2813 another to change its structure. Both types of changes
2814 must, of course, be supported.
2815 \end{notate}
2817 \section{Applications}
2819 \subsection{Managing tasks} \label{managing-tasks}
2821 \begin{notate}{What are tasks?}
2822 Each task tends to have a \emph{name}, a
2823 \emph{description}, a collection of \emph{prerequisite
2824 tasks}, a description of other \emph{material
2825 dependencies}, a \emph{status}, some \emph{justification
2826 of that status}, a \emph{creation date}, and an
2827 \emph{estimated time of completion}. There might actually
2828 be several ``estimated times of completion'', since the
2829 estimate would tend to improve over time. To really
2830 understand a task, one should keep track of revisions like
2831 this.
2832 \end{notate}
2834 \begin{notate}{On `store-task-data'} \label{store-task-data}
2835 Here, we're just filling in a frame. Since ``filling in a
2836 frame'' seems like the sort of operation that might happen
2837 over and over again in different contexts, to save space,
2838 it would probably be nice to have a macro (or similar)
2839 that would do a more general version of what this function
2840 does.
2841 \end{notate}
2843 \begin{elisp}
2844 (Defun store-task-data
2845 (name description prereqs materials status
2846 justification submitted eta)
2847 (add-triple name "is a" "task")
2848 (add-triple name "description" description)
2849 (add-triple name "prereqs" prereqs)
2850 (add-triple name "materials" materials)
2851 (add-triple name "status" status)
2852 (add-triple name "status justification" justification)
2853 (add-triple name "date submitted" submitted)
2854 (add-triple name "estimated time of completion" eta))
2855 \end{elisp}
2857 \begin{notate}{On `generate-task-data'} \label{generate-task-data}
2858 This is a simple function to create a new task matching
2859 the description above.
2860 \end{notate}
2862 \begin{elisp}
2863 (defun generate-task-data ()
2864 (interactive)
2865 (let ((name (read-string "Name: "))
2866 (description (read-string "Description: "))
2867 (prereqs (read-string
2868 "Task(s) this task depends on: "))
2869 (materials (read-string "Material dependencies: "))
2870 (status (completing-read
2871 "Status (tabled, in progress, completed):
2872 " '("tabled" "in progress" "completed")))
2873 (justification (read-string "Why this status? "))
2874 (submitted
2875 (read-string
2876 (concat "Date submitted (default "
2877 (substring (current-time-string) 0 10)
2878 "): ")
2879 nil nil (substring (current-time-string) 0 10)))
2880 (eta
2881 (read-string "Estimated date of completion:")))
2882 (store-task-data name description prereqs materials
2883 status
2884 justification submitted eta)))
2885 \end{elisp}
2887 \begin{notate}{Possible enhancements to `generate-task-data'}
2888 In order to make this function very nice, it would be good
2889 to allow ``completing read'' over known tasks when filling
2890 in the prerequisites. Indeed, it might be especially nice
2891 to offer a type of completing read that is similar in some
2892 sense to the tab-completion you get when completing a file
2893 name, i.e., quickly completing certain sub-strings of the
2894 final string (in this case, these substrings would
2895 correspond to task areas we are progressively zooming down
2896 into).
2898 As for the task description, rather than forcing the user
2899 to type the description into the minibuffer, it might be
2900 nice to pop up a separate buffer instead (a la the
2901 Emacs/w3m textarea). If we had a list of all the known
2902 tasks, we could offer completing-read over the names of
2903 existing tasks to generate the list of `prereqs'. It
2904 might be nice to systematize date data, so we could more
2905 easily e.g. sort and display task info ``by date''.
2906 (Perhaps we should be working with predefined database
2907 types for dates and so on; but see Note
2908 \ref{choice-of-database}.)
2910 Also, before storing the task, it might be nice to offer
2911 the user the chance to review the data they entered.
2912 \end{notate}
2914 \begin{notate}{On `get-filler'} \label{get-filler}
2915 Just a wrapper for `triples-given-beginning-and-middle'.
2916 (Maybe add `heading' as an option here.)
2917 \end{notate}
2919 \begin{elisp}
2920 (Defun get-filler (frame slot)
2921 (third (first
2922 (print-triples
2923 (triples-given-beginning-and-middle frame
2924 slot)))))
2925 \end{elisp}
2927 \begin{notate}{On `get-task'} \label{get-task}
2928 Uses `get-filler' (Note \ref{get-filler}) to assemble the
2929 elements of a task's frame.
2930 \end{notate}
2932 \begin{elisp}
2933 (Defun get-task (name)
2934 (when (triple-exact-match name "is a" "task")
2935 (list (get-filler name "description")
2936 (get-filler name "prereqs")
2937 (get-filler name "materials")
2938 (get-filler name "status")
2939 (get-filler name "status justification")
2940 (get-filler name "date submitted")
2941 (get-filler name
2942 "estimated time of completion"))))
2943 \end{elisp}
2945 \begin{notate}{On `review-task'} \label{review-task}
2946 This is a function to review a task by name.
2947 \end{notate}
2949 \begin{elisp}
2950 (defun review-task (name)
2951 (interactive "MName: ")
2952 (let ((task-data (get-task name)))
2953 (if task-data
2954 (display-task task-data)
2955 (message "No data."))))
2957 (defun display-task (data)
2958 (save-excursion
2959 (pop-to-buffer (get-buffer-create
2960 "*Arxana Display*"))
2961 (delete-region (point-min) (point-max))
2962 (insert "NAME: " name "\n\n")
2963 (insert "DESCRIPTION: " (first data) "\n\n")
2964 (insert "TASKS THIS TASK DEPENDS ON: "
2965 (second data) "\n\n")
2966 (insert "MATERIAL DEPENDENCIES: "
2967 (third data) "\n\n")
2968 (insert "STATUS: " (fourth data) "\n\n")
2969 (insert "WHY THIS STATUS?: " (fifth data) "\n\n")
2970 (insert "DATE SUBMITTED:" (sixth data) "\n\n")
2971 (insert "ESTIMATED TIME OF COMPLETION: "
2972 (seventh data) "\n\n")
2973 (goto-char (point-min))
2974 (fill-individual-paragraphs (point-min) (point-max))))
2975 \end{elisp}
2977 \begin{notate}{Possible enhancements to `review-task'}
2978 Breaking this down into a function to select the task and
2979 another function to display the task would be nice. Maybe
2980 we should have a generic function for selecting any object
2981 ``by name'', and then special-purpose functions for
2982 displaying objects with different properties.
2984 Using text properties, we could set up a ``field-editing
2985 mode'' that would enable you to select a particular field
2986 and edit it independently of the others. Another more
2987 complex editing mode would \emph{know} which fields the
2988 user had edited, and would store all edits back to the
2989 database properly. See Section \ref{editing} for more on
2990 editing.
2991 \end{notate}
2993 \begin{notate}{Browsing tasks} \label{browsing-tasks}
2994 The function `pick-a-name' (Note \ref{pick-a-name}) takes
2995 two functions, one that finds the names to choose from,
2996 and the other that says how to present these names. We
2997 can therefore build `pick-a-task' on top of `pick-a-name'.
2998 \end{notate}
3000 \begin{elisp}
3001 (Defun get-tasks ()
3002 (mapcar #'first
3003 (print-triples
3004 (triples-given-middle-and-end "is a" "task")
3005 t)))
3007 (defun pick-a-task ()
3008 (interactive)
3009 (pick-a-name
3010 'get-tasks
3011 (lambda (items)
3012 (mapc (lambda (item)
3013 (let ((pos (line-beginning-position)))
3014 (insert item)
3015 (put-text-property pos (1+ pos)
3016 'arxana-display-type
3017 'task)
3018 (insert "\n"))) items))))
3020 (add-to-list 'display-style
3021 '(task . (get-task display-task)))
3022 \end{elisp}
3024 \begin{notate}{Working with theories}
3025 Presumably, like other related functions, `get-tasks'
3026 should take a heading argument.
3027 \end{notate}
3029 \begin{notate}{Check display style}
3030 Check if this works, and make style consistent between
3031 this usage and earlier usage.
3032 \end{notate}
3034 \begin{notate}{Example tasks}
3035 It might be fun to add some tasks associated with
3036 improving Arxana, just to show that it can be done...
3037 maybe along with a small importer to show how importing
3038 something without a whole lot of structure can be easy.
3039 \end{notate}
3041 \subsection{Other ideas$^*$}
3043 \begin{notate}{A browser within a browser} \label{browser-within}
3044 All the stuff we're doing with triples can be superimposed
3045 over the existing web and existing web interfaces, by, for
3046 example, writing a web browser as a web app, and in this
3047 ``browser within a browser'' offer the ability to annotate
3048 and rewrite other people's web pages, produce 3rd-party
3049 redirects, and so forth, sharing these mods with other
3050 subscribers to the service. (Already websites such as the
3051 short-lived scrum.diddlyumptio.us have offered limited
3052 versions of ``web annotation'', but, so far, what one can
3053 do with such services seems quite weak compared with
3054 what's possible.)
3055 \end{notate}
3057 \begin{notate}{Improvements to the PlanetMath backend}
3058 From one point of view, the SQL tables are the main thing
3059 in Noosphere. We could say that getting the things out of
3060 SQL and storing new things there is what Noosphere mainly
3061 does. Following this line of thought, anything that
3062 adjusts these tables will do just as well, e.g., it
3063 shouldn't be terribly hard to develop an email-based
3064 front-end. But rather than making Arxana work with the
3065 Noosphere relational table system, it is probably
3066 advantageous to translate the data from these tables into
3067 the scholium system.
3068 \end{notate}
3070 \begin{notate}{A new communication platform}
3071 One of the premier applications I have in mind is a new
3072 way to handle communications in an online-forum. I have
3073 previously called this ``subchanneling'', but really,
3074 joining channels is just as important.
3075 \end{notate}
3077 \begin{notate}{Some tutorials}
3078 It would be interesting to write a tutorial for Common
3079 Lisp or just about any other topic with this system. For
3080 example, some little ``worksheets'' or ``gymnasia'' that
3081 will help solidify user knowledge in topics on which
3082 questions keep appearing.
3083 \end{notate}
3085 \section{Topics of philosophical interest}
3087 \begin{notate}{Research and development}
3088 In Note \ref{theoretical-context}, I mentioned a model
3089 that could apply in many contexts; it is an essentially
3090 metaphysical conception. I'm pretty sure that the data
3091 model of Note \ref{data-model} provides a general-enough
3092 framework to represent anything we might find ``out
3093 there''. However, even if this is the case, questions as
3094 to \emph{efficient} means of working with such data still
3095 abound (cf. Note \ref{models-of-theories}, Note
3096 \ref{use-of-views}).
3098 I propose that along with \emph{development} of Arxana as
3099 a useful system for \emph{doing} ``commons-based peer
3100 production'' should come a \emph{research} programme for
3101 understanding in much greater detail what ``commons-based
3102 peer production'' \emph{is}. Eventually we may want to
3103 change the name of the subject of study to reflect still
3104 more general ideas of resource use.
3106 While the ``frontend'' of this research project is
3107 anthropological, the ``backend'' is much closer to
3108 artificial intelligence. On this level, the project is
3109 about understanding \emph{effective} means for solving
3110 human problems. Often this will involve decomposing
3111 events and processes into constituent elements, making
3112 increasingly detailed treatments along the lines described
3113 in Note \ref{arxana}.
3114 \end{notate}
3116 \begin{notate}{The relationship between text and commentary}
3117 Text under revision might be marked up by a copyeditor: in
3118 cases like these, the interpretation is clear. However,
3119 what about marginalia with looser interpretations? These
3120 seem to become part of the copy of the text they are
3121 attached to. What about steering processes applied to a
3122 given course of action? How about the relationship of
3123 thoughts or words to perception and action? How can we
3124 lower the barrier between conception and action, while
3125 still maintaining some purchase on wisdom?
3127 You see, a lot of issues in life have to do with overlays,
3128 multi-tracking, interchange between different systems; and
3129 in these terms, a lot of philosophy reduces to ``media
3130 awareness'' which extends into more and more immediate
3131 contexts (Note \ref{theoretical-context}).
3132 \end{notate}
3134 \begin{notate}{Heuristic flow}
3135 Continuing the notion above: one does not need a
3136 fully-developed ``heading'' of work in order to do work --
3137 instead, one wants some straightforward heuristics that
3138 will enable the desired work to get done. So, even
3139 supposing the work is ``heading building'', it can progress
3140 without becoming overwhelmed in abstractions -- because
3141 theories and heuristics are different things.
3142 \end{notate}
3144 \begin{notate}{Limits of simple languages} \label{simple-languages}
3145 Triples are frequently ``subject, verb, object''
3146 statements, although with the annotation features, we can
3147 modify any part of any such statement; for example, we
3148 can apply an adverb to a given verb.
3150 ``Tags'', of course, already provide ``subject,
3151 predicate'' relationships. It will be interesting to
3152 examine the degree to which human languages can be mapped
3153 down into these sorts of simple languages. What features
3154 are needed to make such languages \emph{useful}? (Lisp's
3155 `car' and `cdr' seem related to the idea of making
3156 predicates useful.)
3158 How are triples and predicates ``enough''? What, if
3159 anything, do they lack? The difference between triples
3160 and predicates illustrates the issue. How should we
3161 characterize Arxana's additions to Lisp?
3162 \end{notate}
3164 \begin{notate}{Higher dimensions}
3165 Why stop with three components? Why not have $(A, B, C,
3166 D, T)$ represent a semantic relationship between all of
3167 $A$, $B$, $C$, and $D$ (in heading $T$, of course)?
3168 Actually, there is no reason to stop apart from the fact
3169 that I want to explore simple languages (Note
3170 \ref{simple-languages}). In real life, things are not as
3171 simple, and we should be ready to deal with the
3172 complexities! (Cf., for example, Note \ref{pointing}).
3173 \end{notate}
3175 \section{Future plans}
3177 \begin{notate}{Development pathways}
3178 To the extent that it's possible, I'd like to maintain a
3179 succinct non-linear roadmap in which tasks are outlined
3180 and prioritized, and some procedural details are made
3181 concrete. Whenever relevant this map should point into
3182 the current document. I'll begin by revising the plans
3183 I've used so far!\footnote{{\tt
3184 http://metameso.org/files/plan-arxana.pdf}} Over the
3185 next several months, I'd like to see these plans develop
3186 into a genuine production machine, and see the machine
3187 begin to stabilize its operations.
3188 \end{notate}
3190 \begin{notate}{Theories as database objects} \label{theories-as-database-objects}
3191 We're just beginning to treat theories as database
3192 objects; I expect there will be more work to do to make
3193 this work really well. We'll want to make some test
3194 cases, like building a ``theory of chess'', or even just
3195 describing a particular chess board; cf. Note
3196 \ref{partial-image}.
3197 \end{notate}
3199 \begin{notate}{Search engine/elements} \label{search-engine}
3200 One of the features that came very easy in the Emacs-only
3201 prototype was textual search. With the strings stored in
3202 a database, Sphinx seems to be the most suitable search
3203 engine to use. It is tempting to try to make our own
3204 inverted index using triples, so that text-based search
3205 can be even more directly integrated with semantic search.
3206 (Since the latest version(s) of Sphinx can act to some
3207 extent like a MySQL database, we almost have a direct
3208 connection in the backend, but since Sphinx is not
3209 \emph{the same} database, one would at least need some
3210 glue code to effect joins and so forth.)
3212 More to the point, it is important for this project that
3213 the scholia-based document model be transparently extended
3214 down to the level of words and characters. It may be
3215 helpful to think about text as \emph{always being}
3216 hypertext; a document as a heading; and a word in the
3217 inverted index as a frame.
3218 \end{notate}
3220 \begin{notate}{Pointing at database elements and other things} \label{pointing}
3221 We will want to be able to point at other tables and at
3222 other sorts of objects and make use of their contents.
3223 The plan is that our triples will provide a sort of guide
3224 or backbone superimposed over a much larger data system.
3225 \end{notate}
3227 \begin{notate}{Feature-chase}
3228 There are lots of different features that could be
3229 explored, for example: multi-dimensional history lists; a
3230 useful treatment of ``clusions''; MS Word-like colorful
3231 annotations; etc. Many of these features are already
3232 prototyped.\footnote{See footnote \ref{old-version}.}
3233 \end{notate}
3235 \begin{notate}{Regression testing}
3236 Along with any major feature chase, we should provide
3237 and maintain a regression testing suite.
3238 \end{notate}
3240 \begin{notate}{Deleting and changing things}
3241 How will we deal with unlinking, disassociating,
3242 forgetting, entropy, and the like? Changes can perhaps
3243 be modeled by an insertion following a deletion, and,
3244 as noted, we'll need effective ways to represent and
3245 manage change (Note \ref{change}).
3246 \end{notate}
3248 \begin{notate}{Tutorial}
3249 Right now the system is simple enough to be pretty much
3250 self-explanatory, but if it becomes much more complicated,
3251 it might be helpful to put together a simple guide to some
3252 likely-to-be-interesting features.
3253 \end{notate}
3255 \begin{notate}{Computing possible paths and connections}
3256 If we can find all the \emph{direct} paths from one node
3257 to another using `triples-given-beginning-and-end', can we
3258 inject some algorthms for finding longer, indirect paths
3259 into the system, and find ways to make them useful?
3261 Similarly, we can satisfy local conditions (Note
3262 \ref{satisfy-conditions}), but we'll want to deal with
3263 increasingly ``non-local'' conditions (even just using the
3264 logical operator ``or'', instead of ``and'', for example).
3265 \end{notate}
3267 \begin{notate}{Monster Mountain}
3268 In Summer 2007, we checked out the Monster Mountain MUD
3269 server\footnote{{\tt http://code.google.com/p/mmtn/}},
3270 which would enable several users to interact with one
3271 LISP, instead of just one database. This would have a
3272 number of advantages, particularly for exploring
3273 ``scholiumific programming'', but also towards fulfilling
3274 the user-to-user interaction objective stated in Note
3275 \ref{theoretical-context}. I plan to explore this after
3276 the primary goal of multi-user interaction with the
3277 database has been solidly completed.
3278 \end{notate}
3280 \begin{notate}{Web interface}
3281 A finished web interface may take a considerable amount of
3282 work (if the complexity of an interesting Emacs interface
3283 is any indication), but the basics shouldn't be hard to
3284 put together soon.
3285 \end{notate}
3287 \begin{notate}{Parsing input} \label{parsing}
3288 Complicated objects specified in long-hand (e.g. triples
3289 pointing to triples) can be read by a relatively simple
3290 parser -- which we'll have to write! The simplest goal
3291 for the parser would be to be able to distinguish between
3292 a triple and a string -- presumably that much isn't hard.
3293 And of course, building complexes of triples that
3294 represent statements from natural language is a good
3295 long-term goal. (Right now, our granularity level is set
3296 much higher.)
3297 \end{notate}
3299 \begin{notate}{Choice of database} \label{choice-of-database}
3300 I expect Elephant\footnote{{\tt
3301 http://common-lisp.net/project/elephant/}} may become
3302 our preferred database at some point in the future; we are
3303 currently awaiting changes to Elephant that make nested
3304 queries possible and efficient. Some core queries related
3305 to managing a database of semantic links with the current
3306 Elephant were constructed by Ian Eslick, Elephant's
3307 maintainer.\footnote{{\tt
3308 http://planetx.cc.vt.edu/\~{}jcorneli/arxana/variant-4.lisp}}
3310 On the other hand, it might be reasonable to use an Emacs
3311 database and redo the whole thing to work in Emacs
3312 (again), e.g. for single-user applications or users who
3313 want to work offline a lot of the time.
3314 \end{notate}
3316 \begin{notate}{Different kinds of theories}
3317 Theories or variants thereof are of course already popular
3318 in other knowledge representation contexts.\footnote{{\tt
3319 http://www.cyc.com/cycdoc/vocab/mt-expansion-vocab.html}}$^{,}$\footnote{{\tt
3320 http://www.stanford.edu/\~{}kdevlin/HHL\_SituationTheory.pdf}}
3321 We'll want to adopt some useful techniques for knowledge
3322 management as soon as the core systems are ready.
3324 Various notions of a mathematical theory
3325 exist.\footnote{{\tt
3326 http://planetmath.org/encyclopedia/Theory.html}} It
3327 would be nice to be able to assign specific logic to
3328 theories in Arxana, following the ``little theories''
3329 design of e.g. IMPS.\footnote{{\tt
3330 http://imps.mcmaster.ca/manual/node13.html}}
3331 \end{notate}
3333 \section{Conclusion} \label{conclusion}
3335 \begin{notate}{Ending and beginning again}
3336 This is the end of the Arxana system itself; the
3337 appendices provide some ancillary tools, and some further
3338 discussion. Contributions that support the development of
3339 the Arxana project are welcome.
3340 \end{notate}
3342 \appendix
3344 \section{Appendix: Auto-setup} \label{appendix-setup}
3346 \begin{notate}{Setting up auto-setup}
3347 This section provides code for satifying dependencies and
3348 setting up the program. This code assumes that you are
3349 using a Debian/APT-based system (but things are not so
3350 different using say, Fedora or Fink; writing a
3351 multi-package-manager-friendly installer shouldn't be
3352 hard). Of course, feel free to set things up differently
3353 if you have something else in mind!
3354 \end{notate}
3356 \begin{elisp}
3357 (defalias 'set-up 'shell-command)
3359 (defun alternative-set-up (string)
3360 (save-excursion
3361 (pop-to-buffer (get-buffer-create "*Arxana Help*"))
3362 (goto-char (point-max))
3363 (insert string "\n")))
3365 (defun set-up-arxana-environment ()
3366 (interactive)
3367 (if (y-or-n-p
3368 "Run commands (y) (or just show instructions)? ")
3369 (fset 'set-up 'shell-command)
3370 (fset 'set-up 'alternative-set-up))
3371 (when (y-or-n-p "Install dependencies? ")
3372 (set-up "mkdir ~/arxana")
3373 (set-up "cd arxana"))
3375 (when (y-or-n-p "Download latest Arxana? ")
3376 (set-up "wget http://metameso.org/files/arxana.tex"))
3378 (unless (y-or-n-p "Is your emacs good enough?... ")
3379 (set-up
3380 (concat "cvs -z3 -d"
3381 ":pserver:anonymous@cvs.savannah.gnu.org:"
3382 "/sources/emacs co emacs"))
3383 (set-up "mv emacs ~")
3384 (set-up "cd ~/emacs")
3385 (set-up "./configure && make bootstrap")
3386 (set-up "cd ~/arxana"))
3388 (defvar pac-man nil)
3390 (cond ((y-or-n-p
3391 "Do you use an apt-based package manager? ")
3392 (setq pac-man "apt-get"))
3393 (t (message
3394 "OK, get Lisp and SQL on your own, then!")))
3396 (when pac-man
3397 (when (y-or-n-p "Install Common Lisp? ")
3398 (set-up (concat pac-man " install sbcl")))
3400 (when (y-or-n-p "Install Postgresql? ")
3401 (set-up (concat pac-man " install postgresql"))
3402 (when (y-or-n-p "Help setting up PostgreSQL? ")
3403 (save-excursion
3404 (pop-to-buffer (get-buffer-create "*Arxana Help*"))
3405 (insert "As superuser (root),
3406 edit /etc/postgresql/7.4/main/pg_hba.conf
3407 make sure it says this:
3408 host all all 127.0.0.1 255.255.255.255 trust
3409 then edit /etc/postgresql/7.4/main/postgresql.conf
3410 and make it say
3411 tcpip_socket = true
3412 then restart:
3413 /etc/init.d/postgresql-7.4 restart
3414 su postgres
3415 createuser username
3416 exit
3417 as username, run
3418 createdb -U username\n")))))
3420 (when (y-or-n-p "Install SLIME...? ")
3421 (set-up (concat "cvs -d :pserver:anonymous"
3422 ":anonymous@common-lisp.net:"
3423 "/project/slime/cvsroot co slime"))
3424 (set-up
3425 (concat "echo \";; Added to ~/.emacs for Arxana:\n\n"
3426 "(add-to-list 'load-path \"~/slime/\")\n"
3427 "(setq inferior-lisp-program \"/usr/bin/sbcl\")\n"
3428 "(require 'slime)\n"
3429 "(slime-setup '(slime-repl))\n\n\""
3430 "| cat - ~/.emacs > ~/updated.emacs &&"
3431 "mv ~/updated.emacs ~/.emacs")))
3433 (when (y-or-n-p "Set up Common Lisp environment? ")
3434 (set-up "mkdir ~/.sbcl")
3435 (set-up "mkdir ~/.sbcl/site")
3436 (set-up "mkdir ~/.sbcl/systems")
3437 (set-up "cd ~/.sbcl/site")
3438 (set-up (concat "wget http://files.b9.com/"
3439 "clsql/clsql-latest.tar.gz"))
3440 (set-up "tar -zxf clsql-4.0.3.tar.gz")
3441 (set-up (concat "wget http://files.b9.com/"
3442 "uffi/uffi-latest.tar.gz"))
3443 (set-up "tar -zxf uffi-1.6.0.tar.gz")
3444 (set-up (concat "wget http://files.b9.com/"
3445 "md5/md5-1.8.5.tar.gz"))
3446 (set-up "tar -zxf md5-1.8.5.tar.gz")
3447 (set-up "cd ~/.sbcl/systems")
3448 (set-up "ln -s ../site/md5-1.8.5/md5.asd .")
3449 (set-up "ln -s ../site/uffi-1.6.0/uffi.asd .")
3450 (set-up "ln -s ../site/clsql-4.0.3/clsql.asd .")
3451 (set-up "ln -s ../site/clsql-4.0.3/clsql-uffi.asd .")
3452 (set-up (concat "ln -s ../site/clsql-4.0.3/"
3453 "clsql-postgresql-socket.asd ."))
3454 (set-up "ln -s ~/arxana/arxana.asd ."))
3456 (when (y-or-n-p "Modify ~/.sbclrc so CL always starts Arxana? ")
3457 (set-up
3458 (concat "echo \";; Added to ~/.sbclrc for Arxana:\n\n"
3459 "(require 'asdf)\n\n"
3460 "(asdf:operate 'asdf:load-op 'swank)\n"
3461 "(setf swank:*use-dedicated-output-stream* nil)\n"
3462 "(setf swank:*communication-style* :fd-handler)\n"
3463 "(swank:create-server :port 4006 :dont-close t)\n\n"
3464 "(asdf:operate 'asdf:load-op 'clsql)\n"
3465 "(asdf:operate 'asdf:load-op 'arxana)\n"
3466 "(in-package arxana)\n"
3467 "(connect-to-database)\n"
3468 "(locally-enable-sql-reader-syntax)\n\n\""
3469 "| cat ~/.sbclrc - > ~/updated.sbclrc &&"
3470 "mv ~/updated.sbclrc ~/.sbclrc")))
3472 (when (y-or-n-p "Install Monster Mountain? ")
3473 (set-up "cd ~/.sbcl/systems")
3474 (set-up (concat
3475 "darcs get http://common-lisp.net/project/"
3476 "bordeaux-threads/darcs/bordeaux-threads/"))
3477 (set-up (concat
3478 "svn checkout svn://common-lisp.net/project/"
3479 "usocket/svn/usocket/trunk usocket-svn"))
3480 ;; I've had problems with this approach to setting cclan
3481 ;; mirror...
3482 (set-up
3483 (concat
3484 "wget \"http://ww.telent.net/cclan-choose-mirror"
3485 "?M=http%3A%2F%2Fthingamy.com%2Fcclan%2F\""))
3486 (set-up (concat "wget http://ww.telent.net/cclan/"
3487 "split-sequence.tar.gz"))
3488 (set-up "tar -zxf split-sequence.tar.gz")
3489 (set-up
3490 (concat "svn checkout http://mmtn.googlecode.com/"
3491 "svn/trunk/ mmtn-read-only"))
3492 (set-up
3493 "ln -s ~/bordeaux-threads/bordeaux-threads.asd .")
3494 (set-up "ln -s ~/usocket-svn/usocket.asd .")
3495 (set-up "ln -s ~/split-sequence/split-sequence.asd .")
3496 (set-up "ln -s ~/mmtn/src/mmtn.asd .")))
3497 \end{elisp}
3499 \begin{notate}{Postgresql on Fedora}
3500 There are some slightly different instructions for
3501 installing postgresql on Fedora; the above will be
3502 changed to include them, but for now, check them
3503 out on the
3504 web.\footnote{{\tt http://www.flmnh.ufl.edu/linux/install\_postgresql.htm}}
3505 \end{notate}
3507 \begin{notate}{Using MySQL and CLISP instead} \label{backend-variant}
3508 Since my OS X box seems to have a variety of confusing
3509 PostgreSQL systems already installed (which I'm not sure
3510 how to configure), and CLISP is easy to install with fink,
3511 I thought I'd try a different set up for simplicity and
3512 variety.
3514 In order to make it work, I enabled root user on Mac OS X
3515 per instructions on web, and installed and configured
3516 mysql; used a slight modification of the strings table
3517 described previously; download and installed
3518 cffi\footnote{{\tt
3519 http://common-lisp.net/project/cffi/releases/cffi\_latest.tar.gz}};
3520 changed the definition of `connect-to-database' in
3521 Arxana's utilities.lisp; doctored up my ~/.clisprc.lisp;
3522 and changed how I started Lisp. Details below.
3523 \end{notate}
3525 \begin{idea}
3526 ;; on the shell prompt
3527 sudo apt-get install mysql
3528 sudo mysqld_safe --user=mysql &
3529 sudo daemonic enable mysql
3530 sudo mysqladmin -u root password root
3531 mysql --user=root --password=root -D test
3532 create database joe; grant all on joe.* to joe@localhost
3533 identified by 'joe'
3535 ;; in tabledefs.lisp
3536 (execute-command "CREATE TABLE strings (
3537 id SERIAL PRIMARY KEY,
3538 text TEXT,
3539 UNIQUE INDEX (text(255))
3540 );")
3542 ;; in ~/asdf-registry/ or whatever you've designated as
3543 ;; your asdf:*central-registry*
3544 ln -s ~/cffi_0.10.4/cffi-uffi-compat.asd .
3545 ln -s ~/cffi_0.10.4/cffi.asd .
3547 ;; In utilities.lisp
3548 (defun connect-to-database ()
3549 (connect `("localhost" "joe" "joe" "joe")
3550 :database-type :mysql))
3552 ;; In ~/.clisprc.lisp
3553 (asdf:operate 'asdf:load-op 'clsql)
3554 (push "/sw/lib/mysql/"
3555 CLSQL-SYS:*FOREIGN-LIBRARY-SEARCH-PATHS*)
3557 ;; From SLIME prompt, and not in ~/.clisprc.lisp
3558 (in-package #:arxana)
3559 (connect-to-database)
3560 (locally-enable-sql-reader-syntax)
3561 \end{idea}
3563 \begin{notate}{Installing Sphinx}
3564 Here are some tips on how to install and configure
3565 Sphinx.
3566 \end{notate}
3568 \begin{idea}
3569 ;; Fedora/Postgresql flavor
3570 yum install postgresql-devel
3571 ./configure --without-mysql
3572 --with-pgsql
3573 --with-pgsql-libs=/usr/lib/pgsql/
3574 --with-pgsql-includes=/usr/include/pgsql
3576 ;; Fink/MySQL flavor
3577 ./configure --with-mysql
3578 --with-mysql-includes=/sw/include/mysql
3579 --with-mysql-libs=/sw/lib/mysql
3580 \end{idea}
3582 \begin{notate}{Getting Sphinx set up} \label{sphinx-setup}
3583 Here are some instructions I've used to get Sphinx set
3585 \end{notate}
3587 \begin{notate}{Create a sphinx.conf}
3588 I want a very minimal sphinx.conf, this seems to work.
3589 (We should probably set this up so that it gets written
3590 to a file when the Arxana is set up.)
3591 \end{notate}
3593 \begin{idea}
3594 ## Copy this to /usr/local/etc/sphinx.conf when you want
3595 ## to use it.
3597 source strings
3599 type = mysql
3600 sql_host = localhost
3601 sql_user = joe
3602 sql_pass = joe
3603 sql_db = joe
3604 sql_query = SELECT id, text FROM strings
3607 ## index definition
3609 index strings
3611 source = strings
3612 path = /Users/planetmath/sphinx/search-testing
3613 morphology = none
3616 ## indexer settings
3618 indexer
3620 mem_limit = 32M
3623 ## searchd settings
3625 searchd
3627 listen = 3312
3628 listen = localhost:3307:mysql41
3629 log = /Users/planetmath/sphinx/searchd.log
3630 query_log = /Users/planetmath/sphinx/searchd_query.log
3631 read_timeout = 5
3632 max_children = 30
3633 pid_file = /Users/planetmath/sphinx/searchd.pid
3634 max_matches = 1000
3636 \end{idea}
3638 \begin{notate}{Working from the command line}
3639 Then you can run commands like these.
3640 \end{notate}
3642 \begin{idea}
3643 /usr/local/bin/indexer strings
3644 /usr/local/bin/search "but, then"
3646 % mysql -h 127.0.0.1 -P 3307
3647 mysql> SELECT * FROM strings WHERE MATCH('but, then');
3648 \end{idea}
3650 \begin{notate}{Integrating this with Lisp}
3651 Since we can talk to Sphinx via Mysql
3652 protocol, it seems reasonable that we should be able to talk to
3653 it from CLSQL, too. With a little fussing to get the format
3654 right, I found something that works!
3655 \end{notate}
3657 \begin{idea}
3658 (connect `("127.0.0.1" "" "" "" "3307") :database-type :mysql)
3659 (mapcar (lambda (elt) (floor (car elt)))
3660 (query "select * from strings where match('text')"))
3661 \end{idea}
3663 \begin{notate}{Some added difficulty with Postgresql}
3664 When I try to index things on the server, I get an
3665 error, as below. The question is a good one... I'm
3666 not sure \emph{how} postgresql is set up on the server,
3667 actually...
3668 \end{notate}
3670 \begin{idea}
3671 ERROR: index 'strings': sql_connect: could not connect to server:
3672 Connection refused
3673 Is the server running on host "localhost" and accepting
3674 TCP/IP connections on port 5432?
3675 \end{idea}
3677 \section{Appendix: A simple literate programming system} \label{appendix-lit}
3679 \begin{notate}{The literate programming system used in this paper}
3680 This code defines functions that grab all the Lisp
3681 portions of this document, evaluate the Emacs Lisp
3682 sections in Emacs, and save the Common Lisp sections in
3683 suitable files.\footnote{{\tt
3684 Cf. http://mmm-mode.sourceforge.net/}} It requires
3685 that the \LaTeX\ be written in a certain consistent way.
3686 The function assumes that this document is the current
3687 buffer.
3689 \begin{verbatim}
3690 (defvar lit-code-beginning-regexp
3691 "^\\\\begin{elisp}\\|^\\\\begin{common}{\\([^}\n]*\\)}")
3693 (defvar lit-code-end-regexp
3694 "^\\\\end{elisp}\\|^\\\\end{common}")
3696 (defun lit-process ()
3697 (interactive)
3698 (save-excursion
3699 (let ((to-buffer "*Lit Code*")
3700 (from-buffer (buffer-name (current-buffer)))
3701 (start-buffers (buffer-list)))
3702 (set-buffer (get-buffer-create to-buffer))
3703 (erase-buffer)
3704 (set-buffer (get-buffer-create from-buffer))
3705 (goto-char (point-min))
3706 (while (re-search-forward
3707 lit-code-beginning-regexp nil t)
3708 (let* ((file (match-string 1))
3709 (beg (match-end 0))
3710 (end (save-excursion
3711 (search-forward-regexp
3712 lit-code-end-regexp nil t)
3713 (match-beginning 0)))
3714 (match (buffer-substring-no-properties
3715 beg end)))
3716 (let ((to-buffer
3717 (if file
3718 (concat "*Lit Code*: " file)
3719 "*Lit Code*")))
3720 (save-excursion
3721 (set-buffer (get-buffer-create
3722 to-buffer))
3723 (insert match)))))
3724 (dolist
3725 (buffer (set-difference (buffer-list)
3726 start-buffers))
3727 (save-excursion
3728 (set-buffer buffer)
3729 (if (string= (buffer-name buffer)
3730 "*Lit Code*")
3731 (eval-buffer)
3732 (write-region (point-min)
3733 (point-max)
3734 (concat "~/arxana/"
3735 (substring
3736 (buffer-name
3737 buffer)
3738 12)))))
3739 (kill-buffer buffer)))))
3740 \end{verbatim}
3741 \end{notate}
3743 \begin{notate}{Emacs-export?}
3744 It wouldn't be hard to export the Elisp sections so
3745 that those who wanted to could ditch the literate
3746 wrapper.
3747 \end{notate}
3749 \begin{notate}{Bidirectional updating}
3750 Eventually it would be nice to have a code repository set
3751 up, and make it so that changes to the code can get
3752 snarfed up here.
3753 \end{notate}
3755 \begin{notate}{A literate style}
3756 Ideally, each function will have its own Note to introduce
3757 it, and will not be called before it has been defined. I
3758 sometimes make an exception to this rule, for example,
3759 functions used to form recursions may appear with no
3760 further introduction, and may be called before they are
3761 defined.
3762 \end{notate}
3764 \section{Appendix: Hypertext platforms} \label{appendix-hyper}
3766 \begin{notate}{The hypertextual canon} \label{canon}
3767 There is a core library of texts that come up in
3768 discussions of hypertext.
3769 \begin{itemize}
3770 % \item (Plato)
3771 \item The Rosetta stone
3772 \item The Talmud (Judah haNasi, Rav Ashi, and many others)
3773 \item Monadology (Wilhelm Leibniz)
3774 \item The Life and Opinions of Tristam Shandy, Gentleman
3775 (Lawrence Sterne)
3776 \item Middlemarch (George Eliot)
3777 % \item The Gay Science (Freidrich Nietzsche)
3778 % \item (Wittgenstein)
3779 % \item (Alan Turing)
3780 \item The Nova Trilogy (William S. Burroughs)
3781 \item The Logic of Sense (Gilles Deleuze)
3782 % \item Open Creation and its Enemies (Asger Jorn)
3783 \item Labyrinths (Jorge Luis Borges)
3784 \item Literary Machines (Ted Nelson)
3785 % \item Simulation and Simulacra (Jean Baudrillard)
3786 \item Lila (Robert M. Pirsig)
3787 % \item \TeX: the program (Donald Knuth)
3788 \item Dirk Gently's Holistic Detective Agency
3789 (Douglas Adams)
3790 \item Pussy, King of the Pirates (Kathy Acker)
3791 % \item Rachel Blau DuPlessis,
3792 % \item Emily Dickinson
3793 % \item Gertrude Stein
3794 % \item Zora Neale Hurston
3795 \end{itemize}
3796 At the same time, it is somewhat ironic that none of the
3797 items on this list are themselves hypertexts in the
3798 contemporary sense of the word. It's also a bit funny
3799 that certain other works (even some by the same authors)
3800 aren't on this list. Perhaps we begin to get a sense of
3801 what's going on in this quote from Kathleen
3802 Burnett:\footnote{{\tt http://www.iath.virginia.edu/pmc/text-only/issue.193/burnett.193}}
3803 \begin{quote}
3804 ``Multiplicity, as a hypertextual principle, recognizes a
3805 multiplicity of relationships beyond the canonical
3806 (hierarchical). Thus, the traditional concept of
3807 literary authorship comes under attack from two
3808 quarters--as connectivity blurs the boundary between
3809 author and reader, multiplicity problematizes the
3810 hierarchy that is canonicity.''
3811 \end{quote}
3812 It seems quite telling that non-hypertextual canons remain
3813 mostly-non-hypertextual even today, despite the existence
3814 of catalogs, indexes, and online access.\footnote{{\tt
3815 http://www.gutenberg.org/wiki/Category:Bookshelf}}
3816 \end{notate}
3818 \begin{notate}{A geek's guide to literature}
3819 This title is a riff on Slasov \v{Z}i\v{z}ek's ``A
3820 pervert's guide to cinema''. Taking Note \ref{canon} as a
3821 jumping-off point, why don't we make a survey of
3822 historical texts from the point of view of an aficionado
3823 of hypertext! Just what does one have to do to ``get on
3824 the list''? Just what is ``the hypertextual
3825 perspective''? And, if \v{Z}i\v{z}ek is correct and we're
3826 to look for the hyperreal in the world of cinematic
3827 fictions -- what's left over for the world of literature?
3828 (Or mathematics?)
3829 \end{notate}
3831 \begin{notate}{The number 3}
3832 This is the number of things present if we count carefully
3833 the items $A$, $B$, and a connection $C$ between them.
3834 [Picture of $A\xrightarrow{C} B$.]
3836 (Or even: given $A$ and $B$, we use Wittgenstein counting,
3837 and \emph{intuit} that $C$ exists as the collection $\{A,
3838 B\}$; after all,
3839 some connection must exist precisely because we were
3840 presented with $A$ and $B$ together -- and lest the
3841 connections proliferate infinitely, we lump them all
3842 together as one. [Picture of $A$, $B$,
3843 with the \emph{frame} labeled $C$.])
3844 \end{notate}
3846 \begin{notate}{Surfaces}
3847 Deleuze talks about a theory of surfaces associated with
3848 verbs and events. His surfaces represent the evanescence
3849 of events in time, and of their descriptions in language.
3850 An event is seen as a vanishingly-thin boundary between
3851 one state of being and another.
3853 Certainly, a statement that is true \emph{now} may not be
3854 true five minutes from now. It is easier to think and
3855 talk about things that are coming up and things that have
3856 already happened. ``Living in the moment'' is regarded as
3857 special or even ``Zen''.
3859 We can begin to put these musings on a more solid
3860 mathematical basis. We first examine two types of
3861 \emph{interfaces}:
3862 \begin{enumerate}
3863 \item $A\xrightarrow{C} B$, $A\xrightarrow{D} B$,
3864 $A\xrightarrow{E} B$
3865 (the interface of $A$ and $B$ across $C$, $D$, and $E$);
3866 \item $A\xrightarrow{C} B$, $D\xrightarrow{C} E$,
3867 $F\xrightarrow{C} G$
3868 (the interface of various terms across $C$).
3869 \end{enumerate}
3870 \end{notate}
3872 \begin{notate}{Comic books}
3873 No geek's guide to literature would be complete without
3874 putting comics in a hallowed place. [Framed picture of
3875 $A$, $B$ next to framed
3876 picture of $A$, $B$, $a$.] What happened?
3877 $\ddot{\smile}$
3878 \end{notate}
3880 \begin{notate}{Intersecting triples}
3881 Diagrammatically, it is tempting to portray
3882 $(ACB)_{\mathrm{mid}}DE$ as if it was closely related to
3883 $A(CDE)_{\mathrm{beg}}B$, despite the fact that they are
3884 notationally very different. I'll have to think more
3885 about what this means.
3886 \end{notate}
3888 \section{Appendix: Computational Linguistics} \label{appendix-linguistics}
3890 \begin{notate}{What is this?}
3891 It might be reasonable to make annotating sentences part
3892 of our writeup on hypertext platforms -- but I'm putting
3893 it here for now. If hypertext is what deals with language
3894 artifacts on the ``bulky'' level (saying, for example,
3895 that a subsection is part of a section, and so on), then
3896 computational linguistics is what deals with the finer
3897 levels. However, the distinction is in some ways
3898 arbitrary, and many of the techniques should be at least
3899 vaguely similar.
3900 \end{notate}
3902 \begin{notate}{Annotation sensibilities}\label{sense}
3903 We will want to be able to make at least two different
3904 kinds of annotations of verbs. For example, given the
3905 statement
3906 \begin{itemize}
3907 \item[$S$.] (``Who'' ``is on'' ``first''),
3908 \end{itemize}
3909 I'd like to be able to say
3910 \begin{itemize}
3911 \item[I.](``is on'' ``means'' ``the position of a base runner in baseball'').
3912 \end{itemize}
3913 However, I'd also like to be able to say
3914 \begin{itemize}
3915 \item[II.] (``is on'' ``because'' ``he was walked'').
3916 \end{itemize}
3917 Annotation I is meant to apply to the term ``is on''
3918 itself (in a context that might be more general than just
3919 this one sentence). If Who is also on steroids, that's
3920 another matter -- as this type of annotation helps make
3921 clear!
3923 Annotation II is meant to apply to the term ``is on''
3924 \emph{as it
3925 appears in sentence $S$}. In particular, Annotation II
3926 seems to work best in a context in which we've already
3927 accepted the ontological status of the verb-phrase ``is
3928 on first''.
3930 Whereas Annotation I should presumably exist before
3931 statement $S$ is ever made (and it certainly helps make
3932 that statement make sense), Annotation II is most properly
3933 understood with reference to the fully-formed statement
3934 $S$. However, Annotation II is different from a statement
3935 like ($S$ ``has truth value'' $F$) in that it looks into
3936 the guts of $S$.
3937 \end{notate}
3939 \begin{notate}{Comparison of places and ontological status} \label{places-and-onto-status}
3940 The difference between (I) a ``global'' annotation, and
3941 (II) the annotation of a specific sentence is analogous to
3942 the difference between (a) relationships between objects
3943 without a place, and (b) relationships between objects in
3944 specific places. (Cf. Note \ref{sense}: ``global''
3945 statements are of course made ``local'' by the theories
3946 that scope them.)
3948 For example, in a descriptive ontology of research
3949 documents, I might make the ``placeless'' statement,
3950 \begin{itemize}
3951 \item[a.] (``Introduction'' ``names'' ``a section'')
3952 \end{itemize}
3953 On the other hand, the statement
3954 \begin{itemize}
3955 \item[b.] (``Introduction'' ``has subject'' ``American
3956 History''),
3957 \end{itemize}
3958 seems likely to be about a specific Introduction. (And
3959 somewhere in the backend, this triple should be expressed
3960 in terms of places!)
3961 \end{notate}
3963 \begin{notate}{Semantics}
3964 In a sentence like
3965 \begin{quote}
3966 (((``I'' ``saw'' ``myself'')$_{\mathrm{mid}}$ ``as if''
3967 ``through a glass'')$_{\mathrm{beg}}$ ``but'' ``darkly'')
3968 \end{quote}
3969 first of all, there may be different parenthesizations,
3970 and second of all, the semantics of links like ``as if''
3971 and ``but'' may shape, to some extent, the ways in
3972 which we parethesize.
3973 \end{notate}
3975 \section{Appendix: Resource use} \label{appendix-resources}
3977 \begin{notate}{Free culture in action}
3978 I thought it worthwhile to include this quote from
3979 a joint paper with Aaron Krowne:\footnote{See Footnote
3980 \ref{corneli-krowne}.}
3981 \begin{quote}
3982 ``[F]ree content typically
3983 manifests aspects of a common resource as well as an
3984 open access resource; while anyone can do essentially
3985 whatever they wish with the content offline, in its
3986 online life, the content is managed in a
3987 socially-mediated way. In particular, rights to
3988 \emph{in situ} modification tend to be strictly
3989 controlled. [...] By finding new ways to support
3990 freedom of speech within CBPP documents, we embrace
3991 subjectivity as a way to enhance the content of an
3992 intersubjectively valued corpus. In the context of
3993 ``hackable'' media and maintenance protocols, the
3994 semantics with which scholia are handled can be improved
3995 upon indefinitely on a user-by-user basis and a
3996 resource-wide basis. This is free culture in action.''
3997 \end{quote}
3998 \end{notate}
4000 \begin{notate}{Learning}
4001 The learner, confronted with a learning resource, or the
4002 consumer of any other information resource (or indeed,
4003 practically any resource whatsoever) may want a chance to
4004 respond to the questions ``was this what you were looking
4005 for?'' and ``did you find this helpful?''. In some cases,
4006 an independent answer to that question could be generated
4007 (e.g. if a student is seen to come up with a correct
4008 answer, or not).
4009 \end{notate}
4011 \begin{notate}{Connections}
4012 A useful communication goal is to expose some of the
4013 connections between disparate resources. Some existing
4014 connections may be far more explicit than others. It's
4015 important to facilitate the making and explicating of
4016 connections by ``third parties'' (Note
4017 \ref{browser-within}). The search for connections between
4018 ostensibly unrelated things is a key part of both
4019 creativity and learning. In addition, connecting with
4020 what others are doing is an important part of being a
4021 social animal.
4022 \end{notate}
4024 \begin{notate}{Boundaries}
4025 Notice that the departmentalization of knowledge is
4026 similar to any regime that oversees and administers
4027 boundaries. In addition to bridging different areas,
4028 learning often involves pushing one's boundaries and
4029 getting out of one's comfort zone. The ``sociological
4030 imagination'' involves seeing oneself as part of something
4031 bigger; this goes along with the idea of a discourse that
4032 lowers or transcends the boundaries between participants.
4033 Imagination of any form can challenge myopic patterns of
4034 resource use, although there are also myopic fictions
4035 which neglect to look at what's going on in reality!
4036 \end{notate}
4038 \end{document}