latex/arxana-reboot.tex

   1 %;; arxana.tex                   -*- mode: Emacs-Lisp; -*-
   2 %;; Copyright (C) 2005-2009 Joe Corneli <holtzermann17@gmail.com>
   3 %;; DISOWNED! THIS FILE IS PUBLIC DOMAIN. DO WHAT YOU WILL!
   4
   5 % (progn
   6 %   (find-file "~/arxana.tex")
   7 %   (save-excursion
   8 %     (goto-char (point-max))
   9 %     (let ((beg (progn (search-backward "\\begin{verbatim}")
  10 %                       (match-end 0)))
  11 %           (end (progn (search-forward "\\end{verbatim}")
  12 %                       (match-beginning 0))))
  13 %       (eval-region beg end)
  14 %       (lit-process))))
  15
  16 %%% Commentary:
  17
  18 %% To load: remove %'s above and evaluate with C-x C-e.
  19
  20 %% Alternatively, run this:
  21 % head -n 13 arxana.tex | sed -e "/%/s///" > arxana-loader.el
  22 %% on the command line to produce something you can use
  23 %% to load Arxana when you start Emacs:
  24 % emacs -l arxana-loader.el
  25
  26 %% Or put the expression in your ~/.emacs (perhaps wrapped
  27 %% in function like `eval-arxana').
  28
  29 %% Or search for a similar form below and evaluate there!
  30
  31 %% Q.  Where exactly are we supposed to store the most
  32 %% up-to-date Arxana files when they are ready to go?
  33
  34 %% A.  Copy them into /usr/lib/sbcl/site-systems/arxana/
  35 %% and that should be enough.  Make sure that arxana.asd
  36 %% is in that directory and that you have a symbolic link,
  37 %% made via
  38
  39 %% ln -s ./arxana/arxana.asd .
  40
  41 %% in the directory /usr/lib/sbcl/site-systems/
  42 %% -- Make sure to load once as root to generate new fasls.
  43
  44 %% Q. How to run the remote slime after that?
  45
  46 %% A. Make sure that Emacs `slime-protocol-version' matches
  47 %% Common Lisp's `swank::*swank-wire-protocol-version*', then,
  48 %% like this:
  49
  50 %% ssh -L 4005:127.0.0.1:4005 joe@li23-125.members.linode.com
  51 %% linode$ sbcl
  52 %% M-x slime-connect RET RET
  53
  54 %%% Code:
  55
  56 \documentclass{article}
  57
  58 \usepackage{amsmath}
  59 \usepackage{amsthm}
  60 \usepackage{verbatim}
  61
  62 \newcommand{\meta}[1]{$\langle${\it #1}$\rangle$}
  63
  64 \theoremstyle{definition}
  65 \newtheorem{nota}{Note}[section]
  66
  67 \parindent = 1.2em
  68
  69 \newenvironment{notate}[1]
  70   {\begin{nota}[{\bf {\em #1}}]}%
  71   {\end{nota}}
  72
  73 \makeatletter
  74 \newenvironment{elisp}
  75   {\let\ORGverbatim@font\verbatim@font
  76    \def\verbatim@font{\ttfamily\scshape}%
  77    \verbatim}
  78   {\endverbatim
  79   \let\verbatim@font\ORGverbatim@font}
  80 \makeatother
  81
  82 \makeatletter
  83 \newenvironment{common}[1]
  84   {\let\ORGverbatim@font\verbatim@font
  85    \def\verbatim@font{\ttfamily\scshape}%
  86    \verbatim}
  87   {\endverbatim
  88   \let\verbatim@font\ORGverbatim@font}
  89 \makeatother
  90
  91 \makeatletter
  92 \newenvironment{idea}
  93   {\let\ORGverbatim@font\verbatim@font
  94    \def\verbatim@font{\ttfamily\slshape}%
  95    \verbatim}
  96   {\endverbatim
  97   \let\verbatim@font\ORGverbatim@font}
  98 \makeatother
  99
 100 \begin{document}
 101
 102 \title{\emph{Arxana}}
 103
 104 \author{Joseph Corneli\thanks{Copyright (C) 2005-2010
 105     Joseph Corneli {\tt <holtzermann17@gmail.com>}\newline
 106     $\longrightarrow$ transferred to the public domain.}}
 107 \date{Last revised: \today}
 108
 109 \maketitle
 110
 111 \abstract{A tool for building hackable semantic hypertext
 112   platforms.  Source code and mailing lists are at {\tt
 113     http://common-lisp.net/project/arxana}.}
 114
 115 \tableofcontents
 116
 117 \section{Introduction}
 118
 119 \begin{notate}{What is ``Arxana''?} \label{arxana}
 120 \emph{Arxana} is the name of a ``next generation''
 121 hypertext system that emphasizes annotation.  Every object
 122 in this system is annotatable.  Because of this, I
 123 sometimes call Arxana's core ``the scholium system'', but
 124 the name ``Arxana'' better reflects our aim: to explore
 125 the mysterious world of links, attachments,
 126 correspondences, and side-effects.
 127 \end{notate}
 128
 129 \begin{notate}{The idea} \label{theoretical-context}
 130 A scholia-based document model for commons-based peer
 131 production will inform the development of our
 132 system.\footnote{{\tt
 133 http://www.metascholar.org/events/2005/freeculture/viewabstract.php?id=19
 134 % alternate:
 135 % http://br.endernet.org/~akrowne/planetmath/papers/corneli\_fcdl/corneli-krowne.pdf
 136 \label{corneli-krowne}
 137 }}
 138 In this model, texts are made up of smaller texts until
 139 you get to atomic texts; user actions are built in the
 140 same way.  Multiple users should interact with a shared
 141 persistent data-store, through functional annotation, not
 142 destructive modification.  We should pursue the
 143 asynchronous interaction model until we arrive at live,
 144 synchronous, settings, where we facilitate real-time
 145 computer-mediated interactions between users, and between
 146 users and running hackable programs.
 147 \end{notate}
 148
 149 \begin{notate}{The data model} \label{data-model}
 150 Start by storing a collection of \emph{strings}.  Now add
 151 in \emph{pairs} and \emph{triples} which point at 2 and 3
 152 objects respectively.  (We can extend to n-tuples if that
 153 turns out to be convenient.)  Finally, we will maintain a
 154 collection of \emph{lists}, each of which points at an
 155 unlimited number of objects.
 156 \end{notate}
 157
 158 \begin{notate}{History}
 159 Thinking about how to improve existing systems for
 160 peer-based collaboration in 2004, I designed a simple
 161 version of the scholium system that treated textual
 162 commentary and markup as scholia.\footnote{{\tt
 163     http://wiki.planetmath.org/AsteroidMeta/old\_draft\_of\_scholium\_system}}
 164 In 2006, I put together a single-user version of this
 165 system that ran exclusively under Emacs.\footnote{{\tt
 166     http://metameso.org/files/sbdm4cbpp.tex} \label{old-version}}
 167 The current system is an almost-completely rewritten
 168 variant, bringing in a shared database and various other
 169 enhancements to support multi-user interaction.
 170 \end{notate}
 171
 172 \begin{notate}{A brisk review of the programming literature} \label{prog-lit-review}
 173 Many years before I started working on this project, there
 174 was something called the Emacs HyperText
 175 System.\footnote{{\tt
 176     http://www.aue.aau.dk/\~{}kock/Publications/HyperBase/}}
 177 What we're doing here updates for modern database methods,
 178 uses a more interesting data storage format, and also
 179 considers multiple front-ends to the same database (for
 180 example, a web interface).
 181
 182 Contemporary Emacs-based hypertext creation systems
 183 include Muse and Emacs Wiki.\footnote{{\tt
 184     http://mwolson.org/projects/EmacsMuse.html}}$^,$\footnote{{\tt
 185     http://mwolson.org/projects/EmacsWiki.html}} The
 186 browsing side features old standbys, Info and
 187 Emacs/w3m\footnote{Not to be confused with Emacs-w3m,
 188   which is not entirely ``Emacs-based''.}.  These packages
 189 provide ways to author or view what what we should now
 190 call ``traditional'' hypertext documents.
 191
 192 An another legacy tool worth mentioning is
 193 HyperCard\footnote{{\tt
 194     http://en.wikipedia.org/wiki/HyperCard}}.  This system
 195 was oriented around the idea of using hypertext to create
 196 software, a vision we share, but like just about everyone
 197 else working in the field at the time, it used
 198 uni-directional links.
 199
 200 Hypertext \emph{nouveau} is based on semantic triples.
 201 The Semantic Web standard provides one specification of
 202 the features we can expect from triples.\footnote{{\tt
 203     http://www.w3.org/TR/2004/REC-rdf-primer-20040210/}}
 204 Triples provide a framework for knowledge representation
 205 with more depth and flexibility than the popular
 206 ``tagging'' methodology.  For example, suitable
 207 collections of triples implement AI-style ``frames''.  The
 208 idea of using triples to organize archival material is
 209 generating some interest as Semantic Web ideas
 210 spread.\footnote{Cf. recent museum and library
 211   conferences}$^,$\footnote{Even among academic computer
 212   scientists! (Josh Grochow, p.c.)}
 213
 214 An abstractly similar project to Arxana with some grand
 215 goals is being developed by Chris Hanson at MIT under the
 216 name ``Web-scale Environments for Deduction
 217 Systems''.\footnote{{\tt
 218     http://publications.csail.mit.edu/abstracts/abstracts07/cph2/cph2.html}}
 219
 220 Another technically similar project is Freebase, a hand
 221 rolled database of open content, organized on frame-based,
 222 triple driven, principles.  The developer of the Freebase
 223 graphd database has some interesting things to say about
 224 old and new ways of handling triples.\footnote{{\tt
 225     http://blog.freebase.com/2008/04/09/a-brief-tour-of-graphd/}}
 226 \end{notate}
 227
 228 \begin{notate}{Fitting in}
 229 My current development goal is to use this system to
 230 create a more flexible multiuser interaction platform than
 231 those currently available to web-based collaborative
 232 projects (such as PlanetMath\footnote{{\tt
 233     http://planetmath.org}}).  As an intermediate stage,
 234 I'm using Arxana to help organize material for a book I'm
 235 writing.  Arxana's theoretical generality, active
 236 development status, detailed documentation, and
 237 superlatively liberal terms of use may make it an
 238 attractive option for you to try as well!
 239 \end{notate}
 240
 241 \begin{notate}{What you get}
 242 Arxana has an Emacs frontend, a Common Lisp middle-end,
 243 and a SQL backend.  If you want to do some work, any one
 244 of these components can be swapped out and replaced with
 245 the engine of your choice.  I've released all of the
 246 implementation work on this system into the public domain,
 247 and it runs on an entirely free/libre/open source software
 248 platform.
 249 \end{notate}
 250
 251 \begin{notate}{Acknowledgements}
 252 Ted Nelson's ``Literary Machines'' and Marvin Minsky's
 253 ``Society of Mind'' are cornerstones in the historical and
 254 social contextualization of this work.  Alfred Korzybski's
 255 ``Science and Sanity'' and Gilles Deleuze's ``The Logic of
 256 Sense'' provided grounding and encouragement.  \TeX\ and
 257 GNU Emacs have been useful not just in prototyping this
 258 system, but also as exemplary projects in the genre I'm
 259 aiming for.  John McCarthy's Elephant 2000 was an
 260 inspiring thing to look at and think about\footnote{{\tt
 261     http://www-formal.stanford.edu/jmc/elephant/elephant.html}}, and of course Lisp has been a vital ingredient.
 262
 263 Thanks also to everyone who's talked about this project
 264 with me!
 265 \end{notate}
 266
 267 \section{Using the program}
 268
 269 \begin{notate}{Dependencies} \label{dependencies}
 270 Our interface is embedded in Emacs.  Backend processing is
 271 done with Common Lisp.  We are currently using the
 272 PostgreSQL database.  These packages should be available
 273 to you through the usual channels.  (I've been using SBCL,
 274 but any Lisp should do; please make sure you are using a
 275 contemporary Emacs version.)
 276
 277 We will connect Emacs to Lisp via Slime\footnote{{\tt
 278     http://common-lisp.net/project/slime/}}, and Lisp to
 279 PostgreSQL via CLSQL.\footnote{{\tt http://clsql.b9.com/}}
 280 CLSQL also talks directly to the Sphinx search engine,
 281 which we use for text-based search.\footnote{{\tt
 282     http://www.sphinxsearch.com/}} Once all of these
 283 things are installed and working together, you should be
 284 able to begin to use Arxana.
 285
 286 Setting up all of these packages can be a somewhat
 287 time-consuming and confusing task, especially if you
 288 haven't done it before!  See Appendix \ref{appendix-setup}
 289 for help.
 290 \end{notate}
 291
 292 \begin{notate}{Export code and set up the interface}
 293 If you are looking at the source version of this document
 294 in Emacs, evaluate the following s-expression (type
 295 \emph{C-x C-e} with the cursor positioned just after its
 296 final parenthesis).  This exports the Common Lisp
 297 components of the program to suitable files for subsequent
 298 use, and prepares the Emacs environment.  (The code that
 299 does this is in Appendix \ref{appendix-lit}.)
 300 \end{notate}
 301
 302 \begin{idea}
 303 (save-excursion
 304   (let ((beg (search-forward "\\begin{verbatim}"))
 305         (end (progn (search-forward "\\end{verbatim}")
 306                     (match-beginning 0))))
 307     (eval-region beg end)
 308     (lit-process)))
 309 \end{idea}
 310
 311 \begin{notate}{To load Common Lisp components at run-time} \label{load-at-runtime}
 312 Link {\tt arxana.asd} somewhere where Lisp can find it.
 313 Then run commands like these in your Lisp; if you like,
 314 you can place all of this stuff in your config file to
 315 automatically load Arxana when Lisp starts.  The final
 316 form is only necessary if you plan to use CLSQL's special
 317 syntax on the Lisp command-line.
 318 \end{notate}
 319
 320 \begin{idea}
 321 (asdf:operate 'asdf:load-op 'clsql)
 322 (asdf:operate 'asdf:load-op 'arxana)
 323 (in-package arxana)
 324 (connect-to-database)
 325 (locally-enable-sql-reader-syntax)
 326 \end{idea}
 327
 328 \begin{notate}{To connect Emacs to Lisp}
 329 Either run {\tt M-x slime RET} to start and connect to
 330 Lisp locally, or {\tt M-x slime-connect RET RET} after you
 331 have opened a remote connection to your remote server with
 332 a command like this: {\tt ssh -L 4005:127.0.0.1:4005
 333   <username>@<host>} and started Lisp and the Swank server
 334 on the remote machine.  To have Swank start automatically
 335 when you start Lisp, put commands like this in your config
 336 file.
 337 \end{notate}
 338
 339 \begin{idea}
 340 (asdf:operate 'asdf:load-op 'swank)
 341 (setf swank:*use-dedicated-output-stream* nil)
 342 (setf swank:*communication-style* :fd-handler)
 343 (swank:create-server :dont-close t)
 344 \end{idea}
 345
 346 \begin{notate}{To define database structures}
 347 If you haven't yet defined the basic database structures,
 348 make sure to load them now!  (Using {\tt tabledefs.lisp},
 349 or the SQL code in Section \ref{sql-code})
 350 \end{notate}
 351
 352 \begin{notate}{Importing this document into system}
 353 You can browse this document inside Arxana: after loading
 354 the code, run \emph{M-x autoimport-arxana}.
 355 \end{notate}
 356
 357 \section{SQL tables} \label{sql-code}
 358
 359 \begin{notate}{Objects and codes} \label{objects-and-codes}
 360 Every object in the system is identified by an ordered
 361 pair: a \emph{code} and a \emph{reference}.  The codes say
 362 which table contains the indicated object, and references
 363 provide that object's id.  To a specific element of a list
 364 or n-tuple, a third number, that element's \emph{offset},
 365 is required.  The codes are as follows:
 366
 367 \begin{center}
 368 \begin{tabular}{|l|l|}
 369 \hline
 370 0 & list \\ \hline
 371 1 & string \\ \hline
 372 2 & pair \\ \hline
 373 3 & triple \\ \hline
 374 \end{tabular}
 375 \end{center}
 376 \end{notate}
 377
 378 \begin{idea}
 379 CREATE TABLE strings (
 380    id SERIAL PRIMARY KEY,
 381    text TEXT NOT NULL UNIQUE
 382 );
 383
 384 CREATE TABLE pairs (
 385    id SERIAL PRIMARY KEY,
 386    code1 INT NOT NULL,
 387    ref1 INT NOT NULL,
 388    code2 INT NOT NULL,
 389    ref2 INT NOT NULL,
 390    UNIQUE (code1, ref1,
 391            code2, ref2)
 392 );
 393
 394 CREATE TABLE triples (
 395    id SERIAL PRIMARY KEY,
 396    code1 INT NOT NULL,
 397    ref1 INT NOT NULL,
 398    code2 INT NOT NULL,
 399    ref2 INT NOT NULL,
 400    code3 INT NOT NULL,
 401    ref3 INT NOT NULL,
 402    UNIQUE (code1, ref1,
 403            code2, ref2,
 404            code3, ref3)
 405 );
 406 \end{idea}
 407
 408 \begin{notate}{A list of lists}\label{models-of-theories}
 409 As a central place to manage our collections, we first
 410 create a list of lists.  The `heading' is the list's name,
 411 and its `header' is metadata.
 412 \end{notate}
 413
 414 \begin{idea}
 415 CREATE TABLE lists (
 416   id SERIAL PRIMARY KEY,
 417   heading REFERENCES strings(id) UNIQUE,
 418   header REFERENCES strings(id)
 419 );
 420 \end{idea}
 421
 422 \begin{notate}{Lists on demand}\label{models-of-theories}
 423 Whenever we want to create a new list, we first add to the
 424 `lists' table, and then create a new table ``listk''
 425 (where k is equal to the new maximum id on `lists').
 426 \end{notate}
 427
 428 \begin{idea}
 429 CREATE TABLE listk (
 430    offset SERIAL PRIMARY KEY,
 431    code INT NOT NULL,
 432    ref INT NOT NULL
 433 );
 434 \end{idea}
 435
 436 \begin{notate}{Side-note on containers via triples}  \label{containers-using-triples}
 437 To model a basic container, we can just use triples like
 438 ``(A in B)''.  This is useful, but the elements of B are
 439 of course unordered.  In Section \ref{importing}, we make
 440 extensive use of triples like (B 1 $\alpha$), (B 2
 441 $\beta$), etc., to indicate that B's first component is
 442 $\alpha$, second component is $\beta$, and so on; so we
 443 can make ordered list-like containers as well.
 444
 445 This is an example of the difference in expressive power
 446 of tags (which only provide a sense of unordered
 447 containment in ``virtual baskets'') and triples (which
 448 here are seen to at least provide the additional sense of
 449 ordered containment in ``virtual filing cabinets'',
 450 although they have much more in store for us); cf. Note
 451 \ref{prog-lit-review}.
 452
 453 As useful as models based on these two principles are in
 454 principle, the user could easily be overloaded by looking
 455 at lots of different containers encoded in raw triples,
 456 all at once.
 457 \end{notate}
 458
 459 \begin{notate}{Sense of containment}
 460 Note that every element of a list is in the list in the
 461 same ``sense'' -- for example, we can't instantly
 462 distinguish elements that are ``halfway in'' from those
 463 that are ``all the way in'', the same way we could with
 464 pure triples.
 465 \end{notate}
 466
 467 %% \begin{notate}{References into theories}
 468 %% Since at the moment we have less than 10 basic codes, we
 469 %% can uniquely reference contents of theory $k$ with ordered
 470 %% pairs $10k+\mathit{basic\ code}$ and $\mathit{reference}$.
 471 %% \end{notate}
 472
 473 \begin{notate}{Uniqueness of strings and triples} \label{unique-things}
 474 An attempt to create a duplicate contents in a string or
 475 triple generates a warning.  This saves storage, given
 476 possible repetitive use -- and avoids confusion.  We can,
 477 however, reference duplicate ``copies'' on the lists.
 478 \end{notate}
 479
 480 \begin{notate}{Change} \label{change}
 481 Notice also that since neither strings nor triples
 482 ``change'', we have to account for change in other ways.
 483 In particular, the contents of lists can change.  (We may
 484 subsequently add some metadata to certain lists are
 485 ``locked'', or indicate that they can only be changed by
 486 adding, etc., so that their contents can be cited stably
 487 and reliably.)
 488 \end{notate}
 489
 490 %% \begin{notate}{Each place contains one object} \label{places}
 491 %% It is obvious from the table definition that I want each
 492 %% place to contain precisely one thing; perhaps it is less
 493 %% obvious why I want to use a database table to maintain
 494 %% this relationship between ``places'' and ``things''.  This
 495 %% is largely a matter of convenience, but in particular it
 496 %% makes it easy for places to change.
 497 %% \end{notate}
 498
 499 \begin{notate}{Provenance and other metadata} \label{provenance}
 500 We could of course add much more structure to the
 501 database, starting with simple adjustments like adding
 502 provenance metadata or versioning into the records for
 503 each stored thing.  For the time being, I assume that such
 504 metadata will appear in the application or content layer,
 505 as triples.  (The exception are the ``headings'' and
 506 ``headers'' associated with lists.)
 507 \end{notate}
 508
 509 \section{Common Lisp-side}
 510
 511 \subsection{Preliminaries}
 512
 513 \subsubsection*{System definition}
 514
 515 \begin{common}{arxana.asd}
 516 (defsystem "arxana"
 517     :version "1"
 518     :author "Joe Corneli <holtzermann17@gmail.com>"
 519     :licence "Public Domain"
 520     :components
 521     ((:file "packages")
 522      (:file "utilities" :depends-on ("packages"))
 523      (:file "database" :depends-on ("utilities"))
 524      (:file "queries" :depends-on ("packages"))))
 525 \end{common}
 526
 527 \subsubsection*{Package definition}
 528
 529 \begin{common}{packages.lisp}
 530 (defpackage :arxana
 531   (:use #:cl #:clsql #:clsql-sys))
 532 \end{common}
 533
 534 \subsubsection*{Utilities}
 535
 536 \begin{notate}{Useful things} \label{useful}
 537 These definitions are either necessary or useful for
 538 working the database and manipulating triple-centric
 539 and/or theory-situated data.  The implementation of
 540 theories given here is inspired by Lisp's streams.  This
 541 is perhaps the most gnarly part of the code; the pay-off
 542 of doing things the way we do them here is that
 543 subsequently theories can sit ``transparently'' over other
 544 structures.
 545 \end{notate}
 546
 547 \begin{common}{utilities.lisp}
 548 (in-package arxana)
 549 (locally-enable-sql-reader-syntax)
 550
 551 ;; (defun connect-to-database ()
 552 ;;    (connect `("localhost" "joe" "joe" "")
 553 ;;             :database-type :postgresql-socket))
 554
 555 (defun connect-to-database ()
 556    (connect `("localhost" "joe" "joe" "joe")
 557             :database-type :mysql))
 558
 559 (defmacro select-one (&rest args)
 560   `(car (select ,@args :flatp t)))
 561
 562 (defmacro select-flat (&rest args)
 563   `(select ,@args :flatp t))
 564
 565 (defun resolve-ambiguity (stuff)
 566   (first stuff))
 567
 568 (defun isolate-components (content i j)
 569   (list (nth (1- i) content)
 570         (nth (1- j) content)))
 571
 572 (defun isolate-beginning (triple)
 573   (isolate-components (cdr triple) 1 2))
 574
 575 (defun isolate-middle (triple)
 576   (isolate-components (cdr triple) 3 4))
 577
 578 (defun isolate-end (triple)
 579   (isolate-components (cdr triple) 5 6))
 580
 581 (defvar *read-from-heading* nil)
 582
 583 (defvar *write-to-heading* nil)
 584 \end{common}
 585
 586 \begin{notate}{On `datatype'}
 587 Just translate coordinates into their primary dimension.
 588 (How should this change to accomodate codes 4, 5, 6,
 589 possibly etc.?)
 590 \end{notate}
 591
 592 \begin{common}{utilities.lisp}
 593 (defun datatype (data)
 594   (cond ((eq (car data) 0)
 595          "strings")
 596         ((eq (car data) 1)
 597          "places")
 598         ((eq (car data) 2)
 599          "triples")
 600         ((eq (car data) 3)
 601          "theories")))
 602
 603 (locally-disable-sql-reader-syntax)
 604 \end{common}
 605
 606 \begin{notate}{Resolving ambiguity}
 607 Often it will eventuate that there will be more than one
 608 item returned when we are only truly prepared to deal with
 609 one item.  In order to handle this sort of ambiguity, it
 610 would be great to have either a non-interactive notifier
 611 that says that some ambiguity has been dealt with, or an
 612 interactive tool that will let the user decide which of
 613 the ambiguous options to choose from.  For now, we provide
 614 the simplest non-interactive tool: just choose the first
 615 item from a possibly ambiguous list of items.
 616 \end{notate}
 617
 618 \begin{notate}{Using a different database}
 619 See Note \ref{backend-variant} for instructions on changes
 620 you will want to make if you use a different database.
 621 \end{notate}
 622
 623 \begin{notate}{Use of the ``count'' function}
 624 The SQL count function is thought to be inefficient with
 625 some backends; workarounds exist.  (And it's considered to
 626 be efficient with MySQL.)
 627 \end{notate}
 628
 629 \begin{notate}{Abstraction} \label{abstraction}
 630 While it might be in some ways ``nice'' to allow people to
 631 chain together ever-more-abstract references to elements
 632 from other theories, I actually think it is better to
 633 demand that there just be \emph{one} layer of abstraction
 634 (since we can then quickly translate back and forth,
 635 rather than running through a chain of translations).
 636
 637 This does not imply that we cannot have a theory
 638 superimposed over another theory (or over multiple
 639 theories) that draws input from throughout a massively
 640 distributed interlaced system -- rather, just that we
 641 assume we will need to translate to ``base coordinates''
 642 when building such structures.  However, we'll certainly
 643 want to explore the possibilities for running links
 644 between theories (abstractly similar in some sense to
 645 pointing at a component of a triple, but here there's no
 646 uniform beg, mid, end scheme to refer to).
 647 \end{notate}
 648
 649 \subsection{Main table definitions}
 650
 651 \begin{notate}{Defining tables from within Lisp}
 652 This is Lisp code to define the permanent SQL tables
 653 described in Section \ref{sql-code}.
 654 \end{notate}
 655
 656 \begin{common}{tabledefs.lisp}
 657 ;; (execute-command "CREATE TABLE strings (
 658 ;;    id SERIAL PRIMARY KEY,
 659 ;;    text TEXT NOT NULL UNIQUE
 660 ;; );")
 661
 662 (execute-command "CREATE TABLE strings (
 663    id SERIAL PRIMARY KEY,
 664    text TEXT,
 665    UNIQUE INDEX (text(255))
 666 );")
 667
 668 (execute-command "CREATE TABLE places (
 669    id SERIAL PRIMARY KEY,
 670    code INT NOT NULL,
 671    ref INT NOT NULL
 672 );")
 673
 674 (execute-command "CREATE TABLE triples (
 675    id SERIAL PRIMARY KEY,
 676    code1 INT NOT NULL,
 677    ref1 INT NOT NULL,
 678    code2 INT NOT NULL,
 679    ref2 INT NOT NULL,
 680    code3 INT NOT NULL,
 681    ref3 INT NOT NULL,
 682    UNIQUE (code1, ref1,
 683            code2, ref2,
 684            code3, ref3)
 685 );")
 686
 687 (execute-command "CREATE TABLE theories (
 688   id SERIAL PRIMARY KEY,
 689   name INT UNIQUE REFERENCES strings(id)
 690 );")
 691 \end{common}
 692
 693 \begin{notate}{Eliminating and tables}
 694 In case you ever need to redefine these tables, you can
 695 run code like this first, to delete the existing copies.
 696 (Additional tables are added whenever a theory is created;
 697 code for deleting theories or their contents will appear
 698 in Section \ref{processing-theories}.)
 699 \end{notate}
 700
 701 \begin{idea}
 702 (dolist (view (list-views)) (drop-view view))
 703 (execute-command "DROP TABLE strings")
 704 (execute-command "DROP TABLE triples")
 705 (execute-command "DROP TABLE places")
 706 (execute-command "DROP TABLE theories")
 707 \end{idea}
 708
 709 \subsection{Modifying the database}
 710
 711 \begin{common}{database.lisp}
 712 (in-package arxana)
 713 (locally-enable-sql-reader-syntax)
 714 \end{common}
 715
 716 \subsection*{Processing strings}
 717
 718 \begin{notate}{On `string-to-id'}
 719 Return the id of `text', if present, otherwise nil.
 720
 721 There was a segmentation fault with clisp here at one
 722 point, maybe because I hadn't gotten the clsql sql reader
 723 syntax loaded up properly.  Note that calling the code
 724 without the function wrapper did not produce the same
 725 segfault.
 726 \end{notate}
 727
 728 \begin{common}{database.lisp}
 729 (defun string-to-id (text)
 730   (select [id]
 731           :from [strings]
 732           :where [= [text] text]))
 733 \end{common}
 734
 735 \begin{notate}{On `add-string'} \label{add-string}
 736 Add the argument `text' to the list of strings.  If the string
 737 is successfully created, its coordinates are returned.
 738 Otherwise, and in particular, if the request was to create
 739 a duplicate, nil is returned.
 740
 741 Should this give a message ``Adding \meta{text} to the
 742 strings table'' when the string is added by an indirecto
 743 function call, such as through `massage'?
 744 (Note \ref{massage}.)
 745 \end{notate}
 746
 747 \begin{common}{database.lisp}
 748 (defun add-string (text)
 749   (handler-case
 750    (progn (insert :into [strings]
 751                   :attributes '(text)
 752                   :values `(,text))
 753           `(1 ,(string-to-id text)))
 754    (sql-database-data-error ()
 755      (warn "\"~a\" already exists."
 756            text))))
 757 \end{common}
 758
 759 \begin{notate}{Error handling bug}
 760 The function `add-string' (Note \ref{add-string}) exhibits
 761 the first of several error handling calls designed to
 762 ensure uniqueness (Note \ref{unique-things}).
 763 Experimentally, this works, but I'm observing that, at
 764 least sometimes, if the user tries to add an item that's
 765 already present in the database, the index tied to the
 766 associated table increases even though the item isn't
 767 added.  This is annoying.  I haven't checked whether this
 768 happens on all possible installations of the underlying
 769 software.
 770 \end{notate}
 771
 772 \subsection*{Parsing general input}
 773
 774 \begin{notate}{On `massage'} \label{massage}
 775 User input to functions like `add-triple' and so on and so
 776 forth can be strings, integers (which the function
 777 ``serializes'' as the string versions of themselves), or
 778 as \emph{coordinates} -- lists of the form (code ref).
 779 This function converts all of these input forms into the
 780 last one!  It takes an optional argument `addstr' which,
 781 if supplied, says to add string data to the database if it
 782 wasn't there already.
 783 \end{notate}
 784
 785 \begin{common}{database.lisp}
 786 (defun massage (data &optional addstr)
 787   (cond
 788    ((integerp data)
 789     (massage (format nil "~a" data) addstr))
 790    ((stringp data)
 791     (let ((id (string-to-id data)))
 792       (if id
 793           (list 0 id)
 794           (when addstr
 795             (add-string data)))))
 796    ((and (listp data)
 797          (equal (length data) 2))
 798     data)
 799    (t nil)))
 800 \end{common}
 801
 802
 803 \subsection*{Processing triples}
 804
 805 \begin{notate}{On `triple-to-id'}
 806 Return the id of the triple (beg mid end),
 807 if present, otherwise nil.
 808 \end{notate}
 809
 810 \begin{common}{database.lisp}
 811 (defun triple-to-id (beg mid end)
 812   (let ((b (massage beg))
 813         (m (massage mid))
 814         (e (massage end)))
 815     (select [id]
 816             :from [triples]
 817             :where [and [= [code1] (first b)]
 818                         [= [ref1] (second b)]
 819                         [= [code2] (first m)]
 820                         [= [ref2] (second m)]
 821                         [= [code3] (first e)]
 822                         [= [ref3] (second e)]])))
 823 \end{common}
 824
 825 \begin{notate}{On `add-triple'} \label{add-triple}
 826 Elements of triples are parsed by `massage'
 827 (Note \ref{massage}).  If the triple
 828 is successfully created, its coordinates are returned.
 829 Otherwise, and in particular, if the request was to create
 830 a duplicate, nil is returned.
 831 \end{notate}
 832
 833 \begin{common}{database.lisp}
 834 (defun add-triple (beg mid end)
 835   "Add a triple comprised of BEG MID and END."
 836   (let ((b (massage beg t))
 837         (m (massage mid t))
 838         (e (massage end t)))
 839     (when (and b m e)
 840       (handler-case
 841        (progn
 842          (insert-records
 843           :into [triples] :attributes '(code1 ref1
 844                                         code2 ref2
 845                                         code3 ref3)
 846           :values `(,(first b) ,(second b)
 847                     ,(first m) ,(second m)
 848                     ,(first e) ,(second e)))
 849          `(2 ,(triple-to-id b m e)))
 850        (sql-database-data-error ()
 851          (warn "\"~a\" already entered as [~a ~a ~a]."
 852                (list beg mid end) b m e))))))
 853 \end{common}
 854
 855 \subsection*{Processing theories} \label{processing-theories}
 856
 857 \begin{notate}{Things to do with theories}
 858 For the record, we want to be able to create a theory, add
 859 elements to that theory, remove or change elements in the
 860 theory, and, for convenience, zap everything in a theory.
 861 Perhaps we will also want functions to remove the tables
 862 associated with a theory as well, swap the position of two
 863 theories, or change the name of a theory.  We will also
 864 want to be able to export and import theories, so they can
 865 be ``beamed'' between installations.  At appropriate
 866 places in the Emacs interface, we'll need to set
 867 `*write-to-heading*' and `*read-from-heading*'.
 868 \end{notate}
 869
 870 \begin{notate}{What can go in a theory} \label{what-can-go-in}
 871 Notice that there is no rule that says that a triple or
 872 place that's part of a theory needs to point only at
 873 strings that are in the same theory.
 874 \end{notate}
 875
 876 \begin{notate}{On `list-to-id'}
 877 Return the id of the theory with given `heading', if present,
 878 otherwise, nil.
 879 \end{notate}
 880
 881 \begin{common}{database.lisp}
 882 (defun list-to-id (heading)
 883   (let ((string-id (string-to-id heading)))
 884     (select [id]
 885             :from [lists]
 886             :where [= [heading] string-id])))
 887 \end{common}
 888
 889 \begin{notate}{On `add-theory'} \label{add-theory}
 890 Add a theory to the theories table, and all the new
 891 dimensions of the frame that comprise this theory.
 892 (Theories have names that are strings -- it seems a
 893 little funny to always have to translate submitted
 894 strings to ids for lookup, but this is what we do.)
 895 \end{notate}
 896
 897 \begin{common}{database.lisp}
 898 (defun add-list (heading)
 899   (let ((string-id (second (massage heading t))))
 900     (handler-case
 901         (progn (insert :into [lists]
 902                        :attributes '(heading)
 903                        :values `(,string-id))
 904                (let ((k (theory-to-id heading)))
 905                  (execute-command
 906                   (format nil "CREATE TABLE lists~A (
 907    offset SERIAL PRIMARY KEY,
 908    code INT NOT NULL,
 909    ref INT NOT NULL
 910 );" k))
 911                  `(0 ,k)))
 912       (sql-database-data-error
 913           ()
 914         (warn "The list \"~a\" already exists."
 915               heading)))))
 916 \end{common}
 917
 918 \begin{notate}{On `get-lists'}
 919 Find all lists that contain `symbol'.
 920 \end{notate}
 921
 922 \begin{common}{database.lisp}
 923 (defun get-lists (symbol)
 924   (let* ((data (massage symbol))
 925          (type (datatype data))
 926          (id (second data))
 927          (n (caar
 928              (query "select count(*) from lists")))
 929          results)
 930     (loop for k from 1 upto n
 931           do (let ((present
 932                     (query (concatenate
 933                             'string
 934                             "select offset from list"
 935                             (format nil "~A" k)
 936                             " where ((code = "
 937                             (format nil "~A" type)
 938                             ") and (ref = "
 939                             (format nil "~A" id)
 940                             "))"))))
 941                (when present
 942                  ;; bit of a problem if there are multiple
 943                  ;; entries of that item on the given
 944                  ;; list.
 945                  (setq results (cons (list 0 k present)
 946                                      results)))))
 947     results))
 948 \end{common}
 949
 950 \begin{notate}{On `save-to-list'}
 951 Record `symbol' on list named `name'.
 952 \end{notate}
 953
 954 \begin{common}{database.lisp}
 955 (defun save-to-list (symbol name)
 956   (let* ((data (massage symbol t))
 957          (type (datatype data))
 958          (string-id (string-to-id name))
 959          (k (select-one [id]
 960                         :from [lists]
 961                         :where [= [name] string-id]))
 962          (tablek (concatenate 'string
 963                               type (format nil "~A" k))))
 964     (insert-records :into (sql-expression :table tablek)
 965                     :attributes '(id)
 966                     :values `(,(second data)))))
 967 \end{common}
 968
 969 \subsection*{Lookup by id or coordinates}
 970
 971 \begin{notate}{The data format that's best for Lisp} \label{what-is-best-for-lisp}
 972 It is a reasonable question to ask whether or not the an
 973 item's id should be considered part of that item's
 974 defining data when that data is no longer in the database.
 975 For the functions defined here, the id is an input, and so
 976 by default I'm not including it in the output here,
 977 because it is already known.  However, for functions like
 978 `triples-given-beginning' (See Note
 979 \ref{graph-like-data}), the id is \emph{not} part of the
 980 known data, and so it is returned.  Therefore I am
 981 providing the `retain-id' flag here, for cases where
 982 output should be consistent with that of these other
 983 functions.
 984 \end{notate}
 985
 986 \begin{common}{database.lisp}
 987 (defun string-lookup (id &optional retain-id)
 988   (let ((ret (select [text]
 989                      :from [strings]
 990                      :where [= [id] id])))
 991     (if retain-id
 992         (list id ret)
 993         ret)))
 994
 995 (defun triple-lookup (id &optional retain-id)
 996   (let ((ret (select [code1] [ref1]
 997                      [code2] [ref2]
 998                      [code3] [ref3]
 999                      :from [triples]
1000                      :where [= [id] id])))
1001     (if retain-id
1002         (cons id ret)
1003         ret)))
1004
1005 (defun list-lookup (id &optional retain-id)
1006   (let ((ret (select [name]
1007                      :from [lists]
1008                      :where [= [id] id])))
1009     (if retain-id
1010         (list id ret)
1011         ret)))
1012 \end{common}
1013
1014 \begin{notate}{Succinct idioms for following pointers}
1015 Here are some variants on the functions above which save
1016 us from needing to extract the id of the item from its
1017 coordinates.
1018 \end{notate}
1019
1020 \begin{common}{database.lisp}
1021 (defun string-contents (coords)
1022   (string-lookup (second coords)))
1023
1024 (defun place-contents (coords)
1025   (place-lookup (second coords)))
1026
1027 (defun triple-contents (coords)
1028   (triple-lookup (second coords)))
1029 \end{common}
1030
1031 \begin{notate}{Switchboard} \label{switchboard}
1032 Even more succinctly, one function that can get
1033 the object indicated by any set of coordinates.
1034 \end{notate}
1035
1036 \begin{common}{database.lisp}
1037 (defun switchboard (coords)
1038   (cond ((eq (first coords) 0)
1039          (string-contents coords))
1040         ((eq (first coords) 1)
1041          (place-contents coords))
1042         ((eq (first coords) 2)
1043          (triple-contents coords))))
1044 \end{common}
1045
1046 \begin{notate}{Anti-pasti}
1047 The readability of this code could perhaps be improved if
1048 we used functions like `switchboard' more frequently.
1049 (More to the point, it seems it's not currently used.)  In
1050 particular, it would be nice if we could sweep idioms like
1051 \verb+`(2 ,(car triple))+ under the rug.
1052 \end{notate}
1053
1054 \begin{common}{database.lisp}
1055 (locally-disable-sql-reader-syntax)
1056 \end{common}
1057
1058 \subsection{Queries} \label{queries}
1059
1060 \begin{notate}{The use of views} \label{use-of-views}
1061 It is easy enough to select those triples which match
1062 simple data, e.g., those triples which have the same
1063 beginning, middle, or end, or any combination of these.
1064 It is a little more complicated to find items that match
1065 criteria specified by several different triples; for
1066 example, to \emph{find all the books by Arthur C. Clarke
1067   that are also works of fiction}.
1068
1069 Suppose our collection of triples contains a portion as
1070 follows:
1071 \begin{center}
1072 \begin{tabular}{lll}
1073 Profiles of the Future & is a & book \\ 2001: A Space
1074 Odyssey & is a & book \\ Ender's Game & is a & book
1075 \\ Profiles of the Future & has genre & non-fiction
1076 \\ 2001: A Space Odyssey & has genre & fiction \\ Ender's
1077 Game & has genre & fiction \\ Profiles of the Future & has
1078 author & Arthur C. Clarke \\ 2001: A Space Odyssey & has
1079 author & Arthur C. Clarke \\ Ender's Game & has author &
1080 Orson Scott Card
1081 \end{tabular}
1082 \end{center}
1083
1084 One way to solve the given problem would be to find those
1085 items that \emph{are written by Arthur C. Clarke} (* ``has
1086 author'' and ``Arthur C. Clarke''), that \emph{are books}
1087 (* ``is a'' ``book''), and \emph{that are classified as
1088   fiction} (* ``has genre'' ``fiction'').  We are looking
1089 for items that match \emph{all} of these conditions.
1090
1091 Our implementation strategy is: collect the items matching
1092 each criterion into a view, then join these views.  (See
1093 the function `satisfy-conditions'
1094 \ref{satisfy-conditions}.)
1095
1096 If we end up working with large queries and a lot of data,
1097 this use of views may not be an efficient way to go -- but
1098 we'll cross that bridge when we come to it.
1099 \end{notate}
1100
1101 \begin{notate}{Search queries}
1102 In Note \ref{sphinx-setup} et seq., we give some
1103 instructions on how to set up the Sphinx search engine to
1104 work with Arxana.  However, a much tighter integration of
1105 Sphinx into Arxana is possible, and will be coming soon.
1106 \end{notate}
1107
1108 \begin{common}{queries.lisp}
1109 (in-package arxana)
1110 (locally-enable-sql-reader-syntax)
1111 \end{common}
1112
1113 \subsection*{Printing}
1114
1115 \begin{notate}{On `print-system-object'} \label{print-system-object}
1116 The function `print-system-object' bears some resemblance
1117 to `massage', but is for printing instead,
1118 and therefor has to be recursive (because triples and
1119 places can point to other system objects, printing can be
1120 a long and drawn out ordeal).
1121 \end{notate}
1122
1123 \begin{common}{queries.lisp}
1124 (defun print-system-object (data &optional components)
1125   (cond
1126     ;; just return strings
1127     ((stringp data)
1128      data)
1129     ;; printing from coordinates (code, ref)
1130     ((and (listp data)
1131           (equal (length data) 2))
1132      ;; we'll need some hack to deal with
1133      ;; elements-of-theories, which, right now, are two
1134      ;; elements long but are not (code, ref) pairs but
1135      ;; rather (local_id, ref) pairs, or maybe actually if
1136      ;; we take context into consideration, they're
1137      ;; actually (k, table, local_id, ref) quadruplets.
1138      ;; Obviously with *that* data we can translate to
1139      ;; (code, ref).  On the other hand, if we *don't*
1140      ;; take it into consideration, we probably can't do
1141      ;; much of anything.  So we should be careful to be
1142      ;; aware of just what sort of information we're
1143      ;; passing around.
1144      (cond ((equal (first data) 0)
1145             (string-lookup (second data)))
1146            ((equal (first data) 1)
1147             (print-system-object
1148              (place-lookup (second data) t)))
1149            ((equal (first data) 2)
1150             (let ((triple (triple-lookup (second data) t)))
1151               (if components
1152                   (list
1153                    (print-beginning triple)
1154                    (print-middle triple)
1155                    (print-end triple))
1156                   (concatenate
1157                    'string
1158                    (format nil "T~a[" (second data))
1159                    (print-beginning triple) "."
1160                    (print-middle triple) "."
1161                    (print-end triple) "]"))))
1162            ((equal (first data) 3)
1163             (concatenate 'string "List printing not implemented yet."))))
1164     ;; place
1165     ((and (listp data)
1166           (equal (length data) 3))
1167      (concatenate 'string
1168                   (format nil "P~a|" (first data))
1169                   (print-system-object (cdr data)) "|"))
1170     ;; triple
1171     ((and (listp data)
1172           (equal (length data) 7))
1173       (if components
1174           (list
1175            (print-beginning data)
1176            (print-middle data)
1177            (print-end data))
1178           (concatenate
1179            'string
1180            (format nil "T~a[" (first data))
1181            (print-beginning data) "."
1182            (print-middle data) "."
1183            (print-end data) "]")))
1184     (t nil)))
1185
1186 (defun print-beginning (triple)
1187   (print-system-object (isolate-beginning triple)))
1188
1189 (defun print-middle (triple)
1190   (print-system-object (isolate-middle triple)))
1191
1192 (defun print-end (triple)
1193   (print-system-object (isolate-end triple)))
1194 \end{common}
1195
1196 \begin{notate}{Depth}
1197 If we are going to have complicated recursive references,
1198 our printer, and anything else that gives the system some
1199 semantics, should come with some sort of ``layers'' switch
1200 that can be used to limit the amount of recursion we do in
1201 any given computation.
1202 \end{notate}
1203
1204 \begin{notate}{Printing objects as they appear in Lisp} \label{printing-objects-in-lisp}
1205 With the following functions we provide facilities for
1206 printing an object, either from its id or from the
1207 expanded form of the data that represents it in Lisp.
1208 (This is one good reason to have one standard form for
1209 this data; compare Note \ref{what-is-best-for-lisp}.
1210 These functions assume that the id \emph{is} part of
1211 what's printed, so if using functions like `triple-lookup'
1212 to retrieve data for printing, you'll have to graft the id
1213 back on before printing with these functions.)
1214 \end{notate}
1215
1216 \begin{notate}{Printing theories}
1217 We'll want to both print all of the content of a theory,
1218 and print \emph{from} the theory in a more limited way.
1219 (Perhaps we get the second item for free, already?)
1220 \end{notate}
1221
1222 \begin{common}{queries.lisp}
1223 (defun print-string (string &optional components)
1224   (print-system-object string components))
1225
1226 (defun print-place (place &optional components)
1227   (print-system-object place components))
1228
1229 (defun print-triple (triple &optional components)
1230   (print-system-object triple components))
1231
1232 (defun print-string-from-id (id &optional components)
1233   (print-system-object (list 0 id) components))
1234
1235 (defun print-place-from-id (id &optional components)
1236   (print-system-object (list 1 id) components))
1237
1238 (defun print-triple-from-id (id &optional components)
1239   (print-system-object (list 2 id) components))
1240 \end{common}
1241
1242 \begin{notate}{Printing some stuff but not other stuff} \label{printing-some}
1243 These functions are good for printing lists as come out of
1244 the database.  See Note \ref{strings-and-ids} on printing
1245 strings.
1246 \end{notate}
1247
1248 \begin{common}{queries.lisp}
1249 (defun print-strings (strings)
1250   (mapcar 'second strings))
1251
1252 (defun print-places (places &optional components)
1253   (mapcar (lambda (item)
1254              (print-system-object item components))
1255   places))
1256
1257 (defun print-triples (triples &optional components)
1258  (mapcar (lambda (item)
1259              (print-system-object item components))
1260              triples))
1261
1262 (defun print-theories (theories &optional components)
1263  (mapcar (lambda (item)
1264              (print-system-object item components))
1265              theories))
1266 \end{common}
1267
1268 \begin{notate}{Printing everything in each table} \label{printing-everything}
1269 These functions collect human-readable versions of
1270 everything in each table.  Notice that `all-strings' is
1271 written differently.
1272 \end{notate}
1273
1274 \begin{common}{queries.lisp}
1275 (defun all-strings ()
1276   (mapcar 'second (select [*] :from [strings])))
1277
1278 (defun all-places ()
1279   (mapcar 'print-system-object
1280           (select [*] :from [places])))
1281
1282 (defun all-triples ()
1283  (mapcar 'print-system-object
1284          (select [*] :from [triples])))
1285
1286 (defun all-theories ()
1287  (mapcar 'print-system-object
1288          (select [*] :from [theories])))
1289 \end{common}
1290
1291 \begin{notate}{Printing on particular dimensions}
1292 One possible upgrade to the printing functions would be to
1293 provide the built-in to ``curry'' the printout -- for
1294 example, just print the source nodes from a list of
1295 triples.  However, it should of course also be possible to
1296 do processing like this Lisp after the printout has been
1297 made (the point is, it is presumably it is more efficient
1298 only to retrieve and format the data we're actually
1299 looking for).
1300 \end{notate}
1301
1302 \begin{notate}{Strings and ids} \label{strings-and-ids}
1303 Unlike other objects, strings don't get printed with their
1304 ids.  We should probably provide an \emph{option} to print
1305 with ids (this could be helpful for subsequent work with
1306 the strings in question; on the other hand, since strings
1307 are being kept unique, we can immediately exchange a
1308 string and it's id, so I'm not sure if it's necessary to
1309 have an explicit ``option'').
1310 \end{notate}
1311
1312 \subsection*{Functions that establish basic graph structure}
1313
1314 \begin{notate}{Thinking about graph-like data} \label{graph-like-data}
1315 Here we have in mind one or more objects (e.g. a
1316 particular source and sink) that is associated with
1317 potentially any number of triples (e.g. all the possible
1318 middles running between these two identified objects).
1319 These functions establish various forms of locality or
1320 neighborhood within the data.
1321
1322 The results of such queries can be optionally cached in a
1323 view, which is useful for further processing
1324 (cf. \ref{satisfy-conditions}).
1325
1326 These functions take input in the form of strings and/or
1327 coordinates (cf. Note \ref{massage}).
1328 \end{notate}
1329
1330 \begin{common}{queries.lisp}
1331 (defun triples-given-beginning (node &optional view)
1332   "Get triples outbound from the given NODE.  Optional
1333   argument VIEW causes the results to be selected into a
1334   view with that name."
1335   (let ((data (massage node))
1336         (window (or view "interal-view"))
1337         ret)
1338     (when data
1339       (create-view
1340        window
1341         :as (select [*]
1342              :from [triples]
1343              :where [and [= [code1] (first data)]
1344                          [= [ref1] (second data)]]))
1345       (setq ret (select [*] :from window))
1346       (unless view
1347         (drop-view window))
1348       ret)))
1349
1350 (defun triples-given-end (node &optional view)
1351   "Get triples inbound into NODE.  Optional argument VIEW
1352        causes the results to be selected into a view with
1353        that name."
1354   (let ((data (massage node))
1355         (window (or view "interal-view"))
1356         ret)
1357     (when data
1358       (create-view
1359        window
1360         :as (select [*]
1361              :from [triples]
1362              :where [and [= [code3] (first data)]
1363                          [= [ref3] (second data)]]))
1364       (setq ret (select [*] :from window))
1365       (unless view
1366         (drop-view window))
1367       ret)))
1368
1369 (defun triples-given-middle (edge &optional view)
1370   "Get the triples that run along EDGE.  Optional argument
1371        VIEW causes the results to be selected into a view
1372        with that name."
1373   (let ((data (massage edge))
1374         (window (or view "interal-view"))
1375         ret)
1376     (when data
1377       (create-view
1378        window
1379        :as (select [*]
1380             :from [triples]
1381             :where [and [= [code2] (first data)]
1382                         [= [ref2] (second data)]]))
1383       (setq ret (select [*] :from window))
1384       (unless view
1385         (drop-view window))
1386       ret)))
1387
1388 (defun triples-given-middle-and-end (edge node &optional
1389        view)
1390   "Get the triples that run along EDGE into NODE.
1391        Optional argument VIEW causes the results to be
1392        selected into a view with that name."
1393   (let ((edgedata (massage edge))
1394         (nodedata (massage node))
1395         (window (or view "interal-view"))
1396         ret)
1397     (when (and edgedata nodedata)
1398       (create-view
1399        window
1400        :as (select [*]
1401             :from [triples]
1402             :where [and [= [code2] (first edgedata)]
1403                         [= [ref2] (second edgedata)]
1404                         [= [code3] (first nodedata)]
1405                         [= [ref3] (second nodedata)]]))
1406       (setq ret (select [*] :from window))
1407       (unless view
1408         (drop-view window))
1409       ret)))
1410
1411 (defun triples-given-beginning-and-middle (node edge
1412                                            &optional view)
1413   "Get the triples that run from NODE along EDGE.
1414 Optional argument VIEW causes the results to be selected
1415 into a view with that name."
1416   (let ((nodedata (massage node))
1417         (edgedata (massage edge))
1418         (window (or view "interal-view"))
1419         ret)
1420     (when (and nodedata edgedata)
1421       (create-view
1422        window
1423        :as (select [*]
1424             :from [triples]
1425             :where [and [= [code1] (first nodedata)]
1426                         [= [ref1] (second nodedata)]
1427                         [= [code2] (first edgedata)]
1428                         [= [ref2] (second edgedata)]]))
1429       (setq ret (select [*] :from window))
1430       (unless view
1431         (drop-view window))
1432       ret)))
1433
1434 (defun triples-given-beginning-and-end (node1 node2
1435        &optional view)
1436   "Get the triples that run from NODE1 to NODE2.  Optional
1437        argument VIEW causes the results to be selected
1438        into a view with that name."
1439   (let ((node1data (massage node1))
1440         (node2data (massage node2))
1441         (window (or view "interal-view"))
1442         ret)
1443     (when (and node1data node2data)
1444       (create-view
1445        window
1446        :as (select [*]
1447             :from [triples]
1448             :where [and [= [code1] (first node1data)]
1449                         [= [ref1] (second node1data)]
1450                         [= [code3] (first node2data)]
1451                         [= [ref3] (second node2data)]]))
1452       (setq ret (select [*] :from window))
1453       (unless view
1454         (drop-view window))
1455       ret)))
1456
1457 ;; This one use `select-one' instead of `select'
1458 (defun triple-exact-match (node1 edge node2 &optional
1459        view)
1460   "Get the triples that run from NODE1 along EDGE to
1461 NODE2.  Optional argument VIEW causes the results to be
1462 selected into a view with that name."
1463   (let ((node1data (massage node1))
1464         (edgedata (massage edge))
1465         (node2data (massage node2))
1466         (window (or view "interal-view"))
1467         ret)
1468     (when (and node1data edgedata node2data)
1469       (create-view
1470        window
1471        :as (select [*]
1472             :from [triples]
1473             :where [and [= [code1] (first node1data)]
1474                         [= [ref1] (second node1data)]
1475                         [= [code2] (first edgedata)]
1476                         [= [ref2] (second edgedata)]
1477                         [= [code3] (first node2data)]
1478                         [= [ref3] (second node2data)]]))
1479       (setq ret (select-one [*] :from window))
1480       (unless view
1481         (drop-view window))
1482       ret)))
1483 \end{common}
1484
1485 \begin{notate}{Becoming flexible about a string's status}
1486 One possible upgrade would be to provide versions of these
1487 functions that will flexibly accept either a string or a
1488 ``placed string'' as input (since frequently we're
1489 interested in content of that sort; see
1490 \ref{importing-sketch}).
1491 \end{notate}
1492
1493 \subsection*{Finding places that satisfy some property}
1494
1495 \begin{notate}{On `get-places-subject-to-constraint'}
1496 Like `get-places' (Note \ref{get-places}), but this
1497 time takes an extra condition of the form (A C B)
1498 where one of A, B, and C is `nil'.  We test each
1499 of the places in place of this `nil', to see if a
1500 triple matching that criterion exists.
1501 \end{notate}
1502
1503 \begin{common}{queries.lisp}
1504 (defun get-places-subject-to-constraint (symbol condition)
1505   (let ((candidate-places (get-places symbol))
1506         accepted-places)
1507     (dolist (place candidate-places)
1508       (let ((filled-condition
1509              (map 'list (lambda (elt) (or elt
1510                                           `(1 ,place)))
1511                   condition)))
1512         (when (apply 'triple-relaxed-match
1513                      filled-condition)
1514           (setq accepted-places
1515                 (cons place accepted-places)))))
1516     accepted-places))
1517 \end{common}
1518
1519 \subsection*{Logic}
1520
1521 \begin{notate}{Caution: compatibility with theories?}
1522 For the moment, I'm not sure how compatible this function
1523 is with the theories apparatus we've established, or with
1524 the somewhat vaguer notion of trans-theory questions or
1525 concerns.  Global queries should work just fine, but
1526 theory-local questions may need some work.  Before getting
1527 into compatibility of these questions with the theory
1528 apparatus, I want to make sure that apparatus is working
1529 properly.  Note that the questions here do rely on
1530 functions for graph-like thinking (Note
1531 \ref{graph-like-data} et seq.), and it would certainly
1532 make sense to port to ``subgraphs'' as represented by
1533 theories.
1534 \end{notate}
1535
1536 \begin{notate}{On `satisfy-conditions'} \label{satisfy-conditions}
1537 This function finds the items which match constraints.
1538 Constraints take the form (A B C), where precisely one of
1539 A, B, or C should be `nil', and any of the others can be
1540 either input suitable for `massage', or
1541 `t'.  The `nil' entry stands for the object we're
1542 interested in.  Any `t' entries are wildcards.
1543
1544 The first thing that happens as the function runs is that
1545 views are established exhibiting each group of triples
1546 satisfying each predicate.  The names of these views are
1547 then massaged into a large SQL query.  (It is important to
1548 ``typeset'' all of this correctly for our SQL `query'.)
1549 Finally, once that query has been run, we clean up,
1550 dropping all of the views we created.
1551 \end{notate}
1552
1553 \begin{common}{queries.lisp}
1554 (defun satisfy-conditions (constraints)
1555   (let* ((views (generate-views constraints))
1556          (formatted-list-of-views (format-views
1557                                    views))
1558          (where-condition (generate-where-condition
1559                            views
1560                            constraints))
1561          (ret
1562           ;; Let's see what the query is, first of all.
1563           (query
1564            (concatenate
1565             'string
1566             "select v1.id, v1.code1, v1.ref1, "
1567                           "v1.code2, v1.ref2, "
1568                           "v1.code3, v1.ref3 "
1569             "from "
1570             formatted-list-of-views
1571             "where "
1572             where-condition
1573             ";"))))
1574     (mapc (lambda (name) (drop-view name)) views)
1575     ret))
1576 \end{common}
1577
1578 \begin{notate}{Subroutines for `satisfy-conditions'}
1579 The functions below produce bits and pieces of the SQL
1580 query that `satisfy-conditions' submits.  The point of the
1581 `generate-views' is to create a series of views centered
1582 on the term(s) we're interested in (the `nil' slots in
1583 each submitted constraint).  With
1584 `generate-where-condition', we insist that all of these
1585 interesting terms should, in fact, be equal to one
1586 another.
1587 \end{notate}
1588
1589 \begin{notate}{On `generate-views'}
1590 In a `cond' form, for each constraint we must select the
1591 appropriate function to generate the view; at the very end
1592 of the cond form, we spit out the viewname (for `mapcar'
1593 to add to the list of views).
1594 \end{notate}
1595
1596 \begin{common}{queries.lisp}
1597 (defun generate-views (constraints)
1598   (let ((counter 0))
1599     (mapcar
1600      (lambda (constraint)
1601        (setq counter (1+ counter))
1602        (let ((viewname (format nil "v~a" counter)))
1603          (cond
1604           ;; A * ? or A ? *
1605           ((or (and (eq (second constraint) t)
1606                     (eq (third constraint) nil))
1607                (and (eq (second constraint) nil)
1608                     (eq (third constraint) t)))
1609            (triples-given-beginning
1610             (first constraint)
1611             viewname))
1612           ;; * B ? or ? B *
1613           ((or (and (eq (first constraint) t)
1614                     (eq (third constraint) nil))
1615                (and (eq (first constraint) nil)
1616                     (eq (third constraint) t)))
1617            (triples-given-middle
1618             (second constraint)
1619             viewname))
1620           ;; * ? C or ? * C
1621           ((or (and (eq (first constraint) t)
1622                     (eq (second constraint) nil))
1623                (and (eq (first constraint) nil)
1624                     (eq (second constraint) t)))
1625            (triples-given-end
1626             (third constraint)
1627             viewname))
1628           ;; ? B C
1629           ((eq (first constraint) nil)
1630            (triples-given-middle-and-end
1631             (second constraint)
1632             (third constraint)
1633             viewname))
1634           ;; A ? C
1635           ((eq (second constraint) nil)
1636            (triples-given-beginning-and-middle
1637             (first constraint)
1638             (second constraint)
1639             viewname))
1640           ;; A C ?
1641           ((eq (third constraint) nil)
1642            (triples-given-beginning-and-end
1643             (first constraint)
1644             (third constraint)
1645             viewname)))
1646          viewname))
1647      constraints)))
1648
1649 (defun format-views (views)
1650   (let ((formatted-list-of-views ""))
1651     (mapc (lambda (view)
1652             (setq formatted-list-of-views
1653                   (concatenate
1654                    'string
1655                    formatted-list-of-views
1656                    (format nil "~a," view))))
1657           (butlast views))
1658     (setq formatted-list-of-views
1659           (concatenate
1660            'string
1661            formatted-list-of-views
1662            (format nil "~a " (car (last views)))))
1663     formatted-list-of-views))
1664
1665 (defun generate-where-condition (views conditions)
1666   (let ((where-condition "")
1667         (c (select-component (first conditions))))
1668     ;; there should be one less "=" condition than there
1669     ;; are things to compare; until we get to the last
1670     ;; view, everything is joined together by an `and'.
1671     ;; -- this needs to consider (map over) both `views'
1672     ;; and `conditions'.
1673     (loop
1674      for i from 1 upto (1- (length views))
1675      do
1676      (let ((compi (select-component (nth i conditions)))
1677            (viewi (nth i views)))
1678        (setq
1679         where-condition
1680         (concatenate
1681          'string
1682          where-condition
1683          (concatenate
1684           'string
1685           "(v1.code" c " = " viewi ".code" compi ") and "
1686           "(v1.ref" c " = " viewi ".ref" compi ") and ")))))
1687     (let ((viewn (nth (1- (length views)) views))
1688           (compn (select-component
1689                     (nth (length views) conditions))))
1690       (setq
1691        where-condition
1692        (concatenate
1693         'string
1694         where-condition
1695         "(v1.code" c " = " viewn ".code" compn ") and "
1696         "(v1.ref" c " = " viewn ".ref" compn ")")))
1697     where-condition))
1698
1699 (defun select-component (condition)
1700   (cond ((eq (first condition) nil) "1")
1701         ((eq (second condition) nil) "2")
1702         ((eq (third condition) nil) "3")))
1703 \end{common}
1704
1705 \begin{common}{queries.lisp}
1706 (locally-disable-sql-reader-syntax)
1707 \end{common}
1708
1709 \begin{notate}{Even more complicated logic}
1710 In order to conveniently manage complex queries, it would
1711 be nice if we could store the results of earlier queries
1712 into views, so that we can combine several such views for
1713 further processing.
1714 \end{notate}
1715
1716 \section{Emacs-side} \label{emacs-side}
1717
1718 \subsection{The interface to Common Lisp}
1719
1720 \begin{notate}{On `Defun'} \label{defun-interface}
1721 A way to define Elisp functions whose bodies are evaluated
1722 by Common Lisp.  Trust me, this is a good idea.  Besides,
1723 it exhibits some facinating backquote and comma tricks.
1724 But be careful: this definition of `Defun' did not work on
1725 Emacs version 21.
1726
1727 If we want to be able to feed in a standard arglist to
1728 Common Lisp (with optional elements and so forth), we'd
1729 have define how these arguments are handled here!
1730 \end{notate}
1731
1732 \begin{elisp}
1733 (defmacro Defun (name arglist &rest body)
1734   (declare (indent defun))
1735   `(defun ,name ,arglist
1736      (let* ((outbound-string
1737              (translate-emacs-syntax-to-common-syntax
1738               (format "%S"
1739                       (append
1740                        (list
1741                         (append (list 'lambda ',arglist)
1742                                 ',body))
1743                        (mapcar
1744                         (lambda (arg) `',arg)
1745                         (list
1746                          ,@(remove-if
1747                                  (lambda (testelt)
1748                                    (eq testelt
1749                                  '&optional))
1750                                  arglist)))))))
1751             (returned-string
1752              (second
1753               ;; we now specify the right package!
1754               (slime-eval
1755                (list 'swank:eval-and-grab-output
1756                      outbound-string)
1757                :arxana))))
1758        (process-slime-output returned-string))))
1759 \end{elisp}
1760
1761 \begin{notate}{On `process-slime-output'}
1762 This should downcase all constituent symbols, but for
1763 expediency I'm just downcasing `NIL' at the moment.  Will
1764 come back for more testing and downcasing shortly.  (I
1765 suspect the general case is just about as easy as what
1766 happens here.)
1767 \end{notate}
1768
1769 \begin{elisp}
1770 (defun process-slime-output (str)
1771   (condition-case nil
1772       (let ((read-value (read str)))
1773         (if (symbolp read-value)
1774             (read (downcase str)))
1775         (nsubst nil 'NIL read-value))
1776     (error str)))
1777 \end{elisp}
1778
1779 \begin{elisp}
1780 (defun translate-emacs-syntax-to-common-syntax (str)
1781   (with-temp-buffer
1782     (insert str)
1783     (dolist (swap '(("(\\` " "`")
1784                     ("(\\\, " ",")))
1785       (goto-char (point-min))
1786       (while (search-forward (first swap) nil t)
1787         (goto-char (match-beginning 0))
1788         (forward-sexp)
1789         (delete-char -1)
1790         (goto-char (match-beginning 0))
1791         (delete-region (match-beginning 0)
1792                        (match-end 0))
1793         (insert (second swap))))
1794     (buffer-substring-no-properties (point-min)
1795                                     (point-max))))
1796 \end{elisp}
1797
1798 \begin{notate}{Interactive `Defun'}
1799 Note, an improved version of this macro would allow me to
1800 specify that some Defuns are interactive and some are not.
1801 This could be done by examining the submitted body, and
1802 adjusting the defun if its car is an `interactive' form.
1803 Most of the Defuns will be things that people will want to
1804 use interactively, so making this change would probably be
1805 a good idea.  What I'm doing in the mean time is just
1806 writing 2 functions each time I need to make an
1807 interactive function that accesses Common Lisp data!
1808 \end{notate}
1809
1810 \begin{notate}{Common Lisp evaluation of code chunks}
1811 Another potentially beneficial and simple approach is to
1812 write a form like `progn' that evaluates its contents on
1813 Common Lisp.  This saves us from having to rewrite all of
1814 the `defun' facilities into `Defun' (e.g. interactivity).
1815 But... the problem with \emph{this} is that Common Lisp
1816 doesn't know the names of all the variables that are
1817 defined in Emacs!  I'm not sure how to get all of the
1818 values of these variable substituted \emph{first}, before
1819 the call to Common Lisp is made.
1820 \end{notate}
1821
1822 \begin{notate}{Debugging `Defun'}
1823 In order to make debugging go easier, it might be nice to
1824 have an option to make the code that is supposed to be
1825 evaluated by Defun actually \emph{print} on the REPL
1826 instead of being processed through an invisible back-end.
1827 There could be a couple of different ways to do that, one
1828 would be to simulate just what a user might do, the other
1829 would be a happy medium between that and what we're doing
1830 now: just put our computery auto-generated code on the
1831 REPL and evaluate it.  (To some extent, I think the
1832 *slime-events* buffer captures this information, but it is
1833 not particularly easy to read.)
1834 \end{notate}
1835
1836 \begin{notate}{Interactive Common Lisp?}
1837 Suppose we set up some kind of interactive environment in
1838 Common Lisp; how would we go about passing this
1839 environment along to a user interacting via Emacs?  (Note
1840 that SLIME's presentation of the debugging loop is one
1841 good example.)
1842 \end{notate}
1843
1844 \subsection{Database interaction} \label{interaction}
1845
1846 \begin{notate}{The `article' function} \label{the-article-function}
1847 You can use this function to create an article with a
1848 given name and contents.  If you like you can put it in a
1849 list.
1850 \end{notate}
1851
1852 \begin{elisp}
1853 (Defun article (name contents &optional heading)
1854   (let ((coordinates (add-triple name
1855                                  "has content"
1856                                  contents)))
1857     (when theory (add-triple coordinates "in" heading))
1858     (when place (if (numberp place)
1859                     (put-in-place coordinates place)
1860                   (put-in-place coordinates)))
1861     coordinates))
1862 \end{elisp}
1863
1864 \begin{notate}{The `scholium' function} \label{the-scholium-function}
1865 You can use this function to link annotations to objects.
1866 As with the `article' function, you can optionally
1867 categorize the connection on a given list (cf. Note
1868 \ref{the-article-function}).
1869 \end{notate}
1870
1871 \begin{elisp}
1872 (Defun scholium (beginning link end &optional heading)
1873   (let ((coordinates (add-triple beginning
1874                                  link
1875                                  end)))
1876     (when list (add-triple coordinates "in" heading))
1877     (when place (if (numberp place)
1878                     (put-in-place coordinates place)
1879                   (put-in-place coordinates)))
1880     coordinates))
1881 \end{elisp}
1882
1883 \begin{notate}{Uses of coordinates}
1884 Note that, if desired, you can feed input of the form
1885 '(\meta{code} \meta{ref}) into `article' and `scholium'.
1886 It's convenient to do further any processing of the object
1887 we've created, while we still have ahold of the coordinates
1888 returned by `add-triple' (cf. Note
1889 \ref{import-code-continuations} for an example).
1890 \end{notate}
1891
1892 \begin{notate}{Finding all the members of a list by type?}
1893 We just narrow according to type.
1894 \end{notate}
1895
1896 \begin{notate}{On `get-article'} \label{get-article}
1897 Get the contents of the article named `name'.  Optional
1898 argument `list' lets us find and use the position on the
1899 given list that holds the name, and use that instead of
1900 the name itself.
1901
1902 We do not yet deal well with the ambiguous case in which
1903 there are several positions that correspond to the given
1904 name that appear on the same list.
1905
1906 Note also that out of the data returned by
1907 `triples-given-beginning-and-middle', we should pick the
1908 (hopefully just) ONE that corresponds to the given list.
1909
1910 This means we need to pick over the list of triples
1911 returned here, and test each one to see if it is in our
1912 heading.  As to WHY there might be more than one ``has
1913 content'' for a place that we know to be in our
1914 heading... I'm not sure.  I guess we can go with the
1915 assumption that there is just one, for now.
1916 \end{notate}
1917
1918 \begin{elisp}
1919 (Defun get-article (name &optional heading)
1920   (let* ((place-pseudonyms
1921           (if heading
1922               (get-places-subject-to-constraint
1923                name `(nil "in" ,heading))
1924             (get-places name)))
1925          (goes-by (cond
1926                     ((eq (length place-pseudonyms) 1)
1927                      `(1 ,(car place-pseudonyms)))
1928                     ((triple-exact-match
1929                       name "in" heading)
1930                      name)
1931                     ((not heading) name)
1932                     (t nil))))
1933     (when goes-by
1934       ;; it might be nice to also return `goes-by'
1935       ;; so we can access the appropriate place again.
1936       (third (print-triple
1937               (resolve-ambiguity
1938                (triples-given-beginning-and-middle
1939                 goes-by "has content"))
1940               t)))))
1941 \end{elisp}
1942
1943 \begin{notate}{On `get-names'} \label{get-names}
1944 This function simply gets the names of articles that have
1945 names -- in other words, every triple built around the
1946 ``has content'' relation.
1947 \end{notate}
1948
1949 \begin{elisp}
1950 (Defun get-names (&optional heading)
1951   (let ((conditions (list (list nil "has content" t))))
1952     (when heading
1953       (setq conditions
1954             (append conditions
1955                     (list (list nil "in" heading)))))
1956     (mapcar
1957      (lambda (place-or-string)
1958        (cond
1959          ;; place case
1960          ((eq (first place-or-string) 1)
1961           (print-system-object
1962            (place-lookup (second place-or-string))))
1963          ;; string case
1964          ((eq (first place-or-string) 0)
1965           (print-system-object place-or-string))))
1966      (mapcar
1967       (lambda (triple)
1968         (isolate-beginning triple))
1969       (satisfy-conditions conditions)))))
1970 \end{elisp}
1971
1972 \begin{notate}{Contrasting cases} \label{contrasting-cases}
1973 Consider the difference between
1974 \begin{quote}
1975 (? ``has author'' ``Arthur C. Clarke'') \\
1976 (? ``has genre'' ``fiction'')
1977 \end{quote}
1978 and
1979 \begin{quote}
1980 (\emph{name} ``has content'' *) \\
1981 (\emph{name} ``in'' ``heading'')
1982 \end{quote}
1983 where, in the latter case, we know \emph{who} we're
1984 talking about, and we just want to limit the list of items
1985 generated by the ``*'' by the second condition.  This
1986 should help illustrate the difference between `get-names'
1987 (which is making a general query) and `get-article' (which
1988 already knows the name of a specific article), and the
1989 logic that they use.
1990 \end{notate}
1991
1992 \begin{notate}{Placing items from Emacs} \label{place-item}
1993 We periodically need to place items from within Emacs.
1994 The function `place-item' is a wrapper for `put-in-place'
1995 that makes this possible (it also provides the user with
1996 an extra option, namely to put the place itself under a
1997 given heading).
1998
1999 Notice that when the symbol is placed in some pre-existing
2000 place (which can only happen when `id' is not nil), that
2001 place may already be under some other heading.  We will ignore
2002 this case for now (since it seems that putting objects
2003 into \emph{new} places will be the preferred action), but
2004 later we will have to look at what to do in this other
2005 case.
2006 \end{notate}
2007
2008 \begin{elisp}
2009 (Defun place-item (symbol &optional id heading)
2010   (let ((coordinates (put-in-place symbol id)))
2011     (when heading (add-triple coordinates "in" heading))
2012     coordinates))
2013 \end{elisp}
2014
2015 \begin{notate}{Automatic classifications} \label{classifications}
2016 It will presumably make sense to offer increasingly
2017 ``automatic'' classifications for new objects.  At this
2018 point, we've set things up so that the user can optionally
2019 supply the name of \emph{one} heading that their new object
2020 is a part of.
2021
2022 It may make more sense to allow an `\&rest theories'
2023 argument, and add the triple to all of the specified
2024 theories.  This would require modifying `Defun' to
2025 accommodate the `\&rest' idiom; see Note
2026 \ref{defun-interface}.
2027 \end{notate}
2028
2029 \begin{notate}{Postconditions and provenance}
2030 After adding something to the database, we may want to do
2031 something extra; perhaps generating provenance
2032 information, perhaps checking or enforcing database
2033 consistency, or perhaps running a hook that causes some
2034 update in the frontend (cf. Note \ref{provenance}).
2035 Provisions of this sort will come later, as will
2036 short-hand convenience functions for making particularly
2037 common complex entries.
2038 \end{notate}
2039
2040 \subsection{Importing \LaTeX\ documents} \label{importing}
2041
2042 \begin{notate}{Importing sketch} \label{importing-sketch}
2043 The code in this section imports a document as a
2044 collection of (sub-)sections and notes.  It gathers the
2045 sections, sub-sections, and notes recursively and records
2046 their content in a tree whose nodes are places (Note
2047 \ref{places}) and whose links express the ``component-of''
2048 relation described in Note \ref{order-of-order}.
2049
2050 This representation lets us see the geometric,
2051 hierarchical, structure of the document we've imported.
2052 It exemplifies a general principle, that geometric data
2053 should be represented by relationships between places, not
2054 direct relationships between strings.  This is because
2055 ``the same'' string often appears in ``different'' places
2056 in any given document (e.g. a paper's many sub-sections
2057 titled ``Introduction'' will not all have the same
2058 content).
2059
2060 What goes into the places is in some sense arbitrary.  The
2061 key is that whatever is \emph{in} or \emph{attached} to
2062 these places must tell us everything we need to know about
2063 the part of the document associated with that place
2064 (e.g. in the case of a note, its title and contents).
2065 That's over and above the \emph{structural} links which
2066 say how the places relate to one another.  Finally, all of
2067 these places and structural links will be added to a
2068 heading that represents the document as a whole.
2069
2070 A natural convention we'll use will be to put the name
2071 of any document component that's associated with a given
2072 place into that place, and add all other information as
2073 annotations.
2074 \end{notate}
2075
2076 \begin{notate}{Ordered versus unordered data} \label{ordered-vs-unordered}
2077 The code in this section is an example of one way to work
2078 with ordered data (i.e. \LaTeX\ documents are not just
2079 hierarchical, but the elements at each level of the
2080 hierarchy are also ordered).
2081
2082 Since \emph{many} artifacts are hierachical (e.g. Lisp
2083 code), we should try to be compatible with \emph{native}
2084 methods for working with order (in the case of Lisp, feed
2085 the code into a Lisp processor and use CDR and CAR, etc.).
2086
2087 We \emph{can} use triples such as (``rank'' ``1''
2088 ``Fred'') and (``rank'' ``2'' ``Barney'') to talk about
2089 order.  There may be some SQL techniques that would help.
2090 (FYI, order can be handled very explicitly in Elephant!)
2091
2092 In order to account for \emph{different} orderings, we
2093 need one more piece of data -- some explicit treatment of
2094 where the order \emph{is}; in other words, theories.
2095 (This table illustrates the fact that a heading is not so
2096 different from ``an additional triple''; indeed, the only
2097 reason to make them different is to have the extra
2098 convenience of having their elements be numbered.)
2099
2100 \begin{center}
2101 \begin{tabular}{|lll|l|}
2102 \hline
2103 rank & 1 & Fred & Friday \\
2104 rank & 2 & Barney & Friday \\
2105 rank & 1 & Barney & Saturday \\
2106 rank & 2 & Fred & Saturday \\
2107 \hline
2108 \end{tabular}
2109 \end{center}
2110 \end{notate}
2111
2112 \begin{notate}{The order of order} \label{order-of-order}
2113 The triples (``rank'' ``1'' ``Fred'') and (``rank'' ``2''
2114 ``Barney'') mentioned in Note \ref{ordered-vs-unordered}
2115 are easy enough to read and understand; it might be more
2116 natural in some ways for us to say (``Fred'' ``rank''
2117 ``1'') -- Fred has rank 1.  In this section, we're
2118 concerned with talking about the ordered parts of a
2119 document, and ($A$ $n$ $B$) seems like an intuitive way to
2120 say ``$A$'s $n$th component is $B$''.
2121 \end{notate}
2122
2123 \begin{notate}{It's not overdoing it, right?}
2124 When importing \emph{this} document, we see links like the
2125 following.  I hope that's not ``overdoing it''.  (Take a
2126 look at Note \ref{get-article} and Note \ref{get-names} to
2127 see how we go about getting information out of the
2128 database.)  We could get rid of one link if theories were
2129 database objects (cf. Note
2130 \ref{theories-as-database-objects}).
2131 \end{notate}
2132
2133 \begin{idea}
2134 "T557[P135|Web interface|.in.arxana.tex]"
2135 "T558[Future plans.9.P135|Web interface|]"
2136 "T559[T558[Future plans.9.P135|Web interface|].in.arxana.tex]"
2137 \end{idea}
2138
2139 \begin{notate}{Importing in general} \label{importing-generally}
2140 We will eventually have a collection of parsers to get
2141 various kinds of documents into the system in various
2142 different ways (Note \ref{parsing}).  For now, this
2143 section gives a simple way to get some sorts of
2144 \LaTeX\ documents into the system, namely documents
2145 structured along the same lines as the document you're
2146 reading now!
2147
2148 An interesting approach to parsing \emph{math} documents
2149 has been undertaken in the \LaTeX ML
2150 project.\footnote{{\tt http://dlmf.nist.gov/LaTeXML/}}
2151 Eventually it would be nice to get that level of detail
2152 here, too!  Emacsspeak is another example of a
2153 \LaTeX\ parser that deals with large-scale textual
2154 structures as well as smaller bits and
2155 pieces.\footnote{{\tt
2156     http://www.cs.cornell.edu/home/raman/aster/aster-thesis.ps}}
2157
2158 It would probably be useful to put together some parsers
2159 for HTML and wiki code soon.
2160 \end{notate}
2161
2162 \begin{notate}{On `import-buffer'}
2163 This function imports \LaTeX\ documents, taking care of
2164 the non-recursive aspects of this operation.  It imports
2165 frontmatter (everything up to the first
2166 \verb+\begin{section}+), but assumes ``backmatter'' is
2167 trivial, and does not import it.  The imported material is
2168 classified as a ``document'' with the same name as the
2169 imported buffer.
2170 \end{notate}
2171
2172 \begin{elisp}
2173 (defun import-buffer (&optional buffername)
2174   (save-excursion
2175     (set-buffer (get-buffer (or buffername
2176                                 (current-buffer))))
2177     (goto-char (point-min))
2178     (search-forward-regexp "\\\\begin{document}")
2179     (search-forward-regexp "\\\\section")
2180     (goto-char (match-beginning 0))
2181     ;; other links will be made in the "heading of this
2182     ;; document", but here we make a broader assertion.
2183     (scholium buffername "is a" "document")
2184     (scholium buffername
2185               "has frontmatter"
2186               (buffer-substring-no-properties
2187                (point-min)
2188                (point))
2189               buffername)
2190     ;;; These should maybe be scholia attached to
2191     ;; root-coords (below), but for some reason that
2192     ;; wasn't working so well -- investigate later --
2193     ;; maybe it just wasn't good to run after running
2194     ;; `import-within'.
2195     (let* ((root-coords (place-item buffername nil
2196                                     buffername))
2197            (levels
2198             '("section" "subsection" "subsubsection"))
2199            (current-parent buffername)
2200            (level-end nil)
2201            (sections (import-within levels))
2202            (index 0))
2203       (while sections
2204         (let ((coords (car sections)))
2205           (setq index (1+ index))
2206           (scholium root-coords
2207                     index
2208                     coords
2209                     buffername))
2210         (setq sections (cdr sections))))))
2211 \end{elisp}
2212
2213 \begin{notate}{On `import-within'}
2214 Recurse through levels of sectioning to import
2215 \LaTeX\ code.
2216
2217 It would be good if we could do something about sections
2218 that contain neither subsections nor notes (for example, a
2219 preface), or, more generally, about text that is not
2220 contained in any environment (possibly that appears before
2221 any section).  We'll save things like this for another
2222 editing round!
2223
2224 For the moment, we've decided to build the document
2225 hierarchy with links that are blind to whether the $k$th
2226 component of a section is a note or a subsection.
2227 Children that are notes are attached in the subroutine
2228 `import-notes' and those that are sections are attached in
2229 `import-within'.  Users can find out what type of object
2230 they are looking at based on whether or not it ``has
2231 content''.
2232
2233 Incidentally, when looking for the end of an importing
2234 level, `nil' is an OK result -- if this is the \emph{last}
2235 section at this level \emph{and} there is no subsequent
2236 section at a higher level.
2237 \end{notate}
2238
2239 \begin{elisp}
2240 (defun import-within (levels)
2241   (let ((this-level (car levels))
2242         (next-level (car (cdr levels))) answer)
2243     (while (re-search-forward
2244             (concat
2245              "^\\\\" this-level "{\\([^}\n]*\\)}"
2246              "\\( +\\\\label{\\)?"
2247              "\\([^}\n]*\\)?")
2248             level-end t)
2249       (let* ((name (match-string-no-properties 1))
2250              (at (place-item name nil buffername))
2251              (level-end
2252               (or (save-excursion
2253                     (search-forward-regexp
2254                      (concat "^\\\\" this-level "{.*")
2255                      level-end t))
2256                   level-end))
2257              (notes-end
2258               (if next-level
2259                   (or (progn (point)
2260                              (save-excursion
2261                                (search-forward-regexp
2262                                 (concat "^\\\\"
2263                                         next-level "{.*")
2264                                 level-end t)))
2265                       level-end)
2266                 level-end))
2267              (index (let ((current-parent at))
2268                       (import-notes notes-end)))
2269              (subsections (let ((current-parent at))
2270                             (import-within (cdr levels)))))
2271         (while subsections
2272           (let ((coords (car subsections)))
2273             (setq index (1+ index))
2274             (scholium at
2275                       index
2276                       coords
2277                       buffername)
2278             (setq subsections (cdr subsections))))
2279         (setq answer (cons at answer))))
2280     (reverse answer)))
2281 \end{elisp}
2282
2283 \begin{notate}{On `import-notes'} \label{import-notes}
2284 We're going to make the daring assumption that the
2285 ``textual'' portions of incoming \LaTeX\ documents are
2286 contained in ``Notes''.  That assumption is true, at
2287 least, for the current document.  The function returns the
2288 count of the number of notes imported, so that
2289 `import-within' knows where to start counting this
2290 section's non-note children.
2291
2292 Would this same function work to import all notes from a
2293 buffer without examining its sectioning structure?  Not
2294 quite, but close! (Could be a fun exercise to fix this.)
2295 \end{notate}
2296
2297 \begin{elisp}
2298 (defun import-notes (end)
2299   (let ((index 0))
2300     (while (re-search-forward (concat "\\\\begin{notate}"
2301                                       "{\\([^}\n]*\\)}"
2302                                       "\\( +\\\\label{\\)?"
2303                                       "\\([^}\n]*\\)?")
2304                               end t)
2305       (let* ((name
2306               (match-string-no-properties 1))
2307              (tag (match-string-no-properties 3))
2308              (beg
2309               (progn (next-line 1)
2310                      (line-beginning-position)))
2311              (end
2312               (progn (search-forward-regexp
2313                       "\\\\end{notate}")
2314                      (match-beginning 0)))
2315              (coords (place-item name nil buffername)))
2316         (setq index (1+ index))
2317         (scholium current-parent
2318                   index
2319                   coords
2320                   buffername)
2321         ;; not in the heading
2322         (scholium coords
2323                   "has content"
2324                   (buffer-substring-no-properties
2325                    beg end))
2326         (import-code-continuations coords)))
2327     index))
2328 \end{elisp}
2329
2330 \begin{notate}{On `import-code-continuations'} \label{import-code-continuations}
2331 This runs within the scope of `import-notes', to turn the
2332 series of Lisp chunks or other code snippets that follow a
2333 given note into a scholium attached to that note.  Each
2334 separate snippet becomes its own annotation.
2335
2336 The ``conditional regexps'' used here only work with Emacs
2337 version 23 or higher.
2338
2339 I'm noticing a problem with the way the `looking-at'
2340 form behaves.  It matches the expression in question,
2341 but then the match-end is reported as one character
2342 less than it supposed to be.  Maybe `looking-at' is
2343 just not as good as `re-search-forward'?  But it's
2344 what seems easiest to use.
2345 \end{notate}
2346
2347 \begin{elisp}
2348 (defun import-code-continuations (coords)
2349   (let ((possible-environments
2350          "\\(1?:lisp\\|idea\\|common\\)"))
2351     (while (looking-at
2352             (concat "\n*?\\\\begin{"
2353                     possible-environments
2354                     "}"))
2355       (let* ((beg (match-end 0))
2356              (environment (match-string 1))
2357              (end (progn (search-forward-regexp
2358                           (concat "\\\\end{"
2359                                   environment
2360                                   "}"))
2361                          (match-beginning 0)))
2362              (content (buffer-substring-no-properties
2363                        beg
2364                        end)))
2365         (scholium (scholium coords
2366                             "has attachment"
2367                             content)
2368                   "has type"
2369                   environment)))))
2370 \end{elisp}
2371
2372 \begin{notate}{On `autoimport-arxana'} \label{autoimport-arxana}
2373 This just calls `import-buffer', and imports this document
2374 into the system.
2375 \end{notate}
2376
2377 \begin{elisp}
2378 (defun autoimport-arxana ()
2379   (interactive)
2380   (import-buffer "arxana.tex"))
2381 \end{elisp}
2382
2383 \begin{notate}{Importing textual links}
2384 Of course, it would be good to import the links that users
2385 make between articles, since then we can quickly navigate
2386 from an article to the various articles that cite that
2387 article, as well as follow the usual forward-directional
2388 links.  Indeed, we should be able to browse each article
2389 within a ``neighborhood'' of other related articles.
2390 (We'll need to import labels as well, of course.)
2391 \end{notate}
2392
2393 \subsection{Browsing database contents} \label{browsing}
2394
2395 \begin{notate}{Browsing sketch} \label{browsing-sketch}
2396 This section facilitates browsing of documents represented
2397 with structures like those created in Section
2398 \ref{importing}, and sets the ground for browsing other
2399 sorts of contents (e.g. collections of tasks, as in
2400 Section \ref{managing-tasks}).
2401
2402 In order to facilitate general browsing, it is not enough
2403 to simply use `get-article' (Note \ref{get-article}) and
2404 `get-names' (Note \ref{get-names}), although these
2405 functions provide our defaults.  We must provide the means
2406 to find and display different things differently -- for
2407 example, a section's  table of contents will typically
2408 be displayed differently from its actual contents.
2409
2410 Indeed, the ability to display and select elements of
2411 document sections (Note \ref{display-section}) is
2412 basically the core browsing deliverable.  In the process
2413 we develop a re-usable article selector (Note
2414 \ref{selector}; cf. Note \ref{browsing-tasks}).  This in
2415 turn relies on a flexible function for displaying
2416 different kinds of articles (Note \ref{display-article}).
2417 \end{notate}
2418
2419 \begin{notate}{On `display-article'} \label{display-article}
2420 This function takes in the name of the article to display.
2421 Furthermore, it takes optional arguments `retriever' and
2422 `formatter', which tell it how to look up and/or format
2423 the information for display, respectively.
2424
2425 Thus, either we make some statement up front (choosing our
2426 `formatter' based on what we already know about the
2427 article), or we decide what to display after making some
2428 investigation of information attached to the article, some
2429 of which may be retrieved and displayed (this requires
2430 that we specify a suitable `retriever' and a complementary
2431 `formatter').
2432
2433 For example, the major mode in which to display the
2434 article's contents could be stored as a scholium attached
2435 to the article; or we might maintain some information
2436 about ``areas'' of the database that would tell us up
2437 front what which mode is associated with the current area.
2438 (The default is to simply insert the data with no markup
2439 whatsoever.)
2440
2441 Observe that this works when no heading argument is given,
2442 because in that case `get-article' looks for \emph{all}
2443 place pseudonyms.  (But of course that won't work well
2444 when we have multiple theories containing things with the
2445 same names, so we should get used to using the heading
2446 argument.)
2447
2448 (The business about requiring the data to be a sequence
2449 before engaging in further formatting is, of course, just
2450 a matter of expediency for making things work with the
2451 current dataset.)
2452 \end{notate}
2453
2454 \begin{elisp}
2455 (defun display-article
2456   (name &optional heading retriever formatter)
2457   (interactive "Mname: ")
2458   (let* ((data (if retriever
2459                    (funcall retriever name heading)
2460                  (get-article name heading))))
2461     (when (and data (sequencep data))
2462       (save-excursion
2463         (if formatter
2464             (funcall formatter data heading)
2465           (pop-to-buffer (get-buffer-create
2466                           "*Arxana Display*"))
2467           (delete-region (point-min) (point-max))
2468           (insert "NAME: " name "\n\n")
2469           (insert data)
2470           (goto-char (point-min)))))))
2471 \end{elisp}
2472
2473 \begin{notate}{An interactive article selector} \label{selector}
2474 The function `get-names' (Note \ref{get-names}) and
2475 similar functions can give us a collection of articles.
2476 The next few functions provide an interactive
2477 functionality for moving through this collection to find
2478 the article we want to look at.
2479
2480 We define a ``display style'' that the article selector
2481 uses to determine how to display various articles.  These
2482 display styles are specified by text properties attached
2483 to each option the selector provides.  Similarly, when
2484 we're working within a given heading, the relevant heading
2485 is also specified as a text property.
2486
2487 At selection time, these text properties are checked to
2488 determine which information to pass along to
2489 `display-article'.
2490 \end{notate}
2491
2492 \begin{elisp}
2493 (defvar display-style '((nil . (nil nil))))
2494
2495 (defun thing-name-at-point ()
2496   (buffer-substring-no-properties
2497    (line-beginning-position)
2498    (line-end-position)))
2499
2500 (defun get-display-type ()
2501   (get-text-property (line-beginning-position)
2502                      'arxana-display-type))
2503
2504 (defun get-relevant-heading ()
2505   (get-text-property (line-beginning-position)
2506                      'arxana-relevant-heading))
2507
2508 (defun arxana-list-select ()
2509   (interactive)
2510   (apply 'display-article
2511          (thing-name-at-point)
2512          (get-relevant-heading)
2513          (cdr (assoc (get-display-type)
2514                      display-style))))
2515
2516 (define-derived-mode arxana-list-mode fundamental-mode
2517   "arxana-list" "Arxana List Mode.
2518
2519 \\{arxana-list-mode-map}")
2520
2521 (define-key arxana-list-mode-map (kbd "RET")
2522             'arxana-list-select)
2523 \end{elisp}
2524
2525 \begin{notate}{On `pick-a-name'} \label{pick-a-name}
2526 Here `generate' is the name of a function to call to
2527 generate a list of items to display, and `format' is a
2528 function to put these items (including any mark-up) into
2529 the buffer from which individiual items can then be
2530 selected.
2531
2532 One simple way to get a list of names to display would be
2533 to reuse a list that we had already produced (this would
2534 save querying the database each time).  We could, in fact,
2535 store a history list of lists of names that had been
2536 displayed previously (cf. Note \ref{local-storage}).
2537
2538 We'll eventually want versions of `generate' that provide
2539 various useful views into the data, e.g., listing all of
2540 the elements of a given section (Note
2541 \ref{display-section}).
2542
2543 Finding all the elements that match a given search term,
2544 whether that's just normal text search or some kind of
2545 structured search would be worthwhile too.  Upgrading the
2546 display to e.g. color-code listed elements according to
2547 their type would be another nice feature to add.
2548 \end{notate}
2549
2550 \begin{elisp}
2551 (defun pick-a-name (&optional generate format heading)
2552   (interactive)
2553   (let ((items (if generate
2554                    (funcall generate)
2555                  (get-names heading))))
2556     (when items
2557       (set-buffer (get-buffer-create "*Arxana Articles*"))
2558       (toggle-read-only -1)
2559       (delete-region (point-min)
2560                      (point-max))
2561       (if format
2562           (funcall format items)
2563         (mapc (lambda (item) (insert item "\n")) items))
2564       (toggle-read-only t)
2565       (arxana-list-mode)
2566       (goto-char (point-min))
2567       (pop-to-buffer (get-buffer "*Arxana Articles*")))))
2568 \end{elisp}
2569
2570 \begin{notate}{On `display-section'} \label{display-section}
2571 When browsing a document, if you select a section, you
2572 should display a list of that section's constituent
2573 elements, be they notes or subsections.  The question
2574 comes up: when you go to display something, how do you
2575 know whether you're looking at the name of a section, or
2576 the name of an article?
2577
2578 When you get the section's contents out of the database
2579 (Note \ref{get-section-contents})
2580 \end{notate}
2581
2582 \begin{elisp}
2583 (defun display-section (name heading)
2584   (interactive (list (read-string
2585                       (concat
2586                        "name (default "
2587                        (buffer-name) "): ")
2588                       nil nil (buffer-name))))
2589   ;; should this pop to the Articles window?
2590   (pick-a-name `(lambda ()
2591                   (get-section-contents
2592                    ,name ,heading))
2593                `(lambda (items)
2594                   (format-section-contents
2595                    items ,heading))))
2596
2597 (add-to-list 'display-style
2598              '(section . (display-section
2599                           nil)))
2600 \end{elisp}
2601
2602 \begin{notate}{On `get-section-contents'} \label{get-section-contents}
2603 Sent by `display-section' (Note \ref{display-section})
2604 to `pick-a-name' as a generator for the table of contents
2605 of the section with the given name in the given heading.
2606
2607 This function first finds the triples that begin with the
2608 (placed) name of the section, then checks to see which of
2609 these are in the heading of the document we're examinining
2610 (in other words, which of these links represent structural
2611 information about that document).  It also looks at the
2612 items found at the end of these links to see if they are
2613 sections or notes (``noteness'' is determined by them
2614 having content).  The links are then sorted by their
2615 middles (which show the order in which these components
2616 have in the section we're examining).  After this ordering
2617 information has been used for sorting, it is deleted, and
2618 we're left with just a list of names in the apropriate
2619 order together with an indication of their noteness.
2620 \end{notate}
2621
2622 \begin{elisp}
2623 (Defun get-section-contents (name heading)
2624   (let (contents)
2625     (dolist (triple (triples-given-beginning
2626                      `(1 ,(resolve-ambiguity
2627                            (get-places name)))))
2628       (when (triple-exact-match
2629              `(2 ,(car triple)) "in" heading)
2630         (let* ((number (print-middle triple))
2631                (site (isolate-end triple))
2632                (noteness
2633                 (when (triples-given-beginning-and-middle
2634                        site "has content")
2635                   t)))
2636         (setq contents
2637               (cons (list number
2638                           (print-system-object
2639                            (place-contents site))
2640                           noteness)
2641                     contents)))))
2642     (mapcar 'cdr
2643             (sort contents
2644                   (lambda (component1 component2)
2645                     (< (parse-integer (car component1))
2646                        (parse-integer (car component2))))))))
2647 \end{elisp}
2648
2649 \begin{notate}{On `format-section-contents'} \label{format-section-contents}
2650 A formatter for document contents, used by
2651 `display-document' (Note \ref{display-document}) as input
2652 for `pick-a-name' (Note \ref{pick-a-name}).
2653
2654 Instead of just printing the items one by one,
2655 like the default formatter in `pick-a-name'  does,
2656 this version adds appropriate text properties, which
2657 we determine based the second component of
2658 of `items' to format.
2659 \end{notate}
2660
2661 \begin{elisp}
2662 (defun format-section-contents (items heading)
2663   ;; just replicating the default and building on that.
2664   (mapc (lambda (item)
2665           (insert (car item))
2666           (let* ((beg (line-beginning-position))
2667                  (end (1+ beg)))
2668             (unless (second item)
2669               (put-text-property beg end
2670                                  'arxana-display-type
2671                                  'section))
2672             (put-text-property beg end
2673                                'arxana-relevant-heading
2674                                heading))
2675           (insert "\n"))
2676         items))
2677 \end{elisp}
2678
2679 \begin{notate}{On `display-document'} \label{display-document}
2680 When browsing a document, you should first display its
2681 top-level table of contents.  (Most typically, a list of
2682 all of that document's major sections.)  In order to do
2683 this, we must find the triples that are begin at the node
2684 representing this document \emph{and} that are in the
2685 heading of this document.  This boils down to treating the
2686 document's root as if it was a section and using the
2687 function `display-section' (Note \ref{display-section}).
2688 \end{notate}
2689
2690 \begin{elisp}
2691 (defun display-document (name)
2692   (interactive (list (read-string
2693                       (concat
2694                        "name (default "
2695                        (buffer-name) "): ")
2696                       nil nil (buffer-name))))
2697   (display-section name name))
2698 \end{elisp}
2699
2700 \begin{notate}{Work with `heading' argument}
2701 We should make sure that if we know the heading we're
2702 working with (e.g. the name of the document we're
2703 browsing) that this information gets communicated in the
2704 background of the user interaction with the article
2705 selector.
2706 \end{notate}
2707
2708 \begin{notate}{Selecting from a hierarchical display} \label{hierarchical-display}
2709 A fancier ``article selector'' would be able to display
2710 several sections with nice indenting to show their
2711 hierarchical order.
2712 \end{notate}
2713
2714 \begin{notate}{Browser history tricks} \label{history-tricks}
2715 I want to put together (or put back together) something
2716 similar to the multihistoried browser that I had going in
2717 the previous version of Arxana and my Emacs/Lynx-based web
2718 browser, Nero\footnote{{\tt http://metameso.org/~joe/nero.el}}.
2719 The basic features are:
2720 (1) forward, back, and up inside the structure of a given
2721 document; (2) switch between tabs.  More advanced features
2722 might include: (3) forward and back globally across all
2723 tabs; (4) explicit understanding of paths that loop.
2724
2725 These sorts of features are independent of the exact
2726 details of what's printed to the screen each time
2727 something is displayed.  So, for instance, you could flip
2728 between section manifests a la Note \ref{display-section},
2729 or between hierarchical displays a la Note
2730 \ref{hierarchical-display}, or some combination; the key
2731 thing is just to keep track in some sensible way of
2732 whatever's been displayed!
2733 \end{notate}
2734
2735 \begin{notate}{Local storage for browsing purposes} \label{local-storage}
2736 Right now, in order to browse the contents of the
2737 database, you need to query the database every time.  It
2738 might be handy to offer the option to cache names of
2739 things locally, and only sync with the database from time
2740 to time.  Indeed, the same principle could apply in
2741 various places; however, it may also be somewhat
2742 complicated to set up.  Using two systems for storage, one
2743 local and one permanent, is certainly more heavy-duty than
2744 just using one permanent storage system and the local
2745 temporary display.  However, one thing in favor of local
2746 storage systems is that that's what I used in the the
2747 previous prototype of Arxana -- so some code already
2748 exists for local storage!  (Caching the list of
2749 \emph{names} we just made a selection from would be one
2750 simple expedient, see Note \ref{pick-a-name}.)
2751 \end{notate}
2752
2753 \begin{notate}{Hang onto absolute references}
2754 Since `get-article' (Note \ref{get-article}) translates
2755 strings into their ``place pseudonyms'', we may want to
2756 hang onto those pseudonyms, because they are, in fact, the
2757 absolute references to the objects we end up working with.
2758 In particular, they should probably go into the
2759 text-property background of the article selector, so it
2760 will know right away what to select!
2761 \end{notate}
2762
2763 \subsection{Exporting \LaTeX\ documents$^*$}
2764
2765 \begin{notate}{Roundtripping}
2766 The easiest test is: can we import a document into the
2767 system and then export it again, and find it unchanged?
2768 \end{notate}
2769
2770 \begin{notate}{Data format}
2771 We should be able to \emph{stably} import and export a
2772 document, as well as export any modifications to the
2773 document that were generated within Arxana.  This means
2774 that the exporting functions will have to read the data
2775 format that the importing functions use, \emph{and} that
2776 any functions that edit document contents (or structure)
2777 will also have to use the same format.  Furthermore,
2778 \emph{browsing} functions will have to be somewhat aware
2779 of this format.  So, this is a good time to ask -- did we
2780 use a good format?
2781 \end{notate}
2782
2783 \subsection{Editing database contents$^*$} \label{editing}
2784
2785 \begin{notate}{Roundtripping, with changes}
2786 Here, we should import a document into the system and then
2787 make some simple changes, and after exporting, check with
2788 diff to make sure the changes are correct.
2789 \end{notate}
2790
2791 \begin{notate}{Re-importing}
2792 One nice feature would be a function to ``re-import'' a
2793 document that has changed outside of the system, and make
2794 changes in the system's version whereever changes appeared
2795 in the source version.
2796 \end{notate}
2797
2798 \begin{notate}{Editing document structure}
2799 The way we have things set up currently, it is one thing
2800 to make a change to a document's textual components, and
2801 another to change its structure.  Both types of changes
2802 must, of course, be supported.
2803 \end{notate}
2804
2805 \section{Applications}
2806
2807 \subsection{Managing tasks} \label{managing-tasks}
2808
2809 \begin{notate}{What are tasks?}
2810 Each task tends to have a \emph{name}, a
2811 \emph{description}, a collection of \emph{prerequisite
2812   tasks}, a description of other \emph{material
2813   dependencies}, a \emph{status}, some \emph{justification
2814   of that status}, a \emph{creation date}, and an
2815 \emph{estimated time of completion}.  There might actually
2816 be several ``estimated times of completion'', since the
2817 estimate would tend to improve over time.  To really
2818 understand a task, one should keep track of revisions like
2819 this.
2820 \end{notate}
2821
2822 \begin{notate}{On `store-task-data'} \label{store-task-data}
2823 Here, we're just filling in a frame.  Since ``filling in a
2824 frame'' seems like the sort of operation that might happen
2825 over and over again in different contexts, to save space,
2826 it would probably be nice to have a macro (or similar)
2827 that would do a more general version of what this function
2828 does.
2829 \end{notate}
2830
2831 \begin{elisp}
2832 (Defun store-task-data
2833   (name description prereqs materials status
2834         justification submitted eta)
2835   (add-triple name "is a" "task")
2836   (add-triple name "description" description)
2837   (add-triple name "prereqs" prereqs)
2838   (add-triple name "materials" materials)
2839   (add-triple name "status" status)
2840   (add-triple name "status justification" justification)
2841   (add-triple name "date submitted" submitted)
2842   (add-triple name "estimated time of completion" eta))
2843 \end{elisp}
2844
2845 \begin{notate}{On `generate-task-data'} \label{generate-task-data}
2846 This is a simple function to create a new task matching
2847 the description above.
2848 \end{notate}
2849
2850 \begin{elisp}
2851 (defun generate-task-data ()
2852   (interactive)
2853   (let ((name (read-string "Name: "))
2854         (description (read-string "Description: "))
2855         (prereqs (read-string
2856                   "Task(s) this task depends on: "))
2857         (materials (read-string "Material dependencies: "))
2858         (status (completing-read
2859                  "Status (tabled, in progress, completed):
2860                  " '("tabled" "in progress" "completed")))
2861         (justification (read-string "Why this status? "))
2862         (submitted
2863          (read-string
2864           (concat "Date submitted (default "
2865                   (substring (current-time-string) 0 10)
2866                   "): ")
2867           nil nil (substring (current-time-string) 0 10)))
2868         (eta
2869          (read-string "Estimated date of completion:")))
2870     (store-task-data name description prereqs materials
2871                      status
2872                      justification submitted eta)))
2873 \end{elisp}
2874
2875 \begin{notate}{Possible enhancements to `generate-task-data'}
2876 In order to make this function very nice, it would be good
2877 to allow ``completing read'' over known tasks when filling
2878 in the prerequisites.  Indeed, it might be especially nice
2879 to offer a type of completing read that is similar in some
2880 sense to the tab-completion you get when completing a file
2881 name, i.e., quickly completing certain sub-strings of the
2882 final string (in this case, these substrings would
2883 correspond to task areas we are progressively zooming down
2884 into).
2885
2886 As for the task description, rather than forcing the user
2887 to type the description into the minibuffer, it might be
2888 nice to pop up a separate buffer instead (a la the
2889 Emacs/w3m textarea).  If we had a list of all the known
2890 tasks, we could offer completing-read over the names of
2891 existing tasks to generate the list of `prereqs'.  It
2892 might be nice to systematize date data, so we could more
2893 easily e.g. sort and display task info ``by date''.
2894 (Perhaps we should be working with predefined database
2895 types for dates and so on; but see Note
2896 \ref{choice-of-database}.)
2897
2898 Also, before storing the task, it might be nice to offer
2899 the user the chance to review the data they entered.
2900 \end{notate}
2901
2902 \begin{notate}{On `get-filler'} \label{get-filler}
2903 Just a wrapper for `triples-given-beginning-and-middle'.
2904 (Maybe add `heading' as an option here.)
2905 \end{notate}
2906
2907 \begin{elisp}
2908 (Defun get-filler (frame slot)
2909   (third (first
2910           (print-triples
2911            (triples-given-beginning-and-middle frame
2912                                                slot)))))
2913 \end{elisp}
2914
2915 \begin{notate}{On `get-task'} \label{get-task}
2916 Uses `get-filler' (Note \ref{get-filler}) to assemble the
2917 elements of a task's frame.
2918 \end{notate}
2919
2920 \begin{elisp}
2921 (Defun get-task (name)
2922   (when (triple-exact-match name "is a" "task")
2923     (list (get-filler name "description")
2924           (get-filler name "prereqs")
2925           (get-filler name "materials")
2926           (get-filler name "status")
2927           (get-filler name "status justification")
2928           (get-filler name "date submitted")
2929           (get-filler name
2930                       "estimated time of completion"))))
2931 \end{elisp}
2932
2933 \begin{notate}{On `review-task'} \label{review-task}
2934 This is a function to review a task by name.
2935 \end{notate}
2936
2937 \begin{elisp}
2938 (defun review-task (name)
2939   (interactive "MName: ")
2940   (let ((task-data (get-task name)))
2941     (if task-data
2942         (display-task task-data)
2943       (message "No data."))))
2944
2945 (defun display-task (data)
2946   (save-excursion
2947     (pop-to-buffer (get-buffer-create
2948                     "*Arxana Display*"))
2949     (delete-region (point-min) (point-max))
2950     (insert "NAME: " name "\n\n")
2951     (insert "DESCRIPTION: " (first data) "\n\n")
2952     (insert "TASKS THIS TASK DEPENDS ON: "
2953             (second data) "\n\n")
2954     (insert "MATERIAL DEPENDENCIES: "
2955             (third data) "\n\n")
2956     (insert "STATUS: " (fourth data) "\n\n")
2957     (insert "WHY THIS STATUS?: " (fifth data) "\n\n")
2958     (insert "DATE SUBMITTED:" (sixth data) "\n\n")
2959     (insert "ESTIMATED TIME OF COMPLETION: "
2960             (seventh data) "\n\n")
2961     (goto-char (point-min))
2962     (fill-individual-paragraphs (point-min) (point-max))))
2963 \end{elisp}
2964
2965 \begin{notate}{Possible enhancements to `review-task'}
2966 Breaking this down into a function to select the task and
2967 another function to display the task would be nice.  Maybe
2968 we should have a generic function for selecting any object
2969 ``by name'', and then special-purpose functions for
2970 displaying objects with different properties.
2971
2972 Using text properties, we could set up a ``field-editing
2973 mode'' that would enable you to select a particular field
2974 and edit it independently of the others.  Another more
2975 complex editing mode would \emph{know} which fields the
2976 user had edited, and would store all edits back to the
2977 database properly.  See Section \ref{editing} for more on
2978 editing.
2979 \end{notate}
2980
2981 \begin{notate}{Browsing tasks} \label{browsing-tasks}
2982 The function `pick-a-name' (Note \ref{pick-a-name}) takes
2983 two functions, one that finds the names to choose from,
2984 and the other that says how to present these names.  We
2985 can therefore build `pick-a-task' on top of `pick-a-name'.
2986 \end{notate}
2987
2988 \begin{elisp}
2989 (Defun get-tasks ()
2990   (mapcar #'first
2991           (print-triples
2992            (triples-given-middle-and-end "is a" "task")
2993            t)))
2994
2995 (defun pick-a-task ()
2996   (interactive)
2997   (pick-a-name
2998    'get-tasks
2999    (lambda (items)
3000      (mapc (lambda (item)
3001              (let ((pos (line-beginning-position)))
3002                (insert item)
3003                (put-text-property pos (1+ pos)
3004                                   'arxana-display-type
3005                                   'task)
3006                (insert "\n"))) items))))
3007
3008 (add-to-list 'display-style
3009              '(task . (get-task display-task)))
3010 \end{elisp}
3011
3012 \begin{notate}{Working with theories}
3013 Presumably, like other related functions, `get-tasks'
3014 should take a heading argument.
3015 \end{notate}
3016
3017 \begin{notate}{Check display style}
3018 Check if this works, and make style consistent between
3019 this usage and earlier usage.
3020 \end{notate}
3021
3022 \begin{notate}{Example tasks}
3023 It might be fun to add some tasks associated with
3024 improving Arxana, just to show that it can be done...
3025 maybe along with a small importer to show how importing
3026 something without a whole lot of structure can be easy.
3027 \end{notate}
3028
3029 \subsection{Other ideas$^*$}
3030
3031 \begin{notate}{A browser within a browser} \label{browser-within}
3032 All the stuff we're doing with triples can be superimposed
3033 over the existing web and existing web interfaces, by, for
3034 example, writing a web browser as a web app, and in this
3035 ``browser within a browser'' offer the ability to annotate
3036 and rewrite other people's web pages, produce 3rd-party
3037 redirects, and so forth, sharing these mods with other
3038 subscribers to the service.  (Already websites such as the
3039 short-lived scrum.diddlyumptio.us have offered limited
3040 versions of ``web annotation'', but, so far, what one can
3041 do with such services seems quite weak compared with
3042 what's possible.)
3043 \end{notate}
3044
3045 \begin{notate}{Improvements to the PlanetMath backend}
3046 From one point of view, the SQL tables are the main thing
3047 in Noosphere.  We could say that getting the things out of
3048 SQL and storing new things there is what Noosphere mainly
3049 does.  Following this line of thought, anything that
3050 adjusts these tables will do just as well, e.g., it
3051 shouldn't be terribly hard to develop an email-based
3052 front-end.  But rather than making Arxana work with the
3053 Noosphere relational table system, it is probably
3054 advantageous to translate the data from these tables into
3055 the scholium system.
3056 \end{notate}
3057
3058 \begin{notate}{A new communication platform}
3059 One of the premier applications I have in mind is a new
3060 way to handle communications in an online-forum.  I have
3061 previously called this ``subchanneling'', but really,
3062 joining channels is just as important.
3063 \end{notate}
3064
3065 \begin{notate}{Some tutorials}
3066 It would be interesting to write a tutorial for Common
3067 Lisp or just about any other topic with this system.  For
3068 example, some little ``worksheets'' or ``gymnasia'' that
3069 will help solidify user knowledge in topics on which
3070 questions keep appearing.
3071 \end{notate}
3072
3073 \section{Topics of philosophical interest}
3074
3075 \begin{notate}{Research and development}
3076 In Note \ref{theoretical-context}, I mentioned a model
3077 that could apply in many contexts; it is an essentially
3078 metaphysical conception.  I'm pretty sure that the data
3079 model of Note \ref{data-model} provides a general-enough
3080 framework to represent anything we might find ``out
3081 there''.  However, even if this is the case, questions as
3082 to \emph{efficient} means of working with such data still
3083 abound (cf. Note \ref{models-of-theories}, Note
3084 \ref{use-of-views}).
3085
3086 I propose that along with \emph{development} of Arxana as
3087 a useful system for \emph{doing} ``commons-based peer
3088 production'' should come a \emph{research} programme for
3089 understanding in much greater detail what ``commons-based
3090 peer production'' \emph{is}.  Eventually we may want to
3091 change the name of the subject of study to reflect still
3092 more general ideas of resource use.
3093
3094 While the ``frontend'' of this research project is
3095 anthropological, the ``backend'' is much closer to
3096 artificial intelligence.  On this level, the project is
3097 about understanding \emph{effective} means for solving
3098 human problems.  Often this will involve decomposing
3099 events and processes into constituent elements, making
3100 increasingly detailed treatments along the lines described
3101 in Note \ref{arxana}.
3102 \end{notate}
3103
3104 \begin{notate}{The relationship between text and commentary}
3105 Text under revision might be marked up by a copyeditor: in
3106 cases like these, the interpretation is clear.  However,
3107 what about marginalia with looser interpretations?  These
3108 seem to become part of the copy of the text they are
3109 attached to.  What about steering processes applied to a
3110 given course of action?  How about the relationship of
3111 thoughts or words to perception and action?  How can we
3112 lower the barrier between conception and action, while
3113 still maintaining some purchase on wisdom?
3114
3115 You see, a lot of issues in life have to do with overlays,
3116 multi-tracking, interchange between different systems; and
3117 in these terms, a lot of philosophy reduces to ``media
3118 awareness'' which extends into more and more immediate
3119 contexts (Note \ref{theoretical-context}).
3120 \end{notate}
3121
3122 \begin{notate}{Heuristic flow}
3123 Continuing the notion above: one does not need a
3124 fully-developed ``heading'' of work in order to do work --
3125 instead, one wants some straightforward heuristics that
3126 will enable the desired work to get done.  So, even
3127 supposing the work is ``heading building'', it can progress
3128 without becoming overwhelmed in abstractions -- because
3129 theories and heuristics are different things.
3130 \end{notate}
3131
3132 \begin{notate}{Limits of simple languages} \label{simple-languages}
3133 Triples are frequently ``subject, verb, object''
3134 statements, although with the annotation features, we can
3135 modify any part of any such statement; for example, we
3136 can apply an adverb to a given verb.
3137
3138 ``Tags'', of course, already provide ``subject,
3139 predicate'' relationships.  It will be interesting to
3140 examine the degree to which human languages can be mapped
3141 down into these sorts of simple languages.  What features
3142 are needed to make such languages \emph{useful}?  (Lisp's
3143 `car' and `cdr' seem related to the idea of making
3144 predicates useful.)
3145
3146 How are triples and predicates ``enough''?  What, if
3147 anything, do they lack?  The difference between triples
3148 and predicates illustrates the issue.  How should we
3149 characterize Arxana's additions to Lisp?
3150 \end{notate}
3151
3152 \begin{notate}{Higher dimensions}
3153 Why stop with three components?  Why not have $(A, B, C,
3154 D, T)$ represent a semantic relationship between all of
3155 $A$, $B$, $C$, and $D$ (in heading $T$, of course)?
3156 Actually, there is no reason to stop apart from the fact
3157 that I want to explore simple languages (Note
3158 \ref{simple-languages}).  In real life, things are not as
3159 simple, and we should be ready to deal with the
3160 complexities! (Cf., for example, Note \ref{pointing}).
3161 \end{notate}
3162
3163 \section{Future plans}
3164
3165 \begin{notate}{Development pathways}
3166 To the extent that it's possible, I'd like to maintain a
3167 succinct non-linear roadmap in which tasks are outlined
3168 and prioritized, and some procedural details are made
3169 concrete.  Whenever relevant this map should point into
3170 the current document.  I'll begin by revising the plans
3171 I've used so far!\footnote{{\tt
3172     http://metameso.org/files/plan-arxana.pdf}} Over the
3173 next several months, I'd like to see these plans develop
3174 into a genuine production machine, and see the machine
3175 begin to stabilize its operations.
3176 \end{notate}
3177
3178 \begin{notate}{Theories as database objects} \label{theories-as-database-objects}
3179 We're just beginning to treat theories as database
3180 objects; I expect there will be more work to do to make
3181 this work really well.  We'll want to make some test
3182 cases, like building a ``theory of chess'', or even just
3183 describing a particular chess board; cf. Note
3184 \ref{partial-image}.
3185 \end{notate}
3186
3187 \begin{notate}{Search engine/elements} \label{search-engine}
3188 One of the features that came very easy in the Emacs-only
3189 prototype was textual search.  With the strings stored in
3190 a database, Sphinx seems to be the most suitable search
3191 engine to use.  It is tempting to try to make our own
3192 inverted index using triples, so that text-based search
3193 can be even more directly integrated with semantic search.
3194 (Since the latest version(s) of Sphinx can act to some
3195 extent like a MySQL database, we almost have a direct
3196 connection in the backend, but since Sphinx is not
3197 \emph{the same} database, one would at least need some
3198 glue code to effect joins and so forth.)
3199
3200 More to the point, it is important for this project that
3201 the scholia-based document model be transparently extended
3202 down to the level of words and characters.  It may be
3203 helpful to think about text as \emph{always being}
3204 hypertext; a document as a heading; and a word in the
3205 inverted index as a frame.
3206 \end{notate}
3207
3208 \begin{notate}{Pointing at database elements and other things} \label{pointing}
3209 We will want to be able to point at other tables and at
3210 other sorts of objects and make use of their contents.
3211 The plan is that our triples will provide a sort of guide
3212 or backbone superimposed over a much larger data system.
3213 \end{notate}
3214
3215 \begin{notate}{Feature-chase}
3216 There are lots of different features that could be
3217 explored, for example: multi-dimensional history lists; a
3218 useful treatment of ``clusions''; MS Word-like colorful
3219 annotations; etc.  Many of these features are already
3220 prototyped.\footnote{See footnote \ref{old-version}.}
3221 \end{notate}
3222
3223 \begin{notate}{Regression testing}
3224 Along with any major feature chase, we should provide
3225 and maintain a regression testing suite.
3226 \end{notate}
3227
3228 \begin{notate}{Deleting and changing things}
3229 How will we deal with unlinking, disassociating,
3230 forgetting, entropy, and the like?  Changes can perhaps
3231 be modeled by an insertion following a deletion, and,
3232 as noted, we'll need effective ways to represent and
3233 manage change (Note \ref{change}).
3234 \end{notate}
3235
3236 \begin{notate}{Tutorial}
3237 Right now the system is simple enough to be pretty much
3238 self-explanatory, but if it becomes much more complicated,
3239 it might be helpful to put together a simple guide to some
3240 likely-to-be-interesting features.
3241 \end{notate}
3242
3243 \begin{notate}{Computing possible paths and connections}
3244 If we can find all the \emph{direct} paths from one node
3245 to another using `triples-given-beginning-and-end', can we
3246 inject some algorthms for finding longer, indirect paths
3247 into the system, and find ways to make them useful?
3248
3249 Similarly, we can satisfy local conditions (Note
3250 \ref{satisfy-conditions}), but we'll want to deal with
3251 increasingly ``non-local'' conditions (even just using the
3252 logical operator ``or'', instead of ``and'', for example).
3253 \end{notate}
3254
3255 \begin{notate}{Monster Mountain}
3256 In Summer 2007, we checked out the Monster Mountain MUD
3257 server\footnote{{\tt http://code.google.com/p/mmtn/}},
3258 which would enable several users to interact with one
3259 LISP, instead of just one database.  This would have a
3260 number of advantages, particularly for exploring
3261 ``scholiumific programming'', but also towards fulfilling
3262 the user-to-user interaction objective stated in Note
3263 \ref{theoretical-context}. I plan to explore this after
3264 the primary goal of multi-user interaction with the
3265 database has been solidly completed.
3266 \end{notate}
3267
3268 \begin{notate}{Web interface}
3269 A finished web interface may take a considerable amount of
3270 work (if the complexity of an interesting Emacs interface
3271 is any indication), but the basics shouldn't be hard to
3272 put together soon.
3273 \end{notate}
3274
3275 \begin{notate}{Parsing input} \label{parsing}
3276 Complicated objects specified in long-hand (e.g. triples
3277 pointing to triples) can be read by a relatively simple
3278 parser -- which we'll have to write!  The simplest goal
3279 for the parser would be to be able to distinguish between
3280 a triple and a string -- presumably that much isn't hard.
3281 And of course, building complexes of triples that
3282 represent statements from natural language is a good
3283 long-term goal. (Right now, our granularity level is set
3284 much higher.)
3285 \end{notate}
3286
3287 \begin{notate}{Choice of database} \label{choice-of-database}
3288 I expect Elephant\footnote{{\tt
3289     http://common-lisp.net/project/elephant/}} may become
3290 our preferred database at some point in the future; we are
3291 currently awaiting changes to Elephant that make nested
3292 queries possible and efficient.  Some core queries related
3293 to managing a database of semantic links with the current
3294 Elephant were constructed by Ian Eslick, Elephant's
3295 maintainer.\footnote{{\tt
3296     http://planetx.cc.vt.edu/\~{}jcorneli/arxana/variant-4.lisp}}
3297
3298 On the other hand, it might be reasonable to use an Emacs
3299 database and redo the whole thing to work in Emacs
3300 (again), e.g. for single-user applications or users who
3301 want to work offline a lot of the time.
3302 \end{notate}
3303
3304 \begin{notate}{Different kinds of theories}
3305 Theories or variants thereof are of course already popular
3306 in other knowledge representation contexts.\footnote{{\tt
3307     http://www.cyc.com/cycdoc/vocab/mt-expansion-vocab.html}}$^{,}$\footnote{{\tt
3308     http://www.stanford.edu/\~{}kdevlin/HHL\_SituationTheory.pdf}}
3309 We'll want to adopt some useful techniques for knowledge
3310 management as soon as the core systems are ready.
3311
3312 Various notions of a mathematical theory
3313 exist.\footnote{{\tt
3314     http://planetmath.org/encyclopedia/Theory.html}} It
3315 would be nice to be able to assign specific logic to
3316 theories in Arxana, following the ``little theories''
3317 design of e.g. IMPS.\footnote{{\tt
3318     http://imps.mcmaster.ca/manual/node13.html}}
3319 \end{notate}
3320
3321 \section{Conclusion} \label{conclusion}
3322
3323 \begin{notate}{Ending and beginning again}
3324 This is the end of the Arxana system itself; the
3325 appendices provide some ancillary tools, and some further
3326 discussion.  Contributions that support the development of
3327 the Arxana project are welcome.
3328 \end{notate}
3329
3330 \appendix
3331
3332 \section{Appendix: Auto-setup} \label{appendix-setup}
3333
3334 \begin{notate}{Setting up auto-setup}
3335 This section provides code for satifying dependencies and
3336 setting up the program.  This code assumes that you are
3337 using a Debian/APT-based system (but things are not so
3338 different using say, Fedora or Fink; writing a
3339 multi-package-manager-friendly installer shouldn't be
3340 hard).  Of course, feel free to set things up differently
3341 if you have something else in mind!
3342 \end{notate}
3343
3344 \begin{elisp}
3345 (defalias 'set-up 'shell-command)
3346
3347 (defun alternative-set-up (string)
3348   (save-excursion
3349     (pop-to-buffer (get-buffer-create "*Arxana Help*"))
3350     (goto-char (point-max))
3351     (insert string "\n")))
3352
3353 (defun set-up-arxana-environment ()
3354   (interactive)
3355   (if (y-or-n-p
3356        "Run commands (y) (or just show instructions)? ")
3357       (fset 'set-up 'shell-command)
3358     (fset 'set-up 'alternative-set-up))
3359   (when (y-or-n-p "Install dependencies? ")
3360     (set-up "mkdir ~/arxana")
3361     (set-up "cd arxana"))
3362
3363   (when (y-or-n-p "Download latest Arxana? ")
3364     (set-up "wget http://metameso.org/files/arxana.tex"))
3365
3366   (unless (y-or-n-p "Is your emacs good enough?... ")
3367     (set-up
3368      (concat "cvs -z3 -d"
3369              ":pserver:anonymous@cvs.savannah.gnu.org:"
3370              "/sources/emacs co emacs"))
3371     (set-up "mv emacs ~")
3372     (set-up "cd ~/emacs")
3373     (set-up "./configure && make bootstrap")
3374     (set-up "cd ~/arxana"))
3375
3376   (defvar pac-man nil)
3377
3378   (cond ((y-or-n-p
3379           "Do you use an apt-based package manager? ")
3380          (setq pac-man "apt-get"))
3381         (t (message
3382             "OK, get Lisp and SQL on your own, then!")))
3383
3384   (when pac-man
3385     (when (y-or-n-p "Install Common Lisp? ")
3386       (set-up (concat pac-man " install sbcl")))
3387
3388     (when (y-or-n-p "Install Postgresql? ")
3389       (set-up (concat pac-man " install postgresql"))
3390       (when (y-or-n-p "Help setting up PostgreSQL? ")
3391         (save-excursion
3392           (pop-to-buffer (get-buffer-create "*Arxana Help*"))
3393           (insert "As superuser (root),
3394 edit /etc/postgresql/7.4/main/pg_hba.conf
3395 make sure it says this:
3396 host all all 127.0.0.1 255.255.255.255 trust
3397 then edit /etc/postgresql/7.4/main/postgresql.conf
3398 and make it say
3399 tcpip_socket = true
3400 then restart:
3401 /etc/init.d/postgresql-7.4 restart
3402 su postgres
3403 createuser username
3404 exit
3405 as username, run
3406 createdb -U username\n")))))
3407
3408   (when (y-or-n-p "Install SLIME...? ")
3409     (set-up (concat "cvs -d :pserver:anonymous"
3410                            ":anonymous@common-lisp.net:"
3411                            "/project/slime/cvsroot co slime"))
3412     (set-up
3413      (concat "echo \";; Added to ~/.emacs for Arxana:\n\n"
3414              "(add-to-list 'load-path \"~/slime/\")\n"
3415              "(setq inferior-lisp-program \"/usr/bin/sbcl\")\n"
3416              "(require 'slime)\n"
3417              "(slime-setup '(slime-repl))\n\n\""
3418              "| cat - ~/.emacs > ~/updated.emacs &&"
3419              "mv ~/updated.emacs ~/.emacs")))
3420
3421   (when (y-or-n-p "Set up Common Lisp environment? ")
3422     (set-up "mkdir ~/.sbcl")
3423     (set-up "mkdir ~/.sbcl/site")
3424     (set-up "mkdir ~/.sbcl/systems")
3425     (set-up "cd ~/.sbcl/site")
3426     (set-up (concat "wget http://files.b9.com/"
3427                     "clsql/clsql-latest.tar.gz"))
3428     (set-up "tar -zxf clsql-4.0.3.tar.gz")
3429     (set-up (concat "wget http://files.b9.com/"
3430                            "uffi/uffi-latest.tar.gz"))
3431     (set-up "tar -zxf uffi-1.6.0.tar.gz")
3432     (set-up (concat "wget http://files.b9.com/"
3433                            "md5/md5-1.8.5.tar.gz"))
3434     (set-up "tar -zxf md5-1.8.5.tar.gz")
3435     (set-up "cd ~/.sbcl/systems")
3436     (set-up "ln -s ../site/md5-1.8.5/md5.asd .")
3437     (set-up "ln -s ../site/uffi-1.6.0/uffi.asd .")
3438     (set-up "ln -s ../site/clsql-4.0.3/clsql.asd .")
3439     (set-up "ln -s ../site/clsql-4.0.3/clsql-uffi.asd .")
3440     (set-up (concat "ln -s ../site/clsql-4.0.3/"
3441                            "clsql-postgresql-socket.asd ."))
3442     (set-up "ln -s ~/arxana/arxana.asd ."))
3443
3444   (when (y-or-n-p "Modify ~/.sbclrc so CL always starts Arxana? ")
3445     (set-up
3446      (concat "echo \";; Added to ~/.sbclrc for Arxana:\n\n"
3447              "(require 'asdf)\n\n"
3448              "(asdf:operate 'asdf:load-op 'swank)\n"
3449              "(setf swank:*use-dedicated-output-stream* nil)\n"
3450              "(setf swank:*communication-style* :fd-handler)\n"
3451              "(swank:create-server :port 4006 :dont-close t)\n\n"
3452              "(asdf:operate 'asdf:load-op 'clsql)\n"
3453              "(asdf:operate 'asdf:load-op 'arxana)\n"
3454              "(in-package arxana)\n"
3455              "(connect-to-database)\n"
3456              "(locally-enable-sql-reader-syntax)\n\n\""
3457              "| cat ~/.sbclrc - > ~/updated.sbclrc &&"
3458              "mv ~/updated.sbclrc ~/.sbclrc")))
3459
3460   (when (y-or-n-p "Install Monster Mountain? ")
3461     (set-up "cd ~/.sbcl/systems")
3462     (set-up (concat
3463                     "darcs get http://common-lisp.net/project/"
3464                     "bordeaux-threads/darcs/bordeaux-threads/"))
3465     (set-up (concat
3466                     "svn checkout svn://common-lisp.net/project/"
3467                     "usocket/svn/usocket/trunk usocket-svn"))
3468     ;; I've had problems with this approach to setting cclan
3469     ;; mirror...
3470     (set-up
3471      (concat
3472       "wget \"http://ww.telent.net/cclan-choose-mirror"
3473       "?M=http%3A%2F%2Fthingamy.com%2Fcclan%2F\""))
3474     (set-up (concat "wget http://ww.telent.net/cclan/"
3475                            "split-sequence.tar.gz"))
3476     (set-up "tar -zxf split-sequence.tar.gz")
3477     (set-up
3478      (concat "svn checkout http://mmtn.googlecode.com/"
3479              "svn/trunk/ mmtn-read-only"))
3480     (set-up
3481      "ln -s ~/bordeaux-threads/bordeaux-threads.asd .")
3482     (set-up "ln -s ~/usocket-svn/usocket.asd .")
3483     (set-up "ln -s ~/split-sequence/split-sequence.asd .")
3484     (set-up "ln -s ~/mmtn/src/mmtn.asd .")))
3485 \end{elisp}
3486
3487 \begin{notate}{Postgresql on Fedora}
3488 There are some slightly different instructions for
3489 installing postgresql on Fedora; the above will be
3490 changed to include them, but for now, check them
3491 out on the
3492 web.\footnote{{\tt http://www.flmnh.ufl.edu/linux/install\_postgresql.htm}}
3493 \end{notate}
3494
3495 \begin{notate}{Using MySQL and CLISP instead} \label{backend-variant}
3496 Since my OS X box seems to have a variety of confusing
3497 PostgreSQL systems already installed (which I'm not sure
3498 how to configure), and CLISP is easy to install with fink,
3499 I thought I'd try a different set up for simplicity and
3500 variety.
3501
3502 In order to make it work, I enabled root user on Mac OS X
3503 per instructions on web, and installed and configured
3504 mysql; used a slight modification of the strings table
3505 described previously; download and installed
3506 cffi\footnote{{\tt
3507     http://common-lisp.net/project/cffi/releases/cffi\_latest.tar.gz}};
3508 changed the definition of `connect-to-database' in
3509 Arxana's utilities.lisp; doctored up my ~/.clisprc.lisp;
3510 and changed how I started Lisp.  Details below.
3511 \end{notate}
3512
3513 \begin{idea}
3514 ;; on the shell prompt
3515 sudo apt-get install mysql
3516 sudo mysqld_safe --user=mysql &
3517 sudo daemonic enable mysql
3518 sudo mysqladmin -u root password root
3519 mysql --user=root --password=root -D test
3520 create database joe; grant all on joe.* to joe@localhost
3521 identified by 'joe'
3522
3523 ;; in tabledefs.lisp
3524 (execute-command "CREATE TABLE strings (
3525    id SERIAL PRIMARY KEY,
3526    text TEXT,
3527    UNIQUE INDEX (text(255))
3528 );")
3529
3530 ;; in ~/asdf-registry/ or whatever you've designated as
3531 ;; your asdf:*central-registry*
3532 ln -s ~/cffi_0.10.4/cffi-uffi-compat.asd .
3533 ln -s ~/cffi_0.10.4/cffi.asd .
3534
3535 ;; In utilities.lisp
3536 (defun connect-to-database ()
3537    (connect `("localhost" "joe" "joe" "joe")
3538             :database-type :mysql))
3539
3540 ;; In ~/.clisprc.lisp
3541 (asdf:operate 'asdf:load-op 'clsql)
3542 (push "/sw/lib/mysql/"
3543 CLSQL-SYS:*FOREIGN-LIBRARY-SEARCH-PATHS*)
3544
3545 ;; From SLIME prompt, and not in ~/.clisprc.lisp
3546 (in-package #:arxana)
3547 (connect-to-database)
3548 (locally-enable-sql-reader-syntax)
3549 \end{idea}
3550
3551 \begin{notate}{Installing Sphinx}
3552 Here are some tips on how to install and configure
3553 Sphinx.
3554 \end{notate}
3555
3556 \begin{idea}
3557 ;; Fedora/Postgresql flavor
3558 yum install postgresql-devel
3559 ./configure --without-mysql
3560   --with-pgsql
3561   --with-pgsql-libs=/usr/lib/pgsql/
3562   --with-pgsql-includes=/usr/include/pgsql
3563
3564 ;; Fink/MySQL flavor
3565 ./configure --with-mysql
3566   --with-mysql-includes=/sw/include/mysql
3567   --with-mysql-libs=/sw/lib/mysql
3568 \end{idea}
3569
3570 \begin{notate}{Getting Sphinx set up} \label{sphinx-setup}
3571 Here are some instructions I've used to get Sphinx set
3572 up.
3573 \end{notate}
3574
3575 \begin{notate}{Create a sphinx.conf}
3576 I want a very minimal sphinx.conf, this seems to work.
3577 (We should probably set this up so that it gets written
3578 to a file when the Arxana is set up.)
3579 \end{notate}
3580
3581 \begin{idea}
3582 ## Copy this to /usr/local/etc/sphinx.conf when you want
3583 ## to use it.
3584
3585 source strings
3586 {
3587  type            = mysql
3588  sql_host        = localhost
3589  sql_user        = joe
3590  sql_pass        = joe
3591  sql_db          = joe
3592  sql_query       = SELECT id, text FROM strings
3593 }
3594
3595 ## index definition
3596
3597 index strings
3598 {
3599  source          = strings
3600  path            = /Users/planetmath/sphinx/search-testing
3601  morphology      = none
3602 }
3603
3604 ## indexer settings
3605
3606 indexer
3607 {
3608  mem_limit       = 32M
3609 }
3610
3611 ## searchd settings
3612
3613 searchd
3614 {
3615  listen          = 3312
3616  listen          = localhost:3307:mysql41
3617  log             = /Users/planetmath/sphinx/searchd.log
3618  query_log       = /Users/planetmath/sphinx/searchd_query.log
3619  read_timeout    = 5
3620  max_children    = 30
3621  pid_file        = /Users/planetmath/sphinx/searchd.pid
3622  max_matches     = 1000
3623 }
3624 \end{idea}
3625
3626 \begin{notate}{Working from the command line}
3627 Then you can run commands like these.
3628 \end{notate}
3629
3630 \begin{idea}
3631 /usr/local/bin/indexer strings
3632 /usr/local/bin/search "but, then"
3633
3634 % mysql -h 127.0.0.1 -P 3307
3635 mysql> SELECT * FROM strings WHERE MATCH('but, then');
3636 \end{idea}
3637
3638 \begin{notate}{Integrating this with Lisp}
3639 Since we can talk to Sphinx via Mysql
3640 protocol, it seems reasonable that we should be able to talk to
3641 it from CLSQL, too.  With a little fussing to get the format
3642 right, I found something that works!
3643 \end{notate}
3644
3645 \begin{idea}
3646 (connect `("127.0.0.1" "" "" "" "3307") :database-type :mysql)
3647 (mapcar (lambda (elt) (floor (car elt)))
3648   (query "select * from strings where match('text')"))
3649 \end{idea}
3650
3651 \begin{notate}{Some added difficulty with Postgresql}
3652 When I try to index things on the server, I get an
3653 error, as below.  The question is a good one... I'm
3654 not sure \emph{how} postgresql is set up on the server,
3655 actually...
3656 \end{notate}
3657
3658 \begin{idea}
3659 ERROR: index 'strings': sql_connect: could not connect to server:
3660 Connection refused
3661 Is the server running on host "localhost" and accepting
3662 TCP/IP connections on port 5432?
3663 \end{idea}
3664
3665 \section{Appendix: A simple literate programming system} \label{appendix-lit}
3666
3667 \begin{notate}{The literate programming system used in this paper}
3668 This code defines functions that grab all the Lisp
3669 portions of this document, evaluate the Emacs Lisp
3670 sections in Emacs, and save the Common Lisp sections in
3671 suitable files.\footnote{{\tt
3672     Cf. http://mmm-mode.sourceforge.net/}} It requires
3673 that the \LaTeX\ be written in a certain consistent way.
3674 The function assumes that this document is the current
3675 buffer.
3676
3677 \begin{verbatim}
3678 (defvar lit-code-beginning-regexp
3679   "^\\\\begin{elisp}\\|^\\\\begin{common}{\\([^}\n]*\\)}")
3680
3681 (defvar lit-code-end-regexp
3682   "^\\\\end{elisp}\\|^\\\\end{common}")
3683
3684 (defun lit-process ()
3685   (interactive)
3686   (save-excursion
3687     (let ((to-buffer "*Lit Code*")
3688           (from-buffer (buffer-name (current-buffer)))
3689           (start-buffers (buffer-list)))
3690       (set-buffer (get-buffer-create to-buffer))
3691       (erase-buffer)
3692       (set-buffer (get-buffer-create from-buffer))
3693       (goto-char (point-min))
3694       (while (re-search-forward
3695               lit-code-beginning-regexp nil t)
3696         (let* ((file (match-string 1))
3697                (beg (match-end 0))
3698                (end (save-excursion
3699                       (search-forward-regexp
3700                        lit-code-end-regexp nil t)
3701                       (match-beginning 0)))
3702                (match (buffer-substring-no-properties
3703                        beg end)))
3704           (let ((to-buffer
3705                  (if file
3706                      (concat "*Lit Code*: " file)
3707                    "*Lit Code*")))
3708             (save-excursion
3709               (set-buffer (get-buffer-create
3710                            to-buffer))
3711               (insert match)))))
3712       (dolist
3713           (buffer (set-difference (buffer-list)
3714                                   start-buffers))
3715         (save-excursion
3716           (set-buffer buffer)
3717           (if (string= (buffer-name buffer)
3718                        "*Lit Code*")
3719               (eval-buffer)
3720             (write-region (point-min)
3721                           (point-max)
3722                           (concat "~/arxana/"
3723                                   (substring
3724                                    (buffer-name
3725                                     buffer)
3726                                    12)))))
3727         (kill-buffer buffer)))))
3728 \end{verbatim}
3729 \end{notate}
3730
3731 \begin{notate}{Emacs-export?}
3732 It wouldn't be hard to export the Elisp sections so
3733 that those who wanted to could ditch the literate
3734 wrapper.
3735 \end{notate}
3736
3737 \begin{notate}{Bidirectional updating}
3738 Eventually it would be nice to have a code repository set
3739 up, and make it so that changes to the code can get
3740 snarfed up here.
3741 \end{notate}
3742
3743 \begin{notate}{A literate style}
3744 Ideally, each function will have its own Note to introduce
3745 it, and will not be called before it has been defined.  I
3746 sometimes make an exception to this rule, for example,
3747 functions used to form recursions may appear with no
3748 further introduction, and may be called before they are
3749 defined.
3750 \end{notate}
3751
3752 \section{Appendix: Hypertext platforms} \label{appendix-hyper}
3753
3754 \begin{notate}{The hypertextual canon} \label{canon}
3755 There is a core library of texts that come up in
3756 discussions of hypertext.
3757 \begin{itemize}
3758 % \item (Plato)
3759 \item The Rosetta stone
3760 \item The Talmud (Judah haNasi, Rav Ashi, and many others)
3761 \item Monadology (Wilhelm Leibniz)
3762 \item The Life and Opinions of Tristam Shandy, Gentleman
3763   (Lawrence Sterne)
3764 \item Middlemarch (George Eliot)
3765 % \item The Gay Science (Freidrich Nietzsche)
3766 % \item (Wittgenstein)
3767 % \item (Alan Turing)
3768 \item The Nova Trilogy (William S. Burroughs)
3769 \item The Logic of Sense (Gilles Deleuze)
3770 % \item Open Creation and its Enemies (Asger Jorn)
3771 \item Labyrinths (Jorge Luis Borges)
3772 \item Literary Machines (Ted Nelson)
3773 % \item Simulation and Simulacra (Jean Baudrillard)
3774 \item Lila (Robert M. Pirsig)
3775 % \item \TeX: the program (Donald Knuth)
3776 \item Dirk Gently's Holistic Detective Agency
3777   (Douglas Adams)
3778 \item Pussy, King of the Pirates (Kathy Acker)
3779 % \item Rachel Blau DuPlessis,
3780 % \item Emily Dickinson
3781 % \item Gertrude Stein
3782 % \item Zora Neale Hurston
3783 \end{itemize}
3784 At the same time, it is somewhat ironic that none of the
3785 items on this list are themselves hypertexts in the
3786 contemporary sense of the word.  It's also a bit funny
3787 that certain other works (even some by the same authors)
3788 aren't on this list.  Perhaps we begin to get a sense of
3789 what's going on in this quote from Kathleen
3790 Burnett:\footnote{{\tt http://www.iath.virginia.edu/pmc/text-only/issue.193/burnett.193}}
3791 \begin{quote}
3792 ``Multiplicity, as a hypertextual principle, recognizes a
3793   multiplicity of relationships beyond the canonical
3794   (hierarchical).  Thus, the traditional concept of
3795   literary authorship comes under attack from two
3796   quarters--as connectivity blurs the boundary between
3797   author and reader, multiplicity problematizes the
3798   hierarchy that is canonicity.''
3799 \end{quote}
3800 It seems quite telling that non-hypertextual canons remain
3801 mostly-non-hypertextual even today, despite the existence
3802 of catalogs, indexes, and online access.\footnote{{\tt
3803     http://www.gutenberg.org/wiki/Category:Bookshelf}}
3804 \end{notate}
3805
3806 \begin{notate}{A geek's guide to literature}
3807 This title is a riff on Slasov \v{Z}i\v{z}ek's ``A
3808 pervert's guide to cinema''.  Taking Note \ref{canon} as a
3809 jumping-off point, why don't we make a survey of
3810 historical texts from the point of view of an aficionado
3811 of hypertext!  Just what does one have to do to ``get on
3812 the list''?  Just what is ``the hypertextual
3813 perspective''?  And, if \v{Z}i\v{z}ek is correct and we're
3814 to look for the hyperreal in the world of cinematic
3815 fictions -- what's left over for the world of literature?
3816 (Or mathematics?)
3817 \end{notate}
3818
3819 \begin{notate}{The number 3}
3820 This is the number of things present if we count carefully
3821 the items $A$, $B$, and a connection $C$ between them.
3822 [Picture of $A\xrightarrow{C} B$.]
3823
3824 (Or even: given $A$ and $B$, we use Wittgenstein counting,
3825 and \emph{intuit} that $C$ exists as the collection $\{A,
3826 B\}$; after all,
3827   some connection must exist precisely because we were
3828   presented with $A$ and $B$ together -- and lest the
3829   connections proliferate infinitely, we lump them all
3830   together as one.  [Picture of $A$, $B$,
3831     with the \emph{frame} labeled $C$.])
3832 \end{notate}
3833
3834 \begin{notate}{Surfaces}
3835 Deleuze talks about a theory of surfaces associated with
3836 verbs and events.  His surfaces represent the evanescence
3837 of events in time, and of their descriptions in language.
3838 An event is seen as a vanishingly-thin boundary between
3839 one state of being and another.
3840
3841 Certainly, a statement that is true \emph{now} may not be
3842 true five minutes from now.  It is easier to think and
3843 talk about things that are coming up and things that have
3844 already happened.  ``Living in the moment'' is regarded as
3845 special or even ``Zen''.
3846
3847 We can begin to put these musings on a more solid
3848 mathematical basis.  We first examine two types of
3849 \emph{interfaces}:
3850 \begin{enumerate}
3851 \item $A\xrightarrow{C} B$, $A\xrightarrow{D} B$,
3852   $A\xrightarrow{E} B$
3853   (the interface of $A$ and $B$ across $C$, $D$, and $E$);
3854 \item $A\xrightarrow{C} B$, $D\xrightarrow{C} E$,
3855   $F\xrightarrow{C} G$
3856   (the interface of various terms across $C$).
3857 \end{enumerate}
3858 \end{notate}
3859
3860 \begin{notate}{Comic books}
3861 No geek's guide to literature would be complete without
3862 putting comics in a hallowed place.  [Framed picture of
3863   $A$, $B$ next to framed
3864   picture of $A$, $B$, $a$.]  What happened?
3865   $\ddot{\smile}$
3866 \end{notate}
3867
3868 \begin{notate}{Intersecting triples}
3869 Diagrammatically, it is tempting to portray
3870 $(ACB)_{\mathrm{mid}}DE$ as if it was closely related to
3871 $A(CDE)_{\mathrm{beg}}B$, despite the fact that they are
3872 notationally very different.  I'll have to think more
3873 about what this means.
3874 \end{notate}
3875
3876 \section{Appendix: Computational Linguistics} \label{appendix-linguistics}
3877
3878 \begin{notate}{What is this?}
3879 It might be reasonable to make annotating sentences part
3880 of our writeup on hypertext platforms -- but I'm putting
3881 it here for now.  If hypertext is what deals with language
3882 artifacts on the ``bulky'' level (saying, for example,
3883 that a subsection is part of a section, and so on), then
3884 computational linguistics is what deals with the finer
3885 levels.  However, the distinction is in some ways
3886 arbitrary, and many of the techniques should be at least
3887 vaguely similar.
3888 \end{notate}
3889
3890 \begin{notate}{Annotation sensibilities}\label{sense}
3891 We will want to be able to make at least two different
3892 kinds of annotations of verbs.  For example, given the
3893 statement
3894 \begin{itemize}
3895 \item[$S$.] (``Who'' ``is on'' ``first''),
3896 \end{itemize}
3897 I'd like to be able to say
3898 \begin{itemize}
3899 \item[I.](``is on'' ``means'' ``the position of a base runner in baseball'').
3900 \end{itemize}
3901 However, I'd also like to be able to say
3902 \begin{itemize}
3903 \item[II.] (``is on'' ``because'' ``he was walked'').
3904 \end{itemize}
3905 Annotation I is meant to apply to the term ``is on''
3906 itself (in a context that might be more general than just
3907 this one sentence).  If Who is also on steroids, that's
3908 another matter -- as this type of annotation helps make
3909 clear!
3910
3911 Annotation II is meant to apply to the term ``is on''
3912 \emph{as it
3913   appears in sentence $S$}.  In particular, Annotation II
3914 seems to work best in a context in which we've already
3915 accepted the ontological status of the verb-phrase ``is
3916 on first''.
3917
3918 Whereas Annotation I should presumably exist before
3919 statement $S$ is ever made (and it certainly helps make
3920 that statement make sense), Annotation II is most properly
3921 understood with reference to the fully-formed statement
3922 $S$.  However, Annotation II is different from a statement
3923 like ($S$ ``has truth value'' $F$) in that it looks into
3924 the guts of $S$.
3925 \end{notate}
3926
3927 \begin{notate}{Comparison of places and ontological status} \label{places-and-onto-status}
3928 The difference between (I) a ``global'' annotation, and
3929 (II) the annotation of a specific sentence is analogous to
3930 the difference between (a) relationships between objects
3931 without a place, and (b) relationships between objects in
3932 specific places.  (Cf. Note \ref{sense}: ``global''
3933 statements are of course made ``local'' by the theories
3934 that scope them.)
3935
3936 For example, in a descriptive ontology of research
3937 documents, I might make the ``placeless'' statement,
3938 \begin{itemize}
3939 \item[a.] (``Introduction'' ``names'' ``a section'')
3940 \end{itemize}
3941 On the other hand, the statement
3942 \begin{itemize}
3943 \item[b.] (``Introduction'' ``has subject'' ``American
3944   History''),
3945 \end{itemize}
3946 seems likely to be about a specific Introduction.  (And
3947 somewhere in the backend, this triple should be expressed
3948 in terms of places!)
3949 \end{notate}
3950
3951 \begin{notate}{Semantics}
3952 In a sentence like
3953 \begin{quote}
3954 (((``I'' ``saw'' ``myself'')$_{\mathrm{mid}}$ ``as if''
3955   ``through a glass'')$_{\mathrm{beg}}$ ``but'' ``darkly'')
3956 \end{quote}
3957 first of all, there may be different parenthesizations,
3958 and second of all, the semantics of links like ``as if''
3959 and ``but'' may shape, to some extent, the ways in
3960 which we parethesize.
3961 \end{notate}
3962
3963 \section{Appendix: Resource use} \label{appendix-resources}
3964
3965 \begin{notate}{Free culture in action}
3966 I thought it worthwhile to include this quote from
3967 a joint paper with Aaron Krowne:\footnote{See Footnote
3968 \ref{corneli-krowne}.}
3969 \begin{quote}
3970 ``[F]ree content typically
3971   manifests aspects of a common resource as well as an
3972   open access resource; while anyone can do essentially
3973   whatever they wish with the content offline, in its
3974   online life, the content is managed in a
3975   socially-mediated way.  In particular, rights to
3976   \emph{in situ} modification tend to be strictly
3977   controlled. [...]  By finding new ways to support
3978   freedom of speech within CBPP documents, we embrace
3979   subjectivity as a way to enhance the content of an
3980   intersubjectively valued corpus.  In the context of
3981   ``hackable'' media and maintenance protocols, the
3982   semantics with which scholia are handled can be improved
3983   upon indefinitely on a user-by-user basis and a
3984   resource-wide basis.  This is free culture in action.''
3985 \end{quote}
3986 \end{notate}
3987
3988 \begin{notate}{Learning}
3989 The learner, confronted with a learning resource, or the
3990 consumer of any other information resource (or indeed,
3991 practically any resource whatsoever) may want a chance to
3992 respond to the questions ``was this what you were looking
3993 for?'' and ``did you find this helpful?''.  In some cases,
3994 an independent answer to that question could be generated
3995 (e.g. if a student is seen to come up with a correct
3996 answer, or not).
3997 \end{notate}
3998
3999 \begin{notate}{Connections}
4000 A useful communication goal is to expose some of the
4001 connections between disparate resources.  Some existing
4002 connections may be far more explicit than others.  It's
4003 important to facilitate the making and explicating of
4004 connections by ``third parties'' (Note
4005 \ref{browser-within}).  The search for connections between
4006 ostensibly unrelated things is a key part of both
4007 creativity and learning.  In addition, connecting with
4008 what others are doing is an important part of being a
4009 social animal.
4010 \end{notate}
4011
4012 \begin{notate}{Boundaries}
4013 Notice that the departmentalization of knowledge is
4014 similar to any regime that oversees and administers
4015 boundaries.  In addition to bridging different areas,
4016 learning often involves pushing one's boundaries and
4017 getting out of one's comfort zone.  The ``sociological
4018 imagination'' involves seeing oneself as part of something
4019 bigger; this goes along with the idea of a discourse that
4020 lowers or transcends the boundaries between participants.
4021 Imagination of any form can challenge myopic patterns of
4022 resource use, although there are also myopic fictions
4023 which neglect to look at what's going on in reality!
4024 \end{notate}
4025
4026 \end{document}