latex/arxana-reboot.tex

   1 %;; arxana.tex                   -*- mode: Emacs-Lisp; -*-
   2 %;; Copyright (C) 2005-2009 Joe Corneli <holtzermann17@gmail.com>
   3
   4 %;; This program is free software: you can redistribute it and/or modify
   5 %;; it under the terms of the GNU Affero General Public License as published by
   6 %;; the Free Software Foundation, either version 3 of the License, or
   7 %;; (at your option) any later version.
   8 %;;
   9 %;; This program is distributed in the hope that it will be useful,
  10 %;; but WITHOUT ANY WARRANTY; without even the implied warranty of
  11 %;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  12 %;; GNU Affero General Public License for more details.
  13 %;;
  14 %;; You should have received a copy of the GNU Affero General Public License
  15 %;; along with this program.  If not, see <http://www.gnu.org/licenses/>.
  16
  17 % (progn
  18 %   (find-file "~/arxana.tex")
  19 %   (save-excursion
  20 %     (goto-char (point-max))
  21 %     (let ((beg (progn (search-backward "\\begin{verbatim}")
  22 %                       (match-end 0)))
  23 %           (end (progn (search-forward "\\end{verbatim}")
  24 %                       (match-beginning 0))))
  25 %       (eval-region beg end)
  26 %       (lit-process))))
  27
  28 %%% Commentary:
  29
  30 %% To load: remove %'s above and evaluate with C-x C-e.
  31
  32 %% Alternatively, run this:
  33 % head -n 13 arxana.tex | sed -e "/%/s///" > arxana-loader.el
  34 %% on the command line to produce something you can use
  35 %% to load Arxana when you start Emacs:
  36 % emacs -l arxana-loader.el
  37
  38 %% Or put the expression in your ~/.emacs (perhaps wrapped
  39 %% in function like `eval-arxana').
  40
  41 %% Or search for a similar form below and evaluate there!
  42
  43 %% Q.  Where exactly are we supposed to store the most
  44 %% up-to-date Arxana files when they are ready to go?
  45
  46 %% A.  Copy them into /usr/lib/sbcl/site-systems/arxana/
  47 %% and that should be enough.  Make sure that arxana.asd
  48 %% is in that directory and that you have a symbolic link,
  49 %% made via
  50
  51 %% ln -s ./arxana/arxana.asd .
  52
  53 %% in the directory /usr/lib/sbcl/site-systems/
  54 %% -- Make sure to load once as root to generate new fasls.
  55
  56 %% Q. How to run the remote slime after that?
  57
  58 %% A. Make sure that Emacs `slime-protocol-version' matches
  59 %% Common Lisp's `swank::*swank-wire-protocol-version*', then,
  60 %% like this:
  61
  62 %% ssh -L 4005:127.0.0.1:4005 joe@li23-125.members.linode.com
  63 %% linode$ sbcl
  64 %% M-x slime-connect RET RET
  65
  66 %%% Code:
  67
  68 \documentclass{article}
  69
  70 \usepackage{amsmath}
  71 \usepackage{amsthm}
  72 \usepackage{verbatim}
  73
  74 \newcommand{\meta}[1]{$\langle${\it #1}$\rangle$}
  75
  76 \theoremstyle{definition}
  77 \newtheorem{nota}{Note}[section]
  78
  79 \parindent = 1.2em
  80
  81 \newenvironment{notate}[1]
  82   {\begin{nota}[{\bf {\em #1}}]}%
  83   {\end{nota}}
  84
  85 \makeatletter
  86 \newenvironment{elisp}
  87   {\let\ORGverbatim@font\verbatim@font
  88    \def\verbatim@font{\ttfamily\scshape}%
  89    \verbatim}
  90   {\endverbatim
  91   \let\verbatim@font\ORGverbatim@font}
  92 \makeatother
  93
  94 \makeatletter
  95 \newenvironment{common}[1]
  96   {\let\ORGverbatim@font\verbatim@font
  97    \def\verbatim@font{\ttfamily\scshape}%
  98    \verbatim}
  99   {\endverbatim
 100   \let\verbatim@font\ORGverbatim@font}
 101 \makeatother
 102
 103 \makeatletter
 104 \newenvironment{idea}
 105   {\let\ORGverbatim@font\verbatim@font
 106    \def\verbatim@font{\ttfamily\slshape}%
 107    \verbatim}
 108   {\endverbatim
 109   \let\verbatim@font\ORGverbatim@font}
 110 \makeatother
 111
 112 \begin{document}
 113
 114 \title{\emph{Arxana}}
 115
 116 \author{Joseph Corneli\thanks{Copyright (C) 2005-2010
 117     Joseph Corneli {\tt <holtzermann17@gmail.com>}\newline
 118     $\longrightarrow$ transferred to the public domain.}}
 119 \date{Last revised: \today}
 120
 121 \maketitle
 122
 123 \abstract{A tool for building hackable semantic hypertext
 124   platforms.  Source code and mailing lists are at {\tt
 125     http://common-lisp.net/project/arxana}.}
 126
 127 \tableofcontents
 128
 129 \section{Introduction}
 130
 131 \begin{notate}{What is ``Arxana''?} \label{arxana}
 132 \emph{Arxana} is the name of a ``next generation''
 133 hypertext system that emphasizes annotation.  Every object
 134 in this system is annotatable.  Because of this, I
 135 sometimes call Arxana's core ``the scholium system'', but
 136 the name ``Arxana'' better reflects our aim: to explore
 137 the mysterious world of links, attachments,
 138 correspondences, and side-effects.
 139 \end{notate}
 140
 141 \begin{notate}{The idea} \label{theoretical-context}
 142 A scholia-based document model for commons-based peer
 143 production will inform the development of our
 144 system.\footnote{{\tt
 145 http://www.metascholar.org/events/2005/freeculture/viewabstract.php?id=19
 146 % alternate:
 147 % http://br.endernet.org/~akrowne/planetmath/papers/corneli\_fcdl/corneli-krowne.pdf
 148 \label{corneli-krowne}
 149 }}
 150 In this model, texts are made up of smaller texts until
 151 you get to atomic texts; user actions are built in the
 152 same way.  Multiple users should interact with a shared
 153 persistent data-store, through functional annotation, not
 154 destructive modification.  We should pursue the
 155 asynchronous interaction model until we arrive at live,
 156 synchronous, settings, where we facilitate real-time
 157 computer-mediated interactions between users, and between
 158 users and running hackable programs.
 159 \end{notate}
 160
 161 \begin{notate}{The data model} \label{data-model}
 162 Start by storing a collection of \emph{strings}.  Now add
 163 in \emph{pairs} and \emph{triples} which point at 2 and 3
 164 objects respectively.  (We can extend to n-tuples if that
 165 turns out to be convenient.)  Finally, we will maintain a
 166 collection of \emph{lists}, each of which points at an
 167 unlimited number of objects.
 168 \end{notate}
 169
 170 \begin{notate}{History}
 171 Thinking about how to improve existing systems for
 172 peer-based collaboration in 2004, I designed a simple
 173 version of the scholium system that treated textual
 174 commentary and markup as scholia.\footnote{{\tt
 175     http://wiki.planetmath.org/AsteroidMeta/old\_draft\_of\_scholium\_system}}
 176 In 2006, I put together a single-user version of this
 177 system that ran exclusively under Emacs.\footnote{{\tt
 178     http://metameso.org/files/sbdm4cbpp.tex} \label{old-version}}
 179 The current system is an almost-completely rewritten
 180 variant, bringing in a shared database and various other
 181 enhancements to support multi-user interaction.
 182 \end{notate}
 183
 184 \begin{notate}{A brisk review of the programming literature} \label{prog-lit-review}
 185 Many years before I started working on this project, there
 186 was something called the Emacs HyperText
 187 System.\footnote{{\tt
 188     http://www.aue.aau.dk/\~{}kock/Publications/HyperBase/}}
 189 What we're doing here updates for modern database methods,
 190 uses a more interesting data storage format, and also
 191 considers multiple front-ends to the same database (for
 192 example, a web interface).
 193
 194 Contemporary Emacs-based hypertext creation systems
 195 include Muse and Emacs Wiki.\footnote{{\tt
 196     http://mwolson.org/projects/EmacsMuse.html}}$^,$\footnote{{\tt
 197     http://mwolson.org/projects/EmacsWiki.html}} The
 198 browsing side features old standbys, Info and
 199 Emacs/w3m\footnote{Not to be confused with Emacs-w3m,
 200   which is not entirely ``Emacs-based''.}.  These packages
 201 provide ways to author or view what what we should now
 202 call ``traditional'' hypertext documents.
 203
 204 An another legacy tool worth mentioning is
 205 HyperCard\footnote{{\tt
 206     http://en.wikipedia.org/wiki/HyperCard}}.  This system
 207 was oriented around the idea of using hypertext to create
 208 software, a vision we share, but like just about everyone
 209 else working in the field at the time, it used
 210 uni-directional links.
 211
 212 Hypertext \emph{nouveau} is based on semantic triples.
 213 The Semantic Web standard provides one specification of
 214 the features we can expect from triples.\footnote{{\tt
 215     http://www.w3.org/TR/2004/REC-rdf-primer-20040210/}}
 216 Triples provide a framework for knowledge representation
 217 with more depth and flexibility than the popular
 218 ``tagging'' methodology.  For example, suitable
 219 collections of triples implement AI-style ``frames''.  The
 220 idea of using triples to organize archival material is
 221 generating some interest as Semantic Web ideas
 222 spread.\footnote{Cf. recent museum and library
 223   conferences}$^,$\footnote{Even among academic computer
 224   scientists! (Josh Grochow, p.c.)}
 225
 226 An abstractly similar project to Arxana with some grand
 227 goals is being developed by Chris Hanson at MIT under the
 228 name ``Web-scale Environments for Deduction
 229 Systems''.\footnote{{\tt
 230     http://publications.csail.mit.edu/abstracts/abstracts07/cph2/cph2.html}}
 231
 232 Another technically similar project is Freebase, a hand
 233 rolled database of open content, organized on frame-based,
 234 triple driven, principles.  The developer of the Freebase
 235 graphd database has some interesting things to say about
 236 old and new ways of handling triples.\footnote{{\tt
 237     http://blog.freebase.com/2008/04/09/a-brief-tour-of-graphd/}}
 238 \end{notate}
 239
 240 \begin{notate}{Fitting in}
 241 My current development goal is to use this system to
 242 create a more flexible multiuser interaction platform than
 243 those currently available to web-based collaborative
 244 projects (such as PlanetMath\footnote{{\tt
 245     http://planetmath.org}}).  As an intermediate stage,
 246 I'm using Arxana to help organize material for a book I'm
 247 writing.  Arxana's theoretical generality, active
 248 development status, detailed documentation, and
 249 superlatively liberal terms of use may make it an
 250 attractive option for you to try as well!
 251 \end{notate}
 252
 253 \begin{notate}{What you get}
 254 Arxana has an Emacs frontend, a Common Lisp middle-end,
 255 and a SQL backend.  If you want to do some work, any one
 256 of these components can be swapped out and replaced with
 257 the engine of your choice.  I've released all of the
 258 implementation work on this system into the public domain,
 259 and it runs on an entirely free/libre/open source software
 260 platform.
 261 \end{notate}
 262
 263 \begin{notate}{Acknowledgements}
 264 Ted Nelson's ``Literary Machines'' and Marvin Minsky's
 265 ``Society of Mind'' are cornerstones in the historical and
 266 social contextualization of this work.  Alfred Korzybski's
 267 ``Science and Sanity'' and Gilles Deleuze's ``The Logic of
 268 Sense'' provided grounding and encouragement.  \TeX\ and
 269 GNU Emacs have been useful not just in prototyping this
 270 system, but also as exemplary projects in the genre I'm
 271 aiming for.  John McCarthy's Elephant 2000 was an
 272 inspiring thing to look at and think about\footnote{{\tt
 273     http://www-formal.stanford.edu/jmc/elephant/elephant.html}}, and of course Lisp has been a vital ingredient.
 274
 275 Thanks also to everyone who's talked about this project
 276 with me!
 277 \end{notate}
 278
 279 \section{Using the program}
 280
 281 \begin{notate}{Dependencies} \label{dependencies}
 282 Our interface is embedded in Emacs.  Backend processing is
 283 done with Common Lisp.  We are currently using the
 284 PostgreSQL database.  These packages should be available
 285 to you through the usual channels.  (I've been using SBCL,
 286 but any Lisp should do; please make sure you are using a
 287 contemporary Emacs version.)
 288
 289 We will connect Emacs to Lisp via Slime\footnote{{\tt
 290     http://common-lisp.net/project/slime/}}, and Lisp to
 291 PostgreSQL via CLSQL.\footnote{{\tt http://clsql.b9.com/}}
 292 CLSQL also talks directly to the Sphinx search engine,
 293 which we use for text-based search.\footnote{{\tt
 294     http://www.sphinxsearch.com/}} Once all of these
 295 things are installed and working together, you should be
 296 able to begin to use Arxana.
 297
 298 Setting up all of these packages can be a somewhat
 299 time-consuming and confusing task, especially if you
 300 haven't done it before!  See Appendix \ref{appendix-setup}
 301 for help.
 302 \end{notate}
 303
 304 \begin{notate}{Export code and set up the interface}
 305 If you are looking at the source version of this document
 306 in Emacs, evaluate the following s-expression (type
 307 \emph{C-x C-e} with the cursor positioned just after its
 308 final parenthesis).  This exports the Common Lisp
 309 components of the program to suitable files for subsequent
 310 use, and prepares the Emacs environment.  (The code that
 311 does this is in Appendix \ref{appendix-lit}.)
 312 \end{notate}
 313
 314 \begin{idea}
 315 (save-excursion
 316   (let ((beg (search-forward "\\begin{verbatim}"))
 317         (end (progn (search-forward "\\end{verbatim}")
 318                     (match-beginning 0))))
 319     (eval-region beg end)
 320     (lit-process)))
 321 \end{idea}
 322
 323 \begin{notate}{To load Common Lisp components at run-time} \label{load-at-runtime}
 324 Link {\tt arxana.asd} somewhere where Lisp can find it.
 325 Then run commands like these in your Lisp; if you like,
 326 you can place all of this stuff in your config file to
 327 automatically load Arxana when Lisp starts.  The final
 328 form is only necessary if you plan to use CLSQL's special
 329 syntax on the Lisp command-line.
 330 \end{notate}
 331
 332 \begin{idea}
 333 (asdf:operate 'asdf:load-op 'clsql)
 334 (asdf:operate 'asdf:load-op 'arxana)
 335 (in-package arxana)
 336 (connect-to-database)
 337 (locally-enable-sql-reader-syntax)
 338 \end{idea}
 339
 340 \begin{notate}{To connect Emacs to Lisp}
 341 Either run {\tt M-x slime RET} to start and connect to
 342 Lisp locally, or {\tt M-x slime-connect RET RET} after you
 343 have opened a remote connection to your remote server with
 344 a command like this: {\tt ssh -L 4005:127.0.0.1:4005
 345   <username>@<host>} and started Lisp and the Swank server
 346 on the remote machine.  To have Swank start automatically
 347 when you start Lisp, put commands like this in your config
 348 file.
 349 \end{notate}
 350
 351 \begin{idea}
 352 (asdf:operate 'asdf:load-op 'swank)
 353 (setf swank:*use-dedicated-output-stream* nil)
 354 (setf swank:*communication-style* :fd-handler)
 355 (swank:create-server :dont-close t)
 356 \end{idea}
 357
 358 \begin{notate}{To define database structures}
 359 If you haven't yet defined the basic database structures,
 360 make sure to load them now!  (Using {\tt tabledefs.lisp},
 361 or the SQL code in Section \ref{sql-code})
 362 \end{notate}
 363
 364 \begin{notate}{Importing this document into system}
 365 You can browse this document inside Arxana: after loading
 366 the code, run \emph{M-x autoimport-arxana}.
 367 \end{notate}
 368
 369 \section{SQL tables} \label{sql-code}
 370
 371 \begin{notate}{Objects and codes} \label{objects-and-codes}
 372 Every object in the system is identified by an ordered
 373 pair: a \emph{code} and a \emph{reference}.  The codes say
 374 which table contains the indicated object, and references
 375 provide that object's id.  To a specific element of a list
 376 or n-tuple, a third number, that element's \emph{offset},
 377 is required.  The codes are as follows:
 378
 379 \begin{center}
 380 \begin{tabular}{|l|l|}
 381 \hline
 382 0 & list \\ \hline
 383 1 & string \\ \hline
 384 2 & pair \\ \hline
 385 3 & triple \\ \hline
 386 \end{tabular}
 387 \end{center}
 388 \end{notate}
 389
 390 \begin{idea}
 391 CREATE TABLE strings (
 392    id SERIAL PRIMARY KEY,
 393    text TEXT NOT NULL UNIQUE
 394 );
 395
 396 CREATE TABLE pairs (
 397    id SERIAL PRIMARY KEY,
 398    code1 INT NOT NULL,
 399    ref1 INT NOT NULL,
 400    code2 INT NOT NULL,
 401    ref2 INT NOT NULL,
 402    UNIQUE (code1, ref1,
 403            code2, ref2)
 404 );
 405
 406 CREATE TABLE triples (
 407    id SERIAL PRIMARY KEY,
 408    code1 INT NOT NULL,
 409    ref1 INT NOT NULL,
 410    code2 INT NOT NULL,
 411    ref2 INT NOT NULL,
 412    code3 INT NOT NULL,
 413    ref3 INT NOT NULL,
 414    UNIQUE (code1, ref1,
 415            code2, ref2,
 416            code3, ref3)
 417 );
 418 \end{idea}
 419
 420 \begin{notate}{A list of lists}\label{models-of-theories}
 421 As a central place to manage our collections, we first
 422 create a list of lists.  The `heading' is the list's name,
 423 and its `header' is metadata.
 424 \end{notate}
 425
 426 \begin{idea}
 427 CREATE TABLE lists (
 428   id SERIAL PRIMARY KEY,
 429   heading REFERENCES strings(id) UNIQUE,
 430   header REFERENCES strings(id)
 431 );
 432 \end{idea}
 433
 434 \begin{notate}{Lists on demand}\label{models-of-theories}
 435 Whenever we want to create a new list, we first add to the
 436 `lists' table, and then create a new table ``listk''
 437 (where k is equal to the new maximum id on `lists').
 438 \end{notate}
 439
 440 \begin{idea}
 441 CREATE TABLE listk (
 442    offset SERIAL PRIMARY KEY,
 443    code INT NOT NULL,
 444    ref INT NOT NULL
 445 );
 446 \end{idea}
 447
 448 \begin{notate}{Side-note on containers via triples}  \label{containers-using-triples}
 449 To model a basic container, we can just use triples like
 450 ``(A in B)''.  This is useful, but the elements of B are
 451 of course unordered.  In Section \ref{importing}, we make
 452 extensive use of triples like (B 1 $\alpha$), (B 2
 453 $\beta$), etc., to indicate that B's first component is
 454 $\alpha$, second component is $\beta$, and so on; so we
 455 can make ordered list-like containers as well.
 456
 457 This is an example of the difference in expressive power
 458 of tags (which only provide a sense of unordered
 459 containment in ``virtual baskets'') and triples (which
 460 here are seen to at least provide the additional sense of
 461 ordered containment in ``virtual filing cabinets'',
 462 although they have much more in store for us); cf. Note
 463 \ref{prog-lit-review}.
 464
 465 As useful as models based on these two principles are in
 466 principle, the user could easily be overloaded by looking
 467 at lots of different containers encoded in raw triples,
 468 all at once.
 469 \end{notate}
 470
 471 \begin{notate}{Sense of containment}
 472 Note that every element of a list is in the list in the
 473 same ``sense'' -- for example, we can't instantly
 474 distinguish elements that are ``halfway in'' from those
 475 that are ``all the way in'', the same way we could with
 476 pure triples.
 477 \end{notate}
 478
 479 %% \begin{notate}{References into theories}
 480 %% Since at the moment we have less than 10 basic codes, we
 481 %% can uniquely reference contents of theory $k$ with ordered
 482 %% pairs $10k+\mathit{basic\ code}$ and $\mathit{reference}$.
 483 %% \end{notate}
 484
 485 \begin{notate}{Uniqueness of strings and triples} \label{unique-things}
 486 An attempt to create a duplicate contents in a string or
 487 triple generates a warning.  This saves storage, given
 488 possible repetitive use -- and avoids confusion.  We can,
 489 however, reference duplicate ``copies'' on the lists.
 490 \end{notate}
 491
 492 \begin{notate}{Change} \label{change}
 493 Notice also that since neither strings nor triples
 494 ``change'', we have to account for change in other ways.
 495 In particular, the contents of lists can change.  (We may
 496 subsequently add some metadata to certain lists are
 497 ``locked'', or indicate that they can only be changed by
 498 adding, etc., so that their contents can be cited stably
 499 and reliably.)
 500 \end{notate}
 501
 502 %% \begin{notate}{Each place contains one object} \label{places}
 503 %% It is obvious from the table definition that I want each
 504 %% place to contain precisely one thing; perhaps it is less
 505 %% obvious why I want to use a database table to maintain
 506 %% this relationship between ``places'' and ``things''.  This
 507 %% is largely a matter of convenience, but in particular it
 508 %% makes it easy for places to change.
 509 %% \end{notate}
 510
 511 \begin{notate}{Provenance and other metadata} \label{provenance}
 512 We could of course add much more structure to the
 513 database, starting with simple adjustments like adding
 514 provenance metadata or versioning into the records for
 515 each stored thing.  For the time being, I assume that such
 516 metadata will appear in the application or content layer,
 517 as triples.  (The exception are the ``headings'' and
 518 ``headers'' associated with lists.)
 519 \end{notate}
 520
 521 \section{Common Lisp-side}
 522
 523 \subsection{Preliminaries}
 524
 525 \subsubsection*{System definition}
 526
 527 \begin{common}{arxana.asd}
 528 (defsystem "arxana"
 529     :version "1"
 530     :author "Joe Corneli <holtzermann17@gmail.com>"
 531     :licence "Public Domain"
 532     :components
 533     ((:file "packages")
 534      (:file "utilities" :depends-on ("packages"))
 535      (:file "database" :depends-on ("utilities"))
 536      (:file "queries" :depends-on ("packages"))))
 537 \end{common}
 538
 539 \subsubsection*{Package definition}
 540
 541 \begin{common}{packages.lisp}
 542 (defpackage :arxana
 543   (:use #:cl #:clsql #:clsql-sys))
 544 \end{common}
 545
 546 \subsubsection*{Utilities}
 547
 548 \begin{notate}{Useful things} \label{useful}
 549 These definitions are either necessary or useful for
 550 working the database and manipulating triple-centric
 551 and/or theory-situated data.  The implementation of
 552 theories given here is inspired by Lisp's streams.  This
 553 is perhaps the most gnarly part of the code; the pay-off
 554 of doing things the way we do them here is that
 555 subsequently theories can sit ``transparently'' over other
 556 structures.
 557 \end{notate}
 558
 559 \begin{common}{utilities.lisp}
 560 (in-package arxana)
 561 (locally-enable-sql-reader-syntax)
 562
 563 ;; (defun connect-to-database ()
 564 ;;    (connect `("localhost" "joe" "joe" "")
 565 ;;             :database-type :postgresql-socket))
 566
 567 (defun connect-to-database ()
 568    (connect `("localhost" "joe" "joe" "joe")
 569             :database-type :mysql))
 570
 571 (defmacro select-one (&rest args)
 572   `(car (select ,@args :flatp t)))
 573
 574 (defmacro select-flat (&rest args)
 575   `(select ,@args :flatp t))
 576
 577 (defun resolve-ambiguity (stuff)
 578   (first stuff))
 579
 580 (defun isolate-components (content i j)
 581   (list (nth (1- i) content)
 582         (nth (1- j) content)))
 583
 584 (defun isolate-beginning (triple)
 585   (isolate-components (cdr triple) 1 2))
 586
 587 (defun isolate-middle (triple)
 588   (isolate-components (cdr triple) 3 4))
 589
 590 (defun isolate-end (triple)
 591   (isolate-components (cdr triple) 5 6))
 592
 593 (defvar *read-from-heading* nil)
 594
 595 (defvar *write-to-heading* nil)
 596 \end{common}
 597
 598 \begin{notate}{On `datatype'}
 599 Just translate coordinates into their primary dimension.
 600 (How should this change to accomodate codes 4, 5, 6,
 601 possibly etc.?)
 602 \end{notate}
 603
 604 \begin{common}{utilities.lisp}
 605 (defun datatype (data)
 606   (cond ((eq (car data) 0)
 607          "strings")
 608         ((eq (car data) 1)
 609          "places")
 610         ((eq (car data) 2)
 611          "triples")
 612         ((eq (car data) 3)
 613          "theories")))
 614
 615 (locally-disable-sql-reader-syntax)
 616 \end{common}
 617
 618 \begin{notate}{Resolving ambiguity}
 619 Often it will eventuate that there will be more than one
 620 item returned when we are only truly prepared to deal with
 621 one item.  In order to handle this sort of ambiguity, it
 622 would be great to have either a non-interactive notifier
 623 that says that some ambiguity has been dealt with, or an
 624 interactive tool that will let the user decide which of
 625 the ambiguous options to choose from.  For now, we provide
 626 the simplest non-interactive tool: just choose the first
 627 item from a possibly ambiguous list of items.
 628 \end{notate}
 629
 630 \begin{notate}{Using a different database}
 631 See Note \ref{backend-variant} for instructions on changes
 632 you will want to make if you use a different database.
 633 \end{notate}
 634
 635 \begin{notate}{Use of the ``count'' function}
 636 The SQL count function is thought to be inefficient with
 637 some backends; workarounds exist.  (And it's considered to
 638 be efficient with MySQL.)
 639 \end{notate}
 640
 641 \begin{notate}{Abstraction} \label{abstraction}
 642 While it might be in some ways ``nice'' to allow people to
 643 chain together ever-more-abstract references to elements
 644 from other theories, I actually think it is better to
 645 demand that there just be \emph{one} layer of abstraction
 646 (since we can then quickly translate back and forth,
 647 rather than running through a chain of translations).
 648
 649 This does not imply that we cannot have a theory
 650 superimposed over another theory (or over multiple
 651 theories) that draws input from throughout a massively
 652 distributed interlaced system -- rather, just that we
 653 assume we will need to translate to ``base coordinates''
 654 when building such structures.  However, we'll certainly
 655 want to explore the possibilities for running links
 656 between theories (abstractly similar in some sense to
 657 pointing at a component of a triple, but here there's no
 658 uniform beg, mid, end scheme to refer to).
 659 \end{notate}
 660
 661 \subsection{Main table definitions}
 662
 663 \begin{notate}{Defining tables from within Lisp}
 664 This is Lisp code to define the permanent SQL tables
 665 described in Section \ref{sql-code}.
 666 \end{notate}
 667
 668 \begin{common}{tabledefs.lisp}
 669 ;; (execute-command "CREATE TABLE strings (
 670 ;;    id SERIAL PRIMARY KEY,
 671 ;;    text TEXT NOT NULL UNIQUE
 672 ;; );")
 673
 674 (execute-command "CREATE TABLE strings (
 675    id SERIAL PRIMARY KEY,
 676    text TEXT,
 677    UNIQUE INDEX (text(255))
 678 );")
 679
 680 (execute-command "CREATE TABLE places (
 681    id SERIAL PRIMARY KEY,
 682    code INT NOT NULL,
 683    ref INT NOT NULL
 684 );")
 685
 686 (execute-command "CREATE TABLE triples (
 687    id SERIAL PRIMARY KEY,
 688    code1 INT NOT NULL,
 689    ref1 INT NOT NULL,
 690    code2 INT NOT NULL,
 691    ref2 INT NOT NULL,
 692    code3 INT NOT NULL,
 693    ref3 INT NOT NULL,
 694    UNIQUE (code1, ref1,
 695            code2, ref2,
 696            code3, ref3)
 697 );")
 698
 699 (execute-command "CREATE TABLE theories (
 700   id SERIAL PRIMARY KEY,
 701   name INT UNIQUE REFERENCES strings(id)
 702 );")
 703 \end{common}
 704
 705 \begin{notate}{Eliminating and tables}
 706 In case you ever need to redefine these tables, you can
 707 run code like this first, to delete the existing copies.
 708 (Additional tables are added whenever a theory is created;
 709 code for deleting theories or their contents will appear
 710 in Section \ref{processing-theories}.)
 711 \end{notate}
 712
 713 \begin{idea}
 714 (dolist (view (list-views)) (drop-view view))
 715 (execute-command "DROP TABLE strings")
 716 (execute-command "DROP TABLE triples")
 717 (execute-command "DROP TABLE places")
 718 (execute-command "DROP TABLE theories")
 719 \end{idea}
 720
 721 \subsection{Modifying the database}
 722
 723 \begin{common}{database.lisp}
 724 (in-package arxana)
 725 (locally-enable-sql-reader-syntax)
 726 \end{common}
 727
 728 \subsection*{Processing strings}
 729
 730 \begin{notate}{On `string-to-id'}
 731 Return the id of `text', if present, otherwise nil.
 732
 733 There was a segmentation fault with clisp here at one
 734 point, maybe because I hadn't gotten the clsql sql reader
 735 syntax loaded up properly.  Note that calling the code
 736 without the function wrapper did not produce the same
 737 segfault.
 738 \end{notate}
 739
 740 \begin{common}{database.lisp}
 741 (defun string-to-id (text)
 742   (select [id]
 743           :from [strings]
 744           :where [= [text] text]))
 745 \end{common}
 746
 747 \begin{notate}{On `add-string'} \label{add-string}
 748 Add the argument `text' to the list of strings.  If the string
 749 is successfully created, its coordinates are returned.
 750 Otherwise, and in particular, if the request was to create
 751 a duplicate, nil is returned.
 752
 753 Should this give a message ``Adding \meta{text} to the
 754 strings table'' when the string is added by an indirecto
 755 function call, such as through `massage'?
 756 (Note \ref{massage}.)
 757 \end{notate}
 758
 759 \begin{common}{database.lisp}
 760 (defun add-string (text)
 761   (handler-case
 762    (progn (insert :into [strings]
 763                   :attributes '(text)
 764                   :values `(,text))
 765           `(1 ,(string-to-id text)))
 766    (sql-database-data-error ()
 767      (warn "\"~a\" already exists."
 768            text))))
 769 \end{common}
 770
 771 \begin{notate}{Error handling bug}
 772 The function `add-string' (Note \ref{add-string}) exhibits
 773 the first of several error handling calls designed to
 774 ensure uniqueness (Note \ref{unique-things}).
 775 Experimentally, this works, but I'm observing that, at
 776 least sometimes, if the user tries to add an item that's
 777 already present in the database, the index tied to the
 778 associated table increases even though the item isn't
 779 added.  This is annoying.  I haven't checked whether this
 780 happens on all possible installations of the underlying
 781 software.
 782 \end{notate}
 783
 784 \subsection*{Parsing general input}
 785
 786 \begin{notate}{On `massage'} \label{massage}
 787 User input to functions like `add-triple' and so on and so
 788 forth can be strings, integers (which the function
 789 ``serializes'' as the string versions of themselves), or
 790 as \emph{coordinates} -- lists of the form (code ref).
 791 This function converts all of these input forms into the
 792 last one!  It takes an optional argument `addstr' which,
 793 if supplied, says to add string data to the database if it
 794 wasn't there already.
 795 \end{notate}
 796
 797 \begin{common}{database.lisp}
 798 (defun massage (data &optional addstr)
 799   (cond
 800    ((integerp data)
 801     (massage (format nil "~a" data) addstr))
 802    ((stringp data)
 803     (let ((id (string-to-id data)))
 804       (if id
 805           (list 0 id)
 806           (when addstr
 807             (add-string data)))))
 808    ((and (listp data)
 809          (equal (length data) 2))
 810     data)
 811    (t nil)))
 812 \end{common}
 813
 814
 815 \subsection*{Processing triples}
 816
 817 \begin{notate}{On `triple-to-id'}
 818 Return the id of the triple (beg mid end),
 819 if present, otherwise nil.
 820 \end{notate}
 821
 822 \begin{common}{database.lisp}
 823 (defun triple-to-id (beg mid end)
 824   (let ((b (massage beg))
 825         (m (massage mid))
 826         (e (massage end)))
 827     (select [id]
 828             :from [triples]
 829             :where [and [= [code1] (first b)]
 830                         [= [ref1] (second b)]
 831                         [= [code2] (first m)]
 832                         [= [ref2] (second m)]
 833                         [= [code3] (first e)]
 834                         [= [ref3] (second e)]])))
 835 \end{common}
 836
 837 \begin{notate}{On `add-triple'} \label{add-triple}
 838 Elements of triples are parsed by `massage'
 839 (Note \ref{massage}).  If the triple
 840 is successfully created, its coordinates are returned.
 841 Otherwise, and in particular, if the request was to create
 842 a duplicate, nil is returned.
 843 \end{notate}
 844
 845 \begin{common}{database.lisp}
 846 (defun add-triple (beg mid end)
 847   "Add a triple comprised of BEG MID and END."
 848   (let ((b (massage beg t))
 849         (m (massage mid t))
 850         (e (massage end t)))
 851     (when (and b m e)
 852       (handler-case
 853        (progn
 854          (insert-records
 855           :into [triples] :attributes '(code1 ref1
 856                                         code2 ref2
 857                                         code3 ref3)
 858           :values `(,(first b) ,(second b)
 859                     ,(first m) ,(second m)
 860                     ,(first e) ,(second e)))
 861          `(2 ,(triple-to-id b m e)))
 862        (sql-database-data-error ()
 863          (warn "\"~a\" already entered as [~a ~a ~a]."
 864                (list beg mid end) b m e))))))
 865 \end{common}
 866
 867 \subsection*{Processing theories} \label{processing-theories}
 868
 869 \begin{notate}{Things to do with theories}
 870 For the record, we want to be able to create a theory, add
 871 elements to that theory, remove or change elements in the
 872 theory, and, for convenience, zap everything in a theory.
 873 Perhaps we will also want functions to remove the tables
 874 associated with a theory as well, swap the position of two
 875 theories, or change the name of a theory.  We will also
 876 want to be able to export and import theories, so they can
 877 be ``beamed'' between installations.  At appropriate
 878 places in the Emacs interface, we'll need to set
 879 `*write-to-heading*' and `*read-from-heading*'.
 880 \end{notate}
 881
 882 \begin{notate}{What can go in a theory} \label{what-can-go-in}
 883 Notice that there is no rule that says that a triple or
 884 place that's part of a theory needs to point only at
 885 strings that are in the same theory.
 886 \end{notate}
 887
 888 \begin{notate}{On `list-to-id'}
 889 Return the id of the theory with given `heading', if present,
 890 otherwise, nil.
 891 \end{notate}
 892
 893 \begin{common}{database.lisp}
 894 (defun list-to-id (heading)
 895   (let ((string-id (string-to-id heading)))
 896     (select [id]
 897             :from [lists]
 898             :where [= [heading] string-id])))
 899 \end{common}
 900
 901 \begin{notate}{On `add-theory'} \label{add-theory}
 902 Add a theory to the theories table, and all the new
 903 dimensions of the frame that comprise this theory.
 904 (Theories have names that are strings -- it seems a
 905 little funny to always have to translate submitted
 906 strings to ids for lookup, but this is what we do.)
 907 \end{notate}
 908
 909 \begin{common}{database.lisp}
 910 (defun add-list (heading)
 911   (let ((string-id (second (massage heading t))))
 912     (handler-case
 913         (progn (insert :into [lists]
 914                        :attributes '(heading)
 915                        :values `(,string-id))
 916                (let ((k (theory-to-id heading)))
 917                  (execute-command
 918                   (format nil "CREATE TABLE lists~A (
 919    offset SERIAL PRIMARY KEY,
 920    code INT NOT NULL,
 921    ref INT NOT NULL
 922 );" k))
 923                  `(0 ,k)))
 924       (sql-database-data-error
 925           ()
 926         (warn "The list \"~a\" already exists."
 927               heading)))))
 928 \end{common}
 929
 930 \begin{notate}{On `get-lists'}
 931 Find all lists that contain `symbol'.
 932 \end{notate}
 933
 934 \begin{common}{database.lisp}
 935 (defun get-lists (symbol)
 936   (let* ((data (massage symbol))
 937          (type (datatype data))
 938          (id (second data))
 939          (n (caar
 940              (query "select count(*) from lists")))
 941          results)
 942     (loop for k from 1 upto n
 943           do (let ((present
 944                     (query (concatenate
 945                             'string
 946                             "select offset from list"
 947                             (format nil "~A" k)
 948                             " where ((code = "
 949                             (format nil "~A" type)
 950                             ") and (ref = "
 951                             (format nil "~A" id)
 952                             "))"))))
 953                (when present
 954                  ;; bit of a problem if there are multiple
 955                  ;; entries of that item on the given
 956                  ;; list.
 957                  (setq results (cons (list 0 k present)
 958                                      results)))))
 959     results))
 960 \end{common}
 961
 962 \begin{notate}{On `save-to-list'}
 963 Record `symbol' on list named `name'.
 964 \end{notate}
 965
 966 \begin{common}{database.lisp}
 967 (defun save-to-list (symbol name)
 968   (let* ((data (massage symbol t))
 969          (type (datatype data))
 970          (string-id (string-to-id name))
 971          (k (select-one [id]
 972                         :from [lists]
 973                         :where [= [name] string-id]))
 974          (tablek (concatenate 'string
 975                               type (format nil "~A" k))))
 976     (insert-records :into (sql-expression :table tablek)
 977                     :attributes '(id)
 978                     :values `(,(second data)))))
 979 \end{common}
 980
 981 \subsection*{Lookup by id or coordinates}
 982
 983 \begin{notate}{The data format that's best for Lisp} \label{what-is-best-for-lisp}
 984 It is a reasonable question to ask whether or not the an
 985 item's id should be considered part of that item's
 986 defining data when that data is no longer in the database.
 987 For the functions defined here, the id is an input, and so
 988 by default I'm not including it in the output here,
 989 because it is already known.  However, for functions like
 990 `triples-given-beginning' (See Note
 991 \ref{graph-like-data}), the id is \emph{not} part of the
 992 known data, and so it is returned.  Therefore I am
 993 providing the `retain-id' flag here, for cases where
 994 output should be consistent with that of these other
 995 functions.
 996 \end{notate}
 997
 998 \begin{common}{database.lisp}
 999 (defun string-lookup (id &optional retain-id)
1000   (let ((ret (select [text]
1001                      :from [strings]
1002                      :where [= [id] id])))
1003     (if retain-id
1004         (list id ret)
1005         ret)))
1006
1007 (defun triple-lookup (id &optional retain-id)
1008   (let ((ret (select [code1] [ref1]
1009                      [code2] [ref2]
1010                      [code3] [ref3]
1011                      :from [triples]
1012                      :where [= [id] id])))
1013     (if retain-id
1014         (cons id ret)
1015         ret)))
1016
1017 (defun list-lookup (id &optional retain-id)
1018   (let ((ret (select [name]
1019                      :from [lists]
1020                      :where [= [id] id])))
1021     (if retain-id
1022         (list id ret)
1023         ret)))
1024 \end{common}
1025
1026 \begin{notate}{Succinct idioms for following pointers}
1027 Here are some variants on the functions above which save
1028 us from needing to extract the id of the item from its
1029 coordinates.
1030 \end{notate}
1031
1032 \begin{common}{database.lisp}
1033 (defun string-contents (coords)
1034   (string-lookup (second coords)))
1035
1036 (defun place-contents (coords)
1037   (place-lookup (second coords)))
1038
1039 (defun triple-contents (coords)
1040   (triple-lookup (second coords)))
1041 \end{common}
1042
1043 \begin{notate}{Switchboard} \label{switchboard}
1044 Even more succinctly, one function that can get
1045 the object indicated by any set of coordinates.
1046 \end{notate}
1047
1048 \begin{common}{database.lisp}
1049 (defun switchboard (coords)
1050   (cond ((eq (first coords) 0)
1051          (string-contents coords))
1052         ((eq (first coords) 1)
1053          (place-contents coords))
1054         ((eq (first coords) 2)
1055          (triple-contents coords))))
1056 \end{common}
1057
1058 \begin{notate}{Anti-pasti}
1059 The readability of this code could perhaps be improved if
1060 we used functions like `switchboard' more frequently.
1061 (More to the point, it seems it's not currently used.)  In
1062 particular, it would be nice if we could sweep idioms like
1063 \verb+`(2 ,(car triple))+ under the rug.
1064 \end{notate}
1065
1066 \begin{common}{database.lisp}
1067 (locally-disable-sql-reader-syntax)
1068 \end{common}
1069
1070 \subsection{Queries} \label{queries}
1071
1072 \begin{notate}{The use of views} \label{use-of-views}
1073 It is easy enough to select those triples which match
1074 simple data, e.g., those triples which have the same
1075 beginning, middle, or end, or any combination of these.
1076 It is a little more complicated to find items that match
1077 criteria specified by several different triples; for
1078 example, to \emph{find all the books by Arthur C. Clarke
1079   that are also works of fiction}.
1080
1081 Suppose our collection of triples contains a portion as
1082 follows:
1083 \begin{center}
1084 \begin{tabular}{lll}
1085 Profiles of the Future & is a & book \\ 2001: A Space
1086 Odyssey & is a & book \\ Ender's Game & is a & book
1087 \\ Profiles of the Future & has genre & non-fiction
1088 \\ 2001: A Space Odyssey & has genre & fiction \\ Ender's
1089 Game & has genre & fiction \\ Profiles of the Future & has
1090 author & Arthur C. Clarke \\ 2001: A Space Odyssey & has
1091 author & Arthur C. Clarke \\ Ender's Game & has author &
1092 Orson Scott Card
1093 \end{tabular}
1094 \end{center}
1095
1096 One way to solve the given problem would be to find those
1097 items that \emph{are written by Arthur C. Clarke} (* ``has
1098 author'' and ``Arthur C. Clarke''), that \emph{are books}
1099 (* ``is a'' ``book''), and \emph{that are classified as
1100   fiction} (* ``has genre'' ``fiction'').  We are looking
1101 for items that match \emph{all} of these conditions.
1102
1103 Our implementation strategy is: collect the items matching
1104 each criterion into a view, then join these views.  (See
1105 the function `satisfy-conditions'
1106 \ref{satisfy-conditions}.)
1107
1108 If we end up working with large queries and a lot of data,
1109 this use of views may not be an efficient way to go -- but
1110 we'll cross that bridge when we come to it.
1111 \end{notate}
1112
1113 \begin{notate}{Search queries}
1114 In Note \ref{sphinx-setup} et seq., we give some
1115 instructions on how to set up the Sphinx search engine to
1116 work with Arxana.  However, a much tighter integration of
1117 Sphinx into Arxana is possible, and will be coming soon.
1118 \end{notate}
1119
1120 \begin{common}{queries.lisp}
1121 (in-package arxana)
1122 (locally-enable-sql-reader-syntax)
1123 \end{common}
1124
1125 \subsection*{Printing}
1126
1127 \begin{notate}{On `print-system-object'} \label{print-system-object}
1128 The function `print-system-object' bears some resemblance
1129 to `massage', but is for printing instead,
1130 and therefor has to be recursive (because triples and
1131 places can point to other system objects, printing can be
1132 a long and drawn out ordeal).
1133 \end{notate}
1134
1135 \begin{common}{queries.lisp}
1136 (defun print-system-object (data &optional components)
1137   (cond
1138     ;; just return strings
1139     ((stringp data)
1140      data)
1141     ;; printing from coordinates (code, ref)
1142     ((and (listp data)
1143           (equal (length data) 2))
1144      ;; we'll need some hack to deal with
1145      ;; elements-of-theories, which, right now, are two
1146      ;; elements long but are not (code, ref) pairs but
1147      ;; rather (local_id, ref) pairs, or maybe actually if
1148      ;; we take context into consideration, they're
1149      ;; actually (k, table, local_id, ref) quadruplets.
1150      ;; Obviously with *that* data we can translate to
1151      ;; (code, ref).  On the other hand, if we *don't*
1152      ;; take it into consideration, we probably can't do
1153      ;; much of anything.  So we should be careful to be
1154      ;; aware of just what sort of information we're
1155      ;; passing around.
1156      (cond ((equal (first data) 0)
1157             (string-lookup (second data)))
1158            ((equal (first data) 1)
1159             (print-system-object
1160              (place-lookup (second data) t)))
1161            ((equal (first data) 2)
1162             (let ((triple (triple-lookup (second data) t)))
1163               (if components
1164                   (list
1165                    (print-beginning triple)
1166                    (print-middle triple)
1167                    (print-end triple))
1168                   (concatenate
1169                    'string
1170                    (format nil "T~a[" (second data))
1171                    (print-beginning triple) "."
1172                    (print-middle triple) "."
1173                    (print-end triple) "]"))))
1174            ((equal (first data) 3)
1175             (concatenate 'string "List printing not implemented yet."))))
1176     ;; place
1177     ((and (listp data)
1178           (equal (length data) 3))
1179      (concatenate 'string
1180                   (format nil "P~a|" (first data))
1181                   (print-system-object (cdr data)) "|"))
1182     ;; triple
1183     ((and (listp data)
1184           (equal (length data) 7))
1185       (if components
1186           (list
1187            (print-beginning data)
1188            (print-middle data)
1189            (print-end data))
1190           (concatenate
1191            'string
1192            (format nil "T~a[" (first data))
1193            (print-beginning data) "."
1194            (print-middle data) "."
1195            (print-end data) "]")))
1196     (t nil)))
1197
1198 (defun print-beginning (triple)
1199   (print-system-object (isolate-beginning triple)))
1200
1201 (defun print-middle (triple)
1202   (print-system-object (isolate-middle triple)))
1203
1204 (defun print-end (triple)
1205   (print-system-object (isolate-end triple)))
1206 \end{common}
1207
1208 \begin{notate}{Depth}
1209 If we are going to have complicated recursive references,
1210 our printer, and anything else that gives the system some
1211 semantics, should come with some sort of ``layers'' switch
1212 that can be used to limit the amount of recursion we do in
1213 any given computation.
1214 \end{notate}
1215
1216 \begin{notate}{Printing objects as they appear in Lisp} \label{printing-objects-in-lisp}
1217 With the following functions we provide facilities for
1218 printing an object, either from its id or from the
1219 expanded form of the data that represents it in Lisp.
1220 (This is one good reason to have one standard form for
1221 this data; compare Note \ref{what-is-best-for-lisp}.
1222 These functions assume that the id \emph{is} part of
1223 what's printed, so if using functions like `triple-lookup'
1224 to retrieve data for printing, you'll have to graft the id
1225 back on before printing with these functions.)
1226 \end{notate}
1227
1228 \begin{notate}{Printing theories}
1229 We'll want to both print all of the content of a theory,
1230 and print \emph{from} the theory in a more limited way.
1231 (Perhaps we get the second item for free, already?)
1232 \end{notate}
1233
1234 \begin{common}{queries.lisp}
1235 (defun print-string (string &optional components)
1236   (print-system-object string components))
1237
1238 (defun print-place (place &optional components)
1239   (print-system-object place components))
1240
1241 (defun print-triple (triple &optional components)
1242   (print-system-object triple components))
1243
1244 (defun print-string-from-id (id &optional components)
1245   (print-system-object (list 0 id) components))
1246
1247 (defun print-place-from-id (id &optional components)
1248   (print-system-object (list 1 id) components))
1249
1250 (defun print-triple-from-id (id &optional components)
1251   (print-system-object (list 2 id) components))
1252 \end{common}
1253
1254 \begin{notate}{Printing some stuff but not other stuff} \label{printing-some}
1255 These functions are good for printing lists as come out of
1256 the database.  See Note \ref{strings-and-ids} on printing
1257 strings.
1258 \end{notate}
1259
1260 \begin{common}{queries.lisp}
1261 (defun print-strings (strings)
1262   (mapcar 'second strings))
1263
1264 (defun print-places (places &optional components)
1265   (mapcar (lambda (item)
1266              (print-system-object item components))
1267   places))
1268
1269 (defun print-triples (triples &optional components)
1270  (mapcar (lambda (item)
1271              (print-system-object item components))
1272              triples))
1273
1274 (defun print-theories (theories &optional components)
1275  (mapcar (lambda (item)
1276              (print-system-object item components))
1277              theories))
1278 \end{common}
1279
1280 \begin{notate}{Printing everything in each table} \label{printing-everything}
1281 These functions collect human-readable versions of
1282 everything in each table.  Notice that `all-strings' is
1283 written differently.
1284 \end{notate}
1285
1286 \begin{common}{queries.lisp}
1287 (defun all-strings ()
1288   (mapcar 'second (select [*] :from [strings])))
1289
1290 (defun all-places ()
1291   (mapcar 'print-system-object
1292           (select [*] :from [places])))
1293
1294 (defun all-triples ()
1295  (mapcar 'print-system-object
1296          (select [*] :from [triples])))
1297
1298 (defun all-theories ()
1299  (mapcar 'print-system-object
1300          (select [*] :from [theories])))
1301 \end{common}
1302
1303 \begin{notate}{Printing on particular dimensions}
1304 One possible upgrade to the printing functions would be to
1305 provide the built-in to ``curry'' the printout -- for
1306 example, just print the source nodes from a list of
1307 triples.  However, it should of course also be possible to
1308 do processing like this Lisp after the printout has been
1309 made (the point is, it is presumably it is more efficient
1310 only to retrieve and format the data we're actually
1311 looking for).
1312 \end{notate}
1313
1314 \begin{notate}{Strings and ids} \label{strings-and-ids}
1315 Unlike other objects, strings don't get printed with their
1316 ids.  We should probably provide an \emph{option} to print
1317 with ids (this could be helpful for subsequent work with
1318 the strings in question; on the other hand, since strings
1319 are being kept unique, we can immediately exchange a
1320 string and it's id, so I'm not sure if it's necessary to
1321 have an explicit ``option'').
1322 \end{notate}
1323
1324 \subsection*{Functions that establish basic graph structure}
1325
1326 \begin{notate}{Thinking about graph-like data} \label{graph-like-data}
1327 Here we have in mind one or more objects (e.g. a
1328 particular source and sink) that is associated with
1329 potentially any number of triples (e.g. all the possible
1330 middles running between these two identified objects).
1331 These functions establish various forms of locality or
1332 neighborhood within the data.
1333
1334 The results of such queries can be optionally cached in a
1335 view, which is useful for further processing
1336 (cf. \ref{satisfy-conditions}).
1337
1338 These functions take input in the form of strings and/or
1339 coordinates (cf. Note \ref{massage}).
1340 \end{notate}
1341
1342 \begin{common}{queries.lisp}
1343 (defun triples-given-beginning (node &optional view)
1344   "Get triples outbound from the given NODE.  Optional
1345   argument VIEW causes the results to be selected into a
1346   view with that name."
1347   (let ((data (massage node))
1348         (window (or view "interal-view"))
1349         ret)
1350     (when data
1351       (create-view
1352        window
1353         :as (select [*]
1354              :from [triples]
1355              :where [and [= [code1] (first data)]
1356                          [= [ref1] (second data)]]))
1357       (setq ret (select [*] :from window))
1358       (unless view
1359         (drop-view window))
1360       ret)))
1361
1362 (defun triples-given-end (node &optional view)
1363   "Get triples inbound into NODE.  Optional argument VIEW
1364        causes the results to be selected into a view with
1365        that name."
1366   (let ((data (massage node))
1367         (window (or view "interal-view"))
1368         ret)
1369     (when data
1370       (create-view
1371        window
1372         :as (select [*]
1373              :from [triples]
1374              :where [and [= [code3] (first data)]
1375                          [= [ref3] (second data)]]))
1376       (setq ret (select [*] :from window))
1377       (unless view
1378         (drop-view window))
1379       ret)))
1380
1381 (defun triples-given-middle (edge &optional view)
1382   "Get the triples that run along EDGE.  Optional argument
1383        VIEW causes the results to be selected into a view
1384        with that name."
1385   (let ((data (massage edge))
1386         (window (or view "interal-view"))
1387         ret)
1388     (when data
1389       (create-view
1390        window
1391        :as (select [*]
1392             :from [triples]
1393             :where [and [= [code2] (first data)]
1394                         [= [ref2] (second data)]]))
1395       (setq ret (select [*] :from window))
1396       (unless view
1397         (drop-view window))
1398       ret)))
1399
1400 (defun triples-given-middle-and-end (edge node &optional
1401        view)
1402   "Get the triples that run along EDGE into NODE.
1403        Optional argument VIEW causes the results to be
1404        selected into a view with that name."
1405   (let ((edgedata (massage edge))
1406         (nodedata (massage node))
1407         (window (or view "interal-view"))
1408         ret)
1409     (when (and edgedata nodedata)
1410       (create-view
1411        window
1412        :as (select [*]
1413             :from [triples]
1414             :where [and [= [code2] (first edgedata)]
1415                         [= [ref2] (second edgedata)]
1416                         [= [code3] (first nodedata)]
1417                         [= [ref3] (second nodedata)]]))
1418       (setq ret (select [*] :from window))
1419       (unless view
1420         (drop-view window))
1421       ret)))
1422
1423 (defun triples-given-beginning-and-middle (node edge
1424                                            &optional view)
1425   "Get the triples that run from NODE along EDGE.
1426 Optional argument VIEW causes the results to be selected
1427 into a view with that name."
1428   (let ((nodedata (massage node))
1429         (edgedata (massage edge))
1430         (window (or view "interal-view"))
1431         ret)
1432     (when (and nodedata edgedata)
1433       (create-view
1434        window
1435        :as (select [*]
1436             :from [triples]
1437             :where [and [= [code1] (first nodedata)]
1438                         [= [ref1] (second nodedata)]
1439                         [= [code2] (first edgedata)]
1440                         [= [ref2] (second edgedata)]]))
1441       (setq ret (select [*] :from window))
1442       (unless view
1443         (drop-view window))
1444       ret)))
1445
1446 (defun triples-given-beginning-and-end (node1 node2
1447        &optional view)
1448   "Get the triples that run from NODE1 to NODE2.  Optional
1449        argument VIEW causes the results to be selected
1450        into a view with that name."
1451   (let ((node1data (massage node1))
1452         (node2data (massage node2))
1453         (window (or view "interal-view"))
1454         ret)
1455     (when (and node1data node2data)
1456       (create-view
1457        window
1458        :as (select [*]
1459             :from [triples]
1460             :where [and [= [code1] (first node1data)]
1461                         [= [ref1] (second node1data)]
1462                         [= [code3] (first node2data)]
1463                         [= [ref3] (second node2data)]]))
1464       (setq ret (select [*] :from window))
1465       (unless view
1466         (drop-view window))
1467       ret)))
1468
1469 ;; This one use `select-one' instead of `select'
1470 (defun triple-exact-match (node1 edge node2 &optional
1471        view)
1472   "Get the triples that run from NODE1 along EDGE to
1473 NODE2.  Optional argument VIEW causes the results to be
1474 selected into a view with that name."
1475   (let ((node1data (massage node1))
1476         (edgedata (massage edge))
1477         (node2data (massage node2))
1478         (window (or view "interal-view"))
1479         ret)
1480     (when (and node1data edgedata node2data)
1481       (create-view
1482        window
1483        :as (select [*]
1484             :from [triples]
1485             :where [and [= [code1] (first node1data)]
1486                         [= [ref1] (second node1data)]
1487                         [= [code2] (first edgedata)]
1488                         [= [ref2] (second edgedata)]
1489                         [= [code3] (first node2data)]
1490                         [= [ref3] (second node2data)]]))
1491       (setq ret (select-one [*] :from window))
1492       (unless view
1493         (drop-view window))
1494       ret)))
1495 \end{common}
1496
1497 \begin{notate}{Becoming flexible about a string's status}
1498 One possible upgrade would be to provide versions of these
1499 functions that will flexibly accept either a string or a
1500 ``placed string'' as input (since frequently we're
1501 interested in content of that sort; see
1502 \ref{importing-sketch}).
1503 \end{notate}
1504
1505 \subsection*{Finding places that satisfy some property}
1506
1507 \begin{notate}{On `get-places-subject-to-constraint'}
1508 Like `get-places' (Note \ref{get-places}), but this
1509 time takes an extra condition of the form (A C B)
1510 where one of A, B, and C is `nil'.  We test each
1511 of the places in place of this `nil', to see if a
1512 triple matching that criterion exists.
1513 \end{notate}
1514
1515 \begin{common}{queries.lisp}
1516 (defun get-places-subject-to-constraint (symbol condition)
1517   (let ((candidate-places (get-places symbol))
1518         accepted-places)
1519     (dolist (place candidate-places)
1520       (let ((filled-condition
1521              (map 'list (lambda (elt) (or elt
1522                                           `(1 ,place)))
1523                   condition)))
1524         (when (apply 'triple-relaxed-match
1525                      filled-condition)
1526           (setq accepted-places
1527                 (cons place accepted-places)))))
1528     accepted-places))
1529 \end{common}
1530
1531 \subsection*{Logic}
1532
1533 \begin{notate}{Caution: compatibility with theories?}
1534 For the moment, I'm not sure how compatible this function
1535 is with the theories apparatus we've established, or with
1536 the somewhat vaguer notion of trans-theory questions or
1537 concerns.  Global queries should work just fine, but
1538 theory-local questions may need some work.  Before getting
1539 into compatibility of these questions with the theory
1540 apparatus, I want to make sure that apparatus is working
1541 properly.  Note that the questions here do rely on
1542 functions for graph-like thinking (Note
1543 \ref{graph-like-data} et seq.), and it would certainly
1544 make sense to port to ``subgraphs'' as represented by
1545 theories.
1546 \end{notate}
1547
1548 \begin{notate}{On `satisfy-conditions'} \label{satisfy-conditions}
1549 This function finds the items which match constraints.
1550 Constraints take the form (A B C), where precisely one of
1551 A, B, or C should be `nil', and any of the others can be
1552 either input suitable for `massage', or
1553 `t'.  The `nil' entry stands for the object we're
1554 interested in.  Any `t' entries are wildcards.
1555
1556 The first thing that happens as the function runs is that
1557 views are established exhibiting each group of triples
1558 satisfying each predicate.  The names of these views are
1559 then massaged into a large SQL query.  (It is important to
1560 ``typeset'' all of this correctly for our SQL `query'.)
1561 Finally, once that query has been run, we clean up,
1562 dropping all of the views we created.
1563 \end{notate}
1564
1565 \begin{common}{queries.lisp}
1566 (defun satisfy-conditions (constraints)
1567   (let* ((views (generate-views constraints))
1568          (formatted-list-of-views (format-views
1569                                    views))
1570          (where-condition (generate-where-condition
1571                            views
1572                            constraints))
1573          (ret
1574           ;; Let's see what the query is, first of all.
1575           (query
1576            (concatenate
1577             'string
1578             "select v1.id, v1.code1, v1.ref1, "
1579                           "v1.code2, v1.ref2, "
1580                           "v1.code3, v1.ref3 "
1581             "from "
1582             formatted-list-of-views
1583             "where "
1584             where-condition
1585             ";"))))
1586     (mapc (lambda (name) (drop-view name)) views)
1587     ret))
1588 \end{common}
1589
1590 \begin{notate}{Subroutines for `satisfy-conditions'}
1591 The functions below produce bits and pieces of the SQL
1592 query that `satisfy-conditions' submits.  The point of the
1593 `generate-views' is to create a series of views centered
1594 on the term(s) we're interested in (the `nil' slots in
1595 each submitted constraint).  With
1596 `generate-where-condition', we insist that all of these
1597 interesting terms should, in fact, be equal to one
1598 another.
1599 \end{notate}
1600
1601 \begin{notate}{On `generate-views'}
1602 In a `cond' form, for each constraint we must select the
1603 appropriate function to generate the view; at the very end
1604 of the cond form, we spit out the viewname (for `mapcar'
1605 to add to the list of views).
1606 \end{notate}
1607
1608 \begin{common}{queries.lisp}
1609 (defun generate-views (constraints)
1610   (let ((counter 0))
1611     (mapcar
1612      (lambda (constraint)
1613        (setq counter (1+ counter))
1614        (let ((viewname (format nil "v~a" counter)))
1615          (cond
1616           ;; A * ? or A ? *
1617           ((or (and (eq (second constraint) t)
1618                     (eq (third constraint) nil))
1619                (and (eq (second constraint) nil)
1620                     (eq (third constraint) t)))
1621            (triples-given-beginning
1622             (first constraint)
1623             viewname))
1624           ;; * B ? or ? B *
1625           ((or (and (eq (first constraint) t)
1626                     (eq (third constraint) nil))
1627                (and (eq (first constraint) nil)
1628                     (eq (third constraint) t)))
1629            (triples-given-middle
1630             (second constraint)
1631             viewname))
1632           ;; * ? C or ? * C
1633           ((or (and (eq (first constraint) t)
1634                     (eq (second constraint) nil))
1635                (and (eq (first constraint) nil)
1636                     (eq (second constraint) t)))
1637            (triples-given-end
1638             (third constraint)
1639             viewname))
1640           ;; ? B C
1641           ((eq (first constraint) nil)
1642            (triples-given-middle-and-end
1643             (second constraint)
1644             (third constraint)
1645             viewname))
1646           ;; A ? C
1647           ((eq (second constraint) nil)
1648            (triples-given-beginning-and-middle
1649             (first constraint)
1650             (second constraint)
1651             viewname))
1652           ;; A C ?
1653           ((eq (third constraint) nil)
1654            (triples-given-beginning-and-end
1655             (first constraint)
1656             (third constraint)
1657             viewname)))
1658          viewname))
1659      constraints)))
1660
1661 (defun format-views (views)
1662   (let ((formatted-list-of-views ""))
1663     (mapc (lambda (view)
1664             (setq formatted-list-of-views
1665                   (concatenate
1666                    'string
1667                    formatted-list-of-views
1668                    (format nil "~a," view))))
1669           (butlast views))
1670     (setq formatted-list-of-views
1671           (concatenate
1672            'string
1673            formatted-list-of-views
1674            (format nil "~a " (car (last views)))))
1675     formatted-list-of-views))
1676
1677 (defun generate-where-condition (views conditions)
1678   (let ((where-condition "")
1679         (c (select-component (first conditions))))
1680     ;; there should be one less "=" condition than there
1681     ;; are things to compare; until we get to the last
1682     ;; view, everything is joined together by an `and'.
1683     ;; -- this needs to consider (map over) both `views'
1684     ;; and `conditions'.
1685     (loop
1686      for i from 1 upto (1- (length views))
1687      do
1688      (let ((compi (select-component (nth i conditions)))
1689            (viewi (nth i views)))
1690        (setq
1691         where-condition
1692         (concatenate
1693          'string
1694          where-condition
1695          (concatenate
1696           'string
1697           "(v1.code" c " = " viewi ".code" compi ") and "
1698           "(v1.ref" c " = " viewi ".ref" compi ") and ")))))
1699     (let ((viewn (nth (1- (length views)) views))
1700           (compn (select-component
1701                     (nth (length views) conditions))))
1702       (setq
1703        where-condition
1704        (concatenate
1705         'string
1706         where-condition
1707         "(v1.code" c " = " viewn ".code" compn ") and "
1708         "(v1.ref" c " = " viewn ".ref" compn ")")))
1709     where-condition))
1710
1711 (defun select-component (condition)
1712   (cond ((eq (first condition) nil) "1")
1713         ((eq (second condition) nil) "2")
1714         ((eq (third condition) nil) "3")))
1715 \end{common}
1716
1717 \begin{common}{queries.lisp}
1718 (locally-disable-sql-reader-syntax)
1719 \end{common}
1720
1721 \begin{notate}{Even more complicated logic}
1722 In order to conveniently manage complex queries, it would
1723 be nice if we could store the results of earlier queries
1724 into views, so that we can combine several such views for
1725 further processing.
1726 \end{notate}
1727
1728 \section{Emacs-side} \label{emacs-side}
1729
1730 \subsection{The interface to Common Lisp}
1731
1732 \begin{notate}{On `Defun'} \label{defun-interface}
1733 A way to define Elisp functions whose bodies are evaluated
1734 by Common Lisp.  Trust me, this is a good idea.  Besides,
1735 it exhibits some facinating backquote and comma tricks.
1736 But be careful: this definition of `Defun' did not work on
1737 Emacs version 21.
1738
1739 If we want to be able to feed in a standard arglist to
1740 Common Lisp (with optional elements and so forth), we'd
1741 have define how these arguments are handled here!
1742 \end{notate}
1743
1744 \begin{elisp}
1745 (defmacro Defun (name arglist &rest body)
1746   (declare (indent defun))
1747   `(defun ,name ,arglist
1748      (let* ((outbound-string
1749              (translate-emacs-syntax-to-common-syntax
1750               (format "%S"
1751                       (append
1752                        (list
1753                         (append (list 'lambda ',arglist)
1754                                 ',body))
1755                        (mapcar
1756                         (lambda (arg) `',arg)
1757                         (list
1758                          ,@(remove-if
1759                                  (lambda (testelt)
1760                                    (eq testelt
1761                                  '&optional))
1762                                  arglist)))))))
1763             (returned-string
1764              (second
1765               ;; we now specify the right package!
1766               (slime-eval
1767                (list 'swank:eval-and-grab-output
1768                      outbound-string)
1769                :arxana))))
1770        (process-slime-output returned-string))))
1771 \end{elisp}
1772
1773 \begin{notate}{On `process-slime-output'}
1774 This should downcase all constituent symbols, but for
1775 expediency I'm just downcasing `NIL' at the moment.  Will
1776 come back for more testing and downcasing shortly.  (I
1777 suspect the general case is just about as easy as what
1778 happens here.)
1779 \end{notate}
1780
1781 \begin{elisp}
1782 (defun process-slime-output (str)
1783   (condition-case nil
1784       (let ((read-value (read str)))
1785         (if (symbolp read-value)
1786             (read (downcase str)))
1787         (nsubst nil 'NIL read-value))
1788     (error str)))
1789 \end{elisp}
1790
1791 \begin{elisp}
1792 (defun translate-emacs-syntax-to-common-syntax (str)
1793   (with-temp-buffer
1794     (insert str)
1795     (dolist (swap '(("(\\` " "`")
1796                     ("(\\\, " ",")))
1797       (goto-char (point-min))
1798       (while (search-forward (first swap) nil t)
1799         (goto-char (match-beginning 0))
1800         (forward-sexp)
1801         (delete-char -1)
1802         (goto-char (match-beginning 0))
1803         (delete-region (match-beginning 0)
1804                        (match-end 0))
1805         (insert (second swap))))
1806     (buffer-substring-no-properties (point-min)
1807                                     (point-max))))
1808 \end{elisp}
1809
1810 \begin{notate}{Interactive `Defun'}
1811 Note, an improved version of this macro would allow me to
1812 specify that some Defuns are interactive and some are not.
1813 This could be done by examining the submitted body, and
1814 adjusting the defun if its car is an `interactive' form.
1815 Most of the Defuns will be things that people will want to
1816 use interactively, so making this change would probably be
1817 a good idea.  What I'm doing in the mean time is just
1818 writing 2 functions each time I need to make an
1819 interactive function that accesses Common Lisp data!
1820 \end{notate}
1821
1822 \begin{notate}{Common Lisp evaluation of code chunks}
1823 Another potentially beneficial and simple approach is to
1824 write a form like `progn' that evaluates its contents on
1825 Common Lisp.  This saves us from having to rewrite all of
1826 the `defun' facilities into `Defun' (e.g. interactivity).
1827 But... the problem with \emph{this} is that Common Lisp
1828 doesn't know the names of all the variables that are
1829 defined in Emacs!  I'm not sure how to get all of the
1830 values of these variable substituted \emph{first}, before
1831 the call to Common Lisp is made.
1832 \end{notate}
1833
1834 \begin{notate}{Debugging `Defun'}
1835 In order to make debugging go easier, it might be nice to
1836 have an option to make the code that is supposed to be
1837 evaluated by Defun actually \emph{print} on the REPL
1838 instead of being processed through an invisible back-end.
1839 There could be a couple of different ways to do that, one
1840 would be to simulate just what a user might do, the other
1841 would be a happy medium between that and what we're doing
1842 now: just put our computery auto-generated code on the
1843 REPL and evaluate it.  (To some extent, I think the
1844 *slime-events* buffer captures this information, but it is
1845 not particularly easy to read.)
1846 \end{notate}
1847
1848 \begin{notate}{Interactive Common Lisp?}
1849 Suppose we set up some kind of interactive environment in
1850 Common Lisp; how would we go about passing this
1851 environment along to a user interacting via Emacs?  (Note
1852 that SLIME's presentation of the debugging loop is one
1853 good example.)
1854 \end{notate}
1855
1856 \subsection{Database interaction} \label{interaction}
1857
1858 \begin{notate}{The `article' function} \label{the-article-function}
1859 You can use this function to create an article with a
1860 given name and contents.  If you like you can put it in a
1861 list.
1862 \end{notate}
1863
1864 \begin{elisp}
1865 (Defun article (name contents &optional heading)
1866   (let ((coordinates (add-triple name
1867                                  "has content"
1868                                  contents)))
1869     (when theory (add-triple coordinates "in" heading))
1870     (when place (if (numberp place)
1871                     (put-in-place coordinates place)
1872                   (put-in-place coordinates)))
1873     coordinates))
1874 \end{elisp}
1875
1876 \begin{notate}{The `scholium' function} \label{the-scholium-function}
1877 You can use this function to link annotations to objects.
1878 As with the `article' function, you can optionally
1879 categorize the connection on a given list (cf. Note
1880 \ref{the-article-function}).
1881 \end{notate}
1882
1883 \begin{elisp}
1884 (Defun scholium (beginning link end &optional heading)
1885   (let ((coordinates (add-triple beginning
1886                                  link
1887                                  end)))
1888     (when list (add-triple coordinates "in" heading))
1889     (when place (if (numberp place)
1890                     (put-in-place coordinates place)
1891                   (put-in-place coordinates)))
1892     coordinates))
1893 \end{elisp}
1894
1895 \begin{notate}{Uses of coordinates}
1896 Note that, if desired, you can feed input of the form
1897 '(\meta{code} \meta{ref}) into `article' and `scholium'.
1898 It's convenient to do further any processing of the object
1899 we've created, while we still have ahold of the coordinates
1900 returned by `add-triple' (cf. Note
1901 \ref{import-code-continuations} for an example).
1902 \end{notate}
1903
1904 \begin{notate}{Finding all the members of a list by type?}
1905 We just narrow according to type.
1906 \end{notate}
1907
1908 \begin{notate}{On `get-article'} \label{get-article}
1909 Get the contents of the article named `name'.  Optional
1910 argument `list' lets us find and use the position on the
1911 given list that holds the name, and use that instead of
1912 the name itself.
1913
1914 We do not yet deal well with the ambiguous case in which
1915 there are several positions that correspond to the given
1916 name that appear on the same list.
1917
1918 Note also that out of the data returned by
1919 `triples-given-beginning-and-middle', we should pick the
1920 (hopefully just) ONE that corresponds to the given list.
1921
1922 This means we need to pick over the list of triples
1923 returned here, and test each one to see if it is in our
1924 heading.  As to WHY there might be more than one ``has
1925 content'' for a place that we know to be in our
1926 heading... I'm not sure.  I guess we can go with the
1927 assumption that there is just one, for now.
1928 \end{notate}
1929
1930 \begin{elisp}
1931 (Defun get-article (name &optional heading)
1932   (let* ((place-pseudonyms
1933           (if heading
1934               (get-places-subject-to-constraint
1935                name `(nil "in" ,heading))
1936             (get-places name)))
1937          (goes-by (cond
1938                     ((eq (length place-pseudonyms) 1)
1939                      `(1 ,(car place-pseudonyms)))
1940                     ((triple-exact-match
1941                       name "in" heading)
1942                      name)
1943                     ((not heading) name)
1944                     (t nil))))
1945     (when goes-by
1946       ;; it might be nice to also return `goes-by'
1947       ;; so we can access the appropriate place again.
1948       (third (print-triple
1949               (resolve-ambiguity
1950                (triples-given-beginning-and-middle
1951                 goes-by "has content"))
1952               t)))))
1953 \end{elisp}
1954
1955 \begin{notate}{On `get-names'} \label{get-names}
1956 This function simply gets the names of articles that have
1957 names -- in other words, every triple built around the
1958 ``has content'' relation.
1959 \end{notate}
1960
1961 \begin{elisp}
1962 (Defun get-names (&optional heading)
1963   (let ((conditions (list (list nil "has content" t))))
1964     (when heading
1965       (setq conditions
1966             (append conditions
1967                     (list (list nil "in" heading)))))
1968     (mapcar
1969      (lambda (place-or-string)
1970        (cond
1971          ;; place case
1972          ((eq (first place-or-string) 1)
1973           (print-system-object
1974            (place-lookup (second place-or-string))))
1975          ;; string case
1976          ((eq (first place-or-string) 0)
1977           (print-system-object place-or-string))))
1978      (mapcar
1979       (lambda (triple)
1980         (isolate-beginning triple))
1981       (satisfy-conditions conditions)))))
1982 \end{elisp}
1983
1984 \begin{notate}{Contrasting cases} \label{contrasting-cases}
1985 Consider the difference between
1986 \begin{quote}
1987 (? ``has author'' ``Arthur C. Clarke'') \\
1988 (? ``has genre'' ``fiction'')
1989 \end{quote}
1990 and
1991 \begin{quote}
1992 (\emph{name} ``has content'' *) \\
1993 (\emph{name} ``in'' ``heading'')
1994 \end{quote}
1995 where, in the latter case, we know \emph{who} we're
1996 talking about, and we just want to limit the list of items
1997 generated by the ``*'' by the second condition.  This
1998 should help illustrate the difference between `get-names'
1999 (which is making a general query) and `get-article' (which
2000 already knows the name of a specific article), and the
2001 logic that they use.
2002 \end{notate}
2003
2004 \begin{notate}{Placing items from Emacs} \label{place-item}
2005 We periodically need to place items from within Emacs.
2006 The function `place-item' is a wrapper for `put-in-place'
2007 that makes this possible (it also provides the user with
2008 an extra option, namely to put the place itself under a
2009 given heading).
2010
2011 Notice that when the symbol is placed in some pre-existing
2012 place (which can only happen when `id' is not nil), that
2013 place may already be under some other heading.  We will ignore
2014 this case for now (since it seems that putting objects
2015 into \emph{new} places will be the preferred action), but
2016 later we will have to look at what to do in this other
2017 case.
2018 \end{notate}
2019
2020 \begin{elisp}
2021 (Defun place-item (symbol &optional id heading)
2022   (let ((coordinates (put-in-place symbol id)))
2023     (when heading (add-triple coordinates "in" heading))
2024     coordinates))
2025 \end{elisp}
2026
2027 \begin{notate}{Automatic classifications} \label{classifications}
2028 It will presumably make sense to offer increasingly
2029 ``automatic'' classifications for new objects.  At this
2030 point, we've set things up so that the user can optionally
2031 supply the name of \emph{one} heading that their new object
2032 is a part of.
2033
2034 It may make more sense to allow an `\&rest theories'
2035 argument, and add the triple to all of the specified
2036 theories.  This would require modifying `Defun' to
2037 accommodate the `\&rest' idiom; see Note
2038 \ref{defun-interface}.
2039 \end{notate}
2040
2041 \begin{notate}{Postconditions and provenance}
2042 After adding something to the database, we may want to do
2043 something extra; perhaps generating provenance
2044 information, perhaps checking or enforcing database
2045 consistency, or perhaps running a hook that causes some
2046 update in the frontend (cf. Note \ref{provenance}).
2047 Provisions of this sort will come later, as will
2048 short-hand convenience functions for making particularly
2049 common complex entries.
2050 \end{notate}
2051
2052 \subsection{Importing \LaTeX\ documents} \label{importing}
2053
2054 \begin{notate}{Importing sketch} \label{importing-sketch}
2055 The code in this section imports a document as a
2056 collection of (sub-)sections and notes.  It gathers the
2057 sections, sub-sections, and notes recursively and records
2058 their content in a tree whose nodes are places (Note
2059 \ref{places}) and whose links express the ``component-of''
2060 relation described in Note \ref{order-of-order}.
2061
2062 This representation lets us see the geometric,
2063 hierarchical, structure of the document we've imported.
2064 It exemplifies a general principle, that geometric data
2065 should be represented by relationships between places, not
2066 direct relationships between strings.  This is because
2067 ``the same'' string often appears in ``different'' places
2068 in any given document (e.g. a paper's many sub-sections
2069 titled ``Introduction'' will not all have the same
2070 content).
2071
2072 What goes into the places is in some sense arbitrary.  The
2073 key is that whatever is \emph{in} or \emph{attached} to
2074 these places must tell us everything we need to know about
2075 the part of the document associated with that place
2076 (e.g. in the case of a note, its title and contents).
2077 That's over and above the \emph{structural} links which
2078 say how the places relate to one another.  Finally, all of
2079 these places and structural links will be added to a
2080 heading that represents the document as a whole.
2081
2082 A natural convention we'll use will be to put the name
2083 of any document component that's associated with a given
2084 place into that place, and add all other information as
2085 annotations.
2086 \end{notate}
2087
2088 \begin{notate}{Ordered versus unordered data} \label{ordered-vs-unordered}
2089 The code in this section is an example of one way to work
2090 with ordered data (i.e. \LaTeX\ documents are not just
2091 hierarchical, but the elements at each level of the
2092 hierarchy are also ordered).
2093
2094 Since \emph{many} artifacts are hierachical (e.g. Lisp
2095 code), we should try to be compatible with \emph{native}
2096 methods for working with order (in the case of Lisp, feed
2097 the code into a Lisp processor and use CDR and CAR, etc.).
2098
2099 We \emph{can} use triples such as (``rank'' ``1''
2100 ``Fred'') and (``rank'' ``2'' ``Barney'') to talk about
2101 order.  There may be some SQL techniques that would help.
2102 (FYI, order can be handled very explicitly in Elephant!)
2103
2104 In order to account for \emph{different} orderings, we
2105 need one more piece of data -- some explicit treatment of
2106 where the order \emph{is}; in other words, theories.
2107 (This table illustrates the fact that a heading is not so
2108 different from ``an additional triple''; indeed, the only
2109 reason to make them different is to have the extra
2110 convenience of having their elements be numbered.)
2111
2112 \begin{center}
2113 \begin{tabular}{|lll|l|}
2114 \hline
2115 rank & 1 & Fred & Friday \\
2116 rank & 2 & Barney & Friday \\
2117 rank & 1 & Barney & Saturday \\
2118 rank & 2 & Fred & Saturday \\
2119 \hline
2120 \end{tabular}
2121 \end{center}
2122 \end{notate}
2123
2124 \begin{notate}{The order of order} \label{order-of-order}
2125 The triples (``rank'' ``1'' ``Fred'') and (``rank'' ``2''
2126 ``Barney'') mentioned in Note \ref{ordered-vs-unordered}
2127 are easy enough to read and understand; it might be more
2128 natural in some ways for us to say (``Fred'' ``rank''
2129 ``1'') -- Fred has rank 1.  In this section, we're
2130 concerned with talking about the ordered parts of a
2131 document, and ($A$ $n$ $B$) seems like an intuitive way to
2132 say ``$A$'s $n$th component is $B$''.
2133 \end{notate}
2134
2135 \begin{notate}{It's not overdoing it, right?}
2136 When importing \emph{this} document, we see links like the
2137 following.  I hope that's not ``overdoing it''.  (Take a
2138 look at Note \ref{get-article} and Note \ref{get-names} to
2139 see how we go about getting information out of the
2140 database.)  We could get rid of one link if theories were
2141 database objects (cf. Note
2142 \ref{theories-as-database-objects}).
2143 \end{notate}
2144
2145 \begin{idea}
2146 "T557[P135|Web interface|.in.arxana.tex]"
2147 "T558[Future plans.9.P135|Web interface|]"
2148 "T559[T558[Future plans.9.P135|Web interface|].in.arxana.tex]"
2149 \end{idea}
2150
2151 \begin{notate}{Importing in general} \label{importing-generally}
2152 We will eventually have a collection of parsers to get
2153 various kinds of documents into the system in various
2154 different ways (Note \ref{parsing}).  For now, this
2155 section gives a simple way to get some sorts of
2156 \LaTeX\ documents into the system, namely documents
2157 structured along the same lines as the document you're
2158 reading now!
2159
2160 An interesting approach to parsing \emph{math} documents
2161 has been undertaken in the \LaTeX ML
2162 project.\footnote{{\tt http://dlmf.nist.gov/LaTeXML/}}
2163 Eventually it would be nice to get that level of detail
2164 here, too!  Emacsspeak is another example of a
2165 \LaTeX\ parser that deals with large-scale textual
2166 structures as well as smaller bits and
2167 pieces.\footnote{{\tt
2168     http://www.cs.cornell.edu/home/raman/aster/aster-thesis.ps}}
2169
2170 It would probably be useful to put together some parsers
2171 for HTML and wiki code soon.
2172 \end{notate}
2173
2174 \begin{notate}{On `import-buffer'}
2175 This function imports \LaTeX\ documents, taking care of
2176 the non-recursive aspects of this operation.  It imports
2177 frontmatter (everything up to the first
2178 \verb+\begin{section}+), but assumes ``backmatter'' is
2179 trivial, and does not import it.  The imported material is
2180 classified as a ``document'' with the same name as the
2181 imported buffer.
2182 \end{notate}
2183
2184 \begin{elisp}
2185 (defun import-buffer (&optional buffername)
2186   (save-excursion
2187     (set-buffer (get-buffer (or buffername
2188                                 (current-buffer))))
2189     (goto-char (point-min))
2190     (search-forward-regexp "\\\\begin{document}")
2191     (search-forward-regexp "\\\\section")
2192     (goto-char (match-beginning 0))
2193     ;; other links will be made in the "heading of this
2194     ;; document", but here we make a broader assertion.
2195     (scholium buffername "is a" "document")
2196     (scholium buffername
2197               "has frontmatter"
2198               (buffer-substring-no-properties
2199                (point-min)
2200                (point))
2201               buffername)
2202     ;;; These should maybe be scholia attached to
2203     ;; root-coords (below), but for some reason that
2204     ;; wasn't working so well -- investigate later --
2205     ;; maybe it just wasn't good to run after running
2206     ;; `import-within'.
2207     (let* ((root-coords (place-item buffername nil
2208                                     buffername))
2209            (levels
2210             '("section" "subsection" "subsubsection"))
2211            (current-parent buffername)
2212            (level-end nil)
2213            (sections (import-within levels))
2214            (index 0))
2215       (while sections
2216         (let ((coords (car sections)))
2217           (setq index (1+ index))
2218           (scholium root-coords
2219                     index
2220                     coords
2221                     buffername))
2222         (setq sections (cdr sections))))))
2223 \end{elisp}
2224
2225 \begin{notate}{On `import-within'}
2226 Recurse through levels of sectioning to import
2227 \LaTeX\ code.
2228
2229 It would be good if we could do something about sections
2230 that contain neither subsections nor notes (for example, a
2231 preface), or, more generally, about text that is not
2232 contained in any environment (possibly that appears before
2233 any section).  We'll save things like this for another
2234 editing round!
2235
2236 For the moment, we've decided to build the document
2237 hierarchy with links that are blind to whether the $k$th
2238 component of a section is a note or a subsection.
2239 Children that are notes are attached in the subroutine
2240 `import-notes' and those that are sections are attached in
2241 `import-within'.  Users can find out what type of object
2242 they are looking at based on whether or not it ``has
2243 content''.
2244
2245 Incidentally, when looking for the end of an importing
2246 level, `nil' is an OK result -- if this is the \emph{last}
2247 section at this level \emph{and} there is no subsequent
2248 section at a higher level.
2249 \end{notate}
2250
2251 \begin{elisp}
2252 (defun import-within (levels)
2253   (let ((this-level (car levels))
2254         (next-level (car (cdr levels))) answer)
2255     (while (re-search-forward
2256             (concat
2257              "^\\\\" this-level "{\\([^}\n]*\\)}"
2258              "\\( +\\\\label{\\)?"
2259              "\\([^}\n]*\\)?")
2260             level-end t)
2261       (let* ((name (match-string-no-properties 1))
2262              (at (place-item name nil buffername))
2263              (level-end
2264               (or (save-excursion
2265                     (search-forward-regexp
2266                      (concat "^\\\\" this-level "{.*")
2267                      level-end t))
2268                   level-end))
2269              (notes-end
2270               (if next-level
2271                   (or (progn (point)
2272                              (save-excursion
2273                                (search-forward-regexp
2274                                 (concat "^\\\\"
2275                                         next-level "{.*")
2276                                 level-end t)))
2277                       level-end)
2278                 level-end))
2279              (index (let ((current-parent at))
2280                       (import-notes notes-end)))
2281              (subsections (let ((current-parent at))
2282                             (import-within (cdr levels)))))
2283         (while subsections
2284           (let ((coords (car subsections)))
2285             (setq index (1+ index))
2286             (scholium at
2287                       index
2288                       coords
2289                       buffername)
2290             (setq subsections (cdr subsections))))
2291         (setq answer (cons at answer))))
2292     (reverse answer)))
2293 \end{elisp}
2294
2295 \begin{notate}{On `import-notes'} \label{import-notes}
2296 We're going to make the daring assumption that the
2297 ``textual'' portions of incoming \LaTeX\ documents are
2298 contained in ``Notes''.  That assumption is true, at
2299 least, for the current document.  The function returns the
2300 count of the number of notes imported, so that
2301 `import-within' knows where to start counting this
2302 section's non-note children.
2303
2304 Would this same function work to import all notes from a
2305 buffer without examining its sectioning structure?  Not
2306 quite, but close! (Could be a fun exercise to fix this.)
2307 \end{notate}
2308
2309 \begin{elisp}
2310 (defun import-notes (end)
2311   (let ((index 0))
2312     (while (re-search-forward (concat "\\\\begin{notate}"
2313                                       "{\\([^}\n]*\\)}"
2314                                       "\\( +\\\\label{\\)?"
2315                                       "\\([^}\n]*\\)?")
2316                               end t)
2317       (let* ((name
2318               (match-string-no-properties 1))
2319              (tag (match-string-no-properties 3))
2320              (beg
2321               (progn (next-line 1)
2322                      (line-beginning-position)))
2323              (end
2324               (progn (search-forward-regexp
2325                       "\\\\end{notate}")
2326                      (match-beginning 0)))
2327              (coords (place-item name nil buffername)))
2328         (setq index (1+ index))
2329         (scholium current-parent
2330                   index
2331                   coords
2332                   buffername)
2333         ;; not in the heading
2334         (scholium coords
2335                   "has content"
2336                   (buffer-substring-no-properties
2337                    beg end))
2338         (import-code-continuations coords)))
2339     index))
2340 \end{elisp}
2341
2342 \begin{notate}{On `import-code-continuations'} \label{import-code-continuations}
2343 This runs within the scope of `import-notes', to turn the
2344 series of Lisp chunks or other code snippets that follow a
2345 given note into a scholium attached to that note.  Each
2346 separate snippet becomes its own annotation.
2347
2348 The ``conditional regexps'' used here only work with Emacs
2349 version 23 or higher.
2350
2351 I'm noticing a problem with the way the `looking-at'
2352 form behaves.  It matches the expression in question,
2353 but then the match-end is reported as one character
2354 less than it supposed to be.  Maybe `looking-at' is
2355 just not as good as `re-search-forward'?  But it's
2356 what seems easiest to use.
2357 \end{notate}
2358
2359 \begin{elisp}
2360 (defun import-code-continuations (coords)
2361   (let ((possible-environments
2362          "\\(1?:lisp\\|idea\\|common\\)"))
2363     (while (looking-at
2364             (concat "\n*?\\\\begin{"
2365                     possible-environments
2366                     "}"))
2367       (let* ((beg (match-end 0))
2368              (environment (match-string 1))
2369              (end (progn (search-forward-regexp
2370                           (concat "\\\\end{"
2371                                   environment
2372                                   "}"))
2373                          (match-beginning 0)))
2374              (content (buffer-substring-no-properties
2375                        beg
2376                        end)))
2377         (scholium (scholium coords
2378                             "has attachment"
2379                             content)
2380                   "has type"
2381                   environment)))))
2382 \end{elisp}
2383
2384 \begin{notate}{On `autoimport-arxana'} \label{autoimport-arxana}
2385 This just calls `import-buffer', and imports this document
2386 into the system.
2387 \end{notate}
2388
2389 \begin{elisp}
2390 (defun autoimport-arxana ()
2391   (interactive)
2392   (import-buffer "arxana.tex"))
2393 \end{elisp}
2394
2395 \begin{notate}{Importing textual links}
2396 Of course, it would be good to import the links that users
2397 make between articles, since then we can quickly navigate
2398 from an article to the various articles that cite that
2399 article, as well as follow the usual forward-directional
2400 links.  Indeed, we should be able to browse each article
2401 within a ``neighborhood'' of other related articles.
2402 (We'll need to import labels as well, of course.)
2403 \end{notate}
2404
2405 \subsection{Browsing database contents} \label{browsing}
2406
2407 \begin{notate}{Browsing sketch} \label{browsing-sketch}
2408 This section facilitates browsing of documents represented
2409 with structures like those created in Section
2410 \ref{importing}, and sets the ground for browsing other
2411 sorts of contents (e.g. collections of tasks, as in
2412 Section \ref{managing-tasks}).
2413
2414 In order to facilitate general browsing, it is not enough
2415 to simply use `get-article' (Note \ref{get-article}) and
2416 `get-names' (Note \ref{get-names}), although these
2417 functions provide our defaults.  We must provide the means
2418 to find and display different things differently -- for
2419 example, a section's  table of contents will typically
2420 be displayed differently from its actual contents.
2421
2422 Indeed, the ability to display and select elements of
2423 document sections (Note \ref{display-section}) is
2424 basically the core browsing deliverable.  In the process
2425 we develop a re-usable article selector (Note
2426 \ref{selector}; cf. Note \ref{browsing-tasks}).  This in
2427 turn relies on a flexible function for displaying
2428 different kinds of articles (Note \ref{display-article}).
2429 \end{notate}
2430
2431 \begin{notate}{On `display-article'} \label{display-article}
2432 This function takes in the name of the article to display.
2433 Furthermore, it takes optional arguments `retriever' and
2434 `formatter', which tell it how to look up and/or format
2435 the information for display, respectively.
2436
2437 Thus, either we make some statement up front (choosing our
2438 `formatter' based on what we already know about the
2439 article), or we decide what to display after making some
2440 investigation of information attached to the article, some
2441 of which may be retrieved and displayed (this requires
2442 that we specify a suitable `retriever' and a complementary
2443 `formatter').
2444
2445 For example, the major mode in which to display the
2446 article's contents could be stored as a scholium attached
2447 to the article; or we might maintain some information
2448 about ``areas'' of the database that would tell us up
2449 front what which mode is associated with the current area.
2450 (The default is to simply insert the data with no markup
2451 whatsoever.)
2452
2453 Observe that this works when no heading argument is given,
2454 because in that case `get-article' looks for \emph{all}
2455 place pseudonyms.  (But of course that won't work well
2456 when we have multiple theories containing things with the
2457 same names, so we should get used to using the heading
2458 argument.)
2459
2460 (The business about requiring the data to be a sequence
2461 before engaging in further formatting is, of course, just
2462 a matter of expediency for making things work with the
2463 current dataset.)
2464 \end{notate}
2465
2466 \begin{elisp}
2467 (defun display-article
2468   (name &optional heading retriever formatter)
2469   (interactive "Mname: ")
2470   (let* ((data (if retriever
2471                    (funcall retriever name heading)
2472                  (get-article name heading))))
2473     (when (and data (sequencep data))
2474       (save-excursion
2475         (if formatter
2476             (funcall formatter data heading)
2477           (pop-to-buffer (get-buffer-create
2478                           "*Arxana Display*"))
2479           (delete-region (point-min) (point-max))
2480           (insert "NAME: " name "\n\n")
2481           (insert data)
2482           (goto-char (point-min)))))))
2483 \end{elisp}
2484
2485 \begin{notate}{An interactive article selector} \label{selector}
2486 The function `get-names' (Note \ref{get-names}) and
2487 similar functions can give us a collection of articles.
2488 The next few functions provide an interactive
2489 functionality for moving through this collection to find
2490 the article we want to look at.
2491
2492 We define a ``display style'' that the article selector
2493 uses to determine how to display various articles.  These
2494 display styles are specified by text properties attached
2495 to each option the selector provides.  Similarly, when
2496 we're working within a given heading, the relevant heading
2497 is also specified as a text property.
2498
2499 At selection time, these text properties are checked to
2500 determine which information to pass along to
2501 `display-article'.
2502 \end{notate}
2503
2504 \begin{elisp}
2505 (defvar display-style '((nil . (nil nil))))
2506
2507 (defun thing-name-at-point ()
2508   (buffer-substring-no-properties
2509    (line-beginning-position)
2510    (line-end-position)))
2511
2512 (defun get-display-type ()
2513   (get-text-property (line-beginning-position)
2514                      'arxana-display-type))
2515
2516 (defun get-relevant-heading ()
2517   (get-text-property (line-beginning-position)
2518                      'arxana-relevant-heading))
2519
2520 (defun arxana-list-select ()
2521   (interactive)
2522   (apply 'display-article
2523          (thing-name-at-point)
2524          (get-relevant-heading)
2525          (cdr (assoc (get-display-type)
2526                      display-style))))
2527
2528 (define-derived-mode arxana-list-mode fundamental-mode
2529   "arxana-list" "Arxana List Mode.
2530
2531 \\{arxana-list-mode-map}")
2532
2533 (define-key arxana-list-mode-map (kbd "RET")
2534             'arxana-list-select)
2535 \end{elisp}
2536
2537 \begin{notate}{On `pick-a-name'} \label{pick-a-name}
2538 Here `generate' is the name of a function to call to
2539 generate a list of items to display, and `format' is a
2540 function to put these items (including any mark-up) into
2541 the buffer from which individiual items can then be
2542 selected.
2543
2544 One simple way to get a list of names to display would be
2545 to reuse a list that we had already produced (this would
2546 save querying the database each time).  We could, in fact,
2547 store a history list of lists of names that had been
2548 displayed previously (cf. Note \ref{local-storage}).
2549
2550 We'll eventually want versions of `generate' that provide
2551 various useful views into the data, e.g., listing all of
2552 the elements of a given section (Note
2553 \ref{display-section}).
2554
2555 Finding all the elements that match a given search term,
2556 whether that's just normal text search or some kind of
2557 structured search would be worthwhile too.  Upgrading the
2558 display to e.g. color-code listed elements according to
2559 their type would be another nice feature to add.
2560 \end{notate}
2561
2562 \begin{elisp}
2563 (defun pick-a-name (&optional generate format heading)
2564   (interactive)
2565   (let ((items (if generate
2566                    (funcall generate)
2567                  (get-names heading))))
2568     (when items
2569       (set-buffer (get-buffer-create "*Arxana Articles*"))
2570       (toggle-read-only -1)
2571       (delete-region (point-min)
2572                      (point-max))
2573       (if format
2574           (funcall format items)
2575         (mapc (lambda (item) (insert item "\n")) items))
2576       (toggle-read-only t)
2577       (arxana-list-mode)
2578       (goto-char (point-min))
2579       (pop-to-buffer (get-buffer "*Arxana Articles*")))))
2580 \end{elisp}
2581
2582 \begin{notate}{On `display-section'} \label{display-section}
2583 When browsing a document, if you select a section, you
2584 should display a list of that section's constituent
2585 elements, be they notes or subsections.  The question
2586 comes up: when you go to display something, how do you
2587 know whether you're looking at the name of a section, or
2588 the name of an article?
2589
2590 When you get the section's contents out of the database
2591 (Note \ref{get-section-contents})
2592 \end{notate}
2593
2594 \begin{elisp}
2595 (defun display-section (name heading)
2596   (interactive (list (read-string
2597                       (concat
2598                        "name (default "
2599                        (buffer-name) "): ")
2600                       nil nil (buffer-name))))
2601   ;; should this pop to the Articles window?
2602   (pick-a-name `(lambda ()
2603                   (get-section-contents
2604                    ,name ,heading))
2605                `(lambda (items)
2606                   (format-section-contents
2607                    items ,heading))))
2608
2609 (add-to-list 'display-style
2610              '(section . (display-section
2611                           nil)))
2612 \end{elisp}
2613
2614 \begin{notate}{On `get-section-contents'} \label{get-section-contents}
2615 Sent by `display-section' (Note \ref{display-section})
2616 to `pick-a-name' as a generator for the table of contents
2617 of the section with the given name in the given heading.
2618
2619 This function first finds the triples that begin with the
2620 (placed) name of the section, then checks to see which of
2621 these are in the heading of the document we're examinining
2622 (in other words, which of these links represent structural
2623 information about that document).  It also looks at the
2624 items found at the end of these links to see if they are
2625 sections or notes (``noteness'' is determined by them
2626 having content).  The links are then sorted by their
2627 middles (which show the order in which these components
2628 have in the section we're examining).  After this ordering
2629 information has been used for sorting, it is deleted, and
2630 we're left with just a list of names in the apropriate
2631 order together with an indication of their noteness.
2632 \end{notate}
2633
2634 \begin{elisp}
2635 (Defun get-section-contents (name heading)
2636   (let (contents)
2637     (dolist (triple (triples-given-beginning
2638                      `(1 ,(resolve-ambiguity
2639                            (get-places name)))))
2640       (when (triple-exact-match
2641              `(2 ,(car triple)) "in" heading)
2642         (let* ((number (print-middle triple))
2643                (site (isolate-end triple))
2644                (noteness
2645                 (when (triples-given-beginning-and-middle
2646                        site "has content")
2647                   t)))
2648         (setq contents
2649               (cons (list number
2650                           (print-system-object
2651                            (place-contents site))
2652                           noteness)
2653                     contents)))))
2654     (mapcar 'cdr
2655             (sort contents
2656                   (lambda (component1 component2)
2657                     (< (parse-integer (car component1))
2658                        (parse-integer (car component2))))))))
2659 \end{elisp}
2660
2661 \begin{notate}{On `format-section-contents'} \label{format-section-contents}
2662 A formatter for document contents, used by
2663 `display-document' (Note \ref{display-document}) as input
2664 for `pick-a-name' (Note \ref{pick-a-name}).
2665
2666 Instead of just printing the items one by one,
2667 like the default formatter in `pick-a-name'  does,
2668 this version adds appropriate text properties, which
2669 we determine based the second component of
2670 of `items' to format.
2671 \end{notate}
2672
2673 \begin{elisp}
2674 (defun format-section-contents (items heading)
2675   ;; just replicating the default and building on that.
2676   (mapc (lambda (item)
2677           (insert (car item))
2678           (let* ((beg (line-beginning-position))
2679                  (end (1+ beg)))
2680             (unless (second item)
2681               (put-text-property beg end
2682                                  'arxana-display-type
2683                                  'section))
2684             (put-text-property beg end
2685                                'arxana-relevant-heading
2686                                heading))
2687           (insert "\n"))
2688         items))
2689 \end{elisp}
2690
2691 \begin{notate}{On `display-document'} \label{display-document}
2692 When browsing a document, you should first display its
2693 top-level table of contents.  (Most typically, a list of
2694 all of that document's major sections.)  In order to do
2695 this, we must find the triples that are begin at the node
2696 representing this document \emph{and} that are in the
2697 heading of this document.  This boils down to treating the
2698 document's root as if it was a section and using the
2699 function `display-section' (Note \ref{display-section}).
2700 \end{notate}
2701
2702 \begin{elisp}
2703 (defun display-document (name)
2704   (interactive (list (read-string
2705                       (concat
2706                        "name (default "
2707                        (buffer-name) "): ")
2708                       nil nil (buffer-name))))
2709   (display-section name name))
2710 \end{elisp}
2711
2712 \begin{notate}{Work with `heading' argument}
2713 We should make sure that if we know the heading we're
2714 working with (e.g. the name of the document we're
2715 browsing) that this information gets communicated in the
2716 background of the user interaction with the article
2717 selector.
2718 \end{notate}
2719
2720 \begin{notate}{Selecting from a hierarchical display} \label{hierarchical-display}
2721 A fancier ``article selector'' would be able to display
2722 several sections with nice indenting to show their
2723 hierarchical order.
2724 \end{notate}
2725
2726 \begin{notate}{Browser history tricks} \label{history-tricks}
2727 I want to put together (or put back together) something
2728 similar to the multihistoried browser that I had going in
2729 the previous version of Arxana and my Emacs/Lynx-based web
2730 browser, Nero\footnote{{\tt http://metameso.org/~joe/nero.el}}.
2731 The basic features are:
2732 (1) forward, back, and up inside the structure of a given
2733 document; (2) switch between tabs.  More advanced features
2734 might include: (3) forward and back globally across all
2735 tabs; (4) explicit understanding of paths that loop.
2736
2737 These sorts of features are independent of the exact
2738 details of what's printed to the screen each time
2739 something is displayed.  So, for instance, you could flip
2740 between section manifests a la Note \ref{display-section},
2741 or between hierarchical displays a la Note
2742 \ref{hierarchical-display}, or some combination; the key
2743 thing is just to keep track in some sensible way of
2744 whatever's been displayed!
2745 \end{notate}
2746
2747 \begin{notate}{Local storage for browsing purposes} \label{local-storage}
2748 Right now, in order to browse the contents of the
2749 database, you need to query the database every time.  It
2750 might be handy to offer the option to cache names of
2751 things locally, and only sync with the database from time
2752 to time.  Indeed, the same principle could apply in
2753 various places; however, it may also be somewhat
2754 complicated to set up.  Using two systems for storage, one
2755 local and one permanent, is certainly more heavy-duty than
2756 just using one permanent storage system and the local
2757 temporary display.  However, one thing in favor of local
2758 storage systems is that that's what I used in the the
2759 previous prototype of Arxana -- so some code already
2760 exists for local storage!  (Caching the list of
2761 \emph{names} we just made a selection from would be one
2762 simple expedient, see Note \ref{pick-a-name}.)
2763 \end{notate}
2764
2765 \begin{notate}{Hang onto absolute references}
2766 Since `get-article' (Note \ref{get-article}) translates
2767 strings into their ``place pseudonyms'', we may want to
2768 hang onto those pseudonyms, because they are, in fact, the
2769 absolute references to the objects we end up working with.
2770 In particular, they should probably go into the
2771 text-property background of the article selector, so it
2772 will know right away what to select!
2773 \end{notate}
2774
2775 \subsection{Exporting \LaTeX\ documents$^*$}
2776
2777 \begin{notate}{Roundtripping}
2778 The easiest test is: can we import a document into the
2779 system and then export it again, and find it unchanged?
2780 \end{notate}
2781
2782 \begin{notate}{Data format}
2783 We should be able to \emph{stably} import and export a
2784 document, as well as export any modifications to the
2785 document that were generated within Arxana.  This means
2786 that the exporting functions will have to read the data
2787 format that the importing functions use, \emph{and} that
2788 any functions that edit document contents (or structure)
2789 will also have to use the same format.  Furthermore,
2790 \emph{browsing} functions will have to be somewhat aware
2791 of this format.  So, this is a good time to ask -- did we
2792 use a good format?
2793 \end{notate}
2794
2795 \subsection{Editing database contents$^*$} \label{editing}
2796
2797 \begin{notate}{Roundtripping, with changes}
2798 Here, we should import a document into the system and then
2799 make some simple changes, and after exporting, check with
2800 diff to make sure the changes are correct.
2801 \end{notate}
2802
2803 \begin{notate}{Re-importing}
2804 One nice feature would be a function to ``re-import'' a
2805 document that has changed outside of the system, and make
2806 changes in the system's version whereever changes appeared
2807 in the source version.
2808 \end{notate}
2809
2810 \begin{notate}{Editing document structure}
2811 The way we have things set up currently, it is one thing
2812 to make a change to a document's textual components, and
2813 another to change its structure.  Both types of changes
2814 must, of course, be supported.
2815 \end{notate}
2816
2817 \section{Applications}
2818
2819 \subsection{Managing tasks} \label{managing-tasks}
2820
2821 \begin{notate}{What are tasks?}
2822 Each task tends to have a \emph{name}, a
2823 \emph{description}, a collection of \emph{prerequisite
2824   tasks}, a description of other \emph{material
2825   dependencies}, a \emph{status}, some \emph{justification
2826   of that status}, a \emph{creation date}, and an
2827 \emph{estimated time of completion}.  There might actually
2828 be several ``estimated times of completion'', since the
2829 estimate would tend to improve over time.  To really
2830 understand a task, one should keep track of revisions like
2831 this.
2832 \end{notate}
2833
2834 \begin{notate}{On `store-task-data'} \label{store-task-data}
2835 Here, we're just filling in a frame.  Since ``filling in a
2836 frame'' seems like the sort of operation that might happen
2837 over and over again in different contexts, to save space,
2838 it would probably be nice to have a macro (or similar)
2839 that would do a more general version of what this function
2840 does.
2841 \end{notate}
2842
2843 \begin{elisp}
2844 (Defun store-task-data
2845   (name description prereqs materials status
2846         justification submitted eta)
2847   (add-triple name "is a" "task")
2848   (add-triple name "description" description)
2849   (add-triple name "prereqs" prereqs)
2850   (add-triple name "materials" materials)
2851   (add-triple name "status" status)
2852   (add-triple name "status justification" justification)
2853   (add-triple name "date submitted" submitted)
2854   (add-triple name "estimated time of completion" eta))
2855 \end{elisp}
2856
2857 \begin{notate}{On `generate-task-data'} \label{generate-task-data}
2858 This is a simple function to create a new task matching
2859 the description above.
2860 \end{notate}
2861
2862 \begin{elisp}
2863 (defun generate-task-data ()
2864   (interactive)
2865   (let ((name (read-string "Name: "))
2866         (description (read-string "Description: "))
2867         (prereqs (read-string
2868                   "Task(s) this task depends on: "))
2869         (materials (read-string "Material dependencies: "))
2870         (status (completing-read
2871                  "Status (tabled, in progress, completed):
2872                  " '("tabled" "in progress" "completed")))
2873         (justification (read-string "Why this status? "))
2874         (submitted
2875          (read-string
2876           (concat "Date submitted (default "
2877                   (substring (current-time-string) 0 10)
2878                   "): ")
2879           nil nil (substring (current-time-string) 0 10)))
2880         (eta
2881          (read-string "Estimated date of completion:")))
2882     (store-task-data name description prereqs materials
2883                      status
2884                      justification submitted eta)))
2885 \end{elisp}
2886
2887 \begin{notate}{Possible enhancements to `generate-task-data'}
2888 In order to make this function very nice, it would be good
2889 to allow ``completing read'' over known tasks when filling
2890 in the prerequisites.  Indeed, it might be especially nice
2891 to offer a type of completing read that is similar in some
2892 sense to the tab-completion you get when completing a file
2893 name, i.e., quickly completing certain sub-strings of the
2894 final string (in this case, these substrings would
2895 correspond to task areas we are progressively zooming down
2896 into).
2897
2898 As for the task description, rather than forcing the user
2899 to type the description into the minibuffer, it might be
2900 nice to pop up a separate buffer instead (a la the
2901 Emacs/w3m textarea).  If we had a list of all the known
2902 tasks, we could offer completing-read over the names of
2903 existing tasks to generate the list of `prereqs'.  It
2904 might be nice to systematize date data, so we could more
2905 easily e.g. sort and display task info ``by date''.
2906 (Perhaps we should be working with predefined database
2907 types for dates and so on; but see Note
2908 \ref{choice-of-database}.)
2909
2910 Also, before storing the task, it might be nice to offer
2911 the user the chance to review the data they entered.
2912 \end{notate}
2913
2914 \begin{notate}{On `get-filler'} \label{get-filler}
2915 Just a wrapper for `triples-given-beginning-and-middle'.
2916 (Maybe add `heading' as an option here.)
2917 \end{notate}
2918
2919 \begin{elisp}
2920 (Defun get-filler (frame slot)
2921   (third (first
2922           (print-triples
2923            (triples-given-beginning-and-middle frame
2924                                                slot)))))
2925 \end{elisp}
2926
2927 \begin{notate}{On `get-task'} \label{get-task}
2928 Uses `get-filler' (Note \ref{get-filler}) to assemble the
2929 elements of a task's frame.
2930 \end{notate}
2931
2932 \begin{elisp}
2933 (Defun get-task (name)
2934   (when (triple-exact-match name "is a" "task")
2935     (list (get-filler name "description")
2936           (get-filler name "prereqs")
2937           (get-filler name "materials")
2938           (get-filler name "status")
2939           (get-filler name "status justification")
2940           (get-filler name "date submitted")
2941           (get-filler name
2942                       "estimated time of completion"))))
2943 \end{elisp}
2944
2945 \begin{notate}{On `review-task'} \label{review-task}
2946 This is a function to review a task by name.
2947 \end{notate}
2948
2949 \begin{elisp}
2950 (defun review-task (name)
2951   (interactive "MName: ")
2952   (let ((task-data (get-task name)))
2953     (if task-data
2954         (display-task task-data)
2955       (message "No data."))))
2956
2957 (defun display-task (data)
2958   (save-excursion
2959     (pop-to-buffer (get-buffer-create
2960                     "*Arxana Display*"))
2961     (delete-region (point-min) (point-max))
2962     (insert "NAME: " name "\n\n")
2963     (insert "DESCRIPTION: " (first data) "\n\n")
2964     (insert "TASKS THIS TASK DEPENDS ON: "
2965             (second data) "\n\n")
2966     (insert "MATERIAL DEPENDENCIES: "
2967             (third data) "\n\n")
2968     (insert "STATUS: " (fourth data) "\n\n")
2969     (insert "WHY THIS STATUS?: " (fifth data) "\n\n")
2970     (insert "DATE SUBMITTED:" (sixth data) "\n\n")
2971     (insert "ESTIMATED TIME OF COMPLETION: "
2972             (seventh data) "\n\n")
2973     (goto-char (point-min))
2974     (fill-individual-paragraphs (point-min) (point-max))))
2975 \end{elisp}
2976
2977 \begin{notate}{Possible enhancements to `review-task'}
2978 Breaking this down into a function to select the task and
2979 another function to display the task would be nice.  Maybe
2980 we should have a generic function for selecting any object
2981 ``by name'', and then special-purpose functions for
2982 displaying objects with different properties.
2983
2984 Using text properties, we could set up a ``field-editing
2985 mode'' that would enable you to select a particular field
2986 and edit it independently of the others.  Another more
2987 complex editing mode would \emph{know} which fields the
2988 user had edited, and would store all edits back to the
2989 database properly.  See Section \ref{editing} for more on
2990 editing.
2991 \end{notate}
2992
2993 \begin{notate}{Browsing tasks} \label{browsing-tasks}
2994 The function `pick-a-name' (Note \ref{pick-a-name}) takes
2995 two functions, one that finds the names to choose from,
2996 and the other that says how to present these names.  We
2997 can therefore build `pick-a-task' on top of `pick-a-name'.
2998 \end{notate}
2999
3000 \begin{elisp}
3001 (Defun get-tasks ()
3002   (mapcar #'first
3003           (print-triples
3004            (triples-given-middle-and-end "is a" "task")
3005            t)))
3006
3007 (defun pick-a-task ()
3008   (interactive)
3009   (pick-a-name
3010    'get-tasks
3011    (lambda (items)
3012      (mapc (lambda (item)
3013              (let ((pos (line-beginning-position)))
3014                (insert item)
3015                (put-text-property pos (1+ pos)
3016                                   'arxana-display-type
3017                                   'task)
3018                (insert "\n"))) items))))
3019
3020 (add-to-list 'display-style
3021              '(task . (get-task display-task)))
3022 \end{elisp}
3023
3024 \begin{notate}{Working with theories}
3025 Presumably, like other related functions, `get-tasks'
3026 should take a heading argument.
3027 \end{notate}
3028
3029 \begin{notate}{Check display style}
3030 Check if this works, and make style consistent between
3031 this usage and earlier usage.
3032 \end{notate}
3033
3034 \begin{notate}{Example tasks}
3035 It might be fun to add some tasks associated with
3036 improving Arxana, just to show that it can be done...
3037 maybe along with a small importer to show how importing
3038 something without a whole lot of structure can be easy.
3039 \end{notate}
3040
3041 \subsection{Other ideas$^*$}
3042
3043 \begin{notate}{A browser within a browser} \label{browser-within}
3044 All the stuff we're doing with triples can be superimposed
3045 over the existing web and existing web interfaces, by, for
3046 example, writing a web browser as a web app, and in this
3047 ``browser within a browser'' offer the ability to annotate
3048 and rewrite other people's web pages, produce 3rd-party
3049 redirects, and so forth, sharing these mods with other
3050 subscribers to the service.  (Already websites such as the
3051 short-lived scrum.diddlyumptio.us have offered limited
3052 versions of ``web annotation'', but, so far, what one can
3053 do with such services seems quite weak compared with
3054 what's possible.)
3055 \end{notate}
3056
3057 \begin{notate}{Improvements to the PlanetMath backend}
3058 From one point of view, the SQL tables are the main thing
3059 in Noosphere.  We could say that getting the things out of
3060 SQL and storing new things there is what Noosphere mainly
3061 does.  Following this line of thought, anything that
3062 adjusts these tables will do just as well, e.g., it
3063 shouldn't be terribly hard to develop an email-based
3064 front-end.  But rather than making Arxana work with the
3065 Noosphere relational table system, it is probably
3066 advantageous to translate the data from these tables into
3067 the scholium system.
3068 \end{notate}
3069
3070 \begin{notate}{A new communication platform}
3071 One of the premier applications I have in mind is a new
3072 way to handle communications in an online-forum.  I have
3073 previously called this ``subchanneling'', but really,
3074 joining channels is just as important.
3075 \end{notate}
3076
3077 \begin{notate}{Some tutorials}
3078 It would be interesting to write a tutorial for Common
3079 Lisp or just about any other topic with this system.  For
3080 example, some little ``worksheets'' or ``gymnasia'' that
3081 will help solidify user knowledge in topics on which
3082 questions keep appearing.
3083 \end{notate}
3084
3085 \section{Topics of philosophical interest}
3086
3087 \begin{notate}{Research and development}
3088 In Note \ref{theoretical-context}, I mentioned a model
3089 that could apply in many contexts; it is an essentially
3090 metaphysical conception.  I'm pretty sure that the data
3091 model of Note \ref{data-model} provides a general-enough
3092 framework to represent anything we might find ``out
3093 there''.  However, even if this is the case, questions as
3094 to \emph{efficient} means of working with such data still
3095 abound (cf. Note \ref{models-of-theories}, Note
3096 \ref{use-of-views}).
3097
3098 I propose that along with \emph{development} of Arxana as
3099 a useful system for \emph{doing} ``commons-based peer
3100 production'' should come a \emph{research} programme for
3101 understanding in much greater detail what ``commons-based
3102 peer production'' \emph{is}.  Eventually we may want to
3103 change the name of the subject of study to reflect still
3104 more general ideas of resource use.
3105
3106 While the ``frontend'' of this research project is
3107 anthropological, the ``backend'' is much closer to
3108 artificial intelligence.  On this level, the project is
3109 about understanding \emph{effective} means for solving
3110 human problems.  Often this will involve decomposing
3111 events and processes into constituent elements, making
3112 increasingly detailed treatments along the lines described
3113 in Note \ref{arxana}.
3114 \end{notate}
3115
3116 \begin{notate}{The relationship between text and commentary}
3117 Text under revision might be marked up by a copyeditor: in
3118 cases like these, the interpretation is clear.  However,
3119 what about marginalia with looser interpretations?  These
3120 seem to become part of the copy of the text they are
3121 attached to.  What about steering processes applied to a
3122 given course of action?  How about the relationship of
3123 thoughts or words to perception and action?  How can we
3124 lower the barrier between conception and action, while
3125 still maintaining some purchase on wisdom?
3126
3127 You see, a lot of issues in life have to do with overlays,
3128 multi-tracking, interchange between different systems; and
3129 in these terms, a lot of philosophy reduces to ``media
3130 awareness'' which extends into more and more immediate
3131 contexts (Note \ref{theoretical-context}).
3132 \end{notate}
3133
3134 \begin{notate}{Heuristic flow}
3135 Continuing the notion above: one does not need a
3136 fully-developed ``heading'' of work in order to do work --
3137 instead, one wants some straightforward heuristics that
3138 will enable the desired work to get done.  So, even
3139 supposing the work is ``heading building'', it can progress
3140 without becoming overwhelmed in abstractions -- because
3141 theories and heuristics are different things.
3142 \end{notate}
3143
3144 \begin{notate}{Limits of simple languages} \label{simple-languages}
3145 Triples are frequently ``subject, verb, object''
3146 statements, although with the annotation features, we can
3147 modify any part of any such statement; for example, we
3148 can apply an adverb to a given verb.
3149
3150 ``Tags'', of course, already provide ``subject,
3151 predicate'' relationships.  It will be interesting to
3152 examine the degree to which human languages can be mapped
3153 down into these sorts of simple languages.  What features
3154 are needed to make such languages \emph{useful}?  (Lisp's
3155 `car' and `cdr' seem related to the idea of making
3156 predicates useful.)
3157
3158 How are triples and predicates ``enough''?  What, if
3159 anything, do they lack?  The difference between triples
3160 and predicates illustrates the issue.  How should we
3161 characterize Arxana's additions to Lisp?
3162 \end{notate}
3163
3164 \begin{notate}{Higher dimensions}
3165 Why stop with three components?  Why not have $(A, B, C,
3166 D, T)$ represent a semantic relationship between all of
3167 $A$, $B$, $C$, and $D$ (in heading $T$, of course)?
3168 Actually, there is no reason to stop apart from the fact
3169 that I want to explore simple languages (Note
3170 \ref{simple-languages}).  In real life, things are not as
3171 simple, and we should be ready to deal with the
3172 complexities! (Cf., for example, Note \ref{pointing}).
3173 \end{notate}
3174
3175 \section{Future plans}
3176
3177 \begin{notate}{Development pathways}
3178 To the extent that it's possible, I'd like to maintain a
3179 succinct non-linear roadmap in which tasks are outlined
3180 and prioritized, and some procedural details are made
3181 concrete.  Whenever relevant this map should point into
3182 the current document.  I'll begin by revising the plans
3183 I've used so far!\footnote{{\tt
3184     http://metameso.org/files/plan-arxana.pdf}} Over the
3185 next several months, I'd like to see these plans develop
3186 into a genuine production machine, and see the machine
3187 begin to stabilize its operations.
3188 \end{notate}
3189
3190 \begin{notate}{Theories as database objects} \label{theories-as-database-objects}
3191 We're just beginning to treat theories as database
3192 objects; I expect there will be more work to do to make
3193 this work really well.  We'll want to make some test
3194 cases, like building a ``theory of chess'', or even just
3195 describing a particular chess board; cf. Note
3196 \ref{partial-image}.
3197 \end{notate}
3198
3199 \begin{notate}{Search engine/elements} \label{search-engine}
3200 One of the features that came very easy in the Emacs-only
3201 prototype was textual search.  With the strings stored in
3202 a database, Sphinx seems to be the most suitable search
3203 engine to use.  It is tempting to try to make our own
3204 inverted index using triples, so that text-based search
3205 can be even more directly integrated with semantic search.
3206 (Since the latest version(s) of Sphinx can act to some
3207 extent like a MySQL database, we almost have a direct
3208 connection in the backend, but since Sphinx is not
3209 \emph{the same} database, one would at least need some
3210 glue code to effect joins and so forth.)
3211
3212 More to the point, it is important for this project that
3213 the scholia-based document model be transparently extended
3214 down to the level of words and characters.  It may be
3215 helpful to think about text as \emph{always being}
3216 hypertext; a document as a heading; and a word in the
3217 inverted index as a frame.
3218 \end{notate}
3219
3220 \begin{notate}{Pointing at database elements and other things} \label{pointing}
3221 We will want to be able to point at other tables and at
3222 other sorts of objects and make use of their contents.
3223 The plan is that our triples will provide a sort of guide
3224 or backbone superimposed over a much larger data system.
3225 \end{notate}
3226
3227 \begin{notate}{Feature-chase}
3228 There are lots of different features that could be
3229 explored, for example: multi-dimensional history lists; a
3230 useful treatment of ``clusions''; MS Word-like colorful
3231 annotations; etc.  Many of these features are already
3232 prototyped.\footnote{See footnote \ref{old-version}.}
3233 \end{notate}
3234
3235 \begin{notate}{Regression testing}
3236 Along with any major feature chase, we should provide
3237 and maintain a regression testing suite.
3238 \end{notate}
3239
3240 \begin{notate}{Deleting and changing things}
3241 How will we deal with unlinking, disassociating,
3242 forgetting, entropy, and the like?  Changes can perhaps
3243 be modeled by an insertion following a deletion, and,
3244 as noted, we'll need effective ways to represent and
3245 manage change (Note \ref{change}).
3246 \end{notate}
3247
3248 \begin{notate}{Tutorial}
3249 Right now the system is simple enough to be pretty much
3250 self-explanatory, but if it becomes much more complicated,
3251 it might be helpful to put together a simple guide to some
3252 likely-to-be-interesting features.
3253 \end{notate}
3254
3255 \begin{notate}{Computing possible paths and connections}
3256 If we can find all the \emph{direct} paths from one node
3257 to another using `triples-given-beginning-and-end', can we
3258 inject some algorthms for finding longer, indirect paths
3259 into the system, and find ways to make them useful?
3260
3261 Similarly, we can satisfy local conditions (Note
3262 \ref{satisfy-conditions}), but we'll want to deal with
3263 increasingly ``non-local'' conditions (even just using the
3264 logical operator ``or'', instead of ``and'', for example).
3265 \end{notate}
3266
3267 \begin{notate}{Monster Mountain}
3268 In Summer 2007, we checked out the Monster Mountain MUD
3269 server\footnote{{\tt http://code.google.com/p/mmtn/}},
3270 which would enable several users to interact with one
3271 LISP, instead of just one database.  This would have a
3272 number of advantages, particularly for exploring
3273 ``scholiumific programming'', but also towards fulfilling
3274 the user-to-user interaction objective stated in Note
3275 \ref{theoretical-context}. I plan to explore this after
3276 the primary goal of multi-user interaction with the
3277 database has been solidly completed.
3278 \end{notate}
3279
3280 \begin{notate}{Web interface}
3281 A finished web interface may take a considerable amount of
3282 work (if the complexity of an interesting Emacs interface
3283 is any indication), but the basics shouldn't be hard to
3284 put together soon.
3285 \end{notate}
3286
3287 \begin{notate}{Parsing input} \label{parsing}
3288 Complicated objects specified in long-hand (e.g. triples
3289 pointing to triples) can be read by a relatively simple
3290 parser -- which we'll have to write!  The simplest goal
3291 for the parser would be to be able to distinguish between
3292 a triple and a string -- presumably that much isn't hard.
3293 And of course, building complexes of triples that
3294 represent statements from natural language is a good
3295 long-term goal. (Right now, our granularity level is set
3296 much higher.)
3297 \end{notate}
3298
3299 \begin{notate}{Choice of database} \label{choice-of-database}
3300 I expect Elephant\footnote{{\tt
3301     http://common-lisp.net/project/elephant/}} may become
3302 our preferred database at some point in the future; we are
3303 currently awaiting changes to Elephant that make nested
3304 queries possible and efficient.  Some core queries related
3305 to managing a database of semantic links with the current
3306 Elephant were constructed by Ian Eslick, Elephant's
3307 maintainer.\footnote{{\tt
3308     http://planetx.cc.vt.edu/\~{}jcorneli/arxana/variant-4.lisp}}
3309
3310 On the other hand, it might be reasonable to use an Emacs
3311 database and redo the whole thing to work in Emacs
3312 (again), e.g. for single-user applications or users who
3313 want to work offline a lot of the time.
3314 \end{notate}
3315
3316 \begin{notate}{Different kinds of theories}
3317 Theories or variants thereof are of course already popular
3318 in other knowledge representation contexts.\footnote{{\tt
3319     http://www.cyc.com/cycdoc/vocab/mt-expansion-vocab.html}}$^{,}$\footnote{{\tt
3320     http://www.stanford.edu/\~{}kdevlin/HHL\_SituationTheory.pdf}}
3321 We'll want to adopt some useful techniques for knowledge
3322 management as soon as the core systems are ready.
3323
3324 Various notions of a mathematical theory
3325 exist.\footnote{{\tt
3326     http://planetmath.org/encyclopedia/Theory.html}} It
3327 would be nice to be able to assign specific logic to
3328 theories in Arxana, following the ``little theories''
3329 design of e.g. IMPS.\footnote{{\tt
3330     http://imps.mcmaster.ca/manual/node13.html}}
3331 \end{notate}
3332
3333 \section{Conclusion} \label{conclusion}
3334
3335 \begin{notate}{Ending and beginning again}
3336 This is the end of the Arxana system itself; the
3337 appendices provide some ancillary tools, and some further
3338 discussion.  Contributions that support the development of
3339 the Arxana project are welcome.
3340 \end{notate}
3341
3342 \appendix
3343
3344 \section{Appendix: Auto-setup} \label{appendix-setup}
3345
3346 \begin{notate}{Setting up auto-setup}
3347 This section provides code for satifying dependencies and
3348 setting up the program.  This code assumes that you are
3349 using a Debian/APT-based system (but things are not so
3350 different using say, Fedora or Fink; writing a
3351 multi-package-manager-friendly installer shouldn't be
3352 hard).  Of course, feel free to set things up differently
3353 if you have something else in mind!
3354 \end{notate}
3355
3356 \begin{elisp}
3357 (defalias 'set-up 'shell-command)
3358
3359 (defun alternative-set-up (string)
3360   (save-excursion
3361     (pop-to-buffer (get-buffer-create "*Arxana Help*"))
3362     (goto-char (point-max))
3363     (insert string "\n")))
3364
3365 (defun set-up-arxana-environment ()
3366   (interactive)
3367   (if (y-or-n-p
3368        "Run commands (y) (or just show instructions)? ")
3369       (fset 'set-up 'shell-command)
3370     (fset 'set-up 'alternative-set-up))
3371   (when (y-or-n-p "Install dependencies? ")
3372     (set-up "mkdir ~/arxana")
3373     (set-up "cd arxana"))
3374
3375   (when (y-or-n-p "Download latest Arxana? ")
3376     (set-up "wget http://metameso.org/files/arxana.tex"))
3377
3378   (unless (y-or-n-p "Is your emacs good enough?... ")
3379     (set-up
3380      (concat "cvs -z3 -d"
3381              ":pserver:anonymous@cvs.savannah.gnu.org:"
3382              "/sources/emacs co emacs"))
3383     (set-up "mv emacs ~")
3384     (set-up "cd ~/emacs")
3385     (set-up "./configure && make bootstrap")
3386     (set-up "cd ~/arxana"))
3387
3388   (defvar pac-man nil)
3389
3390   (cond ((y-or-n-p
3391           "Do you use an apt-based package manager? ")
3392          (setq pac-man "apt-get"))
3393         (t (message
3394             "OK, get Lisp and SQL on your own, then!")))
3395
3396   (when pac-man
3397     (when (y-or-n-p "Install Common Lisp? ")
3398       (set-up (concat pac-man " install sbcl")))
3399
3400     (when (y-or-n-p "Install Postgresql? ")
3401       (set-up (concat pac-man " install postgresql"))
3402       (when (y-or-n-p "Help setting up PostgreSQL? ")
3403         (save-excursion
3404           (pop-to-buffer (get-buffer-create "*Arxana Help*"))
3405           (insert "As superuser (root),
3406 edit /etc/postgresql/7.4/main/pg_hba.conf
3407 make sure it says this:
3408 host all all 127.0.0.1 255.255.255.255 trust
3409 then edit /etc/postgresql/7.4/main/postgresql.conf
3410 and make it say
3411 tcpip_socket = true
3412 then restart:
3413 /etc/init.d/postgresql-7.4 restart
3414 su postgres
3415 createuser username
3416 exit
3417 as username, run
3418 createdb -U username\n")))))
3419
3420   (when (y-or-n-p "Install SLIME...? ")
3421     (set-up (concat "cvs -d :pserver:anonymous"
3422                            ":anonymous@common-lisp.net:"
3423                            "/project/slime/cvsroot co slime"))
3424     (set-up
3425      (concat "echo \";; Added to ~/.emacs for Arxana:\n\n"
3426              "(add-to-list 'load-path \"~/slime/\")\n"
3427              "(setq inferior-lisp-program \"/usr/bin/sbcl\")\n"
3428              "(require 'slime)\n"
3429              "(slime-setup '(slime-repl))\n\n\""
3430              "| cat - ~/.emacs > ~/updated.emacs &&"
3431              "mv ~/updated.emacs ~/.emacs")))
3432
3433   (when (y-or-n-p "Set up Common Lisp environment? ")
3434     (set-up "mkdir ~/.sbcl")
3435     (set-up "mkdir ~/.sbcl/site")
3436     (set-up "mkdir ~/.sbcl/systems")
3437     (set-up "cd ~/.sbcl/site")
3438     (set-up (concat "wget http://files.b9.com/"
3439                     "clsql/clsql-latest.tar.gz"))
3440     (set-up "tar -zxf clsql-4.0.3.tar.gz")
3441     (set-up (concat "wget http://files.b9.com/"
3442                            "uffi/uffi-latest.tar.gz"))
3443     (set-up "tar -zxf uffi-1.6.0.tar.gz")
3444     (set-up (concat "wget http://files.b9.com/"
3445                            "md5/md5-1.8.5.tar.gz"))
3446     (set-up "tar -zxf md5-1.8.5.tar.gz")
3447     (set-up "cd ~/.sbcl/systems")
3448     (set-up "ln -s ../site/md5-1.8.5/md5.asd .")
3449     (set-up "ln -s ../site/uffi-1.6.0/uffi.asd .")
3450     (set-up "ln -s ../site/clsql-4.0.3/clsql.asd .")
3451     (set-up "ln -s ../site/clsql-4.0.3/clsql-uffi.asd .")
3452     (set-up (concat "ln -s ../site/clsql-4.0.3/"
3453                            "clsql-postgresql-socket.asd ."))
3454     (set-up "ln -s ~/arxana/arxana.asd ."))
3455
3456   (when (y-or-n-p "Modify ~/.sbclrc so CL always starts Arxana? ")
3457     (set-up
3458      (concat "echo \";; Added to ~/.sbclrc for Arxana:\n\n"
3459              "(require 'asdf)\n\n"
3460              "(asdf:operate 'asdf:load-op 'swank)\n"
3461              "(setf swank:*use-dedicated-output-stream* nil)\n"
3462              "(setf swank:*communication-style* :fd-handler)\n"
3463              "(swank:create-server :port 4006 :dont-close t)\n\n"
3464              "(asdf:operate 'asdf:load-op 'clsql)\n"
3465              "(asdf:operate 'asdf:load-op 'arxana)\n"
3466              "(in-package arxana)\n"
3467              "(connect-to-database)\n"
3468              "(locally-enable-sql-reader-syntax)\n\n\""
3469              "| cat ~/.sbclrc - > ~/updated.sbclrc &&"
3470              "mv ~/updated.sbclrc ~/.sbclrc")))
3471
3472   (when (y-or-n-p "Install Monster Mountain? ")
3473     (set-up "cd ~/.sbcl/systems")
3474     (set-up (concat
3475                     "darcs get http://common-lisp.net/project/"
3476                     "bordeaux-threads/darcs/bordeaux-threads/"))
3477     (set-up (concat
3478                     "svn checkout svn://common-lisp.net/project/"
3479                     "usocket/svn/usocket/trunk usocket-svn"))
3480     ;; I've had problems with this approach to setting cclan
3481     ;; mirror...
3482     (set-up
3483      (concat
3484       "wget \"http://ww.telent.net/cclan-choose-mirror"
3485       "?M=http%3A%2F%2Fthingamy.com%2Fcclan%2F\""))
3486     (set-up (concat "wget http://ww.telent.net/cclan/"
3487                            "split-sequence.tar.gz"))
3488     (set-up "tar -zxf split-sequence.tar.gz")
3489     (set-up
3490      (concat "svn checkout http://mmtn.googlecode.com/"
3491              "svn/trunk/ mmtn-read-only"))
3492     (set-up
3493      "ln -s ~/bordeaux-threads/bordeaux-threads.asd .")
3494     (set-up "ln -s ~/usocket-svn/usocket.asd .")
3495     (set-up "ln -s ~/split-sequence/split-sequence.asd .")
3496     (set-up "ln -s ~/mmtn/src/mmtn.asd .")))
3497 \end{elisp}
3498
3499 \begin{notate}{Postgresql on Fedora}
3500 There are some slightly different instructions for
3501 installing postgresql on Fedora; the above will be
3502 changed to include them, but for now, check them
3503 out on the
3504 web.\footnote{{\tt http://www.flmnh.ufl.edu/linux/install\_postgresql.htm}}
3505 \end{notate}
3506
3507 \begin{notate}{Using MySQL and CLISP instead} \label{backend-variant}
3508 Since my OS X box seems to have a variety of confusing
3509 PostgreSQL systems already installed (which I'm not sure
3510 how to configure), and CLISP is easy to install with fink,
3511 I thought I'd try a different set up for simplicity and
3512 variety.
3513
3514 In order to make it work, I enabled root user on Mac OS X
3515 per instructions on web, and installed and configured
3516 mysql; used a slight modification of the strings table
3517 described previously; download and installed
3518 cffi\footnote{{\tt
3519     http://common-lisp.net/project/cffi/releases/cffi\_latest.tar.gz}};
3520 changed the definition of `connect-to-database' in
3521 Arxana's utilities.lisp; doctored up my ~/.clisprc.lisp;
3522 and changed how I started Lisp.  Details below.
3523 \end{notate}
3524
3525 \begin{idea}
3526 ;; on the shell prompt
3527 sudo apt-get install mysql
3528 sudo mysqld_safe --user=mysql &
3529 sudo daemonic enable mysql
3530 sudo mysqladmin -u root password root
3531 mysql --user=root --password=root -D test
3532 create database joe; grant all on joe.* to joe@localhost
3533 identified by 'joe'
3534
3535 ;; in tabledefs.lisp
3536 (execute-command "CREATE TABLE strings (
3537    id SERIAL PRIMARY KEY,
3538    text TEXT,
3539    UNIQUE INDEX (text(255))
3540 );")
3541
3542 ;; in ~/asdf-registry/ or whatever you've designated as
3543 ;; your asdf:*central-registry*
3544 ln -s ~/cffi_0.10.4/cffi-uffi-compat.asd .
3545 ln -s ~/cffi_0.10.4/cffi.asd .
3546
3547 ;; In utilities.lisp
3548 (defun connect-to-database ()
3549    (connect `("localhost" "joe" "joe" "joe")
3550             :database-type :mysql))
3551
3552 ;; In ~/.clisprc.lisp
3553 (asdf:operate 'asdf:load-op 'clsql)
3554 (push "/sw/lib/mysql/"
3555 CLSQL-SYS:*FOREIGN-LIBRARY-SEARCH-PATHS*)
3556
3557 ;; From SLIME prompt, and not in ~/.clisprc.lisp
3558 (in-package #:arxana)
3559 (connect-to-database)
3560 (locally-enable-sql-reader-syntax)
3561 \end{idea}
3562
3563 \begin{notate}{Installing Sphinx}
3564 Here are some tips on how to install and configure
3565 Sphinx.
3566 \end{notate}
3567
3568 \begin{idea}
3569 ;; Fedora/Postgresql flavor
3570 yum install postgresql-devel
3571 ./configure --without-mysql
3572   --with-pgsql
3573   --with-pgsql-libs=/usr/lib/pgsql/
3574   --with-pgsql-includes=/usr/include/pgsql
3575
3576 ;; Fink/MySQL flavor
3577 ./configure --with-mysql
3578   --with-mysql-includes=/sw/include/mysql
3579   --with-mysql-libs=/sw/lib/mysql
3580 \end{idea}
3581
3582 \begin{notate}{Getting Sphinx set up} \label{sphinx-setup}
3583 Here are some instructions I've used to get Sphinx set
3584 up.
3585 \end{notate}
3586
3587 \begin{notate}{Create a sphinx.conf}
3588 I want a very minimal sphinx.conf, this seems to work.
3589 (We should probably set this up so that it gets written
3590 to a file when the Arxana is set up.)
3591 \end{notate}
3592
3593 \begin{idea}
3594 ## Copy this to /usr/local/etc/sphinx.conf when you want
3595 ## to use it.
3596
3597 source strings
3598 {
3599  type            = mysql
3600  sql_host        = localhost
3601  sql_user        = joe
3602  sql_pass        = joe
3603  sql_db          = joe
3604  sql_query       = SELECT id, text FROM strings
3605 }
3606
3607 ## index definition
3608
3609 index strings
3610 {
3611  source          = strings
3612  path            = /Users/planetmath/sphinx/search-testing
3613  morphology      = none
3614 }
3615
3616 ## indexer settings
3617
3618 indexer
3619 {
3620  mem_limit       = 32M
3621 }
3622
3623 ## searchd settings
3624
3625 searchd
3626 {
3627  listen          = 3312
3628  listen          = localhost:3307:mysql41
3629  log             = /Users/planetmath/sphinx/searchd.log
3630  query_log       = /Users/planetmath/sphinx/searchd_query.log
3631  read_timeout    = 5
3632  max_children    = 30
3633  pid_file        = /Users/planetmath/sphinx/searchd.pid
3634  max_matches     = 1000
3635 }
3636 \end{idea}
3637
3638 \begin{notate}{Working from the command line}
3639 Then you can run commands like these.
3640 \end{notate}
3641
3642 \begin{idea}
3643 /usr/local/bin/indexer strings
3644 /usr/local/bin/search "but, then"
3645
3646 % mysql -h 127.0.0.1 -P 3307
3647 mysql> SELECT * FROM strings WHERE MATCH('but, then');
3648 \end{idea}
3649
3650 \begin{notate}{Integrating this with Lisp}
3651 Since we can talk to Sphinx via Mysql
3652 protocol, it seems reasonable that we should be able to talk to
3653 it from CLSQL, too.  With a little fussing to get the format
3654 right, I found something that works!
3655 \end{notate}
3656
3657 \begin{idea}
3658 (connect `("127.0.0.1" "" "" "" "3307") :database-type :mysql)
3659 (mapcar (lambda (elt) (floor (car elt)))
3660   (query "select * from strings where match('text')"))
3661 \end{idea}
3662
3663 \begin{notate}{Some added difficulty with Postgresql}
3664 When I try to index things on the server, I get an
3665 error, as below.  The question is a good one... I'm
3666 not sure \emph{how} postgresql is set up on the server,
3667 actually...
3668 \end{notate}
3669
3670 \begin{idea}
3671 ERROR: index 'strings': sql_connect: could not connect to server:
3672 Connection refused
3673 Is the server running on host "localhost" and accepting
3674 TCP/IP connections on port 5432?
3675 \end{idea}
3676
3677 \section{Appendix: A simple literate programming system} \label{appendix-lit}
3678
3679 \begin{notate}{The literate programming system used in this paper}
3680 This code defines functions that grab all the Lisp
3681 portions of this document, evaluate the Emacs Lisp
3682 sections in Emacs, and save the Common Lisp sections in
3683 suitable files.\footnote{{\tt
3684     Cf. http://mmm-mode.sourceforge.net/}} It requires
3685 that the \LaTeX\ be written in a certain consistent way.
3686 The function assumes that this document is the current
3687 buffer.
3688
3689 \begin{verbatim}
3690 (defvar lit-code-beginning-regexp
3691   "^\\\\begin{elisp}\\|^\\\\begin{common}{\\([^}\n]*\\)}")
3692
3693 (defvar lit-code-end-regexp
3694   "^\\\\end{elisp}\\|^\\\\end{common}")
3695
3696 (defun lit-process ()
3697   (interactive)
3698   (save-excursion
3699     (let ((to-buffer "*Lit Code*")
3700           (from-buffer (buffer-name (current-buffer)))
3701           (start-buffers (buffer-list)))
3702       (set-buffer (get-buffer-create to-buffer))
3703       (erase-buffer)
3704       (set-buffer (get-buffer-create from-buffer))
3705       (goto-char (point-min))
3706       (while (re-search-forward
3707               lit-code-beginning-regexp nil t)
3708         (let* ((file (match-string 1))
3709                (beg (match-end 0))
3710                (end (save-excursion
3711                       (search-forward-regexp
3712                        lit-code-end-regexp nil t)
3713                       (match-beginning 0)))
3714                (match (buffer-substring-no-properties
3715                        beg end)))
3716           (let ((to-buffer
3717                  (if file
3718                      (concat "*Lit Code*: " file)
3719                    "*Lit Code*")))
3720             (save-excursion
3721               (set-buffer (get-buffer-create
3722                            to-buffer))
3723               (insert match)))))
3724       (dolist
3725           (buffer (set-difference (buffer-list)
3726                                   start-buffers))
3727         (save-excursion
3728           (set-buffer buffer)
3729           (if (string= (buffer-name buffer)
3730                        "*Lit Code*")
3731               (eval-buffer)
3732             (write-region (point-min)
3733                           (point-max)
3734                           (concat "~/arxana/"
3735                                   (substring
3736                                    (buffer-name
3737                                     buffer)
3738                                    12)))))
3739         (kill-buffer buffer)))))
3740 \end{verbatim}
3741 \end{notate}
3742
3743 \begin{notate}{Emacs-export?}
3744 It wouldn't be hard to export the Elisp sections so
3745 that those who wanted to could ditch the literate
3746 wrapper.
3747 \end{notate}
3748
3749 \begin{notate}{Bidirectional updating}
3750 Eventually it would be nice to have a code repository set
3751 up, and make it so that changes to the code can get
3752 snarfed up here.
3753 \end{notate}
3754
3755 \begin{notate}{A literate style}
3756 Ideally, each function will have its own Note to introduce
3757 it, and will not be called before it has been defined.  I
3758 sometimes make an exception to this rule, for example,
3759 functions used to form recursions may appear with no
3760 further introduction, and may be called before they are
3761 defined.
3762 \end{notate}
3763
3764 \section{Appendix: Hypertext platforms} \label{appendix-hyper}
3765
3766 \begin{notate}{The hypertextual canon} \label{canon}
3767 There is a core library of texts that come up in
3768 discussions of hypertext.
3769 \begin{itemize}
3770 % \item (Plato)
3771 \item The Rosetta stone
3772 \item The Talmud (Judah haNasi, Rav Ashi, and many others)
3773 \item Monadology (Wilhelm Leibniz)
3774 \item The Life and Opinions of Tristam Shandy, Gentleman
3775   (Lawrence Sterne)
3776 \item Middlemarch (George Eliot)
3777 % \item The Gay Science (Freidrich Nietzsche)
3778 % \item (Wittgenstein)
3779 % \item (Alan Turing)
3780 \item The Nova Trilogy (William S. Burroughs)
3781 \item The Logic of Sense (Gilles Deleuze)
3782 % \item Open Creation and its Enemies (Asger Jorn)
3783 \item Labyrinths (Jorge Luis Borges)
3784 \item Literary Machines (Ted Nelson)
3785 % \item Simulation and Simulacra (Jean Baudrillard)
3786 \item Lila (Robert M. Pirsig)
3787 % \item \TeX: the program (Donald Knuth)
3788 \item Dirk Gently's Holistic Detective Agency
3789   (Douglas Adams)
3790 \item Pussy, King of the Pirates (Kathy Acker)
3791 % \item Rachel Blau DuPlessis,
3792 % \item Emily Dickinson
3793 % \item Gertrude Stein
3794 % \item Zora Neale Hurston
3795 \end{itemize}
3796 At the same time, it is somewhat ironic that none of the
3797 items on this list are themselves hypertexts in the
3798 contemporary sense of the word.  It's also a bit funny
3799 that certain other works (even some by the same authors)
3800 aren't on this list.  Perhaps we begin to get a sense of
3801 what's going on in this quote from Kathleen
3802 Burnett:\footnote{{\tt http://www.iath.virginia.edu/pmc/text-only/issue.193/burnett.193}}
3803 \begin{quote}
3804 ``Multiplicity, as a hypertextual principle, recognizes a
3805   multiplicity of relationships beyond the canonical
3806   (hierarchical).  Thus, the traditional concept of
3807   literary authorship comes under attack from two
3808   quarters--as connectivity blurs the boundary between
3809   author and reader, multiplicity problematizes the
3810   hierarchy that is canonicity.''
3811 \end{quote}
3812 It seems quite telling that non-hypertextual canons remain
3813 mostly-non-hypertextual even today, despite the existence
3814 of catalogs, indexes, and online access.\footnote{{\tt
3815     http://www.gutenberg.org/wiki/Category:Bookshelf}}
3816 \end{notate}
3817
3818 \begin{notate}{A geek's guide to literature}
3819 This title is a riff on Slasov \v{Z}i\v{z}ek's ``A
3820 pervert's guide to cinema''.  Taking Note \ref{canon} as a
3821 jumping-off point, why don't we make a survey of
3822 historical texts from the point of view of an aficionado
3823 of hypertext!  Just what does one have to do to ``get on
3824 the list''?  Just what is ``the hypertextual
3825 perspective''?  And, if \v{Z}i\v{z}ek is correct and we're
3826 to look for the hyperreal in the world of cinematic
3827 fictions -- what's left over for the world of literature?
3828 (Or mathematics?)
3829 \end{notate}
3830
3831 \begin{notate}{The number 3}
3832 This is the number of things present if we count carefully
3833 the items $A$, $B$, and a connection $C$ between them.
3834 [Picture of $A\xrightarrow{C} B$.]
3835
3836 (Or even: given $A$ and $B$, we use Wittgenstein counting,
3837 and \emph{intuit} that $C$ exists as the collection $\{A,
3838 B\}$; after all,
3839   some connection must exist precisely because we were
3840   presented with $A$ and $B$ together -- and lest the
3841   connections proliferate infinitely, we lump them all
3842   together as one.  [Picture of $A$, $B$,
3843     with the \emph{frame} labeled $C$.])
3844 \end{notate}
3845
3846 \begin{notate}{Surfaces}
3847 Deleuze talks about a theory of surfaces associated with
3848 verbs and events.  His surfaces represent the evanescence
3849 of events in time, and of their descriptions in language.
3850 An event is seen as a vanishingly-thin boundary between
3851 one state of being and another.
3852
3853 Certainly, a statement that is true \emph{now} may not be
3854 true five minutes from now.  It is easier to think and
3855 talk about things that are coming up and things that have
3856 already happened.  ``Living in the moment'' is regarded as
3857 special or even ``Zen''.
3858
3859 We can begin to put these musings on a more solid
3860 mathematical basis.  We first examine two types of
3861 \emph{interfaces}:
3862 \begin{enumerate}
3863 \item $A\xrightarrow{C} B$, $A\xrightarrow{D} B$,
3864   $A\xrightarrow{E} B$
3865   (the interface of $A$ and $B$ across $C$, $D$, and $E$);
3866 \item $A\xrightarrow{C} B$, $D\xrightarrow{C} E$,
3867   $F\xrightarrow{C} G$
3868   (the interface of various terms across $C$).
3869 \end{enumerate}
3870 \end{notate}
3871
3872 \begin{notate}{Comic books}
3873 No geek's guide to literature would be complete without
3874 putting comics in a hallowed place.  [Framed picture of
3875   $A$, $B$ next to framed
3876   picture of $A$, $B$, $a$.]  What happened?
3877   $\ddot{\smile}$
3878 \end{notate}
3879
3880 \begin{notate}{Intersecting triples}
3881 Diagrammatically, it is tempting to portray
3882 $(ACB)_{\mathrm{mid}}DE$ as if it was closely related to
3883 $A(CDE)_{\mathrm{beg}}B$, despite the fact that they are
3884 notationally very different.  I'll have to think more
3885 about what this means.
3886 \end{notate}
3887
3888 \section{Appendix: Computational Linguistics} \label{appendix-linguistics}
3889
3890 \begin{notate}{What is this?}
3891 It might be reasonable to make annotating sentences part
3892 of our writeup on hypertext platforms -- but I'm putting
3893 it here for now.  If hypertext is what deals with language
3894 artifacts on the ``bulky'' level (saying, for example,
3895 that a subsection is part of a section, and so on), then
3896 computational linguistics is what deals with the finer
3897 levels.  However, the distinction is in some ways
3898 arbitrary, and many of the techniques should be at least
3899 vaguely similar.
3900 \end{notate}
3901
3902 \begin{notate}{Annotation sensibilities}\label{sense}
3903 We will want to be able to make at least two different
3904 kinds of annotations of verbs.  For example, given the
3905 statement
3906 \begin{itemize}
3907 \item[$S$.] (``Who'' ``is on'' ``first''),
3908 \end{itemize}
3909 I'd like to be able to say
3910 \begin{itemize}
3911 \item[I.](``is on'' ``means'' ``the position of a base runner in baseball'').
3912 \end{itemize}
3913 However, I'd also like to be able to say
3914 \begin{itemize}
3915 \item[II.] (``is on'' ``because'' ``he was walked'').
3916 \end{itemize}
3917 Annotation I is meant to apply to the term ``is on''
3918 itself (in a context that might be more general than just
3919 this one sentence).  If Who is also on steroids, that's
3920 another matter -- as this type of annotation helps make
3921 clear!
3922
3923 Annotation II is meant to apply to the term ``is on''
3924 \emph{as it
3925   appears in sentence $S$}.  In particular, Annotation II
3926 seems to work best in a context in which we've already
3927 accepted the ontological status of the verb-phrase ``is
3928 on first''.
3929
3930 Whereas Annotation I should presumably exist before
3931 statement $S$ is ever made (and it certainly helps make
3932 that statement make sense), Annotation II is most properly
3933 understood with reference to the fully-formed statement
3934 $S$.  However, Annotation II is different from a statement
3935 like ($S$ ``has truth value'' $F$) in that it looks into
3936 the guts of $S$.
3937 \end{notate}
3938
3939 \begin{notate}{Comparison of places and ontological status} \label{places-and-onto-status}
3940 The difference between (I) a ``global'' annotation, and
3941 (II) the annotation of a specific sentence is analogous to
3942 the difference between (a) relationships between objects
3943 without a place, and (b) relationships between objects in
3944 specific places.  (Cf. Note \ref{sense}: ``global''
3945 statements are of course made ``local'' by the theories
3946 that scope them.)
3947
3948 For example, in a descriptive ontology of research
3949 documents, I might make the ``placeless'' statement,
3950 \begin{itemize}
3951 \item[a.] (``Introduction'' ``names'' ``a section'')
3952 \end{itemize}
3953 On the other hand, the statement
3954 \begin{itemize}
3955 \item[b.] (``Introduction'' ``has subject'' ``American
3956   History''),
3957 \end{itemize}
3958 seems likely to be about a specific Introduction.  (And
3959 somewhere in the backend, this triple should be expressed
3960 in terms of places!)
3961 \end{notate}
3962
3963 \begin{notate}{Semantics}
3964 In a sentence like
3965 \begin{quote}
3966 (((``I'' ``saw'' ``myself'')$_{\mathrm{mid}}$ ``as if''
3967   ``through a glass'')$_{\mathrm{beg}}$ ``but'' ``darkly'')
3968 \end{quote}
3969 first of all, there may be different parenthesizations,
3970 and second of all, the semantics of links like ``as if''
3971 and ``but'' may shape, to some extent, the ways in
3972 which we parethesize.
3973 \end{notate}
3974
3975 \section{Appendix: Resource use} \label{appendix-resources}
3976
3977 \begin{notate}{Free culture in action}
3978 I thought it worthwhile to include this quote from
3979 a joint paper with Aaron Krowne:\footnote{See Footnote
3980 \ref{corneli-krowne}.}
3981 \begin{quote}
3982 ``[F]ree content typically
3983   manifests aspects of a common resource as well as an
3984   open access resource; while anyone can do essentially
3985   whatever they wish with the content offline, in its
3986   online life, the content is managed in a
3987   socially-mediated way.  In particular, rights to
3988   \emph{in situ} modification tend to be strictly
3989   controlled. [...]  By finding new ways to support
3990   freedom of speech within CBPP documents, we embrace
3991   subjectivity as a way to enhance the content of an
3992   intersubjectively valued corpus.  In the context of
3993   ``hackable'' media and maintenance protocols, the
3994   semantics with which scholia are handled can be improved
3995   upon indefinitely on a user-by-user basis and a
3996   resource-wide basis.  This is free culture in action.''
3997 \end{quote}
3998 \end{notate}
3999
4000 \begin{notate}{Learning}
4001 The learner, confronted with a learning resource, or the
4002 consumer of any other information resource (or indeed,
4003 practically any resource whatsoever) may want a chance to
4004 respond to the questions ``was this what you were looking
4005 for?'' and ``did you find this helpful?''.  In some cases,
4006 an independent answer to that question could be generated
4007 (e.g. if a student is seen to come up with a correct
4008 answer, or not).
4009 \end{notate}
4010
4011 \begin{notate}{Connections}
4012 A useful communication goal is to expose some of the
4013 connections between disparate resources.  Some existing
4014 connections may be far more explicit than others.  It's
4015 important to facilitate the making and explicating of
4016 connections by ``third parties'' (Note
4017 \ref{browser-within}).  The search for connections between
4018 ostensibly unrelated things is a key part of both
4019 creativity and learning.  In addition, connecting with
4020 what others are doing is an important part of being a
4021 social animal.
4022 \end{notate}
4023
4024 \begin{notate}{Boundaries}
4025 Notice that the departmentalization of knowledge is
4026 similar to any regime that oversees and administers
4027 boundaries.  In addition to bridging different areas,
4028 learning often involves pushing one's boundaries and
4029 getting out of one's comfort zone.  The ``sociological
4030 imagination'' involves seeing oneself as part of something
4031 bigger; this goes along with the idea of a discourse that
4032 lowers or transcends the boundaries between participants.
4033 Imagination of any form can challenge myopic patterns of
4034 resource use, although there are also myopic fictions
4035 which neglect to look at what's going on in reality!
4036 \end{notate}
4037
4038 \end{document}