org/farm-2017.org

   1 #+TITLE:     Functional models of mathematical reasoning
   2 #+AUTHOR:    Joseph Corneli and Raymond Puzio
   3 #+EMAIL:     contact@planetmath.org
   4 #+DATE:      May 27, 2017 for Thursday 1 June, 2017 deadline
   5 #+DESCRIPTION: Outline and draft of 12 page paper on math text analysis for FARM workshop at ICFP 2017
   6 #+KEYWORDS: arxana, hypertext, inference anchoring theory
   7 #+LANGUAGE: en
   8 #+OPTIONS: H:3 num:nil toc:t \n:nil @:t ::t |:t ^:nil -:t f:t *:t <:t
   9 #+OPTIONS: TeX:t LaTeX:nil skip:nil d:nil todo:t pri:nil tags:not-in-toc
  10 #+INFOJS_OPT: view:nil toc:nil ltoc:t mouse:underline buttons:0 path:http://orgmode.org/org-info.js
  11 #+EXPORT_SELECT_TAGS: export
  12 #+EXPORT_EXCLUDE_TAGS: noexport
  13 #+LINK_UP:
  14 #+LINK_HOME:
  15 #+HTML_HEAD: <link rel="stylesheet" type="text/css" href="http://metameso.org/~joe/solarized-css/build/solarized-light.css" />
  16 #+BIBLIOGRAPHY: farm-refs plain
  17 #+LATEX_HEADER: \usepackage[backend=bibtex,sorting=none]{biblatex}
  18 #+LATEX_HEADER: \addbibresource{farm-refs}
  19 #+STARTUP: showeverything
  20
  21
  22 #+BEGIN_QUOTE
  23 Functional Programming has emerged as a mainstream software
  24 development paradigm, and its artistic and creative use is booming. A
  25 growing number of software toolkits, frameworks and environments for
  26 art, music and design now employ functional programming languages and
  27 techniques. FARM is a forum for exploration and critical evaluation of
  28 these developments, for example to consider potential benefits of
  29 greater consistency, tersity, and closer mapping to a problem domain.
  30
  31 FARM encourages submissions from across art, craft and design,
  32 including textiles, visual art, *music*, 3D sculpture, animation, GUIs,
  33 video *games*, 3D printing and architectural models, choreography,
  34 *poetry*, and even VLSI layouts, GPU configurations, or
  35 *mechanical engineering designs*.  *Theoretical foundations*, *language design*,
  36 *implementation issues*, and *applications* in industry or the arts are
  37 all within the scope of the workshop.  The language used need not be
  38 purely functional (“mostly functional” is fine), and may be manifested
  39 as a domain specific language or tool.  Moreover, submissions focusing
  40 on questions or issues about the use of functional programming are
  41 within the scope.
  42 #+END_QUOTE
  43
  44 * 0 Introduction
  45
  46 #+BEGIN_QUOTE
  47 ``To get to the moon, we needed crashes and endless fiddling.'' Gabriel and Goldman, "Mob Software: The Erotic Life of Code" \cite{gabriel2000mob}
  48 #+END_QUOTE
  49
  50 ** 0.1 CONTEXT OF APPLICATION
  51 - modelling mathematical dialogues, people don't just write their things in metamath
  52 - people may make different expository choices
  53 - mathematical text contains argumentation an narrative in addition to logic and calculations
  54 ** 0.2 FORCES
  55 - push to make everything formal (QED, Tarski, LCF)
  56 - push to make everything understandable (Alexander)
  57 ** 0.3 PROBLEM
  58 - each different user will have a different "way in", depending on their background knowledge ([[3.2][3.2]])
  59 - how to get behind the scenes to see why?
  60 - how to tailor a given formal content to a given reader?
  61 ** 0.4 SOLUTION (STRATEGY)
  62 - focus on expressivity
  63 - focus on representing process
  64 - formalization and efficiency are important, but will be left for later.
  65 ** 0.5 RATIONALE
  66 - objective: understand mathematics as people do it (literary criticism, rhetoric, argumentation)
  67 - objective: learn from these methods to build new mathematical texts
  68 - build on intuitive diagrams
  69 - build on the idea of parsing natural language (complementary to Mohan's work)
  70 ** 0.6 RESOLUTION of FORCES
  71 - outcome: sharing diagrams with people who can use them
  72 - potential: draw new diagrams
  73 - philosophical: bridging 'two cultures' (proof checkers and artists/designers/hackers)
  74
  75 * 1 THEORETICAL FOUNDATIONS Informal mathematical language is more general, and  the reasoning involved may be abductive, inductive, or heuristic. :RAY:
  76
  77 Even though a mathematical result and its proof involve strict logic,
  78 the process of formulating, motivating proving and explaining go
  79 beyond strict formality. (Pierce, Lautman)
  80
  81 # ** <<1.1>> 1.1 Examples of actual texts (different registers)
  82
  83 # *** TODO Pull up examples from real texts.
  84
  85 # - E.g. Artin's Galois theory book
  86 # - E.g. Gowers's "Eventually this is something a computer can solve"
  87
  88 # ** <<1.2>> 1.2 A formal language doesn't do anything with this ("Proofs and Prologues")
  89
  90 # - (Conclusion from inductive reasoning in 1.1.)
  91
  92 # ** Text
  93
  94 Since our purpose is to understand mathematical reasoning as practiced
  95 by mathematicians, we will begin by studying examples of mathematical
  96 texts.  We would like to characterize the use of language
  97 gramatically and rhetorically, describe the logic and argumentation
  98 structures used so as to develop our formalism for representing
  99 mathematical knowledge in an informed matter.  For the purposes of
 100 illustration, we will examine four short texts:
 101
 102 - *MPM* :: Mini-polymath 3, threads 3, 11, and 14 - selected portions
 103            of a collaborative online effort to solve a problem from
 104            the 2011 International Mathematics Olympiad.[fn::
 105            https://polymathprojects.org/2011/07/19/minipolymath3-project-2011-imo/]
 106
 107 - *GCP* :: Timothy Gowers's - What is the 500th digit of
 108            $(\sqrt{2}+\sqrt{3})^{2012}$? from his Maxwell Institute
 109            Lecture "Modelling the mathematical discovery process"
 110
 111 - *AGT* :: Section II.H, ``Normal Extensions'' of ``Galois Theory'' by
 112            Emil Artin.
 113
 114 - *HHS* :: Section 22, ``Adjoints'' of ``Introduction to Hilbert
 115            Space'' by P. R. Halmos.
 116
 117 One of the first features which leaps to the eye when reading these
 118 texts is that much of the text is written in a fashion which employs a
 119 precise technical vocabulary, logical idioms, and symbolic notation to
 120 state mathematical propositions.  We will refer to this as the /formal
 121 register/ of mathematical discourse.  Here are two typical instances
 122 from our sample texts:
 123
 124 - If, conversely, $\varphi$ is a bounded linear functional, then
 125   there exists a unique operator $A$ such that $\varphi (x,y) = (Ax,
 126   y)$ for all $x$ and $y$.  (HHS, statement of theorem 1)
 127
 128 - The intermediate field $B$ is a normal extension of $F$ if and only
 129   if the subgroup $G_B$ is a normal subgroup of $G$. (AGT, statement
 130   of theorem 16)
 131
 132 Semantically, such statements are assumed to be equivalent to symbolic
 133 expressions in formal logic (in whatever formal system one is
 134 working).  For the examples above, the following might be formal
 135 equivalents:
 136 \begin{align*}
 137   &\varphi \in \mathrm{BLF} \Rightarrow (\exists ! A) (\forall x,y)
 138   \varphi (x,y) = (Ax,y) \\
 139   &\mathrm{norm\_ext} (B, F) \Leftrightarrow \mathrm{norm\_subgp} (G_B,
 140   G)
 141 \end{align*}
 142
 143 While statements in the formal register are an important part of
 144 mathematical writing, a text consisting only of formal statements
 145 would be frustrating to read because, while it might contain all the
 146 technical information, it would offer no guidance to the reader as to
 147 where the statements came from, why they are interesting, and how to
 148 go about understanding them.[fn:lamport-reviewer:Lamport \cite[p.~20]{lamport2012write21st} quotes a referee who had read one of his structured proofs (see Section [[2.3][2.3, below]]): "The proofs ... are lengthy, and are presented in a style which I find very tedious. I think the readers ... are going to be more interested in understanding the techniques and how they can apply them, than they will be in reading the formal proofs. A problem with the proofs is that they do not clearly distinguish the trivial manipulations from the nontrivial or surprising parts. ... My feeling is that informal proof sketches ... to explain the crucial ideas in each result would be more appropriate."] To offer this
 149 guidance, a mathematical text will also contain expository statements
 150 such as the following:
 151
 152 + We begin the proper business of this section by showing that the
 153   connection between linear transformations and bilinear functionals
 154   goes quite deeper than these superficial remarks. (HHS, first
 155   paragraph)
 156
 157 + We come now to a theorem known as the /Fundamental Theorem of Galois
 158   Theory/ which gives the relation between the structure of a
 159   splitting field and its group of automorphisms.  (AGT, right before
 160   statement of theorem 16)
 161
 162 + Is it possible to prove that any point and any line will do? (MPM)
 163
 164 + Can we do this for $x+y$? For $e$? Rationals with small denominator?
 165   (GCP)
 166
 167 The /expository register/ has several salient features which work
 168 together and which distinguish it from the formal register.
 169
 170 While the subject matter of expository statements is also mathematical
 171 objects and propositions, the language is not only descriptive, but
 172 also narrative and argumentative.  For instance, the phrases ``we
 173 begin'' and ``we come now'' in the first two examples are indicative
 174 of a narrative structure; the former serves to introduce and motivate
 175 the theory of adjoint operators in terms of a few observations and the
 176 latter motivates the theorem by describing its purpose in general
 177 terms.  The third example is argumentative, asking whether an earlier
 178 proposal might be generalizable.
 179
 180 Whereas in formal logic, only the strictest deductive reasoning is
 181 allowed, mathematical exposition also makes use of inductive and
 182 abductive reasoning, and even looser reasoning by analogy.  For
 183 instance, the fourth example illustrates an inductive mode of
 184 reasoning in which the questioner is looking for a general result of
 185 which the problem posed would be a special case.  In the first
 186 example, an analogy is being set up between the remarks which precede
 187 that statement and the propositions which follow it.
 188
 189 A related point is that, whereas formal statements only make use of
 190 the truth values and predicates of formal logic, the expository
 191 register also makes use of looser judgements of plausibility and
 192 heuristics.  For instance, in the first example, the adjectives
 193 ``superficial'' and ``deeper'' appear.  Terms such as these are not
 194 formally definable but refer to approximate notions which inform
 195 heuristic choices.
 196
 197 Allowing such loose notions and non-deductive reasoning serves an
 198 important purpose because, quite often, the exact deduction of some
 199 result might be lengthy and/or opaque.  When that happens, it is
 200 helpful to the reader to augment the formal reasoning with informal
 201 reasoning which is shorter and easier to understand.  Moreover, it
 202 not infrequently happens that such informal reasoning can serve to
 203 guide the construction of a formal proof, as in GCP and MPM.
 204
 205 Finally, the expository register is metamathematical, discussing not
 206 only objects and propositions, but also inference and proofs.  For
 207 instance, the third example asks about the provability of a
 208 statement.  Furthermore, this is not just strict metamathematics, but
 209 also may involve approximate and heuristic elements.  In particular,
 210 when setting out to prove a theorem, one may first assess different
 211 possible proof strategies using heuristic notions such as difficult
 212 vesrus easy or by making analogies with techniques used to
 213 successfully prove similar results.
 214
 215 In this essay, our main focus will be to construct a theory of
 216 mathematical exposition which will account for the features noted
 217 above and provide a semantics for utterances in the expository
 218 register which can be implemented and queried algorithmically.  Our
 219 focus here will be on the representation and what can be done with it
 220 as opposed to parsing natural language.  This is a matter of
 221 specialization, not of trivializing the latter.  We also note that
 222 there has been progress by Mohan in parsing the formal register and
 223 will later note that there exist natural language parsers for related
 224 AI problems in order to suggest that it is plausible that one could
 225 build a parser.
 226
 227 ** Urban explorer metaphor
 228
 229 As a way of illustrating the formal and expository registers, we offer
 230 the following analogy.  Instead of mathematical objects and
 231 propositions, consider locations and buildings in a city.  Instead of
 232 inferences, roads and paths and, instead of theoroes, neighborhoods
 233 and districts.
 234
 235 Now consider guides to that city.  A basic feature of a guide is
 236 description --- ``The exchange building is built of red brick and has
 237 a gabled roof''; ``The memorial fountain is made of white marble and
 238 has a stream of water coming out of the lion's mouth into an octagonal
 239 basin filled with goldfish.''  This corresponds to the formal
 240 register.  The most utilitarian form of a guidance is a list of
 241 directions --- ``Go straight three blocks, turn left, go another two
 242 blocks.  In the middle of the third block, there will be a building
 243 with yellow clapboard siding.''  This kind of statement corresponds to
 244 formal mathematical proofs.
 245
 246 However, there is much more to city guides than just straightforward
 247 descriptions and directions.  They may contain value judgements ---
 248 ``For a more scenic view, take this detour''; ``To avoid nasty taxi
 249 traffic, go under the intersection through the tunnel.''  While a
 250 straightforward set of directions might suffice for a business person
 251 passing through town, for someone with more leisure a single route or
 252 tree-like transportation map may be dissatisfying since that person
 253 would like to learn about all the different ways around together with
 254 their historical, cultural, and aesthetic associations.  A newcomer
 255 might appreciate a book which not only says what there is and how to
 256 get there, but also provides a higher level guidance as to what are
 257 some prominant landmarks and important sites and have things laid out
 258 in some order that makes it possible to assimilate information
 259 gradually rather than be barraged with everything at once.  Three
 260 comrades going out for a night together might start out by having a
 261 debate on which way to go is the best, weighing different concerns
 262 such as travel time versus convenience.  This is like the expository
 263 register.
 264
 265 Since a city is not a tree \cite{alexander1964city}, any guide to it which lays things out
 266 in a straightforward tree-like fashion will necessarily leave things
 267 out and be of limited use.  Likewise, for math, in order to go beyond
 268 the superficial, we will want some more rhizomatic sort of
 269 description which enables one to wander along a dérive along quirky
 270 winding roads.
 271
 272 * 2 LANGUAGE DESIGN The presentation here will describe a strategy we have been developing for representing mathematical dialogues and other informal texts. :JOE:
 273
 274 Bundy has argued that finding the right representation is the key to
 275 successful reasoning \cite[p.~16]{Bundy20130194}.  Since we cannot
 276 simply build on formal theorem proving for our current purposes, we
 277 need to look further afield for foundations.  We are concerned in
 278 particular with building representations \emph{of} reasoning
 279 \cite{corneli2017towards}.  This helps guide our search for useful
 280 paradigms.  McCarthy \cite{mccarthy2008well} posited several high-level
 281 features of a "language of thought", highlighting in particular the
 282 ways in which such a language would not be like spoken language:
 283
 284 #+BEGIN_QUOTE
 285 - Much mental information is represented in parallel and is processed in parallel.
 286 - Reference to states of the senses and the body has to be done by something like pointers. Natural languages use descriptions, because one person can’t give another a pointer to his visual cortex. (A robot might tell another robot, “Look through my eyes, and you’ll see it.”)
 287 - We don’t think in terms of long sentences.
 288 - Much human thought is contiguous with the thought of the animals from which we evolved.
 289 - For robots, logic is appropriate, but a robot internal language may also include pointers.
 290 - A language of thought must operate on a shorter time scale than speech does. A batter needs to do at least some thinking about a pitched ball, and a fielder often needs to do quite a bit of thinking about where to throw the ball. Pointers to processes while they are operating may be important elements of its sentences.
 291 #+END_QUOTE
 292
 293 While we won't directly accommodate all of these features, but we can
 294 find languages that deal with several of them.
 295
 296 ** <<2.1>> 2.1 Inference Anchoring Theory + Content (IATC)
 297
 298 Our modelling approach builds on /argumentation theory/; in particular
 299 Inference Anchoring Theory (IAT), which was devised by Budzynska et al
 300 \cite{budzynska2014model} to model the pro-and-contra logical
 301 structure of dialogical arguments.  IAT has since been adapted to
 302 model mathematical arguments \cite{gabi-iatc, corneli2017towards}.
 303 The primary features of this adaptation are: (1) to introduce more
 304 explicit relationships between pieces of mathematical content; and (2)
 305 to describe a range of intermediate relations that model the way these
 306 objects fit together in discourse.  These include inferential
 307 structure, judgements of validity or usefulness, and reasoning
 308 tactics.  Later in this article we will describe "IAT+Content" (IATC)
 309 in more detail .
 310
 311 ** <<2.2>> 2.2 Conceptual Dependence (CD)
 312
 313 Conceptual Dependence was initially introduced as a tool for
 314 understanding natural language \cite{schank1972conceptual}.  However,
 315 it is also used to represent knowledge about actions
 316 \cite{lytinen1992conceptual,sowa2008conceptual}.  The basis of the
 317 theory is a set of /primitives/ which describe basic types of actions
 318 such as "=PTRANS= — transfer of location" or "=MBUILD= — construction
 319 of a thought or new information".  Each primitive comes with a set of
 320 /slots/ which accept objects of certain types.  By filling in the
 321 slots of a primitive, one obtains the most elementary form of a
 322 /conceptual dependency graph/.  By combining such basic graphs, one
 323 can build up more complicated CD graphs which describe actions in
 324 greater detail.  Later, we will show that CD primitives are
 325 analogous to IAT performatives, and what follows from this.
 326
 327 ** <<2.3>> 2.3 Structured proofs
 328
 329 Although we are not focusing on formal proofs, at the level of
 330 mathematical content we are talking about similar things: mathematical
 331 objects and mathematical propositions.  Formal mathematics models the
 332 same universe of discourse, but the discourse itself is not
 333 necessarily formal.  Accordingly, we can derive a useful sanity check from
 334 Lamport's notion of a ``structured proof'' \cite{lamport1995write}.
 335 In essence, a structured proof combine a summary (or ``proof sketch'')
 336 with a mode of writing out details so way that each step is made clear
 337 and each assumption and assertion is a label.  We emphasize that
 338 structured proofs are trees.  As first introduced, there were just a
 339 few keywords: =assume=, =prove=, =let=, and =case=.  In a more recent
 340 reassessment \cite{lamport2012write21st}, Lamport states "The way I
 341 now write proofs has profited by my recent experience designing and
 342 using a formal language for machine-checked proofs."  In line with
 343 this, additional keywords =new=, =suffices=, and =pick= are imported
 344 from his "Temporal Logic of Actions+" (TLA$^{+}$) language
 345 \cite{lamport1999specifying}, along with a shorthand for equational
 346 proofs.  More recently, a second version of this language,
 347 TLA$^{+2}$, has been made available, with new keywords (and now with
 348 lambda expressions) \cite{lamport2014tla2}.[fn:1] Lamport's structured
 349 proof is an interesting example of a ``missing link'' between informal
 350 and formal mathematics.  The TLA\(^+\) language seems in our opinion
 351 to be reasonably well aligned with typical mathematical style.  The
 352 history of its development helps show which features can be considered
 353 most essential.  We note that in principle it should be possible to
 354 coherently map to any formal mathematical language from our content
 355 layer.
 356
 357 [fn:1] In full, TLA\(^{+2}\)'s keywords are: =action=, =assumption=, =axiom=, =by=, =corollary=, =def=, =define=, =defs=, =have=, =hide=, =lambda=, =lemma=, =new=, =obvious=, =omitted=, =only=, =pick=, =proof=, =proposition=, =prove=, =qed=, =recursive=, =state=, =suffices=, =take=, =temporal=, =use=, =witness=.
 358
 359 ** <<2.4>> 2.4 The Lakatos Game (LG)
 360
 361 Lakatos \cite{lakatos1976proofs} developed an abstract, dialectical,
 362 model of mathematical creativity which has recently been formalised as
 363 a \emph{dialogue game} \cite{pease2017lakatos}.  In the "Lakatos
 364 Game", assertions are made and then adjusted: for instance, a
 365 statement may be sharpened so that it applies to a restricted class of
 366 objects ("Strategic Withdrawal"), or a putative counterexample may be
 367 shown not to be a counterexample at all ("Monster Barring"). Lakatos's
 368 theory is limited, first, because it commits almost exclusively to
 369 Hegelian dialectic.  Secondly, when we zoom in and look at concrete
 370 details of mathematical objects, it should be no surprise that there
 371 are a range of issues that an abstract model will not cover.  In
 372 short, as a representation of reasoning the Lakatos Game is at least
 373 somewhat descriptive, but not explicitly runnable.  We hope to
 374 overcome these limitations.
 375
 376 * 3 IMPLEMENTATION ISSUES (REQUIREMENTS) :JOE:
 377
 378 An informal proof is in general not a tree, even if a structured proof
 379 -- or the trace of the moves made in a dialogue game -- does rather
 380 satisfactorily /describe/ the proof as a tree.  In the limit of full
 381 formality, tree-shaped descriptions are machine checkable -- and by
 382 the Curry-Howard correspondence, proofs-as-programs are even runnable.
 383 However, by default these objects give very little information about
 384 the proof's (or program's) genesis.  Indeed, as we shall see, we soon
 385 run into problems even with a still-more-general network model.
 386
 387 ** <<3.1>> 3.1 Look, it's a graph; no it's a hypergraph!
 388
 389 Figure [[fig:gcp-implements]] is an excerpt from a larger diagram[fn:2]
 390 that analyses Gowers's walk-through of the solution to the problem
 391 "What is the 500th digit of $(\sqrt{2}+\sqrt{3})^{2012}$?" Figure
 392 [[fig:iatc-key]] provides a graphical key to the color-coding of IATC
 393 operations.  We focus in on this step of the proof:
 394 #+BEGIN_QUOTE
 395 And $(\sqrt{3}-\sqrt{2})^{2012}$ is a very small number. Maybe the final answer is "9"?
 396 #+END_QUOTE
 397
 398 #+CAPTION: GCP (detail)
 399 #+NAME: fig:gcp-implements
 400 [[./gowers-example-selection.png]]
 401
 402 #+CAPTION: IATC (key)
 403 #+NAME: fig:iatc-key
 404 [[./iatc-key.png]]
 405
 406 In IATC, utterances expand to performatives: here, in Figure
 407 [[fig:gcp-implements]], simply a list of assertions (two of which are only
 408 implied by the quoted utterance).  Performatives, in turn, expand to
 409 nodes that convey inferential structure (in pink), reasoning tactics
 410 (in blue), and judgements of validity or usefulness -- which often
 411 take the form of heuristics (in green), as well as content-level
 412 relations (colorless; as, for example, the relation "contains as
 413 summand").  Intermediate nodes typically have targets in the content
 414 layer.  For example, the assertion that
 415 "\((\sqrt{3}-\sqrt{2})^{2012}\)" =has_property= "is small" is typical
 416 in this regard.  However, the =implements= node poses some difficulty,
 417 since its outbound arrow does not have another node as a target.  What
 418 it means to say is that the subgraph implements a heuristic idea that
 419 was proposed earlier: "The trick might be: it is close to something we
 420 can compute."  Without the =implements= certificate, we would see a
 421 proliferation of heuristic suggestions without any sense of which ones
 422 are actually used.  And yet, it is clear that this is an assertion
 423 about a subgraph, and not a specific piece of logic.  Recall
 424 McCarthy's assertion about the language of thought: "Pointers to
 425 processes while they are operating may be important elements of its
 426 sentences."  Whereas a structured proof would presumably just give the
 427 indicated subgraph a name and push it down the hierarchy into a lemma,
 428 we can see from the diagram that such a representation strategy poses
 429 problems with locality: /vide/ the inbound links to =sums= (from a
 430 previous assertion) and to "some integer" (from a subsequent step in
 431 the reasoning).  These reflections suggest the need for /hypergraphs/,
 432 not just graphs.[fn:3] With a hypergraph, pointing to a subset of
 433 nodes is straightforward.  Actually, we want one of the standard
 434 generalizations of hypergraphs: the ability for an edge to point to another edge.
 435 This way we can lasso the "lemma" as an object of interest, and point to
 436 it, without the need to bracket it off from the rest of the proof.
 437
 438 [fn:2] http://metameso.org/ar/gowers-example.pdf.
 439 [fn:3] From Wikipedia: Formally, a hypergraph $H$ is a pair $H = (X,E)$ where $X$ is a set of elements called ``nodes'' or ``vertices'', and $E$ is a set of non-empty subsets of $X$ called ``hyperedges'' or ``edges''. (https://en.wikipedia.org/wiki/Hypergraph)
 440
 441 ** <<3.2>> 3.2 The search for the `quantum of progress'
 442
 443 There is a further issue here, which the =implements= certificate
 444 hints at.  Just how does the person working through a proof attempt
 445 decide which step to take next?  In another setting, Gowers provides
 446 some very helpful guidance.
 447
 448 The Polymath project aimed to answer the question "is massively
 449 collaborative mathematics possible?" This project was convened by
 450 Gowers in 2009, and he developed 15 rules to address the "questions of procedure"
 451 that he anticipated would arise in the project.[fn:: http://gowers.wordpress.com/questions-of-procedure/ ] The basic premise was that people would work together on a
 452 shared blog to discuss a given mathematical problem in the open,
 453 and jointly work out a solution.[fn:: Incidentally, the answer to the main motivating question was at least a tentative ``yes'' \cite{gowers2009massively}. New collaborative Polymath research projects continue to be run to the present day. Four Minipolymath projects (the third of which we refer to here as "MPM") have been convened in the same spirit, focusing on competition problems rather than research problems. However, these projects cannot be said to be "massive," if we use collaborations in other scientific disciplines as our standard.]
 454
 455 Gowers carefully explicated the imperative to share comments that are
 456 "not fully thought out."[fn:UMEM] A good contribution to the project
 457 would comprise what he called a "quantum of progress."  Continuing
 458 with this metaphor, as the KRR strategy we are developing comes more
 459 fully into focus, it should be clear that we need to better understand
 460 the "fields" that generate these quanta!  That is, in order to build a
 461 functional representation of mathematical reasoning, we need to model
 462 both what's there at any given step, and also how it evolves -- noting
 463 that such evolution is generally heuristic.
 464
 465 Here one can compare Gowers's work with Ganesalingam on a
 466 ``human-oriented theorem proving'' system \cite{Ganesalingam2016}.
 467 First, we will comment that this work motivated the presentation of
 468 GCP.  However, Ganesalingam and Gowers's system was only able to solve
 469 more straightforward "textbook" problems, in which the next step is
 470 always more or less "obvious".  Under the hood, there is a Logic of
 471 Computable Functions (LCF) style inference system.  Because the system
 472 generates proofs in natural language /and/ also makes its internal
 473 processing explicit, it is quite straightforward to diagram its
 474 operations in IATC (Figure [[fig:robotone-selection]]).[fn::
 475 http://metameso.org/ar/robotone-example.pdf.]  In Figure
 476 [[fig:robotone-selection]] the heuristics -- which in this case correspond
 477 to pre-programmed LCF moves -- are disconnected from both the natural
 478 language and content.  This could be rectified in a revision: the main
 479 point being that a careful study of the relationship between the
 480 underlying logic and the IATC representation would need to be
 481 developed.  Clearly, this applies to more informal proof systems as
 482 well. *Possibly comment here about how this variant is not "driven" by the natural language, but the NL is sort of an epiphenomenon.*
 483
 484 #+CAPTION: ROBOTONE example (detail)
 485 #+NAME: fig:robotone-selection
 486 [[./robotone-example-selection.png]]
 487
 488
 489 [fn:UMEM] Ursula Martin had commented ten years earlier, with regard to the emerging field of "experimental mathematics," that the growth of experimentation requires the "development of community standards as to experimental methodology," emphasizing not only technical matters, but also "the early sharing of insights" \cite{martin1999computers}.
 490
 491 * 4 APPLICATIONS Our latest efforts produce more detailed models of mathematical arguments. :JOE:
 492
 493 One example Sussman gives of modelling "`how to' as well as `what is'"
 494 \cite{sussman2005programming} is a LISP program that he wrote with
 495 Richard Stallman that can replicate the process of analyzing a circuit
 496 \cite{sussman2011programming}.  The program explains /why/ it comes up
 497 with the answers it does.  The "why question" is even more important
 498 (and interesting) when nontrivial ``creativity'' is involved in the
 499 reasoning process \cite{gucklesberger2017addressing}.  In the case of
 500 Ganesalingam and Gowers's ROBOTONE, described above, the answer will
 501 always be more or less the same: /because this is what LCF tells
 502 me to do./  In other words, the program has little or nothing in the
 503 way of context sensitivity.  This is, let's say, fine, given its
 504 limited scope of application, but it would not work for a system that
 505 was supposed to participate in mathematical dialogues, nor
 506 (presumably) for a system that was exploring some entirely new area of
 507 mathematics instead of "textbook" problems.  Here, we aim to describe
 508 a language that can (in principle) reason equally well about both
 509 monological "textbook" and dialogical Polymath-style discussions.
 510
 511 We are aided by the fact that CD has already been used to reason about
 512 informally-constructed scenarios. *For example, given a description of
 513 a restaurant...*
 514
 515 #+CAPTION: MPM example (detail)
 516 #+NAME: fig:mpm-selection
 517 [[./mpm-example-selection.png]]
 518
 519 Given the relation between goals, values, and performatives and the
 520 way these are similar to what goes on in game playing and story
 521 understanding systems, it should be possible to have the system look
 522 at a transcribed dialogue from MPM, for instance (compare Figure
 523 [[fig:mpm-selection]]),[fn::
 524 http://metameso.org/ar/minipolymath-example.pdf.] and
 525 answer questions like:
 526
 527 - "Why does Thomas agree with Anonymous?"
 528 - "Why does Gal says the suggestion is beautiful?"
 529 - "Why did Anonymous suggest `perhaps even the line does not matter'?"
 530
 531 The premise here is that the dynamics at work are based on
 532 performatives being chosen based upon current valuations, goals, and
 533 other contextual features.  We discuss both kinematics and dynamics
 534 below.
 535
 536 ** <<4.1>> 4.1 Kinematics (IATC+CD)
 537
 538 Lytinen \cite{lytinen1992conceptual} gives the following example which
 539 schematizes the sentence "John gave Mary a book" as an s-expression,
 540 rooted on the =ATRANS= primative:
 541
 542 #+BEGIN_SRC lisp
 543 (ATRANS ACTOR  (PERSON NAME JOHN)
 544         OBJECT (PHYS-OBJ TYPE BOOK)
 545         RECIP  (PERSON NAME MARY))
 546 #+END_SRC
 547
 548 We would like to do the same sort of thing here, using IATC
 549 performatives as our primatives.  Like the standard CD primatives,
 550 these performatives have slots, described as follows:[fn:: See \cite{gabi-iatc} for further details.]
 551
 552 - =Assert= (/s/ [, /a/ ]): Assert belief that statement /s/ is true, optionally because of /a/.
 553 - =Agree= (/s/ [, /a/ ]): Agree with a previous assertion.
 554 - =Challenge= (/s/ [, /a/ ]): Assert belief that statement /s/ is false, optionally because of /a/.
 555 - =Retract= (/s/ [, /a/ ]): Retract a previous assertion /s/, optionally because of /a/.
 556 - =Define= (/o/, /p/): Define (name) object /o/ by property /p/.
 557 - =Judge= (/s/): Apply a heuristic value judgement /s/ to some statement.
 558 - =Suggest= (/s/): Advance a tactical reasoning move, /s/.
 559 - =Query= (/s/): Ask for the truth value of /s/.
 560 - =QueryE= ({/pi/(/X/)} . /i/): Ask for the class of objects /X/ for which all of the properties /pi/ hold.
 561
 562 Similarly, the statements that are introduced via these performatives
 563 also have slots, for example:
 564
 565 - =has_property= (/o/, /p/): Object /o/ has property /p/.
 566 - =strategy= (/m/, /s/): Method /m/ may be used to prove statement /s/.
 567 - =beautiful= (/s/): Statement /s/ is mathematically elegant.
 568
 569 As we've seen in our examples we may want to point to a subgraph
 570 rather than a "statement", as with the =implements= relation in Figure
 571 [[fig:gcp-implements]].  Assertions that are fully unfolded generally bottom
 572 out in the content layer: assertions can be made directly about objects
 573 and propositions, or relationships between , as in Figure
 574 [[fig:gcp-implements]]'s =contains as summand= relation; content
 575 they can be addressed through other intermediate assertions, such as
 576 "\((\sqrt{3}-\sqrt{2})^{2012}\)" =has_property= "is small" in the same
 577 figure.  In effect, s-expressions parse statements in natural language
 578 and use the resulting structures to build graphs like the ones shown
 579 in our figures.  The assertions from Figure [[fig:gcp-implements]]
 580 (along with a few pieces of text that are off-screen in that image)
 581 may be represented thusly:
 582
 583 #+BEGIN_SRC lisp
 584 (Assert
 585  "contains as summand"
 586  "(sqrt(2)+sqrt(3))^2012+(sqrt(3)-sqrt(2))^2012"
 587  "(sqrt(3)-sqrt(2))^2012")
 588
 589 (Assert (has_property "(sqrt(3)-sqrt(2))^2012"
 590                       "is small"))
 591
 592 (Suggest (strategy "numbers that are very close
 593                     to integers have \"9\"
 594                     in many places of their
 595                     decimal expansion"))
 596
 597 (Assert (implements #SUBGRAPH
 598                     "the trick might be: it
 599                      is close to something
 600                      we can compute"))
 601 #+END_SRC
 602
 603 We defer the somewhat technical matter of indexing a specific subgraph
 604 to Section [[5.2]].
 605
 606 ** <<4.2>> 4.2 Dynamics
 607 # (Lytinen's stuff with goals and scripts)
 608
 609 Static graphical representations of mathematical reasoning do not by
 610 themselves live up to promise of /functional/ models.  We need to be
 611 able to reason about structure: answering "why" questions, as above,
 612 or introducing new reasoned statements into a dialogue.  Here we are
 613 inspired by Sowa and Majumdar's treatment of reasoning by analogy
 614 \cite{sowa2003analogical}.  For example, early in GCP, Gowers asks:
 615
 616 #+BEGIN_QUOTE
 617 Can we do this for $x+y$? For $e$? Rationals with small denominator?
 618 #+END_QUOTE
 619
 620 One linguistically tricky issue here is the resolution of the anaphora
 621 "this".  We can see that "this" refers to /computing the 500th digit
 622 of X/.  The anaphoristic relation is grounded in several proposed
 623 analogies:
 624
 625 #+BEGIN_SRC lisp
 626 (Assert
 627  (analogy
 628   "compute 500th digit of (sqrt(2)+sqrt(3))^2012"
 629   "compute 500th digit of (x+y)^2012"))
 630
 631 (Assert
 632  (analogy
 633   "compute 500th digit of (sqrt(2)+sqrt(3))^2012"
 634   "compute 500th digit of e^2012"))
 635
 636 (Assert
 637  (analogy
 638   "compute 500th digit of (sqrt(2)+sqrt(3))^2012"
 639   "compute 500th digit of r^2012,
 640    where r is a rational with small denominator"))
 641 #+END_SRC
 642
 643 We can see the analogies even more clearly when the content is
 644 represented in graphical form, i.e., when the constituent objects and
 645 functions have been fully expanded.  Figure [[fig:qa-fig]] shows an
 646 example from \cite{corneli2017towards} that describes an analogy
 647 between a known theorem for finite groups, and a structurally
 648 related conjecture for infinite groups.
 649
 650 #+CAPTION: Q&A example (lightly adapted from \cite[Fig. 1]{corneli2017towards})
 651 #+NAME: fig:qa-fig
 652 [[./revised-qa-question.png]]
 653
 654 The analogy between the two clusters is clear: in one case $G$ is a
 655 finite group, in the other $G$ is an infinite group.  Naturally, this
 656 is not the same "\(G\)".  Using a typed formalism we could be more
 657 specific about the difference between the two sides, while retaining a
 658 structural similarity on the two sides of the analogy:[fn:: Something along the lines of clojure.spec may be useful these purposes; it provides a "standard, expressive, powerful and integrated system for specification and testing"; see https://clojure.org/about/spec.]
 659
 660 #+BEGIN_SRC
 661 G: group           -> X: group
 662 G has_card finite  -> X has_card infinite
 663 #+END_SRC
 664
 665 In short, if we can expand the statements in sufficient detail, we can
 666 start to see the specific analogies between them.  We can then form
 667 hypotheses like "Anonymous suggests `perhaps even the line does not
 668 matter' by strict analogy with Nemanja's immediately previous
 669 assertion "we can start with any point and 'just' need to choose a
 670 second point through which we will draw a line."  Here, Nemanja has
 671 described a specific simplification, and Anonymous wonder if things
 672 can be simplified again.
 673
 674 Given sufficently detailed representations - sometimes requiring
 675 background theories (like group theory in the Q&A example, or plane
 676 geometry in MPM), we expect analogical reasoning to serve as the main
 677 explanatory mechanism.  Sowa and Majumdar describe three ways of doing
 678 analogical reasoning about conceptual graphs:
 679
 680 #+BEGIN_QUOTE
 681 The first compares two nodes by what they contain in themselves
 682 independent of any other nodes; the second compares nodes by their
 683 relationships to other nodes; and the third compares the mediating
 684 transformations that may be necessary to make the graphs comparable.
 685 #+END_QUOTE
 686
 687 Pushing the reasoning problem "back" to the natural language
 688 processing layer by suggesting that sufficiently rich representations
 689 are the main thing we need to proceed with reasoning may seem like it
 690 is dodging the issue.  However, there is in fact a (high-order)
 691 analogy between what goes on in parsing and reasoning.  Lytinen
 692 \cite{lytinen1992conceptual} summarises Schank and Birnbaum
 693 \cite{schank-and-birnbaum} on semantic parsing, again, referring to
 694 three aspects of the parser:
 695
 696 #+BEGIN_QUOTE
 697 ... its /control structure/, or the processes that occur during
 698 parsing; its /representational structures/, or the representations
 699 which it builds during the parsing process; and its /knowledge base/,
 700 or the set of rules which the parser draws on and which drive the
 701 parse of a text.
 702 #+END_QUOTE
 703
 704 Functional reasoning moves also have a control structure,
 705 representational structures, and a knowledge base, though they start
 706 from pre-parsed content.  Sowa and Majumdar refer to a "graph
 707 grammar".  We will show a example relevant to our setting in the
 708 following section.
 709
 710 * 5 MECHANICAL ENGINEERING DESIGNS A graphical formalism and a corresponding prototype implementation will be described. :RAY:
 711
 712 ** <<5.1>> 5.1 Import and improve technical stuff on Arxana (cf. Nemas doc)
 713 ** <<5.2>> 5.2 Example
 714
 715
 716 In order to store graphical representations of knowledge in memory and
 717 query them, we will make use of a system called /Arxana/ which we have
 718 been developing in LISP.  The basis of this system is higher-order
 719 nested semantic networks.  By ``higher order'', it is meant that links
 720 can point to other links.  By ``nested'', it is meant that the nodes
 721 which make up a network may contain other networks, which themselves
 722 may have nodes containing yet other networks, etc.  In addition, all
 723 the links are bidirectional and no fundamental distinction is made
 724 between links and nodes.
 725
 726 For convenience and flexibility, this package has been designed in a
 727 modular and extensible fashion.  Basic functionality is provided by a
 728 back-end which manages the data of the network.  Higher-level access
 729 to the data is mediated by means of a handful of primitive commands
 730 and, as long as these commands behave similarly, it does not matter to
 731 the end user how they are implemented or how the data is represented
 732 internally.
 733
 734 The basic unit from which we will construct our networks is called an
 735 ``article''.  Articles are data objects which serve to encode both
 736 links and nodes and are characterized by the following components:
 737
 738 - Identifier
 739 - Source
 740 - Sink
 741 - Content
 742
 743 The identifier comes in two forms, a unique numerical identifier which
 744 the computer assigns to it when it is entered into the database and
 745 human-readable aliases which may be assigned by the user.  Source and
 746 sink are pointers to other articles and content is a place in which to
 747 store a lisp object, which might be some text, or an expression, or
 748 maybe a number or maybe something else depending on what intends to
 749 represent with one's network and how one intends to use it.  In
 750 particular, the content of an article can even be another network
 751 constructed out of more articles.
 752
 753 Bidirectional linking means that the links are stored and accessed in
 754 a manner which makes it no more costly to find and traverse links in
 755 the backwards than the forward direction.  Storing content in articles
 756 is what allows the nodes of a hypergraph to serve as a distinct
 757 carrier of information and can be quite useful.  In the case of links,
 758 this room for extra information can be used to do things like, say,
 759 specify what part of a story is being considered in a link to a
 760 comment on a story.
 761
 762 In order to make this respresentation actionable, we will use two main
 763 facilities: hypergraph search and hypergraphs as programs.
 764
 765 Given a hypergraph, we can query it for subhypergraphs which match
 766 some suitable criterion.  A query consists of a hypergraph whose
 767 articles contain predicates and a match consists of a mapping of
 768 hypergraphs from the query onto a subhypergraph such that, for every
 769 article in the query, the predicate it contains is true of the
 770 contents of the corresponding database article.  Since the predicate
 771 can be any lisp function, perhaps one which is nested and recursive,
 772 this feature can be quite powerful.
 773
 774 By modelling various structures as hypergraphs, we can search for
 775 instances of them.  For instance, if we implement subcollections as
 776 cones, we can use hypergraph search to restrict searches to
 777 subcollections.  Or, if we represent logical inferences in terms of a
 778 hypergraph for the premise and a hypergraph for the conclusion which
 779 share subgraphs, then we can use hypergraph search to find and verify
 780 reasoning.
 781
 782 Since the content of a node is a lisp object, it can potentially be a
 783 piece of executable code.  To make the most of this possibility, we
 784 have a utility which runs this code in a context which is determined
 785 by links out of the node containing the code.
 786
 787 An important use of this facility is handling /clusions/.  The basic
 788 feature of clusions is that, rather than simply looking at the text of
 789 a node, one instead generates a text from that node by processing that
 790 text in its network context.  A classical example would be one where
 791 the text contains link anchors which mean "copy this portion of the
 792 texts from some other node".  In this case the processing would
 793 consist of looking up those references and inserting the portions in
 794 the appropriate places.  Furthermore, due to possibility of recursion
 795 described above, it is possible that the texts cited might themselves
 796 be in need of similar assembly.
 797
 798 This is also a mechanism which allows us to operationalize our
 799 graphical representations.  Given a node corresponding to a
 800 performative of predicate, as content we can take a routine which
 801 verifies that the predicate indeed holds of the content to which the
 802 node points.  Then running that node would carry out the verification.
 803
 804 * 6 RELATED WORK (GAMES, POETRY): parallels between dialogues and discursive Q&A, traditional proofs. :JOE:
 805
 806 No one would insist that a nonlinear film like /Amores Perros/ or
 807 /Pulp Fiction/ needs to be represented as a tree.  Propp-style
 808 theories of storytelling tell us how stories are "supposed" to be
 809 written, but in general these models don't stand up to scrutiny.  The
 810 title of Robbe-Grillet's short novel /Dans le labyrinthe/, in a way,
 811 says it all.
 812
 813 ** <<6.1>> 6.1 prior art: board games, story stuff; table that compares between.
 814 ** <<6.2>> 6.2 Alexander: "a city is not a tree"; compare Lamport
 815
 816 ** Text
 817
 818 In order to build a computational theory of the expository register,
 819 we will draw upon several well-studied topics in AI which deal with
 820 situations that have common features, namely the block world, board
 821 games, and story comprehension (Table [[tab:comparison]]).
 822
 823 #+CAPTION: Comparison between computation in different domains
 824 #+NAME: tab:comparison
 825 | /Level/               | *Block World*     | *Board Games*         | *Story Comprehension* |
 826 |-----------------------+-------------------+-----------------------+-----------------------|
 827 | universe of discourse | blocks on a table | game pieces on board  | everyday life         |
 828 | reasoning             | instructions      | rules & strategy      | analogy               |
 829 | value                 | consistency?      | prediction of winning | costs and benefits    |
 830 | role of disagreement  | none              | multiple strategies   | ethical dilemmas      |
 831
 832 According to David Moshman \cite{moshman2004inference},
 833 /inference/, /thinking/, and /reasoning/ are defined as follows:
 834
 835 #+BEGIN_QUOTE
 836 - *Inference*---going beyond the data---is elementary and ubiquitous.
 837 - *Thinking* is the deliberate application and coordination of one’s inferences to serve one’s purposes
 838 - *Reasoning* is epistemically self-constrained thinking [...] with the intent of conforming to what they deem to be appropriate inferential norms.
 839 #+END_QUOTE
 840
 841 Reasoning is particularly effective in social settings, according to
 842 Sperber & Mercier \cite{mercier2017enigma}.  Even if we are not
 843 interacting directly with others, "appropriate inferential norms" are
 844 social constructions. Indeed, Minsky saw thought as inherently social.
 845
 846 Quite contrary to the sentiment conveyed by the
 847 misattributed quote ``Mathematics is a game played according to
 848 certain simple rules with meaningless marks on paper,'' David Hilbert,
 849 an early and important proponent of mathematical formalism, emphasised
 850 the role of intuition and experience, and discussed conjectural
 851 thinking and the fallibility of mathematical reasoning
 852 \cite{hilbert1992natur,corry1997origins}.[fn:thomae]
 853
 854 [fn:thomae] The actual source of the quote appears to be Carl Johannes
 855 Thomae, /Elementare Theorie der analytische Functionen einer complexen
 856 Veraenderlichen/ (1898), p. 3: ``For the formalist, arithmetic is a game
 857 with signs, which are called empty. That means they have no other
 858 content (in the calculating game) than they are assigned by their
 859 behaviour with respect to certain rules of combination (rules of the
 860 game).'' It is worth noting that Thomae nevertheless continues on to
 861 remark on the difference between arithmetic and chess, insofar as
 862 ``numbers can be related to concrete manifolds by means of simple
 863 axioms.''  Cf. this discussion in the ``Foundations of Mathematics''
 864 mailing list:
 865 https://www.cs.nyu.edu/pipermail/fom/2005-April/008889.html.
 866
 867 # * 7 PART TWO In our planned next steps we plan to bring in explicit type-theoretic representations of mathematical objects within a service-oriented architecture. :onhold:
 868
 869 # ** <<7.1>> 7.1 There is a degree of formality in the specifications, but also informality in the heuristics.
 870
 871 # * 8 MOB PROGRAMMING :onhold:
 872
 873 # ** <<8.1>>
 874 # ** <<8.2>>
 875
 876 # * 9 RELATED WORK music from corneli, pease, stefanou survey ("Chapter 6"); NNexus project; Socratic :onhold:
 877
 878 # ** <<9.1>>
 879 # ** <<9.2>>
 880
 881 # * 10 CONCLUSION :onhold:
 882
 883 # ** <<10.1>>
 884 # ** <<10.2>>