manual/luatex-libraries.tex

   1 \environment luatex-style
   2 \environment luatex-logos
   3
   4 % HH: to be checked
   5
   6 \startcomponent luatex-libraries
   7
   8 \startchapter[reference=libraries,title={\LUATEX\ \LUA\ Libraries}]
   9
  10 The implied use of the built|-|in \LUA\ modules \type {epdf}, \type {fontloader},
  11 \type {mplib}, and \type {pdfscanner} is deprecated. If you want to use these,
  12 please start your source file with a proper \type {require} line. In the future,
  13 \LUATEX\ will switch to loading these modules on demand.
  14
  15 The interfacing between \TEX\ and \LUA\ is facilitated by a set of library
  16 modules. The \LUA\ libraries in this chapter are all defined and initialized by
  17 the \LUATEX\ executable. Together, they allow \LUA\ scripts to query and change a
  18 number of \TEX's internal variables, run various internal \TEX\ functions, and
  19 set up \LUATEX's hooks to execute \LUA\ code.
  20
  21 The following sections are in alphabetical order. For any callback (and
  22 manipulation of nodes) the following is true: you have a lot of freedom which
  23 also means that you can mess up the node lists and nodes themselves. So, a bit of
  24 defensive programming doesn't hurt. A crash can happen when you spoil things or
  25 when \LUATEX\ can recognize the issue, a panic exit will happen. Don't bother the
  26 team with such issues.
  27
  28 \section{The \type {callback} library}
  29
  30 This library has functions that register, find and list callbacks. Callbacks are
  31 \LUA\ functions that are called in well defined places. There are two kind of
  32 callbacks: those that mix with existing functionality, and those that (when
  33 enabled) replace functionality. In mosty cases the second category is expected to
  34 behave similar to the built in functiontionality because in a next step specific
  35 data is expected. For instance, you can replace the hyphenation routine. The
  36 function gets a list that can be hyphenated (or not). The final list should be
  37 valid and is (normally) used for constructing a paragraph. Another function can
  38 replace the ligature builder and|/|or kerner. Doing something else is possible
  39 but in the end might not give the user the expected outcome.
  40
  41 The first thing you need to do is registering a callback:
  42
  43 \startfunctioncall
  44 id, error = callback.register (<string> callback_name, <function> func)
  45 id, error = callback.register (<string> callback_name, nil)
  46 id, error = callback.register (<string> callback_name, false)
  47 \stopfunctioncall
  48
  49 Here the \syntax {callback_name} is a predefined callback name, see below. The
  50 function returns the internal \type {id} of the callback or \type {nil}, if the
  51 callback could not be registered. In the latter case, \type {error} contains an
  52 error message, otherwise it is \type {nil}.
  53
  54 \LUATEX\ internalizes the callback function in such a way that it does not matter
  55 if you redefine a function accidentally.
  56
  57 Callback assignments are always global. You can use the special value \type {nil}
  58 instead of a function for clearing the callback.
  59
  60 For some minor speed gain, you can assign the boolean \type {false} to the
  61 non|-|file related callbacks, doing so will prevent \LUATEX\ from executing
  62 whatever it would execute by default (when no callback function is registered at
  63 all). Be warned: this may cause all sorts of grief unless you know {\em exactly}
  64 what you are doing!
  65
  66 Currently, callbacks are not dumped into the format file.
  67
  68 \startfunctioncall
  69 <table> info = callback.list()
  70 \stopfunctioncall
  71
  72 The keys in the table are the known callback names, the value is a boolean where
  73 \type {true} means that the callback is currently set (active).
  74
  75 \startfunctioncall
  76 <function> f = callback.find (callback_name)
  77 \stopfunctioncall
  78
  79 If the callback is not set, \type {callback.find} returns \type {nil}.
  80
  81 \subsection{File discovery callbacks}
  82
  83 The behavior documented in this subsection is considered stable in the sense that
  84 there will not be backward|-|incompatible changes any more.
  85
  86 \subsubsection{\type {find_read_file} and \type {find_write_file}}
  87
  88 Your callback function should have the following conventions:
  89
  90 \startfunctioncall
  91 <string> actual_name = function (<number> id_number, <string> asked_name)
  92 \stopfunctioncall
  93
  94 Arguments:
  95
  96 \startitemize
  97
  98 \sym{id_number}
  99
 100 This number is zero for the log or \type {\input} files. For \TEX's \type {\read}
 101 or \type {\write} the number is incremented by one, so \type {\read0} becomes~1.
 102
 103 \sym{asked_name}
 104
 105 This is the user|-|supplied filename, as found by \type {\input}, \type {\openin}
 106 or \type {\openout}.
 107
 108 \stopitemize
 109
 110 Return value:
 111
 112 \startitemize
 113
 114 \sym{actual_name}
 115
 116 This is the filename used. For the very first file that is read in by \TEX, you
 117 have to make sure you return an \type {actual_name} that has an extension and
 118 that is suitable for use as \type {jobname}. If you don't, you will have to
 119 manually fix the name of the log file and output file after \LUATEX\ is finished,
 120 and an eventual format filename will become mangled. That is because these file
 121 names depend on the jobname.
 122
 123 You have to return \type {nil} if the file cannot be found.
 124
 125 \stopitemize
 126
 127 \subsubsection{\type {find_font_file}}
 128
 129 Your callback function should have the following conventions:
 130
 131 \startfunctioncall
 132 <string> actual_name = function (<string> asked_name)
 133 \stopfunctioncall
 134
 135 The \type {asked_name} is an \OTF\ or \TFM\ font metrics file.
 136
 137 Return \type {nil} if the file cannot be found.
 138
 139 \subsubsection{\type {find_output_file}}
 140
 141 Your callback function should have the following conventions:
 142
 143 \startfunctioncall
 144 <string> actual_name = function (<string> asked_name)
 145 \stopfunctioncall
 146
 147 The \type {asked_name} is the \PDF\ or \DVI\ file for writing.
 148
 149 \subsubsection{\type {find_format_file}}
 150
 151 Your callback function should have the following conventions:
 152
 153 \startfunctioncall
 154 <string> actual_name = function (<string> asked_name)
 155 \stopfunctioncall
 156
 157 The \type {asked_name} is a format file for reading (the format file for writing
 158 is always opened in the current directory).
 159
 160 \subsubsection{\type {find_vf_file}}
 161
 162 Like \type {find_font_file}, but for virtual fonts. This applies to both \ALEPH's
 163 \OVF\ files and traditional Knuthian \VF\ files.
 164
 165 \subsubsection{\type {find_map_file}}
 166
 167 Like \type {find_font_file}, but for map files.
 168
 169 \subsubsection{\type {find_enc_file}}
 170
 171 Like \type {find_font_file}, but for enc files.
 172
 173 \subsubsection{\type {find_sfd_file}}
 174
 175 Like \type {find_font_file}, but for subfont definition files.
 176
 177 \subsubsection{\type {find_pk_file}}
 178
 179 Like \type {find_font_file}, but for pk bitmap files. This callback takes two
 180 arguments: \type {name} and \type {dpi}. In your callback you can decide to
 181 look for:
 182
 183 \starttyping
 184 <base res>dpi/<fontname>.<actual res>pk
 185 \stoptyping
 186
 187 but other strategies are possible. It is up to you to find a \quote {reasonable}
 188 bitmap file to go with that specification.
 189
 190 \subsubsection{\type {find_data_file}}
 191
 192 Like \type {find_font_file}, but for embedded files (\type {\pdfobj file '...'}).
 193
 194 \subsubsection{\type {find_opentype_file}}
 195
 196 Like \type {find_font_file}, but for \OPENTYPE\ font files.
 197
 198 \subsubsection{\type {find_truetype_file} and \type {find_type1_file}}
 199
 200 Your callback function should have the following conventions:
 201
 202 \startfunctioncall
 203 <string> actual_name = function (<string> asked_name)
 204 \stopfunctioncall
 205
 206 The \type {asked_name} is a font file. This callback is called while \LUATEX\ is
 207 building its internal list of needed font files, so the actual timing may
 208 surprise you. Your return value is later fed back into the matching \type
 209 {read_file} callback.
 210
 211 Strangely enough, \type {find_type1_file} is also used for \OPENTYPE\ (\OTF)
 212 fonts.
 213
 214 \subsubsection{\type {find_image_file}}
 215
 216 Your callback function should have the following conventions:
 217
 218 \startfunctioncall
 219 <string> actual_name = function (<string> asked_name)
 220 \stopfunctioncall
 221
 222 The \type {asked_name} is an image file. Your return value is used to open a file
 223 from the harddisk, so make sure you return something that is considered the name
 224 of a valid file by your operating system.
 225
 226 \subsection[iocallback]{File reading callbacks}
 227
 228 The behavior documented in this subsection is considered stable in the sense that
 229 there will not be backward-incompatible changes any more.
 230
 231 \subsubsection{\type {open_read_file}}
 232
 233 Your callback function should have the following conventions:
 234
 235 \startfunctioncall
 236 <table> env = function (<string> file_name)
 237 \stopfunctioncall
 238
 239 Argument:
 240
 241 \startitemize
 242
 243 \sym{file_name}
 244
 245 The filename returned by a previous \type {find_read_file} or the return value of
 246 \type {kpse.find_file()} if there was no such callback defined.
 247
 248 \stopitemize
 249
 250 Return value:
 251
 252 \startitemize
 253
 254 \sym{env}
 255
 256 This is a table containing at least one required and one optional callback
 257 function for this file. The required field is \type {reader} and the associated
 258 function will be called once for each new line to be read, the optional one is
 259 \type {close} that will be called once when \LUATEX\ is done with the file.
 260
 261 \LUATEX\ never looks at the rest of the table, so you can use it to store your
 262 private per|-|file data. Both the callback functions will receive the table as
 263 their only argument.
 264
 265 \stopitemize
 266
 267 \subsubsubsection{\type {reader}}
 268
 269 \LUATEX\ will run this function whenever it needs a new input line from the file.
 270
 271 \startfunctioncall
 272 function(<table> env)
 273     return <string> line
 274 end
 275 \stopfunctioncall
 276
 277 Your function should return either a string or \type {nil}. The value \type {nil}
 278 signals that the end of file has occurred, and will make \TEX\ call the optional
 279 \type {close} function next.
 280
 281 \subsubsubsection{\type {close}}
 282
 283 \LUATEX\ will run this optional function when it decides to close the file.
 284
 285 \startfunctioncall
 286 function(<table> env)
 287 end
 288 \stopfunctioncall
 289
 290 Your function should not return any value.
 291
 292 \subsubsection{General file readers}
 293
 294 There is a set of callbacks for the loading of binary data files. These all use
 295 the same interface:
 296
 297 \startfunctioncall
 298 function(<string> name)
 299     return <boolean> success, <string> data, <number> data_size
 300 end
 301 \stopfunctioncall
 302
 303 The \type {name} will normally be a full path name as it is returned by either
 304 one of the file discovery callbacks or the internal version of \type
 305 {kpse.find_file()}.
 306
 307 \startitemize
 308
 309 \sym{success}
 310
 311 Return \type {false} when a fatal error occurred (e.g.\ when the file cannot be
 312 found, after all).
 313
 314 \sym{data}
 315
 316 The bytes comprising the file.
 317
 318 \sym{data_size}
 319
 320 The length of the \type {data}, in bytes.
 321
 322 \stopitemize
 323
 324 Return an empty string and zero if the file was found but there was a
 325 reading problem.
 326
 327 The list of functions is as follows:
 328
 329 \starttabulate[|l|p|]
 330 \NC \type {read_font_file}     \NC ofm or tfm files \NC \NR
 331 \NC \type {read_vf_file}       \NC virtual fonts \NC \NR
 332 \NC \type {read_map_file}      \NC map files \NC \NR
 333 \NC \type {read_enc_file}      \NC encoding files \NC \NR
 334 \NC \type {read_sfd_file}      \NC subfont definition files \NC \NR
 335 \NC \type {read_pk_file}       \NC pk bitmap files \NC \NR
 336 \NC \type {read_data_file}     \NC embedded files (\type {\pdfobj file ...}) \NC \NR
 337 \NC \type {read_truetype_file} \NC \TRUETYPE\ font files \NC \NR
 338 \NC \type {read_type1_file}    \NC \TYPEONE\ font files \NC \NR
 339 \NC \type {read_opentype_file} \NC \OPENTYPE\ font files \NC \NR
 340 \stoptabulate
 341
 342 \subsection{Data processing callbacks}
 343
 344 \subsubsection{\type {process_input_buffer}}
 345
 346 This callback allows you to change the contents of the line input buffer just
 347 before \LUATEX\ actually starts looking at it.
 348
 349 \startfunctioncall
 350 function(<string> buffer)
 351     return <string> adjusted_buffer
 352 end
 353 \stopfunctioncall
 354
 355 If you return \type {nil}, \LUATEX\ will pretend like your callback never
 356 happened. You can gain a small amount of processing time from that.
 357
 358 This callback does not replace any internal code.
 359
 360 \subsubsection{\type {process_output_buffer}}
 361
 362 This callback allows you to change the contents of the line output buffer just
 363 before \LUATEX\ actually starts writing it to a file as the result of a \type
 364 {\write} command. It is only called for output to an actual file (that is,
 365 excluding the log, the terminal, and \type {\write18} calls).
 366
 367 \startfunctioncall
 368 function(<string> buffer)
 369     return <string> adjusted_buffer
 370 end
 371 \stopfunctioncall
 372
 373 If you return \type {nil}, \LUATEX\ will pretend like your callback never
 374 happened. You can gain a small amount of processing time from that.
 375
 376 This callback does not replace any internal code.
 377
 378 \subsubsection{\type {process_jobname}}
 379
 380 This callback allows you to change the jobname given by \type {\jobname} in \TEX\
 381 and \type {tex.jobname} in Lua. It does not affect the internal job name or the
 382 name of the output or log files.
 383
 384 \startfunctioncall
 385 function(<string> jobname)
 386     return <string> adjusted_jobname
 387 end
 388 \stopfunctioncall
 389
 390 The only argument is the actual job name; you should not use \type {tex.jobname}
 391 inside this function or infinite recursion may occur. If you return \type {nil},
 392 \LUATEX\ will pretend your callback never happened.
 393
 394 This callback does not replace any internal code.
 395
 396 % \subsubsection{\type {token_filter}}
 397 %
 398 % This callback allows you to replace the way \LUATEX\ fetches lexical tokens.
 399 %
 400 % \startfunctioncall
 401 % function()
 402 %     return <table> token
 403 % end
 404 % \stopfunctioncall
 405 %
 406 % The calling convention for this callback is a bit more complicated than for most
 407 % other callbacks. The function should either return a \LUA\ table representing a
 408 % valid to|-|be|-|processed token or tokenlist, or something else like \type {nil}
 409 % or an empty table.
 410 %
 411 % If your \LUA\ function does not return a table representing a valid token, it
 412 % will be immediately called again, until it eventually does return a useful token
 413 % or tokenlist (or until you reset the callback value to nil). See the description
 414 % of \type {token} for some handy functions to be used in conjunction with this
 415 % callback.
 416 %
 417 % If your function returns a single usable token, then that token will be processed
 418 % by \LUATEX\ immediately. If the function returns a token list (a table consisting
 419 % of a list of consecutive token tables), then that list will be pushed to the
 420 % input stack at a completely new token list level, with its token type set to
 421 % \quote {inserted}. In either case, the returned token(s) will not be fed back
 422 % into the callback function.
 423 %
 424 % Setting this callback to \type {false} has no effect (because otherwise nothing
 425 % would happen, forever).
 426
 427 \subsection{Node list processing callbacks}
 428
 429 The description of nodes and node lists is in~\in{chapter}[nodes].
 430
 431 \subsubsection{\type {buildpage_filter}}
 432
 433 This callback is called whenever \LUATEX\ is ready to move stuff to the main
 434 vertical list. You can use this callback to do specialized manipulation of the
 435 page building stage like imposition or column balancing.
 436
 437 \startfunctioncall
 438 function(<string> extrainfo)
 439 end
 440 \stopfunctioncall
 441
 442 The string \type {extrainfo} gives some additional information about what \TEX's
 443 state is with respect to the \quote {current page}. The possible values are:
 444
 445 \starttabulate[|lT|p|]
 446 \NC \ssbf value     \NC \bf explanation                           \NC \NR
 447 \NC alignment       \NC a (partial) alignment is being added      \NC \NR
 448 \NC after_output    \NC an output routine has just finished       \NC \NR
 449 \NC box             \NC a typeset box is being added              \NC \NR
 450 %NC pre_box         \NC interline material is being added         \NC \NR
 451 %NC adjust          \NC \type {\vadjust} material is being added  \NC \NR
 452 \NC new_graf        \NC the beginning of a new paragraph          \NC \NR
 453 \NC vmode_par       \NC \type {\par} was found in vertical mode   \NC \NR
 454 \NC hmode_par       \NC \type {\par} was found in horizontal mode \NC \NR
 455 \NC insert          \NC an insert is added                        \NC \NR
 456 \NC penalty         \NC a penalty (in vertical mode)              \NC \NR
 457 \NC before_display  \NC immediately before a display starts       \NC \NR
 458 \NC after_display   \NC a display is finished                     \NC \NR
 459 \NC end             \NC \LUATEX\ is terminating (it's all over)   \NC \NR
 460 \stoptabulate
 461
 462 This callback does not replace any internal code.
 463
 464 \subsubsection{\type {pre_linebreak_filter}}
 465
 466 This callback is called just before \LUATEX\ starts converting a list of nodes
 467 into a stack of \type {\hbox}es, after the addition of \type {\parfillskip}.
 468
 469 \startfunctioncall
 470 function(<node> head, <string> groupcode)
 471     return true | false | <node> newhead
 472 end
 473 \stopfunctioncall
 474
 475 The string called \type {groupcode} identifies the nodelist's context within
 476 \TEX's processing. The range of possibilities is given in the table below, but
 477 not all of those can actually appear in \type {pre_linebreak_filter}, some are
 478 for the \type {hpack_filter} and \type {vpack_filter} callbacks that will be
 479 explained in the next two paragraphs.
 480
 481 \starttabulate[|lT|p|]
 482 \NC \ssbf value   \NC \bf explanation                                 \NC \NR
 483 \NC <empty>       \NC main vertical list                              \NC \NR
 484 \NC hbox          \NC \type {\hbox} in horizontal mode                \NC \NR
 485 \NC adjusted_hbox \NC \type {\hbox} in vertical mode                  \NC \NR
 486 \NC vbox          \NC \type {\vbox}                                   \NC \NR
 487 \NC vtop          \NC \type {\vtop}                                   \NC \NR
 488 \NC align         \NC \type {\halign} or \type {\valign}              \NC \NR
 489 \NC disc          \NC discretionaries                                 \NC \NR
 490 \NC insert        \NC packaging an insert                             \NC \NR
 491 \NC vcenter       \NC \type {\vcenter}                                \NC \NR
 492 \NC local_box     \NC \type {\localleftbox} or \type {\localrightbox} \NC \NR
 493 \NC split_off     \NC top of a \type {\vsplit}                        \NC \NR
 494 \NC split_keep    \NC remainder of a \type {\vsplit}                  \NC \NR
 495 \NC align_set     \NC alignment cell                                  \NC \NR
 496 \NC fin_row       \NC alignment row                                   \NC \NR
 497 \stoptabulate
 498
 499 As for all the callbacks that deal with nodes, the return value can be one of
 500 three things:
 501
 502 \startitemize
 503 \startitem
 504     boolean \type {true} signals succesful processing
 505 \stopitem
 506 \startitem
 507     \type {<node>} signals that the \quote {head} node should be replaced by the
 508     returned node
 509 \stopitem
 510 \startitem
 511     boolean \type {false} signals that the \quote {head} node list should be
 512     ignored and flushed from memory
 513 \stopitem
 514 \stopitemize
 515
 516 This callback does not replace any internal code.
 517
 518 \subsubsection{\type {linebreak_filter}}
 519
 520 This callback replaces \LUATEX's line breaking algorithm.
 521
 522 \startfunctioncall
 523 function(<node> head, <boolean> is_display)
 524     return <node> newhead
 525 end
 526 \stopfunctioncall
 527
 528 The returned node is the head of the list that will be added to the main vertical
 529 list, the boolean argument is true if this paragraph is interrupted by a
 530 following math display.
 531
 532 If you return something that is not a \type {<node>}, \LUATEX\ will apply the
 533 internal linebreak algorithm on the list that starts at \type {<head>}.
 534 Otherwise, the \type {<node>} you return is supposed to be the head of a list of
 535 nodes that are all allowed in vertical mode, and at least one of those has to
 536 represent a hbox. Failure to do so will result in a fatal error.
 537
 538 Setting this callback to \type {false} is possible, but dangerous, because it is
 539 possible you will end up in an unfixable \quote {deadcycles loop}.
 540
 541 \subsubsection{\type {append_to_vlist_filter}}
 542
 543 This callback is called whenever \LUATEX\ adds a box to a vertical list:
 544
 545 \startfunctioncall
 546 function(<node> box, <string> locationcode, <number prevdepth>,
 547     <boolean> mirrored)
 548     return list, prevdepth
 549 end
 550 \stopfunctioncall
 551
 552 It is ok to return nothing in which case you also need to flush the box or deal
 553 with it yourself. The prevdepth is also optional. Locations are \type {box},
 554 \type {alignment}, \type {equation}, \type {equation_number} and \type
 555 {post_linebreak}.
 556
 557 \subsubsection{\type {post_linebreak_filter}}
 558
 559 This callback is called just after \LUATEX\ has converted a list of nodes into a
 560 stack of \type {\hbox}es.
 561
 562 \startfunctioncall
 563 function(<node> head, <string> groupcode)
 564     return true | false | <node> newhead
 565 end
 566 \stopfunctioncall
 567
 568 This callback does not replace any internal code.
 569
 570 \subsubsection{\type {hpack_filter}}
 571
 572 This callback is called when \TEX\ is ready to start boxing some horizontal mode
 573 material. Math items and line boxes are ignored at the moment.
 574
 575 \startfunctioncall
 576 function(<node> head, <string> groupcode, <number> size,
 577          <string> packtype [, <string> direction])
 578     return true | false | <node> newhead
 579 end
 580 \stopfunctioncall
 581
 582 The \type {packtype} is either \type {additional} or \type {exactly}. If \type
 583 {additional}, then the \type {size} is a \type {\hbox spread ...} argument. If
 584 \type {exactly}, then the \type {size} is a \type {\hbox to ...}. In both cases,
 585 the number is in scaled points.
 586
 587 The \type {direction} is either one of the three-letter direction specifier
 588 strings, or \type {nil}.
 589
 590 This callback does not replace any internal code.
 591
 592 \subsubsection{\type {vpack_filter}}
 593
 594 This callback is called when \TEX\ is ready to start boxing some vertical mode
 595 material. Math displays are ignored at the moment.
 596
 597 This function is very similar to the \type {hpack_filter}. Besides the fact
 598 that it is called at different moments, there is an extra variable that matches
 599 \TEX's \type {\maxdepth} setting.
 600
 601 \startfunctioncall
 602 function(<node> head, <string> groupcode, <number> size, <string>
 603          packtype,  <number> maxdepth [, <string> direction])
 604     return true | false | <node> newhead
 605 end
 606 \stopfunctioncall
 607
 608 This callback does not replace any internal code.
 609
 610 \subsubsection{\type {hpack_quality}}
 611
 612 This callback can be used to intercept the overfull messages that can result from
 613 packing a horizontal list (as happens in the par builder). The function takes a
 614 few arguments:
 615
 616 \startfunctioncall
 617 function(<string> incident, <number> detail, <node> head, <number> first,
 618          <number> last)
 619     return <node> whatever
 620 end
 621 \stopfunctioncall
 622
 623 The incident is one of \type {overfull}, \type {underfull}, \type {loose} or
 624 \type {tight}. The detail is either the amount of overflow in case of \type
 625 {overfull}, or the badness otherwise. The head is the list that is constructed
 626 (when protrusion or expansion is enabled, this is an intermediate list).
 627 Optionally you can return a node, for instance an overfull rule indicator. That
 628 node will be appended to the list (just like \TEX's own rule would).
 629
 630 \subsubsection{\type {vpack_quality}}
 631
 632 This callback can be used to intercept the overfull messages that can result from
 633 packing a vertical list (as happens in the page builder). The function takes a
 634 few arguments:
 635
 636 \startfunctioncall
 637 function(<string> incident, <number> detail, <node> head, <number> first,
 638          <number> last)
 639 end
 640 \stopfunctioncall
 641
 642 The incident is one of \type {overfull}, \type {underfull}, \type {loose} or
 643 \type {tight}. The detail is either the amount of overflow in case of \type
 644 {overfull}, or the badness otherwise. The head is the list that is constructed.
 645
 646 \subsubsection{\type {process_rule}}
 647
 648 This is an experimental callback. It can be used with rules of subtype~4
 649 (user). The callback gets three arguments: the node, the width and the
 650 height. The callback can use \type {pdf.print} to write code to the \PDF\
 651 file but beware of not messing up the final result. No checking is done.
 652
 653 \subsubsection{\type {pre_output_filter}}
 654
 655 This callback is called when \TEX\ is ready to start boxing the box 255 for \type
 656 {\output}.
 657
 658 \startfunctioncall
 659 function(<node> head, <string> groupcode, <number> size, <string> packtype,
 660         <number> maxdepth [, <string> direction])
 661     return true | false | <node> newhead
 662 end
 663 \stopfunctioncall
 664
 665 This callback does not replace any internal code.
 666
 667 \subsubsection{\type {hyphenate}}
 668
 669 \startfunctioncall
 670 function(<node> head, <node> tail)
 671 end
 672 \stopfunctioncall
 673
 674 No return values. This callback has to insert discretionary nodes in the node
 675 list it receives.
 676
 677 Setting this callback to \type {false} will prevent the internal discretionary
 678 insertion pass.
 679
 680 \subsubsection{\type {ligaturing}}
 681
 682 \startfunctioncall
 683 function(<node> head, <node> tail)
 684 end
 685 \stopfunctioncall
 686
 687 No return values. This callback has to apply ligaturing to the node list it
 688 receives.
 689
 690 You don't have to worry about return values because the \type {head} node that is
 691 passed on to the callback is guaranteed not to be a glyph_node (if need be, a
 692 temporary node will be prepended), and therefore it cannot be affected by the
 693 mutations that take place. After the callback, the internal value of the \quote
 694 {tail of the list} will be recalculated.
 695
 696 The \type {next} of \type {head} is guaranteed to be non-nil.
 697
 698 The \type {next} of \type {tail} is guaranteed to be nil, and therefore the
 699 second callback argument can often be ignored. It is provided for orthogonality,
 700 and because it can sometimes be handy when special processing has to take place.
 701
 702 Setting this callback to \type {false} will prevent the internal ligature
 703 creation pass.
 704
 705 You must not ruin the node list. For instance, the head normally is a local par node,
 706 and the tail a glue. Messing too much can push \LUATEX\ into panic mode.
 707
 708 \subsubsection{\type {kerning}}
 709
 710 \startfunctioncall
 711 function(<node> head, <node> tail)
 712 end
 713 \stopfunctioncall
 714
 715 No return values. This callback has to apply kerning between the nodes in the
 716 node list it receives. See \type {ligaturing} for calling conventions.
 717
 718 Setting this callback to \type {false} will prevent the internal kern insertion
 719 pass.
 720
 721 You must not ruin the node list. For instance, the head normally is a local par node,
 722 and the tail a glue. Messing too much can push \LUATEX\ into panic mode.
 723
 724 \subsubsection{\type {mlist_to_hlist}}
 725
 726 This callback replaces \LUATEX's math list to node list conversion algorithm.
 727
 728 \startfunctioncall
 729 function(<node> head, <string> display_type, <boolean> need_penalties)
 730     return <node> newhead
 731 end
 732 \stopfunctioncall
 733
 734 The returned node is the head of the list that will be added to the vertical or
 735 horizontal list, the string argument is either \quote {text} or \quote {display}
 736 depending on the current math mode, the boolean argument is \type {true} if
 737 penalties have to be inserted in this list, \type {false} otherwise.
 738
 739 Setting this callback to \type {false} is bad, it will almost certainly result in
 740 an endless loop.
 741
 742 \subsection{Information reporting callbacks}
 743
 744 \subsubsection{\type {pre_dump}}
 745
 746 \startfunctioncall
 747 function()
 748 end
 749 \stopfunctioncall
 750
 751 This function is called just before dumping to a format file starts. It does not
 752 replace any code and there are neither arguments nor return values.
 753
 754 \subsubsection{\type {start_run}}
 755
 756 \startfunctioncall
 757 function()
 758 end
 759 \stopfunctioncall
 760
 761 This callback replaces the code that prints \LUATEX's banner. Note that for
 762 successful use, this callback has to be set in the lua initialization script,
 763 otherwise it will be seen only after the run has already started.
 764
 765 \subsubsection{\type {stop_run}}
 766
 767 \startfunctioncall
 768 function()
 769 end
 770 \stopfunctioncall
 771
 772 This callback replaces the code that prints \LUATEX's statistics and \quote
 773 {output written to} messages.
 774
 775 \subsubsection{\type {start_page_number}}
 776
 777 \startfunctioncall
 778 function()
 779 end
 780 \stopfunctioncall
 781
 782 Replaces the code that prints the \type {[} and the page number at the begin of
 783 \type {\shipout}. This callback will also override the printing of box information
 784 that normally takes place when \type {\tracingoutput} is positive.
 785
 786 \subsubsection{\type {stop_page_number}}
 787
 788 \startfunctioncall
 789 function()
 790 end
 791 \stopfunctioncall
 792
 793 Replaces the code that prints the \type {]} at the end of \type {\shipout}.
 794
 795 \subsubsection{\type {show_error_hook}}
 796
 797 \startfunctioncall
 798 function()
 799 end
 800 \stopfunctioncall
 801
 802 This callback is run from inside the \TEX\ error function, and the idea is to
 803 allow you to do some extra reporting on top of what \TEX\ already does (none of
 804 the normal actions are removed). You may find some of the values in the \type
 805 {status} table useful.
 806
 807 This callback does not replace any internal code.
 808
 809 \iffalse % this has been retracted for the moment
 810
 811     \startitemize
 812
 813     \sym{message}
 814
 815     is the formal error message \TEX\ has given to the user. (the line after the
 816     \type {'!'}).
 817
 818     \sym{indicator}
 819
 820     is either a filename (when it is a string) or a location indicator (a number)
 821     that can mean lots of different things like a token list id or a \type {\read}
 822     number.
 823
 824     \sym{lineno}
 825
 826     is the current line number.
 827     \stopitemize
 828
 829     This is an investigative item for 'testing the water' only. The final goal is the
 830     total replacement of \TEX's error handling routines, but that needs lots of
 831     adjustments in the web source because \TEX\ deals with errors in a somewhat
 832     haphazard fashion. This is why the exact definition of \type {indicator} is not
 833     given here.
 834
 835 \fi
 836
 837 \subsubsection{\type {show_error_message}}
 838
 839 \startfunctioncall
 840 function()
 841 end
 842 \stopfunctioncall
 843
 844 This callback replaces the code that prints the error message. The usual
 845 interaction after the message is not affected.
 846
 847 \subsubsection{\type {show_lua_error_hook}}
 848
 849 \startfunctioncall
 850 function()
 851 end
 852 \stopfunctioncall
 853
 854 This callback replaces the code that prints the extra lua error message.
 855
 856 \subsubsection{\type {start_file}}
 857
 858 \startfunctioncall
 859 function(category,filename)
 860 end
 861 \stopfunctioncall
 862
 863 This callback replaces the code that prints \LUATEX's when a file is opened like
 864 \type {(filename} for regular files. The category is a number:
 865
 866 \starttabulate[|||]
 867 \NC 1 \NC a normal data file, like a \TEX\ source \NC \NR
 868 \NC 2 \NC a font map coupling font names to resources \NC \NR
 869 \NC 3 \NC an image file (\type {png}, \type {pdf}, etc) \NC \NR
 870 \NC 4 \NC an embedded font subset \NC \NR
 871 \NC 5 \NC a fully embedded font \NC \NR
 872 \stoptabulate
 873
 874 \subsubsection{\type {stop_file}}
 875
 876 \startfunctioncall
 877 function(category)
 878 end
 879 \stopfunctioncall
 880
 881 This callback replaces the code that prints \LUATEX's when a file is closed like
 882 the \type {)} for regular files.
 883
 884 \subsection{PDF-related callbacks}
 885
 886 \subsubsection{\type {finish_pdffile}}
 887
 888 \startfunctioncall
 889 function()
 890 end
 891 \stopfunctioncall
 892
 893 This callback is called when all document pages are already written to the \PDF\
 894 file and \LUATEX\ is about to finalize the output document structure. Its
 895 intended use is final update of \PDF\ dictionaries such as \type {/Catalog} or
 896 \type {/Info}. The callback does not replace any code. There are neither
 897 arguments nor return values.
 898
 899 \subsubsection{\type {finish_pdfpage}}
 900
 901 \startfunctioncall
 902 function(shippingout)
 903 end
 904 \stopfunctioncall
 905
 906 This callback is called after the pdf page stream has been assembled and before
 907 the page object gets finalized.
 908
 909 \subsection{Font-related callbacks}
 910
 911 \subsubsection{\type {define_font}}
 912
 913 \startfunctioncall
 914 function(<string> name, <number> size, <number> id)
 915     return <table> font | <number> id
 916 end
 917 \stopfunctioncall
 918
 919 The string \type {name} is the filename part of the font specification, as given
 920 by the user.
 921
 922 The number \type {size} is a bit special:
 923
 924 \startitemize[packed]
 925 \startitem
 926     If it is positive, it specifies an \quote{at size} in scaled points.
 927 \stopitem
 928 \startitem
 929     If it is negative, its absolute value represents a \quote {scaled} setting
 930     relative to the designsize of the font.
 931 \stopitem
 932 \stopitemize
 933
 934 The \type {id} is the internal number assigned to the font.
 935
 936 The internal structure of the \type {font} table that is to be returned is
 937 explained in \in {chapter} [fonts]. That table is saved internally, so you can
 938 put extra fields in the table for your later \LUA\ code to use. In alternative,
 939 retval can be a previously defined fontid. This is useful if a previous
 940 definition can be reused instead of creating a whole new font structure.
 941
 942 Setting this callback to \type {false} is pointless as it will prevent font
 943 loading completely but will nevertheless generate errors.
 944
 945 \section{The \type {epdf} library}
 946
 947 The \type {epdf} library provides Lua bindings to many \PDF\ access functions
 948 that are defined by the poppler pdf viewer library (written in C$+{}+$ by
 949 Kristian H\o gsberg, based on xpdf by Derek Noonburg). Within \LUATEX\ (and
 950 \PDFTEX), xpdf functionality is being used since long time to embed \PDF\ files.
 951 The \type {epdf} library shall allow to scrutinize an external \PDF\ file. It
 952 gives access to its document structure, e.g., catalog, cross-reference table,
 953 individual pages, objects, annotations, info, and metadata. The \LUATEX\ team is
 954 evaluating the possibility of reducing the binding to a basic low level \PDF\
 955 primitives and delegate the complete set of functions to an external shared
 956 object module.
 957
 958 The \type {epdf} library is still in alpha state: \PDF\ access is currently
 959 read|-|only. Iit's not yet possible to alter a \PDF\ file or to assemble it from
 960 scratch, and many function bindings are still missing, and it is unlikely that we
 961 to support that at all. At some point we might also decide to limit the interface
 962 to a reasonable subset.
 963
 964 For a start, a \PDF\ file is opened by \type {epdf.open()} with file name, e.g.:
 965
 966 \starttyping
 967 doc = epdf.open("foo.pdf")
 968 \stoptyping
 969
 970 This normally returns a \type {PDFDoc} userdata variable; but if the file could
 971 not be opened successfully, instead of a fatal error just the value \type {nil} is
 972 returned.
 973
 974 All Lua functions in the \type {epdf} library are named after the poppler
 975 functions listed in the poppler header files for the various classes, e.g., files
 976 \type {PDFDoc.h}, \type {Dict.h}, and \type {Array.h}. These files can be found
 977 in the poppler subdirectory within the \LUATEX\ sources. Which functions are
 978 already implemented in the \type {epdf} library can be found in the \LUATEX\
 979 source file \type {lepdflib.cc}. For using the \type {epdf} library, knowledge of
 980 the \PDF\ file architecture is indispensable.
 981
 982 There are many different userdata types defined by the \type {epdf} library,
 983 currently these are \type {AnnotBorderStyle}, \type {AnnotBorder}, \type
 984 {Annots}, \type {Annot}, \type {Array}, \type {Attribute}, \type {Catalog}, \type
 985 {Dict}, \type {EmbFile}, \type {GString}, \type {LinkDest}, \type {Links}, \type
 986 {Link}, \type {ObjectStream}, \type {Object}, \type {PDFDoc}, \type
 987 {PDFRectangle}, \type {Page}, \type {Ref}, \type {Stream}, \type {StructElement},
 988 \type {StructTreeRoot} \type {TextSpan}, \type {XRefEntry} and \type {XRef}.
 989
 990 All these userdata names and the Lua access functions closely resemble the
 991 classes naming from the poppler header files, including the choice of mixed upper
 992 and lower case letters. The Lua function calls use object|-|oriented syntax,
 993 e.g., the following calls return the \type {Page} object for page~1:
 994
 995 \starttyping
 996 pageref = doc:getCatalog():getPageRef(1)
 997 pageobj = doc:getXRef():fetch(pageref.num, pageref.gen)
 998 \stoptyping
 999
1000 But writing such chained calls is risky, as an intermediate function may return
1001 \type {nil} on error; therefore between function calls there should be Lua type
1002 checks (e.g., against \type {nil}) done. If a non-object item is requested (e.g.,
1003 a \type {Dict} item by calling \type {page:getPieceInfo()}, cf.~\type {Page.h})
1004 but not available, the Lua functions return \type {nil} (without error). If a
1005 function should return an \type {Object}, but it's not existing, a \type {Null}
1006 object is returned instead (also without error; this is in|-|line with poppler
1007 behavior).
1008
1009 All library objects have a \type {__gc} metamethod for garbage collection. The
1010 \type {__tostring} metamethod gives the type name for each object.
1011
1012 All object constructors:
1013
1014 \startfunctioncall
1015 <PDFDoc>       = epdf.open(<string> PDF filename)
1016 <Annot>        = epdf.Annot(<XRef>, <Dict>, <Catalog>, <Ref>)
1017 <Annots>       = epdf.Annots(<XRef>, <Catalog>, <Object>)
1018 <Array>        = epdf.Array(<XRef>)
1019 <Attribute>    = epdf.Attribute(<Type>,<Object>)| epdf.Attribute(<string>, <int>, <Object>)
1020 <Dict>         = epdf.Dict(<XRef>)
1021 <Object>       = epdf.Object()
1022 <PDFRectangle> = epdf.PDFRectangle()
1023 \stopfunctioncall
1024
1025 The functions \type {StructElement_Type}, \type {Attribute_Type} and \type
1026 {AttributeOwner_Type} return a hash table \type {{<string>,<integer>}}.
1027
1028 \type {Annot} methods:
1029
1030 \startfunctioncall
1031 <boolean>     = <Annot>:isOK()
1032 <Object>      = <Annot>:getAppearance()
1033 <AnnotBorder> = <Annot>:getBorder()
1034 <boolean>     = <Annot>:match(<Ref>)
1035 \stopfunctioncall
1036
1037 \type {AnnotBorderStyle} methods:
1038
1039 \startfunctioncall
1040 <number> = <AnnotBorderStyle>:getWidth()
1041 \stopfunctioncall
1042
1043 \type {Annots} methods:
1044
1045 \startfunctioncall
1046 <integer> = <Annots>:getNumAnnots()
1047 <Annot>   = <Annots>:getAnnot(<integer>)
1048 \stopfunctioncall
1049
1050 \type {Array} methods:
1051
1052 \startfunctioncall
1053             <Array>:incRef()
1054             <Array>:decRef()
1055 <integer> = <Array>:getLength()
1056             <Array>:add(<Object>)
1057 <Object>  = <Array>:get(<integer>)
1058 <Object>  = <Array>:getNF(<integer>)
1059 <string>  = <Array>:getString(<integer>)
1060 \stopfunctioncall
1061
1062 \type {Attribute} methods:
1063
1064 \startfunctioncall
1065 <boolean>  = <Attribute>:isOk()
1066 <integer>  = <Attribute>:getType()
1067 <integer>  = <Attribute>:getOwner()
1068 <string>   = <Attribute>:getTypeName()
1069 <string>   = <Attribute>:getOwnerName()
1070 <Object>   = <Attribute>:getValue()
1071 <Object>   = <Attribute>:getDefaultValue
1072 <string>   = <Attribute>:getName()
1073 <integer>  = <Attribute>:getRevision()
1074              <Attribute>:setRevision(<unsigned integer>)
1075 <boolean>  = <Attribute>:istHidden()
1076              <Attribute>:setHidden(<boolean>)
1077 <string>   = <Attribute>:getFormattedValue()
1078 <string>   = <Attribute>:setFormattedValue(<string>)
1079 \stopfunctioncall
1080
1081 \type {Catalog} methods:
1082
1083 \startfunctioncall
1084 <boolean>  = <Catalog>:isOK()
1085 <integer>  = <Catalog>:getNumPages()
1086 <Page>     = <Catalog>:getPage(<integer>)
1087 <Ref>      = <Catalog>:getPageRef(<integer>)
1088 <string>   = <Catalog>:getBaseURI()
1089 <string>   = <Catalog>:readMetadata()
1090 <Object>   = <Catalog>:getStructTreeRoot()
1091 <integer>  = <Catalog>:findPage(<integer> object number, <integer> object generation)
1092 <LinkDest> = <Catalog>:findDest(<string> name)
1093 <Object>   = <Catalog>:getDests()
1094 <integer>  = <Catalog>:numEmbeddedFiles()
1095 <EmbFile>  = <Catalog>:embeddedFile(<integer>)
1096 <integer>  = <Catalog>:numJS()
1097 <string>   = <Catalog>:getJS(<integer>)
1098 <Object>   = <Catalog>:getOutline()
1099 <Object>   = <Catalog>:getAcroForm()
1100 \stopfunctioncall
1101
1102 \type {EmbFile} methods:
1103
1104 \startfunctioncall
1105 <string>   = <EmbFile>:name()
1106 <string>   = <EmbFile>:description()
1107 <integer>  = <EmbFile>:size()
1108 <string>   = <EmbFile>:modDate()
1109 <string>   = <EmbFile>:createDate()
1110 <string>   = <EmbFile>:checksum()
1111 <string>   = <EmbFile>:mimeType()
1112 <Object>   = <EmbFile>:streamObject()
1113 <boolean>  = <EmbFile>:isOk()
1114 \stopfunctioncall
1115
1116 \type {Dict} methods:
1117
1118 \startfunctioncall
1119             <Dict>:incRef()
1120             <Dict>:decRef()
1121 <integer> = <Dict>:getLength()
1122             <Dict>:add(<string>, <Object>)
1123             <Dict>:set(<string>, <Object>)
1124             <Dict>:remove(<string>)
1125 <boolean> = <Dict>:is(<string>)
1126 <Object>  = <Dict>:lookup(<string>)
1127 <Object>  = <Dict>:lookupNF(<string>)
1128 <integer> = <Dict>:lookupInt(<string>, <string>)
1129 <string>  = <Dict>:getKey(<integer>)
1130 <Object>  = <Dict>:getVal(<integer>)
1131 <Object>  = <Dict>:getValNF(<integer>)
1132 <boolean> = <Dict>:hasKey(<string>)
1133 \stopfunctioncall
1134
1135 \type {Link} methods:
1136
1137 \startfunctioncall
1138 <boolean>  = <Link>:isOK()
1139 <boolean>  = <Link>:inRect(<number>, <number>)
1140 \stopfunctioncall
1141
1142 \type {LinkDest} methods:
1143
1144 \startfunctioncall
1145 <boolean>  = <LinkDest>:isOK()
1146 <integer>  = <LinkDest>:getKind()
1147 <string>   = <LinkDest>:getKindName()
1148 <boolean>  = <LinkDest>:isPageRef()
1149 <integer>  = <LinkDest>:getPageNum()
1150 <Ref>      = <LinkDest>:getPageRef()
1151 <number>   = <LinkDest>:getLeft()
1152 <number>   = <LinkDest>:getBottom()
1153 <number>   = <LinkDest>:getRight()
1154 <number>   = <LinkDest>:getTop()
1155 <number>   = <LinkDest>:getZoom()
1156 <boolean>  = <LinkDest>:getChangeLeft()
1157 <boolean>  = <LinkDest>:getChangeTop()
1158 <boolean>  = <LinkDest>:getChangeZoom()
1159 \stopfunctioncall
1160
1161 \type {Links} methods:
1162
1163 \startfunctioncall
1164 <integer>  = <Links>:getNumLinks()
1165 <Link>     = <Links>:getLink(<integer>)
1166 \stopfunctioncall
1167
1168 \type {Object} methods:
1169
1170 \startfunctioncall
1171             <Object>:initBool(<boolean>)
1172             <Object>:initInt(<integer>)
1173             <Object>:initReal(<number>)
1174             <Object>:initString(<string>)
1175             <Object>:initName(<string>)
1176             <Object>:initNull()
1177             <Object>:initArray(<XRef>)
1178             <Object>:initDict(<XRef>)
1179             <Object>:initStream(<Stream>)
1180             <Object>:initRef(<integer> object number, <integer> object generation)
1181             <Object>:initCmd(<string>)
1182             <Object>:initError()
1183             <Object>:initEOF()
1184 <Object>  = <Object>:fetch(<XRef>)
1185 <integer> = <Object>:getType()
1186 <string>  = <Object>:getTypeName()
1187 <boolean> = <Object>:isBool()
1188 <boolean> = <Object>:isInt()
1189 <boolean> = <Object>:isReal()
1190 <boolean> = <Object>:isNum()
1191 <boolean> = <Object>:isString()
1192 <boolean> = <Object>:isName()
1193 <boolean> = <Object>:isNull()
1194 <boolean> = <Object>:isArray()
1195 <boolean> = <Object>:isDict()
1196 <boolean> = <Object>:isStream()
1197 <boolean> = <Object>:isRef()
1198 <boolean> = <Object>:isCmd()
1199 <boolean> = <Object>:isError()
1200 <boolean> = <Object>:isEOF()
1201 <boolean> = <Object>:isNone()
1202 <boolean> = <Object>:getBool()
1203 <integer> = <Object>:getInt()
1204 <number>  = <Object>:getReal()
1205 <number>  = <Object>:getNum()
1206 <string>  = <Object>:getString()
1207 <string>  = <Object>:getName()
1208 <Array>   = <Object>:getArray()
1209 <Dict>    = <Object>:getDict()
1210 <Stream>  = <Object>:getStream()
1211 <Ref>     = <Object>:getRef()
1212 <integer> = <Object>:getRefNum()
1213 <integer> = <Object>:getRefGen()
1214 <string>  = <Object>:getCmd()
1215 <integer> = <Object>:arrayGetLength()
1216           = <Object>:arrayAdd(<Object>)
1217 <Object>  = <Object>:arrayGet(<integer>)
1218 <Object>  = <Object>:arrayGetNF(<integer>)
1219 <integer> = <Object>:dictGetLength(<integer>)
1220           = <Object>:dictAdd(<string>, <Object>)
1221           = <Object>:dictSet(<string>, <Object>)
1222 <Object>  = <Object>:dictLookup(<string>)
1223 <Object>  = <Object>:dictLookupNF(<string>)
1224 <string>  = <Object>:dictgetKey(<integer>)
1225 <Object>  = <Object>:dictgetVal(<integer>)
1226 <Object>  = <Object>:dictgetValNF(<integer>)
1227 <boolean> = <Object>:streamIs(<string>)
1228           = <Object>:streamReset()
1229 <integer> = <Object>:streamGetChar()
1230 <integer> = <Object>:streamLookChar()
1231 <integer> = <Object>:streamGetPos()
1232           = <Object>:streamSetPos(<integer>)
1233 <Dict>    = <Object>:streamGetDict()
1234 \stopfunctioncall
1235
1236 \type {Page} methods:
1237
1238 \startfunctioncall
1239 <boolean>      = <Page>:isOk()
1240 <integer>      = <Page>:getNum()
1241 <PDFRectangle> = <Page>:getMediaBox()
1242 <PDFRectangle> = <Page>:getCropBox()
1243 <boolean>      = <Page>:isCropped()
1244 <number>       = <Page>:getMediaWidth()
1245 <number>       = <Page>:getMediaHeight()
1246 <number>       = <Page>:getCropWidth()
1247 <number>       = <Page>:getCropHeight()
1248 <PDFRectangle> = <Page>:getBleedBox()
1249 <PDFRectangle> = <Page>:getTrimBox()
1250 <PDFRectangle> = <Page>:getArtBox()
1251 <integer>      = <Page>:getRotate()
1252 <string>       = <Page>:getLastModified()
1253 <Dict>         = <Page>:getBoxColorInfo()
1254 <Dict>         = <Page>:getGroup()
1255 <Stream>       = <Page>:getMetadata()
1256 <Dict>         = <Page>:getPieceInfo()
1257 <Dict>         = <Page>:getSeparationInfo()
1258 <Dict>         = <Page>:getResourceDict()
1259 <Object>       = <Page>:getAnnots()
1260 <Links>        = <Page>:getLinks(<Catalog>)
1261 <Object>       = <Page>:getContents()
1262 \stopfunctioncall
1263
1264 \type {PDFDoc} methods:
1265
1266 \startfunctioncall
1267 <boolean>  = <PDFDoc>:isOk()
1268 <integer>  = <PDFDoc>:getErrorCode()
1269 <string>   = <PDFDoc>:getErrorCodeName()
1270 <string>   = <PDFDoc>:getFileName()
1271 <XRef>     = <PDFDoc>:getXRef()
1272 <Catalog>  = <PDFDoc>:getCatalog()
1273 <number>   = <PDFDoc>:getPageMediaWidth()
1274 <number>   = <PDFDoc>:getPageMediaHeight()
1275 <number>   = <PDFDoc>:getPageCropWidth()
1276 <number>   = <PDFDoc>:getPageCropHeight()
1277 <integer>  = <PDFDoc>:getNumPages()
1278 <string>   = <PDFDoc>:readMetadata()
1279 <Object>   = <PDFDoc>:getStructTreeRoot()
1280 <integer>  = <PDFDoc>:findPage(<integer> object number, <integer> object generation)
1281 <Links>    = <PDFDoc>:getLinks(<integer>)
1282 <LinkDest> = <PDFDoc>:findDest(<string>)
1283 <boolean>  = <PDFDoc>:isEncrypted()
1284 <boolean>  = <PDFDoc>:okToPrint()
1285 <boolean>  = <PDFDoc>:okToChange()
1286 <boolean>  = <PDFDoc>:okToCopy()
1287 <boolean>  = <PDFDoc>:okToAddNotes()
1288 <boolean>  = <PDFDoc>:isLinearized()
1289 <Object>   = <PDFDoc>:getDocInfo()
1290 <Object>   = <PDFDoc>:getDocInfoNF()
1291 <integer>  = <PDFDoc>:getPDFMajorVersion()
1292 <integer>  = <PDFDoc>:getPDFMinorVersion()
1293 \stopfunctioncall
1294
1295 \type {PDFRectangle} methods:
1296
1297 \startfunctioncall
1298 <boolean>  = <PDFRectangle>:isValid()
1299 \stopfunctioncall
1300
1301 %\type {Ref} methods:
1302 %
1303 %\startfunctioncall
1304 %\stopfunctioncall
1305
1306 \type {Stream} methods:
1307
1308 \startfunctioncall
1309 <integer>  = <Stream>:getKind()
1310 <string>   = <Stream>:getKindName()
1311            = <Stream>:reset()
1312            = <Stream>:close()
1313 <integer>  = <Stream>:getChar()
1314 <integer>  = <Stream>:lookChar()
1315 <integer>  = <Stream>:getRawChar()
1316 <integer>  = <Stream>:getUnfilteredChar()
1317            = <Stream>:unfilteredReset()
1318 <integer>  = <Stream>:getPos()
1319 <boolean>  = <Stream>:isBinary()
1320 <Stream>   = <Stream>:getUndecodedStream()
1321 <Dict>     = <Stream>:getDict()
1322 \stopfunctioncall
1323
1324 \type {StructElement} methods:
1325
1326 \startfunctioncall
1327 <string>         = <StructElement>:getTypeName()
1328 <integer>        = <StructElement>:getType()
1329 <boolean>        = <StructElement>:isOk()
1330 <boolean>        = <StructElement>:isBlock()
1331 <boolean>        = <StructElement>:isInline()
1332 <boolean>        = <StructElement>:isGrouping()
1333 <boolean>        = <StructElement>:isContent()
1334 <boolean>        = <StructElement>:isObjectRef()
1335 <integer>        = <StructElement>:getMCID()
1336 <Ref>            = <StructElement>:getObjectRef()
1337 <Ref>            = <StructElement>:getParentRef()
1338 <boolean>        = <StructElement>:hasPageRef()
1339 <Ref>            = <StructElement>:getPageRef()
1340 <StructTreeRoot> = <StructElement>:getStructTreeRoot()
1341 <string>         = <StructElement>:getID()
1342 <string>         = <StructElement>:getLanguage()
1343 <integer>        = <StructElement>:getRevision()
1344                    <StructElement>:setRevision(<unsigned integer>)
1345 <string>         = <StructElement>:getTitle()
1346 <string>         = <StructElement>:getExpandedAbbr()
1347 <integer>        = <StructElement>:getNumChildren()
1348 <StructElement>  = <StructElement>:getChild()
1349                  = <StructElement>:appendChild<StructElement>)
1350 <integer>        = <StructElement>:getNumAttributes()
1351 <Attribute>      = <StructElement>:geAttribute(<integer>)
1352 <string>         = <StructElement>:appendAttribute(<Attribute>)
1353 <Attribute>      = <StructElement>:findAttribute(<Attribute::Type>,boolean,Attribute::Owner)
1354 <string>         = <StructElement>:getAltText()
1355 <string>         = <StructElement>:getActualText()
1356 <string>         = <StructElement>:getText(<boolean>)
1357 <table>          = <StructElement>:getTextSpans()
1358 \stopfunctioncall
1359
1360 \type {StructTreeRoot} methods:
1361
1362 \startfunctioncall
1363 <StructElement> = <StructTreeRoot>:findParentElement
1364 <PDFDoc>        = <StructTreeRoot>:getDoc
1365 <Dict>          = <StructTreeRoot>:getRoleMap
1366 <Dict>          = <StructTreeRoot>:getClassMap
1367 <integer>       = <StructTreeRoot>:getNumChildren
1368 <StructElement> = <StructTreeRoot>:getChild
1369                   <StructTreeRoot>:appendChild
1370 <StructElement> = <StructTreeRoot>:findParentElement
1371 \stopfunctioncall
1372
1373 \type {TextSpan} han only one method:
1374
1375 \startfunctioncall
1376 <string> = <TestSpan>:getText()
1377 \stopfunctioncall
1378
1379 \type {XRef} methods:
1380
1381 \startfunctioncall
1382 <boolean>  = <XRef>:isOk()
1383 <integer>  = <XRef>:getErrorCode()
1384 <boolean>  = <XRef>:isEncrypted()
1385 <boolean>  = <XRef>:okToPrint()
1386 <boolean>  = <XRef>:okToPrintHighRes()
1387 <boolean>  = <XRef>:okToChange()
1388 <boolean>  = <XRef>:okToCopy()
1389 <boolean>  = <XRef>:okToAddNotes()
1390 <boolean>  = <XRef>:okToFillForm()
1391 <boolean>  = <XRef>:okToAccessibility()
1392 <boolean>  = <XRef>:okToAssemble()
1393 <Object>   = <XRef>:getCatalog()
1394 <Object>   = <XRef>:fetch(<integer> object number, <integer> object generation)
1395 <Object>   = <XRef>:getDocInfo()
1396 <Object>   = <XRef>:getDocInfoNF()
1397 <integer>  = <XRef>:getNumObjects()
1398 <integer>  = <XRef>:getRootNum()
1399 <integer>  = <XRef>:getRootGen()
1400 <integer>  = <XRef>:getSize()
1401 <Object>   = <XRef>:getTrailerDict()
1402 \stopfunctioncall
1403
1404 There is an experimental function \type {epdf.openMemStream} that takes three
1405 arguments:
1406
1407 \starttabulate
1408 \NC \type {stream}  \NC this is a (in low level \LUA\ speak) light userdata
1409                         object, i.e.\ a pointer to a sequence of bytes \NC \NR
1410 \NC \type {length}  \NC this is the length of the stream in bytes \NC \NR
1411 \NC \type {name}    \NC this is a unique identifier that us used for hashing the
1412                         stream, so that mulltiple doesn't use more memory \NC \NR
1413 \stoptabulate
1414
1415 Instead of a light userdata stream you can also pass a \LUA\ string, in which
1416 case the given length is (at most) the string length.
1417
1418 The returned object can be used in the \type {img} library instead of a filename.
1419 Both the memory stream and it's use in the image library is experimental and can
1420 change. In case you wonder where this can be used: when you use the swiglib
1421 library for graphic magick, it can return such a userdata object. This permits
1422 conversion in memory and passing the result directly to the backend. This might
1423 save some runtime in one|-|pass workflows. This feature is currently not meant
1424 for production.
1425
1426 \section{The \type {font} library}
1427
1428 The font library provides the interface into the internals of the font system,
1429 and also it contains helper functions to load traditional \TEX\ font metrics
1430 formats. Other font loading functionality is provided by the \type {fontloader}
1431 library that will be discussed in the next section.
1432
1433 \subsection{Loading a \TFM\ file}
1434
1435 The behavior documented in this subsection is considered stable in the sense that
1436 there will not be backward-incompatible changes any more.
1437
1438 \startfunctioncall
1439 <table> fnt = font.read_tfm(<string> name, <number> s)
1440 \stopfunctioncall
1441
1442 The number is a bit special:
1443
1444 \startitemize
1445 \startitem
1446     If it is positive, it specifies an \quote {at size} in scaled points.
1447 \stopitem
1448 \startitem
1449     If it is negative, its absolute value represents a \quote {scaled}
1450     setting relative to the designsize of the font.
1451 \stopitem
1452 \stopitemize
1453
1454 The internal structure of the metrics font table that is returned is explained in
1455 \in {chapter} [fonts].
1456
1457 \subsection{Loading a \VF\ file}
1458
1459 The behavior documented in this subsection is considered stable in the sense that
1460 there will not be backward-incompatible changes any more.
1461
1462 \startfunctioncall
1463 <table> vf_fnt = font.read_vf(<string> name, <number> s)
1464 \stopfunctioncall
1465
1466 The meaning of the number \type {s} and the format of the returned table are
1467 similar to the ones in the \type {read_tfm()} function.
1468
1469 \subsection{The fonts array}
1470
1471 The whole table of \TEX\ fonts is accessible from \LUA\ using a virtual array.
1472
1473 \starttyping
1474 font.fonts[n] = { ... }
1475 <table> f = font.fonts[n]
1476 \stoptyping
1477
1478 See \in {chapter} [fonts] for the structure of the tables. Because this is a
1479 virtual array, you cannot call \type {pairs} on it, but see below for the \type
1480 {font.each} iterator.
1481
1482 The two metatable functions implementing the virtual array are:
1483
1484 \startfunctioncall
1485 <table> f = font.getfont(<number> n)
1486 font.setfont(<number> n, <table> f)
1487 \stopfunctioncall
1488
1489 Note that at the moment, each access to the \type {font.fonts} or call to \type
1490 {font.getfont} creates a lua table for the whole font. This process can be quite
1491 slow. In a later version of \LUATEX, this interface will change (it will start
1492 using userdata objects instead of actual tables).
1493
1494 Also note the following: assignments can only be made to fonts that have already
1495 been defined in \TEX, but have not been accessed {\it at all\/} since that
1496 definition. This limits the usability of the write access to \type {font.fonts}
1497 quite a lot, a less stringent ruleset will likely be implemented later.
1498
1499 \subsection{Checking a font's status}
1500
1501 You can test for the status of a font by calling this function:
1502
1503 \startfunctioncall
1504 <boolean> f = font.frozen(<number> n)
1505 \stopfunctioncall
1506
1507 The return value is one of \type {true} (unassignable), \type {false} (can be
1508 changed) or \type {nil} (not a valid font at all).
1509
1510 \subsection{Defining a font directly}
1511
1512 You can define your own font into \type {font.fonts} by calling this function:
1513
1514 \startfunctioncall
1515 <number> i = font.define(<table> f)
1516 \stopfunctioncall
1517
1518 The return value is the internal id number of the defined font (the index into
1519 \type {font.fonts}). If the font creation fails, an error is raised. The table
1520 is a font structure, as explained in \in {chapter} [fonts].
1521
1522 \subsection{Projected next font id}
1523
1524 \startfunctioncall
1525 <number> i = font.nextid()
1526 \stopfunctioncall
1527
1528 This returns the font id number that would be returned by a \type {font.define}
1529 call if it was executed at this spot in the code flow. This is useful for virtual
1530 fonts that need to reference themselves.
1531
1532 \subsection{Font id}
1533
1534 \startfunctioncall
1535 <number> i = font.id(<string> csname)
1536 \stopfunctioncall
1537
1538 This returns the font id associated with \type {csname} string, or $-1$ if \type
1539 {csname} is not defined.
1540
1541 \subsection{Currently active font}
1542
1543 \startfunctioncall
1544 <number> i = font.current()
1545 font.current(<number> i)
1546 \stopfunctioncall
1547
1548 This gets or sets the currently used font number.
1549
1550 \subsection{Maximum font id}
1551
1552 \startfunctioncall
1553 <number> i = font.max()
1554 \stopfunctioncall
1555
1556 This is the largest used index in \type {font.fonts}.
1557
1558 \subsection{Iterating over all fonts}
1559
1560 \startfunctioncall
1561 for i,v in font.each() do
1562   ...
1563 end
1564 \stopfunctioncall
1565
1566 This is an iterator over each of the defined \TEX\ fonts. The first returned
1567 value is the index in \type {font.fonts}, the second the font itself, as a \LUA\
1568 table. The indices are listed incrementally, but they do not always form an array
1569 of consecutive numbers: in some cases there can be holes in the sequence.
1570
1571 \section{The \type {fontloader} library}
1572
1573 \subsection{Getting quick information on a font}
1574
1575 \startfunctioncall
1576 <table> info = fontloader.info(<string> filename)
1577 \stopfunctioncall
1578
1579 This function returns either \type {nil}, or a \type {table}, or an array of
1580 small tables (in the case of a TrueType collection). The returned table(s) will
1581 contain some fairly interesting information items from the font(s) defined by the
1582 file:
1583
1584 \starttabulate[|lT|l|p|]
1585 \NC \ssbf key    \NC \bf type \NC \bf explanation \NC \NR
1586 \NC fontname     \NC string   \NC the \POSTSCRIPT\ name of the font\NC \NR
1587 \NC fullname     \NC string   \NC the formal name of the font\NC \NR
1588 \NC familyname   \NC string   \NC the family name this font belongs to\NC \NR
1589 \NC weight       \NC string   \NC a string indicating the color value of the font\NC \NR
1590 \NC version      \NC string   \NC the internal font version\NC \NR
1591 \NC italicangle  \NC float    \NC the slant angle\NC \NR
1592 \NC units_per_em \NC number   \NC 1000 for \POSTSCRIPT-based fonts, usually 2048 for \TRUETYPE\NC \NR
1593 \NC pfminfo      \NC table    \NC (see \in{section}[fontloaderpfminfotable])\NC \NR
1594 \stoptabulate
1595
1596 Getting information through this function is (sometimes much) more efficient than
1597 loading the font properly, and is therefore handy when you want to create a
1598 dictionary of available fonts based on a directory contents.
1599
1600 \subsection{Loading an \OPENTYPE\ or \TRUETYPE\ file}
1601 If you want to use an \OPENTYPE\ font, you have to get the metric information
1602 from somewhere. Using the \type {fontloader} library, the simplest way to get
1603 that information is thus:
1604
1605 \starttyping
1606 function load_font (filename)
1607   local metrics = nil
1608   local font = fontloader.open(filename)
1609   if font then
1610      metrics = fontloader.to_table(font)
1611      fontloader.close(font)
1612   end
1613   return metrics
1614 end
1615
1616 myfont = load_font('/opt/tex/texmf/fonts/data/arial.ttf')
1617 \stoptyping
1618
1619 The main function call is
1620
1621 \startfunctioncall
1622 <userdata> f, <table> w = fontloader.open(<string> filename)
1623 <userdata> f, <table> w = fontloader.open(<string> filename, <string> fontname)
1624 \stopfunctioncall
1625
1626 The first return value is a userdata representation of the font. The second
1627 return value is a table containing any warnings and errors reported by fontloader
1628 while opening the font. In normal typesetting, you would probably ignore the
1629 second argument, but it can be useful for debugging purposes.
1630
1631 For \TRUETYPE\ collections (when filename ends in 'ttc') and \DFONT\ collections,
1632 you have to use a second string argument to specify which font you want from the
1633 collection. Use the \type {fontname} strings that are returned by \type
1634 {fontloader.info} for that.
1635
1636 To turn the font into a table, \type {fontloader.to_table} is used on the font
1637 returned by \type {fontloader.open}.
1638
1639 \startfunctioncall
1640 <table> f = fontloader.to_table(<userdata> font)
1641 \stopfunctioncall
1642
1643 This table cannot be used directly by \LUATEX\ and should be turned into another
1644 one as described in~\in {chapter} [fonts]. Do not forget to store the \type
1645 {fontname} value in the \type {psname} field of the metrics table to be returned
1646 to \LUATEX, otherwise the font inclusion backend will not be able to find the
1647 correct font in the collection.
1648
1649 See \in {section} [fontloadertables] for details on the userdata object returned
1650 by \type {fontloader.open()} and the layout of the \type {metrics} table returned
1651 by \type {fontloader.to_table()}.
1652
1653 The font file is parsed and partially interpreted by the font loading routines
1654 from \FONTFORGE. The file format can be \OPENTYPE, \TRUETYPE, \TRUETYPE\
1655 Collection, \CFF, or \TYPEONE.
1656
1657 There are a few advantages to this approach compared to reading the actual font
1658 file ourselves:
1659
1660 \startitemize
1661
1662 \startitem
1663     The font is automatically re|-|encoded, so that the \type {metrics} table for
1664     \TRUETYPE\ and \OPENTYPE\ fonts is using \UNICODE\ for the character indices.
1665 \stopitem
1666
1667 \startitem
1668     Many features are pre|-|processed into a format that is easier to handle than
1669     just the bare tables would be.
1670 \stopitem
1671
1672 \startitem
1673     \POSTSCRIPT|-|based \OPENTYPE\ fonts do not store the character height and
1674     depth in the font file, so the character boundingbox has to be calculated in
1675     some way.
1676 \stopitem
1677
1678 \startitem
1679     In the future, it may be interesting to allow \LUA\ scripts access to
1680     the font program itself, perhaps even creating or changing the font.
1681 \stopitem
1682
1683 \stopitemize
1684
1685 A loaded font is discarded with:
1686
1687 \startfunctioncall
1688 fontloader.close(<userdata> font)
1689 \stopfunctioncall
1690
1691 \subsection{Applying a \quote{feature file}}
1692
1693 You can apply a \quote{feature file} to a loaded font:
1694
1695 \startfunctioncall
1696 <table> errors = fontloader.apply_featurefile(<userdata> font, <string> filename)
1697 \stopfunctioncall
1698
1699 A \quote {feature file} is a textual representation of the features in an
1700 \OPENTYPE\ font. See
1701
1702 \starttyping
1703 http://www.adobe.com/devnet/opentype/afdko/topic_feature_file_syntax.html
1704 \stoptyping
1705
1706 and
1707
1708 \starttyping
1709 http://fontforge.sourceforge.net/featurefile.html
1710 \stoptyping
1711
1712 for a more detailed description of feature files.
1713
1714 If the function fails, the return value is a table containing any errors reported
1715 by fontloader while applying the feature file. On success, \type {nil} is
1716 returned.
1717
1718 \subsection{Applying an \quote{\AFM\ file}}
1719
1720 You can apply an \quote {\AFM\ file} to a loaded font:
1721
1722 \startfunctioncall
1723 <table> errors = fontloader.apply_afmfile(<userdata> font, <string> filename)
1724 \stopfunctioncall
1725
1726 An \AFM\ file is a textual representation of (some of) the meta information
1727 in a \TYPEONE\ font. See
1728
1729 \starttyping
1730 ftp://ftp.math.utah.edu/u/ma/hohn/linux/postscript/5004.AFM_Spec.pdf
1731 \stoptyping
1732
1733 for more information about \AFM\ files.
1734
1735 Note: If you \type {fontloader.open()} a \TYPEONE\ file named \type {font.pfb},
1736 the library will automatically search for and apply \type {font.afm} if it exists
1737 in the same directory as the file \type {font.pfb}. In that case, there is no
1738 need for an explicit call to \type {apply_afmfile()}.
1739
1740 If the function fails, the return value is a table containing any errors reported
1741 by fontloader while applying the AFM file. On success, \type {nil} is returned.
1742
1743 \subsection[fontloadertables]{Fontloader font tables}
1744
1745 As mentioned earlier, the return value of \type {fontloader.open()} is a userdata
1746 object. One way to have access to the actual metrics is to call \type
1747 {fontloader.to_table()} on this object, returning the table structure that is
1748 explained in the following subsections.
1749
1750 However, it turns out that the result from \type {fontloader.to_table()}
1751 sometimes needs very large amounts of memory (depending on the font's complexity
1752 and size) so it is possible to access the userdata object directly.
1753
1754 \startitemize
1755 \startitem
1756     All top|-|level keys that would be returned by \type {to_table()}
1757     can also be accessed directly.
1758 \stopitem
1759 \startitem
1760 \startitem
1761     The top|-|level key \quote {glyphs} returns a {\it virtual\/} array that
1762     allows indices from \type {f.glyphmin} to (\type {f.glyphmax}).
1763 \stopitem
1764 \startitem
1765     The items in that virtual array (the actual glyphs) are themselves also
1766     userdata objects, and each has accessors for all of the keys explained in the
1767     section \quote {Glyph items} below.
1768 \stopitem
1769     The top|-|level key \quote {subfonts} returns an {\it actual} array of userdata
1770     objects, one for each of the subfonts (or nil, if there are no subfonts).
1771 \stopitem
1772 \stopitemize
1773
1774 A short example may be helpful. This code generates a printout of all
1775 the glyph names in the font \type {PunkNova.kern.otf}:
1776
1777 \starttyping
1778 local f = fontloader.open('PunkNova.kern.otf')
1779 print (f.fontname)
1780 local i = 0
1781 if f.glyphcnt > 0 then
1782     for i=f.glyphmin,f.glyphmax do
1783        local g = f.glyphs[i]
1784        if g then
1785           print(g.name)
1786        end
1787        i = i + 1
1788     end
1789 end
1790 fontloader.close(f)
1791 \stoptyping
1792
1793 In this case, the \LUATEX\ memory requirement stays below 100MB on the test
1794 computer, while the internal stucture generated by \type {to_table()} needs more
1795 than 2GB of memory (the font itself is 6.9MB in disk size).
1796
1797 Only the top|-|level font, the subfont table entries, and the glyphs are virtual
1798 objects, everything else still produces normal lua values and tables.
1799
1800 If you want to know the valid fields in a font or glyph structure, call the \type
1801 {fields} function on an object of a particular type (either glyph or font):
1802
1803 \startfunctioncall
1804 <table> fields = fontloader.fields(<userdata> font)
1805 <table> fields = fontloader.fields(<userdata> font_glyph)
1806 \stopfunctioncall
1807
1808 For instance:
1809
1810 \startfunctioncall
1811 local fields = fontloader.fields(f)
1812 local fields = fontloader.fields(f.glyphs[0])
1813 \stopfunctioncall
1814
1815 \subsubsection{Table types}
1816
1817 \subsubsubsection{Top-level}
1818
1819 The top|-|level keys in the returned table are (the explanations in this part of
1820 the documentation are not yet finished):
1821
1822 \starttabulate[|lT|l|p|]
1823 \NC \ssbf key                    \NC \bf type \NC \bf explanation \NC \NR
1824 \NC table_version                \NC number   \NC indicates the metrics version (currently~0.3)\NC \NR
1825 \NC fontname                     \NC string   \NC \POSTSCRIPT\ font name\NC \NR
1826 \NC fullname                     \NC string   \NC official (human-oriented) font name\NC \NR
1827 \NC familyname                   \NC string   \NC family name\NC \NR
1828 \NC weight                       \NC string   \NC weight indicator\NC \NR
1829 \NC copyright                    \NC string   \NC copyright information\NC \NR
1830 \NC filename                     \NC string   \NC the file name\NC \NR
1831 \NC version                      \NC string   \NC font version\NC \NR
1832 \NC italicangle                  \NC float    \NC slant angle\NC \NR
1833 \NC units_per_em                 \NC number   \NC 1000 for \POSTSCRIPT-based fonts, usually 2048 for \TRUETYPE\NC \NR
1834 \NC ascent                       \NC number   \NC height of ascender in \type {units_per_em}\NC \NR
1835 \NC descent                      \NC number   \NC depth of descender in \type {units_per_em}\NC \NR
1836 \NC upos                         \NC float    \NC \NC \NR
1837 \NC uwidth                       \NC float    \NC \NC \NR
1838 \NC uniqueid                     \NC number   \NC \NC \NR
1839 \NC glyphs                       \NC array    \NC \NC \NR
1840 \NC glyphcnt                     \NC number   \NC number of included glyphs\NC \NR
1841 \NC glyphmax                     \NC number   \NC maximum used index the glyphs array\NC \NR
1842 \NC glyphmin                     \NC number   \NC minimum used index the glyphs array\NC \NR
1843 \NC notdef_loc                   \NC number   \NC location of the \type {.notdef} glyph
1844                                                   or \type {-1} when not present \NC \NR
1845 \NC hasvmetrics                  \NC number   \NC \NC \NR
1846 \NC onlybitmaps                  \NC number   \NC \NC \NR
1847 \NC serifcheck                   \NC number   \NC \NC \NR
1848 \NC isserif                      \NC number   \NC \NC \NR
1849 \NC issans                       \NC number   \NC \NC \NR
1850 \NC encodingchanged              \NC number   \NC \NC \NR
1851 \NC strokedfont                  \NC number   \NC \NC \NR
1852 \NC use_typo_metrics             \NC number   \NC \NC \NR
1853 \NC weight_width_slope_only      \NC number   \NC \NC \NR
1854 \NC head_optimized_for_cleartype \NC number   \NC \NC \NR
1855 \NC uni_interp                   \NC enum     \NC \type {unset}, \type {none}, \type {adobe},
1856                                                   \type {greek}, \type {japanese}, \type {trad_chinese},
1857                                                   \type {simp_chinese}, \type {korean}, \type {ams}\NC \NR
1858 \NC origname                     \NC string   \NC the file name, as supplied by the user\NC \NR
1859 \NC map                          \NC table    \NC \NC \NR
1860 \NC private                      \NC table    \NC \NC \NR
1861 \NC xuid                         \NC string   \NC \NC \NR
1862 \NC pfminfo                      \NC table    \NC \NC \NR
1863 \NC names                        \NC table    \NC \NC \NR
1864 \NC cidinfo                      \NC table    \NC \NC \NR
1865 \NC subfonts                     \NC array    \NC \NC \NR
1866 \NC commments                    \NC string   \NC \NC \NR
1867 \NC fontlog                      \NC string   \NC \NC \NR
1868 \NC cvt_names                    \NC string   \NC \NC \NR
1869 \NC anchor_classes               \NC table    \NC \NC \NR
1870 \NC ttf_tables                   \NC table    \NC \NC \NR
1871 \NC ttf_tab_saved                \NC table    \NC \NC \NR
1872 \NC kerns                        \NC table    \NC \NC \NR
1873 \NC vkerns                       \NC table    \NC \NC \NR
1874 \NC texdata                      \NC table    \NC \NC \NR
1875 \NC lookups                      \NC table    \NC \NC \NR
1876 \NC gpos                         \NC table    \NC \NC \NR
1877 \NC gsub                         \NC table    \NC \NC \NR
1878 \NC mm                           \NC table    \NC \NC \NR
1879 \NC chosenname                   \NC string   \NC \NC \NR
1880 \NC macstyle                     \NC number   \NC \NC \NR
1881 \NC fondname                     \NC string   \NC \NC \NR
1882 %NC design_size                  \NC number   \NC \NC \NR
1883 \NC fontstyle_id                 \NC number   \NC \NC \NR
1884 \NC fontstyle_name               \NC table    \NC \NC \NR
1885 %NC design_range_bottom          \NC number   \NC \NC \NR
1886 %NC design_range_top             \NC number   \NC \NC \NR
1887 \NC strokewidth                  \NC float    \NC \NC \NR
1888 \NC mark_classes                 \NC table    \NC \NC \NR
1889 \NC creationtime                 \NC number   \NC \NC \NR
1890 \NC modificationtime             \NC number   \NC \NC \NR
1891 \NC os2_version                  \NC number   \NC \NC \NR
1892 \NC sfd_version                  \NC number   \NC \NC \NR
1893 \NC math                         \NC table    \NC \NC \NR
1894 \NC validation_state             \NC table    \NC \NC \NR
1895 \NC horiz_base                   \NC table    \NC \NC \NR
1896 \NC vert_base                    \NC table    \NC \NC \NR
1897 \NC extrema_bound                \NC number   \NC \NC \NR
1898 \stoptabulate
1899
1900 \subsubsubsection{Glyph items}
1901
1902 The \type {glyphs} is an array containing the per|-|character
1903 information (quite a few of these are only present if nonzero).
1904
1905 \starttabulate[|lT|l|p|]
1906 \NC \ssbf key         \NC \bf type \NC \bf explanation \NC \NR
1907 \NC name              \NC string   \NC the glyph name \NC \NR
1908 \NC unicode           \NC number   \NC unicode code point, or -1 \NC \NR
1909 \NC boundingbox       \NC array    \NC array of four numbers, see note below \NC \NR
1910 \NC width             \NC number   \NC only for horizontal fonts \NC \NR
1911 \NC vwidth            \NC number   \NC only for vertical fonts \NC \NR
1912 \NC tsidebearing      \NC number   \NC only for vertical ttf/otf fonts, and only if nonzero \NC \NR
1913 \NC lsidebearing      \NC number   \NC only if nonzero and not equal to boundingbox[1] \NC \NR
1914 \NC class             \NC string   \NC one of "none", "base", "ligature", "mark", "component"
1915                                        (if not present, the glyph class is \quote {automatic}) \NC \NR
1916 \NC kerns             \NC array    \NC only for horizontal fonts, if set \NC \NR
1917 \NC vkerns            \NC array    \NC only for vertical fonts, if set \NC \NR
1918 \NC dependents        \NC array    \NC linear array of glyph name strings, only if nonempty\NC \NR
1919 \NC lookups           \NC table    \NC only if nonempty \NC \NR
1920 \NC ligatures         \NC table    \NC only if nonempty \NC \NR
1921 \NC anchors           \NC table    \NC only if set \NC \NR
1922 \NC comment           \NC string   \NC only if set \NC \NR
1923 \NC tex_height        \NC number   \NC only if set \NC \NR
1924 \NC tex_depth         \NC number   \NC only if set \NC \NR
1925 \NC italic_correction \NC number   \NC only if set \NC \NR
1926 \NC top_accent        \NC number   \NC only if set \NC \NR
1927 \NC is_extended_shape \NC number   \NC only if this character is part of a math extension list \NC \NR
1928 \NC altuni            \NC table    \NC alternate \UNICODE\ items \NC \NR
1929 \NC vert_variants     \NC table    \NC \NC \NR
1930 \NC horiz_variants    \NC table    \NC \NC \NR
1931 \NC mathkern          \NC table    \NC \NC \NR
1932 \stoptabulate
1933
1934 On \type {boundingbox}: The boundingbox information for \TRUETYPE\ fonts and
1935 \TRUETYPE-based \OTF\ fonts is read directly from the font file.
1936 \POSTSCRIPT-based fonts do not have this information, so the boundingbox of
1937 traditional \POSTSCRIPT\ fonts is generated by interpreting the actual bezier
1938 curves to find the exact boundingbox. This can be a slow process, so the
1939 boundingboxes of \POSTSCRIPT-based \OTF\ fonts (and raw \CFF\ fonts) are
1940 calculated using an approximation of the glyph shape based on the actual glyph
1941 points only, instead of taking the whole curve into account. This means that
1942 glyphs that have missing points at extrema will have a too|-|tight boundingbox,
1943 but the processing is so much faster that in our opinion the tradeoff is worth
1944 it.
1945
1946 The \type {kerns} and \type {vkerns} are linear arrays of small hashes:
1947
1948 \starttabulate[|lT|l|p|]
1949 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
1950 \NC char      \NC string   \NC \NC \NR
1951 \NC off       \NC number   \NC \NC \NR
1952 \NC lookup    \NC string   \NC \NC \NR
1953 \stoptabulate
1954
1955 The \type {lookups} is a hash, based on lookup subtable names, with
1956 the value of each key inside that a linear array of small hashes:
1957
1958 % TODO: fix this description
1959 \starttabulate[|lT|l|p|]
1960 \NC \ssbf key     \NC \bf type \NC \bf explanation \NC \NR
1961 \NC type          \NC enum     \NC \type {position}, \type {pair}, \type
1962                                    {substitution}, \type {alternate}, \type
1963                                    {multiple}, \type {ligature}, \type {lcaret},
1964                                    \type {kerning}, \type {vkerning}, \type
1965                                    {anchors}, \type {contextpos}, \type
1966                                    {contextsub}, \type {chainpos}, \type
1967                                    {chainsub}, \type {reversesub}, \type {max},
1968                                    \type {kernback}, \type {vkernback} \NC \NR
1969 \NC specification \NC table    \NC extra data \NC \NR
1970 \stoptabulate
1971
1972 For the first seven values of \type {type}, there can be additional
1973 sub|-|information, stored in the sub-table \type {specification}:
1974
1975 \starttabulate[|lT|l|p|]
1976 \NC \ssbf value  \NC \bf type \NC \bf explanation \NC \NR
1977 \NC position     \NC table    \NC a table of the \type {offset_specs} type \NC \NR
1978 \NC pair         \NC table    \NC one string: \type {paired}, and an array of one
1979                                   or two \type {offset_specs} tables: \type
1980                                   {offsets} \NC \NR
1981 \NC substitution \NC table    \NC one string: \type {variant} \NC \NR
1982 \NC alternate    \NC table    \NC one string: \type {components} \NC \NR
1983 \NC multiple     \NC table    \NC one string: \type {components} \NC \NR
1984 \NC ligature     \NC table    \NC two strings: \type {components}, \type {char} \NC \NR
1985 \NC lcaret       \NC array    \NC linear array of numbers \NC \NR
1986 \stoptabulate
1987
1988 Tables for \type {offset_specs} contain up to four number|-|valued fields: \type
1989 {x} (a horizontal offset), \type {y} (a vertical offset), \type {h} (an advance
1990 width correction) and \type {v} (an advance height correction).
1991
1992 The \type {ligatures} is a linear array of small hashes:
1993
1994 \starttabulate[|lT|l|p|]
1995 \NC \ssbf key  \NC \bf type \NC \bf explanation \NC \NR
1996 \NC lig        \NC table    \NC uses the same substructure as a single item in
1997                                 the \type {lookups} table explained above \NC \NR
1998 \NC char       \NC string   \NC \NC \NR
1999 \NC components \NC array    \NC linear array of named components \NC \NR
2000 \NC ccnt       \NC number   \NC \NC \NR
2001 \stoptabulate
2002
2003 The \type {anchor} table is indexed by a string signifying the anchor type, which
2004 is one of
2005
2006 \starttabulate[|lT|l|p|]
2007 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2008 \NC mark      \NC table    \NC placement mark \NC \NR
2009 \NC basechar  \NC table    \NC mark for attaching combining items to a base char \NC \NR
2010 \NC baselig   \NC table    \NC mark for attaching combining items to a ligature \NC \NR
2011 \NC basemark  \NC table    \NC generic mark for attaching combining items to connect to \NC \NR
2012 \NC centry    \NC table    \NC cursive entry point \NC \NR
2013 \NC cexit     \NC table    \NC cursive exit point \NC \NR
2014 \stoptabulate
2015
2016 The content of these is a short array of defined anchors, with the
2017 entry keys being the anchor names. For all except \type {baselig}, the
2018 value is a single table with this definition:
2019
2020 \starttabulate[|lT|l|p|]
2021 \NC \ssbf key    \NC \bf type \NC \bf explanation \NC \NR
2022 \NC x            \NC number   \NC x location \NC \NR
2023 \NC y            \NC number   \NC y location \NC \NR
2024 \NC ttf_pt_index \NC number   \NC truetype point index, only if given \NC \NR
2025 \stoptabulate
2026
2027 For \type {baselig}, the value is a small array of such anchor sets sets, one for
2028 each constituent item of the ligature.
2029
2030 For clarification, an anchor table could for example look like this :
2031
2032 \starttyping
2033 ['anchor'] = {
2034     ['basemark'] = {
2035         ['Anchor-7'] = { ['x']=170, ['y']=1080 }
2036     },
2037     ['mark'] ={
2038         ['Anchor-1'] = { ['x']=160, ['y']=810 },
2039         ['Anchor-4'] = { ['x']=160, ['y']=800 }
2040     },
2041     ['baselig'] = {
2042         [1] = { ['Anchor-2'] = { ['x']=160, ['y']=650 } },
2043         [2] = { ['Anchor-2'] = { ['x']=460, ['y']=640 } }
2044         }
2045     }
2046 \stoptyping
2047
2048 Note: The \type {baselig} table can be sparse!
2049
2050 \subsubsubsection{map table}
2051
2052 The top|-|level map is a list of encoding mappings. Each of those is a table
2053 itself.
2054
2055 \starttabulate[|lT|l|p|]
2056 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2057 \NC enccount  \NC number   \NC \NC \NR
2058 \NC encmax    \NC number   \NC \NC \NR
2059 \NC backmax   \NC number   \NC \NC \NR
2060 \NC remap     \NC table    \NC \NC \NR
2061 \NC map       \NC array    \NC non|-|linear array of mappings\NC \NR
2062 \NC backmap   \NC array    \NC non|-|linear array of backward mappings\NC \NR
2063 \NC enc       \NC table    \NC \NC \NR
2064 \stoptabulate
2065
2066 The \type {remap} table is very small:
2067
2068 \starttabulate[|lT|l|p|]
2069 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2070 \NC firstenc  \NC number   \NC \NC \NR
2071 \NC lastenc   \NC number   \NC \NC \NR
2072 \NC infont    \NC number   \NC \NC \NR
2073 \stoptabulate
2074
2075 The \type {enc} table is a bit more verbose:
2076
2077 \starttabulate[|lT|l|p|]
2078 \NC \ssbf key        \NC \bf type \NC \bf explanation \NC \NR
2079 \NC enc_name         \NC string   \NC \NC \NR
2080 \NC char_cnt         \NC number   \NC \NC \NR
2081 \NC char_max         \NC number   \NC \NC \NR
2082 \NC unicode          \NC array    \NC of \UNICODE\ position numbers\NC \NR
2083 \NC psnames          \NC array    \NC of \POSTSCRIPT\ glyph names\NC \NR
2084 \NC builtin          \NC number   \NC \NC \NR
2085 \NC hidden           \NC number   \NC \NC \NR
2086 \NC only_1byte       \NC number   \NC \NC \NR
2087 \NC has_1byte        \NC number   \NC \NC \NR
2088 \NC has_2byte        \NC number   \NC \NC \NR
2089 \NC is_unicodebmp    \NC number   \NC only if nonzero\NC \NR
2090 \NC is_unicodefull   \NC number   \NC only if nonzero\NC \NR
2091 \NC is_custom        \NC number   \NC only if nonzero\NC \NR
2092 \NC is_original      \NC number   \NC only if nonzero\NC \NR
2093 \NC is_compact       \NC number   \NC only if nonzero\NC \NR
2094 \NC is_japanese      \NC number   \NC only if nonzero\NC \NR
2095 \NC is_korean        \NC number   \NC only if nonzero\NC \NR
2096 \NC is_tradchinese   \NC number   \NC only if nonzero [name?]\NC \NR
2097 \NC is_simplechinese \NC number   \NC only if nonzero\NC \NR
2098 \NC low_page         \NC number   \NC \NC \NR
2099 \NC high_page        \NC number   \NC \NC \NR
2100 \NC iconv_name       \NC string   \NC \NC \NR
2101 \NC iso_2022_escape  \NC string   \NC \NC \NR
2102 \stoptabulate
2103
2104 \subsubsubsection{private table}
2105
2106 This is the font's private \POSTSCRIPT\ dictionary, if any. Keys and values are
2107 both strings.
2108
2109 \subsubsubsection{cidinfo table}
2110
2111 \starttabulate[|lT|l|p|]
2112 \NC \ssbf key  \NC \bf type \NC \bf explanation \NC \NR
2113 \NC registry   \NC string   \NC \NC \NR
2114 \NC ordering   \NC string   \NC \NC \NR
2115 \NC supplement \NC number   \NC \NC \NR
2116 \NC version    \NC number   \NC \NC \NR
2117 \stoptabulate
2118
2119 \subsubsubsection[fontloaderpfminfotable]{pfminfo table}
2120
2121 The \type {pfminfo} table contains most of the OS/2 information:
2122
2123 \starttabulate[|lT|l|p|]
2124 \NC \ssbf key        \NC \bf type \NC \bf explanation \NC \NR
2125 \NC pfmset           \NC number   \NC \NC \NR
2126 \NC winascent_add    \NC number   \NC \NC \NR
2127 \NC windescent_add   \NC number   \NC \NC \NR
2128 \NC hheadascent_add  \NC number   \NC \NC \NR
2129 \NC hheaddescent_add \NC number   \NC \NC \NR
2130 \NC typoascent_add   \NC number   \NC \NC \NR
2131 \NC typodescent_add  \NC number   \NC \NC \NR
2132 \NC subsuper_set     \NC number   \NC \NC \NR
2133 \NC panose_set       \NC number   \NC \NC \NR
2134 \NC hheadset         \NC number   \NC \NC \NR
2135 \NC vheadset         \NC number   \NC \NC \NR
2136 \NC pfmfamily        \NC number   \NC \NC \NR
2137 \NC weight           \NC number   \NC \NC \NR
2138 \NC width            \NC number   \NC \NC \NR
2139 \NC avgwidth         \NC number   \NC \NC \NR
2140 \NC firstchar        \NC number   \NC \NC \NR
2141 \NC lastchar         \NC number   \NC \NC \NR
2142 \NC fstype           \NC number   \NC \NC \NR
2143 \NC linegap          \NC number   \NC \NC \NR
2144 \NC vlinegap         \NC number   \NC \NC \NR
2145 \NC hhead_ascent     \NC number   \NC \NC \NR
2146 \NC hhead_descent    \NC number   \NC \NC \NR
2147 \NC os2_typoascent   \NC number   \NC \NC \NR
2148 \NC os2_typodescent  \NC number   \NC \NC \NR
2149 \NC os2_typolinegap  \NC number   \NC \NC \NR
2150 \NC os2_winascent    \NC number   \NC \NC \NR
2151 \NC os2_windescent   \NC number   \NC \NC \NR
2152 \NC os2_subxsize     \NC number   \NC \NC \NR
2153 \NC os2_subysize     \NC number   \NC \NC \NR
2154 \NC os2_subxoff      \NC number   \NC \NC \NR
2155 \NC os2_subyoff      \NC number   \NC \NC \NR
2156 \NC os2_supxsize     \NC number   \NC \NC \NR
2157 \NC os2_supysize     \NC number   \NC \NC \NR
2158 \NC os2_supxoff      \NC number   \NC \NC \NR
2159 \NC os2_supyoff      \NC number   \NC \NC \NR
2160 \NC os2_strikeysize  \NC number   \NC \NC \NR
2161 \NC os2_strikeypos   \NC number   \NC \NC \NR
2162 \NC os2_family_class \NC number   \NC \NC \NR
2163 \NC os2_xheight      \NC number   \NC \NC \NR
2164 \NC os2_capheight    \NC number   \NC \NC \NR
2165 \NC os2_defaultchar  \NC number   \NC \NC \NR
2166 \NC os2_breakchar    \NC number   \NC \NC \NR
2167 \NC os2_vendor       \NC string   \NC \NC \NR
2168 \NC codepages        \NC table    \NC A two-number array of encoded code pages\NC \NR
2169 \NC unicoderages     \NC table    \NC A four-number array of encoded unicode ranges\NC \NR
2170 \NC panose           \NC table    \NC \NC \NR
2171 \stoptabulate
2172
2173 The \type {panose} subtable has exactly 10 string keys:
2174
2175 \starttabulate[|lT|l|p|]
2176 \NC \ssbf key       \NC \bf type \NC \bf explanation \NC \NR
2177 \NC familytype      \NC string   \NC Values as in the \OPENTYPE\ font
2178                                      specification: \type {Any}, \type {No Fit},
2179                                      \type {Text and Display}, \type {Script},
2180                                      \type {Decorative}, \type {Pictorial} \NC
2181                                      \NR
2182 \NC serifstyle      \NC string   \NC See the \OPENTYPE\ font specification for
2183                                      values \NC \NR
2184 \NC weight          \NC string   \NC id. \NC \NR
2185 \NC proportion      \NC string   \NC id. \NC \NR
2186 \NC contrast        \NC string   \NC id. \NC \NR
2187 \NC strokevariation \NC string   \NC id. \NC \NR
2188 \NC armstyle        \NC string   \NC id. \NC \NR
2189 \NC letterform      \NC string   \NC id. \NC \NR
2190 \NC midline         \NC string   \NC id. \NC \NR
2191 \NC xheight         \NC string   \NC id. \NC \NR
2192 \stoptabulate
2193
2194 \subsubsubsection[fontloadernamestable]{names table}
2195
2196 Each item has two top|-|level keys:
2197
2198 \starttabulate[|lT|l|p|]
2199 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2200 \NC lang      \NC string   \NC language for this entry \NC \NR
2201 \NC names     \NC table    \NC \NC \NR
2202 \stoptabulate
2203
2204 The \type {names} keys are the actual \TRUETYPE\ name strings. The possible keys
2205 are:
2206
2207 \starttabulate[|lT|p|]
2208 \NC \ssbf key       \NC \bf explanation \NC \NR
2209 \NC copyright       \NC \NC \NR
2210 \NC family          \NC \NC \NR
2211 \NC subfamily       \NC \NC \NR
2212 \NC uniqueid        \NC \NC \NR
2213 \NC fullname        \NC \NC \NR
2214 \NC version         \NC \NC \NR
2215 \NC postscriptname  \NC \NC \NR
2216 \NC trademark       \NC \NC \NR
2217 \NC manufacturer    \NC \NC \NR
2218 \NC designer        \NC \NC \NR
2219 \NC descriptor      \NC \NC \NR
2220 \NC venderurl       \NC \NC \NR
2221 \NC designerurl     \NC \NC \NR
2222 \NC license         \NC \NC \NR
2223 \NC licenseurl      \NC \NC \NR
2224 \NC idontknow       \NC \NC \NR
2225 \NC preffamilyname  \NC \NC \NR
2226 \NC prefmodifiers   \NC \NC \NR
2227 \NC compatfull      \NC \NC \NR
2228 \NC sampletext      \NC \NC \NR
2229 \NC cidfindfontname \NC \NC \NR
2230 \NC wwsfamily       \NC \NC \NR
2231 \NC wwssubfamily    \NC \NC \NR
2232 \stoptabulate
2233
2234 \subsubsubsection{anchor_classes table}
2235
2236 The anchor_classes classes:
2237
2238 \starttabulate[|lT|l|p|]
2239 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2240 \NC name      \NC string   \NC a descriptive id of this anchor class\NC \NR
2241 \NC lookup    \NC string   \NC \NC \NR
2242 \NC type      \NC string   \NC one of \type {mark}, \type {mkmk}, \type {curs}, \type {mklg} \NC \NR
2243 \stoptabulate
2244
2245 % type is actually a lookup subtype, not a feature name. Officially, these
2246 % strings should be gpos_mark2mark etc.
2247
2248 \subsubsubsection{gpos table}
2249
2250 The \type {gpos} table has one array entry for each lookup. (The \type {gpos_}
2251 prefix is somewhat redundant.)
2252
2253 \starttabulate[|lT|l|p|]
2254 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2255 \NC type      \NC string   \NC one of \type {gpos_single}, \type {gpos_pair},
2256                                \type {gpos_cursive}, \type {gpos_mark2base},\crlf
2257                                \type {gpos_mark2ligature}, \type
2258                                {gpos_mark2mark}, \type {gpos_context},\crlf \type
2259                                {gpos_contextchain} \NC \NR
2260 \NC flags     \NC table    \NC \NC \NR
2261 \NC name      \NC string   \NC \NC \NR
2262 \NC features  \NC array    \NC \NC \NR
2263 \NC subtables \NC array    \NC \NC \NR
2264 \stoptabulate
2265
2266 The flags table has a true value for each of the lookup flags that is actually
2267 set:
2268
2269 \starttabulate[|lT|l|p|]
2270 \NC \ssbf key            \NC \bf type \NC \bf explanation \NC \NR
2271 \NC r2l                  \NC boolean  \NC \NC \NR
2272 \NC ignorebaseglyphs     \NC boolean  \NC \NC \NR
2273 \NC ignoreligatures      \NC boolean  \NC \NC \NR
2274 \NC ignorecombiningmarks \NC boolean  \NC \NC \NR
2275 \NC mark_class           \NC string   \NC \NC \NR
2276 \stoptabulate
2277
2278 The features subtable items of gpos have:
2279
2280 \starttabulate[|lT|l|p|]
2281 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2282 \NC tag       \NC string   \NC \NC \NR
2283 \NC scripts   \NC table    \NC \NC \NR
2284 \stoptabulate
2285
2286 The scripts table within features has:
2287
2288 \starttabulate[|lT|l|p|]
2289 \NC \ssbf key \NC \bf type         \NC \bf explanation \NC \NR
2290 \NC script    \NC string           \NC \NC \NR
2291 \NC langs     \NC array of strings \NC \NC \NR
2292 \stoptabulate
2293
2294 The subtables table has:
2295
2296 \starttabulate[|lT|l|p|]
2297 \NC \ssbf key        \NC \bf type \NC \bf explanation \NC \NR
2298 \NC name             \NC string   \NC \NC \NR
2299 \NC suffix           \NC string   \NC (only if used)\NC \NR % used by gpos_single to get a default
2300 \NC anchor_classes   \NC number   \NC (only if used)\NC \NR
2301 \NC vertical_kerning \NC number   \NC (only if used)\NC \NR
2302 \NC kernclass        \NC table    \NC (only if used)\NC \NR
2303 \stoptabulate
2304
2305 The kernclass with subtables table has:
2306
2307 \starttabulate[|lT|l|p|]
2308 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2309 \NC firsts    \NC array of strings  \NC \NC \NR
2310 \NC seconds   \NC array of strings   \NC \NC \NR
2311 \NC lookup    \NC string or array \NC associated lookup(s) \NC \NR
2312 \NC offsets   \NC array of numbers  \NC \NC \NR
2313 \stoptabulate
2314
2315 Note: the kernclass (as far as we can see) always has one entry so it could be one level
2316 deep instead. Also the seconds start at \type {[2]} which is close to the fontforge
2317 internals so we keep that too.
2318
2319 \subsubsubsection{gsub table}
2320
2321 This has identical layout to the \type {gpos} table, except for the
2322 type:
2323
2324 \starttabulate[|lT|l|p|]
2325 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2326 \NC type      \NC string   \NC one of \type {gsub_single}, \type {gsub_multiple},
2327                                \type {gsub_alternate}, \type
2328                                {gsub_ligature},\crlf \type {gsub_context}, \type
2329                                {gsub_contextchain}, \type
2330                                {gsub_reversecontextchain} \NC \NR
2331 \stoptabulate
2332
2333 \subsubsubsection{ttf_tables and ttf_tab_saved tables}
2334
2335 \starttabulate[|lT|l|p|]
2336 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2337 \NC tag       \NC string   \NC \NC \NR
2338 \NC len       \NC number   \NC \NC \NR
2339 \NC maxlen    \NC number   \NC \NC \NR
2340 \NC data      \NC number   \NC \NC \NR
2341 \stoptabulate
2342
2343 \subsubsubsection{mm table}
2344
2345 \starttabulate[|lT|l|p|]
2346 \NC \ssbf key      \NC \bf type \NC \bf explanation \NC \NR
2347 \NC axes           \NC table    \NC array of axis names \NC \NR
2348 \NC instance_count \NC number   \NC \NC \NR
2349 \NC positions      \NC table    \NC array of instance positions
2350                                     (\#axes * instances )\NC \NR
2351 \NC defweights     \NC table    \NC array of default weights for instances \NC \NR
2352 \NC cdv            \NC string   \NC \NC \NR
2353 \NC ndv            \NC string   \NC \NC \NR
2354 \NC axismaps       \NC table    \NC \NC \NR
2355 \stoptabulate
2356
2357 The \type {axismaps}:
2358
2359 \starttabulate[|lT|l|p|]
2360 \NC \ssbf key            \NC \bf type \NC \bf explanation \NC \NR
2361 \NC blends               \NC table     \NC an array of blend points \NC \NR
2362 \NC designs              \NC table     \NC an array of design values \NC \NR
2363 \NC min                  \NC number   \NC \NC \NR
2364 \NC def                  \NC number   \NC \NC \NR
2365 \NC max                  \NC number   \NC \NC \NR
2366 \stoptabulate
2367
2368 \subsubsubsection{mark_classes table}
2369
2370 The keys in this table are mark class names, and the values are a
2371 space|-|separated string of glyph names in this class.
2372
2373 \subsubsubsection{math table}
2374
2375 \starttabulate[|lT|p|]
2376 \NC ScriptPercentScaleDown                   \NC \NC \NR
2377 \NC ScriptScriptPercentScaleDown             \NC \NC \NR
2378 \NC DelimitedSubFormulaMinHeight             \NC \NC \NR
2379 \NC DisplayOperatorMinHeight                 \NC \NC \NR
2380 \NC MathLeading                              \NC \NC \NR
2381 \NC AxisHeight                               \NC \NC \NR
2382 \NC AccentBaseHeight                         \NC \NC \NR
2383 \NC FlattenedAccentBaseHeight                \NC \NC \NR
2384 \NC SubscriptShiftDown                       \NC \NC \NR
2385 \NC SubscriptTopMax                          \NC \NC \NR
2386 \NC SubscriptBaselineDropMin                 \NC \NC \NR
2387 \NC SuperscriptShiftUp                       \NC \NC \NR
2388 \NC SuperscriptShiftUpCramped                \NC \NC \NR
2389 \NC SuperscriptBottomMin                     \NC \NC \NR
2390 \NC SuperscriptBaselineDropMax               \NC \NC \NR
2391 \NC SubSuperscriptGapMin                     \NC \NC \NR
2392 \NC SuperscriptBottomMaxWithSubscript        \NC \NC \NR
2393 \NC SpaceAfterScript                         \NC \NC \NR
2394 \NC UpperLimitGapMin                         \NC \NC \NR
2395 \NC UpperLimitBaselineRiseMin                \NC \NC \NR
2396 \NC LowerLimitGapMin                         \NC \NC \NR
2397 \NC LowerLimitBaselineDropMin                \NC \NC \NR
2398 \NC StackTopShiftUp                          \NC \NC \NR
2399 \NC StackTopDisplayStyleShiftUp              \NC \NC \NR
2400 \NC StackBottomShiftDown                     \NC \NC \NR
2401 \NC StackBottomDisplayStyleShiftDown         \NC \NC \NR
2402 \NC StackGapMin                              \NC \NC \NR
2403 \NC StackDisplayStyleGapMin                  \NC \NC \NR
2404 \NC StretchStackTopShiftUp                   \NC \NC \NR
2405 \NC StretchStackBottomShiftDown              \NC \NC \NR
2406 \NC StretchStackGapAboveMin                  \NC \NC \NR
2407 \NC StretchStackGapBelowMin                  \NC \NC \NR
2408 \NC FractionNumeratorShiftUp                 \NC \NC \NR
2409 \NC FractionNumeratorDisplayStyleShiftUp     \NC \NC \NR
2410 \NC FractionDenominatorShiftDown             \NC \NC \NR
2411 \NC FractionDenominatorDisplayStyleShiftDown \NC \NC \NR
2412 \NC FractionNumeratorGapMin                  \NC \NC \NR
2413 \NC FractionNumeratorDisplayStyleGapMin      \NC \NC \NR
2414 \NC FractionRuleThickness                    \NC \NC \NR
2415 \NC FractionDenominatorGapMin                \NC \NC \NR
2416 \NC FractionDenominatorDisplayStyleGapMin    \NC \NC \NR
2417 \NC SkewedFractionHorizontalGap              \NC \NC \NR
2418 \NC SkewedFractionVerticalGap                \NC \NC \NR
2419 \NC OverbarVerticalGap                       \NC \NC \NR
2420 \NC OverbarRuleThickness                     \NC \NC \NR
2421 \NC OverbarExtraAscender                     \NC \NC \NR
2422 \NC UnderbarVerticalGap                      \NC \NC \NR
2423 \NC UnderbarRuleThickness                    \NC \NC \NR
2424 \NC UnderbarExtraDescender                   \NC \NC \NR
2425 \NC RadicalVerticalGap                       \NC \NC \NR
2426 \NC RadicalDisplayStyleVerticalGap           \NC \NC \NR
2427 \NC RadicalRuleThickness                     \NC \NC \NR
2428 \NC RadicalExtraAscender                     \NC \NC \NR
2429 \NC RadicalKernBeforeDegree                  \NC \NC \NR
2430 \NC RadicalKernAfterDegree                   \NC \NC \NR
2431 \NC RadicalDegreeBottomRaisePercent          \NC \NC \NR
2432 \NC MinConnectorOverlap                      \NC \NC \NR
2433 \NC FractionDelimiterSize                    \NC \NC \NR
2434 \NC FractionDelimiterDisplayStyleSize        \NC \NC \NR
2435 \stoptabulate
2436
2437 \subsubsubsection{validation_state table}
2438
2439 \starttabulate[|lT|p|]
2440 \NC \ssbf key         \NC \bf explanation \NC \NR
2441 \NC bad_ps_fontname   \NC \NC \NR
2442 \NC bad_glyph_table   \NC \NC \NR
2443 \NC bad_cff_table     \NC \NC \NR
2444 \NC bad_metrics_table \NC \NC \NR
2445 \NC bad_cmap_table    \NC \NC \NR
2446 \NC bad_bitmaps_table \NC \NC \NR
2447 \NC bad_gx_table      \NC \NC \NR
2448 \NC bad_ot_table      \NC \NC \NR
2449 \NC bad_os2_version   \NC \NC \NR
2450 \NC bad_sfnt_header   \NC \NC \NR
2451 \stoptabulate
2452
2453 \subsubsubsection{horiz_base and vert_base table}
2454
2455 \starttabulate[|lT|l|p|]
2456 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2457 \NC tags      \NC table    \NC an array of script list tags\NC \NR
2458 \NC scripts   \NC table    \NC \NC \NR
2459 \stoptabulate
2460
2461 The \type {scripts} subtable:
2462
2463 \starttabulate[|lT|l|p|]
2464 \NC \ssbf key        \NC \bf type \NC \bf explanation \NC \NR
2465 \NC baseline         \NC table   \NC \NC \NR
2466 \NC default_baseline \NC number  \NC \NC \NR
2467 \NC lang             \NC table   \NC \NC \NR
2468 \stoptabulate
2469
2470
2471 The \type {lang} subtable:
2472
2473 \starttabulate[|lT|l|p|]
2474 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2475 \NC tag       \NC string   \NC a script tag \NC \NR
2476 \NC ascent    \NC number   \NC \NC \NR
2477 \NC descent   \NC number   \NC \NC \NR
2478 \NC features  \NC table    \NC \NC \NR
2479 \stoptabulate
2480
2481 The \type {features} points to an array of tables with the same layout except
2482 that in those nested tables, the tag represents a language.
2483
2484 \subsubsubsection{altuni table}
2485
2486 An array of alternate \UNICODE\ values. Inside that array are hashes with:
2487
2488 \starttabulate[|lT|l|p|]
2489 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2490 \NC unicode   \NC number   \NC this glyph is also used for this unicode \NC \NR
2491 \NC variant   \NC number   \NC the alternative is driven by this unicode selector \NC \NR
2492 \stoptabulate
2493
2494 \subsubsubsection{vert_variants and horiz_variants table}
2495
2496 \starttabulate[|lT|l|p|]
2497 \NC \ssbf key         \NC \bf type \NC \bf explanation \NC \NR
2498 \NC variants          \NC string   \NC \NC \NR
2499 \NC italic_correction \NC number   \NC \NC \NR
2500 \NC parts             \NC table    \NC \NC \NR
2501 \stoptabulate
2502
2503 The \type {parts} table is an array of smaller tables:
2504
2505 \starttabulate[|lT|l|p|]
2506 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2507 \NC component \NC string   \NC \NC \NR
2508 \NC extender  \NC number   \NC \NC \NR
2509 \NC start     \NC number   \NC \NC \NR
2510 \NC end       \NC number   \NC \NC \NR
2511 \NC advance   \NC number   \NC \NC \NR
2512 \stoptabulate
2513
2514
2515 \subsubsubsection{mathkern table}
2516
2517 \starttabulate[|lT|l|p|]
2518 \NC \ssbf key    \NC \bf type \NC \bf explanation \NC \NR
2519 \NC top_right    \NC table    \NC \NC \NR
2520 \NC bottom_right \NC table    \NC \NC \NR
2521 \NC top_left     \NC table    \NC \NC \NR
2522 \NC bottom_left  \NC table    \NC \NC \NR
2523 \stoptabulate
2524
2525 Each of the subtables is an array of small hashes with two keys:
2526
2527 \starttabulate[|lT|l|p|]
2528 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2529 \NC height    \NC number   \NC \NC \NR
2530 \NC kern      \NC number   \NC \NC \NR
2531 \stoptabulate
2532
2533 \subsubsubsection{kerns table}
2534
2535 Substructure is identical to the per|-|glyph subtable.
2536
2537 \subsubsubsection{vkerns table}
2538
2539 Substructure is identical to the per|-|glyph subtable.
2540
2541 \subsubsubsection{texdata table}
2542
2543 \starttabulate[|lT|l|p|]
2544 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2545 \NC type      \NC string   \NC \type {unset}, \type {text}, \type {math}, \type {mathext} \NC \NR
2546 \NC params    \NC array    \NC 22 font numeric parameters \NC \NR
2547 \stoptabulate
2548
2549 \subsubsubsection{lookups table}
2550
2551 Top|-|level \type {lookups} is quite different from the ones at character level.
2552 The keys in this hash are strings, the values the actual lookups, represented as
2553 dictionary tables.
2554
2555 \starttabulate[|lT|l|p|]
2556 \NC \ssbf key     \NC \bf type \NC \bf explanation \NC \NR
2557 \NC type          \NC string   \NC \NC \NR
2558 \NC format        \NC enum     \NC one of \type {glyphs}, \type {class}, \type {coverage}, \type {reversecoverage} \NC \NR
2559 \NC tag           \NC string   \NC \NC \NR
2560 \NC current_class \NC array    \NC \NC \NR
2561 \NC before_class  \NC array    \NC \NC \NR
2562 \NC after_class   \NC array    \NC \NC \NR
2563 \NC rules         \NC array    \NC an array of rule items\NC \NR
2564 \stoptabulate
2565
2566 Rule items have one common item and one specialized item:
2567
2568 \starttabulate[|lT|l|p|]
2569 \NC \ssbf key       \NC \bf type \NC \bf explanation \NC \NR
2570 \NC lookups         \NC array    \NC a linear array of lookup names\NC \NR
2571 \NC glyphs          \NC array    \NC only if the parent's format is \type {glyphs}\NC \NR
2572 \NC class           \NC array    \NC only if the parent's format is \type {class}\NC \NR
2573 \NC coverage        \NC array    \NC only if the parent's format is \type {coverage}\NC \NR
2574 \NC reversecoverage \NC array    \NC only if the parent's format is \type {reversecoverage}\NC \NR
2575 \stoptabulate
2576
2577 A glyph table is:
2578
2579 \starttabulate[|lT|l|p|]
2580 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2581 \NC names     \NC string   \NC \NC \NR
2582 \NC back      \NC string   \NC \NC \NR
2583 \NC fore      \NC string   \NC \NC \NR
2584 \stoptabulate
2585
2586 A class table is:
2587
2588 \starttabulate[|lT|l|p|]
2589 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2590 \NC current   \NC array    \NC of numbers \NC \NR
2591 \NC before    \NC array    \NC of numbers  \NC \NR
2592 \NC after     \NC array    \NC of numbers  \NC \NR
2593 \stoptabulate
2594
2595 coverage:
2596
2597 \starttabulate[|lT|l|p|]
2598 \NC \ssbf key \NC \bf type \NC \bf explanation \NC \NR
2599 \NC current   \NC array    \NC of strings \NC \NR
2600 \NC before    \NC array    \NC of strings\NC \NR
2601 \NC after     \NC array    \NC of strings \NC \NR
2602 \stoptabulate
2603
2604 reversecoverage:
2605
2606 \starttabulate[|lT|l|p|]
2607 \NC \ssbf key    \NC \bf type \NC \bf explanation \NC \NR
2608 \NC current      \NC array    \NC of strings \NC \NR
2609 \NC before       \NC array    \NC of strings\NC \NR
2610 \NC after        \NC array    \NC of strings \NC \NR
2611 \NC replacements \NC string   \NC \NC \NR
2612 \stoptabulate
2613
2614 \section{The \type {img} library}
2615
2616 The \type {img} library can be used as an alternative to \type {\pdfximage} and
2617 \type {\pdfrefximage}, and the associated \quote {satellite} commands like \type
2618 {\pdfximagebbox}. Image objects can also be used within virtual fonts via the
2619 \type {image} command listed in~\in {section} [virtualfonts].
2620
2621 \subsection{\type {img.new}}
2622
2623 \startfunctioncall
2624 <image> var = img.new()
2625 <image> var = img.new(<table> image_spec)
2626 \stopfunctioncall
2627
2628 This function creates a userdata object of type \quote {image}. The \type
2629 {image_spec} argument is optional. If it is given, it must be a table, and that
2630 table must contain a \type {filename} key. A number of other keys can also be
2631 useful, these are explained below.
2632
2633 You can either say
2634
2635 \starttyping
2636 a = img.new()
2637 \stoptyping
2638
2639 followed by
2640
2641 \starttyping
2642 a.filename = "foo.png"
2643 \stoptyping
2644
2645 or you can put the file name (and some or all of the other keys) into a table
2646 directly, like so:
2647
2648 \starttyping
2649 a = img.new({filename='foo.pdf', page=1})
2650 \stoptyping
2651
2652 The generated \type {<image>} userdata object allows access to a set of
2653 user|-|specified values as well as a set of values that are normally filled in
2654 and updated automatically by \LUATEX\ itself. Some of those are derived from the
2655 actual image file, others are updated to reflect the \PDF\ output status of the
2656 object.
2657
2658 There is one required user-specified field: the file name (\type {filename}). It
2659 can optionally be augmented by the requested image dimensions (\type {width},
2660 \type {depth}, \type {height}), user|-|specified image attributes (\type {attr}),
2661 the requested \PDF\ page identifier (\type {page}), the requested boundingbox
2662 (\type {pagebox}) for \PDF\ inclusion, the requested color space object (\type
2663 {colorspace}).
2664
2665 The function \type {img.new} does not access the actual image file, it just
2666 creates the \type {<image>} userdata object and initializes some memory
2667 structures. The \type {<image>} object and its internal structures are
2668 automatically garbage collected.
2669
2670 Once the image is scanned, all the values in the \type {<image>} except \type
2671 {width}, \type {height} and \type {depth}, become frozen, and you cannot change
2672 them any more.
2673
2674 You can use \type {pdf.setignoreunknownimages(1)} (or at the \TEX\ end the \type
2675 {\pdfvariable} \type {ignoreunknownimages}) to get around a quit when no known
2676 image type is found (based on name or preamble). Beware: this will not catch
2677 invalid images and we cannot guarantee side effects. A zero dimension image is
2678 still included when requested. No special flags are set. A proper workflow will
2679 not rely in such a catch but make sure that images are valid.
2680
2681 \subsection{\type {img.keys}}
2682
2683 \startfunctioncall
2684 <table> keys = img.keys()
2685 \stopfunctioncall
2686
2687 This function returns a list of all the possible \type {image_spec} keys, both
2688 user-supplied and automatic ones.
2689
2690 % hahe: i need to add r/w ro column...
2691 \starttabulate[|l|l|p|]
2692 \NC \bf field name \NC \bf type \NC description \NC \NR
2693 \NC attr           \NC string   \NC the image attributes for \LUATEX \NC \NR
2694 \NC bbox           \NC table    \NC table with 4 boundingbox dimensions
2695                                     \type {llx}, \type {lly}, \type {urx},
2696                                     and \type {ury} overruling the \type {pagebox}
2697                                     entry\NC \NR
2698 \NC colordepth     \NC number   \NC the number of bits used by the color space\NC \NR
2699 \NC colorspace     \NC number   \NC the color space object number \NC \NR
2700 \NC depth          \NC number   \NC the image depth for \LUATEX\
2701                                     (in scaled points)\NC \NR
2702 \NC filename       \NC string   \NC the image file name \NC \NR
2703 \NC filepath       \NC string   \NC the full (expanded) file name of the image\NC \NR
2704 \NC height         \NC number   \NC the image height for \LUATEX\
2705                                     (in scaled points)\NC \NR
2706 \NC imagetype      \NC string   \NC one of \type {pdf}, \type {png}, \type {jpg}, \type {jp2},
2707                                     \type {jbig2}, or \type {nil} \NC \NR
2708 \NC index          \NC number   \NC the \PDF\ image name suffix \NC \NR
2709 \NC objnum         \NC number   \NC the \PDF\ image object number \NC \NR
2710 \NC page           \NC ??       \NC the identifier for the requested image page
2711                                     (type is number or string,
2712                                     default is the number 1)\NC \NR
2713 \NC pagebox        \NC string   \NC the requested bounding box, one of
2714                                     \type {none}, \type {media}, \type {crop},
2715                                     \type {bleed}, \type {trim}, \type {art} \NC \NR
2716 \NC pages          \NC number   \NC the total number of available pages \NC \NR
2717 \NC rotation       \NC number   \NC the image rotation from included \PDF\ file,
2718                                     in multiples of 90~deg. \NC \NR
2719 \NC stream         \NC string   \NC the raw stream data for an \type {/Xobject}
2720                                     \type {/Form} object\NC \NR
2721 \NC transform      \NC number   \NC the image transform, integer number 0..7\NC \NR
2722 \NC width          \NC number   \NC the image width for \LUATEX\
2723                                     (in scaled points)\NC \NR
2724 \NC xres           \NC number   \NC the horizontal natural image resolution
2725                                     (in \DPI) \NC \NR
2726 \NC xsize          \NC number   \NC the natural image width \NC \NR
2727 \NC yres           \NC number   \NC the vertical natural image resolution
2728                                     (in \DPI) \NC \NR
2729 \NC ysize          \NC number   \NC the natural image height \NC \NR
2730 \NC visiblefileame \NC string   \NC when set, this name will find its way in the
2731                                     \PDF\ file as \type {PTEX} specification; when
2732                                     an empty string is assigned nothing is written
2733                                     to file, otherwise the natural filename is taken \NC \NR
2734 \stoptabulate
2735
2736 A running (undefined) dimension in \type {width}, \type {height}, or \type
2737 {depth} is represented as \type {nil} in \LUA, so if you want to load an image at
2738 its \quote {natural} size, you do not have to specify any of those three fields.
2739
2740 The \type {stream} parameter allows to fabricate an \type {/XObject} \type
2741 {/Form} object from a string giving the stream contents, e.g., for a filled
2742 rectangle:
2743
2744 \startfunctioncall
2745 a.stream = "0 0 20 10 re f"
2746 \stopfunctioncall
2747
2748 When writing the image, an \type {/Xobject} \type {/Form} object is created, like
2749 with embedded \PDF\ file writing. The object is written out only once. The \type
2750 {stream} key requires that also the \type {bbox} table is given. The \type
2751 {stream} key conflicts with the \type {filename} key. The \type {transform} key
2752 works as usual also with \type {stream}.
2753
2754 The \type {bbox} key needs a table with four boundingbox values, e.g.:
2755
2756 \startfunctioncall
2757 a.bbox = {"30bp", 0, "225bp", "200bp"}
2758 \stopfunctioncall
2759
2760 This replaces and overrules any given \type {pagebox} value; with given \type
2761 {bbox} the box dimensions coming with an embedded \PDF\ file are ignored. The
2762 \type {xsize} and \type {ysize} dimensions are set accordingly, when the image is
2763 scaled. The \type {bbox} parameter is ignored for non-\PDF\ images.
2764
2765 The \type {transform} allows to mirror and rotate the image in steps of 90~deg.
2766 The default value~$0$ gives an unmirrored, unrotated image. Values $1-3$ give
2767 counterclockwise rotation by $90$, $180$, or $270$~degrees, whereas with values
2768 $4-7$ the image is first mirrored and then rotated counterclockwise by $90$,
2769 $180$, or $270$~degrees. The \type {transform} operation gives the same visual
2770 result as if you would externally preprocess the image by a graphics tool and
2771 then use it by \LUATEX. If a \PDF\ file to be embedded already contains a \type
2772 {/Rotate} specification, the rotation result is the combination of the \type
2773 {/Rotate} rotation followed by the \type {transform} operation.
2774
2775 \subsection{\type {img.scan}}
2776
2777 \startfunctioncall
2778 <image> var = img.scan(<image> var)
2779 <image> var = img.scan(<table> image_spec)
2780 \stopfunctioncall
2781
2782 When you say \type {img.scan(a)} for a new image, the file is scanned, and
2783 variables such as \type {xsize}, \type {ysize}, image \type {type}, number of
2784 \type {pages}, and the resolution are extracted. Each of the \type {width}, \type
2785 {height}, \type {depth} fields are set up according to the image dimensions, if
2786 they were not given an explicit value already. An image file will never be
2787 scanned more than once for a given image variable. With all subsequent \type
2788 {img.scan(a)} calls only the dimensions are again set up (if they have been
2789 changed by the user in the meantime).
2790
2791 For ease of use, you can do right-away a
2792
2793 \starttyping
2794 <image> a = img.scan ({ filename = "foo.png" })
2795 \stoptyping
2796
2797 without a prior \type {img.new}.
2798
2799 Nothing is written yet at this point, so you can do \type {a=img.scan}, retrieve
2800 the available info like image width and height, and then throw away \type {a}
2801 again by saying \type {a=nil}. In that case no image object will be reserved in
2802 the PDF, and the used memory will be cleaned up automatically.
2803
2804 \subsection{\type {img.copy}}
2805
2806 \startfunctioncall
2807 <image> var = img.copy(<image> var)
2808 <image> var = img.copy(<table> image_spec)
2809 \stopfunctioncall
2810
2811 If you say \type {a = b}, then both variables point to the same \type {<image>}
2812 object. if you want to write out an image with different sizes, you can do a
2813 \type {b=img.copy(a)}.
2814
2815 Afterwards, \type {a} and \type {b} still reference the same actual image
2816 dictionary, but the dimensions for \type {b} can now be changed from their
2817 initial values that were just copies from \type {a}.
2818
2819 \subsection{\type {img.write}}
2820
2821 \startfunctioncall
2822 <image> var = img.write(<image> var)
2823 <image> var = img.write(<table> image_spec)
2824 \stopfunctioncall
2825
2826 By \type {img.write(a)} a \PDF\ object number is allocated, and a whatsit node of
2827 subtype \type {pdf_refximage} is generated and put into the output list. By this
2828 the image \type {a} is placed into the page stream, and the image file is written
2829 out into an image stream object after the shipping of the current page is
2830 finished.
2831
2832 Again you can do a terse call like
2833
2834 \starttyping
2835 img.write ({ filename = "foo.png" })
2836 \stoptyping
2837
2838 The \type {<image>} variable is returned in case you want it for later
2839 processing.
2840
2841 \subsection{\type {img.immediatewrite}}
2842
2843 \startfunctioncall
2844 <image> var = img.immediatewrite(<image> var)
2845 <image> var = img.immediatewrite(<table> image_spec)
2846 \stopfunctioncall
2847
2848 By \type {img.immediatewrite(a)} a \PDF\ object number is allocated, and the
2849 image file for image \type {a} is written out immediately into the \PDF\ file as
2850 an image stream object (like with \type {\immediate}\type {\pdfximage}). The object
2851 number of the image stream dictionary is then available by the \type {objnum}
2852 key. No \type {pdf_refximage} whatsit node is generated. You will need an
2853 \type {img.write(a)} or \type {img.node(a)} call to let the image appear on the
2854 page, or reference it by another trick; else you will have a dangling image
2855 object in the \PDF\ file.
2856
2857 Also here you can do a terse call like
2858
2859 \starttyping
2860 a = img.immediatewrite ({ filename = "foo.png" })
2861 \stoptyping
2862
2863 The \type {<image>} variable is returned and you will most likely need it.
2864
2865 \subsection{\type {img.node}}
2866
2867 \startfunctioncall
2868 <node> n = img.node(<image> var)
2869 <node> n = img.node(<table> image_spec)
2870 \stopfunctioncall
2871
2872 This function allocates a \PDF\ object number and returns a whatsit node of
2873 subtype \type {pdf_refximage}, filled with the image parameters \type {width},
2874 \type {height}, \type {depth}, and \type {objnum}. Also here you can do a terse
2875 call like:
2876
2877 \starttyping
2878 n = img.node ({ filename = "foo.png" })
2879 \stoptyping
2880
2881 This example outputs an image:
2882
2883 \starttyping
2884 node.write(img.node{filename="foo.png"})
2885 \stoptyping
2886
2887 \subsection{\type {img.types}}
2888
2889 \startfunctioncall
2890 <table> types = img.types()
2891 \stopfunctioncall
2892
2893 This function returns a list with the supported image file type names, currently
2894 these are \type {pdf}, \type {png}, \type {jpg}, \type {jp2} (JPEG~2000), and
2895 \type {jbig2}.
2896
2897 \subsection{\type {img.boxes}}
2898
2899 \startfunctioncall
2900 <table> boxes = img.boxes()
2901 \stopfunctioncall
2902
2903 This function returns a list with the supported \PDF\ page box names, currently
2904 these are \type {media}, \type {crop}, \type {bleed}, \type {trim}, and \type
2905 {art} (all in lowercase letters).
2906
2907 \section{The \type {kpse} library}
2908
2909 This library provides two separate, but nearly identical interfaces to the
2910 \KPATHSEA\ file search functionality: there is a \quote {normal} procedural
2911 interface that shares its kpathsea instance with \LUATEX\ itself, and an object
2912 oriented interface that is completely on its own.
2913
2914 \subsection{\type {kpse.set_program_name} and \type {kpse.new}}
2915
2916 Before the search library can be used at all, its database has to be initialized.
2917 There are three possibilities, two of which belong to the procedural interface.
2918
2919 First, when \LUATEX\ is used to typeset documents, this initialization happens
2920 automatically and the \KPATHSEA\ executable and program names are set to \type
2921 {luatex} (that is, unless explicitly prohibited by the user's startup script.
2922 See~\in {section} [init] for more details).
2923
2924 Second, in \TEXLUA\ mode, the initialization has to be done explicitly via the
2925 \type {kpse.set_program_name} function, which sets the \KPATHSEA\ executable
2926 (and optionally program) name.
2927
2928 \startfunctioncall
2929 kpse.set_program_name(<string> name)
2930 kpse.set_program_name(<string> name, <string> progname)
2931 \stopfunctioncall
2932
2933 The second argument controls the use of the \quote {dotted} values in the \type
2934 {texmf.cnf} configuration file, and defaults to the first argument.
2935
2936 Third, if you prefer the object oriented interface, you have to call a different
2937 function. It has the same arguments, but it returns a userdata variable.
2938
2939 \startfunctioncall
2940 local kpathsea = kpse.new(<string> name)
2941 local kpathsea = kpse.new(<string> name, <string> progname)
2942 \stopfunctioncall
2943
2944 Apart from these two functions, the calling conventions of the interfaces are
2945 identical. Depending on the chosen interface, you either call \type
2946 {kpse.find_file()} or \type {kpathsea:find_file()}, with identical arguments and
2947 return vales.
2948
2949 \subsection{\type {find_file}}
2950
2951 The most often used function in the library is find_file:
2952
2953 \startfunctioncall
2954 <string> f = kpse.find_file(<string> filename)
2955 <string> f = kpse.find_file(<string> filename, <string> ftype)
2956 <string> f = kpse.find_file(<string> filename, <boolean> mustexist)
2957 <string> f = kpse.find_file(<string> filename, <string> ftype, <boolean> mustexist)
2958 <string> f = kpse.find_file(<string> filename, <string> ftype, <number> dpi)
2959 \stopfunctioncall
2960
2961 Arguments:
2962 \startitemize[intro]
2963
2964 \sym{filename}
2965
2966 the name of the file you want to find, with or without extension.
2967
2968 \sym{ftype}
2969
2970 maps to the \type {-format} argument of \KPSEWHICH. The supported \type {ftype}
2971 values are the same as the ones supported by the standalone \type {kpsewhich}
2972 program:
2973
2974 \startsimplecolumns
2975 \starttyping
2976 gf
2977 pk
2978 bitmap font
2979 tfm
2980 afm
2981 base
2982 bib
2983 bst
2984 cnf
2985 ls-R
2986 fmt
2987 map
2988 mem
2989 mf
2990 mfpool
2991 mft
2992 mp
2993 mppool
2994 MetaPost support
2995 ocp
2996 ofm
2997 opl
2998 otp
2999 ovf
3000 ovp
3001 graphic/figure
3002 tex
3003 TeX system documentation
3004 texpool
3005 TeX system sources
3006 PostScript header
3007 Troff fonts
3008 type1 fonts
3009 vf
3010 dvips config
3011 ist
3012 truetype fonts
3013 type42 fonts
3014 web2c files
3015 other text files
3016 other binary files
3017 misc fonts
3018 web
3019 cweb
3020 enc files
3021 cmap files
3022 subfont definition files
3023 opentype fonts
3024 pdftex config
3025 lig files
3026 texmfscripts
3027 lua
3028 font feature files
3029 cid maps
3030 mlbib
3031 mlbst
3032 clua
3033 \stoptyping
3034 \stopsimplecolumns
3035
3036 The default type is \type {tex}. Note: this is different from \KPSEWHICH, which
3037 tries to deduce the file type itself from looking at the supplied extension.
3038
3039 \sym{mustexist}
3040
3041 is similar to \KPSEWHICH's \type {-must-exist}, and the default is \type {false}.
3042 If you specify \type {true} (or a non|-|zero integer), then the \KPSE\ library
3043 will search the disk as well as the \type {ls-R} databases.
3044
3045 \sym{dpi}
3046
3047 This is used for the size argument of the formats \type {pk}, \type {gf}, and
3048 \type {bitmap font}. \stopitemize
3049
3050
3051 \subsection{\type {lookup}}
3052
3053 A more powerful (but slower) generic method for finding files is also available.
3054 It returns a string for each found file.
3055
3056 \startfunctioncall
3057 <string> f, ... = kpse.lookup(<string> filename, <table> options)
3058 \stopfunctioncall
3059
3060 The options match commandline arguments from \type {kpsewhich}:
3061
3062 \starttabulate[|l|l|p|]
3063 \NC \ssbf key \NC \ssbf type \NC \ssbf description \NC \NR
3064 \NC debug     \NC number     \NC set debugging flags for this lookup\NC     \NR
3065 \NC format    \NC string     \NC use specific file type (see list above)\NC \NR
3066 \NC dpi       \NC number     \NC use this resolution for this lookup; default 600\NC \NR
3067 \NC path      \NC string     \NC search in the given path\NC \NR
3068 \NC all       \NC boolean    \NC output all matches, not just the first\NC \NR
3069 \NC mustexist \NC boolean    \NC search the disk as well as ls-R if necessary\NC \NR
3070 \NC mktexpk   \NC boolean    \NC disable/enable mktexpk generation for this lookup\NC \NR
3071 \NC mktextex  \NC boolean    \NC disable/enable mktextex generation for this lookup\NC \NR
3072 \NC mktexmf   \NC boolean    \NC disable/enable mktexmf generation for this lookup\NC \NR
3073 \NC mktextfm  \NC boolean    \NC disable/enable mktextfm generation for this lookup\NC \NR
3074 \NC subdir    \NC string
3075                   or table   \NC only output matches whose directory part
3076                                  ends with the given string(s) \NC \NR
3077 \stoptabulate
3078
3079 \subsection{\type {init_prog}}
3080
3081 Extra initialization for programs that need to generate bitmap fonts.
3082
3083 \startfunctioncall
3084 kpse.init_prog(<string> prefix, <number> base_dpi, <string> mfmode)
3085 kpse.init_prog(<string> prefix, <number> base_dpi, <string> mfmode, <string> fallback)
3086 \stopfunctioncall
3087
3088 \subsection{\type {readable_file}}
3089
3090 Test if an (absolute) file name is a readable file.
3091
3092 \startfunctioncall
3093 <string> f = kpse.readable_file(<string> name)
3094 \stopfunctioncall
3095
3096 The return value is the actual absolute filename you should use, because the disk
3097 name is not always the same as the requested name, due to aliases and
3098 system|-|specific handling under e.g.\ \MSDOS.
3099
3100 Returns \type {nil} if the file does not exist or is not readable.
3101
3102 \subsection{\type {expand_path}}
3103
3104 Like kpsewhich's \type {-expand-path}:
3105
3106 \startfunctioncall
3107 <string> r = kpse.expand_path(<string> s)
3108 \stopfunctioncall
3109
3110 \subsection{\type {expand_var}}
3111
3112 Like kpsewhich's  \type {-expand-var}:
3113
3114 \startfunctioncall
3115 <string> r = kpse.expand_var(<string> s)
3116 \stopfunctioncall
3117
3118 \subsection{\type {expand_braces}}
3119
3120 Like kpsewhich's \type {-expand-braces}:
3121
3122 \startfunctioncall
3123 <string> r = kpse.expand_braces(<string> s)
3124 \stopfunctioncall
3125
3126 \subsection{\type {show_path}}
3127
3128 Like kpsewhich's \type {-show-path}:
3129
3130 \startfunctioncall
3131 <string> r = kpse.show_path(<string> ftype)
3132 \stopfunctioncall
3133
3134
3135 \subsection{\type {var_value}}
3136
3137 Like kpsewhich's \type {-var-value}:
3138
3139 \startfunctioncall
3140 <string> r = kpse.var_value(<string> s)
3141 \stopfunctioncall
3142
3143 \subsection{\type {version}}
3144
3145 Returns the kpathsea version string.
3146
3147 \startfunctioncall
3148 <string> r = kpse.version()
3149 \stopfunctioncall
3150
3151
3152 \section{The \type {lang} library}
3153
3154 This library provides the interface to \LUATEX's structure
3155 representing a language, and the associated functions.
3156
3157 \startfunctioncall
3158 <language> l = lang.new()
3159 <language> l = lang.new(<number> id)
3160 \stopfunctioncall
3161
3162 This function creates a new userdata object. An object of type \type {<language>}
3163 is the first argument to most of the other functions in the \type {lang}
3164 library. These functions can also be used as if they were object methods, using
3165 the colon syntax.
3166
3167 Without an argument, the next available internal id number will be assigned to
3168 this object. With argument, an object will be created that links to the internal
3169 language with that id number.
3170
3171 \startfunctioncall
3172 <number> n = lang.id(<language> l)
3173 \stopfunctioncall
3174
3175 returns the internal \type {\language} id number this object refers to.
3176
3177 \startfunctioncall
3178 <string> n = lang.hyphenation(<language> l)
3179 lang.hyphenation(<language> l, <string> n)
3180 \stopfunctioncall
3181
3182 Either returns the current hyphenation exceptions for this language, or adds new
3183 ones. The syntax of the string is explained in~\in {section}
3184 [patternsexceptions].
3185
3186 \startfunctioncall
3187 lang.clear_hyphenation(<language> l)
3188 \stopfunctioncall
3189
3190 Clears the exception dictionary (string) for this language.
3191
3192 \startfunctioncall
3193 <string> n = lang.clean(<language> l, <string> o)
3194 <string> n = lang.clean(<string> o)
3195 \stopfunctioncall
3196
3197 Creates a hyphenation key from the supplied hyphenation value. The syntax of the
3198 argument string is explained in~\in {section} [patternsexceptions]. This function
3199 is useful if you want to do something else based on the words in a dictionary
3200 file, like spell|-|checking.
3201
3202 \startfunctioncall
3203 <string> n = lang.patterns(<language> l)
3204 lang.patterns(<language> l, <string> n)
3205 \stopfunctioncall
3206
3207 Adds additional patterns for this language object, or returns the current set.
3208 The syntax of this string is explained in~\in {section} [patternsexceptions].
3209
3210 \startfunctioncall
3211 lang.clear_patterns(<language> l)
3212 \stopfunctioncall
3213
3214 Clears the pattern dictionary for this language.
3215
3216 \startfunctioncall
3217 <number> n = lang.prehyphenchar(<language> l)
3218 lang.prehyphenchar(<language> l, <number> n)
3219 \stopfunctioncall
3220
3221 Gets or sets the \quote {pre|-|break} hyphen character for implicit hyphenation
3222 in this language (initially the hyphen, decimal 45).
3223
3224 \startfunctioncall
3225 <number> n = lang.posthyphenchar(<language> l)
3226 lang.posthyphenchar(<language> l, <number> n)
3227 \stopfunctioncall
3228
3229 Gets or sets the \quote {post|-|break} hyphen character for implicit hyphenation
3230 in this language (initially null, decimal~0, indicating emptiness).
3231
3232 \startfunctioncall
3233 <number> n = lang.preexhyphenchar(<language> l)
3234 lang.preexhyphenchar(<language> l, <number> n)
3235 \stopfunctioncall
3236
3237 Gets or sets the \quote {pre|-|break} hyphen character for explicit hyphenation
3238 in this language (initially null, decimal~0, indicating emptiness).
3239
3240 \startfunctioncall
3241 <number> n = lang.postexhyphenchar(<language> l)
3242 lang.postexhyphenchar(<language> l, <number> n)
3243 \stopfunctioncall
3244
3245 Gets or sets the \quote {post|-|break} hyphen character for explicit hyphenation
3246 in this language (initially null, decimal~0, indicating emptiness).
3247
3248 \startfunctioncall
3249 <boolean> success = lang.hyphenate(<node> head)
3250 <boolean> success = lang.hyphenate(<node> head, <node> tail)
3251 \stopfunctioncall
3252
3253 Inserts hyphenation points (discretionary nodes) in a node list. If \type {tail}
3254 is given as argument, processing stops on that node. Currently, \type {success}
3255 is always true if \type {head} (and \type {tail}, if specified) are proper nodes,
3256 regardless of possible other errors.
3257
3258 Hyphenation works only on \quote {characters}, a special subtype of all the glyph
3259 nodes with the node subtype having the value \type {1}. Glyph modes with
3260 different subtypes are not processed. See \in {section~} [charsandglyphs] for
3261 more details.
3262
3263 The following two commands can be used to set or query hj codes:
3264
3265 \startfunctioncall
3266 lang.sethjcode(<language> l, <number> char, <number> usedchar)
3267 <number> usedchar = lang.gethjcode(<language> l, <number> char)
3268 \stopfunctioncall
3269
3270 When you set a hjcode the current sets get initialized unless the set was already
3271 initialized due to \type {\savinghyphcodes} being larger than zero.
3272
3273 \section{The \type {lua} library}
3274
3275 This library contains one read|-|only  item:
3276
3277 \starttyping
3278 <string> s = lua.version
3279 \stoptyping
3280
3281 This returns the \LUA\ version identifier string. The value is currently
3282 \directlua {tex.print(lua.version)}.
3283
3284 \subsection{\LUA\ bytecode registers}
3285
3286 \LUA\ registers can be used to communicate \LUA\ functions across \LUA\ chunks.
3287 The accepted values for assignments are functions and \type {nil}. Likewise, the
3288 retrieved value is either a function or \type {nil}.
3289
3290 \starttyping
3291 lua.bytecode[<number> n] = <function> f
3292 lua.bytecode[<number> n]()
3293 \stoptyping
3294
3295 The contents of the \type {lua.bytecode} array is stored inside the format file
3296 as actual \LUA\ bytecode, so it can also be used to preload \LUA\ code.
3297
3298 Note: The function must not contain any upvalues. Currently, functions containing
3299 upvalues can be stored (and their upvalues are set to \type {nil}), but this is
3300 an artifact of the current \LUA\ implementation and thus subject to change.
3301
3302 The associated function calls are
3303
3304 \startfunctioncall
3305 <function> f = lua.getbytecode(<number> n)
3306 lua.setbytecode(<number> n, <function> f)
3307 \stopfunctioncall
3308
3309 Note: Since a \LUA\ file loaded using \type {loadfile(filename)} is essentially
3310 an anonymous function, a complete file can be stored in a bytecode register like
3311 this:
3312
3313 \startfunctioncall
3314 lua.bytecode[n] = loadfile(filename)
3315 \stopfunctioncall
3316
3317 Now all definitions (functions, variables) contained in the file can be
3318 created by executing this bytecode register:
3319
3320 \startfunctioncall
3321 lua.bytecode[n]()
3322 \stopfunctioncall
3323
3324 Note that the path of the file is stored in the \LUA\ bytecode to be used in
3325 stack backtraces and therefore dumped into the format file if the above code is
3326 used in \INITEX. If it contains private information, i.e. the user name, this
3327 information is then contained in the format file as well. This should be kept in
3328 mind when preloading files into a bytecode register in \INITEX.
3329
3330 \subsection{\LUA\ chunk name registers}
3331
3332 There is an array of 65536 (0--65535) potential chunk names for use with the
3333 \type {\directlua} and \type {\latelua} primitives.
3334
3335 \startfunctioncall
3336 lua.name[<number> n] = <string> s
3337 <string> s = lua.name[<number> n]
3338 \stopfunctioncall
3339
3340 If you want to unset a lua name, you can assign \type {nil} to it.
3341
3342 \section{The \type {mplib} library}
3343
3344 The \MP\ library interface registers itself in the table \type {mplib}. It is
3345 based on \MPLIB\ version \ctxlua {context(mplib.version())}.
3346
3347 \subsection{\type {mplib.new}}
3348
3349 To create a new \METAPOST\ instance, call
3350
3351 \startfunctioncall
3352 <mpinstance> mp = mplib.new({...})
3353 \stopfunctioncall
3354
3355 This creates the \type {mp} instance object. The argument hash can have a number
3356 of different fields, as follows:
3357
3358 \starttabulate[|lT|l|p|p|]
3359 \NC \ssbf name  \NC \bf type \NC \bf description          \NC \bf default       \NC \NR
3360 \NC error_line  \NC number   \NC error line width         \NC 79                \NC \NR
3361 \NC print_line  \NC number   \NC line length in ps output \NC 100               \NC \NR
3362 \NC random_seed \NC number   \NC the initial random seed  \NC variable          \NC \NR
3363 \NC interaction \NC string   \NC the interaction mode,
3364                                  one of
3365                                  \type {batch},
3366                                  \type {nonstop},
3367                                  \type {scroll},
3368                                  \type {errorstop}        \NC \type {errorstop} \NC \NR
3369 \NC job_name    \NC string   \NC \type {--jobname}        \NC \type {mpout}     \NC \NR
3370 \NC find_file   \NC function \NC a function to find files \NC only local files  \NC \NR
3371 \stoptabulate
3372
3373 The \type {find_file} function should be of this form:
3374
3375 \starttyping
3376 <string> found = finder (<string> name, <string> mode, <string> type)
3377 \stoptyping
3378
3379 with:
3380
3381 \starttabulate[|lT|l|p|]
3382 \NC \bf name \NC \bf the requested file \NC \NR
3383 \NC mode     \NC the file mode: \type {r} or \type {w} \NC \NR
3384 \NC type     \NC the kind of file, one of: \type {mp}, \type {tfm}, \type {map},
3385                  \type {pfb}, \type {enc} \NC \NR
3386 \stoptabulate
3387
3388 Return either the full pathname of the found file, or \type {nil} if the file
3389 cannot be found.
3390
3391 Note that the new version of \MPLIB\ no longer uses binary mem files, so the way
3392 to preload a set of macros is simply to start off with an \type {input} command
3393 in the first \type {mp:execute()} call.
3394
3395 \subsection{\type {mp:statistics}}
3396
3397 You can request statistics with:
3398
3399 \startfunctioncall
3400 <table> stats = mp:statistics()
3401 \stopfunctioncall
3402
3403 This function returns the vital statistics for an \MPLIB\ instance. There are
3404 four fields, giving the maximum number of used items in each of four allocated
3405 object classes:
3406
3407 \starttabulate[|lT|l|p|]
3408 \NC main_memory \NC number \NC memory size \NC \NR
3409 \NC hash_size   \NC number \NC hash size\NC \NR
3410 \NC param_size  \NC number \NC simultaneous macro parameters\NC \NR
3411 \NC max_in_open \NC number \NC input file nesting levels\NC \NR
3412 \stoptabulate
3413
3414 Note that in the new version of \MPLIB, this is informational only. The objects
3415 are all allocated dynamically, so there is no chance of running out of space
3416 unless the available system memory is exhausted.
3417
3418 \subsection{\type {mp:execute}}
3419
3420 You can ask the \METAPOST\ interpreter to run a chunk of code by calling
3421
3422 \startfunctioncall
3423 <table> rettable = mp:execute('metapost language chunk')
3424 \stopfunctioncall
3425
3426 for various bits of \METAPOST\ language input. Be sure to check the \type
3427 {rettable.status} (see below) because when a fatal \METAPOST\ error occurs the
3428 \MPLIB\ instance will become unusable thereafter.
3429
3430 Generally speaking, it is best to keep your chunks small, but beware that all
3431 chunks have to obey proper syntax, like each of them is a small file. For
3432 instance, you cannot split a single statement over multiple chunks.
3433
3434 In contrast with the normal standalone \type {mpost} command, there is {\em no}
3435 implied \quote{input} at the start of the first chunk.
3436
3437 \subsection{\type {mp:finish}}
3438
3439 \startfunctioncall
3440 <table> rettable = mp:finish()
3441 \stopfunctioncall
3442
3443 If for some reason you want to stop using an \MPLIB\ instance while processing is
3444 not yet actually done, you can call \type {mp:finish}. Eventually, used memory
3445 will be freed and open files will be closed by the \LUA\ garbage collector, but
3446 an explicit \type {mp:finish} is the only way to capture the final part of the
3447 output streams.
3448
3449 \subsection{Result table}
3450
3451 The return value of \type {mp:execute} and \type {mp:finish} is a table with a
3452 few possible keys (only \type {status} is always guaranteed to be present).
3453
3454 \starttabulate[|l|l|p|]
3455 \NC log    \NC string \NC output to the \quote {log} stream \NC \NR
3456 \NC term   \NC string \NC output to the \quote {term} stream \NC \NR
3457 \NC error  \NC string \NC output to the \quote {error} stream
3458                           (only used for \quote {out of memory}) \NC \NR
3459 \NC status \NC number \NC the return value:
3460                           \type {0} = good,
3461                           \type {1} = warning,
3462                           \type {2} = errors,
3463                           \type {3} = fatal error \NC \NR
3464 \NC fig    \NC table  \NC an array of generated figures (if any) \NC \NR
3465 \stoptabulate
3466
3467 When \type {status} equals~3, you should stop using this \MPLIB\ instance
3468 immediately, it is no longer capable of processing input.
3469
3470 If it is present, each of the entries in the \type {fig} array is a userdata
3471 representing a figure object, and each of those has a number of object methods
3472 you can call:
3473
3474 \starttabulate[|l|l|p|]
3475 \NC boundingbox  \NC function \NC returns the bounding box, as an array of 4
3476                                   values\NC \NR
3477 \NC postscript   \NC function \NC returns a string that is the ps output of the
3478                                   \type {fig}. this function accepts two optional
3479                                   integer arguments for specifying the values of
3480                                   \type {prologues} (first argument) and \type
3481                                   {procset} (second argument)\NC \NR
3482 \NC svg          \NC function \NC returns a string that is the svg output of the
3483                                   \type {fig}. This function accepts an optional
3484                                   integer argument for specifying the value of
3485                                   \type {prologues}\NC \NR
3486 \NC objects      \NC function \NC returns the actual array of graphic objects in
3487                                   this \type {fig} \NC \NR
3488 \NC copy_objects \NC function \NC returns a deep copy of the array of graphic
3489                                   objects in this \type {fig} \NC \NR
3490 \NC filename     \NC function \NC the filename this \type {fig}'s \POSTSCRIPT\
3491                                   output would have written to in standalone
3492                                   mode \NC \NR
3493 \NC width        \NC function \NC the \type {fontcharwd} value \NC \NR
3494 \NC height       \NC function \NC the \type {fontcharht} value \NC \NR
3495 \NC depth        \NC function \NC the \type {fontchardp} value \NC \NR
3496 \NC italcorr     \NC function \NC the \type {fontcharit} value \NC \NR
3497 \NC charcode     \NC function \NC the (rounded) \type {charcode} value \NC \NR
3498 \stoptabulate
3499
3500 Note: you can call \type {fig:objects()} only once for any one \type {fig}
3501 object!
3502
3503 When the boundingbox represents a \quote {negated rectangle}, i.e.\ when the
3504 first set of coordinates is larger than the second set, the picture is empty.
3505
3506 Graphical objects come in various types that each has a different list of
3507 accessible values. The types are: \type {fill}, \type {outline}, \type {text},
3508 \type {start_clip}, \type {stop_clip}, \type {start_bounds}, \type {stop_bounds},
3509 \type {special}.
3510
3511 There is helper function (\type {mplib.fields(obj)}) to get the list of
3512 accessible values for a particular object, but you can just as easily use the
3513 tables given below.
3514
3515 All graphical objects have a field \type {type} that gives the object type as a
3516 string value; it is not explicit mentioned in the following tables. In the
3517 following, \type {number}s are \POSTSCRIPT\ points represented as a floating
3518 point number, unless stated otherwise. Field values that are of type \type
3519 {table} are explained in the next section.
3520
3521 \subsubsection{fill}
3522
3523 \starttabulate[|l|l|p|]
3524 \NC path       \NC table  \NC the list of knots \NC \NR
3525 \NC htap       \NC table  \NC the list of knots for the reversed trajectory \NC \NR
3526 \NC pen        \NC table  \NC knots of the pen \NC \NR
3527 \NC color      \NC table  \NC the object's color \NC \NR
3528 \NC linejoin   \NC number \NC line join style (bare number)\NC \NR
3529 \NC miterlimit \NC number \NC miterlimit\NC \NR
3530 \NC prescript  \NC string \NC the prescript text \NC \NR
3531 \NC postscript \NC string \NC the postscript text \NC \NR
3532 \stoptabulate
3533
3534 The entries \type {htap} and \type {pen} are optional.
3535
3536 There is helper function (\type {mplib.pen_info(obj)}) that returns a table
3537 containing a bunch of vital characteristics of the used pen (all values are
3538 floats):
3539
3540 \starttabulate[|l|l|p|]
3541 \NC width \NC number \NC width of the pen \NC \NR
3542 \NC sx    \NC number \NC $x$ scale        \NC \NR
3543 \NC rx    \NC number \NC $xy$ multiplier  \NC \NR
3544 \NC ry    \NC number \NC $yx$ multiplier  \NC \NR
3545 \NC sy    \NC number \NC $y$ scale        \NC \NR
3546 \NC tx    \NC number \NC $x$ offset       \NC \NR
3547 \NC ty    \NC number \NC $y$ offset       \NC \NR
3548 \stoptabulate
3549
3550 \subsubsection{outline}
3551
3552 \starttabulate[|l|l|p|]
3553 \NC path       \NC table  \NC the list of knots \NC \NR
3554 \NC pen        \NC table  \NC knots of the pen \NC \NR
3555 \NC color      \NC table  \NC the object's color \NC \NR
3556 \NC linejoin   \NC number \NC line join style (bare number) \NC \NR
3557 \NC miterlimit \NC number \NC miterlimit \NC \NR
3558 \NC linecap    \NC number \NC line cap style (bare number) \NC \NR
3559 \NC dash       \NC table  \NC representation of a dash list \NC \NR
3560 \NC prescript  \NC string \NC the prescript text \NC \NR
3561 \NC postscript \NC string \NC the postscript text \NC \NR
3562 \stoptabulate
3563
3564 The entry \type {dash} is optional.
3565
3566 \subsubsection{text}
3567
3568 \starttabulate[|l|l|p|]
3569 \NC text       \NC string \NC the text \NC \NR
3570 \NC font       \NC string \NC font tfm name \NC \NR
3571 \NC dsize      \NC number \NC font size \NC \NR
3572 \NC color      \NC table  \NC the object's color \NC \NR
3573 \NC width      \NC number \NC \NC \NR
3574 \NC height     \NC number \NC \NC \NR
3575 \NC depth      \NC number \NC \NC \NR
3576 \NC transform  \NC table  \NC a text transformation \NC \NR
3577 \NC prescript  \NC string \NC the prescript text \NC \NR
3578 \NC postscript \NC string \NC the postscript text \NC \NR
3579 \stoptabulate
3580
3581 \subsubsection{special}
3582
3583 \starttabulate[|l|l|p|]
3584 \NC prescript \NC string \NC special text \NC \NR
3585 \stoptabulate
3586
3587 \subsubsection{start_bounds, start_clip}
3588
3589 \starttabulate[|l|l|p|]
3590 \NC path \NC table \NC the list of knots \NC \NR
3591 \stoptabulate
3592
3593 \subsubsection{stop_bounds, stop_clip}
3594
3595 Here are no fields available.
3596
3597 \subsection{Subsidiary table formats}
3598
3599 \subsubsection{Paths and pens}
3600
3601 Paths and pens (that are really just a special type of paths as far as \MPLIB\ is
3602 concerned) are represented by an array where each entry is a table that
3603 represents a knot.
3604
3605 \starttabulate[|lT|l|p|]
3606 \NC left_type   \NC string \NC when present: endpoint, but usually absent \NC \NR
3607 \NC right_type  \NC string \NC like \type {left_type} \NC \NR
3608 \NC x_coord     \NC number \NC X coordinate of this knot \NC \NR
3609 \NC y_coord     \NC number \NC Y coordinate of this knot \NC \NR
3610 \NC left_x      \NC number \NC X coordinate of the precontrol point of this knot \NC \NR
3611 \NC left_y      \NC number \NC Y coordinate of the precontrol point of this knot \NC \NR
3612 \NC right_x     \NC number \NC X coordinate of the postcontrol point of this knot \NC \NR
3613 \NC right_y     \NC number \NC Y coordinate of the postcontrol point of this knot \NC \NR
3614 \stoptabulate
3615
3616 There is one special case: pens that are (possibly transformed) ellipses have an
3617 extra string-valued key \type {type} with value \type {elliptical} besides the
3618 array part containing the knot list.
3619
3620 \subsubsection{Colors}
3621
3622 A color is an integer array with 0, 1, 3 or 4 values:
3623
3624 \starttabulate[|l|l|p|]
3625 \NC 0 \NC marking only \NC no values                                                     \NC \NR
3626 \NC 1 \NC greyscale    \NC one value in the range $(0,1)$, \quote {black} is $0$         \NC \NR
3627 \NC 3 \NC \RGB         \NC three values in the range $(0,1)$, \quote {black} is $0,0,0$  \NC \NR
3628 \NC 4 \NC \CMYK        \NC four values in the range $(0,1)$, \quote {black} is $0,0,0,1$ \NC \NR
3629 \stoptabulate
3630
3631 If the color model of the internal object was \type {uninitialized}, then it was
3632 initialized to the values representing \quote {black} in the colorspace \type
3633 {defaultcolormodel} that was in effect at the time of the \type {shipout}.
3634
3635 \subsubsection{Transforms}
3636
3637 Each transform is a six|-|item array.
3638
3639 \starttabulate[|l|l|p|]
3640 \NC 1 \NC number \NC represents x  \NC \NR
3641 \NC 2 \NC number \NC represents y  \NC \NR
3642 \NC 3 \NC number \NC represents xx \NC \NR
3643 \NC 4 \NC number \NC represents yx \NC \NR
3644 \NC 5 \NC number \NC represents xy \NC \NR
3645 \NC 6 \NC number \NC represents yy \NC \NR
3646 \stoptabulate
3647
3648 Note that the translation (index 1 and 2) comes first. This differs from the
3649 ordering in \POSTSCRIPT, where the translation comes last.
3650
3651 \subsubsection{Dashes}
3652
3653 Each \type {dash} is two-item hash, using the same model as \POSTSCRIPT\ for the
3654 representation of the dashlist. \type {dashes} is an array of \quote {on} and
3655 \quote {off}, values, and \type {offset} is the phase of the pattern.
3656
3657 \starttabulate[|l|l|p|]
3658 \NC dashes \NC hash   \NC an array of on-off numbers \NC \NR
3659 \NC offset \NC number \NC the starting offset value  \NC \NR
3660 \stoptabulate
3661
3662 \subsection{Character size information}
3663
3664 These functions find the size of a glyph in a defined font. The \type {fontname}
3665 is the same name as the argument to \type {infont}; the \type {char} is a glyph
3666 id in the range 0 to 255; the returned \type {w} is in AFM units.
3667
3668 \subsubsection{\type {mp:char_width}}
3669
3670 \startfunctioncall
3671 <number> w = mp:char_width(<string> fontname, <number> char)
3672 \stopfunctioncall
3673
3674 \subsubsection{\type {mp:char_height}}
3675
3676 \startfunctioncall
3677 <number> w = mp:char_height(<string> fontname, <number> char)
3678 \stopfunctioncall
3679
3680 \subsubsection{\type {mp:char_depth}}
3681
3682 \startfunctioncall
3683 <number> w = mp:char_depth(<string> fontname, <number> char)
3684 \stopfunctioncall
3685
3686 \section{The \type {node} library}
3687
3688 The \type {node} library contains functions that facilitate dealing with (lists
3689 of) nodes and their values. They allow you to create, alter, copy, delete, and
3690 insert \LUATEX\ node objects, the core objects within the typesetter.
3691
3692 \LUATEX\ nodes are represented in \LUA\ as userdata with the metadata type
3693 \type {luatex.node}. The various parts within a node can be accessed using
3694 named fields.
3695
3696 Each node has at least the three fields \type {next}, \type {id}, and \type
3697 {subtype}:
3698
3699 \startitemize[intro]
3700
3701 \startitem
3702     The \type {next} field returns the userdata object for the next node in a
3703     linked list of nodes, or \type {nil}, if there is no next node.
3704 \stopitem
3705
3706 \startitem
3707     The \type {id} indicates \TEX's \quote{node type}. The field \type {id} has a
3708     numeric value for efficiency reasons, but some of the library functions also
3709     accept a string value instead of \type {id}.
3710 \stopitem
3711
3712 \startitem
3713     The \type {subtype} is another number. It often gives further information
3714     about a node of a particular \type {id}, but it is most important when
3715     dealing with \quote {whatsits}, because they are differentiated solely based
3716     on their \type {subtype}.
3717 \stopitem
3718
3719 \stopitemize
3720
3721 The other available fields depend on the \type {id} (and for \quote {whatsits},
3722 the \type {subtype}) of the node. Further details on the various fields and their
3723 meanings are given in~\in{chapter}[nodes].
3724
3725 Support for \type {unset} (alignment) nodes is partial: they can be queried and
3726 modified from \LUA\ code, but not created.
3727
3728 Nodes can be compared to each other, but: you are actually comparing indices into
3729 the node memory. This means that equality tests can only be trusted under very
3730 limited conditions. It will not work correctly in any situation where one of the
3731 two nodes has been freed and|/|or reallocated: in that case, there will be false
3732 positives.
3733
3734 At the moment, memory management of nodes should still be done explicitly by the
3735 user. Nodes are not \quote {seen} by the \LUA\ garbage collector, so you have to
3736 call the node freeing functions yourself when you are no longer in need of a node
3737 (list). Nodes form linked lists without reference counting, so you have to be
3738 careful that when control returns back to \LUATEX\ itself, you have not deleted
3739 nodes that are still referenced from a \type {next} pointer elsewhere, and that
3740 you did not create nodes that are referenced more than once.
3741
3742 There are statistics available with regards to the allocated node memory, which
3743 can be handy for tracing.
3744
3745 \subsection{Node handling functions}
3746
3747 \subsubsection{\type {node.is_node}}
3748
3749 \startfunctioncall
3750 <boolean> t = node.is_node(<any> item)
3751 \stopfunctioncall
3752
3753 This function returns true if the argument is a userdata object of
3754 type \type {<node>}.
3755
3756 \subsubsection{\type {node.types}}
3757
3758 \startfunctioncall
3759 <table> t = node.types()
3760 \stopfunctioncall
3761
3762 This function returns an array that maps node id numbers to node type strings,
3763 providing an overview of the possible top|-|level \type {id} types.
3764
3765 \subsubsection{\type {node.whatsits}}
3766
3767 \startfunctioncall
3768 <table> t = node.whatsits()
3769 \stopfunctioncall
3770
3771 \TEX's \quote{whatsits} all have the same \type {id}. The various subtypes are
3772 defined by their \type {subtype} fields. The function is much like \type
3773 {node.types}, except that it provides an array of \type {subtype} mappings.
3774
3775 \subsubsection{\type {node.id}}
3776
3777 \startfunctioncall
3778 <number> id = node.id(<string> type)
3779 \stopfunctioncall
3780
3781 This converts a single type name to its internal numeric representation.
3782
3783 \subsubsection{\type {node.subtype}}
3784
3785 \startfunctioncall
3786 <number> subtype = node.subtype(<string> type)
3787 \stopfunctioncall
3788
3789 This converts a single whatsit name to its internal numeric representation (\type
3790 {subtype}).
3791
3792 \subsubsection{\type {node.type}}
3793
3794 \startfunctioncall
3795 <string> type = node.type(<any> n)
3796 \stopfunctioncall
3797
3798 In the argument is a number, then this function converts an internal numeric
3799 representation to an external string representation. Otherwise, it will return
3800 the string \type {node} if the object represents a node, and \type {nil}
3801 otherwise.
3802
3803 \subsubsection{\type {node.fields}}
3804
3805 \startfunctioncall
3806 <table> t = node.fields(<number> id)
3807 <table> t = node.fields(<number> id, <number> subtype)
3808 \stopfunctioncall
3809
3810 This function returns an array of valid field names for a particular type of
3811 node. If you want to get the valid fields for a \quote {whatsit}, you have to
3812 supply the second argument also. In other cases, any given second argument will
3813 be silently ignored.
3814
3815 This function accepts string \type {id} and \type {subtype} values as well.
3816
3817 \subsubsection{\type {node.has_field}}
3818
3819 \startfunctioncall
3820 <boolean> t = node.has_field(<node> n, <string> field)
3821 \stopfunctioncall
3822
3823 This function returns a boolean that is only true if \type {n} is
3824 actually a node, and it has the field.
3825
3826 \subsubsection{\type {node.new}}
3827
3828 \startfunctioncall
3829 <node> n = node.new(<number> id)
3830 <node> n = node.new(<number> id, <number> subtype)
3831 \stopfunctioncall
3832
3833 Creates a new node. All of the new node's fields are initialized to either zero
3834 or \type {nil} except for \type {id} and \type {subtype} (if supplied). If you
3835 want to create a new whatsit, then the second argument is required, otherwise it
3836 need not be present. As with all node functions, this function creates a node on
3837 the \TEX\ level.
3838
3839 This function accepts string \type {id} and \type {subtype} values as well.
3840
3841 \subsubsection{\type {node.free}}
3842
3843 \startfunctioncall
3844 node.free(<node> n)
3845 \stopfunctioncall
3846
3847 Removes the node \type {n} from \TEX's memory. Be careful: no checks are done on
3848 whether this node is still pointed to from a register or some \type {next} field:
3849 it is up to you to make sure that the internal data structures remain correct.
3850
3851 \subsubsection{\type {node.flush_list}}
3852
3853 \startfunctioncall
3854 node.flush_list(<node> n)
3855 \stopfunctioncall
3856
3857 Removes the node list \type {n} and the complete node list following \type {n}
3858 from \TEX's memory. Be careful: no checks are done on whether any of these nodes
3859 is still pointed to from a register or some \type {next} field: it is up to you
3860 to make sure that the internal data structures remain correct.
3861
3862 \subsubsection{\type {node.copy}}
3863
3864 \startfunctioncall
3865 <node> m = node.copy(<node> n)
3866 \stopfunctioncall
3867
3868 Creates a deep copy of node \type {n}, including all nested lists as in the case
3869 of a hlist or vlist node. Only the \type {next} field is not copied.
3870
3871 \subsubsection{\type {node.copy_list}}
3872
3873 \startfunctioncall
3874 <node> m = node.copy_list(<node> n)
3875 <node> m = node.copy_list(<node> n, <node> m)
3876 \stopfunctioncall
3877
3878 Creates a deep copy of the node list that starts at \type {n}. If \type {m} is
3879 also given, the copy stops just before node \type {m}.
3880
3881 Note that you cannot copy attribute lists this way, specialized functions for
3882 dealing with attribute lists will be provided later but are not there yet.
3883 However, there is normally no need to copy attribute lists as when you do
3884 assignments to the \type {attr} field or make changes to specific attributes, the
3885 needed copying and freeing takes place automatically.
3886
3887 \subsubsection{\type {node.next}}
3888
3889 \startfunctioncall
3890 <node> m = node.next(<node> n)
3891 \stopfunctioncall
3892
3893 Returns the node following this node, or \type {nil} if there is no such node.
3894
3895 \subsubsection{\type {node.prev}}
3896
3897 \startfunctioncall
3898 <node> m = node.prev(<node> n)
3899 \stopfunctioncall
3900
3901 Returns the node preceding this node, or \type {nil} if there is no such node.
3902
3903 \subsubsection{\type {node.current_attr}}
3904
3905 \startfunctioncall
3906 <node> m = node.current_attr()
3907 \stopfunctioncall
3908
3909 Returns the currently active list of attributes, if there is one.
3910
3911 The intended usage of \type {current_attr} is as follows:
3912
3913 \starttyping
3914 local x1 = node.new("glyph")
3915 x1.attr = node.current_attr()
3916 local x2 = node.new("glyph")
3917 x2.attr = node.current_attr()
3918 \stoptyping
3919
3920 or:
3921
3922 \starttyping
3923 local x1 = node.new("glyph")
3924 local x2 = node.new("glyph")
3925 local ca = node.current_attr()
3926 x1.attr = ca
3927 x2.attr = ca
3928 \stoptyping
3929
3930 The attribute lists are ref counted and the assignment takes care of incrementing
3931 the refcount. You cannot expect the value \type {ca} to be valid any more when
3932 you assign attributes (using \type {tex.setattribute}) or when control has been
3933 passed back to \TEX.
3934
3935 Note: this function is somewhat experimental, and it returns the {\it actual}
3936 attribute list, not a copy thereof. Therefore, changing any of the attributes in
3937 the list will change these values for all nodes that have the current attribute
3938 list assigned to them.
3939
3940 \subsubsection{\type {node.hpack}}
3941
3942 \startfunctioncall
3943 <node> h, <number> b = node.hpack(<node> n)
3944 <node> h, <number> b = node.hpack(<node> n, <number> w, <string> info)
3945 <node> h, <number> b = node.hpack(<node> n, <number> w, <string> info, <string> dir)
3946 \stopfunctioncall
3947
3948 This function creates a new hlist by packaging the list that begins at node \type
3949 {n} into a horizontal box. With only a single argument, this box is created using
3950 the natural width of its components. In the three argument form, \type {info}
3951 must be either \type {additional} or \type {exactly}, and \type {w} is the
3952 additional (\type {\hbox spread}) or exact (\type {\hbox to}) width to be used. The
3953 second return value is the badness of the generated box.
3954
3955 Caveat: at this moment, there can be unexpected side|-|effects to this function,
3956 like updating some of the \type {\marks} and \type {\inserts}. Also note that the
3957 content of \type {h} is the original node list \type {n}: if you call \type
3958 {node.free(h)} you will also free the node list itself, unless you explicitly set
3959 the \type {list} field to \type {nil} beforehand. And in a similar way, calling
3960 \type {node.free(n)} will invalidate \type {h} as well!
3961
3962 \subsubsection{\type {node.vpack}}
3963
3964 \startfunctioncall
3965 <node> h, <number> b = node.vpack(<node> n)
3966 <node> h, <number> b = node.vpack(<node> n, <number> w, <string> info)
3967 <node> h, <number> b = node.vpack(<node> n, <number> w, <string> info, <string> dir)
3968 \stopfunctioncall
3969
3970 This function creates a new vlist by packaging the list that begins at node \type
3971 {n} into a vertical box. With only a single argument, this box is created using
3972 the natural height of its components. In the three argument form, \type {info}
3973 must be either \type {additional} or \type {exactly}, and \type {w} is the
3974 additional (\type {\vbox spread}) or exact (\type {\vbox to}) height to be used.
3975
3976 The second return value is the badness of the generated box.
3977
3978 See the description of \type {node.hpack()} for a few memory allocation caveats.
3979
3980 \subsubsection{\type {node.dimensions}}
3981
3982 \startfunctioncall
3983 <number> w, <number> h, <number> d  = node.dimensions(<node> n)
3984 <number> w, <number> h, <number> d  = node.dimensions(<node> n, <string> dir)
3985 <number> w, <number> h, <number> d  = node.dimensions(<node> n, <node> t)
3986 <number> w, <number> h, <number> d  = node.dimensions(<node> n, <node> t, <string> dir)
3987 \stopfunctioncall
3988
3989 This function calculates the natural in-line dimensions of the node list starting
3990 at node \type {n} and terminating just before node \type {t} (or the end of the
3991 list, if there is no second argument). The return values are scaled points. An
3992 alternative format that starts with glue parameters as the first three arguments
3993 is also possible:
3994
3995 \startfunctioncall
3996 <number> w, <number> h, <number> d  =
3997   node.dimensions(<number> glue_set, <number> glue_sign,
3998                  <number> glue_order, <node> n)
3999 <number> w, <number> h, <number> d  =
4000   node.dimensions(<number> glue_set, <number> glue_sign,
4001                  <number> glue_order, <node> n, <string> dir)
4002 <number> w, <number> h, <number> d  =
4003   node.dimensions(<number> glue_set, <number> glue_sign,
4004                  <number> glue_order, <node> n, <node> t)
4005 <number> w, <number> h, <number> d  =
4006   node.dimensions(<number> glue_set, <number> glue_sign,
4007                  <number> glue_order, <node> n, <node> t, <string> dir)
4008 \stopfunctioncall
4009
4010 This calling method takes glue settings into account and is especially useful for
4011 finding the actual width of a sublist of nodes that are already boxed, for
4012 example in code like this, which prints the width of the space inbetween the
4013 \type {a} and \type {b} as it would be if \type {\box0} was used as-is:
4014
4015 \starttyping
4016 \setbox0 = \hbox to 20pt {a b}
4017
4018 \directlua{print (node.dimensions(
4019     tex.box[0].glue_set,
4020     tex.box[0].glue_sign,
4021     tex.box[0].glue_order,
4022     tex.box[0].head.next,
4023     node.tail(tex.box[0].head)
4024 )) }
4025 \stoptyping
4026
4027 \subsubsection{\type {node.mlist_to_hlist}}
4028
4029 \startfunctioncall
4030 <node> h = node.mlist_to_hlist(<node> n,
4031              <string> display_type, <boolean> penalties)
4032 \stopfunctioncall
4033
4034 This runs the internal mlist to hlist conversion, converting the math list in
4035 \type {n} into the horizontal list \type {h}. The interface is exactly the same
4036 as for the callback \type {mlist_to_hlist}.
4037
4038 \subsubsection{\type {node.slide}}
4039
4040 \startfunctioncall
4041 <node> m = node.slide(<node> n)
4042 \stopfunctioncall
4043
4044 Returns the last node of the node list that starts at \type {n}. As a
4045 side|-|effect, it also creates a reverse chain of \type {prev} pointers between
4046 nodes.
4047
4048 \subsubsection{\type {node.tail}}
4049
4050 \startfunctioncall
4051 <node> m = node.tail(<node> n)
4052 \stopfunctioncall
4053
4054 Returns the last node of the node list that starts at \type {n}.
4055
4056 \subsubsection{\type {node.length}}
4057
4058 \startfunctioncall
4059 <number> i = node.length(<node> n)
4060 <number> i = node.length(<node> n, <node> m)
4061 \stopfunctioncall
4062
4063 Returns the number of nodes contained in the node list that starts at \type {n}.
4064 If \type {m} is also supplied it stops at \type {m} instead of at the end of the
4065 list. The node \type {m} is not counted.
4066
4067 \subsubsection{\type {node.count}}
4068
4069 \startfunctioncall
4070 <number> i = node.count(<number> id, <node> n)
4071 <number> i = node.count(<number> id, <node> n, <node> m)
4072 \stopfunctioncall
4073
4074 Returns the number of nodes contained in the node list that starts at \type {n}
4075 that have a matching \type {id} field. If \type {m} is also supplied, counting
4076 stops at \type {m} instead of at the end of the list. The node \type {m} is not
4077 counted.
4078
4079 This function also accept string \type {id}'s.
4080
4081 \subsubsection{\type {node.traverse}}
4082
4083 \startfunctioncall
4084 <node> t = node.traverse(<node> n)
4085 \stopfunctioncall
4086
4087 This is a lua iterator that loops over the node list that starts at \type {n}.
4088 Typically code looks like this:
4089
4090 \starttyping
4091 for n in node.traverse(head) do
4092    ...
4093 end
4094 \stoptyping
4095
4096 is functionally equivalent to:
4097
4098 \starttyping
4099 do
4100   local n
4101   local function f (head,var)
4102     local t
4103     if var == nil then
4104        t = head
4105     else
4106        t = var.next
4107     end
4108     return t
4109   end
4110   while true do
4111     n = f (head, n)
4112     if n == nil then break end
4113     ...
4114   end
4115 end
4116 \stoptyping
4117
4118 It should be clear from the definition of the function \type {f} that even though
4119 it is possible to add or remove nodes from the node list while traversing, you
4120 have to take great care to make sure all the \type {next} (and \type {prev})
4121 pointers remain valid.
4122
4123 If the above is unclear to you, see the section \quote {For Statement} in the
4124 \LUA\ Reference Manual.
4125
4126 \subsubsection{\type {node.traverse_id}}
4127
4128 \startfunctioncall
4129 <node> t = node.traverse_id(<number> id, <node> n)
4130 \stopfunctioncall
4131
4132 This is an iterator that loops over all the nodes in the list that starts at
4133 \type {n} that have a matching \type {id} field.
4134
4135 See the previous section for details. The change is in the local function \type
4136 {f}, which now does an extra while loop checking against the upvalue \type {id}:
4137
4138 \starttyping
4139  local function f(head,var)
4140    local t
4141    if var == nil then
4142       t = head
4143    else
4144       t = var.next
4145    end
4146    while not t.id == id do
4147       t = t.next
4148    end
4149    return t
4150  end
4151 \stoptyping
4152
4153 \subsubsection{\type {node.end_of_math}}
4154
4155 \startfunctioncall
4156 <node> t = node.end_of_math(<node> start)
4157 \stopfunctioncall
4158
4159 Looks for and returns the next \type {math_node} following the \type {start}. If
4160 the given node is a math endnode this helper return that node, else it follows
4161 the list and return the next math endnote. If no such node is found nil is
4162 returned.
4163
4164 \subsubsection{\type {node.remove}}
4165
4166 \startfunctioncall
4167 <node> head, current = node.remove(<node> head, <node> current)
4168 \stopfunctioncall
4169
4170 This function removes the node \type {current} from the list following \type
4171 {head}. It is your responsibility to make sure it is really part of that list.
4172 The return values are the new \type {head} and \type {current} nodes. The
4173 returned \type {current} is the node following the \type {current} in the calling
4174 argument, and is only passed back as a convenience (or \type {nil}, if there is
4175 no such node). The returned \type {head} is more important, because if the
4176 function is called with \type {current} equal to \type {head}, it will be
4177 changed.
4178
4179 \subsubsection{\type {node.insert_before}}
4180
4181 \startfunctioncall
4182 <node> head, new = node.insert_before(<node> head, <node> current, <node> new)
4183 \stopfunctioncall
4184
4185 This function inserts the node \type {new} before \type {current} into the list
4186 following \type {head}. It is your responsibility to make sure that \type
4187 {current} is really part of that list. The return values are the (potentially
4188 mutated) \type {head} and the node \type {new}, set up to be part of the list
4189 (with correct \type {next} field). If \type {head} is initially \type {nil}, it
4190 will become \type {new}.
4191
4192 \subsubsection{\type {node.insert_after}}
4193
4194 \startfunctioncall
4195 <node> head, new = node.insert_after(<node> head, <node> current, <node> new)
4196 \stopfunctioncall
4197
4198 This function inserts the node \type {new} after \type {current} into the list
4199 following \type {head}. It is your responsibility to make sure that \type
4200 {current} is really part of that list. The return values are the \type {head} and
4201 the node \type {new}, set up to be part of the list (with correct \type {next}
4202 field). If \type {head} is initially \type {nil}, it will become \type {new}.
4203
4204 \subsubsection{\type {node.first_glyph}}
4205
4206 \startfunctioncall
4207 <node> n = node.first_glyph(<node> n)
4208 <node> n = node.first_glyph(<node> n, <node> m)
4209 \stopfunctioncall
4210
4211 Returns the first node in the list starting at \type {n} that is a glyph node
4212 with a subtype indicating it is a glyph, or \type {nil}. If \type {m} is given,
4213 processing stops at (but including) that node, otherwise processing stops at the
4214 end of the list.
4215
4216 \subsubsection{\type {node.ligaturing}}
4217
4218 \startfunctioncall
4219 <node> h, <node> t, <boolean> success = node.ligaturing(<node> n)
4220 <node> h, <node> t, <boolean> success = node.ligaturing(<node> n, <node> m)
4221 \stopfunctioncall
4222
4223 Apply \TEX-style ligaturing to the specified nodelist. The tail node \type {m} is
4224 optional. The two returned nodes \type {h} and \type {t} are the new head and
4225 tail (both \type {n} and \type {m} can change into a new ligature).
4226
4227 \subsubsection{\type {node.kerning}}
4228
4229 \startfunctioncall
4230 <node> h, <node> t, <boolean> success = node.kerning(<node> n)
4231 <node> h, <node> t, <boolean> success = node.kerning(<node> n, <node> m)
4232 \stopfunctioncall
4233
4234 Apply \TEX|-|style kerning to the specified nodelist. The tail node \type {m} is
4235 optional. The two returned nodes \type {h} and \type {t} are the head and tail
4236 (either one of these can be an inserted kern node, because special kernings with
4237 word boundaries are possible).
4238
4239 \subsubsection{\type {node.unprotect_glyphs}}
4240
4241 \startfunctioncall
4242 node.unprotect_glyphs(<node> n)
4243 \stopfunctioncall
4244
4245 Subtracts 256 from all glyph node subtypes. This and the next function are
4246 helpers to convert from \type {characters} to \type {glyphs} during node
4247 processing.
4248
4249 \subsubsection{\type {node.protect_glyphs}}
4250
4251 \startfunctioncall
4252 node.protect_glyphs(<node> n)
4253 \stopfunctioncall
4254
4255 Adds 256 to all glyph node subtypes in the node list starting at \type {n},
4256 except that if the value is 1, it adds only 255. The special handling of 1 means
4257 that \type {characters} will become \type {glyphs} after subtraction of 256.
4258
4259 \subsubsection{\type {node.last_node}}
4260
4261 \startfunctioncall
4262 <node> n = node.last_node()
4263 \stopfunctioncall
4264
4265 This function pops the last node from \TEX's \quote{current list}. It returns
4266 that node, or \type {nil} if the current list is empty.
4267
4268 \subsubsection{\type {node.write}}
4269
4270 \startfunctioncall
4271 node.write(<node> n)
4272 \stopfunctioncall
4273
4274 This is an experimental function that will append a node list to \TEX's \quote
4275 {current list} The node list is not deep|-|copied! There is no error checking
4276 either!
4277
4278 \subsubsection{\type {node.protrusion_skippable}}
4279 \startfunctioncall
4280 <boolean> skippable = node.protrusion_skippable(<node> n)
4281 \stopfunctioncall
4282
4283 Returns \type {true} if, for the purpose of line boundary discovery when
4284 character protrusion is active, this node can be skipped.
4285
4286 \subsection{Attribute handling}
4287
4288 Attributes appear as linked list of userdata objects in the \type {attr} field of
4289 individual nodes. They can be handled individually, but it is much safer and more
4290 efficient to use the dedicated functions associated with them.
4291
4292 \subsubsection{\type {node.has_attribute}}
4293
4294 \startfunctioncall
4295 <number> v = node.has_attribute(<node> n, <number> id)
4296 <number> v = node.has_attribute(<node> n, <number> id, <number> val)
4297 \stopfunctioncall
4298
4299 Tests if a node has the attribute with number \type {id} set. If \type {val} is
4300 also supplied, also tests if the value matches \type {val}. It returns the value,
4301 or, if no match is found, \type {nil}.
4302
4303 \subsubsection{\type {node.set_attribute}}
4304
4305 \startfunctioncall
4306 node.set_attribute(<node> n, <number> id, <number> val)
4307 \stopfunctioncall
4308
4309 Sets the attribute with number \type {id} to the value \type {val}. Duplicate
4310 assignments are ignored. {\em [needs explanation]}
4311
4312 \subsubsection{\type {node.unset_attribute}}
4313
4314 \startfunctioncall
4315 <number> v = node.unset_attribute(<node> n, <number> id)
4316 <number> v = node.unset_attribute(<node> n, <number> id, <number> val)
4317 \stopfunctioncall
4318
4319 Unsets the attribute with number \type {id}. If \type {val} is also supplied, it
4320 will only perform this operation if the value matches \type {val}. Missing
4321 attributes or attribute|-|value pairs are ignored.
4322
4323 If the attribute was actually deleted, returns its old value. Otherwise, returns
4324 \type {nil}.
4325
4326 \section{The \type {pdf} library}
4327
4328 This contains variables and functions that are related to the \PDF\ backend.
4329
4330 \subsection{\type {pdf.mapfile}, \type {pdf.mapline}}
4331
4332 \startfunctioncall
4333 pdf.mapfile(<string> map file)
4334 pdf.mapline(<string> map line)
4335 \stopfunctioncall
4336
4337 These two functions can be used to replace primitives \type {\pdfmapfile} and
4338 \type {\pdfmapline} from \PDFTEX. They expect a string as only parameter and have
4339 no return value.
4340
4341 The also functions replace the former variables \type {pdf.pdfmapfile} and
4342 \type {pdf.pdfmapline}.
4343
4344 \subsection{\type {pdf.catalog}, \type {pdf.info},\type {pdf.names},
4345     \type {pdf.trailer}}
4346
4347 These variables offer a read|-|write interface to the corresponding \PDFTEX\
4348 token lists. The value types are strings and they are written out to the \PDF\
4349 file directly after the \PDFTEX\ token registers.
4350
4351 The preferred interface is now \type {pdf.setcatalog}, \type {pdf.setinfo}
4352 \type {pdf.setnames} and \type {pdf.settrailer} for setting these properties
4353 and \type {pdf.getcatalog}, \type {pdf.getinfo} \type {pdf.getnames} and
4354 \type {pdf.gettrailer} for querying them,
4355
4356 The corresponding \quote {\type {pdf}} parameter names \type {pdf.pdfcatalog},
4357 \type {pdf.pdfinfo}, \type {pdf.pdfnames}, and \type {pdf.pdftrailer} are
4358 not available.
4359
4360 \subsection{\type {pdf.<set/get>pageattributes}, \type {pdf.<set/get>pageresources},
4361     \type {pdf.<set/get>pagesattributes}}
4362
4363 These variables offer a read|-|write interface to related token lists. The value
4364 types are strings. The variables have no interaction with the corresponding
4365 \PDFTEX\ token registers \type {\pdfpageattr}, \type {\pdfpageresources}, and \type
4366 {\pdfpagesattr}. They are written out to the \PDF\ file directly after the
4367 \PDFTEX\ token registers.
4368
4369 The preferred interface is now \type {pdf.setpageattributes}, \type
4370 {pdf.setpagesattributes} and \type {pdf.setpageresources} for setting these
4371 properties and \type {pdf.getpageattributes}, \type {pdf.getpageattributes}
4372 and \type {pdf.getpageresources} for querying them.
4373
4374 \subsection{\type {pdf.<set/get>xformattributes}, \type {pdf.<set/get>xformresources}}
4375
4376 These variables offer a read|-|write interface to related token lists. The value
4377 types are strings. The variables have no interaction with the corresponding
4378 \PDFTEX\ token registers \type {\pdfxformattr} and \type {\pdfxformresources}. They
4379 are written out to the \PDF\ file directly after the \PDFTEX\ token registers.
4380
4381 The preferred interface is now \type {pdf.setxformattributes} and \type
4382 {pdf.setxformattributes} for setting these properties and \type
4383 {pdf.getxformattributes} and \type {pdf.getxformresources} for querying them.
4384
4385 \subsection{\type {pdf.setcompresslevel} and \type {pdf.setobjcompresslevel}}
4386
4387 These two functions set the level of compression. The minimum valu sis~0,
4388 the maximum is~9.
4389
4390 \subsection{\type {pdf.setdecimaldigits} and \type {pdf.getdecimaldigits}}
4391
4392 These two functions set the accuracy of floats written to the \PDF file. You can
4393 set any value but the backend will not go below 3 and above 6.
4394
4395 \subsection{\type {pdf.setpkresolution} and \type {pdf.getpkresolution}}
4396
4397 These setter takes two arguments: the resolution and an optional zero or one that
4398 indicates if this is a fixed one. The getter returns these two values.
4399
4400 \subsection{\type {pdf.lastobj}, \type {pdf.lastlink}, \type {pdf.lastannot},
4401 and \type {pdf.retval}}
4402
4403 These status variables are similar to the ones traditionally used at the \TEX\
4404 end.
4405
4406 \subsection{\type {pdf.setorigin}, \type {pdf.getorigin}}
4407
4408 This one is used to set the horizonal and/or vertical offset (a traditional
4409 backend property).
4410
4411 \starttyping
4412 pdf.setorigin() -- sets both to 0pt
4413 pdf.setorigin(tex.sp("1in")) -- sets both to 1in
4414 pdf.setorigin(tex.sp("1in"),tex.sp("1in"))
4415 \stoptyping
4416
4417 The counterpart of this function returns two values.
4418
4419 \subsection{\type {pdf.setlinkmargin}, \type {pdf.getlinkmargin} \type
4420 {pdf.setdestmargin}, \type {pdf.getdestmargin} \type {pdf.setthreadmargin},
4421 \type {pdf.getthreadmargin} \type {pdf.setxformmargin}, \type
4422 {pdf.getxformmargin}}
4423
4424 These function can be used to set and retrieve the margins that are added to the
4425 natural boundingboxes of the respective objects.
4426
4427 \subsection{\type {pdf.h}, \type {pdf.v}}
4428
4429 These are the \type {h} and \type {v} values that define the current location on
4430 the output page, measured from its lower left corner. The values can be queried
4431 using scaled points as units.
4432
4433 \starttyping
4434 local h = pdf.h
4435 local v = pdf.v
4436 \stoptyping
4437
4438 \subsection{\type {pdf.getpos}, \type {pdf.gethpos}, \type {pdf.getvpos}}
4439
4440 These are the function variants of \type {pdf.h} and \type {pdf.v}. Sometimes
4441 using a function is preferred over a key so this saves wrapping. Also, these
4442 functions are faster then the key based access, as \type {h} and \type {v} keys
4443 are not real variables but looked up using a metatable call. The \type {getpos}
4444 function returns two values, the other return one.
4445
4446 \starttyping
4447 local h, v = pdf.getpos()
4448 \stoptyping
4449
4450 \subsection{\type {pdf.hasmatrix}, \type {pdf.getmatrix}}
4451
4452 The current matrix transformation is available via the \type {getmatrix} command,
4453 which returns 6 values: \type {sx}, \type {rx}, \type {ry}, \type {sy}, \type
4454 {tx}, and \type {ty}. The \type {hasmatrix} function returns \type {true} when a
4455 matrix is applied.
4456
4457 \starttyping
4458 if pdf.hasmatrix() then
4459     local sx, rx, ry, sy, tx, ty = pdf.getmatrix()
4460     -- do something useful or not
4461 end
4462 \stoptyping
4463
4464 \subsection{\type {pdf.print}}
4465
4466 A print function to write stuff to the \PDF\ document that can be used from
4467 within a \type {\latelua} argument. This function is not to be used inside
4468 \type {\directlua} unless you know {\it exactly} what you are doing.
4469
4470 \startfunctioncall
4471 pdf.print(<string> s)
4472 pdf.print(<string> type, <string> s)
4473 \stopfunctioncall
4474
4475 The optional parameter can be used to mimic the behavior of \type {\pdfliteral}:
4476 the \type {type} is \type {direct} or \type {page}.
4477
4478 \subsection{\type {pdf.immediateobj}}
4479
4480 This function creates a \PDF\ object and immediately writes it to the \PDF\ file.
4481 It is modelled after \PDFTEX's \type {\immediate} \type {\pdfobj} primitives. All
4482 function variants return the object number of the newly generated object.
4483
4484 \startfunctioncall
4485 <number> n = pdf.immediateobj(<string> objtext)
4486 <number> n = pdf.immediateobj("file", <string> filename)
4487 <number> n = pdf.immediateobj("stream", <string> streamtext, <string> attrtext)
4488 <number> n = pdf.immediateobj("streamfile", <string> filename, <string> attrtext)
4489 \stopfunctioncall
4490
4491 The first version puts the \type {objtext} raw into an object. Only the object
4492 wrapper is automatically generated, but any internal structure (like \type {<<
4493 >>} dictionary markers) needs to provided by the user. The second version with
4494 keyword \type {"file"} as 1st argument puts the contents of the file with name
4495 \type {filename} raw into the object. The third version with keyword \type
4496 {"stream"} creates a stream object and puts the \type {streamtext} raw into the
4497 stream. The stream length is automatically calculated. The optional \type
4498 {attrtext} goes into the dictionary of that object. The fourth version with
4499 keyword \type {"streamfile"} does the same as the 3rd one, it just reads the
4500 stream data raw from a file.
4501
4502 An optional first argument can be given to make the function use a previously
4503 reserved \PDF\ object.
4504
4505 \startfunctioncall
4506 <number> n = pdf.immediateobj(<integer> n, <string> objtext)
4507 <number> n = pdf.immediateobj(<integer> n, "file", <string> filename)
4508 <number> n = pdf.immediateobj(<integer> n, "stream", <string> streamtext, <string> attrtext)
4509 <number> n = pdf.immediateobj(<integer> n, "streamfile", <string> filename, <string> attrtext)
4510 \stopfunctioncall
4511
4512 \subsection{\type {pdf.obj}}
4513
4514 This function creates a \PDF\ object, which is written to the \PDF\ file only
4515 when referenced, e.g., by \type {pdf.refobj()}.
4516
4517 All function variants return the object number of the newly generated object, and
4518 there are two separate calling modes.
4519
4520 The first mode is modelled after \PDFTEX's \type {\pdfobj} primitive.
4521
4522 \startfunctioncall
4523 <number> n = pdf.obj(<string> objtext)
4524 <number> n = pdf.obj("file", <string> filename)
4525 <number> n = pdf.obj("stream", <string> streamtext, <string> attrtext)
4526 <number> n = pdf.obj("streamfile", <string> filename, <string> attrtext)
4527 \stopfunctioncall
4528
4529 An optional first argument can be given to make the function use a previously
4530 reserved \PDF\ object.
4531
4532 \startfunctioncall
4533 <number> n = pdf.obj(<integer> n, <string> objtext)
4534 <number> n = pdf.obj(<integer> n, "file", <string> filename)
4535 <number> n = pdf.obj(<integer> n, "stream", <string> streamtext, <string> attrtext)
4536 <number> n = pdf.obj(<integer> n, "streamfile", <string> filename, <string> attrtext)
4537 \stopfunctioncall
4538
4539 The second mode accepts a single argument table with key--value pairs.
4540
4541 \startfunctioncall
4542 <number> n = pdf.obj {
4543     type           = <string>,
4544     immmediate     = <boolean>,
4545     objnum         = <number>,
4546     attr           = <string>,
4547     compresslevel  = <number>,
4548     objcompression = <boolean>,
4549     file           = <string>,
4550     string         = <string>
4551 }
4552 \stopfunctioncall
4553
4554 The \type {type} field can have the values \type {raw} and \type {stream}, this
4555 field is required, the others are optional (within constraints).
4556
4557 Note: this mode makes \type {pdf.obj} look more flexible than it actually is: the
4558 constraints from the separate parameter version still apply, so for example you
4559 can't have both \type {string} and \type {file} at the same time.
4560
4561 \subsection{\type {pdf.refobj}}
4562
4563 This function, the \LUA\ version of the \type {\pdfrefobj} primitive, references an
4564 object by its object number, so that the object will be written out.
4565
4566 \startfunctioncall
4567 pdf.refobj(<integer> n)
4568 \stopfunctioncall
4569
4570 This function works in both the \type {\directlua} and \type {\latelua} environment.
4571 Inside \type {\directlua} a new whatsit node \quote {pdf_refobj} is created, which
4572 will be marked for flushing during page output and the object is then written
4573 directly after the page, when also the resources objects are written out. Inside
4574 \type {\latelua} the object will be marked for flushing.
4575
4576 This function has no return values.
4577
4578 \subsection{\type {pdf.reserveobj}}
4579
4580 This function creates an empty \PDF\ object and returns its number.
4581
4582 \startfunctioncall
4583 <number> n = pdf.reserveobj()
4584 <number> n = pdf.reserveobj("annot")
4585 \stopfunctioncall
4586
4587 \subsection{\type {pdf.registerannot}}
4588
4589 This function adds an object number to the \type {/Annots} array for the current
4590 page without doing anything else. This function can only be used from within
4591 \type {\latelua}.
4592
4593 \startfunctioncall
4594 pdf.registerannot (<number> objnum)
4595 \stopfunctioncall
4596
4597 \subsection{\type {pdf.newcolorstack}}
4598
4599 This function allocates a new color stack and returns it's id. The arguments
4600 are the same as for the similar backend extension primitive.
4601
4602 \startfunctioncall
4603 pdf.newcolorstack("0 g","page",true) -- page|direct|origin
4604 \stopfunctioncall
4605
4606 \section{The \type {pdfscanner} library}
4607
4608 The \type {pdfscanner} library allows interpretation of PDF content streams and
4609 \type {/ToUnicode} (cmap) streams. You can get those streams from the \type
4610 {epdf} library, as explained in an earlier section. There is only a single
4611 top|-|level function in this library:
4612
4613 \startfunctioncall
4614 pdfscanner.scan (<Object> stream, <table> operatortable, <table> info)
4615 \stopfunctioncall
4616
4617 The first argument, \type {stream}, should be either a PDF stream object, or a
4618 PDF array of PDF stream objects (those options comprise the possible return
4619 values of \type {<Page>:getContents()} and \type {<Object>:getStream()} in the
4620 \type {epdf} library).
4621
4622 The second argument, \type {operatortable}, should be a Lua table where the keys
4623 are PDF operator name strings and the values are Lua functions (defined by you)
4624 that are used to process those operators. The functions are called whenever the
4625 scanner finds one of these PDF operators in the content stream(s). The functions
4626 are called with two arguments: the \type {scanner} object itself, and the \type
4627 {info} table that was passed are the third argument to \type {pdfscanner.scan}.
4628
4629 Internally, \type {pdfscanner.scan} loops over the PDF operators in the
4630 stream(s), collecting operands on an internal stack until it finds a PDF
4631 operator. If that PDF operator's name exists in \type {operatortable}, then the
4632 associated function is executed. After the function has run (or when there is no
4633 function to execute) the internal operand stack is cleared in preparation for the
4634 next operator, and processing continues.
4635
4636 The \type {scanner} argument to the processing functions is needed because it
4637 offers various methods to get the actual operands from the internal operand
4638 stack.
4639
4640 A simple example of processing a PDF's document stream could look like this:
4641
4642 \starttyping
4643 function Do (scanner, info)
4644    local val       = scanner:pop()
4645    local name      = val[2] -- val[1] == 'name'
4646    local resources = info.resources
4647    local xobject   = resources:lookup("XObject"):getDict():lookup(name)
4648    print (info.space ..'Use XObject '.. name)
4649    if xobject and xobject:isStream() then
4650       local dict = xobject:getStream():getDict()
4651       if dict then
4652         local name = dict:lookup("Subtype")
4653         if name:getName() == "Form" then
4654           local newinfo =  {
4655             space = info.space .. "  " ,
4656             resources = dict:lookup("Resources"):getDict()
4657           }
4658           pdfscanner.scan(xobject, operatortable, newinfo)
4659         end
4660       end
4661    end
4662 end
4663
4664 operatortable = { Do = Do }
4665
4666 doc      = epdf.open(arg[1])
4667 pagenum  = 1
4668
4669 while pagenum <= doc:getNumPages() do
4670    local page = doc:getCatalog():getPage(pagenum)
4671    local info = {
4672      space     = "  " ,
4673      resources = page:getResourceDict()
4674    }
4675    print('Page ' .. pagenum)
4676    pdfscanner.scan(page:getContents(), operatortable, info)
4677    pagenum = pagenum + 1
4678 end
4679 \stoptyping
4680
4681 This example iterates over all the actual content in the PDF, and prints out the
4682 found XObject names. While the code demonstrates quite some of the \type {epdf}
4683 functions, let's focus on the type \type {pdfscanner} specific code instead.
4684
4685 From the bottom up, the line
4686
4687 \starttyping
4688    pdfscanner.scan(page:getContents(), operatortable, info)
4689 \stoptyping
4690
4691 runs the scanner with the PDF page's top-level content.
4692
4693 The third argument, \type {info}, contains two entries: \type {space} is used to
4694 indent the printed output, and \type {resources} is needed so that embedded \type
4695 {XForms} can find their own content.
4696
4697 The second argument, \type {operatortable} defines a processing function for a
4698 single PDF operator, \type {Do}.
4699
4700 The function \type {Do} prints the name of the current XObject, and then starts a
4701 new scanner for that object's content stream, under the condition that the
4702 XObject is in fact a \type {/Form}. That nested scanner is called with new \type
4703 {info} argument with an updated \type {space} value so that the indentation of
4704 the output nicely nests, and with an new \type {resources} field to help the next
4705 iteration down to properly process any other, embedded XObjects.
4706
4707 Of course, this is not a very useful example in practise, but for the purpose of
4708 demonstrating \type {pdfscanner}, it is just long enough. It makes use of only
4709 one \type {scanner} method: \type {scanner:pop()}. That function pops the top
4710 operand of the internal stack, and returns a lua table where the object at index
4711 one is a string representing the type of the operand, and object two is its
4712 value.
4713
4714 The list of possible operand types and associated lua value types is:
4715
4716 \starttabulate[|lT|p|]
4717 \NC integer  \NC <number>  \NC \NR
4718 \NC real     \NC <number>  \NC \NR
4719 \NC boolean  \NC <boolean> \NC \NR
4720 \NC name     \NC <string>  \NC \NR
4721 \NC operator \NC <string>  \NC \NR
4722 \NC string   \NC <string>  \NC \NR
4723 \NC array    \NC <table>   \NC \NR
4724 \NC dict     \NC <table>   \NC \NR
4725 \stoptabulate
4726
4727 In case of \type {integer} or \type {real}, the value is always a \LUA\ (floating
4728 point) number.
4729
4730 In case of \type {name}, the leading slash is always stripped.
4731
4732 In case of \type {string}, please bear in mind that PDF actually supports
4733 different types of strings (with different encodings) in different parts of the
4734 PDF document, so may need to reencode some of the results; \type {pdfscanner}
4735 always outputs the byte stream without reencoding anything. \type {pdfscanner}
4736 does not differentiate between literal strings and hexidecimal strings (the
4737 hexadecimal values are decoded), and it treats the stream data for inline images
4738 as a string that is the single operand for \type {EI}.
4739
4740 In case of \type {array}, the table content is a list of \type {pop} return
4741 values.
4742
4743 In case of \type {dict}, the table keys are PDF name strings and the values are
4744 \type {pop} return values.
4745
4746 \blank
4747
4748 There are few more methods defined that you can ask \type {scanner}:
4749
4750 \starttabulate[|lT|p|]
4751 \NC pop       \NC as explained above \NC \NR
4752 \NC popNumber \NC return only the value of a \type {real} or \type {integer} \NC \NR
4753 \NC popName   \NC return only the value of a \type {name} \NC \NR
4754 \NC popString \NC return only the value of a \type {string} \NC \NR
4755 \NC popArray  \NC return only the value of a \type {array} \NC \NR
4756 \NC popDict   \NC return only the value of a \type {dict} \NC \NR
4757 \NC popBool   \NC return only the value of a \type {boolean} \NC \NR
4758 \NC done      \NC abort further processing of this \type {scan()} call \NC \NR
4759 \stoptabulate
4760
4761 The \type {popXXX} are convenience functions, and come in handy when you know the
4762 type of the operands beforehand (which you usually do, in PDF). For example, the
4763 \type {Do} function could have used \type {local name = scanner:popName()}
4764 instead, because the single operand to the \type {Do} operator is always a PDF
4765 name object.
4766
4767 The \type {done} function allows you to abort processing of a stream once you
4768 have learned everything you want to learn. This comes in handy while parsing
4769 \type {/ToUnicode}, because there usually is trailing garbage that you are not
4770 interested in. Without \type {done}, processing only end at the end of the
4771 stream, possibly wasting CPU cycles.
4772
4773 \section{The \type {status} library}
4774
4775 This contains a number of run|-|time configuration items that you may find useful
4776 in message reporting, as well as an iterator function that gets all of the names
4777 and values as a table.
4778
4779 \startfunctioncall
4780 <table> info = status.list()
4781 \stopfunctioncall
4782
4783 The keys in the table are the known items, the value is the current value. Almost
4784 all of the values in \type {status} are fetched through a metatable at run|-|time
4785 whenever they are accessed, so you cannot use \type {pairs} on \type {status},
4786 but you {\it can\/} use \type {pairs} on \type {info}, of course. If you do not
4787 need the full list, you can also ask for a single item by using its name as an
4788 index into \type {status}.
4789
4790 The current list is:
4791
4792 \starttabulate[|lT|p|]
4793 \NC \ssbf key          \NC \bf explanation \NC \NR
4794 \NC pdf_gone           \NC written \PDF\ bytes \NC \NR
4795 \NC pdf_ptr            \NC not yet written \PDF\ bytes \NC \NR
4796 \NC dvi_gone           \NC written \DVI\ bytes \NC \NR
4797 \NC dvi_ptr            \NC not yet written \DVI\ bytes \NC \NR
4798 \NC total_pages        \NC number of written pages \NC \NR
4799 \NC output_file_name   \NC name of the \PDF\ or \DVI\ file \NC \NR
4800 \NC log_name           \NC name of the log file \NC \NR
4801 \NC banner             \NC terminal display banner \NC \NR
4802 \NC var_used           \NC variable (one|-|word) memory in use \NC \NR
4803 \NC dyn_used           \NC token (multi|-|word) memory in use  \NC \NR
4804 \NC str_ptr            \NC number of strings \NC \NR
4805 \NC init_str_ptr       \NC number of \INITEX\ strings \NC \NR
4806 \NC max_strings        \NC maximum allowed strings \NC \NR
4807 \NC pool_ptr           \NC string pool index \NC \NR
4808 \NC init_pool_ptr      \NC \INITEX\ string pool index \NC \NR
4809 \NC pool_size          \NC current size allocated for string characters \NC \NR
4810 \NC node_mem_usage     \NC a string giving insight into currently used nodes \NC \NR
4811 \NC var_mem_max        \NC number of allocated words for nodes \NC \NR
4812 \NC fix_mem_max        \NC number of allocated words for tokens \NC \NR
4813 \NC fix_mem_end        \NC maximum number of used tokens \NC \NR
4814 \NC cs_count           \NC number of control sequences \NC \NR
4815 \NC hash_size          \NC size of hash \NC \NR
4816 \NC hash_extra         \NC extra allowed hash \NC \NR
4817 \NC font_ptr           \NC number of active fonts \NC \NR
4818 \NC input_ptr          \NC th elevel of input we're at \NC \NR
4819 \NC max_in_stack       \NC max used input stack entries \NC \NR
4820 \NC max_nest_stack     \NC max used nesting stack entries \NC \NR
4821 \NC max_param_stack    \NC max used parameter stack entries \NC \NR
4822 \NC max_buf_stack      \NC max used buffer position \NC \NR
4823 \NC max_save_stack     \NC max used save stack entries \NC \NR
4824 \NC stack_size         \NC input stack size \NC \NR
4825 \NC nest_size          \NC nesting stack size \NC \NR
4826 \NC param_size         \NC parameter stack size \NC \NR
4827 \NC buf_size           \NC current allocated size of the line buffer \NC \NR
4828 \NC save_size          \NC save stack size \NC \NR
4829 \NC obj_ptr            \NC max \PDF\ object pointer \NC \NR
4830 \NC obj_tab_size       \NC \PDF\ object table size \NC \NR
4831 \NC pdf_os_cntr        \NC max \PDF\ object stream pointer \NC \NR
4832 \NC pdf_os_objidx      \NC \PDF\ object stream index \NC \NR
4833 \NC pdf_dest_names_ptr \NC max \PDF\ destination pointer \NC \NR
4834 \NC dest_names_size    \NC \PDF\ destination table size \NC \NR
4835 \NC pdf_mem_ptr        \NC max \PDF\ memory used \NC \NR
4836 \NC pdf_mem_size       \NC \PDF\ memory size \NC \NR
4837 \NC largest_used_mark  \NC max referenced marks class \NC \NR
4838 \NC filename           \NC name of the current input file \NC \NR
4839 \NC inputid            \NC numeric id of the current input \NC \NR
4840 \NC linenumber         \NC location in the current input file \NC \NR
4841 \NC lasterrorstring    \NC last tex error string \NC \NR
4842 \NC lastluaerrorstring \NC last lua error string \NC \NR
4843 \NC lastwarningtag     \NC last warning string\NC \NR
4844 \NC lastwarningstring  \NC last warning tag, normally an indication of in what part\NC \NR
4845 \NC lasterrorcontext   \NC last error context string (with newlines) \NC \NR
4846 \NC luabytecodes       \NC number of active \LUA\ bytecode registers \NC \NR
4847 \NC luabytecode_bytes  \NC number of bytes in \LUA\ bytecode registers \NC \NR
4848 \NC luastate_bytes     \NC number of bytes in use by \LUA\ interpreters \NC \NR
4849 \NC output_active      \NC \type {true} if the \type {\output} routine is active \NC \NR
4850 \NC callbacks          \NC total number of executed callbacks so far \NC \NR
4851 \NC indirect_callbacks \NC number of those that were themselves
4852                            a result of other callbacks (e.g. file readers) \NC \NR
4853 \NC luatex_version     \NC the luatex version number \NC \NR
4854 \NC luatex_revision    \NC the luatex revision string \NC \NR
4855 \NC ini_version        \NC \type {true} if this is an \INITEX\ run \NC \NR
4856 \NC shell_escape       \NC \type {0} means disabled, \type {1} is restricted and
4857                            \type {2} means anything is permitted \NC \NR
4858 \stoptabulate
4859
4860 The error and warning messages can be wiped with the \type {resetmessages}
4861 function.
4862
4863 \section{The \type {tex} library}
4864
4865 The \type {tex} table contains a large list of virtual internal \TEX\
4866 parameters that are partially writable.
4867
4868 The designation \quote {virtual} means that these items are not properly defined
4869 in \LUA, but are only front\-ends that are handled by a metatable that operates
4870 on the actual \TEX\ values. As a result, most of the \LUA\ table operators (like
4871 \type {pairs} and \type {#}) do not work on such items.
4872
4873 At the moment, it is possible to access almost every parameter that has these
4874 characteristics:
4875
4876 \startitemize[packed]
4877 \item You can use it after \type {\the}
4878 \item It is a single token.
4879 \item Some special others, see the list below
4880 \stopitemize
4881
4882 This excludes parameters that need extra arguments, like \type {\the\scriptfont}.
4883
4884 The subset comprising simple integer and dimension registers are
4885 writable as well as readable (stuff like \type {\tracingcommands} and
4886 \type {\parindent}).
4887
4888 \subsection{Internal parameter values}
4889
4890 For all the parameters in this section, it is possible to access them directly
4891 using their names as index in the \type {tex} table, or by using one of the
4892 functions \type {tex.get} and \type {tex.set}. If you created aliasses,
4893 you can use accessors like \type {tex.getdimen} as these also understand
4894 names of built|-|in variables.
4895
4896 The exact parameters and return values differ depending on the actual parameter,
4897 and so does whether \type {tex.set} has any effect. For the parameters that {\it
4898 can\/} be set, it is possible to use \type {global} as the first argument to
4899 \type {tex.set}; this makes the assignment global instead of local.
4900
4901 \startfunctioncall
4902 tex.set (<string> n, ...)
4903 tex.set ("global", <string> n, ...)
4904 ... = tex.get (<string> n)
4905 \stopfunctioncall
4906
4907 There are also dedicated setters, getters and checkers:
4908
4909 \startfunctioncall
4910 local d = tex.getdimen("foo")
4911 if tex.isdimen("bar") then
4912     tex.setdimen("bar",d)
4913 end
4914 \stopfunctioncall
4915
4916 There are such helpers for \type {dimen}, \type {count}, \type {skip}, \type
4917 {box} and \type {attribute} registers.
4918
4919 \subsubsection{Integer parameters}
4920
4921 The integer parameters accept and return \LUA\ numbers.
4922
4923 Read|-|write:
4924
4925 \starttwocolumns
4926 \starttyping
4927 tex.adjdemerits
4928 tex.binoppenalty
4929 tex.brokenpenalty
4930 tex.catcodetable
4931 tex.clubpenalty
4932 tex.day
4933 tex.defaulthyphenchar
4934 tex.defaultskewchar
4935 tex.delimiterfactor
4936 tex.displaywidowpenalty
4937 tex.doublehyphendemerits
4938 tex.endlinechar
4939 tex.errorcontextlines
4940 tex.escapechar
4941 tex.exhyphenpenalty
4942 tex.fam
4943 tex.finalhyphendemerits
4944 tex.floatingpenalty
4945 tex.globaldefs
4946 tex.hangafter
4947 tex.hbadness
4948 tex.holdinginserts
4949 tex.hyphenpenalty
4950 tex.interlinepenalty
4951 tex.language
4952 tex.lastlinefit
4953 tex.lefthyphenmin
4954 tex.linepenalty
4955 tex.localbrokenpenalty
4956 tex.localinterlinepenalty
4957 tex.looseness
4958 tex.mag
4959 tex.maxdeadcycles
4960 tex.month
4961 tex.newlinechar
4962 tex.outputpenalty
4963 tex.pausing
4964 tex.postdisplaypenalty
4965 tex.predisplaydirection
4966 tex.predisplaypenalty
4967 tex.pretolerance
4968 tex.relpenalty
4969 tex.righthyphenmin
4970 tex.savinghyphcodes
4971 tex.savingvdiscards
4972 tex.showboxbreadth
4973 tex.showboxdepth
4974 tex.time
4975 tex.tolerance
4976 tex.tracingassigns
4977 tex.tracingcommands
4978 tex.tracinggroups
4979 tex.tracingifs
4980 tex.tracinglostchars
4981 tex.tracingmacros
4982 tex.tracingnesting
4983 tex.tracingonline
4984 tex.tracingoutput
4985 tex.tracingpages
4986 tex.tracingparagraphs
4987 tex.tracingrestores
4988 tex.tracingscantokens
4989 tex.tracingstats
4990 tex.uchyph
4991 tex.vbadness
4992 tex.widowpenalty
4993 tex.year
4994 \stoptyping
4995 \stoptwocolumns
4996
4997 Read|-|only:
4998
4999 \startthreecolumns
5000 \starttyping
5001 tex.deadcycles
5002 tex.insertpenalties
5003 tex.parshape
5004 tex.prevgraf
5005 tex.spacefactor
5006 \stoptyping
5007 \stopthreecolumns
5008
5009 \subsubsection{Dimension parameters}
5010
5011 The dimension parameters accept \LUA\ numbers (signifying scaled points) or
5012 strings (with included dimension). The result is always a number in scaled
5013 points.
5014
5015 Read|-|write:
5016
5017 \startthreecolumns
5018 \starttyping
5019 tex.boxmaxdepth
5020 tex.delimitershortfall
5021 tex.displayindent
5022 tex.displaywidth
5023 tex.emergencystretch
5024 tex.hangindent
5025 tex.hfuzz
5026 tex.hoffset
5027 tex.hsize
5028 tex.lineskiplimit
5029 tex.mathsurround
5030 tex.maxdepth
5031 tex.nulldelimiterspace
5032 tex.overfullrule
5033 tex.pagebottomoffset
5034 tex.pageheight
5035 tex.pageleftoffset
5036 tex.pagerightoffset
5037 tex.pagetopoffset
5038 tex.pagewidth
5039 tex.parindent
5040 tex.predisplaysize
5041 tex.scriptspace
5042 tex.splitmaxdepth
5043 tex.vfuzz
5044 tex.voffset
5045 tex.vsize
5046 tex.prevdepth
5047 tex.prevgraf
5048 tex.spacefactor
5049 \stoptyping
5050 \stopthreecolumns
5051
5052 Read|-|only:
5053
5054 \startthreecolumns
5055 \starttyping
5056 tex.pagedepth
5057 tex.pagefilllstretch
5058 tex.pagefillstretch
5059 tex.pagefilstretch
5060 tex.pagegoal
5061 tex.pageshrink
5062 tex.pagestretch
5063 tex.pagetotal
5064 \stoptyping
5065 \stopthreecolumns
5066
5067 Beware: as with all \LUA\ tables you can add values to them. So, the following is valid:
5068
5069 \starttyping
5070 tex.foo = 123
5071 \stoptyping
5072
5073 When you access a \TEX\ parameter a look up takes place. For read||only variables
5074 that means that you will get something back, but when you set them you create a
5075 new entry in the table thereby making the original invisible.
5076
5077 There are a few special cases that we make an exception for: \type {prevdepth},
5078 \type {prevgraf} and \type {spacefactor}. These normally are accessed via the
5079 \type {tex.nest} table:
5080
5081 \starttyping
5082 tex.nest[tex.nest.ptr].prevdepth   = p
5083 tex.nest[tex.nest.ptr].spacefactor = s
5084 \stoptyping
5085
5086 However, the following also works:
5087
5088 \starttyping
5089 tex.prevdepth   = p
5090 tex.spacefactor = s
5091 \stoptyping
5092
5093 Keep in mind that when you mess with node lists directly at the \LUA\ end you
5094 might need to update the top of the nesting stack's \type {prevdepth} explicitly
5095 as there is no way \LUATEX\ can guess your intentions. By using the accessor in
5096 the \type {tex} tables, you get and set the values atthe top of the nest stack.
5097
5098 \subsubsection{Direction parameters}
5099
5100 The direction parameters are read|-|only and return a \LUA\ string.
5101
5102 \startthreecolumns
5103 \starttyping
5104 tex.bodydir
5105 tex.mathdir
5106 tex.pagedir
5107 tex.pardir
5108 tex.textdir
5109 \stoptyping
5110 \stopthreecolumns
5111
5112 \subsubsection{Glue parameters}
5113
5114 The glue parameters accept and return a userdata object that represents a \type
5115 {glue_spec} node.
5116
5117 \startthreecolumns
5118 \starttyping
5119 tex.abovedisplayshortskip
5120 tex.abovedisplayskip
5121 tex.baselineskip
5122 tex.belowdisplayshortskip
5123 tex.belowdisplayskip
5124 tex.leftskip
5125 tex.lineskip
5126 tex.parfillskip
5127 tex.parskip
5128 tex.rightskip
5129 tex.spaceskip
5130 tex.splittopskip
5131 tex.tabskip
5132 tex.topskip
5133 tex.xspaceskip
5134 \stoptyping
5135 \stopthreecolumns
5136
5137 \subsubsection{Muglue parameters}
5138
5139 All muglue parameters are to be used read|-|only and return a \LUA\ string.
5140
5141 \startthreecolumns
5142 \starttyping
5143 tex.medmuskip
5144 tex.thickmuskip
5145 tex.thinmuskip
5146 \stoptyping
5147 \stopthreecolumns
5148
5149 \subsubsection{Tokenlist parameters}
5150
5151 The tokenlist parameters accept and return \LUA\ strings. \LUA\ strings are
5152 converted to and from token lists using \type {\the} \type {\toks} style expansion:
5153 all category codes are either space (10) or other (12). It follows that assigning
5154 to some of these, like \quote {tex.output}, is actually useless, but it feels bad
5155 to make exceptions in view of a coming extension that will accept full|-|blown
5156 token strings.
5157
5158 \startthreecolumns
5159 \starttyping
5160 tex.errhelp
5161 tex.everycr
5162 tex.everydisplay
5163 tex.everyeof
5164 tex.everyhbox
5165 tex.everyjob
5166 tex.everymath
5167 tex.everypar
5168 tex.everyvbox
5169 tex.output
5170 tex.pdfpageattr
5171 tex.pdfpageresources
5172 tex.pdfpagesattr
5173 tex.pdfpkmode
5174 \stoptyping
5175 \stopthreecolumns
5176
5177 \subsection{Convert commands}
5178
5179 All \quote {convert} commands are read|-|only and return a \LUA\ string. The
5180 supported commands at this moment are:
5181
5182 \starttwocolumns
5183 \starttyping
5184 tex.eTeXVersion
5185 tex.eTeXrevision
5186 tex.formatname
5187 tex.jobname
5188 tex.luatexbanner
5189 tex.luatexrevision
5190 tex.pdfnormaldeviate
5191 tex.fontname(number)
5192 tex.pdffontname(number)
5193 tex.pdffontobjnum(number)
5194 tex.pdffontsize(number)
5195 tex.uniformdeviate(number)
5196 tex.number(number)
5197 tex.romannumeral(number)
5198 tex.pdfpageref(number)
5199 tex.pdfxformname(number)
5200 tex.fontidentifier(number)
5201 \stoptyping
5202 \stoptwocolumns
5203
5204 If you are wondering why this list looks haphazard; these are all the cases of
5205 the \quote {convert} internal command that do not require an argument, as well as
5206 the ones that require only a simple numeric value.
5207
5208 The special (lua-only) case of \type {tex.fontidentifier} returns the \type
5209 {csname} string that matches a font id number (if there is one).
5210
5211 if these are really needed in a macro package.
5212
5213 \subsection{Last item commands}
5214
5215 All \quote {last item} commands are read|-|only and return a number.
5216
5217 The supported commands at this moment are:
5218
5219 \startthreecolumns
5220 \starttyping
5221 tex.lastpenalty
5222 tex.lastkern
5223 tex.lastskip
5224 tex.lastnodetype
5225 tex.inputlineno
5226 tex.pdflastobj
5227 tex.pdflastxform
5228 tex.pdflastximage
5229 tex.pdflastximagepages
5230 tex.pdflastannot
5231 tex.pdflastxpos
5232 tex.pdflastypos
5233 tex.pdfrandomseed
5234 tex.pdflastlink
5235 tex.luatexversion
5236 tex.eTeXminorversion
5237 tex.eTeXversion
5238 tex.currentgrouplevel
5239 tex.currentgrouptype
5240 tex.currentiflevel
5241 tex.currentiftype
5242 tex.currentifbranch
5243 tex.pdflastximagecolordepth
5244 \stoptyping
5245 \stopthreecolumns
5246
5247 \subsection{Attribute, count, dimension, skip and token registers}
5248
5249 \TEX's attributes (\type {\attribute}), counters (\type {\count}), dimensions (\type
5250 {\dimen}), skips (\type {\skip}) and token (\type {\toks}) registers can be accessed
5251 and written to using two times five virtual sub|-|tables of the \type {tex}
5252 table:
5253
5254 \startthreecolumns
5255 \starttyping
5256 tex.attribute
5257 tex.count
5258 tex.dimen
5259 tex.skip
5260 tex.toks
5261 \stoptyping
5262 \stopthreecolumns
5263
5264 It is possible to use the names of relevant \type {\attributedef}, \type {\countdef},
5265 \type {\dimendef}, \type {\skipdef}, or \type {\toksdef} control sequences as indices
5266 to these tables:
5267
5268 \starttyping
5269 tex.count.scratchcounter = 0
5270 enormous = tex.dimen['maxdimen']
5271 \stoptyping
5272
5273 In this case, \LUATEX\ looks up the value for you on the fly. You have to use a
5274 valid \type {\countdef} (or \type {\attributedef}, or \type {\dimendef}, or \type
5275 {\skipdef}, or \type {\toksdef}), anything else will generate an error (the intent
5276 is to eventually also allow \type {<chardef tokens>} and even macros that expand
5277 into a number).
5278
5279 The attribute and count registers accept and return \LUA\ numbers.
5280
5281 The dimension registers accept \LUA\ numbers (in scaled points) or strings (with
5282 an included absolute dimension; \type {em} and \type {ex} and \type {px} are
5283 forbidden). The result is always a number in scaled points.
5284
5285 The token registers accept and return \LUA\ strings. \LUA\ strings are converted
5286 to and from token lists using \type {\the} \type {\toks} style expansion: all
5287 category codes are either space (10) or other (12).
5288
5289 The skip registers accept and return \type {glue_spec} userdata node objects (see
5290 the description of the node interface elsewhere in this manual).
5291
5292 As an alternative to array addressing, there are also accessor functions defined
5293 for all cases, for example, here is the set of possibilities for \type {\skip}
5294 registers:
5295
5296 \startfunctioncall
5297 tex.setskip (<number> n, <node> s)
5298 tex.setskip (<string> s, <node> s)
5299 tex.setskip ('global',<number> n, <node> s)
5300 tex.setskip ('global',<string> s, <node> s)
5301 <node> s = tex.getskip (<number> n)
5302 <node> s = tex.getskip (<string> s)
5303 \stopfunctioncall
5304
5305 We have similar setters for \type {count}, \type {dimen}, \type {muskip}, and
5306 \type {toks}. Counters and dimen are represented by numbers, skips and muskips by
5307 nodes, and toks by strings. For tokens registers we have an alternative where a
5308 catcode table is specified:
5309
5310 \startfunctioncall
5311 tex.scantoks(0,3,"$e=mc^2$")
5312 tex.scantoks("global",0,"$\int\limits^1_2$")
5313 \stopfunctioncall
5314
5315 In the function-based interface, it is possible to define values globally by
5316 using the string \type {global} as the first function argument.
5317
5318 \subsection{Character code registers}
5319
5320 \TEX's character code tables (\type {\lccode}, \type {\uccode}, \type {\sfcode}, \type
5321 {\catcode}, \type {\mathcode}, \type {\delcode}) can be accessed and written to using
5322 six virtual subtables of the \type {tex} table
5323
5324 \startthreecolumns
5325 \starttyping
5326 tex.lccode
5327 tex.uccode
5328 tex.sfcode
5329 tex.catcode
5330 tex.mathcode
5331 tex.delcode
5332 \stoptyping
5333 \stopthreecolumns
5334
5335 The function call interfaces are roughly as above, but there are a few twists.
5336 \type {sfcode}s are the simple ones:
5337
5338 \startfunctioncall
5339 tex.setsfcode (<number> n, <number> s)
5340 tex.setsfcode ('global', <number> n, <number> s)
5341 <number> s = tex.getsfcode (<number> n)
5342 \stopfunctioncall
5343
5344 The function call interface for \type {lccode} and \type {uccode} additionally
5345 allows you to set the associated sibling at the same time:
5346
5347 \startfunctioncall
5348 tex.setlccode (['global'], <number> n, <number> lc)
5349 tex.setlccode (['global'], <number> n, <number> lc, <number> uc)
5350 <number> lc = tex.getlccode (<number> n)
5351 tex.setuccode (['global'], <number> n, <number> uc)
5352 tex.setuccode (['global'], <number> n, <number> uc, <number> lc)
5353 <number> uc = tex.getuccode (<number> n)
5354 \stopfunctioncall
5355
5356 The function call interface for \type {catcode} also allows you to specify a
5357 category table to use on assignment or on query (default in both cases is the
5358 current one):
5359
5360 \startfunctioncall
5361 tex.setcatcode (['global'], <number> n, <number> c)
5362 tex.setcatcode (['global'], <number> cattable, <number> n, <number> c)
5363 <number> lc = tex.getcatcode (<number> n)
5364 <number> lc = tex.getcatcode (<number> cattable, <number> n)
5365 \stopfunctioncall
5366
5367 The interfaces for \type {delcode} and \type {mathcode} use small array tables to
5368 set and retrieve values:
5369
5370 \startfunctioncall
5371 tex.setmathcode (['global'], <number> n, <table> mval )
5372 <table> mval = tex.getmathcode (<number> n)
5373 tex.setdelcode (['global'], <number> n, <table> dval )
5374 <table> dval = tex.getdelcode (<number> n)
5375 \stopfunctioncall
5376
5377 Where the table for \type {mathcode} is an array of 3 numbers, like this:
5378
5379 \starttyping
5380 {<number> mathclass, <number> family, <number> character}
5381 \stoptyping
5382
5383 And the table for \type {delcode} is an array with 4 numbers, like this:
5384
5385 \starttyping
5386 {<number> small_fam, <number> small_char, <number> large_fam, <number> large_char}
5387 \stoptyping
5388
5389 You can also avoid the table:
5390
5391 \startfunctioncall
5392 class, family, char = tex.getmathcodes (<number> n)
5393 smallfam, smallchar, largefam, largechar = tex.getdelcodes (<number> n)
5394 \stopfunctioncall
5395
5396 Normally, the third and fourth values in a delimiter code assignment will be zero
5397 according to \type {\Udelcode} usage, but the returned table can have values there
5398 (if the delimiter code was set using \type {\delcode}, for example). Unset \type
5399 {delcode}'s can be recognized because \type {dval[1]} is $-1$.
5400
5401 \subsection{Box registers}
5402
5403 It is possible to set and query actual boxes, using the node interface as defined
5404 in the \type {node} library:
5405
5406 \starttyping
5407 tex.box
5408 \stoptyping
5409
5410 for array access, or
5411
5412 \starttyping
5413 tex.setbox(<number> n, <node> s)
5414 tex.setbox(<string> cs, <node> s)
5415 tex.setbox('global', <number> n, <node> s)
5416 tex.setbox('global', <string> cs, <node> s)
5417 <node> n = tex.getbox(<number> n)
5418 <node> n = tex.getbox(<string> cs)
5419 \stoptyping
5420
5421 for function|-|based access. In the function-based interface, it is possible to
5422 define values globally by using the string \type {global} as the first function
5423 argument.
5424
5425 Be warned that an assignment like
5426
5427 \starttyping
5428 tex.box[0] = tex.box[2]
5429 \stoptyping
5430
5431 does not copy the node list, it just duplicates a node pointer. If \type {\box2}
5432 will be cleared by \TEX\ commands later on, the contents of \type {\box0} becomes
5433 invalid as well. To prevent this from happening, always use \type
5434 {node.copy_list()} unless you are assigning to a temporary variable:
5435
5436 \starttyping
5437 tex.box[0] = node.copy_list(tex.box[2])
5438 \stoptyping
5439
5440 The following function will register a box for reuse (this is modelled after so
5441 called xforms in \PDF). You can (re)use the box with \type {\useboxresource} or
5442 by creating a rule node with subtype~2.
5443
5444 \starttyping
5445 local index = tex.saveboxresource(n,attributes,resources,immediate)
5446 \stoptyping
5447
5448 The optional second and third arguments are strings, the fourth is a boolean.
5449
5450 You can generate the reference (a rule type) with:
5451
5452 \starttyping
5453 local reused = tex.useboxresource(n,wd,ht,dp)
5454 \stoptyping
5455
5456 The dimensions are optional and the final ones are returned as extra values. The
5457 following is just a bonus (no dimensions returned means that the resource is
5458 unknown):
5459
5460 \starttyping
5461 local w, h, d = tex.getboxresourcedimensions(n)
5462 \stoptyping
5463
5464 You can split a box:
5465
5466 \starttyping
5467 local vlist = tex.splitbox(n,height,mode)
5468 \stoptyping
5469
5470 The remainder is kept in the original box and a packaged vlist is returned. This
5471 operation is comparable to the \type {\vsplit} operation. The mode can be \type
5472 {additional} or \type {exactly} and concerns the split off box.
5473
5474 \subsection{Math parameters}
5475
5476 It is possible to set and query the internal math parameters using:
5477
5478 \startfunctioncall
5479 tex.setmath(<string> n, <string> t, <number> n)
5480 tex.setmath('global', <string> n, <string> t, <number> n)
5481 <number> n = tex.getmath(<string> n, <string> t)
5482 \stopfunctioncall
5483
5484 As before an optional first parameter \type {global} indicates a global
5485 assignment.
5486
5487 The first string is the parameter name minus the leading \quote {Umath}, and the
5488 second string is the style name minus the trailing \quote {style}.
5489
5490 Just to be complete, the values for the math parameter name are:
5491
5492 \starttyping
5493 quad                axis                operatorsize
5494 overbarkern         overbarrule         overbarvgap
5495 underbarkern        underbarrule        underbarvgap
5496 radicalkern         radicalrule         radicalvgap
5497 radicaldegreebefore radicaldegreeafter  radicaldegreeraise
5498 stackvgap           stacknumup          stackdenomdown
5499 fractionrule        fractionnumvgap     fractionnumup
5500 fractiondenomvgap   fractiondenomdown   fractiondelsize
5501 limitabovevgap      limitabovebgap      limitabovekern
5502 limitbelowvgap      limitbelowbgap      limitbelowkern
5503 underdelimitervgap  underdelimiterbgap
5504 overdelimitervgap   overdelimiterbgap
5505 subshiftdrop        supshiftdrop        subshiftdown
5506 subsupshiftdown     subtopmax           supshiftup
5507 supbottommin        supsubbottommax     subsupvgap
5508 spaceafterscript    connectoroverlapmin
5509 ordordspacing       ordopspacing        ordbinspacing     ordrelspacing
5510 ordopenspacing      ordclosespacing     ordpunctspacing   ordinnerspacing
5511 opordspacing        opopspacing         opbinspacing      oprelspacing
5512 opopenspacing       opclosespacing      oppunctspacing    opinnerspacing
5513 binordspacing       binopspacing        binbinspacing     binrelspacing
5514 binopenspacing      binclosespacing     binpunctspacing   bininnerspacing
5515 relordspacing       relopspacing        relbinspacing     relrelspacing
5516 relopenspacing      relclosespacing     relpunctspacing   relinnerspacing
5517 openordspacing      openopspacing       openbinspacing    openrelspacing
5518 openopenspacing     openclosespacing    openpunctspacing  openinnerspacing
5519 closeordspacing     closeopspacing      closebinspacing   closerelspacing
5520 closeopenspacing    closeclosespacing   closepunctspacing closeinnerspacing
5521 punctordspacing     punctopspacing      punctbinspacing   punctrelspacing
5522 punctopenspacing    punctclosespacing   punctpunctspacing punctinnerspacing
5523 innerordspacing     inneropspacing      innerbinspacing   innerrelspacing
5524 inneropenspacing    innerclosespacing   innerpunctspacing innerinnerspacing
5525 \stoptyping
5526
5527 The values for the style parameter name are:
5528
5529 \starttyping
5530 display       crampeddisplay
5531 text          crampedtext
5532 script        crampedscript
5533 scriptscript  crampedscriptscript
5534 \stoptyping
5535
5536 The value is either a number (representing a dimension or number) or a glue spec
5537 node representing a muskip for \type {ordordspacing} and similar spacing
5538 parameters.
5539
5540 \subsection{Special list heads}
5541
5542 The virtual table \type {tex.lists} contains the set of internal registers that
5543 keep track of building page lists.
5544
5545 \starttabulate[|lT|p|]
5546 \NC \bf field           \NC \bf description \NC \NR
5547 \NC page_ins_head       \NC circular list of pending insertions \NC \NR
5548 \NC contrib_head        \NC the recent contributions \NC \NR
5549 \NC page_head           \NC the current page content \NC \NR
5550 %NC temp_head           \NC \NC \NR
5551 \NC hold_head           \NC used for held-over items for next page \NC \NR
5552 \NC adjust_head         \NC head of the current \type {\vadjust} list \NC \NR
5553 \NC pre_adjust_head     \NC head of the current \type {\vadjust pre} list \NC \NR
5554 %NC align_head          \NC \NC \NR
5555 \NC page_discards_head  \NC head of the discarded items of a page break \NC \NR
5556 \NC split_discards_head \NC head of the discarded items in a vsplit \NC \NR
5557 \stoptabulate
5558
5559 \subsection{Semantic nest levels}
5560
5561 The virtual table \type {tex.nest} contains the currently active
5562 semantic nesting state. It has two main parts: a zero-based array of userdata for
5563 the semantic nest itself, and the numerical value \type {tex.nest.ptr}, which
5564 gives the highest available index. Neither the array items in \type {tex.nest[]}
5565 nor \type {tex.nest.ptr} can be assigned to (as this would confuse the
5566 typesetting engine beyond repair), but you can assign to the individual values
5567 inside the array items, e.g.\ \type {tex.nest[tex.nest.ptr].prevdepth}.
5568
5569 \type {tex.nest[tex.nest.ptr]} is the current nest state, \type {tex.nest[0]} the
5570 outermost (main vertical list) level.
5571
5572 The known fields are:
5573
5574 \starttabulate[|lT|l|l|p|]
5575 \NC \ssbf key   \NC \bf type \NC \bf modes \NC \bf explanation \NC \NR
5576 \NC mode        \NC number   \NC all       \NC The current mode. This is a number representing the
5577                                                main mode at this level:\crlf
5578                                                \type {0} == no mode (this happens during \type {\write})\crlf
5579                                                \type {1} == vertical,\crlf
5580                                                \type {127} = horizontal,\crlf
5581                                                \type {253} = display math.\crlf
5582                                                \type {-1} == internal vertical,\crlf
5583                                                \type {-127} = restricted horizontal,\crlf
5584                                                \type {-253} = inline math. \NC \NR
5585 \NC modeline    \NC number   \NC all       \NC source input line where this mode was entered in,
5586                                                negative inside the output routine \NC \NR
5587 \NC head        \NC node     \NC all       \NC the head of the current list \NC \NR
5588 \NC tail        \NC node     \NC all       \NC the tail of the current list \NC \NR
5589 \NC prevgraf    \NC number   \NC vmode     \NC number of lines in the previous paragraph \NC \NR
5590 \NC prevdepth   \NC number   \NC vmode     \NC depth of the previous paragraph (equal to \type {\pdfignoreddimen}
5591                                                when it is to be ignored) \NC \NR
5592 \NC spacefactor \NC number   \NC hmode     \NC the current space factor \NC \NR
5593 \NC dirs        \NC node     \NC hmode     \NC used for temporary storage by the line break algorithm\NC \NR
5594 \NC noad        \NC node     \NC mmode     \NC used for temporary storage of a pending fraction numerator,
5595                                                for \type {\over} etc. \NC \NR
5596 \NC delimptr    \NC node     \NC mmode     \NC used for temporary storage of the previous math delimiter,
5597                                                for \type {\middle} \NC \NR
5598 \NC mathdir     \NC boolean  \NC mmode     \NC true when during math processing the \type {\mathdir} is not
5599                                                the same as the surrounding \type {\textdir} \NC \NR
5600 \NC mathstyle   \NC number   \NC mmode     \NC the current \type {\mathstyle} \NC \NR
5601 \stoptabulate
5602
5603 \subsection[sec:luaprint]{Print functions}
5604
5605 The \type {tex} table also contains the three print functions that are the
5606 major interface from \LUA\ scripting to \TEX.
5607
5608 The arguments to these three functions are all stored in an in|-|memory virtual
5609 file that is fed to the \TEX\ scanner as the result of the expansion of
5610 \type {\directlua}.
5611
5612 The total amount of returnable text from a \type {\directlua} command is only
5613 limited by available system \RAM. However, each separate printed string has to
5614 fit completely in \TEX's input buffer.
5615
5616 The result of using these functions from inside callbacks is undefined
5617 at the moment.
5618
5619 \subsubsection{\type {tex.print}}
5620
5621 \startfunctioncall
5622 tex.print(<string> s, ...)
5623 tex.print(<number> n, <string> s, ...)
5624 tex.print(<table> t)
5625 tex.print(<number> n, <table> t)
5626 \stopfunctioncall
5627
5628 Each string argument is treated by \TEX\ as a separate input line. If there is a
5629 table argument instead of a list of strings, this has to be a consecutive array
5630 of strings to print (the first non-string value will stop the printing process).
5631
5632 The optional parameter can be used to print the strings using the catcode regime
5633 defined by \type {\catcodetable}~\type {n}. If \type {n} is $-1$, the currently
5634 active catcode regime is used. If \type {n} is $-2$, the resulting catcodes are
5635 the result of \type {\the} \type {\toks}: all category codes are 12 (other) except for
5636 the space character, that has category code 10 (space). Otherwise, if \type {n}
5637 is not a valid catcode table, then it is ignored, and the currently active
5638 catcode regime is used instead.
5639
5640 The very last string of the very last \type {tex.print()} command in a \type
5641 {\directlua} will not have the \type {\endlinechar} appended, all others do.
5642
5643 \subsubsection{\type {tex.sprint}}
5644
5645 \startfunctioncall
5646 tex.sprint(<string> s, ...)
5647 tex.sprint(<number> n, <string> s, ...)
5648 tex.sprint(<table> t)
5649 tex.sprint(<number> n, <table> t)
5650 \stopfunctioncall
5651
5652 Each string argument is treated by \TEX\ as a special kind of input line that
5653 makes it suitable for use as a partial line input mechanism:
5654
5655 \startitemize[packed]
5656 \startitem
5657     \TEX\ does not switch to the \quote {new line} state, so that leading spaces
5658     are not ignored.
5659 \stopitem
5660 \startitem
5661     No \type {\endlinechar} is inserted.
5662 \stopitem
5663 \startitem
5664     Trailing spaces are not removed.
5665
5666     Note that this does not prevent \TEX\ itself from eating spaces as result of
5667     interpreting the line. For example, in
5668
5669 \starttyping
5670 before\directlua{tex.sprint("\\relax")tex.sprint(" inbetween")}after
5671 \stoptyping
5672     the space before \type {inbetween} will be gobbled as a result of the \quote
5673     {normal} scanning of \type {\relax}.
5674 \stopitem
5675 \stopitemize
5676
5677 If there is a table argument instead of a list of strings, this has to
5678 be a consecutive array of strings to print (the first non-string value
5679 will stop the printing process).
5680
5681 The optional argument sets the catcode regime, as with \type {tex.print()}.
5682
5683 \subsubsection{\type {tex.tprint}}
5684
5685 \startfunctioncall
5686 tex.tprint({<number> n, <string> s, ...}, {...})
5687 \stopfunctioncall
5688
5689 This function is basically a shortcut for repeated calls to \type
5690 {tex.sprint(<number> n, <string> s, ...)}, once for each of the supplied argument
5691 tables.
5692
5693 \subsubsection{\type {tex.cprint}}
5694
5695 This function takes a number indicating the to be used catcode, plus either a
5696 table of strings or an argument list of strings that will be pushed into the
5697 input stream.
5698
5699 \startfunctioncall
5700 tex.cprint( 1," 1: $&{\\foo}") tex.print("\\par") -- a lot of \bgroup s
5701 tex.cprint( 2," 2: $&{\\foo}") tex.print("\\par") -- matching \egroup s
5702 tex.cprint( 9," 9: $&{\\foo}") tex.print("\\par") -- all get ignored
5703 tex.cprint(10,"10: $&{\\foo}") tex.print("\\par") -- all become spaces
5704 tex.cprint(11,"11: $&{\\foo}") tex.print("\\par") -- letters
5705 tex.cprint(12,"12: $&{\\foo}") tex.print("\\par") -- other characters
5706 tex.cprint(14,"12: $&{\\foo}") tex.print("\\par") -- comment triggers
5707 \stopfunctioncall
5708
5709 \subsubsection{\type {tex.write}}
5710
5711 \startfunctioncall
5712 tex.write(<string> s, ...)
5713 tex.write(<table> t)
5714 \stopfunctioncall
5715
5716 Each string argument is treated by \TEX\ as a special kind of input line that
5717 makes it suitable for use as a quick way to dump information:
5718
5719 \startitemize
5720 \item All catcodes on that line are either \quote{space} (for '~') or
5721      \quote{character} (for all others).
5722 \item There is no \type {\endlinechar} appended.
5723 \stopitemize
5724
5725 If there is a table argument instead of a list of strings, this has to be a
5726 consecutive array of strings to print (the first non-string value will stop the
5727 printing process).
5728
5729 \subsection{Helper functions}
5730
5731 \subsubsection{\type {tex.round}}
5732
5733 \startfunctioncall
5734 <number> n = tex.round(<number> o)
5735 \stopfunctioncall
5736
5737 Rounds \LUA\ number \type {o}, and returns a number that is in the range of a
5738 valid \TEX\ register value. If the number starts out of range, it generates a
5739 \quote {number to big} error as well.
5740
5741 \subsubsection{\type {tex.scale}}
5742
5743 \startfunctioncall
5744 <number> n = tex.scale(<number> o, <number> delta)
5745 <table> n = tex.scale(table o, <number> delta)
5746 \stopfunctioncall
5747
5748 Multiplies the \LUA\ numbers \type {o} and \type {delta}, and returns a rounded
5749 number that is in the range of a valid \TEX\ register value. In the table
5750 version, it creates a copy of the table with all numeric top||level values scaled
5751 in that manner. If the multiplied number(s) are of range, it generates
5752 \quote{number to big} error(s) as well.
5753
5754 Note: the precision of the output of this function will depend on your computer's
5755 architecture and operating system, so use with care! An interface to \LUATEX's
5756 internal, 100\% portable scale function will be added at a later date.
5757
5758 \subsubsection{\type {tex.sp}}
5759
5760 \startfunctioncall
5761 <number> n = tex.sp(<number> o)
5762 <number> n = tex.sp(<string> s)
5763 \stopfunctioncall
5764
5765 Converts the number \type {o} or a string \type {s} that represents an explicit
5766 dimension into an integer number of scaled points.
5767
5768 For parsing the string, the same scanning and conversion rules are used that
5769 \LUATEX\ would use if it was scanning a dimension specifier in its \TEX|-|like
5770 input language (this includes generating errors for bad values), expect for the
5771 following:
5772
5773 \startitemize[n]
5774 \startitem
5775     only explicit values are allowed, control sequences are not handled
5776 \stopitem
5777 \startitem
5778     infinite dimension units (\type {fil...}) are forbidden
5779 \stopitem
5780 \startitem
5781     \type {mu} units do not generate an error (but may not be useful either)
5782 \stopitem
5783 \stopitemize
5784
5785 \subsubsection{\type {tex.definefont}}
5786
5787 \startfunctioncall
5788 tex.definefont(<string> csname, <number> fontid)
5789 tex.definefont(<boolean> global, <string> csname, <number> fontid)
5790 \stopfunctioncall
5791
5792 Associates \type {csname} with the internal font number \type {fontid}. The
5793 definition is global if (and only if) \type {global} is specified and true (the
5794 setting of \type {globaldefs} is not taken into account).
5795
5796 \subsubsection{\type {tex.getlinenumber} and \type {tex.setlinenumber}}
5797
5798 You can mess with the current line number:
5799
5800 \startfunctioncall
5801 local n = tex.getlinenumber()
5802 tex.setlinenumber(n+10)
5803 \stopfunctioncall
5804
5805 which can be shortcut to:
5806
5807 \startfunctioncall
5808 tex.setlinenumber(10,true)
5809 \stopfunctioncall
5810
5811 This might be handy when you have a callback that read numbers from a file and
5812 combines them in one line (in which case an error message probably has to refer
5813 to the original line). Interference with \TEX's internal handling of numbers is
5814 of course possible.
5815
5816 \subsubsection{\type {tex.error}}
5817
5818 \startfunctioncall
5819 tex.error(<string> s)
5820 tex.error(<string> s, <table> help)
5821 \stopfunctioncall
5822
5823 This creates an error somewhat like the combination of \type {\errhelp} and \type
5824 {\errmessage} would. During this error, deletions are disabled.
5825
5826 The array part of the \type {help} table has to contain strings, one for each
5827 line of error help.
5828
5829 \subsubsection{\type {tex.hashtokens}}
5830
5831 \startfunctioncall
5832 for i,v in pairs (tex.hashtokens()) do ... end
5833 \stopfunctioncall
5834
5835 Returns a name and token table pair (see~\in {section} [luatokens] about token
5836 tables) iterator for every non-zero entry in the hash table. This can be useful
5837 for debugging, but note that this also reports control sequences that may be
5838 unreachable at this moment due to local redefinitions: it is strictly a dump of
5839 the hash table.
5840
5841 \subsection[luaprimitives]{Functions for dealing with primitives }
5842
5843 \subsubsection{\type {tex.enableprimitives}}
5844
5845 \startfunctioncall
5846 tex.enableprimitives(<string> prefix, <table> primitive names)
5847 \stopfunctioncall
5848
5849 This function accepts a prefix string and an array of primitive names.
5850
5851 For each combination of \quote {prefix} and \quote {name}, the \type
5852 {tex.enableprimitives} first verifies that \quote {name} is an actual primitive
5853 (it must be returned by one of the \type {tex.extraprimitives()} calls explained
5854 below, or part of \TEX82, or \type {\directlua}). If it is not, \type
5855 {tex.enableprimitives} does nothing and skips to the next pair.
5856
5857 But if it is, then it will construct a csname variable by concatenating the
5858 \quote {prefix} and \quote {name}, unless the \quote {prefix} is already the
5859 actual prefix of \quote {name}. In the latter case, it will discard the \quote
5860 {prefix}, and just use \quote {name}.
5861
5862 Then it will check for the existence of the constructed csname. If the csname is
5863 currently undefined (note: that is not the same as \type {\relax}), it will
5864 globally define the csname to have the meaning: run code belonging to the
5865 primitive \quote {name}. If for some reason the csname is already defined, it
5866 does nothing and tries the next pair.
5867
5868 An example:
5869
5870 \starttyping
5871   tex.enableprimitives('LuaTeX', {'formatname'})
5872 \stoptyping
5873
5874 will define \type {\LuaTeXformatname} with the same intrinsic meaning as the
5875 documented primitive \type {\formatname}, provided that the control sequences \type
5876 {\LuaTeXformatname} is currently undefined.
5877
5878 % Second example:
5879 %
5880 % \starttyping
5881 %   tex.enableprimitives('Omega',tex.extraprimitives ('omega'))
5882 % \stoptyping
5883 %
5884 % will define a whole series of csnames like \type {\Omegatextdir}, \type
5885 % {\Omegapardir}, etc., but it will stick with \type {\OmegaVersion} instead of
5886 % creating the doubly-prefixed \type {\OmegaOmegaVersion}.
5887
5888 When \LUATEX\ is run with \type {--ini} only the \TEX82 primitives and \type
5889 {\directlua} are available, so no extra primitives {\bf at all}.
5890
5891 If you want to have all the new functionality available using their default
5892 names, as it is now, you will have to add
5893
5894 \starttyping
5895   \ifx\directlua\undefined \else
5896      \directlua {tex.enableprimitives('',tex.extraprimitives ())}
5897   \fi
5898 \stoptyping
5899
5900 near the beginning of your format generation file. Or you can choose different
5901 prefixes for different subsets, as you see fit.
5902
5903 Calling some form of \type {tex.enableprimitives()} is highly important though,
5904 because if you do not, you will end up with a \TEX82-lookalike that can run \LUA\
5905 code but not do much else. The defined csnames are (of course) saved in the
5906 format and will be available at runtime.
5907
5908 \subsubsection{\type {tex.extraprimitives}}
5909
5910 \startfunctioncall
5911 <table> t = tex.extraprimitives(<string> s, ...)
5912 \stopfunctioncall
5913
5914 This function returns a list of the primitives that originate from the engine(s)
5915 given by the requested string value(s). The possible values and their (current)
5916 return values are:
5917
5918 \startluacode
5919 function document.showprimitives(tag)
5920     for k, v in table.sortedpairs(tex.extraprimitives(tag)) do
5921         if v == ' ' then
5922             v = '\\normalcontrolspace'
5923         end
5924         context.type(v)
5925         context.space()
5926     end
5927 end
5928 \stopluacode
5929
5930 \starttabulate[|l|pl|]
5931 \NC \bf name\NC \bf values \NC \NR
5932 \NC tex     \NC \ctxlua{document.showprimitives('tex')    } \NC \NR
5933 \NC core    \NC \ctxlua{document.showprimitives('core')   } \NC \NR
5934 \NC etex    \NC \ctxlua{document.showprimitives('etex')   } \NC \NR
5935 \NC luatex  \NC \ctxlua{document.showprimitives('luatex') } \NC \NR
5936 \stoptabulate
5937
5938 Note that \type {'luatex'} does not contain \type {directlua}, as that
5939 isconsidered to be a core primitive, along with all the \TEX82 primitives, so it
5940 is part of the list that is returned from \type {'core'}.
5941
5942 % \type {'umath'} is a subset of \type {'luatex'} that covers the Unicode math
5943 % primitives as it might be desired to handle the prefixing of that subset
5944 % differently.
5945
5946 Running \type {tex.extraprimitives()} will give you the complete list of
5947 primitives \type {-ini} startup. It is exactly equivalent to \type
5948 {tex.extraprimitives('etex' and 'luatex')}.
5949
5950 \subsubsection{\type {tex.primitives}}
5951
5952 \startfunctioncall
5953 <table> t = tex.primitives()
5954 \stopfunctioncall
5955
5956 This function returns a hash table listing all primitives that \LUATEX\ knows
5957 about. The keys in the hash are primitives names, the values are tables
5958 representing tokens (see~\in{section }[luatokens]). The third value is always
5959 zero.
5960
5961 {\em In the beginning we had \type {omega} and \type {pdftex} subsets but in the
5962 meantime relevant primitives ave been promoted (either or not adapted) to the
5963 \type {luatex} set when found useful, or removed when considered to be of no use.
5964 Originally we had two sets of math definition primitives but the \OMEGA\ ones
5965 have been removed, so we no longer have a subset for math either.}
5966
5967 \subsection{Core functionality interfaces}
5968
5969 \subsubsection{\type {tex.badness}}
5970
5971 \startfunctioncall
5972 <number> b = tex.badness(<number> t, <number> s)
5973 \stopfunctioncall
5974
5975 This helper function is useful during linebreak calculations. \type {t} and \type
5976 {s} are scaled values; the function returns the badness for when total \type {t}
5977 is supposed to be made from amounts that sum to \type {s}. The returned number is
5978 a reasonable approximation of $100(t/s)^3$;
5979
5980 \subsubsection{\type {tex.linebreak}}
5981
5982 \startfunctioncall
5983 local <node> nodelist, <table> info =
5984        tex.linebreak(<node> listhead, <table> parameters)
5985 \stopfunctioncall
5986
5987 The understood parameters are as follows:
5988
5989 \starttabulate[|l|l|p|]
5990 \NC \bf name                 \NC \bf type        \NC \bf description \NC \NR
5991 \NC pardir                   \NC string          \NC \NC \NR
5992 \NC pretolerance             \NC number          \NC \NC \NR
5993 \NC tracingparagraphs        \NC number          \NC \NC \NR
5994 \NC tolerance                \NC number          \NC \NC \NR
5995 \NC looseness                \NC number          \NC \NC \NR
5996 \NC hyphenpenalty            \NC number          \NC \NC \NR
5997 \NC exhyphenpenalty          \NC number          \NC \NC \NR
5998 \NC pdfadjustspacing         \NC number          \NC \NC \NR
5999 \NC adjdemerits              \NC number          \NC \NC \NR
6000 \NC pdfprotrudechars         \NC number          \NC \NC \NR
6001 \NC linepenalty              \NC number          \NC \NC \NR
6002 \NC lastlinefit              \NC number          \NC \NC \NR
6003 \NC doublehyphendemerits     \NC number          \NC \NC \NR
6004 \NC finalhyphendemerits      \NC number          \NC \NC \NR
6005 \NC hangafter                \NC number          \NC \NC \NR
6006 \NC interlinepenalty         \NC number or table \NC if a table, then it is an array like \type {\interlinepenalties} \NC \NR
6007 \NC clubpenalty              \NC number or table \NC if a table, then it is an array like \type {\clubpenalties} \NC \NR
6008 \NC widowpenalty             \NC number or table \NC if a table, then it is an array like \type {\widowpenalties} \NC \NR
6009 \NC brokenpenalty            \NC number          \NC \NC \NR
6010 \NC emergencystretch         \NC number          \NC in scaled points \NC \NR
6011 \NC hangindent               \NC number          \NC in scaled points \NC \NR
6012 \NC hsize                    \NC number          \NC in scaled points \NC \NR
6013 \NC leftskip                 \NC glue_spec node  \NC \NC \NR
6014 \NC rightskip                \NC glue_spec node  \NC \NC \NR
6015 \NC pdfignoreddimen          \NC number          \NC in scaled points \NC \NR
6016 \NC parshape                 \NC table           \NC \NC \NR
6017 \stoptabulate
6018
6019 Note that there is no interface for \type {\displaywidowpenalties}, you have to
6020 pass the right choice for \type {widowpenalties} yourself.
6021
6022 The meaning of the various keys should be fairly obvious from the table (the
6023 names match the \TEX\ and \PDFTEX\ primitives) except for the last 5 entries. The
6024 four \type {pdf...line...} keys are ignored if their value equals \type
6025 {pdfignoreddimen}.
6026
6027 It is your own job to make sure that \type {listhead} is a proper paragraph list:
6028 this function does not add any nodes to it. To be exact, if you want to replace
6029 the core line breaking, you may have to do the following (when you are not
6030 actually working in the \type {pre_linebreak_filter} or \type {linebreak_filter}
6031 callbacks, or when the original list starting at listhead was generated in
6032 horizontal mode):
6033
6034 \startitemize
6035 \startitem
6036     add an \quote {indent box} and perhaps a \type {local_par} node at the start
6037     (only if you need them)
6038 \stopitem
6039 \startitem
6040     replace any found final glue by an infinite penalty (or add such a penalty,
6041     if the last node is not a glue)
6042 \stopitem
6043 \startitem
6044     add a glue node for the \type {\parfillskip} after that penalty node
6045 \stopitem
6046 \startitem
6047     make sure all the \type {prev} pointers are OK
6048 \stopitem
6049 \stopitemize
6050
6051 The result is a node list, it still needs to be vpacked if you want to assign it
6052 to a \type {\vbox}.
6053
6054 The returned \type {info} table contains four values that are all numbers:
6055
6056 \starttabulate[|l|p|]
6057 \NC prevdepth \NC depth of the last line in the broken paragraph \NC \NR
6058 \NC prevgraf  \NC number of lines in the broken paragraph \NC \NR
6059 \NC looseness \NC the actual looseness value in the broken paragraph \NC \NR
6060 \NC demerits  \NC the total demerits of the chosen solution  \NC \NR
6061 \stoptabulate
6062
6063 Note there are a few things you cannot interface using this function: You cannot
6064 influence font expansion other than via \type {pdfadjustspacing}, because the
6065 settings for that take place elsewhere. The same is true for hbadness and hfuzz
6066 etc. All these are in the \type {hpack()} routine, and that fetches its own
6067 variables via globals.
6068
6069 \subsubsection{\type {tex.shipout}}
6070
6071 \startfunctioncall
6072 tex.shipout(<number> n)
6073 \stopfunctioncall
6074
6075 Ships out box number \type {n} to the output file, and clears the box register.
6076
6077 \section[texconfig]{The \type {texconfig} table}
6078
6079 This is a table that is created empty. A startup \LUA\ script could
6080 fill this table with a number of settings that are read out by
6081 the executable after loading and executing the startup file.
6082
6083 \starttabulate[|lT|l|l|p|]
6084 \NC \ssbf key             \NC \bf type \NC \bf default \NC \bf explanation \NC \NR
6085 \NC kpse_init             \NC boolean  \NC true
6086 \NC
6087     \type {false} totally disables \KPATHSEA\ initialisation, and enables
6088     interpretation of the following numeric key--value pairs. (only ever unset
6089     this if you implement {\it all\/} file find callbacks!)
6090 \NC \NR
6091 \NC
6092     shell_escape          \NC string   \NC \type {'f'} \NC
6093     Use \type {'y'} or \type {'t'} or \type {'1'} to enable \type {\write18}
6094     unconditionally, \type {'p'} to enable the commands that are listed in \type
6095     {shell_escape_commands}
6096 \NC \NR
6097 \NC
6098     shell_escape_commands \NC string \NC \NC Comma-separated list of command
6099     names that may be executed by \type {\write18} even if \type {shell_escape}
6100     is set to \type {'p'}. Do {\it not\/} use spaces around commas, separate any
6101     required command arguments by using a space, and use the ASCII double quote
6102     (\type {"}) for any needed argument or path quoting
6103 \NC \NR
6104
6105 \NC string_vacancies      \NC number   \NC  75000  \NC cf.\ web2c docs \NC \NR
6106 \NC pool_free             \NC number   \NC   5000  \NC cf.\ web2c docs \NC \NR
6107 \NC max_strings           \NC number   \NC  15000  \NC cf.\ web2c docs \NC \NR
6108 \NC strings_free          \NC number   \NC    100  \NC cf.\ web2c docs \NC \NR
6109 \NC nest_size             \NC number   \NC     50  \NC cf.\ web2c docs \NC \NR
6110 \NC max_in_open           \NC number   \NC     15  \NC cf.\ web2c docs \NC \NR
6111 \NC param_size            \NC number   \NC     60  \NC cf.\ web2c docs \NC \NR
6112 \NC save_size             \NC number   \NC   4000  \NC cf.\ web2c docs \NC \NR
6113 \NC stack_size            \NC number   \NC    300  \NC cf.\ web2c docs \NC \NR
6114 \NC dvi_buf_size          \NC number   \NC  16384  \NC cf.\ web2c docs \NC \NR
6115 \NC error_line            \NC number   \NC     79  \NC cf.\ web2c docs \NC \NR
6116 \NC half_error_line       \NC number   \NC     50  \NC cf.\ web2c docs \NC \NR
6117 \NC max_print_line        \NC number   \NC     79  \NC cf.\ web2c docs \NC \NR
6118 \NC hash_extra            \NC number   \NC      0  \NC cf.\ web2c docs \NC \NR
6119 \NC pk_dpi                \NC number   \NC     72  \NC cf.\ web2c docs \NC \NR
6120 \NC trace_file_names      \NC boolean  \NC true
6121 \NC
6122     \type {false} disables \TEX's normal file open|-|close feedback (the
6123     assumption is that callbacks will take care of that)
6124 \NC \NR
6125 \NC file_line_error       \NC boolean  \NC false
6126 \NC
6127     do \type {file:line} style error messages
6128 \NC \NR
6129 \NC halt_on_error         \NC boolean  \NC false
6130 \NC
6131     abort run on the first encountered error
6132 \NC \NR
6133 \NC formatname            \NC string   \NC
6134 \NC
6135     if no format name was given on the commandline, this key will be tested first
6136     instead of simply quitting
6137 \NC \NR
6138 \NC jobname               \NC string   \NC
6139 \NC
6140     if no input file name was given on the commandline, this key will be tested
6141     first instead of simply giving up
6142 \NC \NR
6143 \stoptabulate
6144
6145 Note: the numeric values that match web2c parameters are only used if \type
6146 {kpse_init} is explicitly set to \type {false}. In all other cases, the normal
6147 values from \type {texmf.cnf} are used.
6148
6149 \section{The \type {texio} library}
6150
6151 This library takes care of the low|-|level I/O interface.
6152
6153 \subsection{Printing functions}
6154
6155 \subsubsection{\type {texio.write}}
6156
6157 \startfunctioncall
6158 texio.write(<string> target, <string> s, ...)
6159 texio.write(<string> s, ...)
6160 \stopfunctioncall
6161
6162 Without the \type {target} argument, writes all given strings to the same
6163 location(s) \TEX\ writes messages to at this moment. If \type {\batchmode} is in
6164 effect, it writes only to the log, otherwise it writes to the log and the
6165 terminal. The optional \type {target} can be one of three possibilities: \type
6166 {term}, \type {log} or \type {term and log}.
6167
6168 Note: If several strings are given, and if the first of these strings is or might
6169 be one of the targets above, the \type {target} must be specified explicitly to
6170 prevent \LUA\ from interpreting the first string as the target.
6171
6172 \subsubsection{\type {texio.write_nl}}
6173
6174 \startfunctioncall
6175 texio.write_nl(<string> target, <string> s, ...)
6176 texio.write_nl(<string> s, ...)
6177 \stopfunctioncall
6178
6179 This function behaves like \type {texio.write}, but make sure that the given
6180 strings will appear at the beginning of a new line. You can pass a single empty
6181 string if you only want to move to the next line.
6182
6183 \subsubsection{\type {texio.setescape}}
6184
6185 You can disable \type {^^} escaping of control characters by passing a value of
6186 zero.
6187
6188 % \section[luatokens]{The \type {oldtoken} library (obsolete)}
6189 %
6190 % {\em Nota Bene: This library will disappear soon. It is replaced by the \type
6191 % {token} library, that used to be called \type {newroken}.}
6192 %
6193 % The \type {token} table contains interface functions to \TEX's handling of
6194 % tokens. These functions are most useful when combined with the \type
6195 % {token_filter} callback, but they could be used standalone as well.
6196 %
6197 % A token is represented in \LUA\ as a small table. For the moment, this table
6198 % consists of three numeric entries:
6199 %
6200 % \starttabulate[|l|l|p|]
6201 % \NC \bf index \NC \bf meaning         \NC \bf description \NC \NR
6202 % \NC 1         \NC command code        \NC this is a value between~$0$ and~$130$ (approximately)\NC \NR
6203 % \NC 2         \NC command modifier    \NC this is a value between~$0$ and~$2^{21}$ \NC \NR
6204 % \NC 3         \NC control sequence id \NC for commands that are not the result of control
6205 %                                           sequences, like letters and characters, it is zero,
6206 %                                           otherwise, it is a number pointing into the \quote
6207 %                                           {equivalence table} \NC \NR
6208 % \stoptabulate
6209 %
6210 % \subsection{\type {oldtoken.get_next}}
6211 %
6212 % \startfunctioncall
6213 % token t = oldtoken.get_next()
6214 % \stopfunctioncall
6215 %
6216 % This fetches the next input token from the current input source, without
6217 % expansion.
6218 %
6219 % \subsection{\type {oldtoken.is_expandable}}
6220 %
6221 % \startfunctioncall
6222 % <boolean> b = oldtoken.is_expandable(<token> t)
6223 % \stopfunctioncall
6224 %
6225 % This tests if the token \type {t} could be expanded.
6226 %
6227 % \subsection{\type {oldtoken.expand}}
6228 %
6229 % \startfunctioncall
6230 % oldtoken.expand(<token> t)
6231 % \stopfunctioncall
6232 %
6233 % If a token is expandable, this will expand one level of it, so that the first
6234 % token of the expansion will now be the next token to be read by \type
6235 % {oldtoken.get_next()}.
6236 %
6237 % \subsection{\type {oldtoken.is_activechar}}
6238 %
6239 % \startfunctioncall
6240 % <boolean> b = oldtoken.is_activechar(<token> t)
6241 % \stopfunctioncall
6242 %
6243 % This is a special test that is sometimes handy. Discovering whether some control
6244 % sequence is the result of an active character turned out to be very hard
6245 % otherwise.
6246 %
6247 % \subsection{\type {oldtoken.create}}
6248 %
6249 % \startfunctioncall
6250 % token t = oldtoken.create(<string> csname)
6251 % token t = oldtoken.create(<number> charcode)
6252 % token t = oldtoken.create(<number> charcode, <number> catcode)
6253 % \stopfunctioncall
6254 %
6255 % This is the token factory. If you feed it a string, then it is the name of a
6256 % control sequence (without leading backslash), and it will be looked up in the
6257 % equivalence table.
6258 %
6259 % If you feed it number, then this is assumed to be an input character, and an
6260 % optional second number gives its category code. This means it is possible to
6261 % overrule a character's category code, with a few exceptions: the category codes~0
6262 % (escape), 9~(ignored), 13~(active), 14~(comment), and 15 (invalid) cannot occur
6263 % inside a token. The values~0, 9, 14 and~15 are therefore illegal as input to
6264 % \type {oldtoken.create()}, and active characters will be resolved immediately.
6265 %
6266 % Note: unknown string sequences and never defined active characters will result in
6267 % a token representing an \quote {undefined control sequence} with a near|-|random
6268 % name. It is {\em not} possible to define brand new control sequences using
6269 % \type {oldtoken.create}!
6270 %
6271 % \subsection{\type {oldtoken.command_name}}
6272 %
6273 % \startfunctioncall
6274 % <string> commandname = oldtoken.command_name(<token> t)
6275 % \stopfunctioncall
6276 %
6277 % This returns the name associated with the \quote {command} value of the token in
6278 % \LUATEX. There is not always a direct connection between these names and
6279 % primitives. For instance, all \type {\ifxxx} tests are grouped under \type
6280 % {if_test}, and the \quote {command modifier} defines which test is to be run.
6281 %
6282 % \subsection{\type {oldtoken.command_id}}
6283 %
6284 % \startfunctioncall
6285 % <number> i = oldtoken.command_id(<string> commandname)
6286 % \stopfunctioncall
6287 %
6288 % This returns a number that is the inverse operation of the previous command, to
6289 % be used as the first item in a token table.
6290 %
6291 % \subsection{\type {oldtoken.csname_name}}
6292 %
6293 % \startfunctioncall
6294 % <string> csname = oldtoken.csname_name(<token> t)
6295 % \stopfunctioncall
6296 %
6297 % This returns the name associated with the \quote {equivalence table} value of the
6298 % token in \LUATEX. It returns the string value of the command used to create the
6299 % current token, or an empty string if there is no associated control sequence.
6300 %
6301 % Keep in mind that there are potentially two control sequences that return the
6302 % same csname string: single character control sequences and active characters have
6303 % the same \quote {name}.
6304 %
6305 % \subsection{\type {oldtoken.csname_id}}
6306 %
6307 % \startfunctioncall
6308 % <number> i = oldtoken.csname_id(<string> csname)
6309 % \stopfunctioncall
6310 %
6311 % This returns a number that is the inverse operation of the previous command, to
6312 % be used as the third item in a token table.
6313
6314 \subsection{The \type {token} libray}
6315
6316 The current \type {token} library will be replaced by a new one that is more
6317 flexible and powerful. The transition takes place in steps. In version 0.80 we
6318 have \type {token} and in version 0.85 the old lib will be replaced
6319 completely. So if you use this new mechanism in production code you need to be
6320 aware of incompatible updates between 0.80 and 0.90. Because the related in- and
6321 output code will also be cleaned up and rewritten you should be aware of
6322 incompatible logging and error reporting too.
6323
6324 The old library presents tokens as triplets or numbers, the new library presents
6325 a userdata object. The old library used a callback to intercept tokens in the
6326 input but the new library provides a basic scanner infrastructure that can be
6327 used to write macros that accept a wide range of arguments. This interface is on
6328 purpose kept general and as performance is quite ok one can build additional
6329 parsers without too much overhead. It's up to macro package writers to see how
6330 they can benefit from this as the main principle behind \LUATEX\ is to provide a
6331 minimal set of tools and no solutions.
6332
6333 The current functions in the \type {token} namespace are given in the next
6334 table:
6335
6336 \starttabulate[|lT|lT|p|]
6337 \NC \bf function \NC \bf argument       \NC \bf result \NC \NR
6338 \HL
6339 \NC is_token     \NC token              \NC checks if the given argument is a token userdatum \NC \NR
6340 \NC get_next     \NC                    \NC returns the next token in the input \NC \NR
6341 \NC scan_keyword \NC string             \NC returns true if the given keyword is gobbled \NC \NR
6342 \NC scan_int     \NC                    \NC returns a number \NC \NR
6343 \NC scan_dimen   \NC infinity, mu-units \NC returns a number representing a dimension and or two numbers being the filler and order \NC \NR
6344 \NC scan_glue    \NC mu-units           \NC returns a glue spec node \NC \NR
6345 \NC scan_toks    \NC definer, expand    \NC returns a table of tokens token list (this can become a linked list in later releases) \NC \NR
6346 \NC scan_code    \NC bitset             \NC returns a character if its category is in the given bitset (representing catcodes) \NC \NR
6347 \NC scan_string  \NC                    \NC returns a string given between \type {{}}, as \type {\macro} or as sequence of characters with catcode 11 or 12 \NC \NR
6348 \NC scan_word    \NC                    \NC returns a sequence of characters with catcode 11 or 12 as string \NC \NR
6349 \NC scan_csname  \NC                    \NC returns \type {foo} after scanning \type {\foo} \NC \NR
6350 \NC set_macro    \NC see below          \NC assign a macro \NC \NR
6351 \NC create       \NC                    \NC returns a userdata token object of the given control sequence name (or character); this interface can change  \NC \NR
6352 \stoptabulate
6353
6354 The scanners can be considered stable apart from the one scanning for a token.
6355 This is because futures releases can return a linked list instead of a table (as
6356 with nodes). The \type {scan_code} function takes an optional number, the \type
6357 {keyword} function a normal \LUA\ string. The \type {infinity} boolean signals
6358 that we also permit \type {fill} as dimension and the \type {mu-units} flags the
6359 scanner that we expect math units. When scanning tokens we can indicate that we
6360 are defining a macro, in which case the result will also provide information
6361 about what arguments are expected and in the result this is separated from the
6362 meaning by a separator token. The \type {expand} flag determines if the list will
6363 be expanded.
6364
6365 The string scanner scans for something between curly braces and expands on the
6366 way, or when it sees a control sequence it will return its meaning. Otherwise it
6367 will scan characters with catcode \type {letter} or \type {other}. So, given the
6368 following definition:
6369
6370 \startbuffer
6371 \def\bar{bar}
6372 \def\foo{foo-\bar}
6373 \stopbuffer
6374
6375 \typebuffer \getbuffer
6376
6377 we get:
6378
6379 \starttabulate[|l|Tl|l|]
6380 \NC \type {\directlua{token.scan_string()}{foo}} \NC \directlua{context("{\\red\\type {"..token.scan_string().."}}")} {foo} \NC full expansion \NR
6381 \NC \type {\directlua{token.scan_string()}foo}   \NC \directlua{context("{\\red\\type {"..token.scan_string().."}}")} foo   \NC letters and others \NR
6382 \NC \type {\directlua{token.scan_string()}\foo}  \NC \directlua{context("{\\red\\type {"..token.scan_string().."}}")}\foo   \NC meaning \NR
6383 \stoptabulate
6384
6385 The \type {\foo} case only gives the meaning, but one can pass an already
6386 expanded definition (\type {\edef}'d). In the case of the braced variant one can of
6387 course use the \type {\detokenize} and \type {\unexpanded} primitives as there we
6388 do expand.
6389
6390 The \type {scan_word} scanner can be used to implement for instance a number scanner:
6391
6392 \starttyping
6393 function token.scan_number(base)
6394     return tonumber(token.scan_word(),base)
6395 end
6396 \stoptyping
6397
6398 This scanner accepts any valid \LUA\ number so it is a way to pick up floats
6399 in the input.
6400
6401 The creator function can be used as follows:
6402
6403 \starttyping
6404 local t = token.create("relax")
6405 \stoptyping
6406
6407 This gives back a token object that has the properties of the \type {\relax}
6408 primitive. The possible properties of tokens are:
6409
6410 \starttabulate[|lT|p|]
6411 \NC command    \NC a number representing the internal command number \NC \NR
6412 \NC cmdname    \NC the type of the command (for instance the catcode in case of a
6413                    character or the classifier that determines the internal
6414                    treatment \NC \NR
6415 \NC csname     \NC the associated control sequence (if applicable) \NC \NR
6416 \NC id         \NC the unique id of the token \NC \NR
6417 %NC tok        \NC \NC \NR % might change
6418 \NC active     \NC a boolean indicating the active state of the token \NC \NR
6419 \NC expandable \NC a boolean indicating if the token (macro) is expandable \NC \NR
6420 \NC protected  \NC a boolean indicating if the token (macro) is protected \NC \NR
6421 \stoptabulate
6422
6423 The numbers that represent a catcode are the same as in \TEX\ itself, so using
6424 this information assumes that you know a bit about \TEX's internals. The other
6425 numbers and names are used consistently but are not frozen. So, when you use them
6426 for comparing you can best query a known primitive or character first to see the
6427 values.
6428
6429 More interesting are the scanners. You can use the \LUA\ interface as follows:
6430
6431 \starttyping
6432 \directlua {
6433     function mymacro(n)
6434         ...
6435     end
6436 }
6437
6438 \def\mymacro#1{%
6439     \directlua {
6440         mymacro(\number\dimexpr#1)
6441     }%
6442 }
6443
6444 \mymacro{12pt}
6445 \mymacro{\dimen0}
6446 \stoptyping
6447
6448 You can also do this:
6449
6450 \starttyping
6451 \directlua {
6452     function mymacro()
6453         local d = token.scan_dimen()
6454         ...
6455     end
6456 }
6457
6458 \def\mymacro{%
6459     \directlua {
6460         mymacro()
6461     }%
6462 }
6463
6464 \mymacro 12pt
6465 \mymacro \dimen0
6466 \stoptyping
6467
6468 It is quite clear from looking at the code what the first method needs as
6469 argument(s). For the second method you need to look at the \LUA\ code to see what
6470 gets picked up. Instead of passing from \TEX\ to \LUA\ we let \LUA\ fetch from
6471 the input stream.
6472
6473 In the first case the input is tokenized and then turned into a string when it's
6474 passed to \LUA\ where it gets interpreted. In the second case only a function
6475 call gets interpreted but then the input is picked up by explicitly calling the
6476 scanner functions. These return proper \LUA\ variables so no further conversion
6477 has to be done. This is more efficient but in practice (given what \TEX\ has to
6478 do) this effect should not be overestimated. For numbers and dimensions it saves a
6479 bit but for passing strings conversion to and from tokens has to be done anyway
6480 (although we can probably speed up the process in later versions if needed).
6481
6482 When the interface is stable and has replaced the old one completely we will add
6483 some more information here. By that time the internals have been cleaned up a bit
6484 more so we know then what will stay and go. A positive side effect of this
6485 transition is that we can simplify the input part because we no longer need to
6486 intercept using callbacks.
6487
6488 The \type {set_macro} function can get upto 4 arguments:
6489
6490 \starttyping
6491 setmacro("csname","content")
6492 setmacro("csname","content","global")
6493 setmacro("csname")
6494 \stoptyping
6495
6496 You can pass a catcodetable identifier as first argument:
6497
6498 \starttyping
6499 setmacro(catcodetable,"csname","content")
6500 setmacro(catcodetable,"csname","content","global")
6501 setmacro(catcodetable,"csname")
6502 \stoptyping
6503
6504 The results are like:
6505
6506 \starttyping
6507  \def\csname{content}
6508 \gdef\csname{content}
6509  \def\csname{}
6510 \stoptyping
6511
6512 There is a (for now) experimental putter:
6513
6514 \starttyping
6515 local t1 = token.get_next()
6516 local t2 = token.get_next()
6517 local t3 = token.get_next()
6518 local t4 = token.get_next()
6519 -- watch out, we flush in sequence
6520 token.put_next { t1, t2 }
6521 -- but this one gets pushed in front
6522 token.put_next ( t3, t4 )
6523 \stoptyping
6524
6525 When we scan \type {wxyz!} we get \type {yzwx!} back. The argument is either a table
6526 with tokens or a list of tokens.
6527
6528 \stopchapter
6529
6530 \stopcomponent