beta-0.89.2
[luatex.git] / manual / old / luatexref-t.tex
blob2b3ecc88f3fa1451c242814c07378b42161486cd
1 % engine=luatex language=uk
2 % $Id$
4 % TODO: fix layout of function legend descriptions
5 % check numbers
6 % check \luatex command
8 \usemodule[newotf]
10 %\nopdfcompression
11 %\loggingall
12 \environment luatexref-env
13 \logo[DFONT] {dfont}
14 \logo[CFF] {cff}
15 \logo[CMAP] {CMap}
16 \logo[PATGEN] {patgen}
17 \logo[MP] {MetaPost}
18 \logo[METAPOST]{MetaPost}
19 \logo[MPLIB] {MPlib}
20 \logo[COCO] {coco}
21 \logo[SUNOS] {SunOS}
22 \logo[BSD] {bsd}
23 \logo[SYSV] {sysv}
24 \logo[DPI] {dpi}
26 \setvariables
27 [document]
28 [beta=0.80.1]
30 \starttext
32 \dontcomplain \nonknuthmode
34 \setups[titlepage]
36 \title{Contents}
38 \placecontent[criterium=text,level=subsection]
40 \chapter{Introduction}
42 \startframedtext[framecolor=red,foregroundcolor=red,width=\hsize,style=\tfa]
44 This book will eventually become the reference manual of \LUATEX. At the moment,
45 it simply reports the behavior of the executable matching the snapshot or beta
46 release date in the title page.
48 \blank
50 Features may come and go. The current version of \LUATEX\ is not meant for
51 production and users cannot depend on stability, nor on functionality staying the
52 same.
54 \blank
56 Nothing is considered stable just yet. This manual therefore simply reflects the
57 current state of the executable. {\bs Absolutely nothing\/} on the following
58 pages is set in stone. When the need arises, anything can (and will) be changed.
60 \stopframedtext
62 \blank[2*line]
64 \LUATEX\ consists of a number of interrelated but (still) distinguishable parts:
66 \startitemize[packed]
67 \item \PDFTEX\ version 1.40.9, converted to C (with patches from later releases).
68 \item The direction model and some other bits from \ALEPH\ RC4 converted to C.
69 \item \LUA\ 5.2.1
70 \item dedicated \LUA\ libraries
71 \item various \TEX\ extensions
72 \item parts of \FONTFORGE\ 2008.11.17
73 \item the \METAPOST\ library
74 \item newly written compiled source code to glue it all together
75 \stopitemize
77 Neither \ALEPH's I/O translation processes, nor tcx files, nor \ENCTEX\ can be
78 used, these encoding|-|related functions are superseded by a \LUA|-|based
79 solution (reader callbacks). Also, some experimental \PDFTEX\ features are
80 removed. These can be implemented in \LUA\ instead.
82 \chapter{Basic \TEX\ enhancements}
84 \section{Introduction}
86 From day one, \LUATEX\ has offered extra functionality when compared to the
87 superset of \PDFTEX\ and \ALEPH. That has not been limited to the possibility to
88 execute \LUA\ code via \type {\directlua}, but \LUATEX\ also adds functionality
89 via new \TEX-side primitives.
91 However, starting with beta \type {0.39.0}, most of that functionality is hidden
92 by default. When \LUATEX\ 0.40.0 starts up in \quote {iniluatex} mode (\type
93 {luatex -ini}), it defines only the primitive commands known by \TEX82 and the
94 one extra command \type {\directlua}.
96 As is fitting, a \LUA\ function has to be called to add the extra primitives to
97 the user environment. The simplest method to get access to all of the new
98 primitive commands is by adding this line to the format generation file:
100 \starttyping
101 \directlua { tex.enableprimitives('',tex.extraprimitives()) }
102 \stoptyping
104 But be aware that the curly braces may not have the proper \type{\catcode}
105 assigned to them at this early time (giving a 'Missing number' error), so it may
106 be needed to put these assignments
108 \starttyping
109 \catcode `\{=1
110 \catcode `\}=2
111 \stoptyping
113 before the above line. More fine-grained primitives control is possible, you can
114 look up the details in \in{section}[luaprimitives]. For simplicity's sake, this
115 manual assumes that you have executed the \type {\directlua} command as given
116 above.
118 The startup behavior documented above is considered stable in the sense that
119 there will not be backward-incompatible changes any more.
121 \section{Version information}
123 There are three new primitives to test the version of \LUATEX:
125 \starttabulate[|l|p|p|]
126 \NC \bf primitive \NC \bf explanation \NC \bf value \NC \NR
127 \NC \tex{luatexbanner} \NC the banner as reported on the command line \NC \luatexbanner \NC \NR
128 \NC \tex{luatexversion} \NC a combination of major and minor number \NC \the\luatexversion \NC \NR
129 \NC \tex{luatexrevision} \NC the revision number, the current value is \NC \luatexrevision \NC \NR
130 \stoptabulate
132 The official \LUATEX\ version is defined as follows:
134 \startitemize
135 \startitem
136 The major version is the integer result of \tex {luatexversion} divided by
137 100. The primitive is an \quote {internal variable}, so you may need to prefix
138 its use with \type {\the} depending on the context.
139 \stopitem
140 \startitem
141 The minor version is the two-digit result of \tex {luatexversion} modulo 100.
142 \stopitem
143 \startitem
144 The revision is the given by \tex {luatexrevision}. This primitive expands to
145 a positive integer.
146 \stopitem
147 \startitem
148 The full version number consists of the major version, minor version and
149 revision, separated by dots.
150 \stopitem
151 \stopitemize
153 \section{\UNICODE\ text support}
155 Text input and output is now considered to be \UNICODE\ text, so input characters
156 can use the full range of \UNICODE\ ($2^{20}+2^{16}-1 = \hbox{0x10FFFF}$).
158 Later chapters will talk of characters and glyphs. Although these are not
159 interchangeable, they are closely related. During typesetting, a character is
160 always converted to a suitable graphic representation of that character in a
161 specific font. However, while processing a list of to|-|be|-|typeset nodes, its
162 contents may still be seen as a character. Inside \LUATEX\ there is not yet a
163 clear separation between the two concepts. Until this is implemented, please do
164 not be too harsh on us if we make errors in the usage of the terms.
166 A few primitives are affected by this, all in a similar fashion: each of them has
167 to accommodate for a larger range of acceptable numbers. For instance, \tex
168 {char} now accepts values between~0 and $1{,}114{,}111$. This should not be a
169 problem for well|-|behaved input files, but it could create incompatibilities for
170 input that would have generated an error when processed by older \TEX|-|based
171 engines. The affected commands with an altered initial (left of the equals sign)
172 or secondary (right of the equals sign) value are: \tex {char}, \tex {lccode},
173 \tex {uccode}, \tex {catcode}, \tex {sfcode}, \tex {efcode}, \tex {lpcode}, \tex
174 {rpcode}, \tex {chardef}.
176 As far as the core engine is concerned, all input and output to text files is
177 \UTF-8 encoded. Input files can be pre|-|processed using the \luatex {reader}
178 callback. This will be explained in a later chapter.
180 Output in byte|-|sized chunks can be achieved by using characters just outside of
181 the valid \UNICODE\ range, starting at the value $1{,}114{,}112$ (0x110000). When
182 the time comes to print a character $c>=1{,}114{,}112$, \LUATEX\ will actually
183 print the single byte corresponding to $c$ minus 1{,}114{,}112.
185 Output to the terminal uses \type {^^} notation for the lower control range
186 ($c<32$), with the exception of \type {^^I}, \type {^^J} and \type {^^M}. These
187 are considered \quote {safe} and therefore printed as-is.
189 Normalization of the \UNICODE\ input can be handled by a macro package during
190 callback processing (this will be explained in \in{section}[iocallback]).
192 \section{Extended tables}
194 All traditional \TEX\ and \ETEX\ registers can be 16-bit numbers as in \ALEPH.
195 The affected commands are:
197 \startcolumns[n=4]
198 \starttyping
199 \count
200 \dimen
201 \skip
202 \muskip
203 \marks
204 \toks
205 \countdef
206 \dimendef
207 \skipdef
208 \muskipdef
209 \toksdef
210 \insert
211 \box
212 \unhbox
213 \unvbox
214 \copy
215 \unhcopy
216 \unvcopy
220 \setbox
221 \vsplit
222 \stoptyping
223 \stopcolumns
225 The glyph properties (like \type {\efcode}) introduced in \PDFTEX\ that deal with
226 font expansion (hz) and character protruding are also 16-bit. Because font memory
227 management has been rewritten, these character properties are no longer shared
228 among fonts instances that originate from the same metric file.
230 The behavior documented in the above section is considered stable in the sense
231 that there will not be backward-incompatible changes any more.
233 \section{Attribute registers}
235 Attributes are a completely new concept in \LUATEX. Syntactically, they behave a
236 lot like counters: attributes obey \TEX's nesting stack and can be used after
237 \tex {the} etc.\ just like the normal \tex {count} registers.
239 \startsyntax
240 \attribute <16-bit number> <optional equals> <32-bit number>!crlf
241 \attributedef <csname> <optional equals> <16-bit number>
242 \stopsyntax
244 Conceptually, an attribute is either \quote {set} or \quote {unset}. Unset
245 attributes have a special negative value to indicate that they are unset, that
246 value is the lowest legal value: \type {-"7FFFFFFF} in hexadecimal, a.k.a.
247 $-2147483647$ in decimal. It follows that the value \type {-"7FFFFFFF} cannot be
248 used as a legal attribute value, but you {\it can\/} assign \type {-"7FFFFFFF} to
249 \quote {unset} an attribute. All attributes start out in this \quote {unset}
250 state in \INITEX\ (prior to 0.37, there could not be valid negative attribute
251 values, and the \quote {unset} value was $-1$).
253 Attributes can be used as extra counter values, but their usefulness comes mostly
254 from the fact that the numbers and values of all \quote {set} attributes are
255 attached to all nodes created in their scope. These can then be queried from any
256 \LUA\ code that deals with node processing. Further information about how to use
257 attributes for node list processing from \LUA\ is given in~\in {chapter}[nodes].
259 The behavior documented in the above subsection is considered stable in the sense
260 that there will not be backward-incompatible changes any more.
262 \subsection{Box attributes}
264 Nodes typically receive the list of attributes that is in effect when they are
265 created. This moment can be quite asynchronous. For example: in paragraph
266 building, the individual line boxes are created after the \tex {par} command has
267 been processed, so they will receive the list of attributes that is in effect
268 then, not the attributes that were in effect in, say, the first or third line of
269 the paragraph.
271 Similar situations happen in \LUATEX\ regularly. A few of the more obvious
272 problematic cases are dealt with: the attributes for nodes that are created
273 during hyphenation, kerning and ligaturing borrow their attributes from their
274 surrounding glyphs, and it is possible to influence box attributes directly.
276 When you assemble a box in a register, the attributes of the nodes contained in
277 the box are unchanged when such a box is placed, unboxed, or copied. In this
278 respect attributes act the same as characters that have been converted to
279 references to glyphs in fonts. For instance, when you use attributes to implement
280 color support, each node carries information about its eventual color. In that
281 case, unless you implement mechanisms that deal with it, applying a color to
282 already boxed material will have no effect. Keep in mind that this
283 incompatibility is mostly due to the fact that separate specials and literals are
284 a more unnatural approach to colors than attributes.
286 It is possible to fine-tune the list of attributes that are applied to a \type
287 {hbox}, \type {vbox} or \type {vtop} by the use of the keyword \type {attr}. An
288 example:
290 \starttyping
291 \attribute2=5
292 \setbox0=\hbox {Hello}
293 \setbox2=\hbox attr1=12 attr2=-"7FFFFFFF{Hello}
294 \stoptyping
296 This will set the attribute list of box~2 to $1=12$, and the attributes of box~0
297 will be $2=5$. As you can see, assigning the maximum negative value causes an
298 attribute to be ignored.
300 The \type {attr} keyword(s) should come before a \type {to} or \type {spread}, if
301 that is also specified.
303 \section{\LUA\ related primitives}
305 In order to merge \LUA\ code with \TEX\ input, a few new primitives are needed.
307 \subsection{\tex{directlua}}
309 The primitive \tex {directlua} is used to execute \LUA\ code immediately. The
310 syntax is
312 \startsyntax
313 \directlua <general text>!crlf
314 \directlua name <general text> <general text>!crlf
315 \directlua <16-bit number> <general text>
316 \stopsyntax
318 The last \syntax {<general text>} is expanded fully, and then fed into the \LUA\
319 interpreter. After reading and expansion has been applied to the \syntax
320 {<general text>}, the resulting token list is converted to a string as if it was
321 displayed using \type {\the\toks}. On the \LUA\ side, each \type {\directlua}
322 block is treated as a separate chunk. In such a chunk you can use the \type
323 {local} directive to keep your variables from interfering with those used by the
324 macro package.
326 The conversion to and from a token list means that you normally can not use \LUA\
327 line comments (starting with \type {--}) within the argument. As there typically
328 will be only one \quote {line} the first line comment will run on until the end
329 of the input. You will either need to use \TEX|-|style line comments (starting
330 with \%), or change the \TEX\ category codes locally. Another possibility is to
331 say:
333 \starttyping
334 \begingroup
335 \endlinechar=10
336 \directlua ...
337 \endgroup
338 \stoptyping
340 Then \LUA\ line comments can be used, since \TEX\ does not replace line endings
341 with spaces.
343 The \syntax {name <general text>} specifies the name of the \LUA\ chunk, mainly
344 shown in the stack backtrace of error messages created by \LUA\ code. The \syntax
345 {<general text>} is expanded fully, thus macros can be used to generate the chunk
346 name, i.e.
348 \starttyping
349 \directlua name{\jobname:\the\inputlineno} ...
350 \stoptyping
352 to include the name of the input file as well as the input line into the chunk
353 name.
355 Likewise, the \syntax {<16-bit number>} designates a name of a \LUA\ chunk, but
356 in this case the name will be taken from the \type {lua.name} array (see the
357 documentation of the \type {lua} table further in this manual). This syntax is
358 new in version 0.36.0.
360 The chunk name should not start with a \type {@}, or it will be displayed as a
361 file name (this is a quirk in the current \LUA\ implementation).
363 The \tex {directlua} command is expandable. Since it passes \LUA\ code to the
364 \LUA\ interpreter its expansion from the \TEX\ viewpoint is usually empty.
365 However, there are some \LUA\ functions that produce material to be read by \TEX,
366 the so called print functions. The most simple use of these is \type
367 {tex.print(<string> s)}. The characters of the string \type {s} will be placed on
368 the \TEX\ input buffer, that is, \quote {before \TeX's eyes} to be read by \TEX\
369 immediately. For example:
371 \startbuffer
372 \count10=20
373 a\directlua{tex.print(tex.count[10]+5)}b
374 \stopbuffer
376 \typebuffer
378 expands to
380 \getbuffer
382 Here is another example:
384 \startbuffer
385 $\pi = \directlua{tex.print(math.pi)}$
386 \stopbuffer
388 \typebuffer
390 will result in
392 \getbuffer
394 Note that the expansion of \tex {directlua} is a sequence of characters, not of
395 tokens, contrary to all \TEX\ commands. So formally speaking its expansion is
396 null, but it places material on a pseudo-file to be immediately read by \TEX, as
397 \ETEX's \tex {scantokens}.
399 For a description of print functions look at \in{section~}[sec:luaprint].
401 Because the \syntax {<general text>} is a chunk, the normal \LUA\ error handling
402 is triggered if there is a problem in the included code. The \LUA\ error messages
403 should be clear enough, but the contextual information is still pretty bad.
404 Often, you will only see the line number of the right brace at the end of the
405 code.
407 While on the subject of errors: some of the things you can do inside \LUA\ code
408 can break up \LUATEX\ pretty bad. If you are not careful while working with the
409 node list interface, you may even end up with assertion errors from within the
410 \TEX\ portion of the executable.
412 The behavior documented in the above subsection is considered stable in the sense
413 that there will not be backward-incompatible changes any more.
415 \subsection{\tex{luafunction}}
417 The \type {\directlua} commands involves tokenization of its argument (after
418 picking up an optional name or number specification). The tokenlist is then
419 converted into a string and given to \LUA\ to turn into a function that is
420 called. The overhead is rather small but when you use this primitive hundreds or
421 thousands of times, it can become noticeable. For this reason there is a variant
422 call available: \type {\luafunction}. This command is used as follows:
424 \starttyping
425 \directlua {
426 local t = lua.get_functions_table()
427 t[1] = function() tex.print("!") end
428 t[2] = function() tex.print("?") end
431 \luafunction1
432 \luafunction2
433 \stoptyping
435 Of course the functions can also be defined in a separate file. There is no limit
436 on the number of functions apart from normal \LUA\ limitations. Of course there
437 is the limitation of no arguments but that would involve parsing and thereby give
438 no gain. The function, when called in fact gets one argument, being the index, so
441 \starttyping
442 \directlua {
443 local t = lua.get_functions_table()
444 t[8] = function(slot) tex.print(slot) end
446 \stoptyping
448 the number \type {8} gets typeset.
450 \subsection{\tex{latelua}}
452 \tex{latelua} stores \LUA\ code in a whatsit that will be processed
453 at the time of shipping out. Its intended use is a cross between
454 \tex{pdfliteral} and \tex{write}.
455 Within the \LUA\ code you can print \PDF\
456 statements directly to the \PDF\ file via \type{pdf.print},
457 or you can write to other output streams via \type{texio.write}
458 or simply using lua's I/O routines.
460 \startsyntax
461 \latelua <general text>!crlf
462 \latelua name <general text> <general text>!crlf
463 \latelua <16-bit number> <general text>
464 \stopsyntax
466 Expansion of macros etcetera in the final \type{<general text>} is delayed
467 until just before the whatsit is executed (like in \tex{write}). With
468 regard to PDF output stream \tex{latelua} behaves as \tex{pdfliteral page}.
470 The \syntax{name <general text>} and \syntax{<16-bit number>} behave
471 in the same way as they do for \type{\directlua}
473 \subsection{\tex{luaescapestring}}
475 This primitive converts a \TEX\ token sequence so that it can be
476 safely used as the contents of a \LUA\ string: embedded backslashes,
477 double and single quotes, and newlines and carriage returns are
478 escaped. This is done by prepending an extra token consisting of a
479 backslash with category code~12, and for the line endings,
480 converting them to \type{n} and \type{r} respectively. The token
481 sequence is fully expanded.
483 \startsyntax
484 \luaescapestring <general text>
485 \stopsyntax
487 Most often, this command is not actually the best way to deal with the
488 differences between the \TEX\ and \LUA. In very short bits of \LUA\
489 code it is often not needed, and for longer stretches of \LUA\ code it
490 is easier to keep the code in a separate file and load it using \LUA's
491 \type{dofile}:
493 \starttyping
494 \directlua { dofile('mysetups.lua')}
495 \stoptyping
498 \section{New \ETEX\ primitives}
500 \subsection{\tex{clearmarks}}
502 This primitive clears a mark class completely, resetting all three
503 connected mark texts to empty.
505 \startsyntax
506 \clearmarks <16-bit number>
507 \stopsyntax
509 \subsection{\tex{noligs} and \tex{nokerns}}
511 These primitives prohibit ligature and kerning insertion at the time
512 when the initial node list is built by \LUATEX's main control loop.
513 They are part of a temporary trick and will be removed in the near
514 future. For now, you need to enable these primitives when you want to
515 do node list processing of \quote{characters}, where \TEX's normal
516 processing would get in the way.
518 \startsyntax
519 \noligs <integer>!crlf
520 \nokerns <integer>
521 \stopsyntax
523 These primitives can now be implemented by overloading the ligature
524 building and kerning functions, i.e.\ by assigning dummy functions
525 to their associated callbacks.
527 \subsection{\tex{formatname}}
529 \tex{formatname}'s syntax is identical to \tex{jobname}.
531 In \INITEX, the expansion is empty. Otherwise, the expansion is the
532 value that \tex{jobname} had during the \INITEX\ run that dumped the
533 currently loaded format.
535 \subsection{\tex{scantextokens}}
537 The syntax of \tex{scantextokens} is identical to \tex{scantokens}.
538 This primitive is a slightly adapted version of \ETEX's \tex{scantokens}. The
539 differences are:
541 \startitemize
542 \item The last (and usually only) line does not have a
543 \tex{endlinechar} appended
544 \item \tex{scantextokens} never raises an EOF error,
545 and it does not execute \tex{everyeof} tokens.
546 \item The \quote{\unknown\ while end of file \unknown} error tests are not executed, allowing
547 the expansion to end on a different grouping level or while a
548 conditional is still incomplete.
549 \stopitemize
551 \subsection {Verbose versions of single-character aligments commands (0.45)}
553 \LUATEX\ defines two new primitives that have the same function as
554 \type{#} and \type{&} in aligments:
556 \starttabulate[|l|l|l|l|]
557 \NC \bf primitive \NC \bf explanation \NC\NR
558 \NC \tex{alignmark} \NC Duplicates the functionality of \char`\#~%
559 inside alignment preambles\NC\NR
560 \NC \tex{aligntab} \NC Duplicates the functionality of \char`\&~%
561 inside alignments (and preambles)\NC\NR
562 \stoptabulate
565 \subsection{Catcode tables}
567 Catcode tables are a new feature that allows you to switch to a
568 predefined catcode regime in a single statement. You can have a
569 practically unlimited number of different tables.
571 The subsystem is backward compatible: if you never use the following
572 commands, your document will not notice any difference in behavior
573 compared to traditional \TEX.
575 The contents of each catcode table is independent from any other
576 catcode tables, and their contents is stored and retrieved from the
577 format file.
579 \subsubsection{\tex{catcodetable}}
581 \startsyntax
582 \catcodetable <15-bit number>
583 \stopsyntax
585 The primitive \tex{catcodetable} switches to a different catcode table.
586 Such a table has to be previously created using one of the two
587 primitives below, or it has to be zero. Table zero is initialized by
588 \INITEX.
590 \subsubsection{\tex{initcatcodetable}}
592 \startsyntax
593 \initcatcodetable <15-bit number>
594 \stopsyntax
596 The primitive \tex{initcatcodetable} creates a new table with catcodes
597 identical to those defined by \INITEX:
599 \starttabulate[|l|l|l|l|l|]
600 \NC~0\NC \tt\letterbackslash \NC \NC \tt escape \NC\NR
601 \NC~5\NC \tt\letterhat\letterhat M \NC return \NC \tt car{\_}ret \NC (this name may change) \NC\NR
602 \NC~9\NC \tt\letterhat\letterhat @ \NC null \NC \tt ignore \NC\NR
603 \NC10\NC \tt <space> \NC space \NC \tt spacer \NC\NR
604 \NC11\NC {\tt a} -- {\tt z} \NC \NC \tt letter \NC\NR
605 \NC11\NC {\tt A} -- {\tt Z} \NC \NC \tt letter \NC\NR
606 \NC12\NC everything else \NC \NC \tt other \NC\NR
607 \NC14\NC \tt\letterpercent \NC \NC \tt comment \NC\NR
608 \NC15\NC \tt\letterhat\letterhat ? \NC delete \NC \tt invalid{\_}char \NC\NR
609 \stoptabulate
611 The new catcode table is allocated globally: it will not go away after
612 the current group has ended. If the supplied number is identical to
613 the currently active table, an error is raised.
615 \subsubsection{\tex{savecatcodetable}}
617 \startsyntax
618 \savecatcodetable <15-bit number>
619 \stopsyntax
621 \tex{savecatcodetable} copies the current set of catcodes to a
622 new table with the requested number. The definitions in this new table
623 are all treated as if they were made in the outermost level.
625 The new table is allocated globally: it will not go away after the
626 current group has ended. If the supplied number is the currently
627 active table, an error is raised.
629 \subsection{\tex{suppressfontnotfounderror} (0.11)}
631 \startsyntax
632 \suppressfontnotfounderror = 1
633 \stopsyntax
635 If this new integer parameter is non|-|zero, then \LUATEX\ will not
636 complain about font metrics that are not found. Instead it will
637 silently skip the font assignment, making the requested csname for the
638 font \tex{ifx} equal to \tex{nullfont}, so that it can be tested
639 against that without bothering the user.
641 \subsection{\tex{suppresslongerror} (0.36)}
643 \startsyntax
644 \suppresslongerror = 1
645 \stopsyntax
647 If this new integer parameter is non|-|zero, then \LUATEX\ will not
648 complain about \type{\par} commands encountered in contexts where
649 that is normally prohibited (most prominently in the arguments
650 of non-long macros).
652 \subsection{\tex{suppressifcsnameerror} (0.36)}
654 \startsyntax
655 \suppressifcsnameerror = 1
656 \stopsyntax
658 If this new integer parameter is non|-|zero, then \LUATEX\ will not
659 complain about non-expandable commands appearing in the middle of a
660 \type{\ifcsname} expansion. Instead, it will keep getting expanded
661 tokens from the input until it encounters an \type{\endcsname}
662 command. Use with care! This command is experimental: if the input
663 expansion is unbalanced wrt. \type{\csname} \ldots \type{\endcsname}
664 pairs, the \LUATEX\ process may hang indefinitely.
667 \subsection{\tex{suppressoutererror} (0.36)}
669 \startsyntax
670 \suppressoutererror = 1
671 \stopsyntax
673 If this new integer parameter is non|-|zero, then \LUATEX\ will not
674 complain about \type{\outer} commands encountered in contexts where
675 that is normally prohibited.
677 The addition of this command coincides with a change in the
678 \LUATEX\ engine: ever since the snapshot of 20060915, \type{\outer}
679 was simply ignored. That behavior has now reverted back to be
680 \TEX82-compatible by default.
684 \subsection{\tex{suppressmathparerror} (0.80)}
686 The following setting will permit \par tokens in a math formula:
688 \startsyntax
689 \suppressmathparerror = 1
690 \stopsyntax
692 So, the next code is valid then:
694 \starttyping
695 $ x + 1 =
698 \stoptyping
700 \subsection{\tex{matheqnogapstep} (0.81)}
702 By default \TEX\ will add one quad between the equation and the number. This
703 is hardcoded. A new primitive can control this:
705 \startsyntax
706 \matheqnogapstep = 1000
707 \stopsyntax
709 Because a math quad from the math text font is used instead of a dimension, we
710 use a step to control the size. A value of zero will suppress the gap. The step
711 is divided by 1000 which is the usual way to mimmick floating point factors in
712 \TEX.
714 \subsection{\tex{outputbox} (0.37)}
716 \startsyntax
717 \outputbox = 65535
718 \stopsyntax
720 This new integer parameter allows you to alter the number of the box
721 that will be used to store the page sent to the output routine. Its default
722 value is 255, and the acceptable range is from 0 to 65535.
726 \subsection{\tex{fontid}}
728 \startsyntax
729 \fontid\font
730 \stopsyntax
732 This primitive expands into a number. It is not a register so there is no need to
733 prefix with \type {\number} (and using \type {\the} gives an error). The currently
734 used font id is \fontid\font. Here are some more:
736 \starttabulate[|l|c|]
737 \NC \type {\bf} \NC \bf \fontid\font \NC \NR
738 \NC \type {\it} \NC \it \fontid\font \NC \NR
739 \NC \type {\bi} \NC \bi \fontid\font \NC \NR
740 \stoptabulate
742 These numbers depend on the macro package used because each one has its own way
743 of dealing with fonts. They can also differ per run, as they can depend on the
744 order of loading fonts. For instance, when in \CONTEXT\ virtual math \UNICODE\
745 fonts are used, we can easily get over a hundred ids in use. Not all ids have to
746 be bound to a real font, after all it's just a number.
751 \subsection{Font syntax}
753 \LUATEX\ will accept a braced argument as a font name:
755 \starttyping
756 \font\myfont = {cmr10}
757 \stoptyping
759 This allows for embedded spaces, without the need for double quotes.
760 Macro expansion takes place inside the argument.
762 \subsection{File syntax (0.45)}
764 \LUATEX\ will accept a braced argument as a file name:
766 \starttyping
767 \input {plain}
768 \openin 0 {plain}
769 \stoptyping
771 This allows for embedded spaces, without the need for double quotes.
772 Macro expansion takes place inside the argument.
774 \subsection{Images and Forms}
776 \LUATEX\ accepts optional dimension parameters for \type{\pdfrefximage}
777 and \type{\pdfrefxform} in the same format as for \type{\pdfximage}.
778 With images, these dimensions are then used
779 instead of the ones given to \type{\pdfximage};
780 but the original dimensions are not overwritten,
781 so that a \type{\pdfrefximage} without dimensions still provides
782 the image with dimensions defined by \type{\pdfximage}.
783 These optional parameters are not implemented for \type{\pdfxform}.
785 \starttyping
786 \pdfrefximage width 20mm height 10mm depth 5mm \pdflastximage
787 \pdfrefxform width 20mm height 10mm depth 5mm \pdflastxform
788 \stoptyping
790 \section{Debugging}
792 If \tex{tracingonline} is larger than~2, the node list display will
793 also print the node number of the nodes.
795 \section{Global leaders}
797 There is a new experimental primitive: \type{\gleaders} (a \LUATEX\
798 extension, added in 0.43). This type of leaders is anchored to the
799 origin of the box to be shipped out. So they are like normal
800 \type{\leaders} in that they align nicely, except that the alignment
801 is based on the {\it largest\/} enclosing box instead of the
802 {\it smallest\/}.
805 \section{Expandable character codes (0.75)}
807 The new expandable command \tex{Uchar} reads a number between~0 and
808 $1{,}114{,}111$ and expands to the associated Unicode character.
811 \chapter {\LUA\ general}
813 \section[init]{Initialization}
815 \subsection{\LUATEX\ as a \LUA\ interpreter}
817 There are some situations that make \LUATEX\ behave like a standalone \LUA\
818 interpreter:
820 \startitemize[packed]
821 \item if a \type{--luaonly} option is given on the commandline, or
822 \item if the executable is named \type{texlua} (or \type{luatexlua}), or
823 \item if the only non|-|option argument (file) on the commandline has the extension
824 \type{lua} or \type{luc}.
825 \stopitemize
827 In this mode, it will set \LUA's \type{arg[0]} to the found script
828 name, pushing preceding options in negative values and the rest of the
829 commandline in the positive values, just like the \LUA\
830 interpreter.
832 \LUATEX\ will exit immediately after executing the specified \LUA\
833 script and is, in effect, a somewhat bulky standalone \LUA\
834 interpreter with a bunch of extra preloaded libraries.
836 \subsection{\LUATEX\ as a \LUA\ byte compiler}
838 There are two situations that make \LUATEX\ behave like the \LUA\
839 byte compiler:
841 \startitemize[packed]
842 \item if a \type{--luaconly} option is given on the commandline, or
843 \item if the executable is named \type{texluac}
844 \stopitemize
846 In this mode, \LUATEX\ is exactly like \type{luac} from the standalone
847 \LUA\ distribution, except that it does not have the \type{-l} switch,
848 and that it accepts (but ignores) the \type{--luaconly} switch.
850 \subsection{Other commandline processing}
852 When the \LUATEX\ executable starts, it looks for the \type{--lua}
853 commandline option. If there is no \type{--lua} option, the
854 commandline is interpreted in a similar fashion as in traditional
855 \PDFTEX\ and \ALEPH.
857 The following command-line switches are understood.
859 \starttabulate[|lT|p|]
860 \NC --fmt=FORMAT \NC load the format file FORMAT \NC\NR
861 \NC --lua=FILE \NC load and execute a \LUA\ initialization script\NC\NR
862 \NC --safer \NC disable easily exploitable \LUA\ commands \NC\NR
863 \NC --nosocket \NC disable the \LUA\ socket library \NC\NR
864 \NC --help \NC display help and exit \NC\NR
865 \NC --ini \NC be iniluatex, for dumping formats \NC\NR
866 \NC --interaction=STRING \NC set interaction mode (STRING=batchmode/nonstopmode/\crlf
867 scrollmode/errorstopmode) \NC \NR
868 \NC --halt-on-error \NC stop processing at the first error\NC \NR
869 \NC --kpathsea-debug=NUMBER \NC set path searching debugging flags according to
870 the bits of NUMBER \NC \NR
871 \NC --progname=STRING \NC set the program name to STRING \NC \NR
872 \NC --version \NC display version and exit \NC\NR
873 \NC --credits \NC display credits and exit \NC\NR
874 \NC --recorder \NC enable filename recorder \NC \NR
875 \NC --etex \NC ignored\NC \NR
876 \NC --output-comment=STRING \NC use STRING for DVI file comment instead of date
877 (no effect for PDF)\NC \NR
878 \NC --output-directory=DIR \NC use DIR as the directory to write files to \NC \NR
879 \NC --draftmode \NC switch on draft mode (generates no output PDF)\NC \NR
880 \NC --output-format=FORMAT \NC use FORMAT for job output; FORMAT is 'dvi' or 'pdf' \NC \NR
881 \NC --[no-]shell-escape \NC disable/enable \type{\write18{SHELL COMMAND}} \NC \NR
882 \NC --enable-write18 \NC enable \type{\write18{SHELL COMMAND}} \NC \NR
883 \NC --disable-write18 \NC disable \type{\write18{SHELL COMMAND}} \NC \NR
884 \NC --shell-restricted \NC restrict \type{\write18} to a list of commands
885 given in texmf.cnf \NC \NR
886 \NC --debug-format \NC enable format debugging \NC \NR
887 \NC --[no-]file-line-error \NC disable/enable file:line:error style messages \NC \NR
888 \NC --[no-]file-line-error-style \NC aliases of --[no-]file-line-error \NC \NR
889 \NC --jobname=STRING \NC set the job name to STRING \NC \NR
890 \NC --[no-]parse-first-line \NC ignored \NC \NR
891 \NC --translate-file= \NC ignored \NC \NR
892 \NC --default-translate-file= \NC ignored \NC \NR
893 \NC --8bit \NC ignored \NC \NR
894 \NC --[no-]mktex=FMT \NC disable/enable mktexFMT generation (FMT=tex/tfm)\NC \NR
895 \NC --synctex=NUMBER \NC enable synctex \NC \NR
896 \stoptabulate
898 A note on the creation of the various temporary files and the \type{\jobname}.
899 The value to use for \type{\jobname} is decided as follows:
901 \startitemize
902 \item If \type{--jobname} is given on the command line, its argument
903 will be the value for \tex{jobname}, without any changes. The
904 argument will not be used for actual input so it need not exist.
905 The \type{--jobname} switch only controls the \tex{jobname} setting.
906 \item Otherwise, \tex{jobname} will be the name of the first file that
907 is read from the file system, with any path components and the last
908 extension (the part following the last \type{.}) stripped off.
909 \item An exception to the previous point: if the command
910 line goes into interactive mode (by starting with a command) and
911 there are no files input via \type{\everyjob} either, then the
912 \tex{jobname} is set to \type{texput} as a last resort.
913 \stopitemize
915 The file names for output files that are generated automatically are
916 created by attaching the proper extension (\type{.log}, \type{.pdf},
917 etc.) to the found \tex{jobname}. These files are created in the
918 directory pointed to by \type{--output-directory}, or in the current
919 directory, if that switch is not present.
921 \blank
923 Without the \type{--lua} option, command line processing works like it does in
924 any other web2c-based typesetting engine, except that \LUATEX\ has a few extra
925 switches.
928 If the \type{--lua} option is present, \LUATEX\ will enter an alternative mode
929 of commandline processing in comparison to the standard web2c
930 programs.
932 In this mode, a small series of actions is taken in order. First,
933 it will parse the commandline as usual, but it will only interpret
934 a small subset of the options immediately: \type{--safer}, \type{--nosocket},
935 \type{--[no-]shell-escape}, \type{--enable-write18}, \type{--disable-write18},
936 \type{--shell-restricted}, \type{--help}, \type{--version}, and \type{--credits}.
938 Now it searches for the requested \LUA\ initialization script. If it
939 cannot be found using the actual name given on the commandline, a
940 second attempt is made by prepending the value of the environment
941 variable \type{LUATEXDIR}, if that variable is defined in the environment.
943 Then it checks the various safety switches. You can use those to disable
944 some \LUA\ commands that can easily be abused by a malicious document. At
945 the moment, \type{--safer} \type{nil}s the following functions:
947 \starttabulate[|l|l|]
948 \NC \bf library \NC \bf functions \NC \NR
949 \NC \tt os \NC \tt execute exec setenv rename remove tmpdir \NC \NR
950 \NC \tt io \NC \tt popen output tmpfile \NC \NR
951 \NC \tt lfs \NC \tt rmdir mkdir chdir lock touch \NC \NR
952 \stoptabulate
954 Furthermore, it disables loading of compiled \LUA\ libraries (support
955 for these was added in 0.46.0), and it makes \lua{io.open()} fail on
956 files that are opened for anything besides reading.
958 \type{--nosocket} makes the socket library unavailable, so that
959 \LUA\ cannot use networking.
961 The switches \type{--[no-]shell-escape}, \type{--[enable|disable]-write18}, and
962 \type{--shell-restricted} have the same
963 effects as in \PDFTEX, and additionally make
964 \type{io.popen()}, \type{os.execute}, \type{os.exec} and \type{os.spawn}
965 adhere to the requested option.
967 Next the initialization script is loaded and executed. From within the
968 script, the entire commandline is available in the \LUA\ table
969 \lua{arg}, beginning with \lua {arg[0]}, containing the name of the executable.
970 As consequence, the warning about unrecognized option is suppressed.
972 Commandline processing happens very early on. So early, in fact, that
973 none of \TEX's initializations have taken place yet. For that reason,
974 the tables that deal with typesetting, like \luatex{tex}, \luatex{token},
975 \luatex{node} and \luatex{pdf}, are off|-|limits during the execution
976 of the startup file (they are nilled). Special care is taken that \luatex{texio.write} and
977 \luatex{texio.write_nl} function properly, so that you can at least
978 report your actions to the log file when (and if) it eventually
979 becomes opened (note that \TEX\ does not even know its \tex{jobname}
980 yet at this point). See \in{chapter}[libraries] for more information
981 about the \LUATEX-specific \LUA\ extension tables.
984 Everything you do in the \LUA\ initialization script will remain
985 visible during the rest of the run, with the exception of the
986 aforementioned \luatex{tex}, \luatex{token}, \luatex{node} and
987 \luatex{pdf} tables: those will be initialized
988 to their documented state after the execution of the script. You
989 should not store anything in variables or within tables with these
990 four global names, as they will be overwritten completely.
992 We recommend you use the startup file only for your own
993 \TEX|-|independent initializations (if you need any), to parse the
994 commandline, set values in the \luatex{texconfig} table, and register
995 the callbacks you need.
997 \LUATEX\ allows some of the commandline options to be overridden
998 by reading values from the \luatex{texconfig} table at the end of
999 script execution (see the description of the \luatex{texconfig} table
1000 later on in this document for more details on which ones exactly).
1002 Unless the \luatex{texconfig} table tells \LUATEX\ not to initialize
1003 \KPATHSEA\ at all (set \luatex{texconfig.kpse_init} to \type{false} for that),
1004 \LUATEX\ acts on some more commandline options after the
1005 initialization script is finished:
1006 in order to initialize the built|-|in \KPATHSEA\ library properly,
1007 \LUATEX\ needs to know the correct program name to use, and for that it
1008 needs to check \type{--progname}, or \type{--ini} and \type{--fmt}, if
1009 \type{--progname} is missing.
1012 \section{\LUA\ changes}
1014 {\bf NOTE:} \LUATEX\ 0.74.0 is the first version with Lua 5.2, and
1015 this is used without any patches to the core, which has some side
1016 effects. In particular, Lua's \type{tonumber()} may return values in
1017 scientific notation, thereby confusing the \TEX\ end of things when it
1018 is used as the right-hand side of an assignment to a \type{\dimen}
1019 or \type{\count}.
1021 {\bf NOTE:} Also in \LUATEX\ 0.74.0 (this is a change in Lua 5.2),
1022 loading dynamic Lua libraries will fail if there are two Lua libraries
1023 loaded at the same time (which will typically happen on Win32, because
1024 there is one Lua 5.2 inside luatex, and another will likely be linked
1025 to the \type{dll} file of the module itself). We plan to fix that later
1026 by switching \LUATEX\ itself to using de DLL version of Lua 5.2 inside
1027 \LUATEX\ instead of including a static version in the binary.
1029 Starting from version 0.45, \LUATEX\ is able to use the kpathsea
1030 library to find \type{require()}d modules. For this purpose,
1031 \type{package.searchers[2]} is replaced by a different loader function,
1032 that decides at runtime whether to use kpathsea or the built-in core
1033 lua function. It uses \KPATHSEA\ when that is already initialized at
1034 that point in time, otherwise it reverts to using the normal
1035 \type{package.path} loader.
1037 Initialization of \KPATHSEA\ can happen either implicitly (when
1038 \LUATEX\ starts up and the startup script has not set
1039 \type{texconfig.kpse_init} to false), or explicitly by calling the
1040 \LUA\ function \type{kpse.set_program_name()}.
1042 Starting from version 0.46.0 \LUATEX\ is
1043 also able to use dynamically loadable \LUA\ libraries, unless
1044 \type{--safer} was given as an option on the command line.
1046 For this purpose, \type{package.searchers[3]} is replaced by a different
1047 loader function, that decides at runtime whether to use kpathsea or
1048 the build-in core lua function. As in the previous paragraph, it uses
1049 \KPATHSEA\ when that is already initialized at that point in time,
1050 otherwise it reverts to using the normal \type{package.cpath} loader.
1052 This functionality required an extension to kpathsea:
1054 \startnarrower
1055 There is a new kpathsea file format: \type{kpse_clua_format} that
1056 searches for files with extension \type{.dll} and \type{.so}. The
1057 \type{texmf.cnf} setting for this variable is \type{CLUAINPUTS}, and
1058 by default it has this value:
1060 \starttyping
1061 CLUAINPUTS=.:$SELFAUTOLOC/lib/{$progname,$engine,}/lua//
1062 \stoptyping %$
1064 This path is imperfect (it requires a TDS subtree below the binaries
1065 directory), but the architecture has to be in the path somewhere, and
1066 the currently simplest way to do that is to search below the binaries
1067 directory only.
1069 One level up (a \type{lib} directory parallel to \type{bin}) would
1070 have been nicer, but that is not doable because \TEXLIVE\ uses a
1071 \type{bin/<arch>} structure.
1072 \stopnarrower
1074 In keeping with the other \TEX-like programs in \TEXLIVE, the two
1075 \LUA\ functions
1076 \type{os.execute} and \type{io.popen} (as well as the two new functions \type{os.exec}
1077 and \type{os.spawn} that are explained below) take the value of \type{shell_escape}
1078 and/or \type{shell_escape_commands} in account. Whenever \LUATEX\ is run with the
1079 assumed intention to typeset a document (and by that I mean that it is called as
1080 \type{luatex}, as opposed to \type{texlua}, and that the commandline option
1081 \type{--luaonly} was not given), it will only run the four functions above if the
1082 matching texmf.cnf variable(s) or their \type{texconfig} (see~\in{section}[texconfig])
1083 counterparts allow execution of the requested system command. In \quote{script
1084 interpreter} runs of \LUATEX, these settings have no effect, and all four functions
1085 function as normal. This change is new in 0.37.0.
1089 The \lua{f:read("*line")} and \lua{f:lines()} functions from the io library have
1090 been adjusted so that they are line|-|ending neutral: any of \type{LF}, \type
1091 {CR} or \type{CR+LF} are acceptable line endings.
1093 \lua{luafilesystem} has been extended: there are two extra boolean functions
1094 (\luatex{lfs.isdir(filename)} and \luatex{lfs.isfile(filename)}) and
1095 one extra string field in its attributes table
1096 (\type{permissions}). There is an additional function (added in 0.51)
1097 \type{lfs.shortname()} which takes a file name and returns its short
1098 name on WIN32 platforms. On other platforms, it just returns the given
1099 argument. The file name is not tested for existence. Finally, for
1100 non-WIN32 platforms only, there is the new function
1101 \type{lfs.readlink()} (added in 0.51) that takes an existing symbolic
1102 link as argument and returns its content. It returns an error on
1103 WIN32.
1105 The \lua{string} library has an extra function:
1106 \luatex{string.explode(s[,m])}. This function returns an array containing
1107 the string argument \type{s} split into sub-strings based on the value
1108 of the string argument \type{m}. The second argument is a string that
1109 is either empty (this splits the string into characters), a single
1110 character (this splits on each occurrence of that character, possibly
1111 introducing empty strings), or a single character followed by the plus
1112 sign \type{+} (this special version does not create empty
1113 sub-strings). The default value for \type{m} is \quote{\type{ +}} (multiple
1114 spaces).
1116 Note: \type{m} is not hidden by surrounding braces (as it would be if
1117 this function was written in \TEX\ macros).
1119 The \lua{string} library also has six extra iterators that return strings
1120 piecemeal:
1122 \startitemize
1123 \item \luatex{string.utfvalues(s)} (returns an integer value in the
1124 \UNICODE\ range)
1125 \item \luatex{string.utfcharacters(s)} (returns a string with a single
1126 \UTF-8 token in it)
1127 \item \luatex{string.characters(s)} (a string containing one byte)
1128 \item \luatex{string.characterpairs(s)} (two strings each containing one byte) will
1129 produce an empty second string if the string length was odd.
1130 \item \luatex{string.bytes(s)} (a single byte value)
1131 \item \luatex{string.bytepairs(s)} (two byte values) Will produce nil instead of a
1132 number as its second return value if the string length was odd.
1133 \stopitemize
1135 The \luatex{string.characterpairs()} and \luatex{string.bytepairs()}
1136 are useful especially in the conversion of UTF-16 encoded data into UTF-8.
1139 Starting with \LUATEX\ 0.74, there is also a two-argument form of
1140 \type{string.dump()}. The second argument is a boolean which, if true,
1141 strips the symbols from the dumped data. This matches an extension
1142 made in \type{luajit}.
1144 Note: The \lua{string} library functions \luatex{len}, \luatex{lower},
1145 \luatex{sub} etc. are not \UNICODE|-|aware. For strings in the UTF-8
1146 encoding, i.e., strings containing characters above code point 127, the
1147 corresponding functions from the \lua{slnunicode} library can be used,
1148 e.g., \luatex{unicode.utf8.len}, \luatex{unicode.utf8.lower} etc. The
1149 exceptions are \luatex{unicode.utf8.find}, that always returns byte
1150 positions in a string, and \luatex{unicode.utf8.match} and
1151 \luatex{unicode.utf8.gmatch}. While the latter two functions in general
1152 {\it are} \UNICODE|-|aware, they fall-back to non|-|\UNICODE|-|aware
1153 behavior when using the empty capture \lua{()} (other captures work as
1154 expected). For the interpretation of character classes in
1155 \luatex{unicode.utf8} functions refer to the library sources at
1156 \hyphenatedurl{http://luaforge.net/projects/sln}. The \lua{slnunicode}
1157 library will be replaced by an internal \UNICODE\ library in a future
1158 \LUATEX\ version.
1159 \blank
1161 The \lua{os} library has a few extra functions and variables:
1163 \startitemize
1164 \item \luatex{os.selfdir} is a variable that holds the directory path
1165 of the actual executable. For example: {\tt \directlua{tex.sprint(os.selfdir)}}
1166 (present since 0.27.0).
1168 \item \luatex{os.exec(commandline)} is a variation on \lua{os.execute}.
1170 The \type{commandline} can be either a single string or a single table.
1172 If the argument is a table: \LUATEX\ first checks if there is a value at
1173 integer index zero. If there is, this is the command to be executed. Otherwise,
1174 it will use the value at integer index one. (if neither are present, nothing
1175 at all happens).
1177 The set of consecutive values starting at integer 1 in the table are
1178 the arguments that are passed on to the command (the value at index 1
1179 becomes \type{arg[0]}). The command is searched for in the execution path,
1180 so there is normally no need to pass on a fully qualified pathname.
1182 If the argument is a string, then it is automatically converted into
1183 a table by splitting on whitespace. In this case, it is impossible
1184 for the command and first argument to differ from each other.
1186 In the string argument format, whitespace can be protected by putting (part
1187 of) an argument inside single or double quotes. One layer of quotes is
1188 interpreted by \LUATEX, and all occurrences of \tex{"}, \tex{'} or
1189 \type{\\} within the quoted text are un-escaped. In the table format, there
1190 is no string handling taking place.
1192 This function normally does not return control back to the \LUA\ script: the
1193 command will replace the current process. However, it will return the two values
1194 \type{nil} and \type {'error'} if there was a problem while attempting to execute the command.
1196 On Windows, the current process is actually kept in memory until after the
1197 execution of the command has finished. This prevents crashes in situations
1198 where \TEXLUA\ scripts are run inside integrated \TEX\ environments.
1200 The original reason for this command is that it cleans out the current
1201 process before starting the new one, making it especially useful for
1202 use in \TEXLUA.
1204 \item \luatex{os.spawn(commandline)} is a returning version of \lua{os.exec},
1205 with otherwise identical calling conventions.
1207 If the command ran ok, then the return value is the exit status of the
1208 command. Otherwise, it will return the two values \type{nil} and \type {'error'}.
1210 \item \luatex{os.setenv('key','value')}
1211 This sets a variable in the environment. Passing \lua{nil} instead of a
1212 value string will remove the variable.
1214 \item \luatex{os.env}
1215 This is a hash table containing a dump of the variables and values
1216 in the process environment at the start of the run. It is writeable,
1217 but the actual environment is {\em not\/} updated automatically.
1219 \item \luatex{os.gettimeofday()}
1220 Returns the current \quote {\UNIX\ time}, but as a float. This function is
1221 not available on the \SUNOS\ platforms, so do not use this function
1222 for portable documents.
1224 \item \luatex{os.times()}
1225 Returns the current process times according to \ the \UNIX\ C library function
1226 \quote {times}. This function is not available on the \MSWINDOWS\
1227 and \SUNOS\ platforms, so do not use this function for portable
1228 documents.
1230 \item \luatex{os.tmpdir()} This will create a directory in the \quote {current
1231 directory} with the name \type{luatex.XXXXXX} where the \type {X}-es are
1232 replaced by a unique string. The function also returns this string,
1233 so you can \type{lfs.chdir()} into it, or \type{nil} if it failed to
1234 create the directory. The user is responsible for cleaning up at
1235 the end of the run, it does not happen automatically.
1237 \item \luatex{os.type}
1238 This is a string that gives a global indication of the class of operating
1239 system. The possible values are currently \type{windows}, \type{unix}, and
1240 \type{msdos} (you are unlikely to find this value \quote {in the wild}).
1242 \item \luatex{os.name}
1243 This is a string that gives a more precise indication of the operating
1244 system. These possible values are not yet fixed, and for \type{os.type} values
1245 \type{windows} and \type{msdos}, the \type{os.name} values are simply
1246 \type{windows} and \type{msdos}
1248 The list for the type \type{unix} is more precise: \type{linux},
1249 \type{freebsd}, \type{kfreebsd} (since 0.51), \type{cygwin} (since
1250 0.53), \type{openbsd}, \type{solaris}, \type{sunos} (pre-solaris),
1251 \type{hpux}, \type{irix}, \type{macosx}, \type{gnu} (hurd), \type{bsd} (unknown, but \BSD|-|like),
1252 \type{sysv} (unknown, but \SYSV|-|like), \type{generic} (unknown).
1254 (\type{os.version} is planned as a future extension)
1256 \item \luatex{os.uname()}
1257 This function returns a table with specific operating system
1258 information acquired at runtime. The keys in the returned table are
1259 all string valued, and their names are: \type{sysname}, \type{machine},
1260 \type{release}, \type{version}, and \type{nodename}.
1263 \stopitemize
1265 In stock \LUA, many things depend on the current locale. In \LUATEX, we can't do
1266 that, because it makes documents unportable. While \LUATEX\ is running if
1267 forces the following locale settings:
1269 \starttyping
1270 LC_CTYPE=C
1271 LC_COLLATE=C
1272 LC_NUMERIC=C
1273 \stoptyping
1275 \section {\LUA\ modules}
1277 {\bf NOTE}: Starting with \LUATEX\ 0.74, the implied use of the
1278 built-in Lua modules in this section is deprecated. If you want to use
1279 one of these libraries, please start your source file with a
1280 proper \type{require} line. In the near future, \LUATEX\ will switch
1281 to loading these modules on demand.
1283 Some modules that are normally external to \LUA\ are statically linked
1284 in with \LUATEX, because they offer useful functionality:
1286 \startitemize
1287 \item \lua{slnunicode}, from the \type {Selene} libraries, \hyphenatedurl{http://luaforge.net/projects/sln}. (version 1.1)
1289 This library has been slightly extended so that the \type{unicode.utf8.*}
1290 functions also accept the first 256 values of plane~18. This is the range \LUATEX\
1291 uses for raw binary output, as explained above.
1293 \item \lua{luazip}, from the kepler project, \hyphenatedurl{http://www.keplerproject.org/luazip/}.
1294 (version 1.2.1, but patched for compilation with \LUA\ 5.2)
1295 \item \lua{luafilesystem}, also from the kepler project, \hyphenatedurl{http://www.keplerproject.org/luafilesystem/}.
1296 (version 1.5.0)
1297 \item \lua{lpeg}, by Roberto Ierusalimschy, \hyphenatedurl{http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html}. (version 0.10.2)
1299 Note: \lua{lpeg} is not \UNICODE|-|aware, but interprets strings on a
1300 byte|-|per|-|byte basis. This mainly means that \luatex{lpeg.S} cannot be
1301 used with characters above code point 127, since those characters are
1302 encoded using two bytes, and thus \luatex{lpeg.S} will look for one
1303 of those two bytes when matching, not the combination of the two.
1305 The same is true for \luatex{lpeg.R}, although the latter will display
1306 an error message if used with characters above code point 127: I.\,e.\
1307 \luatex{lpeg.R('aä')} results in the message \type{bad argument #1 to
1308 'R' (range must have two characters)}, since to \lua{lpeg}, \type{ä}
1309 is two 'characters' (bytes), so \type{aä} totals three.
1311 \item \lua{lzlib}, by Tiago Dionizio, \hyphenatedurl{http://luaforge.net/projects/lzlib/}. (version 0.2)
1312 \item \lua{md5}, by Roberto Ierusalimschy \hyphenatedurl{http://www.inf.puc-rio.br/~roberto/md5/md5-5/md5.html}.
1314 \item \lua{luasocket}, by Diego Nehab
1315 \hyphenatedurl{http://w3.impa.br/~diego/software/luasocket/}
1316 (version 2.0.2).
1318 Note: the \type{.lua} support modules from \type{luasocket} are also
1319 preloaded inside the executable, there are no external file dependencies.
1320 \stopitemize
1323 \chapter[libraries]{\LUATEX\ \LUA\ Libraries}
1325 {\bf NOTE}: Starting with \LUATEX\ 0.74, the implied use of the
1326 built-in Lua modules \type{epdf}, \type{fontloader}, \type{mplib},
1327 and \type{pdfscanner} is deprecated. If you want to use these, please
1328 start your source file with a proper \type{require} line. In the near
1329 future, \LUATEX\ will switch to loading these modules on demand.
1332 The interfacing between \TEX\ and \LUA\ is facilitated by a set of
1333 library modules. The \LUA\ libraries in this chapter are all defined and
1334 initialized by the \LUATEX\ executable. Together, they allow \LUA\
1335 scripts to query and change a number of \TEX's internal variables, run
1336 various internal \TEX\ functions, and set up \LUATEX's hooks to execute
1337 \LUA\ code.
1339 The following sections are in alphabetical order.
1341 \section{The \luatex{callback} library}
1343 This library has functions that register, find and list callbacks.
1345 A quick note on what callbacks are (thanks, Paul!):
1347 Callbacks are entry points to \LUATEX's internal operations, which can be
1348 interspersed with additional \LUA\ code, and even replaced altogether.
1349 In the first case, \TEX\ is simply augmented with new operations
1350 (for instance, a manipulation of the nodes resulting from the paragraph
1351 builder); in the second case, its hard-coded behavior (for instance, the
1352 paragraph builder itself) is ignored and processing relies on user code only.
1354 More precisely, the code to be inserted at a given callback is a function
1355 (an anonymous function or the name of a function variable); % Is this line useful?
1356 it will receive the arguments associated with the callback, if any, and must
1357 frequently return some other arguments for \TEX\ to resume its operations.
1359 The first task is registering a callback:
1361 \startfunctioncall
1362 id, error = callback.register (<string> callback_name, <function> func)
1363 id, error = callback.register (<string> callback_name, nil)
1364 id, error = callback.register (<string> callback_name, false)
1365 \stopfunctioncall
1367 where the \syntax{callback_name} is a predefined callback name, see
1368 below. The function returns the internal \type{id} of the callback
1369 or \type{nil}, if the callback could not be registered. In the latter
1370 case, \type{error} contains an error message, otherwise it is
1371 \type{nil}.
1373 \LUATEX\ internalizes the callback function in such a way that
1374 it does not matter if you redefine a function accidentally.
1376 Callback assignments are always global. You can use the special value
1377 \type {nil} instead of a function for clearing the callback.
1379 For some minor speed gain, you can assign the boolean \type{false} to
1380 the non-file related callbacks, doing so will prevent \LUATEX\ from
1381 executing whatever it would execute by default (when no callback
1382 function is registered at all). Be warned: this may cause all sorts of
1383 grief unless you know {\it exactly} what you are doing! This functionality
1384 is present since version 0.38.
1386 Currently, callbacks are not dumped into the format file.
1388 \startfunctioncall
1389 <table> info = callback.list()
1390 \stopfunctioncall
1392 The keys in the table are the known callback names, the value is a
1393 boolean where \type{true} means that the callback is currently set
1394 (active).
1396 \startfunctioncall
1397 <function> f = callback.find (callback_name)
1398 \stopfunctioncall
1400 If the callback is not set, \luatex{callback.find} returns \type{nil}.
1402 \subsection{File discovery callbacks}
1404 The behavior documented in this subsection is considered stable in the
1405 sense that there will not be backward-incompatible changes any more.
1407 \subsubsection{\luatex{find_read_file} and \luatex{find_write_file}}
1409 Your callback function should have the following conventions:
1411 \startfunctioncall
1412 <string> actual_name = function (<number> id_number, <string> asked_name)
1413 \stopfunctioncall
1415 Arguments:
1417 \startitemize
1419 \sym{id_number}
1421 This number is zero for the log or \tex{input} files. For \TEX's \tex{read} or
1422 \tex{write} the number is incremented by one, so \tex{read0} becomes~1.
1424 \sym{asked_name}
1426 This is the user|-|supplied filename, as found by \tex{input}, \tex{openin}
1427 or \tex{openout}.
1429 \stopitemize
1431 Return value:
1433 \startitemize
1435 \sym{actual_name}
1437 This is the filename used. For the very first file that is read in by
1438 \TEX, you have to make sure you return an \type{actual_name} that has
1439 an extension and that is suitable for use as \type{jobname}. If you
1440 don't, you will have to manually fix the name of the log file and
1441 output file after \LUATEX\ is finished, and an eventual format
1442 filename will become mangled. That is because these file names depend
1443 on the jobname.
1445 You have to return \type{nil} if the file cannot be found.
1447 \stopitemize
1449 \subsubsection{\luatex{find_font_file}}
1451 Your callback function should have the following conventions:
1453 \startfunctioncall
1454 <string> actual_name = function (<string> asked_name)
1455 \stopfunctioncall
1457 The \type{asked_name} is an \OTF\ or \TFM\ font metrics file.
1459 Return \type{nil} if the file cannot be found.
1461 \subsubsection{\luatex{find_output_file}}
1463 Your callback function should have the following conventions:
1465 \startfunctioncall
1466 <string> actual_name = function (<string> asked_name)
1467 \stopfunctioncall
1469 The \type{asked_name} is the \PDF\ or \DVI\ file for writing.
1471 \subsubsection{\luatex{find_format_file}}
1473 Your callback function should have the following conventions:
1475 \startfunctioncall
1476 <string> actual_name = function (<string> asked_name)
1477 \stopfunctioncall
1479 The \type{asked_name} is a format file for reading (the format file
1480 for writing is always opened in the current directory).
1482 \subsubsection{\luatex{find_vf_file}}
1484 Like \luatex{find_font_file}, but for virtual fonts. This applies to
1485 both \ALEPH's \OVF\ files and traditional Knuthian \VF\ files.
1487 \subsubsection{\luatex{find_map_file}}
1489 Like \luatex{find_font_file}, but for map files.
1491 \subsubsection{\luatex{find_enc_file}}
1493 Like \luatex{find_font_file}, but for enc files.
1495 \subsubsection{\luatex{find_sfd_file}}
1497 Like \luatex{find_font_file}, but for subfont definition files.
1499 \subsubsection{\luatex{find_pk_file}}
1501 Like \luatex{find_font_file}, but for pk bitmap files. The argument
1502 \type{asked_name} is a bit special in this case. Its form is
1504 \starttyping
1505 <base res>dpi/<fontname>.<actual res>pk
1506 \stoptyping
1508 So you may be asked for \type{600dpi/manfnt.720pk}. It is up to you
1509 to find a \quote{reasonable} bitmap file to go with that specification.
1511 \subsubsection{\luatex{find_data_file}}
1513 Like \luatex{find_font_file}, but for embedded files (\tex{pdfobj file '...'}).
1515 \subsubsection{\luatex{find_opentype_file}}
1517 Like \luatex{find_font_file}, but for \OPENTYPE\ font files.
1519 \subsubsection{\luatex{find_truetype_file} and \luatex{find_type1_file}}
1521 Your callback function should have the following conventions:
1523 \startfunctioncall
1524 <string> actual_name = function (<string> asked_name)
1525 \stopfunctioncall
1527 The \type{asked_name} is a font file. This callback is called while
1528 \LUATEX\ is building its internal list of needed font files, so the
1529 actual timing may surprise you. Your return value is later fed back
1530 into the matching \luatex{read_file} callback.
1532 Strangely enough, \luatex{find_type1_file} is also used for \OPENTYPE\
1533 (\OTF) fonts.
1535 \subsubsection{\luatex{find_image_file}}
1537 Your callback function should have the following conventions:
1539 \startfunctioncall
1540 <string> actual_name = function (<string> asked_name)
1541 \stopfunctioncall
1543 The \type{asked_name} is an image file. Your return value is used to
1544 open a file from the harddisk, so make sure you return something that
1545 is considered the name of a valid file by your operating system.
1547 \subsection[iocallback]{File reading callbacks}
1549 The behavior documented in this subsection is considered stable in the
1550 sense that there will not be backward-incompatible changes any more.
1552 \subsubsection{\luatex{open_read_file}}
1554 Your callback function should have the following conventions:
1556 \startfunctioncall
1557 <table> env = function (<string> file_name)
1558 \stopfunctioncall
1560 Argument:
1562 \startitemize
1564 \sym{file_name}
1566 The filename returned by a previous \luatex{find_read_file} or the return
1567 value of \luatex{kpse.find_file()} if there was no such callback defined.
1569 \stopitemize
1571 Return value:
1573 \startitemize
1575 \sym{env}
1577 This is a table containing at least one required and one optional
1578 callback function for this file. The required field is
1579 \luatex{reader} and the associated function will be called once
1580 for each new line to be read, the optional one is \luatex{close}
1581 that will be called once when \LUATEX\ is done with the file.
1583 \LUATEX\ never looks at the rest of the table, so you can use it to
1584 store your private per|-|file data. Both the callback functions will
1585 receive the table as their only argument.
1587 \stopitemize
1589 \subsubsubsection{\luatex{reader}}
1591 \LUATEX\ will run this function whenever it needs a new input line
1592 from the file.
1594 \startfunctioncall
1595 function(<table> env)
1596 return <string> line
1598 \stopfunctioncall
1600 Your function should return either a string or \type{nil}. The value \type{nil}
1601 signals that the end of file has occurred, and will make \TEX\ call
1602 the optional \luatex{close} function next.
1604 \subsubsubsection{\luatex{close}}
1606 \LUATEX\ will run this optional function when it decides to close the file.
1608 \startfunctioncall
1609 function(<table> env)
1611 \stopfunctioncall
1613 Your function should not return any value.
1615 \subsubsection{General file readers}
1617 There is a set of callbacks for the loading of binary data
1618 files. These all use the same interface:
1620 \startfunctioncall
1621 function(<string> name)
1622 return <boolean> success, <string> data, <number> data_size
1624 \stopfunctioncall
1626 The \type{name} will normally be a full path name as it is returned by
1627 either one of the file discovery callbacks or the internal version of
1628 \luatex{kpse.find_file()}.
1630 \startitemize
1632 \sym{success}
1634 Return \type{false} when a fatal error occurred (e.\,g.\ when the file cannot be
1635 found, after all).
1637 \sym{data}
1639 The bytes comprising the file.
1641 \sym{data_size}
1643 The length of the \type{data}, in bytes.
1645 \stopitemize
1647 Return an empty string and zero if the file was found but there was a
1648 reading problem.
1650 The list of functions is as follows:
1652 \starttabulate[|l|p|]
1653 \NC \luatex{read_font_file} \NC ofm or tfm files \NC\NR
1654 \NC \luatex{read_vf_file} \NC virtual fonts \NC\NR
1655 \NC \luatex{read_map_file} \NC map files \NC\NR
1656 \NC \luatex{read_enc_file} \NC encoding files \NC\NR
1657 \NC \luatex{read_sfd_file} \NC subfont definition files \NC\NR
1658 \NC \luatex{read_pk_file} \NC pk bitmap files \NC\NR
1659 \NC \luatex{read_data_file} \NC embedded files (\tex{pdfobj file ...}) \NC\NR
1660 \NC \luatex{read_truetype_file} \NC \TRUETYPE\ font files \NC\NR
1661 \NC \luatex{read_type1_file} \NC \TYPEONE\ font files \NC\NR
1662 \NC \luatex{read_opentype_file} \NC \OPENTYPE\ font files \NC\NR
1663 \stoptabulate
1665 \subsection{Data processing callbacks}
1667 \subsubsection{\luatex{process_input_buffer}}
1670 This callback allows you to change the contents of the line input
1671 buffer just before \LUATEX\ actually starts looking at it.
1673 \startfunctioncall
1674 function(<string> buffer)
1675 return <string> adjusted_buffer
1677 \stopfunctioncall
1679 If you return \type{nil}, \LUATEX\ will pretend like your callback
1680 never happened. You can gain a small amount of processing time from
1681 that.
1683 This callback does not replace any internal code.
1685 \subsubsection{\luatex{process_output_buffer} (0.43)}
1687 This callback allows you to change the contents of the line output
1688 buffer just before \LUATEX\ actually starts writing it to a file as the
1689 result of a \tex{write} command. It is only called for output to an
1690 actual file (that is, excluding the log, the terminal, and \tex{write18}
1691 calls).
1693 \startfunctioncall
1694 function(<string> buffer)
1695 return <string> adjusted_buffer
1697 \stopfunctioncall
1699 If you return \type{nil}, \LUATEX\ will pretend like your callback
1700 never happened. You can gain a small amount of processing time from
1701 that.
1703 This callback does not replace any internal code.
1706 \subsubsection{\luatex{process_jobname} (0.71)}
1708 This callback allows you to change the jobname given by \type{\jobname}
1709 in \TEX\ and \type{tex.jobname} in Lua. It does not affect the internal
1710 job name or the name of the output or log files.
1712 \startfunctioncall
1713 function(<string> jobname)
1714 return <string> adjusted_jobname
1716 \stopfunctioncall
1718 The only argument is the actual job name; you should not use
1719 \type{tex.jobname} inside this function or infinite recursion may occur.
1720 If you return \type{nil}, \LUATEX\ will pretend your callback never
1721 happened.
1723 This callback does not replace any internal code.
1726 \subsubsection{\luatex{token_filter}}
1728 This callback allows you to replace the way \LUATEX\ fetches
1729 lexical tokens.
1731 \startfunctioncall
1732 function()
1733 return <table> token
1735 \stopfunctioncall
1737 The calling convention for this callback is a bit more complicated than
1738 for most other callbacks. The function should either return a \LUA\
1739 table representing a valid to|-|be|-|processed token or tokenlist, or
1740 something else like \type{nil} or an empty table.
1742 If your \LUA\ function does not return a table representing a valid
1743 token, it will be immediately called again, until it eventually does
1744 return a useful token or tokenlist (or until you reset the callback
1745 value to nil). See the description of \luatex{token} for some
1746 handy functions to be used in conjunction with this callback.
1748 If your function returns a single usable token, then that token will
1749 be processed by \LUATEX\ immediately. If the function returns a token
1750 list (a table consisting of a list of consecutive token tables), then
1751 that list will be pushed to the input stack at a completely new token
1752 list level, with its token type set to \quote{inserted}. In either case,
1753 the returned token(s) will not be fed back into the callback function.
1755 Setting this callback to \type{false} has no effect (because otherwise
1756 nothing would happen, forever).
1758 \subsection{Node list processing callbacks}
1760 The description of nodes and node lists is in~\in{chapter}[nodes].
1762 \subsubsection{\luatex{buildpage_filter}}
1764 This callback is called whenever \LUATEX\ is ready to move stuff to
1765 the main vertical list. You can use this callback to do specialized
1766 manipulation of the page building stage like imposition or column
1767 balancing.
1769 \startfunctioncall
1770 function(<string> extrainfo)
1772 \stopfunctioncall
1774 The string \type{extrainfo} gives some additional information about
1775 what \TEX's state is with respect to the \quote{current page}. The possible
1776 values are:
1778 \starttabulate[|lT|p|]
1779 \NC \ssbf value \NC \bf explanation \NC\NR
1780 \NC alignment \NC a (partial) alignment is being added \NC\NR
1781 \NC after_output \NC an output routine has just finished \NC\NR
1782 \NC box \NC a typeset box is being added \NC\NR
1783 %\NC pre_box \NC interline material is being added \NC\NR
1784 %\NC adjust \NC \tex{vadjust} material is being added \NC\NR
1785 \NC new_graf \NC the beginning of a new paragraph \NC\NR
1786 \NC vmode_par \NC \tex{par} was found in vertical mode \NC\NR
1787 \NC hmode_par \NC \tex{par} was found in horizontal mode \NC\NR
1788 \NC insert \NC an insert is added \NC\NR
1789 \NC penalty \NC a penalty (in vertical mode) \NC\NR
1790 \NC before_display \NC immediately before a display starts \NC\NR
1791 \NC after_display \NC a display is finished \NC\NR
1792 \NC end \NC \LUATEX\ is terminating (it's all over)\NC\NR
1793 \stoptabulate
1795 This callback does not replace any internal code.
1798 \subsubsection{\luatex{pre_linebreak_filter}}
1800 This callback is called just before \LUATEX\ starts converting a list
1801 of nodes into a stack of \tex{hbox}es, after the addition of
1802 \type{\parfillskip}.
1804 \startfunctioncall
1805 function(<node> head, <string> groupcode)
1806 return true | false | <node> newhead
1808 \stopfunctioncall
1810 The string called \type {groupcode} identifies the nodelist's context
1811 within \TEX's processing. The range of possibilities is given in the
1812 table below, but not all of those can actually appear in
1813 \luatex {pre_linebreak_filter}, some are for the
1814 \luatex {hpack_filter} and \luatex {vpack_filter} callbacks that
1815 will be explained in the next two paragraphs.
1817 \starttabulate[|lT|p|]
1818 \NC \ssbf value \NC \bf explanation \NC\NR
1819 \NC <empty> \NC main vertical list \NC\NR
1820 \NC hbox \NC \tex{hbox} in horizontal mode \NC\NR
1821 \NC adjusted_hbox\NC \tex{hbox} in vertical mode \NC\NR
1822 \NC vbox \NC \tex{vbox} \NC\NR
1823 \NC vtop \NC \tex{vtop} \NC\NR
1824 \NC align \NC \tex{halign} or \tex{valign} \NC\NR
1825 \NC disc \NC discretionaries \NC\NR
1826 \NC insert \NC packaging an insert \NC\NR
1827 \NC vcenter \NC \tex{vcenter} \NC\NR
1828 \NC local_box \NC \tex{localleftbox} or \tex{localrightbox} \NC\NR
1829 \NC split_off \NC top of a \tex{vsplit} \NC\NR
1830 \NC split_keep \NC remainder of a \tex{vsplit} \NC\NR
1831 \NC align_set \NC alignment cell \NC\NR
1832 \NC fin_row \NC alignment row \NC\NR
1833 \stoptabulate
1835 As for all the callbacks that deal with nodes, the return value can be one of three things:
1837 \startitemize
1838 \item boolean \type{true} signals succesful processing
1839 \item \type{<node>} signals that the \quote{head} node should be replaced by the returned node
1840 \item boolean \type{false} signals that the \quote{head} node list should be ignored and
1841 flushed from memory
1842 \stopitemize
1845 This callback does not replace any internal code.
1848 \subsubsection{\luatex{linebreak_filter}}
1850 This callback replaces \LUATEX's line breaking algorithm.
1852 \startfunctioncall
1853 function(<node> head, <boolean> is_display)
1854 return <node> newhead
1856 \stopfunctioncall
1858 The returned node is the head of the list that will be added to the
1859 main vertical list, the boolean argument is true if this paragraph is
1860 interrupted by a following math display.
1862 If you return something that is not a \type{<node>}, \LUATEX\ will
1863 apply the internal linebreak algorithm on the list that starts at
1864 \type{<head>}. Otherwise, the \type{<node>} you return is supposed
1865 to be the head of a list of nodes that are all allowed in vertical
1866 mode, and at least one of those has to represent a hbox. Failure to do
1867 so will result in a fatal error.
1869 Setting this callback to \type{false} is possible, but dangerous,
1870 because it is possible you will end up in an unfixable
1871 \quote{deadcycles loop}.
1873 \subsubsection{\luatex{post_linebreak_filter}}
1875 This callback is called just after \LUATEX\ has converted a list
1876 of nodes into a stack of \tex{hbox}es.
1878 \startfunctioncall
1879 function(<node> head, <string> groupcode)
1880 return true | false | <node> newhead
1882 \stopfunctioncall
1884 This callback does not replace any internal code.
1886 \subsubsection{\luatex{hpack_filter}}
1888 This callback is called when \TEX\ is ready to start boxing some
1889 horizontal mode material. Math items and line boxes are ignored
1890 at the moment.
1892 \startfunctioncall
1893 function(<node> head, <string> groupcode, <number> size,
1894 <string> packtype [, <string> direction])
1895 return true | false | <node> newhead
1897 \stopfunctioncall
1899 The \type{packtype} is either \type{additional} or \type{exactly}. If
1900 \type{additional}, then the \type{size} is a \tex{hbox spread ...}
1901 argument. If \type{exactly}, then the \type{size} is a \tex{hbox to ...}.
1902 In both cases, the number is in scaled points.
1904 The \type{direction} is either one of the three-letter direction specifier
1905 strings, or \type{nil} (added in 0.45).
1908 This callback does not replace any internal code.
1910 \subsubsection{\luatex{vpack_filter}}
1912 This callback is called when \TEX\ is ready to start boxing some
1913 vertical mode material. Math displays are ignored at the moment.
1915 This function is very similar to the \luatex{hpack_filter}. Besides
1916 the fact that it is called at different moments, there is an extra
1917 variable that matches \TEX's \tex{maxdepth} setting.
1919 \startfunctioncall
1920 function(<node> head, <string> groupcode, <number> size, <string>
1921 packtype, <number> maxdepth [, <string> direction])
1922 return true | false | <node> newhead
1924 \stopfunctioncall
1926 This callback does not replace any internal code.
1928 \subsubsection{\luatex{pre_output_filter}}
1930 This callback is called when \TEX\ is ready to start boxing the
1931 box 255 for \tex{output}.
1933 \startfunctioncall
1934 function(<node> head, <string> groupcode, <number> size, <string> packtype,
1935 <number> maxdepth [, <string> direction])
1936 return true | false | <node> newhead
1938 \stopfunctioncall
1940 This callback does not replace any internal code.
1942 \subsubsection{\luatex{hyphenate}}
1944 \startfunctioncall
1945 function(<node> head, <node> tail)
1947 \stopfunctioncall
1949 No return values. This callback has to insert discretionary nodes in
1950 the node list it receives.
1952 Setting this callback to \type{false} will prevent the internal
1953 discretionary insertion pass.
1955 \subsubsection{\luatex{ligaturing}}
1957 \startfunctioncall
1958 function(<node> head, <node> tail)
1960 \stopfunctioncall
1962 No return values. This callback has to apply ligaturing to the node
1963 list it receives.
1965 You don't have to worry about return values because the \type{head}
1966 node that is passed on to the callback is guaranteed not to be a
1967 glyph_node (if need be, a temporary node will be prepended), and
1968 therefore it cannot be affected by the mutations that take place.
1969 After the callback, the internal value of the \quote {tail of the list}
1970 will be recalculated.
1972 The \type{next} of \type{head} is guaranteed to be non-nil.
1974 The \type{next} of \type{tail} is guaranteed to be nil, and therefore the
1975 second callback argument can often be ignored. It is provided for
1976 orthogonality, and because it can sometimes be handy when special
1977 processing has to take place.
1979 Setting this callback to \type{false} will prevent the internal
1980 ligature creation pass.
1982 \subsubsection{\luatex{kerning}}
1984 \startfunctioncall
1985 function(<node> head, <node> tail)
1987 \stopfunctioncall
1989 No return values. This callback has to apply kerning between the nodes
1990 in the node list it receives. See \type{ligaturing} for calling
1991 conventions.
1993 Setting this callback to \type{false} will prevent the internal
1994 kern insertion pass.
1996 \subsubsection{\luatex{mlist_to_hlist}}
1998 This callback replaces \LUATEX's math list to node list conversion algorithm.
2000 \startfunctioncall
2001 function(<node> head, <string> display_type, <boolean> need_penalties)
2002 return <node> newhead
2004 \stopfunctioncall
2006 The returned node is the head of the list that will be added to the vertical or
2007 horizontal list, the string argument is either \quote{text} or \quote{display}
2008 depending on the current math mode, the boolean argument is \type{true} if penalties
2009 have to be inserted in this list, \type{false} otherwise.
2011 Setting this callback to \type{false} is bad, it will almost
2012 certainly result in an endless loop.
2014 \subsection{Information reporting callbacks}
2016 \subsubsection{\luatex{pre_dump} (0.61)}
2018 \startfunctioncall
2019 function()
2021 \stopfunctioncall
2023 This function is called just before dumping to a format file starts.
2024 It does not replace any code and there are neither arguments nor return values.
2026 \subsubsection{\luatex{start_run}}
2028 \startfunctioncall
2029 function()
2031 \stopfunctioncall
2033 This callback replaces the code that prints \LUATEX's banner. Note that for
2034 successful use, this callback has to be set in the lua initialization script,
2035 otherwise it will be seen only after the run has already started.
2037 \subsubsection{\luatex{stop_run}}
2039 \startfunctioncall
2040 function()
2042 \stopfunctioncall
2044 This callback replaces the code that prints \LUATEX's statistics and \quote{output written
2045 to} messages.
2047 \subsubsection{\luatex{start_page_number}}
2049 \startfunctioncall
2050 function()
2052 \stopfunctioncall
2054 Replaces the code that prints the \type{[} and the page number at the
2055 begin of \tex{shipout}. This callback will also override the
2056 printing of box information that normally takes place when
2057 \tex{tracingoutput} is positive.
2059 \subsubsection{\luatex{stop_page_number}}
2061 \startfunctioncall
2062 function()
2064 \stopfunctioncall
2066 Replaces the code that prints the \type{]} at the end of \tex{shipout}.
2068 \subsubsection{\luatex{show_error_hook}}
2070 \startfunctioncall
2071 function()
2073 \stopfunctioncall
2075 This callback is run from inside the \TEX\ error function, and the idea
2076 is to allow you to do some extra reporting on top of what \TEX\ already
2077 does (none of the normal actions are removed). You may find some of
2078 the values in the \luatex{status} table useful.
2080 This callback does not replace any internal code.
2082 \iffalse % this has been retracted for the moment
2083 \startitemize
2085 \sym{message}
2087 is the formal error message \TEX\ has given to the user.
2088 (the line after the '!').
2090 \sym{indicator}
2092 is either a filename (when it is a string) or a location indicator (a
2093 number) that can mean lots of different things like a token list id
2094 or a \tex{read} number.
2096 \sym{lineno}
2098 is the current line number.
2099 \stopitemize
2101 This is an investigative item for 'testing the water' only.
2102 The final goal is the total replacement of \TEX's error handling
2103 routines, but that needs lots of adjustments in the web source because
2104 \TEX\ deals with errors in a somewhat haphazard fashion. This is why the
2105 exact definition of \type{indicator} is not given here.
2109 \subsubsection{\luatex{show_error_message}}
2111 \startfunctioncall
2112 function()
2114 \stopfunctioncall
2116 This callback replaces the code that prints the error message. The usual
2117 interaction after the message is not affected.
2119 \subsubsection{\luatex{show_lua_error_hook}}
2121 \startfunctioncall
2122 function()
2124 \stopfunctioncall
2126 This callback replaces the code that prints the extra lua error message.
2129 \subsubsection{\luatex{start_file}}
2131 \startfunctioncall
2132 function(category,filename)
2134 \stopfunctioncall
2136 This callback replaces the code that prints \LUATEX's when a file is opened
2137 like \type {(filename} for regular files. The category is a number:
2139 \starttabulate[|||]
2140 \NC 1 \NC a normal data file, like a \TEX\ source \NC \NR
2141 \NC 2 \NC a font map coupling font names to resources \NC \NR
2142 \NC 3 \NC an image file (\type {png}, \type {pdf}, etc) \NC \NR
2143 \NC 4 \NC an embedded font subset \NC \NR
2144 \NC 5 \NC a fully embedded font \NC \NR
2145 \stoptabulate
2147 \subsubsection{\luatex{stop_file}}
2149 \startfunctioncall
2150 function(category)
2152 \stopfunctioncall
2154 This callback replaces the code that prints \LUATEX's when a file is closed
2155 like the \type{)} for regular files.
2157 \subsection{PDF-related callbacks}
2159 \subsubsection{\luatex{finish_pdffile}}
2161 \startfunctioncall
2162 function()
2164 \stopfunctioncall
2166 This callback is called when all document pages are already written to the \PDF\
2167 file and \LUATEX\ is about to finalize the output document structure. Its intended
2168 use is final update of \PDF\ dictionaries such as \type{/Catalog} or
2169 \type{/Info}. The callback does not replace any code. There are neither
2170 arguments nor return values.
2173 \subsubsection{\luatex{finish_pdfpage}}
2176 \startfunctioncall
2177 function(shippingout)
2179 \stopfunctioncall
2181 This callback is called after the pdf page stream has been assembled and before the
2182 page object gets finalized. This callback is available in \LUATEX\ 0.78.4 and later.
2185 \subsection{Font-related callbacks}
2187 \subsubsection{\luatex{define_font}}
2189 \startfunctioncall
2190 function(<string> name, <number> size, <number> id)
2191 return <table> font | <number> id
2193 \stopfunctioncall
2195 The string \type{name} is the filename part of the font
2196 specification, as given by the user.
2198 The number \type{size} is a bit special:
2200 \startitemize[packed]
2201 \item if it is positive, it specifies an \quote{at size} in scaled points.
2202 \item if it is negative, its absolute value represents a \quote{scaled}
2203 setting relative to the designsize of the font.
2204 \stopitemize
2206 The \type{id} is the internal number assigned to the font.
2208 The internal structure of the \type{font} table that is to be
2209 returned is explained in \in{chapter}[fonts]. That table is saved
2210 internally, so you can put extra fields in the table for your
2211 later \LUA\ code to use.
2212 In alternative, retval can be a previously defined fontid. This is
2213 useful if a previous definition can be reused instead of
2214 creating a whole new font structure.
2217 Setting this callback to \type{false} is pointless as it will prevent
2218 font loading completely but will nevertheless generate errors.
2220 \section{The \luatex{epdf} library}
2222 The \type{epdf} library provides Lua bindings to many \PDF\ access functions
2223 that are defined by the poppler pdf viewer library (written in C$+{}+$
2224 by Kristian H\o gsberg, based on xpdf by Derek Noonburg).
2225 Within \LUATEX\ (and \PDFTEX),
2226 xpdf functionality is being used since long time to embed \PDF\ files.
2227 The \type{epdf} library shall allow to scrutinize an external \PDF\ file.
2228 It gives access to its document structure,
2229 e.\,g., catalog, cross-reference table, individual pages, objects,
2230 annotations, info, and metadata. The \LUATEX\ team is evaluating
2231 the possibility of reducing the binding to a basic low level \PDF\
2232 primitives and delegate the complete set of functions
2233 to an external shared object module.
2236 The \type{epdf} library is still in alpha state:
2237 \PDF\ access is currently read|-|only
2238 (it's not yet possible to alter a \PDF\ file or to assemble it from scratch),
2239 and many function bindings are still missing.
2241 For a start,
2242 a \PDF\ file is opened by \type{epdf.open()} with file name, e.\,g.:
2244 \starttyping
2245 doc = epdf.open("foo.pdf")
2246 \stoptyping
2248 This normally returns a \type{PDFDoc} userdata variable;
2249 but if the file could not be opened successfully,
2250 instead of a fatal error just the value \type{nil} is returned.
2252 All Lua functions in the \type{epdf} library are named after the
2253 poppler functions listed in the poppler header files for the various classes,
2254 e.\,g., files \type{PDFDoc.h}, \type{Dict.h}, and \type{Array.h}.
2255 These files can be found in the poppler subdirectory within the \LUATEX\ sources.
2256 Which functions are already implemented in the \type{epdf} library
2257 can be found in the \LUATEX\ source file \type{lepdflib.cc}.
2258 For using the \type{epdf} library,
2259 knowledge of the \PDF\ file architecture is indispensable.
2261 There are many different userdata types defined
2262 by the \type{epdf} library, currently these are
2263 \type{AnnotBorderStyle},
2264 \type{AnnotBorder},
2265 \type{Annots},
2266 \type{Annot},
2267 \type{Array},
2268 \type{Attribute},
2269 \type{Catalog},
2270 \type{Dict},
2271 \type{EmbFile},
2272 \type{GString},
2273 \type{LinkDest},
2274 \type{Links},
2275 \type{Link},
2276 \type{ObjectStream},
2277 \type{Object},
2278 \type{PDFDoc},
2279 \type{PDFRectangle},
2280 \type{Page},
2281 \type{Ref},
2282 \type{Stream},
2283 \type{StructElement},
2284 \type{StructTreeRoot}
2285 \type{TextSpan},
2286 \type{XRefEntry}
2288 \type{XRef}
2291 All these userdata names and the Lua access functions closely resemble
2292 the classes naming from the poppler header files,
2293 including the choice of mixed upper and lower case letters.
2294 The Lua function calls use object-oriented syntax, e.\,g.,
2295 the following calls return the \type{Page} object for page~1:
2297 \starttyping
2298 pageref = doc:getCatalog():getPageRef(1)
2299 pageobj = doc:getXRef():fetch(pageref.num, pageref.gen)
2300 \stoptyping
2302 But writing such chained calls is risky,
2303 as an intermediate function may return \type{nil} on error;
2304 therefore between function calls there should be Lua type checks
2305 (e.\,g., against \type{nil}) done.
2306 If a non-object item is requested
2307 (e.\,g., a \type{Dict} item by calling \type{page:getPieceInfo()},
2308 cf.~\type{Page.h}) but not available,
2309 the Lua functions return \type{nil} (without error).
2310 If a function should return an \type{Object}, but it's not existing,
2311 a \type{Null} object is returned instead
2312 (also without error; this is in|-|line with poppler behavior).
2314 All library objects have a \type{__gc} metamethod for garbage collection.
2315 The \type{__tostring} metamethod gives the type name for each object.
2317 All object constructors:
2319 \startfunctioncall
2320 <PDFDoc> = epdf.open(<string> PDF filename)
2321 <Annot> = epdf.Annot(<XRef>, <Dict>, <Catalog>, <Ref>)
2322 <Annots> = epdf.Annots(<XRef>, <Catalog>, <Object>)
2323 <Array> = epdf.Array(<XRef>)
2324 <Attribute> = epdf.Attribute(<Type>,<Object>)| epdf.Attribute(<string>, <int>, <Object>)
2325 <Dict> = epdf.Dict(<XRef>)
2326 <Object> = epdf.Object()
2327 <PDFRectangle> = epdf.PDFRectangle()
2328 \stopfunctioncall
2330 The functions \type{StructElement_Type},
2331 \type{Attribute_Type} and
2332 \type{AttributeOwner_Type} return a hash table \type{{<string>,<integer>}}.
2336 \type{Annot} methods:
2338 \startfunctioncall
2339 <boolean> = <Annot>:isOK()
2340 <Object> = <Annot>:getAppearance()
2341 <AnnotBorder> = <Annot>:getBorder()
2342 <boolean> = <Annot>:match(<Ref>)
2343 \stopfunctioncall
2345 \type{AnnotBorderStyle} methods:
2347 \startfunctioncall
2348 <number> = <AnnotBorderStyle>:getWidth()
2349 \stopfunctioncall
2351 \type{Annots} methods:
2353 \startfunctioncall
2354 <integer> = <Annots>:getNumAnnots()
2355 <Annot> = <Annots>:getAnnot(<integer>)
2356 \stopfunctioncall
2358 \type{Array} methods:
2360 \startfunctioncall
2361 <Array>:incRef()
2362 <Array>:decRef()
2363 <integer> = <Array>:getLength()
2364 <Array>:add(<Object>)
2365 <Object> = <Array>:get(<integer>)
2366 <Object> = <Array>:getNF(<integer>)
2367 <string> = <Array>:getString(<integer>)
2368 \stopfunctioncall
2371 \type{Attribute} methods:
2373 \startfunctioncall
2374 <boolean> = <Attribute>:isOk()
2375 <integer> = <Attribute>:getType()
2376 <integer> = <Attribute>:getOwner()
2377 <string> = <Attribute>:getTypeName()
2378 <string> = <Attribute>:getOwnerName()
2379 <Object> = <Attribute>:getValue()
2380 <Object> = <Attribute>:getDefaultValue
2381 <string> = <Attribute>:getName()
2382 <integer> = <Attribute>:getRevision()
2383 <Attribute>:setRevision(<unsigned integer>)
2384 <boolean> = <Attribute>:istHidden()
2385 <Attribute>:setHidden(<boolean>)
2386 <string> = <Attribute>:getFormattedValue()
2387 <string> = <Attribute>:setFormattedValue(<string>)
2388 \stopfunctioncall
2392 \type{Catalog} methods:
2394 \startfunctioncall
2395 <boolean> = <Catalog>:isOK()
2396 <integer> = <Catalog>:getNumPages()
2397 <Page> = <Catalog>:getPage(<integer>)
2398 <Ref> = <Catalog>:getPageRef(<integer>)
2399 <string> = <Catalog>:getBaseURI()
2400 <string> = <Catalog>:readMetadata()
2401 <Object> = <Catalog>:getStructTreeRoot()
2402 <integer> = <Catalog>:findPage(<integer> object number, <integer> object generation)
2403 <LinkDest> = <Catalog>:findDest(<string> name)
2404 <Object> = <Catalog>:getDests()
2405 <integer> = <Catalog>:numEmbeddedFiles()
2406 <EmbFile> = <Catalog>:embeddedFile(<integer>)
2407 <integer> = <Catalog>:numJS()
2408 <string> = <Catalog>:getJS(<integer>)
2409 <Object> = <Catalog>:getOutline()
2410 <Object> = <Catalog>:getAcroForm()
2411 \stopfunctioncall
2413 \type{EmbFile} methods:
2415 \startfunctioncall
2416 <string> = <EmbFile>:name()
2417 <string> = <EmbFile>:description()
2418 <integer> = <EmbFile>:size()
2419 <string> = <EmbFile>:modDate()
2420 <string> = <EmbFile>:createDate()
2421 <string> = <EmbFile>:checksum()
2422 <string> = <EmbFile>:mimeType()
2423 <Object> = <EmbFile>:streamObject()
2424 <boolean> = <EmbFile>:isOk()
2425 \stopfunctioncall
2427 \type{Dict} methods:
2429 \startfunctioncall
2430 <Dict>:incRef()
2431 <Dict>:decRef()
2432 <integer> = <Dict>:getLength()
2433 <Dict>:add(<string>, <Object>)
2434 <Dict>:set(<string>, <Object>)
2435 <Dict>:remove(<string>)
2436 <boolean> = <Dict>:is(<string>)
2437 <Object> = <Dict>:lookup(<string>)
2438 <Object> = <Dict>:lookupNF(<string>)
2439 <integer> = <Dict>:lookupInt(<string>, <string>)
2440 <string> = <Dict>:getKey(<integer>)
2441 <Object> = <Dict>:getVal(<integer>)
2442 <Object> = <Dict>:getValNF(<integer>)
2443 <boolean> = <Dict>:hasKey(<string>)
2444 \stopfunctioncall
2446 \type{Link} methods:
2448 \startfunctioncall
2449 <boolean> = <Link>:isOK()
2450 <boolean> = <Link>:inRect(<number>, <number>)
2451 \stopfunctioncall
2453 \type{LinkDest} methods:
2455 \startfunctioncall
2456 <boolean> = <LinkDest>:isOK()
2457 <integer> = <LinkDest>:getKind()
2458 <string> = <LinkDest>:getKindName()
2459 <boolean> = <LinkDest>:isPageRef()
2460 <integer> = <LinkDest>:getPageNum()
2461 <Ref> = <LinkDest>:getPageRef()
2462 <number> = <LinkDest>:getLeft()
2463 <number> = <LinkDest>:getBottom()
2464 <number> = <LinkDest>:getRight()
2465 <number> = <LinkDest>:getTop()
2466 <number> = <LinkDest>:getZoom()
2467 <boolean> = <LinkDest>:getChangeLeft()
2468 <boolean> = <LinkDest>:getChangeTop()
2469 <boolean> = <LinkDest>:getChangeZoom()
2470 \stopfunctioncall
2472 \type{Links} methods:
2474 \startfunctioncall
2475 <integer> = <Links>:getNumLinks()
2476 <Link> = <Links>:getLink(<integer>)
2477 \stopfunctioncall
2479 \type{Object} methods:
2481 \startfunctioncall
2482 <Object>:initBool(<boolean>)
2483 <Object>:initInt(<integer>)
2484 <Object>:initReal(<number>)
2485 <Object>:initString(<string>)
2486 <Object>:initName(<string>)
2487 <Object>:initNull()
2488 <Object>:initArray(<XRef>)
2489 <Object>:initDict(<XRef>)
2490 <Object>:initStream(<Stream>)
2491 <Object>:initRef(<integer> object number, <integer> object generation)
2492 <Object>:initCmd(<string>)
2493 <Object>:initError()
2494 <Object>:initEOF()
2495 <Object> = <Object>:fetch(<XRef>)
2496 <integer> = <Object>:getType()
2497 <string> = <Object>:getTypeName()
2498 <boolean> = <Object>:isBool()
2499 <boolean> = <Object>:isInt()
2500 <boolean> = <Object>:isReal()
2501 <boolean> = <Object>:isNum()
2502 <boolean> = <Object>:isString()
2503 <boolean> = <Object>:isName()
2504 <boolean> = <Object>:isNull()
2505 <boolean> = <Object>:isArray()
2506 <boolean> = <Object>:isDict()
2507 <boolean> = <Object>:isStream()
2508 <boolean> = <Object>:isRef()
2509 <boolean> = <Object>:isCmd()
2510 <boolean> = <Object>:isError()
2511 <boolean> = <Object>:isEOF()
2512 <boolean> = <Object>:isNone()
2513 <boolean> = <Object>:getBool()
2514 <integer> = <Object>:getInt()
2515 <number> = <Object>:getReal()
2516 <number> = <Object>:getNum()
2517 <string> = <Object>:getString()
2518 <string> = <Object>:getName()
2519 <Array> = <Object>:getArray()
2520 <Dict> = <Object>:getDict()
2521 <Stream> = <Object>:getStream()
2522 <Ref> = <Object>:getRef()
2523 <integer> = <Object>:getRefNum()
2524 <integer> = <Object>:getRefGen()
2525 <string> = <Object>:getCmd()
2526 <integer> = <Object>:arrayGetLength()
2527 = <Object>:arrayAdd(<Object>)
2528 <Object> = <Object>:arrayGet(<integer>)
2529 <Object> = <Object>:arrayGetNF(<integer>)
2530 <integer> = <Object>:dictGetLength(<integer>)
2531 = <Object>:dictAdd(<string>, <Object>)
2532 = <Object>:dictSet(<string>, <Object>)
2533 <Object> = <Object>:dictLookup(<string>)
2534 <Object> = <Object>:dictLookupNF(<string>)
2535 <string> = <Object>:dictgetKey(<integer>)
2536 <Object> = <Object>:dictgetVal(<integer>)
2537 <Object> = <Object>:dictgetValNF(<integer>)
2538 <boolean> = <Object>:streamIs(<string>)
2539 = <Object>:streamReset()
2540 <integer> = <Object>:streamGetChar()
2541 <integer> = <Object>:streamLookChar()
2542 <integer> = <Object>:streamGetPos()
2543 = <Object>:streamSetPos(<integer>)
2544 <Dict> = <Object>:streamGetDict()
2545 \stopfunctioncall
2547 \type{Page} methods:
2549 \startfunctioncall
2550 <boolean> = <Page>:isOk()
2551 <integer> = <Page>:getNum()
2552 <PDFRectangle> = <Page>:getMediaBox()
2553 <PDFRectangle> = <Page>:getCropBox()
2554 <boolean> = <Page>:isCropped()
2555 <number> = <Page>:getMediaWidth()
2556 <number> = <Page>:getMediaHeight()
2557 <number> = <Page>:getCropWidth()
2558 <number> = <Page>:getCropHeight()
2559 <PDFRectangle> = <Page>:getBleedBox()
2560 <PDFRectangle> = <Page>:getTrimBox()
2561 <PDFRectangle> = <Page>:getArtBox()
2562 <integer> = <Page>:getRotate()
2563 <string> = <Page>:getLastModified()
2564 <Dict> = <Page>:getBoxColorInfo()
2565 <Dict> = <Page>:getGroup()
2566 <Stream> = <Page>:getMetadata()
2567 <Dict> = <Page>:getPieceInfo()
2568 <Dict> = <Page>:getSeparationInfo()
2569 <Dict> = <Page>:getResourceDict()
2570 <Object> = <Page>:getAnnots()
2571 <Links> = <Page>:getLinks(<Catalog>)
2572 <Object> = <Page>:getContents()
2573 \stopfunctioncall
2575 \type{PDFDoc} methods:
2577 \startfunctioncall
2578 <boolean> = <PDFDoc>:isOk()
2579 <integer> = <PDFDoc>:getErrorCode()
2580 <string> = <PDFDoc>:getErrorCodeName()
2581 <string> = <PDFDoc>:getFileName()
2582 <XRef> = <PDFDoc>:getXRef()
2583 <Catalog> = <PDFDoc>:getCatalog()
2584 <number> = <PDFDoc>:getPageMediaWidth()
2585 <number> = <PDFDoc>:getPageMediaHeight()
2586 <number> = <PDFDoc>:getPageCropWidth()
2587 <number> = <PDFDoc>:getPageCropHeight()
2588 <integer> = <PDFDoc>:getNumPages()
2589 <string> = <PDFDoc>:readMetadata()
2590 <Object> = <PDFDoc>:getStructTreeRoot()
2591 <integer> = <PDFDoc>:findPage(<integer> object number, <integer> object generation)
2592 <Links> = <PDFDoc>:getLinks(<integer>)
2593 <LinkDest> = <PDFDoc>:findDest(<string>)
2594 <boolean> = <PDFDoc>:isEncrypted()
2595 <boolean> = <PDFDoc>:okToPrint()
2596 <boolean> = <PDFDoc>:okToChange()
2597 <boolean> = <PDFDoc>:okToCopy()
2598 <boolean> = <PDFDoc>:okToAddNotes()
2599 <boolean> = <PDFDoc>:isLinearized()
2600 <Object> = <PDFDoc>:getDocInfo()
2601 <Object> = <PDFDoc>:getDocInfoNF()
2602 <integer> = <PDFDoc>:getPDFMajorVersion()
2603 <integer> = <PDFDoc>:getPDFMinorVersion()
2604 \stopfunctioncall
2606 \type{PDFRectangle} methods:
2608 \startfunctioncall
2609 <boolean> = <PDFRectangle>:isValid()
2610 \stopfunctioncall
2612 %\type{Ref} methods:
2614 %\startfunctioncall
2615 %\stopfunctioncall
2617 \type{Stream} methods:
2619 \startfunctioncall
2620 <integer> = <Stream>:getKind()
2621 <string> = <Stream>:getKindName()
2622 = <Stream>:reset()
2623 = <Stream>:close()
2624 <integer> = <Stream>:getChar()
2625 <integer> = <Stream>:lookChar()
2626 <integer> = <Stream>:getRawChar()
2627 <integer> = <Stream>:getUnfilteredChar()
2628 = <Stream>:unfilteredReset()
2629 <integer> = <Stream>:getPos()
2630 <boolean> = <Stream>:isBinary()
2631 <Stream> = <Stream>:getUndecodedStream()
2632 <Dict> = <Stream>:getDict()
2633 \stopfunctioncall
2635 \type{StructElement} methods:
2637 \startfunctioncall
2638 <string> = <StructElement>:getTypeName()
2639 <integer> = <StructElement>:getType()
2640 <boolean> = <StructElement>:isOk()
2641 <boolean> = <StructElement>:isBlock()
2642 <boolean> = <StructElement>:isInline()
2643 <boolean> = <StructElement>:isGrouping()
2644 <boolean> = <StructElement>:isContent()
2645 <boolean> = <StructElement>:isObjectRef()
2646 <integer> = <StructElement>:getMCID()
2647 <Ref> = <StructElement>:getObjectRef()
2648 <Ref> = <StructElement>:getParentRef()
2649 <boolean> = <StructElement>:hasPageRef()
2650 <Ref> = <StructElement>:getPageRef()
2651 <StructTreeRoot> = <StructElement>:getStructTreeRoot()
2652 <string> = <StructElement>:getID()
2653 <string> = <StructElement>:getLanguage()
2654 <integer> = <StructElement>:getRevision()
2655 <StructElement>:setRevision(<unsigned integer>)
2656 <string> = <StructElement>:getTitle()
2657 <string> = <StructElement>:getExpandedAbbr()
2658 <integer> = <StructElement>:getNumChildren()
2659 <StructElement> = <StructElement>:getChild()
2660 = <StructElement>:appendChild<StructElement>)
2661 <integer> = <StructElement>:getNumAttributes()
2662 <Attribute> = <StructElement>:geAttribute(<integer>)
2663 <string> = <StructElement>:appendAttribute(<Attribute>)
2664 <Attribute> = <StructElement>:findAttribute(<Attribute::Type>,boolean,Attribute::Owner)
2665 <string> = <StructElement>:getAltText()
2666 <string> = <StructElement>:getActualText()
2667 <string> = <StructElement>:getText(<boolean>)
2668 <table> = <StructElement>:getTextSpans()
2669 \stopfunctioncall
2672 \type{StructTreeRoot} methods:
2674 \startfunctioncall
2675 <StructElement> = <StructTreeRoot>:findParentElement
2676 <PDFDoc> = <StructTreeRoot>:getDoc
2677 <Dict> = <StructTreeRoot>:getRoleMap
2678 <Dict> = <StructTreeRoot>:getClassMap
2679 <integer> = <StructTreeRoot>:getNumChildren
2680 <StructElement> = <StructTreeRoot>:getChild
2681 <StructTreeRoot>:appendChild
2682 <StructElement> = <StructTreeRoot>:findParentElement
2683 \stopfunctioncall
2686 \type{TextSpan} han only one method:
2688 \startfunctioncall
2689 <string> = <TestSpan>:getText()
2690 \stopfunctioncall
2695 \type{XRef} methods:
2697 \startfunctioncall
2698 <boolean> = <XRef>:isOk()
2699 <integer> = <XRef>:getErrorCode()
2700 <boolean> = <XRef>:isEncrypted()
2701 <boolean> = <XRef>:okToPrint()
2702 <boolean> = <XRef>:okToPrintHighRes()
2703 <boolean> = <XRef>:okToChange()
2704 <boolean> = <XRef>:okToCopy()
2705 <boolean> = <XRef>:okToAddNotes()
2706 <boolean> = <XRef>:okToFillForm()
2707 <boolean> = <XRef>:okToAccessibility()
2708 <boolean> = <XRef>:okToAssemble()
2709 <Object> = <XRef>:getCatalog()
2710 <Object> = <XRef>:fetch(<integer> object number, <integer> object generation)
2711 <Object> = <XRef>:getDocInfo()
2712 <Object> = <XRef>:getDocInfoNF()
2713 <integer> = <XRef>:getNumObjects()
2714 <integer> = <XRef>:getRootNum()
2715 <integer> = <XRef>:getRootGen()
2716 <integer> = <XRef>:getSize()
2717 <Object> = <XRef>:getTrailerDict()
2718 \stopfunctioncall
2720 %***********************************************************************
2722 \section{The \luatex{font} library}
2724 The font library provides the interface into the internals of the font
2725 system, and also it contains helper functions to load traditional
2726 \TEX\ font metrics formats. Other font loading functionality is
2727 provided by the \luatex{fontloader} library that will be discussed in
2728 the next section.
2730 \subsection{Loading a \TFM\ file}
2732 The behavior documented in this subsection is considered stable in the
2733 sense that there will not be backward-incompatible changes any more.
2735 \startfunctioncall
2736 <table> fnt = font.read_tfm(<string> name, <number> s)
2737 \stopfunctioncall
2739 The number is a bit special:
2741 \startitemize
2742 \item if it is positive, it specifies an \quote{at size} in scaled points.
2743 \item if it is negative, its absolute value represents a \quote{scaled}
2744 setting relative to the designsize of the font.
2745 \stopitemize
2747 The internal structure of the metrics font table that is returned is
2748 explained in \in{chapter}[fonts].
2750 \subsection{Loading a \VF\ file}
2752 The behavior documented in this subsection is considered stable in the
2753 sense that there will not be backward-incompatible changes any more.
2755 \startfunctioncall
2756 <table> vf_fnt = font.read_vf(<string> name, <number> s)
2757 \stopfunctioncall
2759 The meaning of the number \type{s} and the format of the returned
2760 table are similar to the ones in the \luatex{read_tfm()} function.
2762 \subsection{The fonts array}
2764 The whole table of \TEX\ fonts is accessible from \LUA\ using a virtual array.
2766 \starttyping
2767 font.fonts[n] = { ... }
2768 <table> f = font.fonts[n]
2769 \stoptyping
2771 See \in{chapter}[fonts] for the structure of the tables. Because this
2772 is a virtual array, you cannot call \type{pairs} on it, but see below
2773 for the \type{font.each} iterator.
2775 The two metatable functions implementing the virtual array are:
2777 \startfunctioncall
2778 <table> f = font.getfont(<number> n)
2779 font.setfont(<number> n, <table> f)
2780 \stopfunctioncall
2782 Note that at the moment, each access to the \type{font.fonts} or call
2783 to \type{font.getfont} creates a lua table for the whole font. This
2784 process can be quite slow. In a later version of \LUATEX, this
2785 interface will change (it will start using userdata objects instead of
2786 actual tables).
2788 Also note the following: assignments can only be made to fonts that
2789 have already been defined in \TEX, but have not been accessed {\it at
2790 all\/} since that definition. This limits the usability of the write
2791 access to \type{font.fonts} quite a lot, a less stringent ruleset will
2792 likely be implemented later.
2794 \subsection{Checking a font's status}
2796 You can test for the status of a font by calling this function:
2798 \startfunctioncall
2799 <boolean> f = font.frozen(<number> n)
2800 \stopfunctioncall
2802 The return value is one of \type{true} (unassignable), \type{false} (can be changed)
2803 or \type{nil} (not a valid font at all).
2805 \subsection{Defining a font directly}
2807 You can define your own font into \luatex{font.fonts} by calling this function:
2809 \startfunctioncall
2810 <number> i = font.define(<table> f)
2811 \stopfunctioncall
2813 The return value is the internal id number of the defined font (the
2814 index into \luatex{font.fonts}). If the font creation fails, an error is
2815 raised. The table is a font structure, as explained in
2816 \in{chapter}[fonts].
2818 \subsection{Projected next font id}
2820 \startfunctioncall
2821 <number> i = font.nextid()
2822 \stopfunctioncall
2824 This returns the font id number that would be returned by a
2825 \type{font.define} call if it was executed at this spot in the code
2826 flow. This is useful for virtual fonts that need to reference
2827 themselves.
2829 \subsection{Font id (0.47)}
2831 \startfunctioncall
2832 <number> i = font.id(<string> csname)
2833 \stopfunctioncall
2835 This returns the font id associated with \type{csname} string, or $-1$
2836 if \type{csname} is not defined; new in 0.47.
2838 \subsection{Currently active font}
2840 \startfunctioncall
2841 <number> i = font.current()
2842 font.current(<number> i)
2843 \stopfunctioncall
2845 This gets or sets the currently used font number.
2847 \subsection{Maximum font id}
2849 \startfunctioncall
2850 <number> i = font.max()
2851 \stopfunctioncall
2853 This is the largest used index in \type{font.fonts}.
2855 \subsection{Iterating over all fonts}
2857 \startfunctioncall
2858 for i,v in font.each() do
2861 \stopfunctioncall
2863 This is an iterator over each of the defined \TEX\ fonts. The first
2864 returned value is the index in \type{font.fonts}, the second the font
2865 itself, as a \LUA\ table. The indices are listed incrementally, but they
2866 do not always form an array of consecutive numbers: in some cases
2867 there can be holes in the sequence.
2869 \section{The \luatex{fontloader} library (0.36)}
2871 \subsection{Getting quick information on a font}
2873 \startfunctioncall
2874 <table> info = fontloader.info(<string> filename)
2875 \stopfunctioncall
2877 This function returns either \type{nil}, or a \type{table}, or an
2878 array of small tables (in the case of a TrueType collection). The
2879 returned table(s) will contain some fairly interesting information
2880 items from the font(s) defined by the file:
2882 \starttabulate[|lT|l|p|]
2883 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
2884 \NC fontname \NC string \NC the \POSTSCRIPT\ name of the font\NC\NR
2885 \NC fullname \NC string \NC the formal name of the font\NC\NR
2886 \NC familyname \NC string \NC the family name this font belongs to\NC\NR
2887 \NC weight \NC string \NC a string indicating the color value of the font\NC\NR
2888 \NC version \NC string \NC the internal font version\NC\NR
2889 \NC italicangle \NC float \NC the slant angle\NC\NR
2890 \NC units_per_em \NC number \NC (since 0.78.2) 1000 for \POSTSCRIPT-based fonts, usually 2048 for \TRUETYPE\NC\NR
2891 \NC pfminfo \NC table \NC (since 0.78.2) (see \in{section}[fontloaderpfminfotable])\NC\NR
2892 \stoptabulate
2894 Getting information through this function is (sometimes much) more
2895 efficient than loading the font properly, and is therefore handy when
2896 you want to create a dictionary of available fonts based on a
2897 directory contents.
2899 \subsection{Loading an \OPENTYPE\ or \TRUETYPE\ file}
2901 If you want to use an \OPENTYPE\ font, you have to get the metric
2902 information from somewhere. Using the \type{fontloader} library, the
2903 simplest way to get that information is thus:
2905 \starttyping
2906 function load_font (filename)
2907 local metrics = nil
2908 local font = fontloader.open(filename)
2909 if font then
2910 metrics = fontloader.to_table(font)
2911 fontloader.close(font)
2913 return metrics
2916 myfont = load_font('/opt/tex/texmf/fonts/data/arial.ttf')
2917 \stoptyping
2919 The main function call is
2921 \startfunctioncall
2922 <userdata> f, <table> w = fontloader.open(<string> filename)
2923 <userdata> f, <table> w = fontloader.open(<string> filename, <string> fontname)
2924 \stopfunctioncall
2926 The first return value is a userdata representation of the font. The
2927 second return value is a table containing any warnings and errors
2928 reported by fontloader while opening the font. In normal typesetting,
2929 you would probably ignore the second argument, but it can be useful
2930 for debugging purposes.
2932 For \TRUETYPE\ collections (when filename ends in 'ttc') and \DFONT\
2933 collections, you have to use a second string argument to specify which
2934 font you want from the collection. Use the \type{fontname}
2935 strings that are returned by \type{fontloader.info} for that.
2937 To turn the font into a table, \type{fontloader.to_table} is used on
2938 the font returned by \type{fontloader.open}.
2940 \startfunctioncall
2941 <table> f = fontloader.to_table(<userdata> font)
2942 \stopfunctioncall
2944 This table cannot be used directly by \LUATEX\ and should be turned
2945 into another one as described in~\in{chapter}[fonts].
2946 Do not forget to store the \type{fontname} value in the \type{psname}
2947 field of the metrics table to be returned to \LUATEX, otherwise the
2948 font inclusion backend will not be able to find the correct font in
2949 the collection.
2951 See \in{section}[fontloadertables] for details on the userdata object
2952 returned by \type{fontloader.open()} and the layout of the
2953 \type{metrics} table returned by \type{fontloader.to_table()}.
2955 The font file is parsed and partially interpreted by the font
2956 loading routines from \FONTFORGE. The file format can be \OPENTYPE,
2957 \TRUETYPE, \TRUETYPE\ Collection, \CFF, or \TYPEONE.
2959 There are a few advantages to this approach compared to reading the
2960 actual font file ourselves:
2962 \startitemize
2964 \item The font is automatically re|-|encoded, so that the \type{metrics}
2965 table for \TRUETYPE\ and \OPENTYPE\ fonts is using \UNICODE\ for
2966 the character indices.
2968 \item Many features are pre|-|processed into a format that is easier to handle
2969 than just the bare tables would be.
2971 \item \POSTSCRIPT|-|based \OPENTYPE\ fonts do not store the character height and
2972 depth in the font file, so the character boundingbox has to be
2973 calculated in some way.
2975 \item In the future, it may be interesting to allow \LUA\ scripts access to
2976 the font program itself, perhaps even creating or changing the font.
2978 \stopitemize
2980 A loaded font is discarded with:
2982 \startfunctioncall
2983 fontloader.close(<userdata> font)
2984 \stopfunctioncall
2986 \subsection{Applying a \quote{feature file}}
2988 You can apply a \quote{feature file} to a loaded font:
2990 \startfunctioncall
2991 <table> errors = fontloader.apply_featurefile(<userdata> font, <string> filename)
2992 \stopfunctioncall
2994 A \quote{feature file} is a textual representation of the features in an
2995 \OPENTYPE\ font. See\crlf
2996 \hyphenatedurl {http://www.adobe.com/devnet/opentype/afdko/topic_feature_file_syntax.html}\crlf
2997 and\crlf
2998 \hyphenatedurl {http://fontforge.sourceforge.net/featurefile.html}\crlf
2999 for a more detailed description of feature files.
3001 If the function fails, the return value is a table containing any
3002 errors reported by fontloader while applying the feature file. On
3003 success, \type{nil} is returned. (the return value is new in 0.65)
3007 \subsection{Applying an \quote{\AFM\ file}}
3009 You can apply an \quote{\AFM\ file} to a loaded font:
3011 \startfunctioncall
3012 <table> errors = fontloader.apply_afmfile(<userdata> font, <string> filename)
3013 \stopfunctioncall
3015 An \AFM\ file is a textual representation of (some of) the meta information
3016 in a \TYPEONE\ font. See \hyphenatedurl{ftp://ftp.math.utah.edu/u/ma/hohn/linux/postscript/5004.AFM_Spec.pdf}
3017 for more information about afm files.
3019 Note: If you \type{fontloader.open()} a \TYPEONE\ file named \type{font.pfb},
3020 the library will automatically search for and apply \type{font.afm}
3021 if it exists in the same directory as the file \type{font.pfb}. In that case,
3022 there is no need for an explicit call to \type{apply_afmfile()}.
3024 If the function fails, the return value is a table containing any
3025 errors reported by fontloader while applying the AFM file. On
3026 success, \type{nil} is returned. (the return value is new in 0.65)
3028 \subsection[fontloadertables]{Fontloader font tables}
3030 As mentioned earlier, the return value of \type{fontloader.open()} is
3031 a userdata object. In \LUATEX\ versions before 0.63, the only way to
3032 have access to the actual metrics was to call
3033 \type{fontloader.to_table()} on this object, returning the table
3034 structure that is explained in the following subsections.
3036 However, it turns out that the result from
3037 \type{fontloader.to_table()} sometimes needs very large amounts of memory
3038 (depending on the font's complexity and size) so starting with \LUATEX\ 0.63,
3039 it is possible to access the userdata object directly.
3041 In the \LUATEX\ 0.63.0, the following is implemented:
3043 \startitemize
3044 \item all top-level keys that would be returned by \type{to_table()}
3045 can also be accessed directly.
3046 %\item the top-level key \quote{glyphs} returns a {\it virtual\/} array that
3047 % allows indices from \type{0} to ($\type{f.glyphmax}-1$).
3048 \item the top-level key \quote{glyphs} returns a {\it virtual\/} array that
3049 allows indices from \type{f.glyphmin} to (\type{f.glyphmax}).
3050 \item the items in that virtual array (the actual glyphs) are themselves also
3051 userdata objects, and each has accessors for all of the keys
3052 explained in the section \quote{Glyph items} below.
3053 \item the top-level key \quote{subfonts} returns an {\it actual} array of
3054 userdata objects, one for each of the subfonts (or nil, if there are no subfonts).
3055 \stopitemize
3058 A short example may be helpful. This code generates a printout of all
3059 the glyph names in the font \type{PunkNova.kern.otf}:
3062 \starttyping
3063 local f = fontloader.open('PunkNova.kern.otf')
3064 print (f.fontname)
3065 local i = 0
3066 if f.glyphcnt > 0 then
3067 for i=f.glyphmin,f.glyphmax do
3068 local g = f.glyphs[i]
3069 if g then
3070 print(g.name)
3072 i = i + 1
3075 fontloader.close(f)
3076 \stoptyping
3078 In this case, the \LUATEX\ memory requirement stays below 100MB on the
3079 test computer, while the internal stucture generated by
3080 \type{to_table()} needs more than 2GB of memory (the font itself is
3081 6.9MB in disk size).
3083 In \LUATEX\ 0.63 only the top-level font, the subfont table entries,
3084 and the glyphs are virtual objects, everything else still produces
3085 normal lua values and tables. In future versions, more return values
3086 may be replaced by userdata objects (as much as needed to keep the
3087 memory requirements in check).
3089 If you want to know the valid fields in a font or glyph
3090 structure, call the \type{fields} function on an object of a
3091 particular type (either glyph or font for now, more will be
3092 implemented later):
3094 \startfunctioncall
3095 <table> fields = fontloader.fields(<userdata> font)
3096 <table> fields = fontloader.fields(<userdata> font_glyph)
3097 \stopfunctioncall
3099 For instance:
3101 \startfunctioncall
3102 local fields = fontloader.fields(f)
3103 local fields = fontloader.fields(f.glyphs[0])
3104 \stopfunctioncall
3107 \subsubsection{Table types}
3109 \subsubsubsection{Top-level}
3111 The top|-|level keys in the returned table are (the explanations in
3112 this part of the documentation are not yet finished):
3114 \starttabulate[|lT|l|p|]
3115 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3116 \NC table_version \NC number \NC indicates the metrics version (currently~0.3)\NC\NR
3117 \NC fontname \NC string \NC \POSTSCRIPT\ font name\NC\NR
3118 \NC fullname \NC string \NC official (human-oriented) font name\NC\NR
3119 \NC familyname \NC string \NC family name\NC\NR
3120 \NC weight \NC string \NC weight indicator\NC\NR
3121 \NC copyright \NC string \NC copyright information\NC\NR
3122 \NC filename \NC string \NC the file name\NC\NR
3123 \NC version \NC string \NC font version\NC\NR
3124 \NC italicangle \NC float \NC slant angle\NC\NR
3125 \NC units_per_em \NC number \NC 1000 for \POSTSCRIPT-based fonts, usually 2048 for \TRUETYPE\NC\NR
3126 \NC ascent \NC number \NC height of ascender in \type{units_per_em}\NC\NR
3127 \NC descent \NC number \NC depth of descender in \type{units_per_em}\NC\NR
3128 \NC upos \NC float \NC \NC\NR
3129 \NC uwidth \NC float \NC \NC\NR
3130 \NC uniqueid \NC number \NC \NC\NR
3131 \NC glyphs \NC array \NC \NC\NR
3132 \NC glyphcnt \NC number \NC number of included glyphs\NC\NR
3133 \NC glyphmax \NC number \NC maximum used index the glyphs array\NC\NR
3134 \NC glyphmin \NC number \NC minimum used index the glyphs array\NC\NR
3135 \NC hasvmetrics \NC number \NC \NC\NR
3136 \NC onlybitmaps \NC number \NC \NC\NR
3137 \NC serifcheck \NC number \NC \NC\NR
3138 \NC isserif \NC number \NC \NC\NR
3139 \NC issans \NC number \NC \NC\NR
3140 \NC encodingchanged \NC number \NC \NC\NR
3141 \NC strokedfont \NC number \NC \NC\NR
3142 \NC use_typo_metrics \NC number \NC \NC\NR
3143 \NC weight_width_slope_only \NC number \NC \NC\NR
3144 \NC head_optimized_for_cleartype \NC number \NC \NC\NR
3145 \NC uni_interp \NC enum \NC \type {unset}, \type {none}, \type {adobe},
3146 \type {greek}, \type {japanese}, \type {trad_chinese},
3147 \type {simp_chinese}, \type {korean}, \type {ams}\NC\NR
3148 \NC origname \NC string \NC the file name, as supplied by the user\NC\NR
3149 \NC map \NC table \NC \NC\NR
3150 \NC private \NC table \NC \NC\NR
3151 \NC xuid \NC string \NC \NC\NR
3152 \NC pfminfo \NC table \NC \NC\NR
3153 \NC names \NC table \NC \NC\NR
3154 \NC cidinfo \NC table \NC \NC\NR
3155 \NC subfonts \NC array \NC \NC\NR
3156 \NC commments \NC string \NC \NC\NR
3157 \NC fontlog \NC string \NC \NC\NR
3158 \NC cvt_names \NC string \NC \NC\NR
3159 \NC anchor_classes \NC table \NC \NC\NR
3160 \NC ttf_tables \NC table \NC \NC\NR
3161 \NC ttf_tab_saved \NC table \NC \NC\NR
3162 \NC kerns \NC table \NC \NC\NR
3163 \NC vkerns \NC table \NC \NC\NR
3164 \NC texdata \NC table \NC \NC\NR
3165 \NC lookups \NC table \NC \NC\NR
3166 \NC gpos \NC table \NC \NC\NR
3167 \NC gsub \NC table \NC \NC\NR
3168 \NC mm \NC table \NC \NC\NR
3169 \NC chosenname \NC string \NC \NC\NR
3170 \NC macstyle \NC number \NC \NC\NR
3171 \NC fondname \NC string \NC \NC\NR
3172 %\NC design_size \NC number \NC \NC\NR
3173 \NC fontstyle_id \NC number \NC \NC\NR
3174 \NC fontstyle_name \NC table \NC \NC\NR
3175 %\NC design_range_bottom \NC number \NC \NC\NR
3176 %\NC design_range_top \NC number \NC \NC\NR
3177 \NC strokewidth \NC float \NC \NC\NR
3178 \NC mark_classes \NC table \NC \NC\NR
3179 \NC creationtime \NC number \NC \NC\NR
3180 \NC modificationtime \NC number \NC \NC\NR
3181 \NC os2_version \NC number \NC \NC\NR
3182 \NC sfd_version \NC number \NC \NC\NR
3183 \NC math \NC table \NC \NC\NR
3184 \NC validation_state \NC table \NC \NC\NR
3185 \NC horiz_base \NC table \NC \NC\NR
3186 \NC vert_base \NC table \NC \NC\NR
3187 \NC extrema_bound \NC number \NC \NC\NR
3188 \stoptabulate
3190 \subsubsubsection{Glyph items}
3192 The \type{glyphs} is an array containing the per|-|character
3193 information (quite a few of these are only present if nonzero).
3195 \starttabulate[|lT|l|p|]
3196 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3197 \NC name \NC string \NC the glyph name\NC\NR
3198 \NC unicode \NC number \NC unicode code point, or -1\NC\NR
3199 \NC boundingbox \NC array \NC array of four numbers, see note below\NC\NR
3200 \NC width \NC number \NC only for horizontal fonts\NC\NR
3201 \NC vwidth \NC number \NC only for vertical fonts\NC\NR
3202 \NC tsidebearing \NC number \NC only for vertical ttf/otf fonts, and only if nonzero (0.79.0)\NC\NR
3203 \NC lsidebearing \NC number \NC only if nonzero and not equal to boundingbox[1]\NC\NR
3204 \NC class \NC string \NC one of "none", "base", "ligature", "mark", "component"
3205 (if not present, the glyph class is \quote{automatic})\NC\NR
3206 \NC kerns \NC array \NC only for horizontal fonts, if set\NC\NR
3207 \NC vkerns \NC array \NC only for vertical fonts, if set\NC\NR
3208 \NC dependents \NC array \NC linear array of glyph name strings, only if nonempty\NC\NR
3209 \NC lookups \NC table \NC only if nonempty\NC\NR
3210 \NC ligatures \NC table \NC only if nonempty\NC\NR
3211 \NC anchors \NC table \NC only if set\NC\NR
3212 \NC comment \NC string \NC only if set\NC\NR
3213 \NC tex_height \NC number \NC only if set\NC\NR
3214 \NC tex_depth \NC number \NC only if set\NC\NR
3215 \NC italic_correction \NC number \NC only if set\NC\NR
3216 \NC top_accent \NC number \NC only if set\NC\NR
3217 \NC is_extended_shape \NC number \NC only if this character is part of a math extension list\NC\NR
3218 \NC altuni \NC table \NC alternate \UNICODE\ items \NC\NR
3219 \NC vert_variants \NC table \NC \NC \NR
3220 \NC horiz_variants \NC table \NC \NC \NR
3221 \NC mathkern \NC table \NC \NC \NR
3222 \stoptabulate
3224 On \type{boundingbox}: The boundingbox information for \TRUETYPE\ fonts and \TRUETYPE-based \OTF\ fonts is read
3225 directly from the font file. \POSTSCRIPT-based fonts do not have this information, so the boundingbox of
3226 traditional \POSTSCRIPT\ fonts is generated by interpreting the actual bezier curves to find the exact
3227 boundingbox. This can be a slow process, so starting from \LUATEX\ 0.45, the boundingboxes of \POSTSCRIPT-based
3228 \OTF\ fonts (and raw \CFF\ fonts) are calculated using an approximation of the glyph shape based on the actual
3229 glyph points only, instead of taking the whole curve into account. This means that glyphs that have missing
3230 points at extrema will have a too-tight boundingbox, but the processing is so much faster that in our opinion
3231 the tradeoff is worth it.
3234 The \type{kerns} and \type{vkerns} are linear arrays of small hashes:
3236 \starttabulate[|lT|l|p|]
3237 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3238 \NC char \NC string \NC \NC\NR
3239 \NC off \NC number \NC \NC\NR
3240 \NC lookup \NC string \NC \NC\NR
3241 \stoptabulate
3243 The \type{lookups} is a hash, based on lookup subtable names, with
3244 the value of each key inside that a linear array of small hashes:
3246 % TODO: fix this description
3247 \starttabulate[|lT|l|p|]
3248 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3249 \NC type \NC enum \NC \type {position}, \type {pair}, \type {substitution}, \type {alternate},
3250 \type {multiple}, \type {ligature}, \type {lcaret}, \type {kerning}, \type {vkerning}, \type {anchors},
3251 \type {contextpos}, \type {contextsub}, \type {chainpos}, \type {chainsub},
3252 \type {reversesub}, \type {max}, \type {kernback}, \type {vkernback} \NC\NR
3253 \NC specification \NC table \NC extra data \NC\NR
3254 \stoptabulate
3256 For the first seven values of \type{type}, there can be additional
3257 sub|-|information, stored in the sub-table \type{specification}:
3259 \starttabulate[|lT|l|p|]
3260 \NC \ssbf value \NC \bf type \NC \bf explanation \NC\NR
3261 \NC position \NC table \NC a table of the \type {offset_specs} type\NC\NR
3262 \NC pair \NC table \NC one string: \type {paired}, and an array of one or
3263 two \type {offset_specs} tables: \type{offsets}\NC\NR
3264 \NC substitution \NC table \NC one string: \type {variant}\NC\NR
3265 \NC alternate \NC table \NC one string: \type {components}\NC\NR
3266 \NC multiple \NC table \NC one string: \type {components}\NC\NR
3267 \NC ligature \NC table \NC two strings: \type {components}, \type {char}\NC\NR
3268 \NC lcaret \NC array \NC linear array of numbers\NC\NR
3269 \stoptabulate
3271 Tables for \type{offset_specs} contain up to four number|-|valued
3272 fields: \type{x} (a horizontal offset), \type{y} (a vertical offset),
3273 \type{h} (an advance width correction) and \type{v} (an advance height
3274 correction).
3276 The \type{ligatures} is a linear array of small hashes:
3278 \starttabulate[|lT|l|p|]
3279 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3280 \NC lig \NC table \NC uses the same substructure as a single item in the \type{lookups} table explained above\NC\NR
3281 \NC char \NC string \NC \NC\NR
3282 \NC components \NC array \NC linear array of named components\NC\NR
3283 \NC ccnt \NC number \NC \NC\NR
3284 \stoptabulate
3286 The \type{anchor} table is indexed by a string signifying the
3287 anchor type, which is one of
3289 \starttabulate[|lT|l|p|]
3290 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3291 \NC mark \NC table \NC placement mark\NC\NR
3292 \NC basechar \NC table \NC mark for attaching combining items to a base char\NC\NR
3293 \NC baselig \NC table \NC mark for attaching combining items to a ligature\NC\NR
3294 \NC basemark \NC table \NC generic mark for attaching combining items to connect to\NC\NR
3295 \NC centry \NC table \NC cursive entry point\NC\NR
3296 \NC cexit \NC table \NC cursive exit point\NC\NR
3297 \stoptabulate
3299 The content of these is a short array of defined anchors, with the
3300 entry keys being the anchor names. For all except \type{baselig}, the
3301 value is a single table with this definition:
3303 \starttabulate[|lT|l|p|]
3304 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3305 \NC x \NC number \NC x location\NC\NR
3306 \NC y \NC number \NC y location\NC\NR
3307 \NC ttf_pt_index \NC number \NC truetype point index, only if given\NC\NR
3308 \stoptabulate
3310 For \type{baselig}, the value is a small array of such anchor sets
3311 sets, one for each constituent item of the ligature.
3313 For clarification, an anchor table could for example look like this :
3315 \starttyping
3316 ['anchor'] = {
3317 ['basemark'] = {
3318 ['Anchor-7'] = { ['x']=170, ['y']=1080 }
3320 ['mark'] ={
3321 ['Anchor-1'] = { ['x']=160, ['y']=810 },
3322 ['Anchor-4'] = { ['x']=160, ['y']=800 }
3324 ['baselig'] = {
3325 [1] = { ['Anchor-2'] = { ['x']=160, ['y']=650 } },
3326 [2] = { ['Anchor-2'] = { ['x']=460, ['y']=640 } }
3329 \stoptyping
3330 Note: The \type {baselig} table can be sparse!
3333 \subsubsubsection{map table}
3335 The top|-|level map is a list of encoding mappings. Each of those is a table itself.
3337 \starttabulate[|lT|l|p|]
3338 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3339 \NC enccount \NC number \NC \NC\NR
3340 \NC encmax \NC number \NC \NC\NR
3341 \NC backmax \NC number \NC \NC\NR
3342 \NC remap \NC table \NC \NC\NR
3343 \NC map \NC array \NC non|-|linear array of mappings\NC\NR
3344 \NC backmap \NC array \NC non|-|linear array of backward mappings\NC\NR
3345 \NC enc \NC table \NC \NC\NR
3346 \stoptabulate
3348 The \type{remap} table is very small:
3350 \starttabulate[|lT|l|p|]
3351 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3352 \NC firstenc \NC number \NC \NC\NR
3353 \NC lastenc \NC number \NC \NC\NR
3354 \NC infont \NC number \NC \NC\NR
3355 \stoptabulate
3357 The \type{enc} table is a bit more verbose:
3359 \starttabulate[|lT|l|p|]
3360 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3361 \NC enc_name \NC string \NC \NC\NR
3362 \NC char_cnt \NC number \NC \NC\NR
3363 \NC char_max \NC number \NC \NC\NR
3364 \NC unicode \NC array \NC of \UNICODE\ position numbers\NC\NR
3365 \NC psnames \NC array \NC of \POSTSCRIPT\ glyph names\NC\NR
3366 \NC builtin \NC number \NC \NC\NR
3367 \NC hidden \NC number \NC \NC\NR
3368 \NC only_1byte \NC number \NC \NC\NR
3369 \NC has_1byte \NC number \NC \NC\NR
3370 \NC has_2byte \NC number \NC \NC\NR
3371 \NC is_unicodebmp \NC number \NC only if nonzero\NC\NR
3372 \NC is_unicodefull \NC number \NC only if nonzero\NC\NR
3373 \NC is_custom \NC number \NC only if nonzero\NC\NR
3374 \NC is_original \NC number \NC only if nonzero\NC\NR
3375 \NC is_compact \NC number \NC only if nonzero\NC\NR
3376 \NC is_japanese \NC number \NC only if nonzero\NC\NR
3377 \NC is_korean \NC number \NC only if nonzero\NC\NR
3378 \NC is_tradchinese \NC number \NC only if nonzero [name?]\NC\NR
3379 \NC is_simplechinese \NC number \NC only if nonzero\NC\NR
3380 \NC low_page \NC number \NC \NC\NR
3381 \NC high_page \NC number \NC \NC\NR
3382 \NC iconv_name \NC string \NC \NC\NR
3383 \NC iso_2022_escape \NC string \NC \NC\NR
3384 \stoptabulate
3386 \subsubsubsection{private table}
3388 This is the font's private \POSTSCRIPT\ dictionary, if any. Keys and
3389 values are both strings.
3391 \subsubsubsection{cidinfo table}
3393 \starttabulate[|lT|l|p|]
3394 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3395 \NC registry \NC string \NC \NC\NR
3396 \NC ordering \NC string \NC \NC\NR
3397 \NC supplement \NC number \NC \NC\NR
3398 \NC version \NC number \NC \NC\NR
3399 \stoptabulate
3401 \subsubsubsection[fontloaderpfminfotable]{pfminfo table}
3403 The \type{pfminfo} table contains most of the OS/2 information:
3405 \starttabulate[|lT|l|p|]
3406 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3407 \NC pfmset \NC number \NC \NC\NR
3408 \NC winascent_add \NC number \NC \NC\NR
3409 \NC windescent_add \NC number \NC \NC\NR
3410 \NC hheadascent_add \NC number \NC \NC\NR
3411 \NC hheaddescent_add \NC number \NC \NC\NR
3412 \NC typoascent_add \NC number \NC \NC\NR
3413 \NC typodescent_add \NC number \NC \NC\NR
3414 \NC subsuper_set \NC number \NC \NC\NR
3415 \NC panose_set \NC number \NC \NC\NR
3416 \NC hheadset \NC number \NC \NC\NR
3417 \NC vheadset \NC number \NC \NC\NR
3418 \NC pfmfamily \NC number \NC \NC\NR
3419 \NC weight \NC number \NC \NC\NR
3420 \NC width \NC number \NC \NC\NR
3421 \NC avgwidth \NC number \NC \NC\NR
3422 \NC firstchar \NC number \NC \NC\NR
3423 \NC lastchar \NC number \NC \NC\NR
3424 \NC fstype \NC number \NC \NC\NR
3425 \NC linegap \NC number \NC \NC\NR
3426 \NC vlinegap \NC number \NC \NC\NR
3427 \NC hhead_ascent \NC number \NC \NC\NR
3428 \NC hhead_descent \NC number \NC \NC\NR
3429 \NC os2_typoascent \NC number \NC \NC\NR
3430 \NC os2_typodescent \NC number \NC \NC\NR
3431 \NC os2_typolinegap \NC number \NC \NC\NR
3432 \NC os2_winascent \NC number \NC \NC\NR
3433 \NC os2_windescent \NC number \NC \NC\NR
3434 \NC os2_subxsize \NC number \NC \NC\NR
3435 \NC os2_subysize \NC number \NC \NC\NR
3436 \NC os2_subxoff \NC number \NC \NC\NR
3437 \NC os2_subyoff \NC number \NC \NC\NR
3438 \NC os2_supxsize \NC number \NC \NC\NR
3439 \NC os2_supysize \NC number \NC \NC\NR
3440 \NC os2_supxoff \NC number \NC \NC\NR
3441 \NC os2_supyoff \NC number \NC \NC\NR
3442 \NC os2_strikeysize \NC number \NC \NC\NR
3443 \NC os2_strikeypos \NC number \NC \NC\NR
3444 \NC os2_family_class \NC number \NC \NC\NR
3445 \NC os2_xheight \NC number \NC \NC\NR
3446 \NC os2_capheight \NC number \NC \NC\NR
3447 \NC os2_defaultchar \NC number \NC \NC\NR
3448 \NC os2_breakchar \NC number \NC \NC\NR
3449 \NC os2_vendor \NC string \NC \NC\NR
3450 \NC codepages \NC table \NC A two-number array of encoded code pages\NC\NR
3451 \NC unicoderages \NC table \NC A four-number array of encoded unicode ranges\NC\NR
3452 \NC panose \NC table \NC \NC\NR
3453 \stoptabulate
3455 The \type{panose} subtable has exactly 10 string keys:
3457 \starttabulate[|lT|l|p|]
3458 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3459 \NC familytype \NC string \NC Values as in the \OPENTYPE\ font specification:
3460 \type {Any}, \type {No Fit}, \type {Text and Display}, \type {Script},
3461 \type {Decorative}, \type {Pictorial} \NC\NR
3462 \NC serifstyle \NC string \NC See the \OPENTYPE\ font specification for values\NC\NR
3463 \NC weight \NC string \NC id. \NC\NR
3464 \NC proportion \NC string \NC id. \NC\NR
3465 \NC contrast \NC string \NC id. \NC\NR
3466 \NC strokevariation \NC string \NC id. \NC\NR
3467 \NC armstyle \NC string \NC id. \NC\NR
3468 \NC letterform \NC string \NC id. \NC\NR
3469 \NC midline \NC string \NC id. \NC\NR
3470 \NC xheight \NC string \NC id. \NC\NR
3471 \stoptabulate
3473 \subsubsubsection[fontloadernamestable]{names table}
3475 Each item has two top|-|level keys:
3477 \starttabulate[|lT|l|p|]
3478 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3479 \NC lang \NC string \NC language for this entry \NC\NR
3480 \NC names \NC table \NC \NC\NR
3481 \stoptabulate
3483 The \type{names} keys are the actual \TRUETYPE\ name strings. The
3484 possible keys are:
3486 \starttabulate[|lT|p|]
3487 \NC \ssbf key \NC \bf explanation \NC\NR
3488 \NC copyright \NC \NC\NR
3489 \NC family \NC \NC\NR
3490 \NC subfamily \NC \NC\NR
3491 \NC uniqueid \NC \NC\NR
3492 \NC fullname \NC \NC\NR
3493 \NC version \NC \NC\NR
3494 \NC postscriptname \NC \NC\NR
3495 \NC trademark \NC \NC\NR
3496 \NC manufacturer \NC \NC\NR
3497 \NC designer \NC \NC\NR
3498 \NC descriptor \NC \NC\NR
3499 \NC venderurl \NC \NC\NR
3500 \NC designerurl \NC \NC\NR
3501 \NC license \NC \NC\NR
3502 \NC licenseurl \NC \NC\NR
3503 \NC idontknow \NC \NC\NR
3504 \NC preffamilyname \NC \NC\NR
3505 \NC prefmodifiers \NC \NC\NR
3506 \NC compatfull \NC \NC\NR
3507 \NC sampletext \NC \NC\NR
3508 \NC cidfindfontname \NC \NC\NR
3509 \NC wwsfamily \NC \NC\NR
3510 \NC wwssubfamily \NC \NC\NR
3511 \stoptabulate
3513 \subsubsubsection{anchor_classes table}
3515 The anchor_classes classes:
3517 \starttabulate[|lT|l|p|]
3518 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3519 \NC name \NC string \NC a descriptive id of this anchor class\NC\NR
3520 \NC lookup \NC string \NC \NC\NR
3521 \NC type \NC string \NC one of \type {mark}, \type {mkmk}, \type {curs}, \type {mklg} \NC\NR
3522 \stoptabulate
3524 % type is actually a lookup subtype, not a feature name. Officially, these strings
3525 % should be gpos_mark2mark etc.
3527 \subsubsubsection{gpos table}
3529 Th gpos table has one array entry for each lookup. (The \type {gpos_} prefix is somewhat redundant.)
3531 \starttabulate[|lT|l|p|]
3532 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3533 \NC type \NC string \NC one of
3534 \type {gpos_single}, \type {gpos_pair}, \type {gpos_cursive},
3535 \type {gpos_mark2base},\crlf \type {gpos_mark2ligature}, \type {gpos_mark2mark}, \type {gpos_context},\crlf
3536 \type {gpos_contextchain}
3537 \NC\NR
3538 \NC flags \NC table \NC \NC\NR
3539 \NC name \NC string \NC \NC\NR
3540 \NC features \NC array \NC \NC\NR
3541 \NC subtables \NC array \NC \NC\NR
3542 \stoptabulate
3544 The flags table has a true value for each of the lookup flags that is
3545 actually set:
3547 \starttabulate[|lT|l|p|]
3548 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3549 \NC r2l \NC boolean \NC \NC\NR
3550 \NC ignorebaseglyphs \NC boolean \NC \NC\NR
3551 \NC ignoreligatures \NC boolean \NC \NC\NR
3552 \NC ignorecombiningmarks \NC boolean \NC \NC\NR
3553 \NC mark_class \NC string \NC (new in 0.44)\NC\NR
3554 \stoptabulate
3557 The features subtable items of gpos have:
3559 \starttabulate[|lT|l|p|]
3560 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3561 \NC tag \NC string \NC \NC\NR
3562 \NC scripts \NC table \NC \NC\NR
3563 \stoptabulate
3565 The scripts table within features has:
3567 \starttabulate[|lT|l|p|]
3568 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3569 \NC script \NC string \NC \NC\NR
3570 \NC langs \NC array of strings \NC \NC\NR
3571 \stoptabulate
3574 The subtables table has:
3576 \starttabulate[|lT|l|p|]
3577 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3578 \NC name \NC string \NC \NC\NR
3579 \NC suffix \NC string \NC (only if used)\NC\NR % used by gpos_single to get a default
3580 \NC anchor_classes \NC number \NC (only if used)\NC\NR
3581 \NC vertical_kerning \NC number \NC (only if used)\NC\NR
3582 \NC kernclass \NC table \NC (only if used)\NC\NR
3583 \stoptabulate
3586 The kernclass with subtables table has:
3588 \starttabulate[|lT|l|p|]
3589 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3590 \NC firsts \NC array of strings \NC \NC\NR
3591 \NC seconds \NC array of strings \NC \NC\NR
3592 \NC lookup \NC string or array \NC associated lookup(s) \NC \NR
3593 \NC offsets \NC array of numbers \NC \NC\NR
3594 \stoptabulate
3596 \subsubsubsection{gsub table}
3598 This has identical layout to the \type{gpos} table, except for the
3599 type:
3601 \starttabulate[|lT|l|p|]
3602 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3603 \NC type \NC string \NC one of \type {gsub_single}, \type {gsub_multiple}, \type {gsub_alternate},
3604 \type {gsub_ligature},\crlf \type {gsub_context}, \type {gsub_contextchain}, \type {gsub_reversecontextchain}
3605 \NC\NR
3606 \stoptabulate
3610 \subsubsubsection{ttf_tables and ttf_tab_saved tables}
3612 \starttabulate[|lT|l|p|]
3613 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3614 \NC tag \NC string \NC \NC\NR
3615 \NC len \NC number \NC \NC\NR
3616 \NC maxlen \NC number \NC \NC\NR
3617 \NC data \NC number \NC \NC\NR
3618 \stoptabulate
3620 \subsubsubsection{mm table}
3622 \starttabulate[|lT|l|p|]
3623 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3624 \NC axes \NC table \NC array of axis names \NC \NR
3625 \NC instance_count \NC number \NC \NC \NR
3626 \NC positions \NC table \NC array of instance positions
3627 (\#axes * instances )\NC \NR
3628 \NC defweights \NC table \NC array of default weights for instances \NC \NR
3629 \NC cdv \NC string \NC \NC \NR
3630 \NC ndv \NC string \NC \NC \NR
3631 \NC axismaps \NC table \NC \NC \NR
3632 \stoptabulate
3634 The \type{axismaps}:
3636 \starttabulate[|lT|l|p|]
3637 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3638 \NC blends \NC table \NC an array of blend points \NC \NR
3639 \NC designs \NC table \NC an array of design values \NC \NR
3640 \NC min \NC number \NC \NC \NR
3641 \NC def \NC number \NC \NC \NR
3642 \NC max \NC number \NC \NC \NR
3643 \stoptabulate
3646 \subsubsubsection{mark_classes table (0.44)}
3648 The keys in this table are mark class names, and the values
3649 are a space-separated string of glyph names in this class.
3651 Note: This table is indeed new in 0.44. The manual said it existed
3652 before then, but in practise it was missing due to a bug.
3654 \subsubsubsection{math table}
3656 \starttabulate[|lT|p|]
3657 \NC ScriptPercentScaleDown \NC \NC \NR
3658 \NC ScriptScriptPercentScaleDown \NC \NC \NR
3659 \NC DelimitedSubFormulaMinHeight \NC \NC \NR
3660 \NC DisplayOperatorMinHeight \NC \NC \NR
3661 \NC MathLeading \NC \NC \NR
3662 \NC AxisHeight \NC \NC \NR
3663 \NC AccentBaseHeight \NC \NC \NR
3664 \NC FlattenedAccentBaseHeight \NC \NC \NR
3665 \NC SubscriptShiftDown \NC \NC \NR
3666 \NC SubscriptTopMax \NC \NC \NR
3667 \NC SubscriptBaselineDropMin \NC \NC \NR
3668 \NC SuperscriptShiftUp \NC \NC \NR
3669 \NC SuperscriptShiftUpCramped \NC \NC \NR
3670 \NC SuperscriptBottomMin \NC \NC \NR
3671 \NC SuperscriptBaselineDropMax \NC \NC \NR
3672 \NC SubSuperscriptGapMin \NC \NC \NR
3673 \NC SuperscriptBottomMaxWithSubscript \NC \NC \NR
3674 \NC SpaceAfterScript \NC \NC \NR
3675 \NC UpperLimitGapMin \NC \NC \NR
3676 \NC UpperLimitBaselineRiseMin \NC \NC \NR
3677 \NC LowerLimitGapMin \NC \NC \NR
3678 \NC LowerLimitBaselineDropMin \NC \NC \NR
3679 \NC StackTopShiftUp \NC \NC \NR
3680 \NC StackTopDisplayStyleShiftUp \NC \NC \NR
3681 \NC StackBottomShiftDown \NC \NC \NR
3682 \NC StackBottomDisplayStyleShiftDown \NC \NC \NR
3683 \NC StackGapMin \NC \NC \NR
3684 \NC StackDisplayStyleGapMin \NC \NC \NR
3685 \NC StretchStackTopShiftUp \NC \NC \NR
3686 \NC StretchStackBottomShiftDown \NC \NC \NR
3687 \NC StretchStackGapAboveMin \NC \NC \NR
3688 \NC StretchStackGapBelowMin \NC \NC \NR
3689 \NC FractionNumeratorShiftUp \NC \NC \NR
3690 \NC FractionNumeratorDisplayStyleShiftUp \NC \NC \NR
3691 \NC FractionDenominatorShiftDown \NC \NC \NR
3692 \NC FractionDenominatorDisplayStyleShiftDown \NC \NC \NR
3693 \NC FractionNumeratorGapMin \NC \NC \NR
3694 \NC FractionNumeratorDisplayStyleGapMin \NC \NC \NR
3695 \NC FractionRuleThickness \NC \NC \NR
3696 \NC FractionDenominatorGapMin \NC \NC \NR
3697 \NC FractionDenominatorDisplayStyleGapMin \NC \NC \NR
3698 \NC SkewedFractionHorizontalGap \NC \NC \NR
3699 \NC SkewedFractionVerticalGap \NC \NC \NR
3700 \NC OverbarVerticalGap \NC \NC \NR
3701 \NC OverbarRuleThickness \NC \NC \NR
3702 \NC OverbarExtraAscender \NC \NC \NR
3703 \NC UnderbarVerticalGap \NC \NC \NR
3704 \NC UnderbarRuleThickness \NC \NC \NR
3705 \NC UnderbarExtraDescender \NC \NC \NR
3706 \NC RadicalVerticalGap \NC \NC \NR
3707 \NC RadicalDisplayStyleVerticalGap \NC \NC \NR
3708 \NC RadicalRuleThickness \NC \NC \NR
3709 \NC RadicalExtraAscender \NC \NC \NR
3710 \NC RadicalKernBeforeDegree \NC \NC \NR
3711 \NC RadicalKernAfterDegree \NC \NC \NR
3712 \NC RadicalDegreeBottomRaisePercent \NC \NC \NR
3713 \NC MinConnectorOverlap \NC \NC \NR
3714 \NC FractionDelimiterSize \NC (new in 0.47.0)\NC \NR
3715 \NC FractionDelimiterDisplayStyleSize \NC (new in 0.47.0)\NC \NR
3716 \stoptabulate
3718 \subsubsubsection{validation_state table}
3720 \starttabulate[|lT|p|]
3721 \NC \ssbf key \NC \bf explanation \NC\NR
3722 \NC bad_ps_fontname \NC \NC \NR
3723 \NC bad_glyph_table \NC \NC \NR
3724 \NC bad_cff_table \NC \NC \NR
3725 \NC bad_metrics_table \NC \NC \NR
3726 \NC bad_cmap_table \NC \NC \NR
3727 \NC bad_bitmaps_table \NC \NC \NR
3728 \NC bad_gx_table \NC \NC \NR
3729 \NC bad_ot_table \NC \NC \NR
3730 \NC bad_os2_version \NC \NC \NR
3731 \NC bad_sfnt_header \NC \NC \NR
3732 \stoptabulate
3734 \subsubsubsection{horiz_base and vert_base table}
3736 \starttabulate[|lT|l|p|]
3737 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3738 \NC tags \NC table \NC an array of script list tags\NC \NR
3739 \NC scripts \NC table \NC \NC \NR
3740 \stoptabulate
3743 The \type{scripts} subtable:
3745 \starttabulate[|lT|l|p|]
3746 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3747 \NC baseline \NC table \NC \NC \NR
3748 \NC default_baseline \NC number \NC \NC \NR
3749 \NC lang \NC table \NC \NC \NR
3750 \stoptabulate
3753 The \type{lang} subtable:
3755 \starttabulate[|lT|l|p|]
3756 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3757 \NC tag \NC string \NC a script tag \NC \NR
3758 \NC ascent \NC number \NC \NC \NR
3759 \NC descent \NC number \NC \NC \NR
3760 \NC features \NC table \NC \NC \NR
3761 \stoptabulate
3763 The \type{features} points to an array of tables with the same layout
3764 except that in those nested tables, the tag represents a language.
3766 \subsubsubsection{altuni table}
3768 An array of alternate \UNICODE\ values. Inside that array
3769 are hashes with:
3771 \starttabulate[|lT|l|p|]
3772 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3773 \NC unicode \NC number \NC this glyph is also used for this unicode\NC \NR
3774 \NC variant \NC number \NC the alternative is driven by this unicode selector\NC \NR
3775 \stoptabulate
3777 \subsubsubsection{vert_variants and horiz_variants table}
3779 \starttabulate[|lT|l|p|]
3780 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3781 \NC variants \NC string \NC \NC \NR
3782 \NC italic_correction \NC number \NC \NC \NR
3783 \NC parts \NC table \NC \NC \NR
3784 \stoptabulate
3786 The \type{parts} table is an array of smaller tables:
3788 \starttabulate[|lT|l|p|]
3789 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3790 \NC component \NC string \NC \NC \NR
3791 \NC extender \NC number \NC \NC \NR
3792 \NC start \NC number \NC \NC \NR
3793 \NC end \NC number \NC \NC \NR
3794 \NC advance \NC number \NC \NC \NR
3795 \stoptabulate
3798 \subsubsubsection{mathkern table}
3800 \starttabulate[|lT|l|p|]
3801 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3802 \NC top_right \NC table \NC \NC \NR
3803 \NC bottom_right \NC table \NC \NC \NR
3804 \NC top_left \NC table \NC \NC \NR
3805 \NC bottom_left \NC table \NC \NC \NR
3806 \stoptabulate
3808 Each of the subtables is an array of small hashes with two keys:
3810 \starttabulate[|lT|l|p|]
3811 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3812 \NC height \NC number \NC \NC \NR
3813 \NC kern \NC number \NC \NC \NR
3814 \stoptabulate
3816 \subsubsubsection{kerns table}
3818 Substructure is identical to the per|-|glyph subtable.
3820 \subsubsubsection{vkerns table}
3822 Substructure is identical to the per|-|glyph subtable.
3824 \subsubsubsection{texdata table}
3827 \starttabulate[|lT|l|p|]
3828 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3829 \NC type \NC string \NC \type {unset}, \type {text}, \type {math}, \type {mathext}\NC\NR
3830 \NC params \NC array \NC 22 font numeric parameters\NC\NR
3831 \stoptabulate
3833 \subsubsubsection{lookups table}
3835 Top|-|level \type{lookups} is quite different from the ones at
3836 character level. The keys in this hash are strings, the values the
3837 actual lookups, represented as dictionary tables.
3839 \starttabulate[|lT|l|p|]
3840 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3841 \NC type \NC string \NC \NC\NR
3842 \NC format \NC enum \NC one of \type {glyphs}, \type {class}, \type {coverage}, \type {reversecoverage} \NC\NR
3843 \NC tag \NC string \NC \NC\NR
3844 \NC current_class \NC array \NC \NC\NR
3845 \NC before_class \NC array \NC \NC\NR
3846 \NC after_class \NC array \NC \NC\NR
3847 \NC rules \NC array \NC an array of rule items\NC\NR
3848 \stoptabulate
3850 Rule items have one common item and one specialized item:
3852 \starttabulate[|lT|l|p|]
3853 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3854 \NC lookups \NC array \NC a linear array of lookup names\NC\NR
3855 \NC glyphs \NC array \NC only if the parent's format is \type{glyphs}\NC\NR
3856 \NC class \NC array \NC only if the parent's format is \type{class}\NC\NR
3857 \NC coverage \NC array \NC only if the parent's format is \type{coverage}\NC\NR
3858 \NC reversecoverage \NC array \NC only if the parent's format is \type{reversecoverage}\NC\NR
3859 \stoptabulate
3861 A glyph table is:
3863 \starttabulate[|lT|l|p|]
3864 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3865 \NC names \NC string \NC \NC\NR
3866 \NC back \NC string \NC \NC\NR
3867 \NC fore \NC string \NC \NC\NR
3868 \stoptabulate
3870 A class table is:
3872 \starttabulate[|lT|l|p|]
3873 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3874 \NC current \NC array \NC of numbers \NC\NR
3875 \NC before \NC array \NC of numbers \NC\NR
3876 \NC after \NC array \NC of numbers \NC\NR
3877 \stoptabulate
3879 coverage:
3881 \starttabulate[|lT|l|p|]
3882 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3883 \NC current \NC array \NC of strings \NC\NR
3884 \NC before \NC array \NC of strings\NC\NR
3885 \NC after \NC array \NC of strings \NC\NR
3886 \stoptabulate
3888 reversecoverage:
3890 \starttabulate[|lT|l|p|]
3891 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3892 \NC current \NC array \NC of strings \NC\NR
3893 \NC before \NC array \NC of strings\NC\NR
3894 \NC after \NC array \NC of strings \NC\NR
3895 \NC replacements \NC string \NC \NC\NR
3896 \stoptabulate
3898 %***********************************************************************
3900 \section{The \luatex{img} library}
3902 The \type{img} library can be used as an alternative to
3903 \tex{pdfximage} and \tex{pdfrefximage}, and the associated \quote {satellite}
3904 commands like \tex{pdfximagebbox}.
3905 Image objects can also be used within virtual fonts
3906 via the \type{image} command listed in~\in{section}[virtualfonts].
3908 \subsection{\luatex{img.new}}
3910 \startfunctioncall
3911 <image> var = img.new()
3912 <image> var = img.new(<table> image_spec)
3913 \stopfunctioncall
3915 This function creates a userdata object of type \quote {image}. The
3916 \type{image_spec} argument is optional. If it is given, it must be
3917 a table, and that table must contain a \type{filename} key. A number of
3918 other keys can also be useful, these are explained below.
3920 You can either say
3922 \starttyping
3923 a = img.new()
3924 \stoptyping
3926 followed by
3928 \starttyping
3929 a.filename = "foo.png"
3930 \stoptyping
3932 or you can put the file name (and some or all of the other keys)
3933 into a table directly, like so:
3935 \starttyping
3936 a = img.new({filename='foo.pdf', page=1})
3937 \stoptyping
3939 The generated \type{<image>} userdata object allows access to a set of
3940 user|-|specified values as well as a set of values that are normally
3941 filled in and updated automatically by \LUATEX\ itself. Some of those
3942 are derived from the actual image file, others are updated to reflect
3943 the \PDF\ output status of the object.
3945 There is one required user-specified field: the file name
3946 (\type{filename}). It can optionally be augmented by the requested
3947 image dimensions (\type{width}, \type{depth}, \type{height}),
3948 user-specified image attributes (\type{attr}), the requested \PDF\ page
3949 identifier (\type{page}), the requested boundingbox (\type{pagebox})
3950 for \PDF\ inclusion, the requested color space object (\type{colorspace}).
3952 The function \type{img.new} does not access the actual image file, it
3953 just creates the \type{<image>} userdata object and initializes some
3954 memory structures. The \type{<image>} object and its internal
3955 structures are automatically garbage collected.
3957 Once the image is scanned, all the values in the \type{<image>}
3958 except \type{width}, \type{height} and \type{depth}, become frozen,
3959 and you cannot change them any more.
3961 \subsection{\luatex{img.keys}}
3963 \startfunctioncall
3964 <table> keys = img.keys()
3965 \stopfunctioncall
3967 This function returns a list of all the possible \type{image_spec}
3968 keys, both user-supplied and automatic ones.
3970 % hahe: i need to add r/w ro column...
3971 \starttabulate[|l|l|p|]
3972 \NC \bf field name\NC \bf type \NC description \NC \NR
3973 \NC attr \NC string \NC the image attributes for \LUATEX \NC \NR
3974 \NC bbox \NC table \NC table with 4 boundingbox dimensions
3975 \type{llx}, \type{lly}, \type{urx},
3976 and \type{ury} overruling the \type{pagebox}
3977 entry\NC \NR
3978 \NC colordepth \NC number \NC the number of bits used by the color space\NC \NR
3979 \NC colorspace \NC number \NC the color space object number \NC \NR
3980 \NC depth \NC number \NC the image depth for \LUATEX\
3981 (in scaled points)\NC \NR
3982 \NC filename \NC string \NC the image file name \NC \NR
3983 \NC filepath \NC string \NC the full (expanded) file name of the image\NC \NR
3984 \NC height \NC number \NC the image height for \LUATEX\
3985 (in scaled points)\NC \NR
3986 \NC imagetype \NC string \NC one of \type{pdf}, \type{png}, \type{jpg}, \type{jp2},
3987 \type{jbig2}, or \type{nil} \NC \NR
3988 \NC index \NC number \NC the \PDF\ image name suffix \NC \NR
3989 \NC objnum \NC number \NC the \PDF\ image object number \NC \NR
3990 \NC page \NC ?? \NC the identifier for the requested image page
3991 (type is number or string,
3992 default is the number 1)\NC \NR
3993 \NC pagebox \NC string \NC the requested bounding box, one of
3994 \type {none}, \type {media}, \type {crop},
3995 \type {bleed}, \type {trim}, \type {art} \NC \NR
3996 \NC pages \NC number \NC the total number of available pages \NC \NR
3997 \NC rotation \NC number \NC the image rotation from included \PDF\ file,
3998 in multiples of 90~deg. \NC \NR
3999 \NC stream \NC string \NC the raw stream data for an \type{/Xobject}
4000 \type{/Form} object\NC \NR
4001 \NC transform \NC number \NC the image transform, integer number 0..7\NC \NR
4002 \NC width \NC number \NC the image width for \LUATEX\
4003 (in scaled points)\NC \NR
4004 \NC xres \NC number \NC the horizontal natural image resolution
4005 (in \DPI) \NC \NR
4006 \NC xsize \NC number \NC the natural image width \NC \NR
4007 \NC yres \NC number \NC the vertical natural image resolution
4008 (in \DPI) \NC \NR
4009 \NC ysize \NC number \NC the natural image height \NC \NR
4010 \stoptabulate
4012 A running (undefined) dimension in \type{width}, \type{height}, or \type{depth} is
4013 represented as \type{nil} in \LUA, so if you want to load an image at
4014 its \quote {natural} size, you do not have to specify any of those three fields.
4016 The \type{stream} parameter allows to fabricate an \type{/XObject} \type{/Form}
4017 object from a string giving the stream contents,
4018 e.\,g., for a filled rectangle:
4020 \startfunctioncall
4021 a.stream = "0 0 20 10 re f"
4022 \stopfunctioncall
4024 When writing the image, an \type{/Xobject} \type{/Form} object is created,
4025 like with embedded \PDF\ file writing. The object is written out only once.
4026 The \type{stream} key requires that also the \type{bbox} table is given.
4027 The \type{stream} key conflicts with the \type{filename} key.
4028 The \type{transform} key works as usual also with \type{stream}.
4030 The \type{bbox} key needs a table with four boundingbox values, e.\,g.:
4032 \startfunctioncall
4033 a.bbox = {"30bp", 0, "225bp", "200bp"}
4034 \stopfunctioncall
4036 This replaces and overrules any given \type{pagebox} value;
4037 with given \type{bbox} the box dimensions coming with an embedded \PDF\ file
4038 are ignored.
4039 The \type{xsize} and \type{ysize} dimensions are set accordingly,
4040 when the image is scaled.
4041 The \type{bbox} parameter is ignored for non-\PDF\ images.
4043 The \type{transform} allows to mirror and rotate the image in steps of 90~deg.
4044 The default value~0 gives an unmirrored, unrotated image.
4045 Values 1|--|3 give counterclockwise rotation by 90, 180, or 270~degrees,
4046 whereas with values 4|--|7 the image is first mirrored
4047 and then rotated counterclockwise by 90, 180, or 270~degrees.
4048 The \type{transform} operation gives the same visual result
4049 as if you would externally preprocess the image by a graphics tool
4050 and then use it by \LUATEX.
4051 If a \PDF\ file to be embedded already contains a \type{/Rotate} specification,
4052 the rotation result is the combination of the \type{/Rotate} rotation
4053 followed by the \type{transform} operation.
4055 \subsection{\luatex{img.scan}}
4057 \startfunctioncall
4058 <image> var = img.scan(<image> var)
4059 <image> var = img.scan(<table> image_spec)
4060 \stopfunctioncall
4062 When you say \type{img.scan(a)} for a new image, the file is scanned,
4063 and variables such as \type{xsize}, \type{ysize}, image \type{type}, number of
4064 \type{pages}, and the resolution are extracted. Each of the \type{width},
4065 \type{height}, \type{depth} fields are set up according to the image dimensions,
4066 if they were not given an explicit value already.
4067 An image file will never be scanned more than once for a given image variable.
4068 With all subsequent \type{img.scan(a)} calls only the dimensions are again
4069 set up (if they have been changed by the user in the meantime).
4071 For ease of use, you can do right-away a
4073 \starttyping
4074 <image> a = img.scan ({ filename = "foo.png" })
4075 \stoptyping
4077 without a prior \type{img.new}.
4079 Nothing is written yet at this point, so you can do \type{a=img.scan},
4080 retrieve the available info like image width and height, and then
4081 throw away \type{a} again by saying \type{a=nil}. In that case no
4082 image object will be reserved in the PDF, and the used memory will be
4083 cleaned up automatically.
4085 \subsection{\luatex{img.copy}}
4087 \startfunctioncall
4088 <image> var = img.copy(<image> var)
4089 <image> var = img.copy(<table> image_spec)
4090 \stopfunctioncall
4092 If you say \type{a = b}, then both variables point to the same
4093 \type{<image>} object. if you want to write out an image with
4094 different sizes, you can do a \type{b=img.copy(a)}.
4096 Afterwards, \type{a} and \type{b} still reference the same actual
4097 image dictionary, but the dimensions for \type{b} can now be changed
4098 from their initial values that were just copies from \type{a}.
4100 % Hartmut, I don't know if this makes sense. An example of what
4101 % can, and what cannot be changed would be helpful.
4102 % -- will think about it...
4104 \subsection{\luatex{img.write}}
4106 \startfunctioncall
4107 <image> var = img.write(<image> var)
4108 <image> var = img.write(<table> image_spec)
4109 \stopfunctioncall
4111 By \type{img.write(a)} a \PDF\ object number is allocated,
4112 and a whatsit node of subtype \type{pdf_refximage} is generated
4113 and put into the output list.
4114 By this the image \type{a} is placed into the page stream,
4115 and the image file is written out into an image stream object
4116 after the shipping of the current page is finished.
4118 Again you can do a terse call like
4120 \starttyping
4121 img.write ({ filename = "foo.png" })
4122 \stoptyping
4124 The \type{<image>} variable is returned in case you want it for later
4125 processing.
4127 \subsection{\luatex{img.immediatewrite}}
4129 \startfunctioncall
4130 <image> var = img.immediatewrite(<image> var)
4131 <image> var = img.immediatewrite(<table> image_spec)
4132 \stopfunctioncall
4134 By \type{img.immediatewrite(a)} a \PDF\ object number is
4135 allocated, and the image file for image \type{a} is written out
4136 immediately into the \PDF\ file as an image stream object (like
4137 with \tex{immediate}\tex{pdfximage}). The object number of the image
4138 stream dictionary is then available by the \type{objnum} key. No
4139 \type{pdf_refximage} whatsit node is generated. You will need an
4140 \luatex{img.write(a)} or \luatex{img.node(a)} call to let the
4141 image appear on the page, or reference it by another trick; else
4142 you will have a dangling image object in the \PDF\ file.
4144 Also here you can do a terse call like
4146 \starttyping
4147 a = img.immediatewrite ({ filename = "foo.png" })
4148 \stoptyping
4150 The \type{<image>} variable is returned and you will most likely need it.
4152 \subsection{\luatex{img.node}}
4154 \startfunctioncall
4155 <node> n = img.node(<image> var)
4156 <node> n = img.node(<table> image_spec)
4157 \stopfunctioncall
4159 This function allocates a \PDF\ object number and returns a
4160 whatsit node of subtype \type{pdf_refximage}, filled with the
4161 image parameters \type{width}, \type{height}, \type{depth}, and
4162 \type{objnum}. Also here you can do a terse call like:
4164 \starttyping
4165 n = img.node ({ filename = "foo.png" })
4166 \stoptyping
4168 This example outputs an image:
4170 \starttyping
4171 node.write(img.node{filename="foo.png"})
4172 \stoptyping
4174 \subsection{\luatex{img.types}}
4176 \startfunctioncall
4177 <table> types = img.types()
4178 \stopfunctioncall
4180 This function returns a list with the supported image file type names,
4181 currently these are \type{pdf}, \type{png}, \type{jpg}, \type{jp2} (JPEG~2000),
4182 and \type{jbig2}.
4184 \subsection{\luatex{img.boxes}}
4186 \startfunctioncall
4187 <table> boxes = img.boxes()
4188 \stopfunctioncall
4190 This function returns a list with the supported \PDF\ page box names,
4191 currently these are \type {media}, \type {crop}, \type {bleed}, \type {trim}, and \type {art}
4192 (all in lowercase letters).
4194 %***********************************************************************
4196 \section{The \luatex{kpse} library}
4198 This library provides two separate, but nearly identical interfaces to
4199 the \KPATHSEA\ file search functionality: there is a \quote{normal}
4200 procedural interface that shares its kpathsea instance with \LUATEX\
4201 itself, and an object oriented interface that is completely on its
4202 own. The object oriented interface and \type{kpse.new} have been added
4203 in \LUATEX\ 0.37.
4205 \subsection{\luatex{kpse.set_program_name} and \luatex{kpse.new}}
4207 Before the search library can be used at all, its database has to be
4208 initialized. There are three possibilities, two of which belong to the
4209 procedural interface.
4211 First, when \LUATEX\ is used to typeset documents, this initialization
4212 happens automatically and the \KPATHSEA\ executable and program names
4213 are set to \type{luatex} (that is, unless explicitly prohibited by the
4214 user's startup script. See~\in{section}[init] for more details).
4216 Second, in \TEXLUA\ mode, the initialization has to be done explicitly
4217 via the \luatex{kpse.set_program_name} function, which sets the
4218 \KPATHSEA\ executable (and optionally program) name.
4220 \startfunctioncall
4221 kpse.set_program_name(<string> name)
4222 kpse.set_program_name(<string> name, <string> progname)
4223 \stopfunctioncall
4225 The second argument controls the use of the \quote{dotted} values in the
4226 \type{texmf.cnf} configuration file, and defaults to the first argument.
4228 Third, if you prefer the object oriented interface, you have to call a
4229 different function. It has the same arguments, but it returns a
4230 userdata variable.
4232 \startfunctioncall
4233 local kpathsea = kpse.new(<string> name)
4234 local kpathsea = kpse.new(<string> name, <string> progname)
4235 \stopfunctioncall
4237 Apart from these two functions, the calling conventions of the
4238 interfaces are identical. Depending on the chosen interface, you
4239 either call \type{kpse.find_file()} or \type{kpathsea:find_file()},
4240 with identical arguments and return vales.
4242 \subsection{\luatex{find_file}}
4244 The most often used function in the library is find_file:
4246 \startfunctioncall
4247 <string> f = kpse.find_file(<string> filename)
4248 <string> f = kpse.find_file(<string> filename, <string> ftype)
4249 <string> f = kpse.find_file(<string> filename, <boolean> mustexist)
4250 <string> f = kpse.find_file(<string> filename, <string> ftype, <boolean> mustexist)
4251 <string> f = kpse.find_file(<string> filename, <string> ftype, <number> dpi)
4252 \stopfunctioncall
4254 Arguments:
4255 \startitemize[intro]
4257 \sym{filename}
4259 the name of the file you want to find, with or without extension.
4261 \sym{ftype}
4263 maps to the \type {-format} argument of \KPSEWHICH. The supported
4264 \type{ftype} values are the same as the ones supported by the
4265 standalone \type{kpsewhich} program:
4267 \startsimplecolumns
4268 \starttyping
4269 'gf'
4270 'pk'
4271 'bitmap font'
4272 'tfm'
4273 'afm'
4274 'base'
4275 'bib'
4276 'bst'
4277 'cnf'
4278 'ls-R'
4279 'fmt'
4280 'map'
4281 'mem'
4282 'mf'
4283 'mfpool'
4284 'mft'
4285 'mp'
4286 'mppool'
4287 'MetaPost support'
4288 'ocp'
4289 'ofm'
4290 'opl'
4291 'otp'
4292 'ovf'
4293 'ovp'
4294 'graphic/figure'
4295 'tex'
4296 'TeX system documentation'
4297 'texpool'
4298 'TeX system sources'
4299 'PostScript header'
4300 'Troff fonts'
4301 'type1 fonts'
4302 'vf'
4303 'dvips config'
4304 'ist'
4305 'truetype fonts'
4306 'type42 fonts'
4307 'web2c files'
4308 'other text files'
4309 'other binary files'
4310 'misc fonts'
4311 'web'
4312 'cweb'
4313 'enc files'
4314 'cmap files'
4315 'subfont definition files'
4316 'opentype fonts'
4317 'pdftex config'
4318 'lig files'
4319 'texmfscripts'
4320 'lua',
4321 'font feature files',
4322 'cid maps',
4323 'mlbib',
4324 'mlbst',
4325 'clua',
4326 \stoptyping
4327 \stopsimplecolumns
4329 The default type is \type{tex}. Note: this is different from
4330 \KPSEWHICH, which tries to deduce the file type itself from
4331 looking at the supplied extension. The last four types:
4332 'font feature files', 'cid maps', 'mlbib', 'mlbst' were new
4333 additions in \LUATEX\ 0.40.2.
4336 \sym{mustexist}
4338 is similar to \KPSEWHICH's \type{-must-exist}, and the default is \type{false}.
4339 If you specify \type{true} (or a non|-|zero integer), then the \KPSE\ library
4340 will search the disk as well as the \type {ls-R} databases.
4342 \sym{dpi}
4344 This is used for the size argument of the formats \type{pk}, \type{gf}, and \type{bitmap font}.
4345 \stopitemize
4347 \subsection{\luatex{lookup}}
4349 A more powerful (but slower) generic method for finding files is also
4350 available (since 0.51). It returns a string for each found file.
4352 \startfunctioncall
4353 <string> f, ... = kpse.lookup(<string> filename, <table> options)
4354 \stopfunctioncall
4356 The options match commandline arguments from \type{kpsewhich}:
4358 \starttabulate[|l|l|p|]
4359 \NC \ssbf key \NC \ssbf type \NC \ssbf description \NC \NR
4360 \NC debug \NC number \NC set debugging flags for this lookup\NC \NR
4361 \NC format \NC string \NC use specific file type (see list above)\NC \NR
4362 \NC dpi \NC number \NC use this resolution for this lookup; default 600\NC \NR
4363 \NC path \NC string \NC search in the given path\NC \NR
4364 \NC all \NC boolean \NC output all matches, not just the first\NC \NR
4365 \NC mustexist \NC boolean \NC (0.65 and higher) search the disk as well as ls-R if necessary\NC \NR
4366 \NC must-exist\NC boolean \NC (0.64 and lower) search the disk as well as ls-R if necessary\NC \NR
4367 \NC mktexpk \NC boolean \NC disable/enable mktexpk generation for this lookup\NC \NR
4368 \NC mktextex \NC boolean \NC disable/enable mktextex generation for this lookup\NC \NR
4369 \NC mktexmf \NC boolean \NC disable/enable mktexmf generation for this lookup\NC \NR
4370 \NC mktextfm \NC boolean \NC disable/enable mktextfm generation for this lookup\NC \NR
4371 \NC subdir \NC string
4372 or table \NC only output matches whose directory part
4373 ends with the given string(s) \NC \NR
4374 \stoptabulate
4376 \subsection{\luatex{init_prog}}
4378 Extra initialization for programs that need to generate bitmap fonts.
4380 \startfunctioncall
4381 kpse.init_prog(<string> prefix, <number> base_dpi, <string> mfmode)
4382 kpse.init_prog(<string> prefix, <number> base_dpi, <string> mfmode, <string> fallback)
4383 \stopfunctioncall
4386 \subsection{\luatex{readable_file}}
4388 Test if an (absolute) file name is a readable file.
4390 \startfunctioncall
4391 <string> f = kpse.readable_file(<string> name)
4392 \stopfunctioncall
4394 The return value is the actual absolute filename you should use,
4395 because the disk name is not always the same as the requested name,
4396 due to aliases and system|-|specific handling under e.\,g.\ \MSDOS.
4398 Returns \lua {nil} if the file does not exist or is not readable.
4400 \subsection{\luatex{expand_path}}
4402 Like kpsewhich's \type {-expand-path}:
4404 \startfunctioncall
4405 <string> r = kpse.expand_path(<string> s)
4406 \stopfunctioncall
4408 \subsection{\luatex{expand_var}}
4410 Like kpsewhich's \type{-expand-var}:
4412 \startfunctioncall
4413 <string> r = kpse.expand_var(<string> s)
4414 \stopfunctioncall
4416 \subsection{\luatex{expand_braces}}
4418 Like kpsewhich's \type{-expand-braces}:
4420 \startfunctioncall
4421 <string> r = kpse.expand_braces(<string> s)
4422 \stopfunctioncall
4424 \subsection{\luatex{show_path}}
4426 Like kpsewhich's \type{-show-path}:
4428 \startfunctioncall
4429 <string> r = kpse.show_path(<string> ftype)
4430 \stopfunctioncall
4433 \subsection{\luatex{var_value}}
4435 Like kpsewhich's \type{-var-value}:
4437 \startfunctioncall
4438 <string> r = kpse.var_value(<string> s)
4439 \stopfunctioncall
4441 \subsection{\luatex{version}}
4443 Returns the kpathsea version string (new in 0.51)
4445 \startfunctioncall
4446 <string> r = kpse.version()
4447 \stopfunctioncall
4450 \section{The \luatex{lang} library}
4452 This library provides the interface to \LUATEX's structure
4453 representing a language, and the associated functions.
4455 \startfunctioncall
4456 <language> l = lang.new()
4457 <language> l = lang.new(<number> id)
4458 \stopfunctioncall
4460 This function creates a new userdata object. An object of type
4461 \type{<language>} is the first argument to most of the other functions
4462 in the \luatex{lang} library. These functions can also be used as if
4463 they were object methods, using the colon syntax.
4465 Without an argument, the next available internal id number will be
4466 assigned to this object. With argument, an object will be created that
4467 links to the internal language with that id number.
4469 \startfunctioncall
4470 <number> n = lang.id(<language> l)
4471 \stopfunctioncall
4473 returns the internal \tex{language} id number this object refers to.
4475 \startfunctioncall
4476 <string> n = lang.hyphenation(<language> l)
4477 lang.hyphenation(<language> l, <string> n)
4478 \stopfunctioncall
4480 Either returns the current hyphenation exceptions for this language,
4481 or adds new ones. The syntax of the string is explained in~\in{section}[patternsexceptions].
4483 \startfunctioncall
4484 lang.clear_hyphenation(<language> l)
4485 \stopfunctioncall
4487 Clears the exception dictionary for this language.
4489 \startfunctioncall
4490 <string> n = lang.clean(<string> o)
4491 \stopfunctioncall
4493 Creates a hyphenation key from the supplied hyphenation value. The
4494 syntax of the argument string is explained in~\in{section}[patternsexceptions].
4495 This function is useful if
4496 you want to do something else based on the words in a dictionary file,
4497 like spell-checking.
4499 \startfunctioncall
4500 <string> n = lang.patterns(<language> l)
4501 lang.patterns(<language> l, <string> n)
4502 \stopfunctioncall
4504 Adds additional patterns for this language object, or returns the
4505 current set. The syntax of this string is explained in~\in{section}[patternsexceptions].
4507 \startfunctioncall
4508 lang.clear_patterns(<language> l)
4509 \stopfunctioncall
4511 Clears the pattern dictionary for this language.
4513 \startfunctioncall
4514 <number> n = lang.prehyphenchar(<language> l)
4515 lang.prehyphenchar(<language> l, <number> n)
4516 \stopfunctioncall
4518 Gets or sets the \quote{pre|-|break} hyphen character for implicit
4519 hyphenation in this language (initially the hyphen, decimal 45).
4521 \startfunctioncall
4522 <number> n = lang.posthyphenchar(<language> l)
4523 lang.posthyphenchar(<language> l, <number> n)
4524 \stopfunctioncall
4526 Gets or sets the \quote{post|-|break} hyphen character for implicit
4527 hyphenation in this language (initially null, decimal~0, indicating
4528 emptiness).
4531 \startfunctioncall
4532 <number> n = lang.preexhyphenchar(<language> l)
4533 lang.preexhyphenchar(<language> l, <number> n)
4534 \stopfunctioncall
4536 Gets or sets the \quote{pre|-|break} hyphen character for explicit
4537 hyphenation in this language (initially null, decimal~0, indicating
4538 emptiness).
4540 \startfunctioncall
4541 <number> n = lang.postexhyphenchar(<language> l)
4542 lang.postexhyphenchar(<language> l, <number> n)
4543 \stopfunctioncall
4545 Gets or sets the \quote{post|-|break} hyphen character for explicit
4546 hyphenation in this language (initially null, decimal~0, indicating
4547 emptiness).
4549 \startfunctioncall
4550 <boolean> success = lang.hyphenate(<node> head)
4551 <boolean> success = lang.hyphenate(<node> head, <node> tail)
4552 \stopfunctioncall
4554 Inserts hyphenation points (discretionary nodes) in a node list. If
4555 \type{tail} is given as argument, processing stops on that node.
4556 Currently, \type{success} is always true if \type{head} (and \type{tail}, if
4557 specified) are proper nodes, regardless of possible other errors.
4559 Hyphenation works only on \quote{characters}, a special subtype of all
4560 the glyph nodes with the node subtype having the value \type{1}. Glyph
4561 modes with different subtypes are not processed. See
4562 \in{section~}[charsandglyphs] for more details.
4565 \section{The \luatex{lua} library}
4567 This library contains one read|-|only item:
4569 \starttyping
4570 <string> s = lua.version
4571 \stoptyping
4573 This returns the \LUA\ version identifier string. The value is
4574 currently \directlua {tex.print(lua.version)}.
4576 \subsection{\LUA\ bytecode registers}
4578 \LUA\ registers can be used to communicate \LUA\ functions across \LUA\
4579 chunks. The accepted values for assignments are functions and
4580 \type{nil}. Likewise, the retrieved value is either a function or \type{nil}.
4582 \starttyping
4583 lua.bytecode[<number> n] = <function> f
4584 lua.bytecode[<number> n]()
4585 \stoptyping
4587 The contents of the \luatex{lua.bytecode} array is stored inside the format
4588 file as actual \LUA\ bytecode, so it can also be used to preload \LUA\ code.
4590 Note: The function must not contain any upvalues. Currently, functions
4591 containing upvalues can be stored (and their upvalues are set to
4592 \type{nil}), but this is an artifact of the current \LUA\
4593 implementation and thus subject to change.
4595 The associated function calls are
4597 \startfunctioncall
4598 <function> f = lua.getbytecode(<number> n)
4599 lua.setbytecode(<number> n, <function> f)
4600 \stopfunctioncall
4602 Note: Since a \LUA\ file loaded using \luatex{loadfile(filename)} is
4603 essentially an anonymous function, a complete file can be stored in a
4604 bytecode register like this:
4606 \startfunctioncall
4607 lua.bytecode[n] = loadfile(filename)
4608 \stopfunctioncall
4610 Now all definitions (functions, variables) contained in the file can be
4611 created by executing this bytecode register:
4613 \startfunctioncall
4614 lua.bytecode[n]()
4615 \stopfunctioncall
4617 Note that the path of the file is stored in the \LUA\ bytecode to be
4618 used in stack backtraces and therefore dumped into the format file if
4619 the above code is used in \INITEX. If it contains private information, i.e.
4620 the user name, this information is then contained in the format file as
4621 well. This should be kept in mind when preloading files into a bytecode
4622 register in \INITEX.
4624 \subsection{\LUA\ chunk name registers}
4626 There is an array of 65536 (0--65535) potential chunk names for use with
4627 the \type{\directlua} and \type{\latelua} primitives.
4629 \startfunctioncall
4630 lua.name[<number> n] = <string> s
4631 <string> s = lua.name[<number> n]
4632 \stopfunctioncall
4634 If you want to unset a lua name, you can assign \type{nil} to it.
4637 \section{The \luatex{mplib} library}
4639 The \MP\ library interface registers itself in the table \type{mplib}. It
4640 is based on \MPLIB\ version \ctxlua{tex.sprint(mplib.version())}.
4642 \subsection{\luatex{mplib.new}}
4644 To create a new \METAPOST\ instance, call
4646 \startfunctioncall
4647 <mpinstance> mp = mplib.new({...})
4648 \stopfunctioncall
4650 This creates the \type{mp} instance object. The argument hash can have a number of
4651 different fields, as follows:
4653 \starttabulate[|lT|l|p|p|]
4654 \NC \ssbf name \NC \bf type \NC \bf description \NC \bf default \NC\NR
4655 \NC error_line \NC number \NC error line width \NC 79 \NC\NR
4656 \NC print_line \NC number \NC line length in ps output \NC 100\NC\NR
4657 \NC random_seed \NC number \NC the initial random seed \NC variable\NC\NR
4658 \NC interaction \NC string \NC the interaction mode, one of
4659 \type {batch}, \type {nonstop}, \type {scroll}, \type {errorstop} \NC \type {errorstop}\NC\NR
4660 \NC job_name \NC string \NC \type {--jobname} \NC \type {mpout} \NC\NR
4661 \NC find_file \NC function \NC a function to find files \NC only local files\NC\NR
4662 \stoptabulate
4664 The \type{find_file} function should be of this form:
4666 \starttyping
4667 <string> found = finder (<string> name, <string> mode, <string> type)
4668 \stoptyping
4670 with:
4672 \starttabulate[|lT|l|p|]
4673 \NC \bf name \NC \bf the requested file \NC \NR
4674 \NC mode \NC the file mode: \type {r} or \type {w} \NC \NR
4675 \NC type \NC the kind of file, one of: \type {mp}, \type {tfm}, \type {map}, \type {pfb}, \type {enc} \NC \NR
4676 \stoptabulate
4678 Return either the full pathname of the found file, or \type{nil} if
4679 the file cannot be found.
4681 Note that the new version of \MPLIB\ no longer uses binary mem files,
4682 so the way to preload a set of macros is simply to start off with
4683 an \type{input} command in the first \type{mp:execute()} call.
4686 \subsection{\luatex{mp:statistics}}
4688 You can request statistics with:
4690 \startfunctioncall
4691 <table> stats = mp:statistics()
4692 \stopfunctioncall
4694 This function returns the vital statistics for an \MPLIB\ instance. There are four
4695 fields, giving the maximum number of used items in each of four
4696 allocated object classes:
4698 \starttabulate[|lT|l|p|]
4699 \NC main_memory \NC number \NC memory size \NC\NR
4700 \NC hash_size \NC number \NC hash size\NC\NR
4701 \NC param_size \NC number \NC simultaneous macro parameters\NC\NR
4702 \NC max_in_open \NC number \NC input file nesting levels\NC\NR
4703 \stoptabulate
4705 Note that in the new version of \MPLIB, this is informational only. The
4706 objects are all allocated dynamically, so there is no chance of running
4707 out of space unless the available system memory is exhausted.
4709 \subsection{\luatex{mp:execute}}
4711 You can ask the \METAPOST\ interpreter to run a chunk of code by calling
4713 \startfunctioncall
4714 <table> rettable = mp:execute('metapost language chunk')
4715 \stopfunctioncall
4717 for various bits of \METAPOST\ language input. Be sure to check the
4718 \type{rettable.status} (see below) because when a fatal \METAPOST\
4719 error occurs the \MPLIB\ instance will become unusable thereafter.
4721 Generally speaking, it is best to keep your chunks small, but beware
4722 that all chunks have to obey proper syntax, like each of them is a
4723 small file. For instance, you cannot split a single statement over
4724 multiple chunks.
4726 In contrast with the normal standalone \type{mpost} command, there is
4727 {\em no\/} implied \quote{input} at the start of the first chunk.
4729 \subsection{\luatex{mp:finish}}
4731 \startfunctioncall
4732 <table> rettable = mp:finish()
4733 \stopfunctioncall
4735 If for some reason you want to stop using an \MPLIB\ instance while
4736 processing is not yet actually done, you can call \type{mp:finish}.
4737 Eventually, used memory will be freed and open files will be closed by
4738 the \LUA\ garbage collector, but an explicit \type{mp:finish} is the
4739 only way to capture the final part of the output streams.
4741 \subsection{Result table}
4743 The return value of \type{mp:execute} and \type{mp:finish} is a table
4744 with a few possible keys (only \type {status} is always guaranteed to be present).
4746 \starttabulate[|l|l|p|]
4747 \NC log \NC string \NC output to the \quote {log} stream \NC \NR
4748 \NC term \NC string \NC output to the \quote {term} stream \NC \NR
4749 \NC error \NC string \NC output to the \quote {error} stream (only used for \quote {out of memory})\NC \NR
4750 \NC status \NC number \NC the return value: 0=good, 1=warning, 2=errors, 3=fatal error \NC \NR
4751 \NC fig \NC table \NC an array of generated figures (if any)\NC \NR
4752 \stoptabulate
4754 When \type{status} equals~3, you should stop using this \MPLIB\ instance
4755 immediately, it is no longer capable of processing input.
4757 If it is present, each of the entries in the \type{fig} array is a
4758 userdata representing a figure object, and each of those has a number of
4759 object methods you can call:
4761 \starttabulate[|l|l|p|]
4762 \NC boundingbox \NC function \NC returns the bounding box, as an array of 4 values\NC \NR
4763 \NC postscript \NC function \NC returns a string that is the ps output of the \type{fig}.
4764 this function accepts two optional integer arguments for
4765 specifying the values of \type{prologues} (first argument)
4766 and \type{procset} (second argument)\NC \NR
4767 \NC svg \NC function \NC returns a string that is the svg output of the \type{fig}.
4768 This function accepts an optional integer argument for
4769 specifying the value of \type{prologues}\NC \NR
4770 \NC objects \NC function \NC returns the actual array of graphic objects in this \type{fig} \NC \NR
4771 \NC copy_objects \NC function \NC returns a deep copy of the array of graphic objects in this \type{fig} \NC \NR
4772 \NC filename \NC function \NC the filename this \type{fig}'s \POSTSCRIPT\ output
4773 would have written to in standalone mode\NC \NR
4774 \NC width \NC function \NC the \type{charwd} value \NC \NR
4775 \NC height \NC function \NC the \type{charht} value \NC \NR
4776 \NC depth \NC function \NC the \type{chardp} value \NC \NR
4777 \NC italcorr \NC function \NC the \type{charit} value \NC \NR
4778 \NC charcode \NC function \NC the (rounded) \type{charcode} value \NC \NR
4779 \stoptabulate
4781 {\bf NOTE:} you can call \type{fig:objects()} only once for any one \type{fig} object!
4783 When the boundingbox represents a \quote {negated rectangle}, i.e.\ when the first set
4784 of coordinates is larger than the second set, the picture is empty.
4786 Graphical objects come in various types that each has a different list of
4787 accessible values. The types are: \type{fill}, \type{outline}, \type{text},
4788 \type{start_clip}, \type{stop_clip}, \type{start_bounds}, \type{stop_bounds}, \type{special}.
4790 There is helper function (\type{mplib.fields(obj)}) to get the list of
4791 accessible values for a particular object, but you can just as easily
4792 use the tables given below.
4794 All graphical objects have a field \type{type} that gives the object
4795 type as a string value; it is not explicit mentioned in the following tables.
4796 In the following, \type{number}s are \POSTSCRIPT\ points represented as
4797 a floating point number, unless stated otherwise. Field values that
4798 are of type \type{table} are explained in the next section.
4800 \subsubsection{fill}
4802 \starttabulate[|l|l|p|]
4803 \NC path \NC table \NC the list of knots \NC \NR
4804 \NC htap \NC table \NC the list of knots for the reversed trajectory \NC \NR
4805 \NC pen \NC table \NC knots of the pen \NC \NR
4806 \NC color \NC table \NC the object's color \NC \NR
4807 \NC linejoin \NC number \NC line join style (bare number)\NC \NR
4808 \NC miterlimit \NC number \NC miterlimit\NC \NR
4809 \NC prescript \NC string \NC the prescript text \NC \NR
4810 \NC postscript \NC string \NC the postscript text \NC \NR
4811 \stoptabulate
4813 The entries \type{htap} and \type{pen} are optional.
4815 There is helper function (\type{mplib.pen_info(obj)}) that returns
4816 a table containing a bunch of vital characteristics of the used pen
4817 (all values are floats):
4819 \starttabulate[|l|l|p|]
4820 \NC width \NC number \NC width of the pen\NC \NR
4821 \NC sx \NC number \NC $x$ scale \NC \NR
4822 \NC rx \NC number \NC $xy$ multiplier \NC \NR
4823 \NC ry \NC number \NC $yx$ multiplier \NC \NR
4824 \NC sy \NC number \NC $y$ scale \NC \NR
4825 \NC tx \NC number \NC $x$ offset \NC \NR
4826 \NC ty \NC number \NC $y$ offset \NC \NR
4827 \stoptabulate
4829 \subsubsection{outline}
4831 \starttabulate[|l|l|p|]
4832 \NC path \NC table \NC the list of knots \NC \NR
4833 \NC pen \NC table \NC knots of the pen \NC \NR
4834 \NC color \NC table \NC the object's color \NC \NR
4835 \NC linejoin \NC number \NC line join style (bare number)\NC \NR
4836 \NC miterlimit \NC number \NC miterlimit \NC \NR
4837 \NC linecap \NC number \NC line cap style (bare number)\NC \NR
4838 \NC dash \NC table \NC representation of a dash list\NC \NR
4839 \NC prescript \NC string \NC the prescript text \NC \NR
4840 \NC postscript \NC string \NC the postscript text \NC \NR
4841 \stoptabulate
4843 The entry \type{dash} is optional.
4845 \subsubsection{text}
4847 \starttabulate[|l|l|p|]
4848 \NC text \NC string \NC the text \NC \NR
4849 \NC font \NC string \NC font tfm name \NC \NR
4850 \NC dsize \NC number \NC font size\NC \NR
4851 \NC color \NC table \NC the object's color \NC \NR
4852 \NC width \NC number \NC \NC \NR
4853 \NC height \NC number \NC \NC \NR
4854 \NC depth \NC number \NC \NC \NR
4855 \NC transform \NC table \NC a text transformation \NC \NR
4856 \NC prescript \NC string \NC the prescript text \NC \NR
4857 \NC postscript \NC string \NC the postscript text \NC \NR
4858 \stoptabulate
4860 \subsubsection{special}
4862 \starttabulate[|l|l|p|]
4863 \NC prescript \NC string \NC special text \NC \NR
4864 \stoptabulate
4866 \subsubsection{start_bounds, start_clip}
4868 \starttabulate[|l|l|p|]
4869 \NC path \NC table \NC the list of knots \NC \NR
4870 \stoptabulate
4872 \subsubsection{stop_bounds, stop_clip}
4874 Here are no fields available.
4876 \subsection{Subsidiary table formats}
4878 \subsubsection{Paths and pens}
4880 Paths and pens (that are really just a special type of paths as far as
4881 \MPLIB\ is concerned) are represented by an array where each entry
4882 is a table that represents a knot.
4884 \starttabulate[|lT|l|p|]
4885 \NC left_type \NC string \NC when present: 'endpoint', but usually absent \NC \NR
4886 \NC right_type \NC string \NC like \type{left_type}\NC \NR
4887 \NC x_coord \NC number \NC X coordinate of this knot\NC \NR
4888 \NC y_coord \NC number \NC Y coordinate of this knot\NC \NR
4889 \NC left_x \NC number \NC X coordinate of the precontrol point of this knot\NC \NR
4890 \NC left_y \NC number \NC Y coordinate of the precontrol point of this knot\NC \NR
4891 \NC right_x \NC number \NC X coordinate of the postcontrol point of this knot\NC \NR
4892 \NC right_y \NC number \NC Y coordinate of the postcontrol point of this knot\NC \NR
4893 \stoptabulate
4895 There is one special case: pens that are (possibly transformed)
4896 ellipses have an extra string-valued key \type{type} with value
4897 \type{elliptical} besides the array part containing the knot list.
4899 \subsubsection{Colors}
4901 A color is an integer array with 0, 1, 3 or 4 values:
4903 \starttabulate[|l|l|p|]
4904 \NC 0 \NC marking only \NC no values \NC\NR
4905 \NC 1 \NC greyscale \NC one value in the range $(0,1)$, \quote {black} is $0$ \NC\NR
4906 \NC 3 \NC \RGB \NC three values in the range $(0,1)$, \quote {black} is $0,0,0$ \NC\NR
4907 \NC 4 \NC \CMYK \NC four values in the range $(0,1)$, \quote {black} is $0,0,0,1$ \NC\NR
4908 \stoptabulate
4910 If the color model of the internal object was \type{uninitialized}, then
4911 it was initialized to the values representing \quote {black} in the colorspace
4912 \type{defaultcolormodel} that was in effect at the time of the \type{shipout}.
4914 \subsubsection{Transforms}
4916 Each transform is a six-item array.
4918 \starttabulate[|l|l|p|]
4919 \NC 1 \NC number \NC represents x \NC\NR
4920 \NC 2 \NC number \NC represents y \NC\NR
4921 \NC 3 \NC number \NC represents xx \NC\NR
4922 \NC 4 \NC number \NC represents yx \NC\NR
4923 \NC 5 \NC number \NC represents xy \NC\NR
4924 \NC 6 \NC number \NC represents yy \NC\NR
4925 \stoptabulate
4927 Note that the translation (index 1 and 2) comes first. This differs
4928 from the ordering in \POSTSCRIPT, where the translation comes last.
4930 \subsubsection{Dashes}
4932 Each \type{dash} is two-item hash, using the same model as \POSTSCRIPT\
4933 for the representation of the dashlist. \type{dashes} is an array of
4934 \quote {on} and \quote {off}, values, and \type{offset} is the phase of the pattern.
4936 \starttabulate[|l|l|p|]
4937 \NC dashes \NC hash \NC an array of on-off numbers \NC\NR
4938 \NC offset \NC number \NC the starting offset value \NC\NR
4939 \stoptabulate
4941 \subsection{Character size information}
4943 These functions find the size of a glyph in a defined font. The
4944 \type{fontname} is the same name as the argument to \type{infont};
4945 the \type{char} is a glyph id in the range 0 to 255; the returned
4946 \type{w} is in AFM units.
4948 \subsubsection{\luatex{mp:char_width}}
4950 \startfunctioncall
4951 <number> w = mp:char_width(<string> fontname, <number> char)
4952 \stopfunctioncall
4954 \subsubsection{\luatex{mp:char_height}}
4956 \startfunctioncall
4957 <number> w = mp:char_height(<string> fontname, <number> char)
4958 \stopfunctioncall
4960 \subsubsection{\luatex{mp:char_depth}}
4962 \startfunctioncall
4963 <number> w = mp:char_depth(<string> fontname, <number> char)
4964 \stopfunctioncall
4966 \section{The \luatex{node} library}
4968 The \luatex{node} library contains functions that facilitate dealing
4969 with (lists of) nodes and their values. They allow you to create, alter,
4970 copy, delete, and insert \LUATEX\ node objects, the core
4971 objects within the typesetter.
4973 \LUATEX\ nodes are represented in \LUA\ as userdata with
4974 the metadata type \luatex{luatex.node}. The various parts within
4975 a node can be accessed using named fields.
4977 Each node has at least the three fields \type{next}, \type{id}, and
4978 \type{subtype}:
4980 \startitemize[intro]
4982 \item The \type{next} field returns the userdata
4983 object for the next node in a linked list of nodes, or
4984 \type{nil}, if there is no next node.
4986 \item The \type{id} indicates \TEX's \quote{node type}. The field \type{id}
4987 has a numeric value for efficiency reasons, but some of the library
4988 functions also accept a string value instead of \type{id}.
4990 \item The \type{subtype} is another number. It often gives further information
4991 about a node of a particular \type{id}, but it is most important when dealing
4992 with \quote{whatsits}, because they are differentiated solely based on their
4993 \type{subtype}.
4994 \stopitemize
4996 The other available fields depend on the \type{id} (and for \quote{whatsits}, the
4997 \type{subtype}) of the node. Further details on the various fields and their
4998 meanings are given in~\in{chapter}[nodes].
5000 Support for \type{unset} (alignment) nodes is partial:
5001 they can be queried and modified from \LUA\ code, but not created.
5003 Nodes can be compared to each other, but: you are actually comparing
5004 indices into the node memory. This means that equality tests can only
5005 be trusted under very limited conditions. It will not work correctly
5006 in any situation where one of the two nodes has been freed and|/|or
5007 reallocated: in that case, there will be false positives.
5009 At the moment, memory management of nodes should still be done
5010 explicitly by the user. Nodes are not \quote{seen} by the \LUA\
5011 garbage collector, so you have to call the node freeing functions
5012 yourself when you are no longer in need of a node (list). Nodes form
5013 linked lists without reference counting, so you have to be careful
5014 that when control returns back to \LUATEX\ itself, you have not
5015 deleted nodes that are still referenced from a \type{next} pointer
5016 elsewhere, and that you did not create nodes that are referenced more
5017 than once.
5019 There are statistics available with regards to the allocated node memory,
5020 which can be handy for tracing.
5022 \subsection{Node handling functions}
5024 \subsubsection{\luatex{node.is_node}}
5026 \startfunctioncall
5027 <boolean> t = node.is_node(<any> item)
5028 \stopfunctioncall
5030 This function returns true if the argument is a userdata object of
5031 type \type{<node>}.
5033 \subsubsection{\luatex{node.types}}
5035 \startfunctioncall
5036 <table> t = node.types()
5037 \stopfunctioncall
5039 This function returns an array that maps node id numbers to node type
5040 strings, providing an overview of the possible top|-|level \type{id}
5041 types.
5043 \subsubsection{\luatex{node.whatsits}}
5045 \startfunctioncall
5046 <table> t = node.whatsits()
5047 \stopfunctioncall
5049 \TEX's \quote{whatsits} all have the same \type{id}. The various subtypes
5050 are defined by their \type{subtype} fields. The function is much like
5051 \luatex{node.types}, except that it provides an array of \type{subtype}
5052 mappings.
5054 \subsubsection{\luatex{node.id}}
5056 \startfunctioncall
5057 <number> id = node.id(<string> type)
5058 \stopfunctioncall
5060 This converts a single type name to its internal numeric
5061 representation.
5063 \subsubsection{\luatex{node.subtype}}
5065 \startfunctioncall
5066 <number> subtype = node.subtype(<string> type)
5067 \stopfunctioncall
5069 This converts a single whatsit name to its internal numeric
5070 representation (\type{subtype}).
5072 \subsubsection{\luatex{node.type}}
5074 \startfunctioncall
5075 <string> type = node.type(<any> n)
5076 \stopfunctioncall
5078 In the argument is a number, then this function converts an internal
5079 numeric representation to an external string representation.
5080 Otherwise, it will return the string \type{node} if the object
5081 represents a node (this is new in 0.65), and \type{nil} otherwise.
5083 \subsubsection{\luatex{node.fields}}
5085 \startfunctioncall
5086 <table> t = node.fields(<number> id)
5087 <table> t = node.fields(<number> id, <number> subtype)
5088 \stopfunctioncall
5090 This function returns an array of valid field names for a particular
5091 type of node. If you want to get the valid fields for a
5092 \quote{whatsit}, you have to supply the second argument also. In other
5093 cases, any given second argument will be silently ignored.
5095 This function accepts string \type{id} and \type{subtype} values as
5096 well.
5098 \subsubsection{\luatex{node.has_field}}
5100 \startfunctioncall
5101 <boolean> t = node.has_field(<node> n, <string> field)
5102 \stopfunctioncall
5104 This function returns a boolean that is only true if \type{n} is
5105 actually a node, and it has the field.
5107 \subsubsection{\luatex{node.new}}
5109 \startfunctioncall
5110 <node> n = node.new(<number> id)
5111 <node> n = node.new(<number> id, <number> subtype)
5112 \stopfunctioncall
5114 Creates a new node. All of the new node's fields are initialized to
5115 either zero or \type{nil} except for \type{id} and \type{subtype} (if
5116 supplied). If you want to create a new whatsit, then the second
5117 argument is required, otherwise it need not be present. As with all
5118 node functions, this function creates a node on the \TEX\ level.
5120 This function accepts string \type{id} and \type{subtype} values as
5121 well.
5123 \subsubsection{\luatex{node.free}}
5125 \startfunctioncall
5126 node.free(<node> n)
5127 \stopfunctioncall
5129 Removes the node \type{n} from \TEX's memory. Be careful: no checks
5130 are done on whether this node is still pointed to from a register or some
5131 \type{next} field: it is up to you to make sure that the internal data
5132 structures remain correct.
5134 \subsubsection{\luatex{node.flush_list}}
5136 \startfunctioncall
5137 node.flush_list(<node> n)
5138 \stopfunctioncall
5140 Removes the node list \type{n} and the complete node list following
5141 \type{n} from \TEX's memory. Be careful: no checks are done on whether
5142 any of these nodes is still pointed to from a register or some
5143 \type{next} field: it is up to you to make sure that the internal data
5144 structures remain correct.
5146 \subsubsection{\luatex{node.copy}}
5148 \startfunctioncall
5149 <node> m = node.copy(<node> n)
5150 \stopfunctioncall
5152 Creates a deep copy of node \type{n}, including all nested lists as in
5153 the case of a hlist or vlist node. Only the \type{next} field is not
5154 copied.
5156 \subsubsection{\luatex{node.copy_list}}
5158 \startfunctioncall
5159 <node> m = node.copy_list(<node> n)
5160 <node> m = node.copy_list(<node> n, <node> m)
5161 \stopfunctioncall
5163 Creates a deep copy of the node list that starts at \type{n}. If
5164 \type{m} is also given, the copy stops just before node \type{m}.
5166 Note that you cannot copy attribute lists this way, specialized functions for
5167 dealing with attribute lists will be provided later but are not there yet.
5168 However, there is normally no need to copy attribute lists as when you do
5169 assignments to the \type{attr} field or make changes to specific attributes, the
5170 needed copying and freeing takes place automatically.
5172 \subsubsection{\luatex{node.next} (0.65)}
5174 \startfunctioncall
5175 <node> m = node.next(<node> n)
5176 \stopfunctioncall
5178 Returns the node following this node, or \type{nil} if there is no
5179 such node.
5181 \subsubsection{\luatex{node.prev} (0.65)}
5183 \startfunctioncall
5184 <node> m = node.prev(<node> n)
5185 \stopfunctioncall
5187 Returns the node preceding this node, or \type{nil} if there is no
5188 such node.
5191 \subsubsection{\luatex{node.current_attr} (0.66)}
5193 \startfunctioncall
5194 <node> m = node.current_attr()
5195 \stopfunctioncall
5197 Returns the currently active list of attributes, if there is one.
5199 The intended usage of \type{current_attr} is as follows:
5201 \starttyping
5202 local x1 = node.new("glyph")
5203 x1.attr = node.current_attr()
5204 local x2 = node.new("glyph")
5205 x2.attr = node.current_attr()
5206 \stoptyping
5210 \starttyping
5211 local x1 = node.new("glyph")
5212 local x2 = node.new("glyph")
5213 local ca = node.current_attr()
5214 x1.attr = ca
5215 x2.attr = ca
5216 \stoptyping
5218 The attribute lists are ref counted and the assignment takes care
5219 of incrementing the refcount. You cannot expect the value \type {ca}
5220 to be valid any more when you assign attributes (using \type
5221 {tex.setattribute}) or when control has been passed back to \TEX.
5224 Note: this function is somewhat experimental, and it returns the {\it
5225 actual} attribute list, not a copy thereof.
5226 Therefore, changing any of the attributes in the list will change
5227 these values for all nodes that have the current attribute list
5228 assigned to them.
5231 \subsubsection{\luatex{node.hpack}}
5233 \startfunctioncall
5234 <node> h, <number> b = node.hpack(<node> n)
5235 <node> h, <number> b = node.hpack(<node> n, <number> w, <string> info)
5236 <node> h, <number> b = node.hpack(<node> n, <number> w, <string> info, <string> dir)
5237 \stopfunctioncall
5239 This function creates a new hlist by packaging the list that begins at node
5240 \type{n} into a horizontal box. With only a single argument, this box
5241 is created using the natural width of its components. In the three
5242 argument form, \type{info} must be either \type{additional} or
5243 \type{exactly}, and \type{w} is the additional (\tex{hbox spread})
5244 or exact (\tex{hbox to}) width to be used.
5246 Direction support added in \LUATEX\ 0.45.
5248 The second return value is the badness of the generated box,
5249 this extension was added in 0.51.
5251 Caveat: at this moment, there can be unexpected side|-|effects to this
5252 function, like updating some of the \tex{marks} and \tex{inserts}.
5253 Also note that the content of \type{h} is the original node list
5254 \type{n}: if you call \type{node.free(h)} you will also free the
5255 node list itself, unless you explicitly set the \type{list} field
5256 to \type{nil} beforehand. And in a similar way, calling
5257 \type{node.free(n)} will invalidate \type{h} as well!
5259 \subsubsection{\luatex{node.vpack} (since 0.36)}
5261 \startfunctioncall
5262 <node> h, <number> b = node.vpack(<node> n)
5263 <node> h, <number> b = node.vpack(<node> n, <number> w, <string> info)
5264 <node> h, <number> b = node.vpack(<node> n, <number> w, <string> info, <string> dir)
5265 \stopfunctioncall
5267 This function creates a new vlist by packaging the list that begins at node
5268 \type{n} into a vertical box. With only a single argument, this box
5269 is created using the natural height of its components. In the three
5270 argument form, \type{info} must be either \type{additional} or
5271 \type{exactly}, and \type{w} is the additional (\tex{vbox spread}) or exact (\tex{vbox to}) height to be used.
5273 Direction support added in \LUATEX\ 0.45.
5275 The second return value is the badness of the generated box,
5276 this extension was added in 0.51.
5278 See the description of \type{node.hpack()} for a few memory allocation
5279 caveats.
5281 \subsubsection{\luatex{node.dimensions} (0.43)}
5283 \startfunctioncall
5284 <number> w, <number> h, <number> d = node.dimensions(<node> n)
5285 <number> w, <number> h, <number> d = node.dimensions(<node> n, <string> dir)
5286 <number> w, <number> h, <number> d = node.dimensions(<node> n, <node> t)
5287 <number> w, <number> h, <number> d = node.dimensions(<node> n, <node> t, <string> dir)
5288 \stopfunctioncall
5290 This function calculates the natural in-line dimensions of the node
5291 list starting at node \type{n} and terminating just before node \type{t}
5292 (or the end of the list, if there is no second argument). The return values are scaled
5293 points. An alternative format that starts with glue parameters as the
5294 first three arguments is also possible:
5296 \startfunctioncall
5297 <number> w, <number> h, <number> d =
5298 node.dimensions(<number> glue_set, <number> glue_sign,
5299 <number> glue_order, <node> n)
5300 <number> w, <number> h, <number> d =
5301 node.dimensions(<number> glue_set, <number> glue_sign,
5302 <number> glue_order, <node> n, <string> dir)
5303 <number> w, <number> h, <number> d =
5304 node.dimensions(<number> glue_set, <number> glue_sign,
5305 <number> glue_order, <node> n, <node> t)
5306 <number> w, <number> h, <number> d =
5307 node.dimensions(<number> glue_set, <number> glue_sign,
5308 <number> glue_order, <node> n, <node> t, <string> dir)
5309 \stopfunctioncall
5311 This calling method takes glue settings into account and is especially
5312 useful for finding the actual width of a sublist of nodes that are
5313 already boxed, for example in code like this, which prints the
5314 width of the space inbetween the \type{a} and \type{b} as it would
5315 be if \type{\box0} was used as-is:
5317 \starttyping
5318 \setbox0 = \hbox to 20pt {a b}
5320 \directlua{print (node.dimensions(tex.box[0].glue_set,
5321 tex.box[0].glue_sign,
5322 tex.box[0].glue_order,
5323 tex.box[0].head.next,
5324 node.tail(tex.box[0].head))) }
5325 \stoptyping
5327 Direction support added in \LUATEX\ 0.45.
5329 \subsubsection{\luatex{node.mlist_to_hlist}}
5331 \startfunctioncall
5332 <node> h = node.mlist_to_hlist(<node> n,
5333 <string> display_type, <boolean> penalties)
5334 \stopfunctioncall
5336 This runs the internal mlist to hlist conversion, converting the math list in
5337 \type{n} into the horizontal list \type{h}. The interface is exactly the same as
5338 for the callback \type{mlist_to_hlist}.
5340 \subsubsection{\luatex{node.slide}}
5342 \startfunctioncall
5343 <node> m = node.slide(<node> n)
5344 \stopfunctioncall
5346 Returns the last node of the node list that starts at \type{n}. As a
5347 side|-|effect, it also creates a reverse chain of \type{prev} pointers
5348 between nodes.
5350 \subsubsection{\luatex{node.tail}}
5352 \startfunctioncall
5353 <node> m = node.tail(<node> n)
5354 \stopfunctioncall
5356 Returns the last node of the node list that starts at \type{n}.
5359 \subsubsection{\luatex{node.length}}
5361 \startfunctioncall
5362 <number> i = node.length(<node> n)
5363 <number> i = node.length(<node> n, <node> m)
5364 \stopfunctioncall
5366 Returns the number of nodes contained in the node list that starts at
5367 \type{n}. If \type{m} is also supplied it stops at \type{m} instead of
5368 at the end of the list. The node \type{m} is not counted.
5370 \subsubsection{\luatex{node.count}}
5372 \startfunctioncall
5373 <number> i = node.count(<number> id, <node> n)
5374 <number> i = node.count(<number> id, <node> n, <node> m)
5375 \stopfunctioncall
5377 Returns the number of nodes contained in the node list that starts at
5378 \type{n} that have a matching \type{id} field.
5379 If \type{m} is also supplied, counting stops at \type{m} instead of at
5380 the end of the list. The node \type{m} is not counted.
5382 This function also accept string \type{id}'s.
5384 \subsubsection{\luatex{node.traverse}}
5386 \startfunctioncall
5387 <node> t = node.traverse(<node> n)
5388 \stopfunctioncall
5390 This is a lua iterator that loops over the node list that starts at \type{n}.
5391 Typical input code like this
5393 \starttyping
5394 for n in node.traverse(head) do
5397 \stoptyping
5399 is functionally equivalent to:
5401 \starttyping
5403 local n
5404 local function f (head,var)
5405 local t
5406 if var == nil then
5407 t = head
5408 else
5409 t = var.next
5411 return t
5413 while true do
5414 n = f (head, n)
5415 if n == nil then break end
5419 \stoptyping
5421 It should be clear from the definition of the function \type{f} that
5422 even though it is possible to add or remove nodes from the node list while
5423 traversing, you have to take great care to make sure all the \type{next}
5424 (and \type{prev}) pointers remain valid.
5426 If the above is unclear to you, see the section \quote{For Statement}
5427 in the Lua Reference Manual.
5429 \subsubsection{\luatex{node.traverse_id}}
5431 \startfunctioncall
5432 <node> t = node.traverse_id(<number> id, <node> n)
5433 \stopfunctioncall
5435 This is an iterator that loops over all the nodes in the list that
5436 starts at \type{n} that have a matching \type{id} field.
5438 See the previous section for details. The change is in the local
5439 function \type{f}, which now does an extra while loop checking
5440 against the upvalue \type{id}:
5442 \starttyping
5443 local function f (head,var)
5444 local t
5445 if var == nil then
5446 t = head
5447 else
5448 t = var.next
5450 while not t.id == id do
5451 t = t.next
5453 return t
5455 \stoptyping
5457 \subsubsection{\luatex{node.end_of_math} (0.76)}
5459 \startfunctioncall
5460 <node> t = node.end_of_math(<node> start)
5461 \stopfunctioncall
5463 Looks for and returns the next \type{math_node} following the \type{start}.
5464 If the given node is a math endnode this helper return that node, else it follows the list and return the next math endnote. If no such node is found nil is returned.
5466 \subsubsection{\luatex{node.remove}}
5468 \startfunctioncall
5469 <node> head, current = node.remove(<node> head, <node> current)
5470 \stopfunctioncall
5472 This function removes the node \type{current} from the list following
5473 \type{head}. It is your responsibility to make sure it is really part
5474 of that list. The return values are the new \type{head} and
5475 \type{current} nodes. The returned \type{current} is the node
5476 following the \type{current} in the calling argument, and is only
5477 passed back as a convenience (or \type{nil}, if there is no such node). The
5478 returned \type{head} is more important, because if the function is
5479 called with \type{current} equal to \type{head}, it will be changed.
5481 \subsubsection{\luatex{node.insert_before}}
5483 \startfunctioncall
5484 <node> head, new = node.insert_before(<node> head, <node> current, <node> new)
5485 \stopfunctioncall
5487 This function inserts the node \type{new} before \type{current} into
5488 the list following \type{head}. It is your responsibility to make sure
5489 that \type{current} is really part of that list. The return values are
5490 the (potentially mutated) \type{head} and the node \type{new}, set up to
5491 be part of the list (with correct \type{next} field). If \type{head}
5492 is initially \type{nil}, it will become \type{new}.
5494 \subsubsection{\luatex{node.insert_after}}
5496 \startfunctioncall
5497 <node> head, new = node.insert_after(<node> head, <node> current, <node> new)
5498 \stopfunctioncall
5500 This function inserts the node \type{new} after \type{current} into
5501 the list following \type{head}. It is your responsibility to make sure
5502 that \type{current} is really part of that list. The return values are
5503 the \type{head} and the node \type{new}, set up to be part of the list
5504 (with correct \type{next} field). If \type{head} is initially
5505 \type{nil}, it will become \type{new}.
5507 \subsubsection{\luatex{node.first_glyph} (0.65)}
5509 \startfunctioncall
5510 <node> n = node.first_glyph(<node> n)
5511 <node> n = node.first_glyph(<node> n, <node> m)
5512 \stopfunctioncall
5514 Returns the first node in the list starting at \type{n} that is a
5515 glyph node with a subtype indicating it is a glyph, or \type{nil}.
5516 If \type{m} is given, processing stops at (but including) that node,
5517 otherwise processing stops at the end of the list.
5519 Note: this function used to be called \type{first_character}. It has
5520 been renamed in \LUATEX\ 0.65, and the old name is deprecated now.
5522 \subsubsection{\luatex{node.ligaturing}}
5524 \startfunctioncall
5525 <node> h, <node> t, <boolean> success = node.ligaturing(<node> n)
5526 <node> h, <node> t, <boolean> success = node.ligaturing(<node> n, <node> m)
5527 \stopfunctioncall
5529 Apply \TEX-style ligaturing to the specified nodelist. The tail node
5530 \type{m} is optional. The two returned nodes \type{h} and \type{t} are
5531 the new head and tail (both \type{n} and \type{m} can change into
5532 a new ligature).
5534 \subsubsection{\luatex{node.kerning}}
5536 \startfunctioncall
5537 <node> h, <node> t, <boolean> success = node.kerning(<node> n)
5538 <node> h, <node> t, <boolean> success = node.kerning(<node> n, <node> m)
5539 \stopfunctioncall
5541 Apply \TEX|-|style kerning to the specified nodelist. The tail node
5542 \type{m} is optional. The two returned nodes \type{h} and \type{t} are
5543 the head and tail (either one of these can be an inserted kern node,
5544 because special kernings with word boundaries are possible).
5546 \subsubsection{\luatex{node.unprotect_glyphs}}
5548 \startfunctioncall
5549 node.unprotect_glyphs(<node> n)
5550 \stopfunctioncall
5552 Subtracts 256 from all glyph node subtypes. This and the next
5553 function are helpers to convert from \type{characters} to
5554 \type{glyphs} during node processing.
5556 \subsubsection{\luatex{node.protect_glyphs}}
5558 \startfunctioncall
5559 node.protect_glyphs(<node> n)
5560 \stopfunctioncall
5562 Adds 256 to all glyph node subtypes in the node list starting at
5563 \type{n}, except that if the value is 1, it adds only 255. The special
5564 handling of 1 means that \type{characters} will become \type{glyphs}
5565 after subtraction of 256.
5567 \subsubsection{\luatex{node.last_node}}
5569 \startfunctioncall
5570 <node> n = node.last_node()
5571 \stopfunctioncall
5573 This function pops the last node from \TEX's \quote{current list}.
5574 It returns that node, or \type{nil} if the current list is empty.
5576 \subsubsection{\luatex{node.write}}
5578 \startfunctioncall
5579 node.write(<node> n)
5580 \stopfunctioncall
5582 This is an experimental function that will append a node list to
5583 \TEX's \quote {current list} (the node list is not deep-copied
5584 any more since version 0.38). There is no error checking yet!
5586 \subsubsection{\luatex{node.protrusion_skippable} (0.60.1)}
5587 \startfunctioncall
5588 <boolean> skippable = node.protrusion_skippable(<node> n)
5589 \stopfunctioncall
5591 Returns \type{true} if, for the purpose of line boundary discovery
5592 when character protrusion is active, this node can be skipped.
5594 \subsection{Attribute handling}
5596 Attributes appear as linked list of userdata objects in the
5597 \type{attr} field of individual nodes. They can be handled
5598 individually, but it is much safer and more efficient to use the
5599 dedicated functions associated with them.
5601 \subsubsection{\luatex{node.has_attribute}}
5603 \startfunctioncall
5604 <number> v = node.has_attribute(<node> n, <number> id)
5605 <number> v = node.has_attribute(<node> n, <number> id, <number> val)
5606 \stopfunctioncall
5608 Tests if a node has the attribute with number \type{id} set. If
5609 \type{val} is also supplied, also tests if the value matches \type{val}.
5610 It returns the value, or, if no match is found, \type{nil}.
5612 \subsubsection{\luatex{node.set_attribute}}
5614 \startfunctioncall
5615 node.set_attribute(<node> n, <number> id, <number> val)
5616 \stopfunctioncall
5618 Sets the attribute with number \type{id} to the value
5619 \type{val}. Duplicate assignments are ignored. {\em [needs explanation]}
5621 \subsubsection{\luatex{node.unset_attribute}}
5623 \startfunctioncall
5624 <number> v = node.unset_attribute(<node> n, <number> id)
5625 <number> v = node.unset_attribute(<node> n, <number> id, <number> val)
5626 \stopfunctioncall
5628 Unsets the attribute with number \type{id}. If \type{val} is also supplied,
5629 it will only perform this operation if the value matches \type{val}.
5630 Missing attributes or attribute|-|value pairs are ignored.
5632 If the attribute was actually deleted, returns its old
5633 value. Otherwise, returns \type{nil}.
5635 \section{The \luatex{pdf} library}
5637 This contains variables and functions that are related to the \PDF\ backend.
5639 %***********************************************************************
5641 \subsection{\luatex{pdf.mapfile}, \luatex{pdf.mapline} (new in 0.53.0)}
5643 \startfunctioncall
5644 pdf.mapfile(<string> map file)
5645 pdf.mapline(<string> map line)
5646 \stopfunctioncall
5648 These two functions can be used to replace primitives \type{\pdfmapfile}
5649 and \type{\pdfmapline} from \PDFTEX. They expect a string as only parameter
5650 and have no return value.
5652 The also functions replace the former variables
5653 \luatex{pdf.pdfmapfile} and \luatex{pdf.pdfmapline}.
5655 %***********************************************************************
5656 \subsection{\luatex{pdf.catalog}, \luatex{pdf.info},
5657 \luatex{pdf.names}, \luatex{pdf.trailer} (new in 0.53.0)}
5659 These variables offer a read|-|write interface to the corresponding
5660 \PDFTEX\ token lists. The value types are strings and they are
5661 written out to the \PDF\ file directly after the \PDFTEX\ token registers.
5663 The preferred interface is now \luatex {pdf.setcatalog}, \luatex {pdf.setinfo}
5664 \luatex {pdf.setnames} and \luatex {pdf.settrailer} for setting these properties
5665 and \luatex {pdf.getcatalog}, \luatex {pdf.getinfo} \luatex {pdf.getnames} and
5666 \luatex {pdf.gettrailer} for querying them,
5668 The corresponding \quote {\type{pdf}} parameter names \luatex {pdf.pdfcatalog},
5669 \luatex {pdf.pdfinfo}, \luatex {pdf.pdfnames}, and \luatex {pdf.pdftrailer} are
5670 removed in 0.79.0.
5672 %***********************************************************************
5673 \subsection{\luatex{pdf.<set/get>pageattributes}, \luatex{pdf.<set/get>pageresources},
5674 \luatex{pdf.<set/get>pagesattributes}}
5676 These variables offer a read|-|write interface to related
5677 token lists. The value types are strings. The variables have no
5678 interaction with the corresponding \PDFTEX\ token registers
5679 \tex{pdfpageattr}, \tex{pdfpageresources}, and \tex{pdfpagesattr}.
5680 They are written out to the \PDF\ file directly after
5681 the \PDFTEX\ token registers.
5683 The preferred interface is now \luatex {pdf.setpageattributes}, \luatex
5684 {pdf.setpagesattributes} and \luatex {pdf.setpageresources} for setting these
5685 properties and \luatex {pdf.getpageattributes}, \luatex {pdf.getpageattributes} and
5686 \luatex {pdf.getpageresources} for querying them.
5688 %***********************************************************************
5690 \subsection{\luatex{pdf.h}, \luatex{pdf.v}}
5693 These are the \type{h} and \type{v} values that define the current location
5694 on the output page, measured from its lower left corner. The values can be queried
5695 using scaled points as units.
5697 \starttyping
5698 local h = pdf.h
5699 local v = pdf.v
5700 \stoptyping
5702 \subsection{\luatex{pdf.getpos}, \luatex{pdf.gethpos}, \luatex{pdf.getvpos}}
5704 These are the function variants of \type {pdf.h} and \type {pdf.v}. Sometimes
5705 using a function is preferred over a key so this saves wrapping. Also, these
5706 functions are faster then the key based access, as \type {h} and \type {v}
5707 keys are not real variables but looked up using a metatable call. The
5708 \type {getpos} function returns two values, the other return one.
5710 \starttyping
5711 local h, v = pdf.getpos()
5712 \stoptyping
5714 \subsection{\luatex{pdf.hasmatrix}, \luatex{pdf.getmatrix}}
5716 The current matrix transformation is available via the \type {getmatrix} command,
5717 which returns 6 values: \type {sx}, \type {rx}, \type {ry}, \type {sy}, \type {tx},
5718 and \type {ty}. The \type {hasmatrix} function returns \type {true} when a matrix is
5719 applied.
5721 \starttyping
5722 if pdf.hasmatrix() then
5723 local sx, rx, ry, sy, tx, ty = pdf.getmatrix()
5724 -- do something useful or not
5726 \stoptyping
5730 \subsection{\luatex{pdf.print}}
5732 A print function to write stuff to the \PDF\ document
5733 that can be used from within a \tex{latelua} argument.
5734 This function is not to be used inside \tex{directlua}
5735 unless you know {\it exactly} what you are doing.
5737 \startfunctioncall
5738 pdf.print(<string> s)
5739 pdf.print(<string> type, <string> s)
5740 \stopfunctioncall
5742 The optional parameter can be used to mimic the behavior of
5743 \tex{pdfliteral}: the \type{type} is \type{direct} or \type{page}.
5745 \subsection{\luatex{pdf.immediateobj}}
5747 This function creates a \PDF\ object
5748 and immediately writes it to the \PDF\ file.
5749 It is modelled after \PDFTEX's \tex{immediate}\tex{pdfobj} primitives.
5750 All function variants return the object number
5751 of the newly generated object.
5753 \startfunctioncall
5754 <number> n = pdf.immediateobj(<string> objtext)
5755 <number> n = pdf.immediateobj("file", <string> filename)
5756 <number> n = pdf.immediateobj("stream", <string> streamtext, <string> attrtext)
5757 <number> n = pdf.immediateobj("streamfile", <string> filename, <string> attrtext)
5758 \stopfunctioncall
5760 The first version puts the \type{objtext} raw into an object.
5761 Only the object wrapper is automatically generated,
5762 but any internal structure (like \type{<< >>} dictionary markers)
5763 needs to provided by the user.
5764 The second version with keyword \type{"file"} as 1st argument
5765 puts the contents of the file with name \type{filename} raw into the object.
5766 The third version with keyword \type{"stream"} creates a stream object
5767 and puts the \type{streamtext} raw into the stream.
5768 The stream length is automatically calculated.
5769 The optional \type{attrtext} goes into the dictionary of that object.
5770 The fourth version with keyword \type{"streamfile"} does the same as the 3rd one,
5771 it just reads the stream data raw from a file.
5773 An optional first argument can be given to make the function use a
5774 previously reserved \PDF\ object.
5776 \startfunctioncall
5777 <number> n = pdf.immediateobj(<integer> n, <string> objtext)
5778 <number> n = pdf.immediateobj(<integer> n, "file", <string> filename)
5779 <number> n = pdf.immediateobj(<integer> n, "stream", <string> streamtext, <string> attrtext)
5780 <number> n = pdf.immediateobj(<integer> n, "streamfile", <string> filename, <string> attrtext)
5781 \stopfunctioncall
5783 %***********************************************************************
5785 \subsection{\luatex{pdf.obj}}
5787 This function creates a \PDF\ object,
5788 which is written to the \PDF\ file only when referenced,
5789 e.\,g., by \luatex{pdf.refobj()}.
5791 All function variants return the object number of the newly generated
5792 object, and there are two separate calling modes.
5794 The first mode is modelled after \PDFTEX's \tex{pdfobj} primitive.
5796 \startfunctioncall
5797 <number> n = pdf.obj(<string> objtext)
5798 <number> n = pdf.obj("file", <string> filename)
5799 <number> n = pdf.obj("stream", <string> streamtext, <string> attrtext)
5800 <number> n = pdf.obj("streamfile", <string> filename, <string> attrtext)
5801 \stopfunctioncall
5803 An optional first argument can be given to make the function use a
5804 previously reserved \PDF\ object.
5806 \startfunctioncall
5807 <number> n = pdf.obj(<integer> n, <string> objtext)
5808 <number> n = pdf.obj(<integer> n, "file", <string> filename)
5809 <number> n = pdf.obj(<integer> n, "stream", <string> streamtext, <string> attrtext)
5810 <number> n = pdf.obj(<integer> n, "streamfile", <string> filename, <string> attrtext)
5811 \stopfunctioncall
5813 The second mode accepts a single argument table with key--value pairs.
5815 \startfunctioncall
5816 <number> n = pdf.obj{ type = <string>,
5817 immmediate = <boolean>,
5818 objnum = <number>,
5819 attr = <string>,
5820 compresslevel = <number>,
5821 objcompression = <boolean>,
5822 file = <string>,
5823 string = <string>}
5824 \stopfunctioncall
5826 The \type{type} field can have the values \type{raw} and
5827 \type{stream}, this field is required, the others are optional
5828 (within constraints).
5830 Note: this mode makes \type{pdf.obj} look more flexible than it
5831 actually is: the constraints from the separate parameter version
5832 still apply, so for example you can't have both \type{string} and
5833 \type{file} at the same time.
5835 %***********************************************************************
5837 \subsection{\luatex{pdf.refobj}}
5839 This function,
5840 the \LUA\ version of the \tex{pdfrefobj} primitive,
5841 references an object by its object number,
5842 so that the object will be written out.
5844 \startfunctioncall
5845 pdf.refobj(<integer> n)
5846 \stopfunctioncall
5848 This function works in both the \tex{directlua} and \tex{latelua} environment.
5849 Inside \tex{directlua} a new whatsit node
5850 \quote{pdf_refobj} is created, which will be marked for flushing during
5851 page output and the object is then written directly after the page,
5852 when also the resources objects are written out.
5853 Inside \tex{latelua} the object will be marked for flushing.
5855 This function has no return values.
5857 %***********************************************************************
5859 \subsection{\luatex{pdf.reserveobj}}
5861 This function creates an empty \PDF\ object and returns its number.
5863 \startfunctioncall
5864 <number> n = pdf.reserveobj()
5865 <number> n = pdf.reserveobj("annot")
5866 \stopfunctioncall
5868 \subsection{\luatex{pdf.registerannot} (new in 0.47.0)}
5870 This function adds an object number to the \type{/Annots} array for the
5871 current page without doing anything else. This function can only be
5872 used from within \type{\latelua}.
5874 \startfunctioncall
5875 pdf.registerannot (<number> objnum)
5876 \stopfunctioncall
5878 \section{The \luatex{pdfscanner} library (new in 0.72.0)}
5880 The \luatex{pdfscanner} library allows interpretation of PDF content streams
5881 and \type{/ToUnicode} (cmap) streams. You can get those streams from the
5882 \luatex{epdf} library, as explained in an earlier section. There is only
5883 a single top|-|level function in this library:
5885 \startfunctioncall
5886 pdfscanner.scan (<Object> stream, <table> operatortable, <table> info)
5887 \stopfunctioncall
5889 The first argument, \type{stream}, should be either a PDF stream
5890 object, or a PDF array of PDF stream objects (those options comprise
5891 the possible return values of \type{<Page>:getContents()}
5892 and \type{<Object>:getStream()} in the \type{epdf} library).
5894 The second argument, \type{operatortable}, should be a Lua table where
5895 the keys are PDF operator name strings and the values are Lua
5896 functions (defined by you) that are used to process those
5897 operators. The functions are called whenever the scanner finds one
5898 of these PDF operators in the content stream(s). The functions are
5899 called with two arguments: the \type{scanner} object itself, and
5900 the \type{info} table that was passed are the third argument
5901 to \type{pdfscanner.scan}.
5903 Internally, \type{pdfscanner.scan} loops over the PDF operators in the
5904 stream(s), collecting operands on an internal stack until it finds a
5905 PDF operator. If that PDF operator's name exists
5906 in \type{operatortable}, then the associated function is
5907 executed. After the function has run (or when there is no function to
5908 execute) the internal operand stack is cleared in preparation for the
5909 next operator, and processing continues.
5911 The \type{scanner} argument to the processing functions is needed
5912 because it offers various methods to get the actual operands from the
5913 internal operand stack. The most important of those functions is
5914 \type{}
5916 A simple example of processing a PDF's document stream
5917 could look like this:
5919 \starttyping
5920 function Do (scanner, info)
5921 local val = scanner:pop()
5922 local name = val[2] -- val[1] == 'name'
5923 print (info.space ..'Use XObject '.. name)
5924 local resources = info.resources
5925 local xobject = resources:lookup("XObject"):getDict():lookup(name)
5926 if (xobject and xobject:isStream()) then
5927 local dict = xobject:getStream():getDict()
5928 if dict then
5929 local name = dict:lookup('Subtype')
5930 if name:getName() == 'Form' then
5931 local newinfo = { space = info.space .. " " ,
5932 resources = dict:lookup('Resources'):getDict() }
5933 pdfscanner.scan(xobject, operatortable, newinfo)
5938 operatortable = {Do = Do}
5940 doc = epdf.open(arg[1])
5941 pagenum = 1
5942 while pagenum <= doc:getNumPages() do
5943 local page = doc:getCatalog():getPage(pagenum)
5944 local info = { space = " " , resources = page:getResourceDict()}
5945 print ('Page ' .. pagenum)
5946 pdfscanner.scan(page:getContents(), operatortable, info)
5947 pagenum = pagenum + 1
5949 \stoptyping
5951 This example iterates over all the actual content in the PDF, and
5952 prints out the found XObject names. While the code demonstrates quite
5953 some of the \type{epdf} functions, let's focus on the type
5954 \type{pdfscanner} specific code instead.
5956 From the bottom up, the line
5958 \starttyping
5959 pdfscanner.scan(page:getContents(), operatortable, info)
5960 \stoptyping
5962 runs the scanner with the PDF page's top-level content.
5964 The third argument, \type{info}, contains two entries: \type{space} is
5965 used to indent the printed output, and \type{resources} is needed so
5966 that embedded \type{XForms} can find their own content.
5968 The second argument, \type{operatortable} defines a processing function
5969 for a single PDF operator, \type{Do}.
5971 The function \type{Do} prints the name of the current XObject, and
5972 then starts a new scanner for that object's content stream, under the
5973 condition that the XObject is in fact a \type{/Form}. That nested
5974 scanner is called with new \type{info} argument with an
5975 updated \type{space} value so that the indentation of the output nicely
5976 nests, and with an new \type{resources} field to help the next
5977 iteration down to properly process any other, embedded XObjects.
5979 Of course, this is not a very useful example in practise, but for the
5980 purpose of demonstrating \type{pdfscanner}, it is just long enough.
5981 It makes use of only one \type{scanner} method: \type{scanner:pop()}.
5982 That function pops the top operand of the internal stack, and returns
5983 a lua table where the object at index one is a string representing
5984 the type of the operand, and object two is its value.
5986 The list of possible operand types and associated lua value types is:
5988 \starttabulate[|lT|p|]
5989 \NC integer \NC <number> \NC \NR
5990 \NC real \NC <number> \NC \NR
5991 \NC boolean \NC <boolean> \NC \NR
5992 \NC name \NC <string> \NC \NR
5993 \NC operator \NC <string> \NC \NR
5994 \NC string \NC <string> \NC \NR
5995 \NC array \NC <table> \NC \NR
5996 \NC dict \NC <table> \NC \NR
5997 \stoptabulate
5999 In case of \type{integer} or \type{real}, the value is always
6000 a Lua (floating point) number.
6002 In case of \type{name}, the leading slash is always stripped.
6004 In case of \type{string}, please bear in mind that PDF actually
6005 supports different types of strings (with different encodings) in
6006 different parts of the PDF document, so may need to reencode some of
6007 the results; \type{pdfscanner} always outputs the byte stream without
6008 reencoding anything. \type{pdfscanner} does not differentiate between
6009 literal strings and hexidecimal strings (the hexadecimal values are
6010 decoded), and it treats the stream data for inline images as a string
6011 that is the single operand for \type{EI}.
6013 In case of \type{array}, the table content is a list of \type{pop}
6014 return values.
6016 In case of \type{dict}, the table keys are PDF name strings
6017 and the values are \type{pop} return values.
6019 \blank
6021 There are few more methods defined that you can ask \type{scanner}:
6023 \starttabulate[|lT|p|]
6024 \NC pop \NC as explained above\NC \NR
6025 \NC popNumber \NC return only the value of a \type{real} or \type{integer}\NC \NR
6026 \NC popName \NC return only the value of a \type{name} \NC \NR
6027 \NC popString \NC return only the value of a \type{string} \NC \NR
6028 \NC popArray \NC return only the value of a \type{array} \NC \NR
6029 \NC popDict \NC return only the value of a \type{dict} \NC \NR
6030 \NC popBool \NC return only the value of a \type{boolean} \NC \NR
6031 \NC done \NC abort further processing of this \type{scan()} call\NC \NR
6032 \stoptabulate
6034 The \type{popXXX} are convenience functions, and come in handy when
6035 you know the type of the operands beforehand (which you usually do, in
6036 PDF). For example, the \type{Do} function could have used \type{local
6037 name = scanner:popName()} instead, because the single operand
6038 to the \type{Do} operator is always a PDF name object.
6040 The \type{done} function allows you to abort processing of a stream
6041 once you have learned everything you want to learn. This comes in handy
6042 while parsing \type{/ToUnicode}, because there usually is trailing
6043 garbage that you are not interested in. Without \type{done}, processing
6044 only end at the end of the stream, possibly wasting CPU cycles.
6046 \section{The \luatex{status} library}
6048 This contains a number of run|-|time configuration items that
6049 you may find useful in message reporting, as well as an iterator
6050 function that gets all of the names and values as a table.
6052 \startfunctioncall
6053 <table> info = status.list()
6054 \stopfunctioncall
6056 The keys in the table are the known items, the value is the
6057 current value. Almost all of the values in \type{status} are
6058 fetched through a metatable at run|-|time whenever they are
6059 accessed, so you cannot use \type{pairs} on \type{status}, but you
6060 {\it can\/} use \type{pairs} on \type{info}, of course. If you do
6061 not need the full list, you can also ask for a single item by
6062 using its name as an index into \type{status}.
6064 The current list is:
6066 \starttabulate[|lT|p|]
6067 \NC \ssbf key \NC \bf explanation \NC\NR
6068 \NC pdf_gone\NC written \PDF\ bytes \NC \NR
6069 \NC pdf_ptr\NC not yet written \PDF\ bytes \NC \NR
6070 \NC dvi_gone\NC written \DVI\ bytes \NC \NR
6071 \NC dvi_ptr\NC not yet written \DVI\ bytes \NC \NR
6072 \NC total_pages\NC number of written pages \NC \NR
6073 \NC output_file_name\NC name of the \PDF\ or \DVI\ file \NC \NR
6074 \NC log_name\NC name of the log file \NC \NR
6075 \NC banner\NC terminal display banner \NC \NR
6076 \NC var_used\NC variable (one|-|word) memory in use \NC \NR
6077 \NC dyn_used\NC token (multi|-|word) memory in use \NC \NR
6078 \NC str_ptr\NC number of strings \NC \NR
6079 \NC init_str_ptr\NC number of \INITEX\ strings \NC \NR
6080 \NC max_strings\NC maximum allowed strings \NC \NR
6081 \NC pool_ptr\NC string pool index \NC \NR
6082 \NC init_pool_ptr\NC \INITEX\ string pool index \NC \NR
6083 \NC pool_size\NC current size allocated for string characters \NC \NR
6084 \NC node_mem_usage\NC a string giving insight into currently used nodes\NC\NR
6085 \NC var_mem_max\NC number of allocated words for nodes\NC \NR
6086 \NC fix_mem_max\NC number of allocated words for tokens\NC \NR
6087 \NC fix_mem_end\NC maximum number of used tokens\NC \NR
6088 \NC cs_count\NC number of control sequences \NC \NR
6089 \NC hash_size\NC size of hash \NC \NR
6090 \NC hash_extra\NC extra allowed hash \NC \NR
6091 \NC font_ptr\NC number of active fonts \NC \NR
6092 \NC max_in_stack\NC max used input stack entries \NC \NR
6093 \NC max_nest_stack\NC max used nesting stack entries \NC \NR
6094 \NC max_param_stack\NC max used parameter stack entries \NC \NR
6095 \NC max_buf_stack\NC max used buffer position \NC \NR
6096 \NC max_save_stack\NC max used save stack entries \NC \NR
6097 \NC stack_size\NC input stack size \NC \NR
6098 \NC nest_size\NC nesting stack size \NC \NR
6099 \NC param_size\NC parameter stack size \NC \NR
6100 \NC buf_size\NC current allocated size of the line buffer \NC \NR
6101 \NC save_size\NC save stack size \NC \NR
6102 \NC obj_ptr\NC max \PDF\ object pointer \NC \NR
6103 \NC obj_tab_size\NC \PDF\ object table size \NC \NR
6104 \NC pdf_os_cntr\NC max \PDF\ object stream pointer \NC \NR
6105 \NC pdf_os_objidx\NC \PDF\ object stream index \NC \NR
6106 \NC pdf_dest_names_ptr\NC max \PDF\ destination pointer \NC \NR
6107 \NC dest_names_size\NC \PDF\ destination table size \NC \NR
6108 \NC pdf_mem_ptr\NC max \PDF\ memory used \NC \NR
6109 \NC pdf_mem_size\NC \PDF\ memory size \NC \NR
6110 \NC largest_used_mark\NC max referenced marks class \NC \NR
6111 \NC filename\NC name of the current input file \NC \NR
6112 \NC inputid\NC numeric id of the current input \NC \NR
6113 \NC linenumber\NC location in the current input file\NC \NR
6114 \NC lasterrorstring\NC last error string\NC \NR
6115 \NC luabytecodes\NC number of active \LUA\ bytecode registers\NC \NR
6116 \NC luabytecode_bytes\NC number of bytes in \LUA\ bytecode registers\NC \NR
6117 \NC luastate_bytes\NC number of bytes in use by \LUA\ interpreters\NC \NR
6118 \NC output_active\NC \type{true} if the \tex{output} routine is active\NC \NR
6119 \NC callbacks\NC total number of executed callbacks so far\NC \NR
6120 \NC indirect_callbacks\NC number of those that were themselves
6121 a result of other callbacks (e.g. file readers)\NC \NR
6122 \NC luatex_svn\NC the luatex repository id (added in 0.51)\NC\NR
6123 \NC luatex_version\NC the luatex version number (added in 0.38)\NC\NR
6124 \NC luatex_revision\NC the luatex revision string (added in 0.38)\NC\NR
6125 \NC ini_version\NC \type{true} if this is an \INITEX\ run (added in 0.38)\NC\NR
6126 \stoptabulate
6129 \section{The \luatex{tex} library}
6131 The \luatex{tex} table contains a large list of virtual internal \TEX\
6132 parameters that are partially writable.
6134 The designation \quote{virtual} means that these items are not properly
6135 defined in \LUA, but are only front\-ends that are handled by a metatable
6136 that operates on the actual \TEX\ values. As a result, most of the \LUA\
6137 table operators (like \type{pairs} and \type{#}) do not work on such
6138 items.
6140 At the moment, it is possible to access almost every parameter
6141 that has these characteristics:
6143 \startitemize[packed]
6144 \item You can use it after \tex{the}
6145 \item It is a single token.
6146 \item Some special others, see the list below
6147 \stopitemize
6149 This excludes parameters that need extra arguments, like
6150 \tex{the}\tex{scriptfont}.
6152 The subset comprising simple integer and dimension registers are
6153 writable as well as readable (stuff like \tex{tracingcommands} and
6154 \tex{parindent}).
6156 \subsection{Internal parameter values}
6158 For all the parameters in this section, it is possible to access them
6159 directly using their names as index in the \type{tex} table, or by
6160 using one of the functions \type{tex.get()} and \type{tex.set()}.
6162 The exact parameters and return values differ depending on the actual
6163 parameter, and so does whether \type{tex.set} has any effect. For the
6164 parameters that {\it can\/} be set, it is possible to use
6165 \type{'global'} as the first argument to \type{tex.set}; this makes
6166 the assignment global instead of local.
6168 \startfunctioncall
6169 tex.set (<string> n, ...)
6170 tex.set ('global', <string> n, ...)
6171 ... = tex.get (<string> n)
6172 \stopfunctioncall
6174 \subsubsection{Integer parameters}
6176 The integer parameters accept and return \LUA\ numbers.
6178 Read-write:
6180 \startcolumns[n=2]
6181 \starttyping
6182 tex.adjdemerits
6183 tex.binoppenalty
6184 tex.brokenpenalty
6185 tex.catcodetable
6186 tex.clubpenalty
6187 tex.day
6188 tex.defaulthyphenchar
6189 tex.defaultskewchar
6190 tex.delimiterfactor
6191 tex.displaywidowpenalty
6192 tex.doublehyphendemerits
6193 tex.endlinechar
6194 tex.errorcontextlines
6195 tex.escapechar
6196 tex.exhyphenpenalty
6197 tex.fam
6198 tex.finalhyphendemerits
6199 tex.floatingpenalty
6200 tex.globaldefs
6201 tex.hangafter
6202 tex.hbadness
6203 tex.holdinginserts
6204 tex.hyphenpenalty
6205 tex.interlinepenalty
6206 tex.language
6207 tex.lastlinefit
6208 tex.lefthyphenmin
6209 tex.linepenalty
6210 tex.localbrokenpenalty
6211 tex.localinterlinepenalty
6212 tex.looseness
6213 tex.mag
6214 tex.maxdeadcycles
6215 tex.month
6216 tex.newlinechar
6217 tex.outputpenalty
6218 tex.pausing
6219 tex.pdfadjustspacing
6220 tex.pdfcompresslevel
6221 tex.pdfdecimaldigits
6222 tex.pdfgamma
6223 tex.pdfgentounicode
6224 tex.pdfimageapplygamma
6225 tex.pdfimagegamma
6226 tex.pdfimagehicolor
6227 tex.pdfimageresolution
6228 tex.pdfinclusionerrorlevel
6229 tex.pdfminorversion
6230 tex.pdfobjcompresslevel
6231 tex.pdfoutput
6232 tex.pdfpagebox
6233 tex.pdfpkresolution
6234 tex.pdfprotrudechars
6235 tex.pdftracingfonts
6236 tex.pdfuniqueresname
6237 tex.postdisplaypenalty
6238 tex.predisplaydirection
6239 tex.predisplaypenalty
6240 tex.pretolerance
6241 tex.relpenalty
6242 tex.righthyphenmin
6243 tex.savinghyphcodes
6244 tex.savingvdiscards
6245 tex.showboxbreadth
6246 tex.showboxdepth
6247 tex.time
6248 tex.tolerance
6249 tex.tracingassigns
6250 tex.tracingcommands
6251 tex.tracinggroups
6252 tex.tracingifs
6253 tex.tracinglostchars
6254 tex.tracingmacros
6255 tex.tracingnesting
6256 tex.tracingonline
6257 tex.tracingoutput
6258 tex.tracingpages
6259 tex.tracingparagraphs
6260 tex.tracingrestores
6261 tex.tracingscantokens
6262 tex.tracingstats
6263 tex.uchyph
6264 tex.vbadness
6265 tex.widowpenalty
6266 tex.year
6267 \stoptyping
6268 \stopcolumns
6270 Read|-|only:
6272 \startcolumns[n=3]
6273 \starttyping
6274 tex.deadcycles
6275 tex.insertpenalties
6276 tex.parshape
6277 tex.prevgraf
6278 tex.spacefactor
6279 \stoptyping
6280 \stopcolumns
6282 \subsubsection{Dimension parameters}
6284 The dimension parameters accept \LUA\ numbers (signifying scaled points)
6285 or strings (with included dimension). The result is always a number in
6286 scaled points.
6288 Read|-|write:
6290 \startcolumns[n=3]
6291 \starttyping
6292 tex.boxmaxdepth
6293 tex.delimitershortfall
6294 tex.displayindent
6295 tex.displaywidth
6296 tex.emergencystretch
6297 tex.hangindent
6298 tex.hfuzz
6299 tex.hoffset
6300 tex.hsize
6301 tex.lineskiplimit
6302 tex.mathsurround
6303 tex.maxdepth
6304 tex.nulldelimiterspace
6305 tex.overfullrule
6306 tex.pagebottomoffset
6307 tex.pageheight
6308 tex.pageleftoffset
6309 tex.pagerightoffset
6310 tex.pagetopoffset
6311 tex.pagewidth
6312 tex.parindent
6313 tex.pdfdestmargin
6314 tex.pdfhorigin
6315 tex.pdflinkmargin
6316 tex.pdfpageheight
6317 tex.pdfpagewidth
6318 tex.pdfpxdimen
6319 tex.pdfthreadmargin
6320 tex.pdfvorigin
6321 tex.predisplaysize
6322 tex.scriptspace
6323 tex.splitmaxdepth
6324 tex.vfuzz
6325 tex.voffset
6326 tex.vsize
6327 \stoptyping
6328 \stopcolumns
6330 Read|-|only:
6332 \startcolumns[n=3]
6333 \starttyping
6334 tex.pagedepth
6335 tex.pagefilllstretch
6336 tex.pagefillstretch
6337 tex.pagefilstretch
6338 tex.pagegoal
6339 tex.pageshrink
6340 tex.pagestretch
6341 tex.pagetotal
6342 tex.prevdepth
6343 \stoptyping
6344 \stopcolumns
6346 \subsubsection{Direction parameters}
6348 The direction parameters are read|-|only and return a \LUA\ string.
6350 \startcolumns[n=3]
6351 \starttyping
6352 tex.bodydir
6353 tex.mathdir
6354 tex.pagedir
6355 tex.pardir
6356 tex.textdir
6357 \stoptyping
6358 \stopcolumns
6360 \subsubsection{Glue parameters}
6362 The glue parameters accept and return a userdata object that
6363 represents a \type{glue_spec} node.
6365 \startcolumns[n=3]
6366 \starttyping
6367 tex.abovedisplayshortskip
6368 tex.abovedisplayskip
6369 tex.baselineskip
6370 tex.belowdisplayshortskip
6371 tex.belowdisplayskip
6372 tex.leftskip
6373 tex.lineskip
6374 tex.parfillskip
6375 tex.parskip
6376 tex.rightskip
6377 tex.spaceskip
6378 tex.splittopskip
6379 tex.tabskip
6380 tex.topskip
6381 tex.xspaceskip
6382 \stoptyping
6383 \stopcolumns
6385 \subsubsection{Muglue parameters}
6387 All muglue parameters are to be used read|-|only and return a \LUA\ string.
6389 \startcolumns[n=3]
6390 \starttyping
6391 tex.medmuskip
6392 tex.thickmuskip
6393 tex.thinmuskip
6394 \stoptyping
6395 \stopcolumns
6397 \subsubsection{Tokenlist parameters}
6399 The tokenlist parameters accept and return \LUA\ strings. \LUA\ strings are
6400 converted to and from token lists using \tex{the}\tex{toks} style
6401 expansion: all category codes are either space (10) or other (12).
6402 It follows that assigning to some of these, like \quote{tex.output},
6403 is actually useless, but it feels bad to make exceptions in view
6404 of a coming extension that will accept full-blown token strings.
6406 \startcolumns[n=3]
6407 \starttyping
6408 tex.errhelp
6409 tex.everycr
6410 tex.everydisplay
6411 tex.everyeof
6412 tex.everyhbox
6413 tex.everyjob
6414 tex.everymath
6415 tex.everypar
6416 tex.everyvbox
6417 tex.output
6418 tex.pdfpageattr
6419 tex.pdfpageresources
6420 tex.pdfpagesattr
6421 tex.pdfpkmode
6422 \stoptyping
6423 \stopcolumns
6426 \subsection{Convert commands}
6428 All \quote{convert} commands are read|-|only and return a \LUA\ string.
6429 The supported commands at this moment are:
6431 \startcolumns[n=2]
6432 \starttyping
6433 tex.eTeXVersion
6434 tex.eTeXrevision
6435 tex.formatname
6436 tex.jobname
6437 tex.luatexbanner
6438 tex.luatexrevision
6439 tex.pdfnormaldeviate
6440 tex.fontname(number)
6441 tex.pdffontname(number)
6442 tex.pdffontobjnum(number)
6443 tex.pdffontsize(number)
6444 tex.uniformdeviate(number)
6445 tex.number(number)
6446 tex.romannumeral(number)
6447 tex.pdfpageref(number)
6448 tex.pdfxformname(number)
6449 tex.fontidentifier(number)
6450 \stoptyping
6451 \stopcolumns
6453 If you are wondering why this list looks haphazard; these are all the
6454 cases of the \quote{convert} internal command that do not require an
6455 argument, as well as the ones that require only a simple numeric
6456 value.
6458 The special (lua-only) case of \type{tex.fontidentifier} returns the
6459 \type{csname} string that matches a font id number (if there is one).
6461 if these are really needed in a macro package.
6463 \subsection{Last item commands}
6465 All \quote{last item} commands are read|-|only and return a number.
6467 The supported commands at this moment are:
6469 \startcolumns[n=3]
6470 \starttyping
6471 tex.lastpenalty
6472 tex.lastkern
6473 tex.lastskip
6474 tex.lastnodetype
6475 tex.inputlineno
6476 tex.pdflastobj
6477 tex.pdflastxform
6478 tex.pdflastximage
6479 tex.pdflastximagepages
6480 tex.pdflastannot
6481 tex.pdflastxpos
6482 tex.pdflastypos
6483 tex.pdfrandomseed
6484 tex.pdflastlink
6485 tex.luatexversion
6486 tex.eTeXminorversion
6487 tex.eTeXversion
6488 tex.currentgrouplevel
6489 tex.currentgrouptype
6490 tex.currentiflevel
6491 tex.currentiftype
6492 tex.currentifbranch
6493 tex.pdflastximagecolordepth
6494 \stoptyping
6495 \stopcolumns
6497 \subsection{Attribute, count, dimension, skip and token registers}
6499 \TEX's attributes (\tex{attribute}), counters (\tex{count}),
6500 dimensions (\tex{dimen}), skips (\tex{skip}) and token (\tex{toks})
6501 registers can be accessed and written to using two times five virtual
6502 sub|-|tables of the \luatex{tex} table:
6504 \startcolumns[n=3]
6505 \starttyping
6506 tex.attribute
6507 tex.count
6508 tex.dimen
6509 tex.skip
6510 tex.toks
6511 \stoptyping
6512 \stopcolumns
6514 It is possible to use the names of relevant \tex{attributedef}, \tex{countdef},
6515 \tex{dimendef}, \tex{skipdef}, or \tex{toksdef} control sequences as indices
6516 to these tables:
6518 \starttyping
6519 tex.count.scratchcounter = 0
6520 enormous = tex.dimen['maxdimen']
6521 \stoptyping
6523 In this case, \LUATEX\ looks up the value for you on the fly. You have
6524 to use a valid \tex{countdef} (or \tex{attributedef}, or
6525 \tex{dimendef}, or \tex{skipdef}, or \tex{toksdef}), anything else
6526 will generate an error (the intent is to eventually also allow
6527 \type{<chardef tokens>} and even macros that expand into a number).
6529 The attribute and count registers accept and return \LUA\ numbers.
6531 The dimension registers accept \LUA\ numbers (in scaled points) or
6532 strings (with an included absolute dimension; \type {em} and \type {ex} and \type {px}
6533 are forbidden). The result is always a number in scaled points.
6535 The token registers accept and return \LUA\ strings. \LUA\ strings are
6536 converted to and from token lists using \tex{the}\tex{toks} style
6537 expansion: all category codes are either space (10) or other (12).
6539 The skip registers accept and return \type{glue_spec} userdata node
6540 objects (see the description of the node interface elsewhere in this
6541 manual).
6543 As an alternative to array addressing, there are also accessor
6544 functions defined for all cases, for example, here is the set
6545 of possibilities for \type{\skip} registers:
6547 \startfunctioncall
6548 tex.setskip (<number> n, <node> s)
6549 tex.setskip (<string> s, <node> s)
6550 tex.setskip ('global',<number> n, <node> s)
6551 tex.setskip ('global',<string> s, <node> s)
6552 <node> s = tex.getskip (<number> n)
6553 <node> s = tex.getskip (<string> s)
6554 \stopfunctioncall
6556 In the function-based interface, it is possible to define values
6557 globally by using the string \type{'global'} as the first function argument.
6559 \subsection{Character code registers (0.63)}
6561 \TEX's character code tables (\tex{lccode}, \tex{uccode},
6562 \tex{sfcode}, \tex{catcode}, \tex{mathcode}, \tex{delcode}) can be
6563 accessed and written to using six virtual subtables of the \type{tex}
6564 table
6566 \startcolumns[n=3]
6567 \starttyping
6568 tex.lccode
6569 tex.uccode
6570 tex.sfcode
6571 tex.catcode
6572 tex.mathcode
6573 tex.delcode
6574 \stoptyping
6575 \stopcolumns
6577 The function call interfaces are roughly as above, but there are a few twists.
6578 \type{sfcode}s are the simple ones:
6580 \startfunctioncall
6581 tex.setsfcode (<number> n, <number> s)
6582 tex.setsfcode ('global', <number> n, <number> s)
6583 <number> s = tex.getsfcode (<number> n)
6584 \stopfunctioncall
6586 The function call interface for \type{lccode} and \type{uccode} additionally allows you to set the associated sibling at the same time:
6588 \startfunctioncall
6589 tex.setlccode (['global'], <number> n, <number> lc)
6590 tex.setlccode (['global'], <number> n, <number> lc, <number> uc)
6591 <number> lc = tex.getlccode (<number> n)
6592 tex.setuccode (['global'], <number> n, <number> uc)
6593 tex.setuccode (['global'], <number> n, <number> uc, <number> lc)
6594 <number> uc = tex.getuccode (<number> n)
6595 \stopfunctioncall
6597 The function call interface for \type{catcode} also allows you to
6598 specify a category table to use on assignment or on query (default in
6599 both cases is the current one):
6601 \startfunctioncall
6602 tex.setcatcode (['global'], <number> n, <number> c)
6603 tex.setcatcode (['global'], <number> cattable, <number> n, <number> c)
6604 <number> lc = tex.getcatcode (<number> n)
6605 <number> lc = tex.getcatcode (<number> cattable, <number> n)
6606 \stopfunctioncall
6609 The interfaces for \type{delcode} and \type{mathcode} use small array tables to
6610 set and retrieve values:
6612 \startfunctioncall
6613 tex.setmathcode (['global'], <number> n, <table> mval )
6614 <table> mval = tex.getmathcode (<number> n)
6615 tex.setdelcode (['global'], <number> n, <table> dval )
6616 <table> dval = tex.getdelcode (<number> n)
6617 \stopfunctioncall
6619 Where the table for \type{mathcode} is an array of 3 numbers, like this:
6621 \starttyping
6622 {<number> mathclass, <number> family, <number> character}
6623 \stoptyping
6625 And the table for \type{delcode} is an array with 4 numbers, like this:
6627 \starttyping
6628 {<number> small_fam, <number> small_char, <number> large_fam, <number> large_char}
6629 \stoptyping
6631 Normally, the third and fourth values in a delimiter code assignment
6632 will be zero according to \tex{Udelcode} usage, but the returned table can have
6633 values there (if the delimiter code was set using \type{\delcode}, for
6634 example). Unset \type{delcode}'s can be recognized because
6635 \type{dval[1]} is $-1$.
6637 \subsection{Box registers}
6639 It is possible to set and query actual boxes, using the node
6640 interface as defined in the \luatex{node} library:
6642 \starttyping
6643 tex.box
6644 \stoptyping
6646 for array access, or
6648 \starttyping
6649 tex.setbox(<number> n, <node> s)
6650 tex.setbox(<string> cs, <node> s)
6651 tex.setbox('global', <number> n, <node> s)
6652 tex.setbox('global', <string> cs, <node> s)
6653 <node> n = tex.getbox(<number> n)
6654 <node> n = tex.getbox(<string> cs)
6655 \stoptyping
6657 for function|-|based access.
6658 In the function-based interface, it is possible to define values
6659 globally by using the string \type{'global'} as the first function argument.
6661 Be warned that an assignment like
6663 \starttyping
6664 tex.box[0] = tex.box[2]
6665 \stoptyping
6667 does not copy the node list, it just duplicates a node pointer. If
6668 \tex{box2} will be cleared by \TEX\ commands later on, the contents
6669 of \tex{box0} becomes invalid as well. To prevent this from
6670 happening, always use \luatex{node.copy_list()} unless you are
6671 assigning to a temporary variable:
6673 \starttyping
6674 tex.box[0] = node.copy_list(tex.box[2])
6675 \stoptyping
6677 %{\bf note: In previous versions of \LUATEX\ there were also three
6678 %virtual tables called \type{tex.wd}, \type{tex.ht}, and \type{tex.dp}
6679 %along with an associated function call interface. These were
6680 %removed in version 0.63. You should switch to using \type{tex.box[].width}
6681 %etc. instead.}
6683 %If for some reason you want the functionality of these tables back,
6684 %you can add \LUA\ code to do that for you, like this:
6686 %\starttyping
6687 %local box = tex.box
6689 %local wd = {
6690 % __index = function(t,k) local bk = box[k] return bk and bk.width or 0 end,
6691 % __newindex = function(t,k,v) local bk = box[k] if bk then bk.width = v end end,
6693 %local ht = {
6694 % __index = function(t,k) local bk = box[k] return bk and bk.height or 0 end,
6695 % __newindex = function(t,k,v) local bk = box[k] if bk then bk.height = v end end,
6697 %local dp = {
6698 % __index = function(t,k) local bk = box[k] return bk and bk.depth or 0 end,
6699 % __newindex = function(t,k,v) local bk = box[k] if bk then bk.depth = v end end,
6702 %tex.wd = { } setmetatable(tex.wd,wd)
6703 %tex.ht = { } setmetatable(tex.ht,ht)
6704 %tex.dp = { } setmetatable(tex.dp,dp)
6705 %\stoptyping
6708 \subsection{Math parameters}
6710 It is possible to set and query the internal math parameters
6711 using:
6713 \startfunctioncall
6714 tex.setmath(<string> n, <string> t, <number> n)
6715 tex.setmath('global', <string> n, <string> t, <number> n)
6716 <number> n = tex.getmath(<string> n, <string> t)
6717 \stopfunctioncall
6719 As before an optional first parameter \type{'global'} indicates a
6720 global assignment.
6722 The first string is the parameter name minus the leading \quote{Umath},
6723 and the second string is the style name minus the trailing \quote{style}.
6725 Just to be complete, the values for the math parameter name are:
6727 \starttyping
6728 quad axis operatorsize
6729 overbarkern overbarrule overbarvgap
6730 underbarkern underbarrule underbarvgap
6731 radicalkern radicalrule radicalvgap
6732 radicaldegreebefore radicaldegreeafter radicaldegreeraise
6733 stackvgap stacknumup stackdenomdown
6734 fractionrule fractionnumvgap fractionnumup
6735 fractiondenomvgap fractiondenomdown fractiondelsize
6736 limitabovevgap limitabovebgap limitabovekern
6737 limitbelowvgap limitbelowbgap limitbelowkern
6738 underdelimitervgap underdelimiterbgap
6739 overdelimitervgap overdelimiterbgap
6740 subshiftdrop supshiftdrop subshiftdown
6741 subsupshiftdown subtopmax supshiftup
6742 supbottommin supsubbottommax subsupvgap
6743 spaceafterscript connectoroverlapmin
6744 ordordspacing ordopspacing ordbinspacing ordrelspacing
6745 ordopenspacing ordclosespacing ordpunctspacing ordinnerspacing
6746 opordspacing opopspacing opbinspacing oprelspacing
6747 opopenspacing opclosespacing oppunctspacing opinnerspacing
6748 binordspacing binopspacing binbinspacing binrelspacing
6749 binopenspacing binclosespacing binpunctspacing bininnerspacing
6750 relordspacing relopspacing relbinspacing relrelspacing
6751 relopenspacing relclosespacing relpunctspacing relinnerspacing
6752 openordspacing openopspacing openbinspacing openrelspacing
6753 openopenspacing openclosespacing openpunctspacing openinnerspacing
6754 closeordspacing closeopspacing closebinspacing closerelspacing
6755 closeopenspacing closeclosespacing closepunctspacing closeinnerspacing
6756 punctordspacing punctopspacing punctbinspacing punctrelspacing
6757 punctopenspacing punctclosespacing punctpunctspacing punctinnerspacing
6758 innerordspacing inneropspacing innerbinspacing innerrelspacing
6759 inneropenspacing innerclosespacing innerpunctspacing innerinnerspacing
6760 \stoptyping
6762 The values for the style parameter name are:
6764 \starttyping
6765 display crampeddisplay
6766 text crampedtext
6767 script crampedscript
6768 scriptscript crampedscriptscript
6769 \stoptyping
6772 \subsection{Special list heads}
6774 The virtual table \luatex{tex.lists} contains the set of internal
6775 registers that keep track of building page lists.
6778 \starttabulate[|lT|p|]
6779 \NC \bf field \NC \bf description \NC \NR
6780 \NC page_ins_head \NC circular list of pending insertions \NC \NR
6781 \NC contrib_head \NC the recent contributions \NC \NR
6782 \NC page_head \NC the current page content\NC \NR
6783 %\NC temp_head \NC \NC \NR
6784 \NC hold_head \NC used for held-over items for next page\NC \NR
6785 \NC adjust_head \NC head of the current \tex{vadjust} list \NC \NR
6786 \NC pre_adjust_head \NC head of the current \tex{vadjust pre} list\NC \NR
6787 % \NC align_head \NC \NC \NR
6788 \stoptabulate
6790 \subsection{Semantic nest levels (0.51)}
6792 The virtual table \luatex{tex.nest} contains the currently active
6793 semantic nesting state. It has two main parts: a zero-based array of
6794 userdata for the semantic nest itself, and the numerical value
6795 \type{tex.nest.ptr}, which gives the highest available index. Neither
6796 the array items in \type{tex.nest[]} nor \type{tex.nest.ptr} can be
6797 assigned to (as this would confuse the typesetting engine beyond
6798 repair), but you can assign to the individual values inside the array
6799 items, e.g. \type{tex.nest[tex.nest.ptr].prevdepth}.
6801 \type{tex.nest[tex.nest.ptr]} is the current nest state, \type{tex.nest[0]}
6802 the outermost (main vertical list) level.
6804 The known fields are:
6806 \starttabulate[|lT|l|l|p|]
6807 \NC \ssbf key \NC \bf type \NC \bf modes \NC \bf explanation \NC\NR
6808 \NC mode \NC number \NC all \NC The current mode. This is a number representing the
6809 main mode at this level:\crlf
6810 0 == no mode (this happens during \type{\write})\crlf
6811 1 == vertical,\crlf
6812 127 = horizontal,\crlf
6813 253 = display math.\crlf
6814 $-1$ == internal vertical,\crlf
6815 $-127$ = restricted horizontal,\crlf
6816 $-253$ = inline math.\NC\NR
6817 \NC modeline \NC number \NC all \NC source input line where this mode was entered in,
6818 negative inside the output routine.\NC\NR
6819 \NC head \NC node \NC all \NC the head of the current list\NC\NR
6820 \NC tail \NC node \NC all \NC the tail of the current list\NC\NR
6821 \NC prevgraf \NC number \NC vmode \NC number of lines in the previous paragraph\NC\NR
6822 \NC prevdepth \NC number \NC vmode \NC depth of the previous paragraph (equal to \type{\pdfignoreddimen}
6823 when it is to be ignored)\NC\NR
6824 \NC spacefactor \NC number \NC hmode \NC the current space factor\NC\NR
6825 \NC dirs \NC node \NC hmode \NC used for temporary storage by the line break algorithm\NC\NR
6826 \NC noad \NC node \NC mmode \NC used for temporary storage of a pending fraction numerator,
6827 for \type{\over} etc.\NC\NR
6828 \NC delimptr \NC node \NC mmode \NC used for temporary storage of the previous math delimiter,
6829 for \type{\middle}.\NC\NR
6830 \NC mathdir \NC boolean \NC mmode \NC true when during math processing the \type{\mathdir} is not
6831 the same as the surrounding \type{\textdir}\NC\NR
6832 \NC mathstyle \NC number \NC mmode \NC the current \type{\mathstyle} \NC\NR
6833 \stoptabulate
6836 \subsection[sec:luaprint]{Print functions}
6838 The \luatex{tex} table also contains the three print functions that
6839 are the major interface from \LUA\ scripting to \TEX.
6841 The arguments to these three functions are all stored in an in|-|memory
6842 virtual file that is fed to the \TEX\ scanner as the result of the
6843 expansion of \tex{directlua}.
6845 The total amount of returnable text from a \tex{directlua} command
6846 is only limited by available system \RAM. However, each separate
6847 printed string has to fit completely in \TEX's input buffer.
6849 The result of using these functions from inside callbacks is undefined
6850 at the moment.
6852 \subsubsection{\luatex{tex.print}}
6854 \startfunctioncall
6855 tex.print(<string> s, ...)
6856 tex.print(<number> n, <string> s, ...)
6857 tex.print(<table> t)
6858 tex.print(<number> n, <table> t)
6859 \stopfunctioncall
6861 Each string argument is treated by \TEX\ as a separate input line.
6862 If there is a table argument instead of a list of strings, this has to
6863 be a consecutive array of strings to print (the first non-string value
6864 will stop the printing process). This syntax was added in 0.36.
6866 The optional parameter can be used to print the strings using the
6867 catcode regime defined by \tex{catcodetable}~\type{n}. If \type{n} is
6868 $-1$, the currently active catcode regime is used. If \type{n} is
6869 $-2$, the resulting catcodes are the result of \type{\the\toks}: all
6870 category codes are 12 (other) except for the space character, that has
6871 category code 10 (space). Otherwise, if \type{n} is not
6872 a valid catcode table, then it is ignored, and the currently
6873 active catcode regime is used instead.
6875 The very last string of the very last \luatex{tex.print()} command in a
6876 \tex{directlua} will not have the \tex{endlinechar} appended, all
6877 others do.
6879 \subsubsection{\luatex{tex.sprint}}
6881 \startfunctioncall
6882 tex.sprint(<string> s, ...)
6883 tex.sprint(<number> n, <string> s, ...)
6884 tex.sprint(<table> t)
6885 tex.sprint(<number> n, <table> t)
6886 \stopfunctioncall
6888 Each string argument is treated by \TEX\ as a special kind of input line
6889 that makes it suitable for use as a partial line input mechanism:
6891 \startitemize[packed]
6892 \item \TEX\ does not switch to the \quote{new line} state, so
6893 that leading spaces are not ignored.
6894 \item No \tex{endlinechar} is inserted.
6895 \item Trailing spaces are not removed.
6897 Note that this does not prevent \TEX\ itself from eating spaces as
6898 result of interpreting the line. For example, in
6900 \starttyping
6901 before\directlua{tex.sprint("\\relax")tex.sprint(" inbetween")}after
6902 \stoptyping
6904 the space before \type{inbetween} will be gobbled as a result of
6905 the \quote{normal} scanning of \tex{relax}.
6906 \stopitemize
6908 If there is a table argument instead of a list of strings, this has to
6909 be a consecutive array of strings to print (the first non-string value
6910 will stop the printing process). This syntax was added in 0.36.
6912 The optional argument sets the catcode regime, as with \type{tex.print()}.
6914 \subsubsection{\luatex{tex.tprint}}
6916 \startfunctioncall
6917 tex.tprint({<number> n, <string> s, ...}, {...})
6918 \stopfunctioncall
6920 This function is basically a shortcut for repeated calls to
6921 \luatex{tex.sprint(<number> n, <string> s, ...)}, once for each of
6922 the supplied argument tables.
6924 \subsubsection{\luatex{tex.write}}
6926 \startfunctioncall
6927 tex.write(<string> s, ...)
6928 tex.write(<table> t)
6929 \stopfunctioncall
6931 Each string argument is treated by \TEX\ as a special kind of input
6932 line that makes it suitable for use as a quick way to dump
6933 information:
6935 \startitemize
6936 \item All catcodes on that line are either \quote{space} (for '~') or
6937 \quote{character} (for all others).
6938 \item There is no \tex{endlinechar} appended.
6939 \stopitemize
6941 If there is a table argument instead of a list of strings, this has to
6942 be a consecutive array of strings to print (the first non-string value
6943 will stop the printing process). This syntax was added in 0.36.
6946 \subsection{Helper functions}
6948 \subsubsection{\luatex{tex.round}}
6950 \startfunctioncall
6951 <number> n = tex.round(<number> o)
6952 \stopfunctioncall
6954 Rounds \LUA\ number \type{o}, and returns a number that is in the range
6955 of a valid \TEX\ register value. If the number starts out of range, it
6956 generates a \quote{number to big} error as well.
6958 \subsubsection{\luatex{tex.scale}}
6960 \startfunctioncall
6961 <number> n = tex.scale(<number> o, <number> delta)
6962 <table> n = tex.scale(table o, <number> delta)
6963 \stopfunctioncall
6965 Multiplies the \LUA\ numbers \type{o} and \type{delta}, and returns a
6966 rounded number that is in the range of a valid \TEX\ register value.
6967 In the table version, it creates a copy of the table with all numeric
6968 top||level values scaled in that manner. If the multiplied number(s) are
6969 of range, it generates \quote{number to big} error(s) as well.
6971 Note: the precision of the output of this function will depend on your
6972 computer's architecture and operating system, so use with care! An
6973 interface to \LUATEX's internal, 100\% portable scale function will be
6974 added at a later date.
6976 \subsubsection{\luatex{tex.sp} (0.51)}
6978 \startfunctioncall
6979 <number> n = tex.sp(<number> o)
6980 <number> n = tex.sp(<string> s)
6981 \stopfunctioncall
6983 Converts the number \type{o} or a string \type{s} that represents
6984 an explicit dimension into an integer number of scaled points.
6986 For parsing the string, the same scanning and conversion rules are used
6987 that \LUATEX\ would use if it was scanning a dimension specifier in
6988 its \TEX-like input language (this includes generating errors for bad
6989 values), expect for the following:
6991 \startitemize[n]
6992 \item only explicit values are allowed, control sequences are not handled
6993 \item infinite dimension units (\type{fil...}) are forbidden
6994 \item \type{mu} units do not generate an error (but may not be useful either)
6995 \stopitemize
6997 \subsubsection{\luatex{tex.definefont}}
6999 \startfunctioncall
7000 tex.definefont(<string> csname, <number> fontid)
7001 tex.definefont(<boolean> global, <string> csname, <number> fontid)
7002 \stopfunctioncall
7004 Associates \type{csname} with the internal font number \type{fontid}.
7005 The definition is global if (and only if) \type{global} is specified
7006 and true (the setting of \type{globaldefs} is not taken into account).
7009 \subsubsection{\luatex{tex.error} (0.61)}
7011 \startfunctioncall
7012 tex.error(<string> s)
7013 tex.error(<string> s, <table> help)
7014 \stopfunctioncall
7016 This creates an error somewhat like the combination of \tex{errhelp}
7017 and \tex{errmessage} would. During this error, deletions are disabled.
7019 The array part of the \type{help} table has to contain strings,
7020 one for each line of error help.
7023 \subsubsection{\luatex{tex.hashtokens} (0.25)}
7025 \startfunctioncall
7026 for i,v in pairs (tex.hashtokens()) do ... end
7027 \stopfunctioncall
7029 Returns a name and token table pair (see~\in{section}[luatokens] about
7030 token tables) iterator for every non-zero entry in the hash table.
7031 This can be useful for debugging, but note that this also reports
7032 control sequences that may be unreachable at this moment due to local
7033 redefinitions: it is strictly a dump of the hash table.
7035 \subsection[luaprimitives]{Functions for dealing with primitives }
7037 \subsubsection{\luatex{tex.enableprimitives}}
7039 \startfunctioncall
7040 tex.enableprimitives(<string> prefix, <table> primitive names)
7041 \stopfunctioncall
7043 This function accepts a prefix string and an array of primitive names.
7045 For each combination of \quote{prefix} and \quote{name}, the
7046 \type{tex.enableprimitives} first verifies that \quote{name} is
7047 an actual primitive (it must be returned by one of the
7048 \type{tex.extraprimitives()} calls explained below, or part of
7049 \TEX82, or \type{\directlua}). If it is not,
7050 \type{tex.enableprimitives} does nothing and skips to the next pair.
7052 But if it is, then it will construct a csname variable by concatenating the
7053 \quote{prefix} and \quote{name}, unless the \quote{prefix} is already the actual
7054 prefix of \quote{name}. In the latter case, it will discard the \quote{prefix},
7055 and just use \quote{name}.
7057 Then it will check for the existence of the constructed csname.
7058 If the csname is currently undefined (note: that is not the same as
7059 \type{\relax}), it will globally define the csname to have the
7060 meaning: run code belonging to the primitive \quote{name}. If for some
7061 reason the csname is already defined, it does nothing and tries the
7062 next pair.
7064 An example:
7066 \starttyping
7067 tex.enableprimitives('LuaTeX', {'formatname'})
7068 \stoptyping
7070 will define \type{\LuaTeXformatname} with the same intrinsic meaning
7071 as the documented primitive \type{\formatname}, provided that the
7072 control sequences \type{\LuaTeXformatname} is currently undefined.
7074 Second example:
7076 \starttyping
7077 tex.enableprimitives('Omega',tex.extraprimitives ('omega'))
7078 \stoptyping
7080 will define a whole series of csnames like \type{\Omegatextdir},
7081 \type{\Omegapardir}, etc., but it will stick with \type{\OmegaVersion}
7082 instead of creating the doubly-prefixed \type{\OmegaOmegaVersion}.
7084 Starting with version 0.39.0 (and this is why the above two functions
7085 are needed), \LUATEX\ in \type{--ini} mode contains only the \TEX82
7086 primitives and \type{\directlua}, no extra primitives {\bf at all}.
7088 So, if you want to have all the new functionality available using
7089 their default names, as it is now, you will have to add
7091 \starttyping
7092 \ifx\directlua\undefined \else
7093 \directlua {tex.enableprimitives('',tex.extraprimitives ())}
7095 \stoptyping
7097 near the beginning of your format generation file. Or you can choose
7098 different prefixes for different subsets, as you see fit.
7100 Calling some form of \type{tex.enableprimitives()} is highly important
7101 though, because if you do not, you will end up with a \TEX82-lookalike
7102 that can run lua code but not do much else. The defined csnames are
7103 (of course) saved in the format and will be available at runtime.
7106 \subsubsection{\luatex{tex.extraprimitives}}
7108 \startfunctioncall
7109 <table> t = tex.extraprimitives(<string> s, ...)
7110 \stopfunctioncall
7112 This function returns a list of the primitives that originate
7113 from the engine(s) given by the requested string value(s). The
7114 possible values and their (current) return values are:
7116 \startluacode
7117 function out_prim (a)
7118 local v = tex.extraprimitives(a)
7119 table.sort(v)
7120 for _,n in pairs(v) do
7121 if n == ' ' then
7122 n = '\\normalcontrolspace'
7124 tex.print(n .. '\\hskip 4pt plus 5em')
7127 \stopluacode
7129 \starttabulate[|l|p|]
7130 \NC \bf name\NC \bf values \NC \NR
7131 \NC tex \NC \ctxlua{out_prim('tex') } \NC \NR
7132 \NC core \NC \ctxlua{out_prim('core') } \NC \NR
7133 \NC etex \NC \ctxlua{out_prim('etex') } \NC \NR
7134 \NC pdftex \NC \ctxlua{out_prim('pdftex') } \NC \NR
7135 \NC omega \NC \ctxlua{out_prim('omega') } \NC \NR
7136 \NC aleph \NC \ctxlua{out_prim('aleph') } \NC \NR
7137 \NC luatex \NC \ctxlua{out_prim('luatex') } \NC \NR
7138 \NC umath \NC \ctxlua{out_prim('umath') } \NC \NR
7139 \stoptabulate
7141 Note that \type{'luatex'} does not contain \type{directlua}, as that is
7142 considered to be a core primitive, along with all the \TEX82
7143 primitives, so it is part of the list that is returned from \type{'core'}.
7145 \type{'umath'} is a subset of \type{'luatex'} that covers the Unicode math
7146 primitives and have been added in \LUATEX\ 0.75.0 as it might be desired to
7147 handle the prefixing of that subset differently.
7149 Running \type{tex.extraprimitives()} will give you the complete list
7150 of primitives that are not defined at \LUATEX\ 0.39.0 \type{-ini}
7151 startup. It is exactly equivalent to \type{tex.extraprimitives('etex',
7152 'pdftex', 'omega', 'aleph', 'luatex')}
7154 \subsubsection{\luatex{tex.primitives}}
7156 \startfunctioncall
7157 <table> t = tex.primitives()
7158 \stopfunctioncall
7160 This function returns a hash table listing all primitives that \LUATEX\
7161 knows about. The keys in the hash are primitives names, the values are
7162 tables representing tokens (see~\in{section }[luatokens]). The third value
7163 is always zero.
7165 \subsection{Core functionality interfaces}
7167 \subsubsection{\luatex{tex.badness} (0.53)}
7169 \startfunctioncall
7170 <number> b = tex.badness(<number> t, <number> s)
7171 \stopfunctioncall
7173 This helper function is useful
7174 during linebreak calculations. \type{t} and \type{s} are scaled values; the function
7175 returns the badness for when total \type{t} is supposed to be made from amounts
7176 that sum to \type{s}. The returned number is a reasonable approximation of $100(t/s)^3$;
7178 \subsubsection{\luatex{tex.linebreak} (0.53)}
7180 \startfunctioncall
7181 local <node> nodelist, <table> info =
7182 tex.linebreak(<node> listhead, <table> parameters)
7183 \stopfunctioncall
7185 The understood parameters are as follows:
7187 \starttabulate[|l|l|p|]
7188 \NC \bf name \NC \bf type \NC \bf description \NC \NR
7189 \NC pardir \NC string \NC \NC \NR
7190 \NC pretolerance \NC number \NC \NC \NR
7191 \NC tracingparagraphs \NC number \NC \NC \NR
7192 \NC tolerance \NC number \NC \NC \NR
7193 \NC looseness \NC number \NC \NC \NR
7194 \NC hyphenpenalty \NC number \NC \NC \NR
7195 \NC exhyphenpenalty \NC number \NC \NC \NR
7196 \NC pdfadjustspacing \NC number \NC \NC \NR
7197 \NC adjdemerits \NC number \NC \NC \NR
7198 \NC pdfprotrudechars \NC number \NC \NC \NR
7199 \NC linepenalty \NC number \NC \NC \NR
7200 \NC lastlinefit \NC number \NC \NC \NR
7201 \NC doublehyphendemerits \NC number \NC \NC \NR
7202 \NC finalhyphendemerits \NC number \NC \NC \NR
7203 \NC hangafter \NC number \NC \NC \NR
7204 \NC interlinepenalty \NC number or table \NC if a table, then it is an array like \type{\interlinepenalties}\NC \NR
7205 \NC clubpenalty \NC number or table \NC if a table, then it is an array like \type{\clubpenalties}\NC \NR
7206 \NC widowpenalty \NC number or table \NC if a table, then it is an array like \type{\widowpenalties}\NC \NR
7207 \NC brokenpenalty \NC number \NC \NC \NR
7208 \NC emergencystretch \NC number \NC in scaled points \NC \NR
7209 \NC hangindent \NC number \NC in scaled points \NC \NR
7210 \NC hsize \NC number \NC in scaled points \NC \NR
7211 \NC leftskip \NC glue_spec node \NC \NC \NR
7212 \NC rightskip \NC glue_spec node \NC \NC \NR
7213 \NC pdfignoreddimen \NC number \NC in scaled points \NC \NR
7214 \NC parshape \NC table \NC \NC \NR
7215 \stoptabulate
7217 Note that there is no interface for \type{\displaywidowpenalties}, you
7218 have to pass the right choice for \type{widowpenalties} yourself.
7220 The meaning of the various keys should be fairly obvious from the
7221 table (the names match the \TEX\ and \PDFTEX\ primitives) except for
7222 the last 5 entries. The four \type{pdf...line...} keys are ignored if
7223 their value equals \type{pdfignoreddimen}.
7225 It is your own job to make sure that \type{listhead} is a proper
7226 paragraph list: this function does not add any nodes to it. To be
7227 exact, if you want to replace the core line breaking, you may have to
7228 do the following (when you are not actually working in the
7229 \type{pre_linebreak_filter} or \type{linebreak_filter} callbacks, or when the
7230 original list starting at listhead was generated in horizontal mode):
7232 \startitemize
7233 \item add an \quote{indent box} and perhaps a \type{local_par} node at
7234 the start (only if you need them)
7235 \item replace any found final glue by an infinite penalty (or add such
7236 a penalty, if the last node is not a glue)
7237 \item add a glue node for the \type{\parfillskip} after that penalty node
7238 \item make sure all the \type{prev} pointers are OK
7239 \stopitemize
7241 The result is a node list, it still needs to be vpacked if you
7242 want to assign it to a \tex{vbox}.
7245 The returned \type{info} table contains four values that are all numbers:
7247 \starttabulate[|l|p|]
7248 \NC prevdepth \NC depth of the last line in the broken paragraph \NC \NR
7249 \NC prevgraf \NC number of lines in the broken paragraph \NC \NR
7250 \NC looseness \NC the actual looseness value in the broken paragraph \NC \NR
7251 \NC demerits \NC the total demerits of the chosen solution \NC \NR
7252 \stoptabulate
7254 Note there are a few things you cannot interface using this function:
7255 You cannot influence font expansion other than via
7256 \type{pdfadjustspacing}, because the settings for that take place
7257 elsewhere. The same is true for hbadness and hfuzz etc. All these are
7258 in the \type{hpack()} routine, and that fetches its own variables via
7259 globals.
7261 \subsubsection{\luatex{tex.shipout} (0.51)}
7263 \startfunctioncall
7264 tex.shipout(<number> n)
7265 \stopfunctioncall
7267 Ships out box number \type{n} to the output file, and clears the box
7268 register.
7271 \section[texconfig]{The \luatex{texconfig} table}
7273 This is a table that is created empty. A startup \LUA\ script could
7274 fill this table with a number of settings that are read out by
7275 the executable after loading and executing the startup file.
7277 \starttabulate[|lT|l|l|p|]
7278 \NC \ssbf key \NC \bf type \NC \bf default \NC \bf explanation \NC\NR
7279 \NC kpse_init \NC boolean \NC true \NC \type{false} totally disables \KPATHSEA\ initialisation,
7280 and enables interpretation of the following numeric key--value pairs.
7281 (only ever unset this if you implement {\it all\/} file
7282 find callbacks!)\NC \NR
7283 \NC shell_escape \NC string\NC \type{'f'}\NC Use \type{'y'} or \type{'t'} or \type{'1'} to enable \type{\write18} unconditionally,
7284 \type{'p'} to enable the commands that are listed in \type{shell_escape_commands} (new in 0.37)\NC\NR
7285 \NC shell_escape_commands \NC string\NC \NC Comma-separated list of command names that may be executed by \type{\write18} even
7286 if \type{shell_escape} is set to \type{'p'}. Do {\it not\/} use spaces around commas,
7287 separate any required command arguments by using a space, and use the ASCII double quote
7288 (\type{"}) for any needed argument or path quoting (new in 0.37)\NC\NR
7289 \NC string_vacancies \NC number\NC 75000\NC cf.\ web2c docs \NC \NR
7290 \NC pool_free \NC number\NC 5000\NC cf.\ web2c docs \NC \NR
7291 \NC max_strings \NC number\NC 15000\NC cf.\ web2c docs \NC \NR
7292 \NC strings_free \NC number\NC 100\NC cf.\ web2c docs \NC \NR
7293 \NC nest_size \NC number\NC 50\NC cf.\ web2c docs \NC \NR
7294 \NC max_in_open \NC number\NC 15\NC cf.\ web2c docs \NC \NR
7295 \NC param_size \NC number\NC 60\NC cf.\ web2c docs \NC \NR
7296 \NC save_size \NC number\NC 4000\NC cf.\ web2c docs \NC \NR
7297 \NC stack_size \NC number\NC 300\NC cf.\ web2c docs \NC \NR
7298 \NC dvi_buf_size \NC number\NC 16384\NC cf.\ web2c docs \NC \NR
7299 \NC error_line \NC number\NC 79\NC cf.\ web2c docs \NC \NR
7300 \NC half_error_line \NC number\NC 50\NC cf.\ web2c docs \NC \NR
7301 \NC max_print_line \NC number\NC 79\NC cf.\ web2c docs \NC \NR
7302 \NC hash_extra \NC number\NC 0\NC cf.\ web2c docs \NC \NR
7303 \NC pk_dpi \NC number\NC 72\NC cf.\ web2c docs \NC \NR
7304 \NC trace_file_names \NC boolean \NC true \NC \type{false} disables \TEX's normal file open|-|close
7305 feedback (the assumption is that callbacks will take care of
7306 that) \NC \NR
7307 \NC file_line_error \NC boolean \NC false \NC do \type{file:line} style error messages\NC \NR
7308 \NC halt_on_error \NC boolean \NC false \NC abort run on the first encountered error\NC \NR
7309 \NC formatname \NC string \NC \NC if no format name was given
7310 on the commandline, this key will be tested first
7311 instead of simply quitting\NC \NR
7312 \NC jobname \NC string \NC \NC if no input file name was given
7313 on the commandline, this key will be tested first
7314 instead of simply giving up\NC \NR
7315 \stoptabulate
7317 {\bf Note:} the numeric values that match web2c parameters are only used if
7318 \type{kpse_init} is explicitly set to \type{false}. In all other cases, the normal values from
7319 \type{texmf.cnf} are used.
7321 \section{The \luatex{texio} library}
7323 This library takes care of the low|-|level I/O interface.
7325 \subsection{Printing functions}
7327 \subsubsection{\luatex{texio.write}}
7329 \startfunctioncall
7330 texio.write(<string> target, <string> s, ...)
7331 texio.write(<string> s, ...)
7332 \stopfunctioncall
7334 Without the \type{target} argument, writes all given strings to the same
7335 location(s) \TEX\ writes messages to at this moment. If
7336 \tex{batchmode} is in effect, it writes only to the log,
7337 otherwise it writes to the log and the terminal.
7338 The optional \type{target} can be one of three possibilities:
7339 \type{term}, \type{log} or \type {term and log}.
7341 Note: If several strings are given, and if the first of these strings
7342 is or might be one of the targets above, the \type{target} must be
7343 specified explicitly to prevent \LUA\ from interpreting the first
7344 string as the target.
7346 \subsubsection{\luatex{texio.write_nl}}
7348 \startfunctioncall
7349 texio.write_nl(<string> target, <string> s, ...)
7350 texio.write_nl(<string> s, ...)
7351 \stopfunctioncall
7353 This function behaves like \luatex{texio.write}, but make sure that the given strings will
7354 appear at the beginning of a new line. You can pass a single empty string
7355 if you only want to move to the next line.
7357 %***********************************************************************
7359 \section[luatokens]{The \luatex{token} library}
7361 The \luatex{token} table contains interface functions to \TEX's
7362 handling of tokens. These functions are most useful when combined with
7363 the \luatex{token_filter} callback, but they could be used standalone
7364 as well.
7366 A token is represented in \LUA\ as a small table. For the moment, this
7367 table consists of three numeric entries:
7369 \starttabulate[|l|l|p|]
7370 \NC \bf index\NC \bf meaning \NC \bf description \NC \NR
7371 \NC 1 \NC command code \NC this is a value between~$0$ and~$130$ (approximately)\NC \NR
7372 \NC 2 \NC command modifier \NC this is a value between~$0$ and~$2^{21}$ \NC \NR
7373 \NC 3 \NC control sequence id \NC for commands that are not the result of control
7374 sequences, like letters and characters, it is zero,
7375 otherwise, it is a number pointing into the \quote
7376 {equivalence table} \NC \NR
7377 \stoptabulate
7379 \subsection{\luatex{token.get_next}}
7381 \startfunctioncall
7382 token t = token.get_next()
7383 \stopfunctioncall
7385 This fetches the next input token from the current input source,
7386 without expansion.
7388 \subsection{\luatex{token.is_expandable}}
7390 \startfunctioncall
7391 <boolean> b = token.is_expandable(<token> t)
7392 \stopfunctioncall
7394 This tests if the token \type{t} could be expanded.
7396 \subsection{\luatex{token.expand}}
7398 \startfunctioncall
7399 token.expand(<token> t)
7400 \stopfunctioncall
7402 If a token is expandable, this will expand one level of it, so that
7403 the first token of the expansion will now be the next token to be read
7404 by \luatex{token.get_next()}.
7406 \subsection{\luatex{token.is_activechar}}
7408 \startfunctioncall
7409 <boolean> b = token.is_activechar(<token> t)
7410 \stopfunctioncall
7412 This is a special test that is sometimes handy. Discovering whether
7413 some control sequence is the result of an active character turned out
7414 to be very hard otherwise.
7416 \subsection{\luatex{token.create}}
7418 \startfunctioncall
7419 token t = token.create(<string> csname)
7420 token t = token.create(<number> charcode)
7421 token t = token.create(<number> charcode, <number> catcode)
7422 \stopfunctioncall
7424 This is the token factory. If you feed it a string, then it is the
7425 name of a control sequence (without leading backslash), and it will be
7426 looked up in the equivalence table.
7428 If you feed it number, then this is assumed to be an input character,
7429 and an optional second number gives its category code. This means it
7430 is possible to overrule a character's category code, with a few
7431 exceptions: the category codes~0 (escape), 9~(ignored), 13~(active),
7432 14~(comment), and 15 (invalid) cannot occur inside a token. The values~0, 9, 14
7433 and~15 are therefore illegal as input to \luatex{token.create()}, and
7434 active characters will be resolved immediately.
7436 Note: unknown string sequences and never defined active characters
7437 will result in a token representing an \quote{undefined control sequence}
7438 with a near|-|random name. It is {\em not} possible to define brand
7439 new control sequences using \luatex{token.create}!
7441 \subsection{\luatex{token.command_name}}
7443 \startfunctioncall
7444 <string> commandname = token.command_name(<token> t)
7445 \stopfunctioncall
7447 This returns the name associated with the \quote{command} value of the token
7448 in \LUATEX. There is not always a direct connection between these names and
7449 primitives. For instance, all \tex{ifxxx} tests are grouped under
7450 \type {if_test}, and the \quote{command modifier} defines which test is to be run.
7452 \subsection{\luatex{token.command_id}}
7454 \startfunctioncall
7455 <number> i = token.command_id(<string> commandname)
7456 \stopfunctioncall
7458 This returns a number that is the inverse operation of the previous
7459 command, to be used as the first item in a token table.
7461 \subsection{\luatex{token.csname_name}}
7463 \startfunctioncall
7464 <string> csname = token.csname_name(<token> t)
7465 \stopfunctioncall
7467 This returns the name associated with the \quote{equivalence table} value of
7468 the token in \LUATEX. It returns the string value of the command used
7469 to create the current token, or an empty string if there is no
7470 associated control sequence.
7472 Keep in mind that there are potentially two control sequences that
7473 return the same csname string: single character control sequences
7474 and active characters have the same \quote{name}.
7476 \subsection{\luatex{token.csname_id}}
7478 \startfunctioncall
7479 <number> i = token.csname_id(<string> csname)
7480 \stopfunctioncall
7482 This returns a number that is the inverse operation of the previous
7483 command, to be used as the third item in a token table.
7485 \subsection{The \luatex{newtoken} libray}
7487 The current \type {token} library will be replaced by a new one that is more
7488 flexible and powerful. The transition takes place in steps. In version 0.80 we
7489 have \type {newtoken} and in version 0.85 the old lib will be replaced
7490 completely. So if you use this new mechanism in production code you need to be
7491 aware of incompatible updates between 0.80 and 0.90. Because the related in- and
7492 output code will also be cleaned up and rewritten you should be aware of
7493 incompatible logging and error reporting too.
7495 The old library presents tokens as triplets or numbers, the new library presents
7496 a userdata object. The old library used a callback to intercept tokens in the
7497 input but the new library provides a basic scanner infrastructure that can be
7498 used to write macros that accept a wide range of arguments. This interface is on
7499 purpose kept general and as performance is quite ok one can build additional
7500 parsers without too much overhead. It's up to macro package writers to see how
7501 they can benefit from this as the main principle behind \LUATEX\ is to provide
7502 a minimal set of tools and no solutions.
7504 The current functions in the \type {newtoken} namespace are given in the next
7505 table:
7507 \starttabulate[|lT|lT|p|]
7508 \NC \bf function \NC \bf argument \NC \bf result \NC \NR
7510 \NC is_token \NC token \NC checks if the given argument is a token userdatum \NC \NR
7511 \NC get_next \NC \NC returns the next token in the input \NC \NR
7512 \NC scan_keyword \NC string \NC returns true if the given keyword is gobbled \NC \NR
7513 \NC scan_int \NC \NC returns a number \NC \NR
7514 \NC scan_dimen \NC infinity, mu-units \NC returns a number representing a dimension and or two numbers being the filler and order \NC \NR
7515 \NC scan_glue \NC mu-units \NC returns a glue spec node \NC \NR
7516 \NC scan_toks \NC definer, expand \NC returns a table of tokens token list (this can become a linked list in later releases) \NC \NR
7517 \NC scan_code \NC bitset \NC returns a character if its category is in the given bitset (representing catcodes) \NC \NR
7518 \NC scan_string \NC \NC returns a string given between \type {{}}, as \type {\macro} or as sequence of characters with catcode 11 or 12 \NC \NR
7519 \NC scan_word \NC \NC returns a sequence of characters with catcode 11 or 12 as string \NC \NR
7520 \NC create \NC \NC returns a userdata token object of the given control sequence name (or character); this interface can change \NC \NR
7521 \stoptabulate
7523 The scanners can be considered stable apart from the one scanning for a token.
7524 This is because futures releases can return a linked list instead of a table (as
7525 with nodes). The \type {scan_code} function takes an optional number, the \type
7526 {keyword} function a normal \LUA\ string. The \type {infinity} boolean signals
7527 that we also permit \type {fill} as dimension and the \type {mu-units} flags the
7528 scanner that we expect math units. When scanning tokens we can indicate that we
7529 are defining a macro, in which case the result will also provide information
7530 about what arguments are expected and in the result this is separated from the
7531 meaning by a separator token. The \type {expand} flag determines if the list will
7532 be expanded.
7534 The string scanner scans for something between curly braces and expands on the way,
7535 or when it sees a control sequence it will return its meaning. Otherwise it will
7536 scan characters with catcode \type {letter} or \type {other}. So, given the
7537 following definition:
7539 \startbuffer
7540 \def\bar{bar}
7541 \def\foo{foo-\bar}
7542 \stopbuffer
7544 \typebuffer \getbuffer
7546 we get:
7548 \starttabulate[|l|Tl|l|]
7549 \NC \type{\directlua{newtoken.scan_string()}{foo}} \NC \directlua{context("{\\red\\type{"..newtoken.scan_string().."}}")} {foo} \NC full expansion \NR
7550 \NC \type{\directlua{newtoken.scan_string()}foo} \NC \directlua{context("{\\red\\type{"..newtoken.scan_string().."}}")} foo \NC letters and others \NR
7551 \NC \type{\directlua{newtoken.scan_string()}\foo} \NC \directlua{context("{\\red\\type{"..newtoken.scan_string().."}}")}\foo \NC meaning \NR
7552 \stoptabulate
7554 The \type {\foo} case only gives the meaning, but one can pass an already expanded
7555 definition (\type {\edef}'d). In the case of the braced variant one can of course
7556 use the \type {\detokenize} and \type {\unexpanded} primitives as there we do
7557 expand.
7559 The \type {scan_word} scanner can be used to implement for instance a number scanner:
7561 \starttyping
7562 function newtokens.scan_number(base)
7563 return tonumber(newtoken.scan_word(),base)
7565 \stoptyping
7567 This scanner accepts any valid \LUA\ number so it is a way to pick up floats
7568 in the input.
7570 The creator function can be used as follows:
7572 \starttyping
7573 local t = newtoken("relax")
7574 \stoptyping
7576 This gives back a token object that has the properties of the \type {\relax}
7577 primitive. The possible properties of tokens are:
7579 \starttabulate[|lT|p|]
7580 \NC command \NC a number representing the internal command number \NC \NR
7581 \NC cmdname \NC the type of the command (for instance the catcode in case of a
7582 character or the classifier that determines the internal
7583 treatment \NC \NR
7584 \NC csname \NC the associated control sequence (if applicable) \NC \NR
7585 \NC id \NC the unique id of the token \NC \NR
7586 %NC tok \NC \NC \NR % might change
7587 \NC active \NC a boolean indicating the active state of the token \NC \NR
7588 \NC expandable \NC a boolean indicating if the token (macro) is expandable \NC \NR
7589 \NC protected \NC a boolean indicating if the token (macro) is protected \NC \NR
7590 \stoptabulate
7592 The numbers that represent a catcode are the same as in \TEX\ itself, so using
7593 this information assumes that you know a bit about \TEX's internals. The other
7594 numbers and names are used consistently but are not frozen. So, when you use them
7595 for comparing you can best query a known primitive or character first to see the
7596 values.
7598 More interesting are the scanners. You can use the \LUA\ interface as follows:
7600 \starttyping
7601 \directlua {
7602 function mymacro(n)
7607 \def\mymacro#1{%
7608 \directlua {
7609 mymacro(\number\dimexpr#1)
7613 \mymacro{12pt}
7614 \mymacro{\dimen0}
7615 \stoptyping
7617 But starting with version 0.80 you can also do this:
7619 \starttyping
7620 \directlua {
7621 function mymacro()
7622 local d = newtoken.scan_dimen()
7627 \def\mymacro{%
7628 \directlua {
7629 mymacro()
7633 \mymacro 12pt
7634 \mymacro \dimen0
7635 \stoptyping
7637 It is quite clear from looking at the code what the first method needs as
7638 argument(s). For the second method you need to look at the \LUA\ code to see what
7639 gets picked up. Instead of passing from \TEX\ to \LUA\ we let \LUA\ fetch from
7640 the input stream.
7642 In the first case the input is tokenized and then turned into a string when it's
7643 passed to \LUA\ where it gets interpreted. In the second case only a function
7644 call gets interpreted but then the input is picked up by explicitly calling the
7645 scanner functions. These return proper \LUA\ variables so no further conversion
7646 has to be done. This is more efficient but in practice (given what \TEX\ has to
7647 do) this effect should not be overestimated. For numbers and dimensions it saves a
7648 bit but for passing strings conversion to and from tokens has to be done anyway
7649 (although we can probably speed up the process in later versions if needed).
7651 When the interface is stable and has replaced the old one completely we will add
7652 some more information here. By that time the internals have been cleaned up a bit
7653 more so we know then what will stay and go. A positive side effect of this
7654 transition is that we can simplify the input part because we no longer need to
7655 intercept using callbacks.
7657 \chapter[math]{Math}
7659 The handling of mathematics in \LUATEX\ differs quite a bit from how
7660 \TEX82 (and therefore \PDFTEX) handles math. First, \LUATEX\ adds primitives and
7661 extends some others so that \UNICODE\ input can be used easily. Second, all
7662 of \TEX82's internal special values (for example for operator spacing) have
7663 been made accessible and changeable via control sequences. Third, there are
7664 extensions that make it easier to use \OPENTYPE\ math fonts. And finally,
7665 there are some extensions that have been proposed in the past that are now
7666 added to the engine.
7668 \section{The current math style}
7670 Starting with \LUATEX\ 0.39.0, it is possible to discover the math
7671 style that will be used for a formula in an expandable fashion
7672 (while the math list is still being read). To make this possible,
7673 \LUATEX\ adds the new primitive: \type{\mathstyle}. This is a
7674 \quote{convert command} like e.g. \type{\romannumeral}: its value can
7675 only be read, not set.
7677 \subsection{\tex{mathstyle}}
7679 The returned value is between 0 and 7 (in math mode), or $-1$
7680 (all other modes). For easy testing, the eight math style commands
7681 have been altered so that the can be used as numeric values, so you
7682 can write code like this:
7684 \starttyping
7685 \ifnum\mathstyle=\textstyle
7686 \message{normal text style}
7687 \else \ifnum\mathstyle=\crampedtextstyle
7688 \message{cramped text style}
7689 \fi \fi
7690 \stoptyping
7692 \subsection{\tex{Ustack}}
7694 There are a few math commands in \TEX\ where the style that will be used
7695 is not known straight from the start. These commands (\tex{over},
7696 \tex{atop}, \tex{overwithdelims}, \tex{atopwithdelims}) would
7697 therefore normally return wrong values for \type{\mathstyle}. To
7698 fix this, \LUATEX\ introduces a special prefix command:
7699 \type{\Ustack}:
7701 \starttyping
7702 $\Ustack {a \over b}$
7703 \stoptyping
7705 The \type{\Ustack} command will scan the next brace and start a new
7706 math group with the correct (numerator) math style.
7708 \section{Unicode math characters}
7710 Character handling is now extended up to the full \UNICODE\ range
7711 (the \type{\U} prefix), which is compatible with \XETEX.
7713 The math primitives from \TEX\ are kept as they are, except for
7714 the ones that convert from input to math commands: \type{mathcode},
7715 and \type{delcode}. These two now allow
7716 for a 21-bit character argument on the left hand side of the equals sign.
7718 Some of the new \LUATEX\ primitives read
7719 more than one separate value. This is shown in the tables below by a plus
7720 sign in the second column.
7722 The input for such primitives would look like this:
7724 \starttyping
7725 \def\overbrace {\Umathaccent 0 1 "23DE }
7726 \stoptyping
7729 Altered \TEX82 primitives:
7731 \starttabulate[|l|l|l|]
7732 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7733 \NC \tex{mathcode} \NC 0--10FFFF = 0--8000 \NC\NR
7734 \NC \tex{delcode} \NC 0--10FFFF = 0--FFFFFF \NC\NR
7735 \stoptabulate
7737 Unaltered:
7739 \starttabulate[|l|l|l|]
7740 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7741 \NC \tex{mathchardef} \NC 0--8000 \NC\NR
7742 \NC \tex{mathchar} \NC 0--7FFF \NC\NR
7743 \NC \tex{mathaccent} \NC 0--7FFF \NC\NR
7744 \NC \tex{delimiter} \NC 0--7FFFFFF \NC\NR
7745 \NC \tex{radical} \NC 0--7FFFFFF \NC\NR
7746 \stoptabulate
7748 New primitives that are compatible with \XETEX:
7750 \starttabulate[|l|l|l|l|]
7751 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7752 \NC \tex{Umathchardef} \NC 0+0+0--7+FF+10FFFF$^1$ \NC\NR
7753 \NC \tex{Umathcharnumdef}$^5$ \NC -80000000--7FFFFFFF$^3$ \NC\NR
7754 \NC \tex{Umathcode} \NC 0--10FFFF = 0+0+0--7+FF+10FFFF$^1$ \NC\NR
7755 \NC \tex{Udelcode} \NC 0--10FFFF = 0+0--FF+10FFFF$^2$ \NC\NR
7756 \NC \tex{Umathchar} \NC 0+0+0--7+FF+10FFFF \NC\NR
7757 \NC \tex{Umathaccent} \NC 0+0+0--7+FF+10FFFF$^{2,4}$ \NC\NR
7758 \NC \tex{Udelimiter} \NC 0+0+0--7+FF+10FFFF$^2$ \NC\NR
7759 \NC \tex{Uradical} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7760 \NC \tex{Umathcharnum} \NC -80000000--7FFFFFFF$^3$ \NC\NR
7761 \NC \tex{Umathcodenum} \NC 0--10FFFF = -80000000--7FFFFFFF$^3$ \NC\NR
7762 \NC \tex{Udelcodenum} \NC 0--10FFFF = -80000000--7FFFFFFF$^3$ \NC\NR
7763 \stoptabulate
7765 Note 1: \type{\Umathchardef<csname>="8"0"0} and \type{\Umathchardef<number>="8"0"0}
7766 are also accepted.
7768 Note 2: The new primitives that deal with delimiter-style objects do not
7769 set up a \quote{large family}. Selecting a suitable size for display
7770 purposes is expected to be dealt with by the font via the
7771 \tex{Umathoperatorsize} parameter (more information a following section).
7773 Note 3: For these three primitives, all information is packed into a single
7774 signed integer. For the first two (\tex{Umathcharnum} and
7775 \tex{Umathcodenum}), the lowest 21 bits are the character code, the 3
7776 bits above that represent the math class, and the family data is kept in
7777 the topmost bits (This means that the values for math families 128--255 are
7778 actually negative). For \tex{Udelcodenum} there is no math class; the
7779 math family information is stored in the bits directly on top of the
7780 character code. Using these three commands is not as natural as using the
7781 two- and three-value commands, so unless you know exactly what you are
7782 doing and absolutely require the speedup resulting from the faster input
7783 scanning, it is better to use the verbose commands instead.
7785 Note 4: As of \LUATEX\ 0.65, \tex{Umathaccent} accepts optional
7786 keywords to control various details regarding math accents. See
7787 \in{section}[mathacc] below for details.
7789 Note 5: \tex{Umathcharnumdef} was added in release 0.72.
7792 New primitives that exist in \LUATEX\ only (all of these will be explained
7793 in following sections):
7796 \starttabulate[|l|l|l|l|]
7797 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7798 \NC \tex{Uroot} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7799 \NC \tex{Uoverdelimiter} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7800 \NC \tex{Uunderdelimiter} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7801 \NC \tex{Udelimiterover} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7802 \NC \tex{Udelimiterunder} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7803 \stoptabulate
7805 \section{Cramped math styles}
7807 \LUATEX\ has four new primitives to set the cramped math styles
7808 directly:
7810 \starttyping
7811 \crampeddisplaystyle
7812 \crampedtextstyle
7813 \crampedscriptstyle
7814 \crampedscriptscriptstyle
7815 \stoptyping
7817 These additional commands are not all that valuable on their own, but
7818 they come in handy as arguments to the math parameter settings that
7819 will be added shortly.
7821 \section{Math parameter settings}
7823 In \LUATEX, the font dimension parameters that \TEX\ used in math
7824 typesetting are now accessible via primitive commands. In fact,
7825 refactoring of the math engine has resulted in many more parameters
7826 than were accessible before.
7828 \starttabulate
7829 \NC \bf primitive name \NC \bf description \NC \NR
7830 \NC \type{\Umathquad} \NC the width of 18mu's\NC \NR
7831 \NC \type{\Umathaxis} \NC height of the vertical center axis of
7832 the math formula above the baseline\NC \NR
7833 \NC \type{\Umathoperatorsize} \NC minimum size of large operators in display mode \NC \NR
7834 \NC \type{\Umathoverbarkern} \NC vertical clearance above the rule \NC \NR
7835 \NC \type{\Umathoverbarrule} \NC the width of the rule \NC \NR
7836 \NC \type{\Umathoverbarvgap} \NC vertical clearance below the rule \NC \NR
7837 \NC \type{\Umathunderbarkern} \NC vertical clearance below the rule \NC \NR
7838 \NC \type{\Umathunderbarrule} \NC the width of the rule \NC \NR
7839 \NC \type{\Umathunderbarvgap} \NC vertical clearance above the rule \NC \NR
7840 \NC \type{\Umathradicalkern} \NC vertical clearance above the rule \NC \NR
7841 \NC \type{\Umathradicalrule} \NC the width of the rule \NC \NR
7842 \NC \type{\Umathradicalvgap} \NC vertical clearance below the rule \NC \NR
7843 \NC \type{\Umathradicaldegreebefore}\NC the forward kern that takes place before placement of
7844 the radical degree \NC \NR
7845 \NC \type{\Umathradicaldegreeafter} \NC the backward kern that takes place after placement of
7846 the radical degree \NC \NR
7847 \NC \type{\Umathradicaldegreeraise} \NC this is the percentage of the total height and depth of
7848 the radical sign that the degree is raised by. It is
7849 expressed in \type{percents}, so 60\% is expressed as the
7850 integer $60$.\NC \NR
7851 \NC \type{\Umathstackvgap} \NC vertical clearance between the two
7852 elements in a \type{\atop} stack \NC \NR
7853 \NC \type{\Umathstacknumup} \NC numerator shift upward in \type{\atop} stack \NC \NR
7854 \NC \type{\Umathstackdenomdown} \NC denominator shift downward in \type{\atop} stack\NC \NR
7855 \NC \type{\Umathfractionrule} \NC the width of the rule in a \type{\over}\NC \NR
7856 \NC \type{\Umathfractionnumvgap} \NC vertical clearance between the numerator and the rule\NC \NR
7857 \NC \type{\Umathfractionnumup} \NC numerator shift upward in \type{\over} \NC \NR
7858 \NC \type{\Umathfractiondenomvgap} \NC vertical clearance between the denominator and the rule\NC \NR
7859 \NC \type{\Umathfractiondenomdown} \NC denominator shift downward in \type{\over} \NC \NR
7860 \NC \type{\Umathfractiondelsize} \NC minimum delimiter size for \type{\...withdelims}\NC \NR
7861 \NC \type{\Umathlimitabovevgap} \NC vertical clearance for limits above operators\NC \NR
7862 \NC \type{\Umathlimitabovebgap} \NC vertical baseline clearance for limits above operators\NC \NR
7863 \NC \type{\Umathlimitabovekern} \NC space reserved at the top of the limit\NC \NR
7864 \NC \type{\Umathlimitbelowvgap} \NC vertical clearance for limits below operators\NC \NR
7865 \NC \type{\Umathlimitbelowbgap} \NC vertical baseline clearance for limits below operators\NC \NR
7866 \NC \type{\Umathlimitbelowkern} \NC space reserved at the bottom of the limit\NC \NR
7867 \NC \type{\Umathoverdelimitervgap} \NC vertical clearance for limits above delimiters\NC \NR
7868 \NC \type{\Umathoverdelimiterbgap} \NC vertical baseline clearance for limits above delimiters\NC \NR
7869 \NC \type{\Umathunderdelimitervgap} \NC vertical clearance for limits below delimiters\NC \NR
7870 \NC \type{\Umathunderdelimiterbgap} \NC vertical baseline clearance for limits below delimiters\NC \NR
7871 \NC \type{\Umathsubshiftdrop} \NC subscript drop for boxes and subformulas\NC \NR
7872 \NC \type{\Umathsubshiftdown} \NC subscript drop for characters\NC \NR
7873 \NC \type{\Umathsupshiftdrop} \NC superscript drop (raise, actually) for boxes and subformulas\NC \NR
7874 \NC \type{\Umathsupshiftup} \NC superscript raise for characters\NC \NR
7875 \NC \type{\Umathsubsupshiftdown} \NC subscript drop in the presence of a superscript\NC \NR
7876 \NC \type{\Umathsubtopmax} \NC the top of standalone subscripts cannot be higher than this above the baseline\NC \NR
7877 \NC \type{\Umathsupbottommin} \NC the bottom of standalone superscripts cannot be less than this above the baseline\NC \NR
7878 \NC \type{\Umathsupsubbottommax} \NC the bottom of the superscript of a combined super- and subscript
7879 be at least as high as this above the baseline\NC \NR
7880 \NC \type{\Umathsubsupvgap} \NC vertical clearance between super- and subscript\NC \NR
7881 \NC \type{\Umathspaceafterscript} \NC additional space added after a super- or subscript\NC \NR
7882 \NC \type{\Umathconnectoroverlapmin}\NC minimum overlap between parts in an extensible recipe\NC \NR
7883 \stoptabulate
7885 Each of the parameters in this section can be set by a command like this:
7887 \starttyping
7888 \Umathquad\displaystyle=1em
7889 \stoptyping
7891 they obey grouping, and you can use \type{\the\Umathquad\displaystyle} if needed.
7893 \section{Font-based Math Parameters}
7895 While it is nice to have these math parameters available for tweaking, it
7896 would be tedious to have to set each of them by hand. For this reason,
7897 \LUATEX\ initializes a bunch of these parameters whenever you assign a font
7898 identifier to a math family based on either the traditional math font
7899 dimensions in the font (for assignments to math family~2 and~3 using
7900 \TFM|-|based fonts like \type{cmsy} and \type{cmex}), or based on the named
7901 values in a potential \type{MathConstants} table when the font is loaded
7902 via Lua. If there is a \type{MathConstants} table, this takes precedence
7903 over font dimensions, and in that case no attention is paid to which
7904 family is being assigned to: the \type{MathConstants} tables in the last
7905 assigned family sets all parameters.
7907 In the table below, the one-letter style abbreviations and symbolic tfm
7908 font dimension names match those using in the \TeX book. Assignments to
7909 \tex{textfont} set the values for the cramped and uncramped display and
7910 text styles. Use \tex{scriptfont} for the script styles, and
7911 \tex{scriptscriptfont} for the scriptscript styles (totalling eight
7912 parameters for three font sizes). In the \TFM\ case, assignments only happen
7913 in family~2 and family~3 (and of course only for the parameters for which
7914 there are font dimensions).
7916 Besides the parameters below, \LUATEX\ also looks at the \quote{space}
7917 font dimension parameter. For math fonts, this should be set to zero.
7919 \start
7921 \switchtobodyfont[8pt]
7923 \starttabulate[|l|l|l|p|]
7924 \NC \bf variable \NC \bf style \NC \bf default value opentype \NC \bf default value tfm \NC\NR
7925 \NC \tex{Umathaxis} \NC -- \NC AxisHeight \NC axis_height \NC\NR
7926 \NC \tex{Umathoperatorsize} \NC D, D' \NC DisplayOperatorMinHeight \NC $^6$ \NC\NR
7927 \NC \tex{Umathfractiondelsize} \NC D, D' \NC FractionDelimiterDisplayStyleSize$^9$ \NC delim1 \NC\NR
7928 \NC " \NC T, T', S, S', SS, SS' \NC FractionDelimiterSize$^9$ \NC delim2 \NC\NR
7929 \NC \tex{Umathfractiondenomdown}\NC D, D' \NC FractionDenominatorDisplayStyleShiftDown \NC denom1 \NC\NR
7930 \NC " \NC T, T', S, S', SS, SS' \NC FractionDenominatorShiftDown \NC denom2 \NC\NR
7931 \NC \tex{Umathfractiondenomvgap}\NC D, D' \NC FractionDenominatorDisplayStyleGapMin \NC 3*default_rule_thickness \NC\NR
7932 \NC " \NC T, T', S, S', SS, SS' \NC FractionDenominatorGapMin \NC default_rule_thickness \NC\NR
7933 \NC \tex{Umathfractionnumup} \NC D, D' \NC FractionNumeratorDisplayStyleShiftUp \NC num1 \NC\NR
7934 \NC " \NC T, T', S, S', SS, SS' \NC FractionNumeratorShiftUp \NC num2 \NC\NR
7935 \NC \tex{Umathfractionnumvgap} \NC D, D' \NC FractionNumeratorDisplayStyleGapMin \NC 3*default_rule_thickness \NC\NR
7936 \NC " \NC T, T', S, S', SS, SS' \NC FractionNumeratorGapMin \NC default_rule_thickness \NC\NR
7937 \NC \tex{Umathfractionrule} \NC -- \NC FractionRuleThickness \NC default_rule_thickness \NC\NR
7938 \NC \tex{Umathlimitabovebgap} \NC -- \NC UpperLimitBaselineRiseMin \NC big_op_spacing3 \NC\NR
7939 \NC \tex{Umathlimitabovekern} \NC -- \NC 0$^1$ \NC big_op_spacing5 \NC\NR
7940 \NC \tex{Umathlimitabovevgap} \NC -- \NC UpperLimitGapMin \NC big_op_spacing1 \NC\NR
7941 \NC \tex{Umathlimitbelowbgap} \NC -- \NC LowerLimitBaselineDropMin \NC big_op_spacing4 \NC\NR
7942 \NC \tex{Umathlimitbelowkern} \NC -- \NC 0$^1$ \NC big_op_spacing5 \NC\NR
7943 \NC \tex{Umathlimitbelowvgap} \NC -- \NC LowerLimitGapMin \NC big_op_spacing2 \NC\NR
7944 \NC \tex{Umathoverdelimitervgap}\NC -- \NC StretchStackGapBelowMin \NC big_op_spacing1 \NC\NR
7945 \NC \tex{Umathoverdelimiterbgap}\NC -- \NC StretchStackTopShiftUp \NC big_op_spacing3 \NC\NR
7946 \NC \tex{Umathunderdelimitervgap}\NC-- \NC StretchStackGapAboveMin \NC big_op_spacing2 \NC\NR
7947 \NC \tex{Umathunderdelimiterbgap}\NC-- \NC StretchStackBottomShiftDown \NC big_op_spacing4 \NC\NR
7948 \NC \tex{Umathoverbarkern} \NC -- \NC OverbarExtraAscender \NC default_rule_thickness \NC\NR
7949 \NC \tex{Umathoverbarrule} \NC -- \NC OverbarRuleThickness \NC default_rule_thickness \NC\NR
7950 \NC \tex{Umathoverbarvgap} \NC -- \NC OverbarVerticalGap \NC 3*default_rule_thickness \NC\NR
7951 \NC \tex{Umathquad} \NC -- \NC <font_size(f)>$^1$ \NC math_quad \NC\NR
7952 \NC \tex{Umathradicalkern} \NC -- \NC RadicalExtraAscender \NC default_rule_thickness \NC\NR
7953 \NC \tex{Umathradicalrule} \NC -- \NC RadicalRuleThickness \NC <not set>$^2$ \NC\NR
7954 \NC \tex{Umathradicalvgap} \NC D, D' \NC RadicalDisplayStyleVerticalGap \NC (default_rule_thickness+\crlf
7955 (abs(math_x_height)/4))$^3$ \NC\NR
7956 \NC " \NC T, T', S, S', SS, SS' \NC RadicalVerticalGap \NC (default_rule_thickness+\crlf
7957 (abs(default_rule_thickness)/4))$^3$ \NC\NR
7958 \NC \tex{Umathradicaldegreebefore}\NC -- \NC RadicalKernBeforeDegree \NC <not set>$^2$ \NC\NR
7959 \NC \tex{Umathradicaldegreeafter}\NC -- \NC RadicalKernAfterDegree \NC <not set>$^2$ \NC\NR
7960 \NC \tex{Umathradicaldegreeraise}\NC -- \NC RadicalDegreeBottomRaisePercent \NC <not set>$^{2,7}$ \NC\NR
7961 \NC \tex{Umathspaceafterscript} \NC -- \NC SpaceAfterScript \NC script_space$^4$ \NC\NR
7962 \NC \tex{Umathstackdenomdown} \NC D, D' \NC StackBottomDisplayStyleShiftDown \NC denom1 \NC\NR
7963 \NC " \NC T, T', S, S', SS, SS' \NC StackBottomShiftDown \NC denom2 \NC\NR
7964 \NC \tex{Umathstacknumup} \NC D, D' \NC StackTopDisplayStyleShiftUp \NC num1 \NC\NR
7965 \NC " \NC T, T', S, S', SS, SS' \NC StackTopShiftUp \NC num3 \NC\NR
7966 \NC \tex{Umathstackvgap} \NC D, D' \NC StackDisplayStyleGapMin \NC 7*default_rule_thickness \NC\NR
7967 \NC " \NC T, T', S, S', SS, SS' \NC StackGapMin \NC 3*default_rule_thickness \NC\NR
7968 \NC \tex{Umathsubshiftdown} \NC -- \NC SubscriptShiftDown \NC sub1 \NC\NR
7969 \NC \tex{Umathsubshiftdrop} \NC -- \NC SubscriptBaselineDropMin \NC sub_drop \NC\NR
7970 \NC \tex{Umathsubsupshiftdown} \NC -- \NC SubscriptShiftDownWithSuperscript$^8$ \NC \NC\NR
7971 \NC \NC \NC \quad\ or SubscriptShiftDown \NC sub2 \NC\NR
7972 \NC \tex{Umathsubtopmax} \NC -- \NC SubscriptTopMax \NC (abs(math_x_height * 4) / 5) \NC\NR
7973 \NC \tex{Umathsubsupvgap} \NC -- \NC SubSuperscriptGapMin \NC 4*default_rule_thickness \NC\NR
7974 \NC \tex{Umathsupbottommin} \NC -- \NC SuperscriptBottomMin \NC (abs(math_x_height) / 4) \NC\NR
7975 \NC \tex{Umathsupshiftdrop} \NC -- \NC SuperscriptBaselineDropMax \NC sup_drop \NC\NR
7976 \NC \tex{Umathsupshiftup} \NC D \NC SuperscriptShiftUp \NC sup1 \NC\NR
7977 \NC " \NC T, S, SS, \NC SuperscriptShiftUp \NC sup2 \NC\NR
7978 \NC " \NC D', T', S', SS' \NC SuperscriptShiftUpCramped \NC sup3 \NC\NR
7979 \NC \tex{Umathsupsubbottommax} \NC -- \NC SuperscriptBottomMaxWithSubscript \NC (abs(math_x_height * 4) / 5) \NC\NR
7980 \NC \tex{Umathunderbarkern} \NC -- \NC UnderbarExtraDescender \NC default_rule_thickness \NC\NR
7981 \NC \tex{Umathunderbarrule} \NC -- \NC UnderbarRuleThickness \NC default_rule_thickness \NC\NR
7982 \NC \tex{Umathunderbarvgap} \NC -- \NC UnderbarVerticalGap \NC 3*default_rule_thickness \NC\NR
7983 \NC \tex{Umathconnectoroverlapmin}\NC -- \NC MinConnectorOverlap \NC 0$^5$ \NC\NR
7984 \stoptabulate
7986 \stop
7988 Note 1: \OPENTYPE\ fonts set \tex{Umathlimitabovekern} and
7989 \tex{Umathlimitbelowkern} to zero and set \tex{Umathquad} to the font size of the used font,
7990 because these are not supported in the MATH table,
7992 Note 2: \TFM\ fonts do not set \tex{Umathradicalrule} because \TeX82\ uses the height of the radical
7993 instead. When this parameter is indeed not set when \LUATEX\ has to typeset a radical, a backward
7994 compatibility mode will kick in that assumes that an oldstyle \TeX\ font is used. Also, they do
7995 not set \tex{Umathradicaldegreebefore}, \tex{Umathradicaldegreeafter}, and
7996 \tex{Umathradicaldegreeraise}. These are then automatically initialized to $5/18$quad, $-10/18$quad, and 60.
7998 Note 3: If tfm fonts are used, then the \tex{Umathradicalvgap} is not set until the first time
7999 \LUATEX\ has to typeset a formula because this needs parameters from both family2 and family3.
8000 This provides a partial backward compatibility with \TEX82, but that compatibility is only partial:
8001 once the \tex{Umathradicalvgap} is set, it will not be recalculated any more.
8003 Note 4: (also if tfm fonts are used) A similar situation arises wrt. \tex{Umathspaceafterscript}: it is not
8004 set until the first time \LUATEX\ has to typeset a formula. This provides some backward compatibility with
8005 \TEX82. But once the \tex{Umathspaceafterscript} is set, \tex{scriptspace} will never be looked at again.
8007 Note 5: Tfm fonts set \tex{Umathconnectoroverlapmin} to zero because
8008 \TeX82\ always stacks extensibles without any overlap.
8010 Note 6: The \tex{Umathoperatorsize} is only used in \type{\displaystyle}, and is only set
8011 in \OPENTYPE\ fonts. In \TFM\ font mode, it is artificially set to one scaled point more than the
8012 initial attempt's size, so that always the \quote{first next} will be tried, just like in \TEX82.
8014 Note 7: The \tex{Umathradicaldegreeraise} is a special case because it is the only parameter that is
8015 expressed in a percentage instead of as a number of scaled points.
8017 Note 8: \type{SubscriptShiftDownWithSuperscript} does not actually exist in the \quote{standard}
8018 Opentype Math font Cambria, but it is useful enough to be added. New in version 0.38.
8020 Note 9: \type{FractionDelimiterDisplayStyleSize} and \type{FractionDelimiterSize} do not actually exist in the \quote{standard}
8021 Opentype Math font Cambria, but were useful enough to be added. New in version 0.47.
8024 \section{Math spacing setting}
8026 Besides the parameters mentioned in the previous sections, there are
8027 also 64 new primitives to control the math spacing table (as explained in
8028 Chapter~18 of the \TeX book). The primitive names are a simple matter
8029 of combining two math atom types, but for completeness' sake, here is
8030 the whole list:
8032 \startcolumns[n=2]
8033 \starttyping
8034 \Umathordordspacing
8035 \Umathordopspacing
8036 \Umathordbinspacing
8037 \Umathordrelspacing
8038 \Umathordopenspacing
8039 \Umathordclosespacing
8040 \Umathordpunctspacing
8041 \Umathordinnerspacing
8042 \Umathopordspacing
8043 \Umathopopspacing
8044 \Umathopbinspacing
8045 \Umathoprelspacing
8046 \Umathopopenspacing
8047 \Umathopclosespacing
8048 \Umathoppunctspacing
8049 \Umathopinnerspacing
8050 \Umathbinordspacing
8051 \Umathbinopspacing
8052 \Umathbinbinspacing
8053 \Umathbinrelspacing
8054 \Umathbinopenspacing
8055 \Umathbinclosespacing
8056 \Umathbinpunctspacing
8057 \Umathbininnerspacing
8058 \Umathrelordspacing
8059 \Umathrelopspacing
8060 \Umathrelbinspacing
8061 \Umathrelrelspacing
8062 \Umathrelopenspacing
8063 \Umathrelclosespacing
8064 \Umathrelpunctspacing
8065 \Umathrelinnerspacing
8066 \Umathopenordspacing
8067 \Umathopenopspacing
8068 \Umathopenbinspacing
8069 \Umathopenrelspacing
8070 \Umathopenopenspacing
8071 \Umathopenclosespacing
8072 \Umathopenpunctspacing
8073 \Umathopeninnerspacing
8074 \Umathcloseordspacing
8075 \Umathcloseopspacing
8076 \Umathclosebinspacing
8077 \Umathcloserelspacing
8078 \Umathcloseopenspacing
8079 \Umathcloseclosespacing
8080 \Umathclosepunctspacing
8081 \Umathcloseinnerspacing
8082 \Umathpunctordspacing
8083 \Umathpunctopspacing
8084 \Umathpunctbinspacing
8085 \Umathpunctrelspacing
8086 \Umathpunctopenspacing
8087 \Umathpunctclosespacing
8088 \Umathpunctpunctspacing
8089 \Umathpunctinnerspacing
8090 \Umathinnerordspacing
8091 \Umathinneropspacing
8092 \Umathinnerbinspacing
8093 \Umathinnerrelspacing
8094 \Umathinneropenspacing
8095 \Umathinnerclosespacing
8096 \Umathinnerpunctspacing
8097 \Umathinnerinnerspacing
8098 \stoptyping
8099 \stopcolumns
8101 These parameters are of type \type{\muskip}, so setting a parameter
8102 can be done like this:
8104 \starttyping
8105 \Umathopordspacing\displaystyle=4mu plus 2mu
8106 \stoptyping
8108 They are all initialized by initex to the values mentioned in the
8109 table in Chapter~18 of the \TeX book.
8111 Note 1: for ease of use as well as for backward compatibility, \type{\thinmuskip},
8112 \type{\medmuskip} and \type{\thickmuskip} are treated especially. In their case a pointer to
8113 the corresponding internal parameter is saved, not the actual \type{\muskip} value. This
8114 means that any later changes to one of these three parameters will be taken into account.
8116 Note 2: Careful readers will realise that there are also primitives
8117 for the items marked \type{*} in the \TeX book. These will not
8118 actually be used as those combinations of atoms cannot actually
8119 happen, but it seemed better not to break orthogonality. They are initialized to zero.
8122 \section[mathacc]{Math accent handling}
8124 \LUATEX\ supports both top accents and bottom accents in math mode,
8125 and math accents stretch automatically (if this is supported by the
8126 font the accent comes from, of course). Bottom and combined accents as
8127 well as fixed-width math accents are controlled by optional keywords
8128 following \tex{Umathaccent}.
8130 The keyword \type{bottom} after \tex{Umathaccent} signals that a bottom
8131 accent is needed, and the keyword \type{both} signals that both a top
8132 and a bottom accent are needed (in this case two accents need to be
8133 specified, of course).
8135 Then the set of three integers defining the accent is read. This set
8136 of integers can be prefixed by the \type{fixed} keyword to indicate
8137 that a non-stretching variant is requested (in case of both accents,
8138 this step is repeated).
8140 A simple example:
8141 \starttyping
8142 \Umathaccent both fixed 0 0 "20D7 fixed 0 0 "20D7 {example}
8143 \stoptyping
8145 If a math top accent has to be placed and the accentee is a character and has a non-zero
8146 \type{top_accent} value, then this value will be used to place the accent instead of
8147 the \type{\skewchar} kern used by \TEX82.
8149 The \type{top_accent} value represents a vertical line somewhere in the accentee. The
8150 accent will be shifted horizontally such that its own \type{top_accent} line coincides
8151 with the one from the accentee. If the \type{top_accent} value of the accent is zero,
8152 then half the width of the accent followed by its italic correction is used instead.
8154 The vertical placement of a top accent depends on the \type{x_height} of the font of the
8155 accentee (as explained in the \TEX book), but if value that turns out to be zero and the
8156 font had a MathConstants table, then \type{AccentBaseHeight} is used instead.
8158 If a math bottom accent has to be placed, the \type{bot_accent} value is checked instead
8159 of \type{top_accent}. Because bottom accents do not exist in \TEX82, the \type{\skewchar}
8160 kern is ignored.
8162 The vertical placement of a bottom accent is straight below the accentee, no correction
8163 takes place.
8165 \section{Math root extension}
8167 The new primitive \type{\Uroot} allows the construction of a radical
8168 noad including a degree field. Its syntax is an extension of \type{\Uradical}:
8170 \starttyping
8171 \Uradical <fam integer> <char integer> <radicand>
8172 \Uroot <fam integer> <char integer> <degree> <radicand>
8173 \stoptyping
8175 The placement of the degree is controlled by the math parameters
8176 \type{\Umathradicaldegreebefore}, \type{\Umathradicaldegreeafter}, and
8177 \type{\Umathradicaldegreeraise}. The degree will be typeset in \type{\scriptscriptstyle}.
8180 \section{Math kerning in super- and subscripts}
8182 The character fields in a lua-loaded OpenType math font can have a \quote{mathkern} table.
8183 The format of this table is the same as the \quote{mathkern} table that is returned by
8184 the \type{fontloader} library, except that all height and kern values have to
8185 be specified in actual scaled points.
8187 When a super- or subscript has to be placed next to a math item, \LUATEX\ checks
8188 whether the super- or subscript and the nucleus are both simple character items. If
8189 they are, and if the fonts of both character imtes are OpenType fonts (as opposed to
8190 legacy \TEX\ fonts), then \LUATEX\ will use the OpenType MATH algorithm for deciding
8191 on the horizontal placement of the super- or subscript.
8193 This works as follows:
8195 \startitemize
8196 \item The vertical position of the script is calculated.
8197 \item The default horizontal position is flat next to the base character.
8198 \item For superscripts, the italic correction of the base character is added.
8199 \item For a superscript, two vertical values are calculated: the bottom of the
8200 script (after shifting up), and the top of the base. For a subscript,
8201 the two values are the top of the (shifted down) script, and the bottom
8202 of the base.
8203 \item For each of these two locations:
8204 \startitemize
8205 \item find the mathkern value at this height for the base
8206 (for a subscript placement, this is the bottom_right corner,
8207 for a superscript placement the top_right corner)
8208 \item find the mathkern value at this height for the script
8209 (for a subscript placement, this is the top_left corner,
8210 for a superscript placement the bottom_left corner)
8211 \item add the found values together to get a preliminary result.
8212 \stopitemize
8213 \item The horizontal kern to be applied is the smallest of the two results from
8214 previous step.
8215 \stopitemize
8217 The mathkern value at a specific height is the kern value that is specified by the
8218 next higher height and kern pair, or the highest one in the character (if there is no
8219 value high enough in the character), or simply zero (if the character has no mathkern
8220 pairs at all).
8222 \section{Scripts on horizontally extensible items like arrows}
8224 The new primitives \tex{Uunderdelimiter} and \tex{Uoverdelimiter}
8225 (both from 0.35) allow the placement of a subscript or superscript on
8226 an automatically extensible item and \tex{Udelimiterunder} and
8227 \tex{Udelimiterover} (both from 0.37) allow the placement of
8228 an automatically extensible item as a subscript or superscript on a
8229 nucleus.
8231 The vertical placements are controlled by
8232 \tex{Umathunderdelimiterbgap}, \tex{Umathunderdelimitervgap},
8233 \tex{Umathoverdelimiterbgap}, and \tex{Umathoverdelimitervgap} in a similar way as limit
8234 placements on large operators. The superscript in \tex{Uoverdelimiter} is typeset in
8235 a suitable scripted style, the subscript in \tex{Uunderdelimiter} is cramped as well.
8237 \section {Extensible delimiters}
8239 \LUATEX\ internally uses a structure that supports \OPENTYPE\ \quote{MathVariants} as well
8240 as \TFM\ \quote{extensible recipes}.
8243 \section{Other Math changes}
8245 \subsection {Verbose versions of single-character math commands}
8247 \LUATEX\ defines six new primitives that have the same function as
8248 \type{^}, \type{_}, \type{$}, and \type{$$}. %$
8250 \starttabulate[|l|l|l|l|]
8251 \NC \bf primitive \NC \bf explanation \NC\NR
8252 \NC \tex{Usuperscript} \NC Duplicates the functionality of \type{^} \NC\NR
8253 \NC \tex{Usubscript} \NC Duplicates the functionality of \type{_} \NC\NR
8254 \NC \tex{Ustartmath} \NC Duplicates the functionality of \type{$}, % $
8255 when used in non-math mode. \NC\NR
8256 \NC \tex{Ustopmath} \NC Duplicates the functionality of \type{$}, % $
8257 when used in inline math mode. \NC\NR
8258 \NC \tex{Ustartdisplaymath}\NC Duplicates the functionality of \type{$$}, % $$
8259 when used in non-math mode. \NC\NR
8260 \NC \tex{Ustopdisplaymath} \NC Duplicates the functionality of \type{$$}, % $$
8261 when used in display math mode. \NC\NR
8262 \stoptabulate
8264 All are new in version 0.38. The \tex{Ustopmath} and \tex{Ustopdisplaymath}
8265 primitives check if the current math mode is the correct one (inline
8266 vs. displayed), but you can freely intermix the four mathon|/|mathoff
8267 commands with explicit dollar sign(s).
8270 \subsection{Allowed math commands in non-math modes}
8272 The commands \type{\mathchar}, and \type{\Umathchar} and control
8273 sequences that are the result of \type{\mathchardef} or
8274 \type{\Umathchardef} are also acceptable in the horizontal and vertical modes.
8275 In those cases, the \type{\textfont} from the requested math family is used.
8277 \section{Math todo}
8279 The following items are still todo.
8281 \startitemize
8282 \item Pre-scripts.
8283 \item Multi-story stacks.
8284 \item Flattened accents for high characters (?).
8285 \item Better control over the spacing around displays and handling of equation numbers.
8286 \item Support for multi-line displays using \MATHML\ style alignment points.
8287 \stopitemize
8289 \chapter[languages]{Languages and characters, fonts and glyphs}
8291 \LUATEX's internal handling of the characters and glyphs that eventually
8292 become typeset is quite different from the way \TEX82 handles those
8293 same objects. The easiest way to explain the difference is to focus on
8294 unrestricted horizontal mode (i.\,e.\ paragraphs) and hyphenation first.
8295 Later on, it will be easy to deal with the differences that occur in
8296 horizontal and math modes.
8298 In \TEX82, the characters you type are converted into \type{char_node}
8299 records when they are encountered by the main control loop. \TEX\
8300 attaches and processes the font information while creating those
8301 records, so that the resulting \quote{horizontal list} contains the final
8302 forms of ligatures and implicit kerning. This packaging is needed because
8303 we may want to get the effective width of for instance a horizontal box.
8305 When it becomes necessary to hyphenate words in a paragraph, \TEX\
8306 converts (one word at time) the \type{char_node} records into a
8307 string array by replacing ligatures with their components and
8308 ignoring the kerning. Then it runs the hyphenation algorithm on this
8309 string, and converts the hyphenated result back into a
8310 \quote{horizontal list} that is consecutively spliced back into
8311 the paragraph stream. Keep in mind that the paragraph may contain unboxed horizontal material,
8312 which then already contains ligatures and kerns and the words therein
8313 are part of the hyphenation process.
8315 The \type{char_node} records are somewhat misnamed, as they are glyph
8316 positions in specific fonts, and therefore not really \quote{characters}
8317 in the linguistic sense. There is no language information inside the
8318 \type{char_node} records. Instead, language information is passed along
8319 using \type{language whatsit} records inside the horizontal list.
8321 In \LUATEX, the situation is quite different. The characters you
8322 type are always converted into \type{glyph_node} records with a
8323 special subtype to identify them as being intended as linguistic
8324 characters. \LUATEX\ stores the needed language information in those
8325 records, but does not do any font|-|related processing at the time of
8326 node creation. It only stores the index of the font.
8328 When it becomes necessary to typeset a paragraph, \LUATEX\ first
8329 inserts all hyphenation points right into the whole node list.
8330 Next, it processes all the font information in the whole list
8331 (creating ligatures and adjusting kerning), and finally it adjusts
8332 all the subtype identifiers so that the records are \quote{glyph
8333 nodes} from now on.
8335 That was the broad overview. The rest of this chapter will deal with the
8336 minutiae of the new process.
8338 \section[charsandglyphs]{Characters and glyphs}
8340 \TEX82 (including \PDFTEX) differentiated between \type{char_node}s
8341 and \type{lig_node}s. The former are simple items that contained
8342 nothing but a \quote{character} and a \quote{font} field, and they
8343 lived in the same memory as tokens did. The latter also contained a
8344 list of components, and a subtype indicating whether this ligature was
8345 the result of a word boundary, and it was stored in the same place as
8346 other nodes like boxes and kerns and glues.
8348 In \LUATEX, these two types are merged into one, somewhat larger
8349 structure called a \type{glyph_node}. Besides having the old
8350 character, font, and component fields, and the new special fields like
8351 \quote{attr} (see~\in{section}[glyphnodes]), these nodes also contain:
8353 \startitemize
8355 \item A subtype, split into four main types:
8357 \startitemize
8358 \item \type{character}, for characters to be hyphenated: the lowest
8359 bit (bit 0) is set to 1.
8360 \item \type{glyph}, for specific font glyphs: the lowest bit
8361 (bit 0) is not set.
8362 \item \type{ligature}, for ligatures (bit 1 is set)
8363 \item \type{ghost}, for \quote{ghost objects} (bit 2 is set)
8364 \stopitemize
8366 The latter two make further use of two extra fields (bits 3 and 4):
8368 \startitemize
8369 \item \type{left}, for ligatures created from a left word boundary and
8370 for ghosts created from \tex{leftghost}
8371 \item \type{right}, for ligatures created from a right word boundary and
8372 for ghosts created from \tex{rightghost}
8373 \stopitemize
8375 For ligatures, both bits can be set at the same time (in case of a single|-|glyph word).
8377 \item \type{glyph_node}s of type \quote{character} also contain language data,
8378 split into four items that were current when the node was created:
8379 the \tex{setlanguage} (15 bits), \tex{lefthyphenmin} (8 bits),
8380 \tex{righthyphenmin} (8 bits), and \tex{uchyph} (1 bit).
8382 \stopitemize
8384 Incidentally, \LUATEX\ allows 16383 separate languages, and words can
8385 be 256 characters long.
8387 Because the \tex{uchyph} value is saved in the actual nodes, its
8388 handling is subtly different from \TEX82: changes to \tex{uchyph}
8389 become effective immediately, not at the end of the current partial
8390 paragraph.
8392 Typeset boxes now always have their language information embedded in
8393 the nodes themselves, so there is no longer a possible dependency on
8394 the surrounding language settings. In \TEX82, a mid-paragraph
8395 statement like \tex{unhbox0} would process the box using the current
8396 paragraph language unless there was a \tex{setlanguage} issued inside
8397 the box. In \LUATEX, all language variables are already frozen.
8400 \section{The main control loop}
8402 In \LUATEX's main loop, almost all input characters that are to be
8403 typeset are converted into \type{glyph_node} records with subtype
8404 \quote{character}, but there are a few small exceptions.
8406 First, the \tex{accent} primitives creates nodes with subtype \quote{glyph}
8407 instead of \quote{character}: one for the actual accent and one for the
8408 accentee. The primary reason for this is that \tex{accent} in \TEX82
8409 is explicitly dependent on the current font encoding, so it would not
8410 make much sense to attach a new meaning to the primitive's name, as
8411 that would invalidate many old documents and macro packages. A
8412 secondary reason is that in \TEX82, \tex{accent} prohibits hyphenation
8413 of the current word. Since in \LUATEX\ hyphenation only takes place on
8414 \quote{character} nodes, it is possible to achieve the same effect.
8416 This change of meaning did happen with \tex{char}, that now generates
8417 \quote{character} nodes, consistent with its changed meaning in \XETEX.
8418 The changed status of \tex{char} is not yet finalized, but if it stays
8419 as it is now, a new primitive \tex{glyph} should be added to directly
8420 insert a font glyph id.
8422 Second, all the results of processing in math mode eventually become
8423 nodes with \quote{glyph} subtypes.
8425 Third, the \ALEPH-derived commands \tex{leftghost} and
8426 \tex{rightghost} create nodes of a third subtype: \quote{ghost}. These nodes
8427 are ignored completely by all further processing until the stage where
8428 inter-glyph kerning is added.
8430 Fourth, automatic discretionaries are handled differently. \TEX82
8431 inserts an empty discretionary after sensing an input character that
8432 matches the \tex{hyphenchar} in the current font. This test is wrong,
8433 in our opinion: whether or not hyphenation takes place should not
8434 depend on the current font, it is a language property.
8436 In \LUATEX, it works like this: if \LUATEX\ senses a string of input
8437 characters that matches the value of the new integer parameter
8438 \tex{exhyphenchar}, it will insert an explicit discretionary after that
8439 series of nodes. Initex sets the \tex{exhyphenchar=`\-}.
8440 Incidentally, this is a global parameter instead of a
8441 language-specific one because it may be useful to change the value
8442 depending on the document structure instead of the text language.
8444 Note: as of \LUATEX\ 0.63.0, the insertion of discretionaries after
8445 a sequence of explicit hyphens happens at the same time as the other
8446 hyphenation processing, {\it not\/} inside the main control loop.
8448 The only use \LUATEX\ has for \tex{hyphenchar} is at the check
8449 whether a word should be considered for hyphenation at all. If the
8450 \tex{hyphenchar} of the font attached to the first character node in a
8451 word is negative, then hyphenation of that word is abandoned
8452 immediately. {\bf This behavior is added for backward
8453 compatibility only, and the use of \type{\hyphenchar=-1} as a means of
8454 preventing hyphenation should not be used in new \LUATEX\ documents.}
8456 Fifth, \tex{setlanguage} no longer creates whatsits. The meaning of
8457 \tex{setlanguage} is changed so that it is now an integer parameter
8458 like all others. That integer parameter is used in \tex{glyph_node}
8459 creation to add language information to the glyph nodes. In
8460 conjunction, the \tex{language} primitive is extended so that it
8461 always also updates the value of \tex{setlanguage}.
8463 Sixth, the \tex{noboundary} command (this command prohibits word
8464 boundary processing where that would normally take place) now does
8465 create whatsits. These whatsits are needed because the exact place of
8466 the \tex{noboundary} command in the input stream has to be retained
8467 until after the ligature and font processing stages.
8469 Finally, there is no longer a \type{main_loop} label in the
8470 code. Remember that \TEX82 did quite a lot of processing while adding
8471 \type{char_nodes} to the horizontal list? For speed reasons, it handled
8472 that processing code outside of the \quote{main control} loop, and only the
8473 first character of any \quote{word} was handled by that \quote{main control} loop.
8474 In \LUATEX, there is no longer a need for that (all hard work is done
8475 later), and the (now very small) bits of character-handling code have
8476 been moved back inline. When \tex{tracingcommands} is on, this is
8477 visible because the full word is reported, instead of just the initial
8478 character.
8481 \section[patternsexceptions]{Loading patterns and exceptions}
8483 The hyphenation algorithm in \LUATEX\ is quite different from the one
8484 in \TEX82, although it uses essentially the same user input.
8486 After expansion, the argument for \tex{patterns} has to be proper
8487 UTF-8 with individual patterns separated by spaces, no \tex{char} or
8488 \tex{chardef-ed} commands are allowed. (The current implementation is
8489 even more strict, and will reject all non|-|\UNICODE\ characters, but
8490 that will be changed in the future. For now, the generated errors are
8491 a valuable tool in discovering font-encoding specific pattern files)
8493 Likewise, the expanded argument for \tex{hyphenation} also has to be
8494 proper UTF-8, but here a tiny little bit of extra syntax is provided:
8496 \startitemize[n]
8497 \item three sets of arguments in curly braces (\type{{}{}{}})
8498 indicates a desired complex discretionary, with arguments
8499 as in \tex{discretionary}'s command in normal document input.
8500 \item \type{-} indicates a desired simple discretionary, cf. \tex{-} and
8501 \type{\discretionary{-}{}{}} in normal document input.
8502 \item Internal command names are ignored. This rule is provided
8503 especially for \tex{discretionary}, but it also helps to deal with
8504 \tex{relax} commands that may sneak in.
8505 \item \type{=} indicates a (non-discretionary) hyphen in the document input.
8506 \stopitemize
8508 The expanded argument is first converted back to a space-separated
8509 string while dropping the internal command names. This string is then
8510 converted into a dictionary by a routine that creates key||value pairs
8511 by converting the other listed items. It is important to note that the
8512 keys in an exception dictionary can always be generated from the
8513 values. Here are a few examples:
8515 \starttabulate[|l|l|l|]
8516 \NC \ssbf value \NC \ssbf implied key (input) \NC \ssbf effect\NC\NR
8517 \NC \type{ta-ble} \NC table \NC \type{ta\-ble}
8518 ($=$ \type{ta\discretionary{-}{}{}ble})\NC\NR
8519 \NC \type{ba{k-}{}{c}ken}\NC backen \NC \type{ba\discretionary{k-}{}{c}ken}\NC\NR
8520 \stoptabulate
8522 The resultant patterns and exception dictionary will be stored under
8523 the language code that is the present value of \tex{language}.
8525 In the last line of the table, you see there is no \tex{discretionary}
8526 command in the value: the command is optional in the \TEX-based input
8527 syntax. The underlying reason for that is that it is conceivable that
8528 a whole dictionary of words is stored as a plain text file and loaded
8529 into \LUATEX\ using one of the functions in the \LUA\ \luatex{lang}
8530 library. This loading method is quite a bit faster than going through
8531 the \TEX\ language primitives, but some (most?) of that speed gain
8532 would be lost if it had to interpret command sequences while doing so.
8534 Starting with \LUATEX\ 0.63.0, it is possible to specify extra hyphenation
8535 points in compound words by using \type{{-}{}{-}} for the explicit hyphen
8536 character (replace \type{-} by the actual explicit hyphen character if needed).
8537 For example, this matches the word \quote{multi-word-boundaries} and allows
8538 an extra break inbetweem \quote{boun} and \quote{daries}:
8540 \starttyping
8541 \hyphenation{multi{-}{}{-}word{-}{}{-}boun-daries}
8542 \stoptyping
8544 The motivation behind the \ETEX\ extension \tex{savinghyphcodes} was
8545 that hyphenation heavily depended on font encodings. This is no longer
8546 true in \LUATEX, and the corresponding primitive is ignored pending
8547 complete removal. The future semantics of \tex{uppercase} and
8548 \tex{lowercase} are still under consideration, no changes have taken
8549 place yet.
8552 \section{Applying hyphenation}
8554 The internal structures \LUATEX\ uses for the insertion of
8555 discretionaries in words is very different from the ones in \TEX82,
8556 and that means there are some noticeable differences in handling as
8557 well.
8559 First and foremost, there is no \quote{compressed trie} involved in
8560 hyphenation. The algorithm still reads \PATGEN-generated pattern
8561 files, but \LUATEX\ uses a finite state hash to match the patterns
8562 against the word to be hyphenated. This algorithm is based on the
8563 \quote{libhnj} library used by OpenOffice, which in turn is inspired
8564 by \TEX.
8565 The memory allocation for this new implementation is completely
8566 dynamic, so the \WEBC\ setting for \type{trie_size} is ignored.
8568 Differences between \LUATEX\ and \TEX82 that are a direct result of that:
8570 \startitemize
8571 \item \LUATEX\ happily hyphenates the full \UNICODE\ character range.
8572 \item Pattern and exception dictionary size is limited by the
8573 available memory only, all allocations are done dynamically.
8574 The trie-related settings in \type{texmf.cnf} are ignored.
8575 \item Because there is no \quote{trie preparation} stage, language patterns
8576 never become frozen. This means that the primitive \tex{patterns}
8577 (and its \LUA\ counterpart \luatex{lang.patterns}) can be used at any
8578 time, not only in initex.
8579 \item Only the string representation of \tex{patterns} and
8580 \tex{hyphenation} is stored in the format file. At format load time,
8581 they are simply re-evaluated. It follows that there is no real
8582 reason to preload languages in the format file. In fact, it is
8583 usually not a good idea to do so. It is much smarter to load
8584 patterns no sooner than the first time they are actually needed.
8585 \item \LUATEX\ uses the language-specific variables
8586 \tex{prehyphenchar} and \tex{posthyphenchar} in the creation of
8587 implicit discretionaries, instead of \TEX82's \tex{hyphenchar}, and
8588 the values of the language-specific variables \tex{preexhyphenchar} and
8589 \tex{postexhyphenchar} for explicit discretionaries (instead of
8590 \TEX82's empty discretionary).
8591 \stopitemize
8593 Inserted characters and ligatures inherit their attributes from the
8594 nearest glyph node item (usually the preceding one, but the following
8595 one for the items inserted at the left-hand side of a word).
8597 Word boundaries are no longer implied by font switches, but by
8598 language switches. One word can have two separate fonts and still be
8599 hyphenated correctly (but it can not have two different languages,
8600 the \tex{setlanguage} command forces a word boundary).
8602 All languages start out with \tex{prehyphenchar=`\-},
8603 \tex{posthyphenchar=0}, \tex{preexhyphenchar=0} and
8604 \tex{postexhyphenchar=0}.
8605 When you assign the values of one of these four parameters, you are
8606 actually changing the settings for the current \tex{language}, this
8607 behavior is compatible with \tex{patterns} and \tex{hyphenation}.
8609 \LUATEX\ also hyphenates the first word in a paragraph.
8611 Words can be up to 256 characters long (up from 64 in \TEX82). Longer
8612 words generate an error right now, but eventually either the
8613 limitation will be removed or perhaps it will become possible to
8614 silently ignore the excess characters (this is what happens in \TEX82,
8615 but there the behavior cannot be controlled).
8617 If you are using the \LUA\ function \type{lang.hyphenate}, you should be
8618 aware that this function expects to receive a list of \quote{character}
8619 nodes. It will not operate properly in the presence of \quote{glyph},
8620 \quote{ligature}, or \quote{ghost} nodes, nor does it know how to deal with
8621 kerning. In the near future, it will be able to skip over \quote{ghost}
8622 nodes, and we may add a less fuzzy function you can call as well.
8624 The hyphenation exception dictionary is maintained as key-value
8625 hash, and that is also dynamic, so the \type{hyph_size} setting is not
8626 used either.
8628 A technical paper detailing the new algorithm will be released as a
8629 separate document.
8631 \section{Applying ligatures and kerning}
8633 After all possible hyphenation points have been inserted in the list,
8634 \LUATEX\ will process the list to convert the \quote{character} nodes into
8635 \quote{glyph} and \quote{ligature} nodes. This is actually done in two stages:
8636 first all ligatures are processed, then all kerning information is
8637 applied to the result list. But those two stages are somewhat
8638 dependent on each other: If the used font makes it possible to do so,
8639 the ligaturing stage adds virtual \quote{character} nodes to the word
8640 boundaries in the list. While doing so, it removes and interprets
8641 \type{noboundary} nodes. The kerning stage deletes those word boundary
8642 items after it is done with them, and it does the same for \quote{ghost}
8643 nodes. Finally, at the end of the kerning stage, all remaining
8644 \quote{character} nodes are converted to \quote{glyph} nodes.
8646 This work separation is worth mentioning because, if you overrule from
8647 \LUA\ only one of the two callbacks related to font handling, then you
8648 have to make sure you perform the tasks normally done by \LUATEX\
8649 itself in order to make sure that the other, non|-|overruled, routine
8650 continues to function properly.
8652 Work in this area is not yet complete, but most of the possible cases
8653 are handled by our rewritten ligaturing engine. We are working hard to
8654 make sure all of the possible inputs will become supported soon.
8656 For example, take the word \type{office}, hyphenated \type{of-fice},
8657 using a \quote{normal} font with all the \type{f}-\type{f} and
8658 \type{f}-\type{i} type ligatures:
8660 \starttabulate[|l|l|]
8661 \NC Initial: \NC \type{{o}{f}{f}{i}{c}{e}}\NC\NR
8662 \NC After hyphenation: \NC \type{{o}{f}{{-},{},{}}{f}{i}{c}{e}}\NC\NR
8663 \NC First ligature stage: \NC \type{{o}{{f-},{f},{<ff>}}{i}{c}{e}}\NC\NR
8664 \NC Final result: \NC \type{{o}{{f-},{<fi>},{<ffi>}}{c}{e}} \NC\NR
8665 \stoptabulate
8667 That's bad enough, but let us assume that there is also a hyphenation
8668 point between the \type{f} and the \type{i}, to create
8669 \type{of-f-ice}. Then the final result should be:
8671 \starttyping
8672 {o}{{f-},
8673 {{f-},
8674 {i},
8675 {<fi>}},
8676 {{<ff>-},
8677 {i},
8678 {<ffi>}}}{c}{e}
8679 \stoptyping
8681 with discretionaries in the post-break text as well as in the
8682 replacement text of the top-level discretionary that resulted from the
8683 first hyphenation point.
8685 Here is that nested solution again, in a different representation:
8687 \starttabulate[|l|l|l|l|]
8688 \NC \NC pre \NC post \NC replace \NC \NR
8689 \NC topdisc \NC \type{f-}$^1$ \NC sub1 \NC sub2 \NC \NR
8690 \NC sub1 \NC \type{f-}$^2$ \NC \type{i}$^3$ \NC \type{<fi>}$^4$ \NC \NR
8691 \NC sub2 \NC \type{<ff>-}$^5$\NC \type{i}$^6$ \NC \type{<ffi>}$^7$\NC \NR
8692 \stoptabulate
8694 When line breaking is choosing its breakpoints, the following fields will eventually
8695 be selected:
8697 \starttabulate[|l|l|l|]
8698 \NC \type{of-f-ice} \NC \type{f-}$^1$ \NC \NR
8699 \NC \NC \type{f-}$^2$ \NC \NR
8700 \NC \NC \type{i}$^3$ \NC \NR
8701 \NC \type{of-fice} \NC \type{f-}$^1$ \NC \NR
8702 \NC \NC \type{<fi>}$^4$ \NC \NR
8703 \NC \type{off-ice} \NC \type{<ff>-}$^5$ \NC \NR
8704 \NC \NC \type{i}$^6$ \NC \NR
8705 \NC \type{office} \NC \type{<ffi>}$^7$ \NC \NR
8706 \stoptabulate
8708 The current solution in \LUATEX\ is not able to handle nested
8709 discretionaries, but it is in fact smart enough to handle this
8710 fictional \type{of-f-ice} example. It does so by combining two
8711 sequential discretionary nodes as if they were a single object
8712 (where the second discretionary node is treated as an extension
8713 of the first node).
8715 One can observe that the \type{of-f-ice} and \type{off-ice} cases both end
8716 with the same actual post replacement list (\type{i}), and that this
8717 would be the case even if that \type{i} was the first item of a
8718 potential following ligature like \type{ic}. This allows \LUATEX\
8719 to do away with one of the fields, and thus make the whole stuff fit
8720 into just two discretionary nodes.
8722 The mapping of the seven list fields to the six fields in this
8723 discretionary node pair is as follows:
8725 \starttabulate[|l|p|]
8726 \NC \bf field \NC \bf description \NC \NR
8727 \NC \type{disc1.pre} \NC \type{f-}$^1$ \NC \NR
8728 \NC \type{disc1.post} \NC \type{<fi>}$^4$ \NC \NR
8729 \NC \type{disc1.replace} \NC \type{<ffi>}$^7$ \NC \NR
8730 \NC \type{disc2.pre} \NC \type{f-}$^2$ \NC \NR
8731 \NC \type{disc2.post} \NC \type{i}$^{3{,}6}$\NC \NR
8732 \NC \type{disc2.replace} \NC \type{<ff>-}$^5$\NC \NR
8733 \stoptabulate
8735 What is actually generated after ligaturing has been applied is
8736 therefore:
8738 \starttyping
8739 {o}{{f-},
8740 {<fi>},
8741 {<ffi>}}
8742 {{f-},
8743 {i},
8744 {<ff>-}}{c}{e}
8745 \stoptyping
8747 The two discretionaries have different subtypes from a discretionary
8748 appearing on its own: the first has subtype 4, and the second has
8749 subtype 5. The need for these special subtypes stems from the fact
8750 that not all of the fields appear in their \quote{normal} location.
8751 The second discretionary especially looks odd, with things like the
8752 \type{<ff>-} appearing in \type{disc2.replace}. The fact that some of
8753 the fields have different meanings (and different processing code
8754 internally) is what makes it necessary to have different subtypes:
8755 this enables \LUATEX\ to distinguish this sequence of two joined
8756 discretionary nodes from the case of two standalone discretionaries
8757 appearing in a row.
8760 \section{Breaking paragraphs into lines}
8762 This code is still almost unchanged, but because of the
8763 above|-|mentioned changes with respect to discretionaries and ligatures,
8764 line breaking will potentially be different from traditional \TEX.
8765 The actual line breaking code is still based on the \TEX82 algorithms,
8766 and it does not expect there to be discretionaries inside of
8767 discretionaries.
8769 But that situation is now fairly common in \LUATEX, due to the changes
8770 to the ligaturing mechanism. And also, the \LUATEX\ discretionary
8771 nodes are implemented slightly different from the \TEX82 nodes: the
8772 \type{no_break} text is now embedded inside the disc node, where
8773 previously these nodes kept their place in the horizontal list (the
8774 discretionary node contained a counter indicating how many nodes to
8775 skip).
8777 The combined effect of these two differences is that \LUATEX\ does not
8778 always use all of the potential breakpoints in a paragraph, especially
8779 when fonts with many ligatures are used.
8781 % TODO:
8782 % Check \sfcode handling
8783 % Implement \glyph
8785 % Remove \savinghyphcodes
8786 % Allow non-UCS characters in \patterns
8788 \chapter[fonts]{Font structure}
8790 All \TEX\ fonts are represented to \LUA\ code as tables, and
8791 internally as C~structures. All keys in the table below are saved in
8792 the internal font structure if they are present in the table returned
8793 by the
8794 \luatex{define_font} callback, or if they result from the normal \TFM|/|\VF\
8795 reading routines if there is no \luatex{define_font} callback defined.
8797 The column \quote{from \VF} means that this key will be created by the
8798 \luatex{font.read_vf()} routine, \quote{from \TFM} means that the key will be created
8799 by the \luatex{font.read_tfm()} routine, and \quote{used} means whether or not the
8800 \LUATEX\ engine itself will do something with the key.
8802 The top|-|level keys in the table are as follows:
8804 \starttabulate[|Tl|l|l|l|l|p|]
8805 \NC \ssbf key \NC \bf from vf \NC \bf from tfm \NC \bf used\NC \bf value type \NC \bf description \NC\NR
8806 \NC name \NC yes \NC yes \NC yes \NC string \NC metric (file) name\NC\NR
8807 \NC area \NC no \NC yes \NC yes \NC string \NC (directory) location, typically empty\NC\NR
8808 \NC used \NC no \NC yes \NC yes \NC boolean\NC used already? (initial: false)\NC \NR
8809 \NC characters \NC yes \NC yes \NC yes \NC table \NC the defined glyphs of this font \NC \NR
8810 \NC checksum \NC yes \NC yes \NC no \NC number \NC default: 0 \NC \NR
8811 \NC designsize \NC no \NC yes \NC yes \NC number \NC expected size (default: 655360 == 10pt) \NC \NR
8812 \NC direction \NC no \NC yes \NC yes \NC number \NC default: 0 (TLT) \NC \NR
8813 \NC encodingbytes \NC no \NC no \NC yes \NC number \NC default: depends on \type {format}\NC\NR
8814 \NC encodingname \NC no \NC no \NC yes \NC string \NC encoding name\NC\NR
8815 \NC fonts \NC yes \NC no \NC yes \NC table \NC locally used fonts\NC \NR
8816 \NC psname \NC no \NC no \NC yes \NC string
8817 \NC actual (\POSTSCRIPT) name (this is the PS fontname in the
8818 incoming font source, also used as fontname identifier in the \PDF\ output, new in 0.43)\NC\NR
8819 \NC fullname \NC no \NC no \NC yes \NC string \NC output font name, used as a fallback in the \PDF\ output if the psname is not set\NC\NR
8820 \NC header \NC yes \NC no \NC no \NC string \NC header comments, if any\NC \NR
8821 \NC hyphenchar \NC no \NC no \NC yes \NC number \NC default: TeX's \tex{hyphenchar} \NC \NR
8822 \NC parameters \NC no \NC yes \NC yes \NC hash \NC default: 7 parameters, all zero \NC \NR
8823 \NC size \NC no \NC yes \NC yes \NC number \NC loaded (at) size. (default: same as designsize) \NC \NR
8824 \NC skewchar \NC no \NC no \NC yes \NC number \NC default: TeX's \tex{skewchar} \NC \NR
8825 \NC type \NC yes \NC no \NC yes \NC string \NC basic type of this font\NC \NR
8826 \NC format \NC no \NC no \NC yes \NC string \NC disk format type\NC \NR
8827 \NC embedding \NC no \NC no \NC yes \NC string \NC \PDF\ inclusion\NC \NR
8828 \NC filename \NC no \NC no \NC yes \NC string \NC disk file name\NC\NR
8829 \NC tounicode \NC no \NC yes \NC yes \NC number \NC if 1, \LUATEX\ assumes per-glyph tounicode entries are
8830 present in the font\NC\NR
8831 \NC stretch \NC no \NC no \NC yes \NC number \NC the \quote {stretch} value from \tex{pdffontexpand}\NC\NR
8832 \NC shrink \NC no \NC no \NC yes \NC number \NC the \quote {shrink} value from \tex{pdffontexpand}\NC\NR
8833 \NC step \NC no \NC no \NC yes \NC number \NC the \quote {step} value from \tex{pdffontexpand}\NC\NR
8834 \NC auto_expand \NC no \NC no \NC yes \NC boolean\NC the \quote {autoexpand} keyword from\crlf \tex{pdffontexpand}\NC\NR
8835 \NC expansion_factor \NC no \NC no \NC no \NC number \NC the actual expansion factor of an expanded font\NC\NR
8836 \NC attributes \NC no \NC no \NC yes \NC string \NC the \tex{pdffontattr}\NC\NR
8837 \NC cache \NC no \NC no \NC yes \NC string \NC this key controls caching of the lua table on the \type{tex}
8838 end. \type{yes}: use a reference to the table that is
8839 passed to \LUATEX\ (this is the default). \type{no}: don't store the table
8840 reference, don't cache any lua data for this font.
8841 \type{renew}: don't store the table reference, but
8842 save a reference to the table that is created at the
8843 first access to one of its fields in font.fonts.
8844 (new in 0.40.0, before that caching was always \type{yes}).
8845 Note: the saved reference is thread-local, so be careful when you are using coroutines: an error will be thrown if the table
8846 has been cached in one thread, but you reference it from another thread ($\approx$ coroutine)\NC\NR
8847 \NC nomath \NC no \NC no \NC yes \NC boolean\NC this key allows a minor speedup for text fonts. if it is
8848 present and true, then \LUATEX\ will not check the
8849 character enties for math-specific keys. (0.42.0)\NC\NR
8850 \NC slant \NC no \NC no \NC yes \NC number \NC This has the same semantics as the \type{SlantFont} operator
8851 in font map files. (0.47.0)\NC\NR
8852 \NC extent \NC no \NC no \NC yes \NC number \NC This has the same semantics as the \type{ExtendFont} operator
8853 in font map files. (0.50.0)\NC\NR
8854 \stoptabulate
8856 The key \type{name} is always required. The keys \type{stretch},
8857 \type{shrink}, \type{step} and optionally \type{auto_expand} only
8858 have meaning when used together: they can be used to replace a
8859 post-loading \tex{pdffontexpand} command. The
8860 \type{expansion_factor} is value that can be present inside a font
8861 in \type{font.fonts}. It is the actual expansion factor (a value
8862 between \type{-shrink} and \type{stretch}, with step \type{step})
8863 of a font that was automatically generated by the font expansion
8864 algorithm. The key \type{attributes} can be used to replace
8865 \tex{pdffontattr}. The key \type{used} is set by the engine when a
8866 font is actively in use, this makes sure that the font's
8867 definition is written to the output file (\DVI\ or \PDF). The
8868 \TFM\ reader sets it to false. The \type{direction} is a number
8869 signalling the \quote{normal} direction for this font. There are
8870 sixteen possibilities:
8872 \starttabulate[|Tc|c|c|c|]
8873 \NC \ssbf number \NC \bf meaning \NC \bf number \NC \bf meaning \NC\NR
8874 \NC 0 \NC LT \NC 8 \NC TT \NC\NR
8875 \NC 1 \NC LL \NC 9 \NC TL \NC\NR
8876 \NC 2 \NC LB \NC 10 \NC TB \NC\NR
8877 \NC 3 \NC LR \NC 11 \NC TR \NC\NR
8878 \NC 4 \NC RT \NC 12 \NC BT \NC\NR
8879 \NC 5 \NC RL \NC 13 \NC BL \NC\NR
8880 \NC 6 \NC RB \NC 14 \NC BB \NC\NR
8881 \NC 7 \NC RR \NC 15 \NC BR \NC\NR
8882 \stoptabulate
8884 These are \OMEGA|-|style direction abbreviations: the first character
8885 indicates the \quote{first} edge of the character glyphs (the edge that is
8886 seen first in the writing direction), the second the \quote{top} side.
8888 The \type{parameters} is a hash with mixed key types. There are seven
8889 possible string keys, as well as a number of integer indices (these
8890 start from 8 up). The seven strings are actually used instead of the
8891 bottom seven indices, because that gives a nicer user interface.
8893 The names and their internal remapping are:
8895 \starttabulate[|lT|c|]
8896 \NC \ssbf name \NC \bf internal remapped number \NC\NR
8897 \NC slant \NC 1 \NC\NR
8898 \NC space \NC 2 \NC\NR
8899 \NC space_stretch \NC 3 \NC\NR
8900 \NC space_shrink \NC 4 \NC\NR
8901 \NC x_height \NC 5 \NC\NR
8902 \NC quad \NC 6 \NC\NR
8903 \NC extra_space \NC 7 \NC\LR
8904 \stoptabulate
8906 The keys \type{type}, \type{format}, \type{embedding}, \type{fullname} and
8907 \type{filename} are used to embed \OPENTYPE\ fonts in the result \PDF.
8909 The \type{characters} table is a list of character hashes indexed by
8910 an integer number. The number is the \quote{internal code} \TEX\ knows this
8911 character by.
8913 Two very special string indexes can be used also: \type{left_boundary} is a
8914 virtual character whose ligatures and kerns are used to handle word
8915 boundary processing. \type{right_boundary} is similar but not actually
8916 used for anything (yet!).
8918 Other index keys are ignored.
8920 Each character hash itself is a hash. For example, here is the
8921 character \quote{f} (decimal 102) in the font cmr10 at 10 points:
8923 \starttyping
8924 [102] = {
8925 ['width'] = 200250,
8926 ['height'] = 455111,
8927 ['depth'] = 0,
8928 ['italic'] = 50973,
8929 ['kerns'] = {
8930 [63] = 50973,
8931 [93] = 50973,
8932 [39] = 50973,
8933 [33] = 50973,
8934 [41] = 50973
8936 ['ligatures'] = {
8937 [102] = {
8938 ['char'] = 11,
8939 ['type'] = 0
8941 [108] = {
8942 ['char'] = 13,
8943 ['type'] = 0
8945 [105] = {
8946 ['char'] = 12,
8947 ['type'] = 0
8951 \stoptyping
8953 The following top|-|level keys can be present inside a character hash:
8955 \starttabulate[|lT|c|c|c|l|p|]
8956 \NC \ssbf key \NC \bf from vf \NC \bf from tfm \NC \bf used \NC \bf value type \NC \bf description \NC\NR
8957 \NC width \NC yes \NC yes \NC yes \NC number \NC character's width, in sp (default 0) \NC\NR
8958 \NC height \NC no \NC yes \NC yes \NC number \NC character's height, in sp (default 0) \NC\NR
8959 \NC depth \NC no \NC yes \NC yes \NC number \NC character's depth, in sp (default 0) \NC\NR
8960 \NC italic \NC no \NC yes \NC yes \NC number \NC character's italic correction, in sp (default zero) \NC\NR
8961 \NC top_accent \NC no \NC no \NC maybe \NC number \NC character's top accent alignment place, in sp (default zero) \NC\NR
8962 \NC bot_accent \NC no \NC no \NC maybe \NC number \NC character's bottom accent alignment place, in sp (default zero) \NC\NR
8963 \NC left_protruding \NC no \NC no \NC maybe \NC number \NC character's \tex{lpcode}\NC\NR
8964 \NC right_protruding \NC no \NC no \NC maybe \NC number \NC character's \tex{rpcode}\NC\NR
8965 \NC expansion_factor \NC no \NC no \NC maybe \NC number \NC character's \tex{efcode}\NC\NR
8966 \NC tounicode \NC no \NC no \NC maybe \NC string \NC character's Unicode equivalent(s), in UTF-16BE hexadecimal format\NC\NR
8967 \NC next \NC no \NC yes \NC yes \NC number \NC the \quote{next larger} character index \NC\NR
8968 \NC extensible \NC no \NC yes \NC yes \NC table \NC the constituent parts of an extensible recipe \NC\NR
8969 \NC vert_variants \NC no \NC no \NC yes \NC table \NC constituent parts of a vertical variant set\NC \NR
8970 \NC horiz_variants\NC no \NC no \NC yes \NC table \NC constituent parts of a horizontal variant set\NC \NR
8971 \NC kerns \NC no \NC yes \NC yes \NC table \NC kerning information \NC\NR
8972 \NC ligatures \NC no \NC yes \NC yes \NC table \NC ligaturing information \NC\NR
8973 \NC commands \NC yes \NC no \NC yes \NC array \NC virtual font commands \NC\NR
8974 \NC name \NC no \NC no \NC no \NC string \NC the character (\POSTSCRIPT) name \NC\NR
8975 \NC index \NC no \NC no \NC yes \NC number \NC the (\OPENTYPE\ or \TRUETYPE) font glyph index \NC\NR
8976 \NC used \NC no \NC yes \NC yes \NC boolean \NC typeset already (default: false)? \NC\NR
8977 \NC mathkern \NC no \NC no \NC yes \NC table \NC math cut-in specifications \NC\NR
8978 \stoptabulate
8980 The values of \type{top_accent}, \type{bot_accent} and \type{mathkern} are used only for math
8981 accent and superscript placement, see the \at{math chapter}[math] in this manual for details.
8983 The values of \type{left_protruding} and \type{right_protruding} are used only when
8984 \tex{pdfprotrudechars} is non-zero.
8986 Whether or not \type{expansion_factor} is used depends on the font's global expansion
8987 settings, as well as on the value of \tex{pdfadjustspacing}.
8989 The usage of \type{tounicode} is this: if this font specifies a \type{tounicode=1} at
8990 the top level, then \LUATEX\ will construct a \type{/ToUnicode} entry for the \PDF\
8991 font (or font subset) based on the character-level \type{tounicode} strings, where
8992 they are available. If a character does not have a sensible \UNICODE\ equivalent,
8993 do not provide a string either (no empty strings).
8995 If the font-level \type{tounicode} is not set, then \LUATEX\ will build up
8996 \type{/ToUnicode} based on the \TEX\ code points you used, and any character-level
8997 \type{tounicodes} will be ignored. {\it At the moment, the string format is exactly the
8998 format that is expected by Adobe \CMAP\ files (\UTF-16BE in hexadecimal encoding), minus
8999 the enclosing angle brackets. This may change in the future.} Small example: the
9000 \type{tounicode} for a \type{fi} ligature would be \type{00660069}.
9002 The presence of \type{extensible} will overrule \type{next}, if that is also present.
9003 It in in turn can be overruled by \type{vert_variants}.
9005 The \type{extensible} table is very simple:
9007 \starttabulate[|lT|l|p|]
9008 \NC \ssbf key \NC \bf type \NC \bf description \NC\NR
9009 \NC top \NC number \NC \quote{top} character index \NC\NR
9010 \NC mid \NC number \NC \quote{middle} character index \NC\NR
9011 \NC bot \NC number \NC \quote{bottom} character index \NC\NR
9012 \NC rep \NC number \NC \quote{repeatable} character index \NC\NR
9013 \stoptabulate
9015 The \type{horiz_variants} and \type{vert_variants} are arrays of components. Each of those
9016 components is itself a hash of up to five keys:
9018 \starttabulate[|lT|l|p|]
9019 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
9020 \NC glyph \NC number \NC The character index (note that this is an encoding number, not a name).\NC \NR
9021 \NC extender \NC number \NC One (1) if this part is repeatable, zero (0) otherwise.\NC \NR
9022 \NC start \NC number \NC Maximum overlap at the starting side (in scaled points).\NC \NR
9023 \NC end \NC number \NC Maximum overlap at the ending side (in scaled points).\NC \NR
9024 \NC advance \NC number \NC Total advance width of this item (can be zero or missing,
9025 then the natural size of the glyph for character \type{component}
9026 is used).\NC \NR
9027 \stoptabulate
9029 The \type{kerns} table is a hash indexed by character index (and
9030 \quote{character index} is defined as either a non|-|negative integer or the
9031 string value \type {right_boundary}), with the values the kerning to be
9032 applied, in scaled points.
9034 The \type{ligatures} table is a hash indexed by character index (and
9035 \quote{character index} is defined as either a non|-|negative integer or the
9036 string value \type {right_boundary}), with the values being yet another small
9037 hash, with two fields:
9039 \starttabulate[|lT|l|p|]
9040 \NC \ssbf key \NC \bf type \NC \bf description \NC\NR
9041 \NC type \NC number \NC the type of this ligature command, default 0 \NC\NR
9042 \NC char \NC number \NC the character index of the resultant ligature \NC\NR
9043 \stoptabulate
9045 The \type{char} field in a ligature is required.
9047 The \type{type} field inside a ligature is the numerical or string value of one of the eight
9048 possible ligature types supported by \TEX. When \TEX\ inserts a new ligature, it puts the new
9049 glyph in the middle of the left and right glyphs. The original left and right glyphs can
9050 optionally be retained, and when at least one of them is kept, it is also possible to move the
9051 new \quote{insertion point} forward one or two places. The glyph that ends up to the right of the
9052 insertion point will become the next \quote{left}.
9054 \starttabulate[|l|c|l|l|]
9055 \NC \bf textual (Knuth) \NC \bf number \NC \bf string \NC result \NC\NR
9056 \NC l + r =: n \NC 0 \NC \type{=:} \NC \|n \NC\NR
9057 \NC l + r =:\| n \NC 1 \NC \type{=:|} \NC \|nr \NC\NR
9058 \NC l + r \|=: n \NC 2 \NC \type{|=:} \NC \|ln \NC\NR
9059 \NC l + r \|=:\| n \NC 3 \NC \type{|=:|} \NC \|lnr \NC\NR
9060 \NC l + r =:\|\> n \NC 5 \NC \type{=:|>} \NC n\|r \NC\NR
9061 \NC l + r \|=:\> n \NC 6 \NC \type{|=:>} \NC l\|n \NC\NR
9062 \NC l + r \|=:\|\> n \NC 7 \NC \type{|=:|>} \NC l\|nr \NC\NR
9063 \NC l + r \|=:\|\>\> n \NC 11 \NC \type{|=:|>>} \NC ln\|r \NC\NR
9064 \stoptabulate
9066 The default value is~0, and can be left out. That signifies a \quote{normal}
9067 ligature where the ligature replaces both original glyphs. In this table
9068 the~\| indicates the final insertion point.
9070 The \type{commands} array is explained below.
9072 \section {Real fonts}
9074 Whether or not a \TEX\ font is a \quote{real} font that should be written to
9075 the \PDF\ document is decided by the \type{type} value in the top|-|level
9076 font structure. If the value is \type{real}, then this is a proper
9077 font, and the inclusion mechanism will attempt to add the needed
9078 font object definitions to the \PDF.
9080 Values for \type{type}:
9082 \starttabulate[|Tl|p|]
9083 \NC \ssbf value \NC \bf description \NC\NR
9084 \NC real \NC this is a base font \NC\NR
9085 \NC virtual \NC this is a virtual font \NC\NR
9086 \stoptabulate
9088 The actions to be taken depend on a number of different variables:
9090 \startitemize[packed]
9091 \item Whether the used font fits in an 8-bit encoding scheme or not
9092 \item The type of the disk font file
9093 \item The level of embedding requested
9094 \stopitemize
9096 A font that uses anything other than an 8-bit encoding vector has to
9097 be written to the \PDF\ in a different way.
9099 The rule is: if the font table has \type {encodingbytes} set to~2,
9100 then this is a wide font, in all other cases it isn't. The value~2 is
9101 the default for \OPENTYPE\ and \TRUETYPE\ fonts loaded via \LUA. For
9102 \TYPEONE\ fonts, you have to set \type {encodingbytes} to~2
9103 explicitly. For \PK\ bitmap fonts, wide font encoding is not
9104 supported at all.
9106 If no special care is needed, \LUATEX\ currently falls back to the
9107 mapfile|-|based solution used by \PDFTEX\ and \DVIPS. This behavior
9108 will be removed in the future, when the existing code becomes
9109 integrated in the new subsystem.
9111 But if this is a \quote{wide} font, then the new subsystem kicks in, and
9112 some extra fields have to be present in the font structure. In this
9113 case, \LUATEX\ does not use a map file at all.
9115 The extra fields are: \type{format}, \type{embedding}, \type{fullname},
9116 \type{cidinfo} (as explained above), \type{filename}, and the
9117 \type{index} key in the separate characters.
9119 Values for \type{format} are:
9121 \starttabulate[|Tl|p|]
9122 \NC \ssbf value \NC \bf description \NC\NR
9123 \NC type1 \NC this is a \POSTSCRIPT\ \TYPEONE\ font \NC\NR
9124 \NC type3 \NC this is a bitmapped (\PK) font \NC\NR
9125 \NC truetype \NC this is a \TRUETYPE\ or \TRUETYPE|-|based \OPENTYPE\ font \NC\NR
9126 \NC opentype \NC this is a \POSTSCRIPT|-|based \OPENTYPE\ font \NC\NR
9127 \stoptabulate
9129 (\type{type3} fonts are provided for backward compatibility only, and do not
9130 support the new wide encoding options.)
9132 Values for \type{embedding} are:
9134 \starttabulate[|Tl|p|]
9135 \NC \ssbf value \NC \bf description \NC\NR
9136 \NC no \NC don't embed the font at all \NC\NR
9137 \NC subset \NC include and atttempt to subset the font \NC\NR
9138 \NC full \NC include this font in its entirety \NC\NR
9139 \stoptabulate
9141 It is not possible to artificially modify the transformation matrix
9142 for the font at the moment.
9144 The other fields are used as follows: The \type{fullname} will be the
9145 \POSTSCRIPT|/|\PDF\ font name. The \type{cidinfo} will be used as the
9146 character set (the CID \type{/Ordering} and \type{/Registry} keys). The
9147 \type{filename} points to the actual font file. If you include the
9148 full path in the \type{filename} or if the file is in the local
9149 directory, \LUATEX\ will run a little bit more efficient because it
9150 will not have to re|-|run the \type{find_xxx_file} callback in that
9151 case.
9153 Be careful: when mixing old and new fonts in one document, it is possible to
9154 create \POSTSCRIPT\ name clashes that can result in printing
9155 errors. When this happens, you have to change the \type{fullname}
9156 of the font.
9158 Typeset strings are written out in a wide format using 2~bytes per
9159 glyph, using the \type{index} key in the character information as
9160 value. The overall effect is like having an encoding based on numbers
9161 instead of traditional (\POSTSCRIPT) name|-|based reencoding. The way
9162 to get the correct \type{index} numbers for \TYPEONE\ fonts is by
9163 loading the font via \type{fontloader.open}; use the table indices as
9164 \type{index} fields.
9166 This type of reencoding means that there is no longer a clear
9167 connection between the text in your input file and the strings in the
9168 output \PDF\ file. Dealing with this is high on the agenda.
9170 \section[virtualfonts]{Virtual fonts}
9172 You have to take the following steps if you want \LUATEX\ to treat the
9173 returned table from \luatex{define_font} as a virtual font:
9175 \startitemize[packed]
9176 \item Set the top|-|level key \type {type} to \type {virtual}.
9177 \item Make sure there is at least one valid entry in \luatex{fonts} (see below).
9178 \item Give a \type {commands} array to every character (see below).
9179 \stopitemize
9181 The presence of the toplevel \type {type} key with the specific value
9182 \type {virtual} will trigger handling of the rest of the special virtual
9183 font fields in the table, but the mere existence of 'type' is enough
9184 to prevent \LUATEX\ from looking for a virtual font on its own.
9186 Therefore, this also works \quote{in reverse}: if you are absolutely certain
9187 that a font is not a virtual font, assigning the value \type{base} or
9188 \type{real} to \type{type} will inhibit \LUATEX\ from looking for a virtual font
9189 file, thereby saving you a disk search.
9191 The \luatex{fonts} is another \LUA\ array. The values are one- or two|-|key
9192 hashes themselves, each entry indicating one of the base fonts in a
9193 virtual font. In case your font is referring to itself, you can use the
9194 \type {font.nextid()} function which returns the index of the next to be defined
9195 font which is probably the currently defined one.
9197 An example makes this easy to understand
9199 \starttyping
9200 fonts = {
9201 { name = 'ptmr8a', size = 655360 },
9202 { name = 'psyr', size = 600000 },
9203 { id = 38 }
9205 \stoptyping
9207 says that the first referenced font (index 1) in this virtual font is
9208 \type{ptrmr8a} loaded at 10pt, and the second is \type{psyr} loaded
9209 at a little over 9pt. The third one is previously defined font that
9210 is known to \LUATEX\ as fontid \quote{38}.
9212 The array index numbers are used by the character command definitions
9213 that are part of each character.
9215 The \luatex{commands} array is a hash where each item is another small array, with the first
9216 entry representing a command and the extra items being the parameters to that command. The
9217 allowed commands and their arguments are:
9219 \starttabulate[|Tl|l|l|p|]
9220 \NC \ssbf command name \NC \bf arguments \NC \bf arg type \NC \bf description \NC\NR
9221 \NC font \NC 1 \NC number \NC select a new font from the local \luatex{fonts} table\NC\NR
9222 \NC char \NC 1 \NC number \NC typeset this character number from the current font,
9223 and move right by the character's width\NC\NR
9224 \NC node \NC 1 \NC node \NC output this node (list), and move right
9225 by the width of this list\NC\NR
9226 \NC slot \NC 2 \NC number \NC a shortcut for the combination of a font and char command\NC\NR
9227 \NC push \NC 0 \NC \NC save current position\NC\NR
9228 \NC nop \NC 0 \NC \NC do nothing \NC\NR
9229 \NC pop \NC 0 \NC \NC pop position \NC\NR
9230 \NC rule \NC 2 \NC 2 numbers \NC output a rule $ht*wd$, and move right.\NC\NR
9231 \NC down \NC 1 \NC number \NC move down on the page\NC\NR
9232 \NC right \NC 1 \NC number \NC move right on the page\NC\NR
9233 \NC special \NC 1 \NC string \NC output a \tex{special} command\NC\NR
9234 \NC lua \NC 1 \NC string \NC execute a \LUA\ script (at \tex{latelua} time)\NC\NR
9235 \NC image \NC 1 \NC image \NC output an image (the argument can be either an \type{<image>}
9236 variable or an \type{image_spec} table)\NC\NR
9237 \NC comment \NC any \NC any \NC the arguments of this command are ignored\NC\NR
9238 \stoptabulate
9240 Here is a rather elaborate glyph commands example:
9241 \starttyping
9243 commands = {
9244 {'push'}, -- remember where we are
9245 {'right', 5000}, -- move right about 0.08pt
9246 {'font', 3}, -- select the fonts[3] entry
9247 {'char', 97}, -- place character 97 (ASCII 'a')
9248 {'pop'}, -- go all the way back
9249 {'down', -200000}, -- move upwards by about 3pt
9250 {'special', 'pdf: 1 0 0 rg'} -- switch to red color
9251 {'rule', 500000, 20000} -- draw a bar
9252 {'special','pdf: 0 g'} -- back to black
9255 \stoptyping
9257 The default value for \type {font} is always~1 at the start of the \type{commands} array.
9258 Therefore, if the virtual font is essentially only a re|-|encoding, then you do usually not
9259 have create an explicit \quote{font} command in the array.
9261 Rules inside of \type{commands} arrays are built up using only two dimensions:
9262 they do not have depth. For correct vertical placement, an extra \type{down} command
9263 may be needed.
9265 Regardless of the amount of movement you create within the \type {commands},
9266 the output pointer will always move by exactly the width that was given in
9267 the \type {width} key of the character hash. Any movements that take place
9268 inside the \type{commands} array are ignored on the upper level.
9270 \subsection{Artificial fonts}
9272 Even in a \quote{real} font, there can be virtual characters. When \LUATEX\ encounters a \type {commands}
9273 field inside a character when it becomes time to typeset the character, it will interpret the commands, just
9274 like for a true virtual character. In this case, if you have created no \quote{fonts} array, then the default
9275 (and only) \quote{base} font is taken to be the current font itself. In practice, this means that you can
9276 create virtual duplicates of existing characters which is useful if you want to create composite characters.
9278 Note: this feature does {\it not\/} work the other way around. There can not be \quote{real} characters in a
9279 virtual font! You cannot use this technique for font re-encoding either; you need a truly virtual
9280 font for that (because characters that are already present cannot be altered).
9282 \subsection{Example virtual font}
9284 Finally, here is a plain \TEX\ input file with a virtual font demonstration:
9286 \startbuffer
9287 \directlua {
9288 callback.register('define_font',
9289 function (name,size)
9290 if name == 'cmr10-red' then
9291 f = font.read_tfm('cmr10',size)
9292 f.name = 'cmr10-red'
9293 f.type = 'virtual'
9294 f.fonts = {{ name = 'cmr10', size = size }}
9295 for i,v in pairs(f.characters) do
9296 if (string.char(i)):find('[tacohanshartmut]') then
9297 v.commands = {
9298 {'special','pdf: 1 0 0 rg'},
9299 {'char',i},
9300 {'special','pdf: 0 g'},
9302 else
9303 v.commands = {{'char',i}}
9306 else
9307 f = font.read_tfm(name,size)
9309 return f
9314 \font\myfont = cmr10-red at 10pt \myfont This is a line of text \par
9315 \font\myfontx= cmr10 at 10pt \myfontx Here is another line of text \par
9316 \stopbuffer
9318 \typebuffer
9320 %\getbuffer
9322 \chapter[nodes]{Nodes}
9324 \section{\LUA\ node representation}
9326 \TEX's nodes are represented in \LUA\ as userdata object with a variable
9327 set of fields. In the following syntax tables, such the type of such a
9328 userdata object is represented as \syntax{<node>}.
9331 The current return value of \luatex{node.types()} is:
9332 \ctxlua {
9333 local d = node.types()
9334 tex.print('\\type{' .. d[0] .. '} (' .. 0 .. '), ')
9335 for _,v in pairs(d) do
9336 if _ > 0 then
9337 tex.print('\\type{' .. v .. '} (' .. _ .. '), ')
9342 NOTE: The \type {\lastnodetype} primitive is \ETEX\ compliant. The valid
9343 range is still -1 .. 15 and glyph nodes have number 0 (used to be
9344 char node) and ligature nodes are mapped to 7. That way macro
9345 packages can use the same symbolic names as in traditional \ETEX.
9346 Keep in mind that the internal node numbers are different and that
9347 there are more node types than 15.
9349 \subsection{Auxiliary items}
9351 A few node|-|typed userdata objects do not occur in the \quote{normal}
9352 list of nodes, but can be pointed to from within that list. They are
9353 not quite the same as regular nodes, but it is easier for the library
9354 routines to treat them as if they were.
9356 \subsubsection{glue_spec items}
9358 Skips are about the only type of data objects in traditional \TEX\
9359 that are not a simple value. The structure that represents the glue
9360 components of a skip is called a \type {glue_spec}, and it has the following
9361 accessible fields:
9363 \starttabulate[|lT|l|p|]
9364 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
9365 \NC width \NC number \NC \NC\NR
9366 \NC stretch \NC number \NC \NC\NR
9367 \NC stretch_order \NC number \NC \NC\NR
9368 \NC shrink \NC number \NC \NC\NR
9369 \NC shrink_order \NC number \NC \NC\NR
9370 \NC writable \NC boolean \NC If this is true, you can't assign to this \type{glue_spec}
9371 because it is one of the preallocated special cases. New in 0.52\NC\NR
9372 \stoptabulate
9374 These objects are reference counted, so there is actually an extra
9375 read-only field named \type {ref_count} as well. This item type will likely
9376 disappear in the future, and the glue fields themselves will
9377 become part of the nodes referencing glue items.
9379 \subsubsection{attribute{\_}list and attribute items}
9381 The newly introduced attribute registers are non|-|trivial, because
9382 the value that is attached to a node is essentially a sparse array of
9383 key|-|value pairs.
9385 It is generally easiest to deal with attribute lists and attributes
9386 by using the dedicated functions in the \luatex{node} library, but
9387 for completeness, here is the low|-|level interface.
9389 An \type{attribute_list} item is used as a head pointer for a list
9390 of attribute items. It has only one user-visible field:
9392 \starttabulate[|lT|l|p|]
9393 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9394 \NC next \NC \syntax{<node>} \NC pointer to the first attribute\NC\NR
9395 \stoptabulate
9397 A normal node's attribute field will point to an item of type
9398 \type{attribute_list}, and the \type{next} field in that item will point
9399 to the first defined \quote{attribute} item, whose \type {next} will
9400 point to the second \quote{attribute} item, etc.
9402 Valid fields in \type{attribute} items:
9404 \starttabulate[|lT|l|p|]
9405 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9406 \NC next \NC \syntax{<node>} \NC pointer to the next attribute\NC\NR
9407 \NC number \NC number \NC the attribute type id\NC\NR
9408 \NC value \NC number \NC the attribute value\NC\NR
9409 \stoptabulate
9411 As mentioned it's better to use the official helpers rather than edit
9412 these fields directly. For instance the \type {prev} field is
9413 used for other purposes and there is no double linked list.
9417 \subsubsection{action item}
9419 Valid fields: \showfields{action}\crlf
9420 Id: \showid{action}
9422 These are a special kind of item that only appears inside
9423 pdf start link objects.
9425 \starttabulate[|lT|l|p|]
9426 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9427 \NC action_type \NC number \NC \NC\NR
9428 \NC action_id \NC number or string \NC \NC\NR
9429 \NC named_id \NC number \NC \NC\NR
9430 \NC file \NC string \NC \NC\NR
9431 \NC new_window \NC number \NC \NC\NR
9432 \NC data \NC string \NC \NC\NR
9433 \NC ref_count \NC number \NC (read-only)\NC\NR
9434 \stoptabulate
9436 \subsection{Main text nodes}
9438 These are the nodes that comprise actual typesetting commands.
9440 A few fields are present in all nodes regardless of their type, these are:
9442 \starttabulate[|lT|l|p|]
9443 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9444 \NC next \NC \syntax{<node>} \NC The next node in a list, or nil\NC\NR
9445 \NC id \NC number \NC The node's type (\type{id}) number \NC\NR
9446 \NC subtype \NC number \NC The node \type{subtype} identifier\NC\NR
9447 \stoptabulate
9449 The \type{subtype} is sometimes just a stub entry. Not all nodes
9450 actually use the \type{subtype}, but this way you can be sure that all
9451 nodes accept it as a valid field name, and that is often handy in node
9452 list traversal. In the following tables \type{next} and \type{id} are
9453 not explicitly mentioned.
9455 Besides these three fields, almost all nodes also have an \type {attr}
9456 field, and there is a also a field called \type{prev}. That last field
9457 is always present, but only initialized on explicit request: when the
9458 function \type{node.slide()} is called, it will set up the \type{prev}
9459 fields to be a backwards pointer in the argument node list.
9462 \subsubsection{hlist nodes}
9464 Valid fields: \showfields{hlist}\crlf
9465 Id: \showid{hlist}
9467 \starttabulate[|lT|l|p|]
9468 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9469 \NC subtype \NC number \NC 0 = unknown origin, 1 = created by
9470 linebreaking, 2 = explicit box command. (0.46.0),
9471 3 = paragraph indentation box, 4 = alignment column or row, 5 = alignment cell (0.62.0)\NC\NR
9472 \NC attr \NC \syntax{<node>} \NC The head of the associated attribute list \NC\NR
9473 \NC width \NC number \NC \NC\NR
9474 \NC height \NC number \NC \NC\NR
9475 \NC depth \NC number \NC \NC\NR
9476 \NC shift \NC number \NC a displacement perpendicular to the
9477 character progression direction \NC\NR
9478 \NC glue_order \NC number \NC a number in the range 0--4, indicating
9479 the glue order\NC\NR
9480 \NC glue_set \NC number \NC the calculated glue ratio\NC\NR
9481 \NC glue_sign \NC number \NC 0 = normal,1 = stretching,2 = shrinking \NC\NR
9482 \NC head \NC \syntax{<node>} \NC the first node of the body of this list\NC\NR
9483 \NC dir \NC string \NC the direction of this box. see~\in{}[dirnodes]\NC\NR
9484 \stoptabulate
9486 A warning: never assign a node list to the \type{head} field
9487 unless you are sure its internal link structure is correct, otherwise
9488 an error may result.
9490 Note: the new field name \type{head} was introduced in 0.65 to replace
9491 the old name \type{list}. Use of the name \type{list} is now
9492 deprecated, but it will stay available until at least version 0.80.
9494 \subsubsection{vlist nodes}
9496 Valid fields: As for hlist, except that \quote{shift} is a displacement
9497 perpendicular to the line progression direction, and \quote{subtype} only
9498 has subtypes 0, 4, and 5.
9500 \subsubsection{rule nodes}
9502 Valid fields: \showfields{rule}\crlf
9503 Id: \showid{rule}
9505 \starttabulate[|lT|l|p|]
9506 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9507 \NC subtype \NC number \NC unused\NC\NR
9508 \NC attr \NC \syntax{<node>} \NC \NC\NR
9509 \NC width \NC number \NC the width of the rule; the special value $-1073741824$
9510 is used for \quote{running} glue dimensions\NC\NR
9511 \NC height \NC number \NC the height of the rule (can be negative)\NC\NR
9512 \NC depth \NC number \NC the depth of the rule (can be negative)\NC\NR
9513 \NC dir \NC string \NC the direction of this rule. see~\in{}[dirnodes]\NC\NR
9514 \stoptabulate
9516 \subsubsection{ins nodes}
9518 Valid fields: \showfields{ins}\crlf
9519 Id: \showid{ins}
9521 \starttabulate[|lT|l|p|]
9522 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9523 \NC subtype \NC number \NC the insertion class\NC\NR
9524 \NC attr \NC \syntax{<node>} \NC \NC\NR
9525 \NC cost \NC number \NC the penalty associated with this insert\NC\NR
9526 \NC height \NC number \NC \NC\NR
9527 \NC depth \NC number \NC \NC\NR
9528 \NC head \NC \syntax{<node>} \NC the first node of the body of this insert\NC\NR
9529 \NC spec \NC \syntax{<node>} \NC a pointer to the \tex{splittopskip} glue spec\NC\NR
9530 \stoptabulate
9532 A warning: never assign a node list to the \type{head} field
9533 unless you are sure its internal link structure is correct, otherwise
9534 an error may be result.
9536 Note: the new field name \type{head} was introduced in 0.65 to replace
9537 the old name \type{list}. Use of the name \type{list} is now
9538 deprecated, but it will stay available until at least version 0.80.
9541 \subsubsection{mark nodes}
9543 Valid fields: \showfields{mark}\crlf
9544 Id: \showid{mark}
9546 \starttabulate[|lT|l|p|]
9547 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9548 \NC subtype \NC number \NC unused\NC\NR
9549 \NC attr \NC \syntax{<node>} \NC \NC\NR
9550 \NC class \NC number \NC the mark class\NC\NR
9551 \NC mark \NC table \NC a table representing a token list\NC\NR
9552 \stoptabulate
9554 \subsubsection{adjust nodes}
9556 Valid fields: \showfields{adjust}\crlf
9557 Id: \showid{adjust}
9559 \starttabulate[|lT|l|p|]
9560 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9561 \NC subtype \NC number \NC 0 = normal, 1 = \quote{pre}\NC\NR
9562 \NC attr \NC \syntax{<node>} \NC \NC\NR
9563 \NC head \NC \syntax{<node>} \NC adjusted material\NC\NR
9564 \stoptabulate
9566 A warning: never assign a node list to the \type{head} field
9567 unless you are sure its internal link structure is correct, otherwise
9568 an error may be result.
9570 Note: the new field name \type{head} was introduced in 0.65 to replace
9571 the old name \type{list}. Use of the name \type{list} is now
9572 deprecated, but it will stay available until at least version 0.80.
9575 \subsubsection{disc nodes}
9577 Valid fields: \showfields{disc}\crlf
9578 Id: \showid{disc}
9580 \starttabulate[|lT|l|p|]
9581 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9582 \NC subtype \NC number \NC indicates the source of a discretionary.
9583 0 = the \tex{discretionary} command,
9584 1 = the \tex{-} command,
9585 2 = added automatically following a \type{-},
9586 3 = added by the hyphenation algorithm (simple),
9587 4 = added by the hyphenation algorithm (hard, first item),
9588 5 = added by the hyphenation algorithm (hard, second item)\NC\NR
9589 \NC attr \NC \syntax{<node>} \NC \NC\NR
9590 \NC pre \NC \syntax{<node>} \NC pointer to the pre|-|break text\NC\NR
9591 \NC post \NC \syntax{<node>} \NC pointer to the post|-|break text\NC\NR
9592 \NC replace \NC \syntax{<node>} \NC pointer to the no|-|break text\NC\NR
9593 \stoptabulate
9595 The subtype numbers~4 and~5 belong to the \quote{of-f-ice} explanation given elsewhere.
9597 A warning: never assign a node list to the pre, post or replace field
9598 unless you are sure its internal link structure is correct, otherwise
9599 an error may be result.
9601 \subsubsection{math nodes}
9603 Valid fields: \showfields{math}\crlf
9604 Id: \showid{math}
9606 \starttabulate[|lT|l|p|]
9607 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9608 \NC subtype \NC number \NC 0 = \quote{on}, 1 = \quote{off}\NC\NR
9609 \NC attr \NC \syntax{<node>} \NC \NC\NR
9610 \NC surround \NC number \NC width of the \tex{mathsurround} kern\NC\NR
9611 \stoptabulate
9613 \subsubsection{glue nodes}
9615 Valid fields: \showfields{glue}\crlf
9616 Id: \showid{glue}
9618 \starttabulate[|lT|l|p|]
9619 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9620 \NC subtype \NC number \NC 0 = \tex{skip},
9621 1--18 = internal glue parameters,
9622 100-103 = \quote{leader} subtypes \NC\NR
9623 \NC attr \NC \syntax{<node>} \NC \NC\NR
9624 \NC spec \NC \syntax{<node>} \NC pointer to a glue{\_}spec item \NC\NR
9625 \NC leader \NC \syntax{<node>} \NC pointer to a box or rule for leaders\NC\NR
9626 \stoptabulate
9628 The exact meanings of the subtypes are as follows:
9630 \starttabulate[|rT|l|]
9631 \NC 1 \NC \tex{lineskip} \NC \NR
9632 \NC 2 \NC \tex{baselineskip} \NC \NR
9633 \NC 3 \NC \tex{parskip} \NC \NR
9634 \NC 4 \NC \tex{abovedisplayskip} \NC \NR
9635 \NC 5 \NC \tex{belowdisplayskip} \NC \NR
9636 \NC 6 \NC \tex{abovedisplayshortskip} \NC \NR
9637 \NC 7 \NC \tex{belowdisplayshortskip} \NC \NR
9638 \NC 8 \NC \tex{leftskip} \NC \NR
9639 \NC 9 \NC \tex{rightskip} \NC \NR
9640 \NC 10 \NC \tex{topskip} \NC \NR
9641 \NC 11 \NC \tex{splittopskip} \NC \NR
9642 \NC 12 \NC \tex{tabskip} \NC \NR
9643 \NC 13 \NC \tex{spaceskip} \NC \NR
9644 \NC 14 \NC \tex{xspaceskip} \NC \NR
9645 \NC 15 \NC \tex{parfillskip} \NC \NR
9646 \NC 16 \NC \tex{thinmuskip} \NC \NR
9647 \NC 17 \NC \tex{medmuskip} \NC \NR
9648 \NC 18 \NC \tex{thickmuskip} \NC \NR
9649 \NC 100 \NC \tex{leaders} \NC \NR
9650 \NC 101 \NC \tex{cleaders} \NC \NR
9651 \NC 102 \NC \tex{xleaders} \NC \NR
9652 \NC 103 \NC \tex{gleaders} \NC \NR
9653 \stoptabulate
9655 \subsubsection{kern nodes}
9657 Valid fields: \showfields{kern}\crlf
9658 Id: \showid{kern}
9660 \starttabulate[|lT|l|p|]
9661 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9662 \NC subtype \NC number \NC 0 = from font,
9663 1 = from \tex{kern} or \tex{/},
9664 2 = from \tex{accent}\NC\NR
9665 \NC attr \NC \syntax{<node>} \NC \NC\NR
9666 \NC kern \NC number \NC \NC\NR
9667 \stoptabulate
9670 \subsubsection{penalty nodes}
9672 Valid fields: \showfields{penalty}\crlf
9673 Id: \showid{penalty}
9675 \starttabulate[|lT|l|p|]
9676 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9677 \NC subtype \NC number \NC not used\NC\NR
9678 \NC attr \NC \syntax{<node>} \NC \NC\NR
9679 \NC penalty \NC number \NC \NC\NR
9680 \stoptabulate
9682 \subsubsection[glyphnodes]{glyph nodes}
9684 Valid fields: \showfields{glyph}\crlf
9685 Id: \showid{glyph}
9687 \starttabulate[|lT|l|p|]
9688 \NC \ssbf field \NC \ssbf type \NC \ssbf explanation \NC \NR
9689 \NC subtype \NC number \NC bitfield \NC \NR
9690 \NC attr \NC \syntax{<node>} \NC \NC \NR
9691 \NC char \NC number \NC \NC \NR
9692 \NC font \NC number \NC \NC \NR
9693 \NC lang \NC number \NC \NC \NR
9694 \NC left \NC number \NC \NC \NR
9695 \NC right \NC number \NC \NC \NR
9696 \NC uchyph \NC boolean \NC \NC \NR
9697 \NC components \NC \syntax{<node>} \NC pointer to ligature components \NC \NR
9698 \NC xoffset \NC number \NC \NC \NR
9699 \NC yoffset \NC number \NC \NC \NR
9700 \NC width \NC number \NC (new in 0.53) \NC \NR
9701 \NC height \NC number \NC (new in 0.53) \NC \NR
9702 \NC depth \NC number \NC (new in 0.53) \NC \NR
9703 \NC expansion_factor \NC number \NC (new in 0.78) \NC \NR
9704 \stoptabulate
9706 A warning: never assign a node list to the components field
9707 unless you are sure its internal link structure is correct, otherwise
9708 an error may be result.
9710 Valid bits for the \type{subtype} field are:
9712 \starttabulate[|c|l|]
9713 \NC \ssbf bit \NC \bf meaning \NC\NR
9714 \NC 0 \NC character \NC\NR
9715 \NC 1 \NC ligature \NC\NR
9716 \NC 2 \NC ghost \NC\NR
9717 \NC 3 \NC left \NC\NR
9718 \NC 4 \NC right \NC\NR
9719 \stoptabulate
9721 See \in{section}[charsandglyphs] for a detailed description of the
9722 \type{subtype} field.
9724 The \type {expansion_factor} is relatively new and the result of extensive
9725 experiments with a more efficient implementation of expansion. Early versions of
9726 \LUATEX\ already replaced multiple instances of fonts in the backend by scaling
9727 but contrary to \PDFTEX\ in \LUATEX\ we now also got rid of font copies in the
9728 frontend and replaced them by expansion factors that travel with glyph nodes. Apart
9729 from a cleaner approach this is also a step towards a better separation between
9730 front- and backend.
9732 \subsubsection{margin{\_}kern nodes}
9734 Valid fields: \showfields{margin_kern}\crlf
9735 Id: \showid{margin_kern}
9737 \starttabulate[|lT|l|p|]
9738 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9739 \NC subtype \NC number \NC 0 = left side,
9740 1 = right side\NC\NR
9741 \NC attr \NC \syntax{<node>} \NC \NC\NR
9742 \NC width \NC number \NC \NC\NR
9743 \NC glyph \NC \syntax{<node>} \NC \NC\NR
9744 \stoptabulate
9746 \subsection{Math nodes}
9748 These are the so||called \quote{noad}s and the nodes that are specifically
9749 associated with math processing. Most of these nodes contain sub-nodes so
9750 that the list of possible fields is actually quite small. First, the subnodes:
9752 \subsubsection{Math kernel subnodes}
9754 Many object fields in math mode are either simple characters in a
9755 specific family or math lists or node lists. There are four associated
9756 subnodes that represent these cases (in the following node
9757 descriptions these are indicated by the word \type{<kernel>}).
9759 The \type{next} and \type{prev} fields for these subnodes are unused.
9761 \subsubsubsection{math{\_}char and math{\_}text{\_}char subnodes}
9763 Valid fields: \showfields{math_char}\crlf
9764 Id: \showid{math_char}
9766 \starttabulate[|lT|l|p|]
9767 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9768 \NC attr \NC \syntax{<node>}\NC \NC\NR
9769 \NC char \NC number \NC \NC \NR
9770 \NC fam \NC number \NC \NC\NR
9771 \stoptabulate
9773 The \type{math_char} is the simplest subnode field, it contains
9774 the character and family for a single glyph object. The
9775 \type{math_text_char} is a special case that you will not
9776 normally encounter, it arises temporarily during math list conversion
9777 (its sole function is to suppress a following italic correction).
9779 \subsubsubsection{sub{\_}box and sub{\_}mlist subnodes}
9781 Valid fields: \showfields{sub_box}\crlf
9782 Id: \showid{sub_box}
9784 \starttabulate[|lT|l|p|]
9785 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9786 \NC attr \NC \syntax{<node>}\NC \NC\NR
9787 \NC head \NC \syntax{<node>}\NC \NC \NR
9788 \stoptabulate
9790 These two subnode types are used for subsidiary list items. For
9791 \type{sub_box}, the \type{head} points to a \quote{normal} vbox or
9792 hbox. For \type{sub_mlist}, the \type{head} points to a math list
9793 that is yet to be converted.
9795 A warning: never assign a node list to the \type{head} field
9796 unless you are sure its internal link structure is correct, otherwise
9797 an error may be result.
9799 Note: the new field name \type{head} was introduced in 0.65 to replace
9800 the old name \type{list}. Use of the name \type{list} is now
9801 deprecated, but it will stay available until at least version 0.80.
9803 \subsubsection{Math delimiter subnode}
9805 There is a fifth subnode type that is used exclusively for delimiter
9806 fields. As before, the \type{next} and \type{prev} fields are unused.
9808 \subsubsubsection{delim subnodes}
9810 Valid fields: \showfields{delim}\crlf
9811 Id: \showid{delim}
9813 \starttabulate[|lT|l|p|]
9814 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9815 \NC attr \NC \syntax{<node>}\NC \NC\NR
9816 \NC small_char \NC number \NC \NC \NR
9817 \NC small_fam \NC number \NC \NC\NR
9818 \NC large_char \NC number \NC \NC \NR
9819 \NC large_fam \NC number \NC \NC\NR
9820 \stoptabulate
9822 The fields \type{large_char} and \type{large_fam} can be zero, in that
9823 case the font that is sed for the \type{small_fam} is expected to
9824 provide the large version as an extension to the \type{small_char}.
9826 \subsubsection{Math core nodes}
9828 First, there are the objects (the \TEX book calls then \quote{atoms})
9829 that are associated with the simple math objects: Ord, Op, Bin, Rel,
9830 Open, Close, Punct, Inner, Over, Under, Vcent. These all have
9831 the same fields, and they are combined into a single node type with
9832 separate subtypes for differentiation.
9834 \subsubsubsection{simple nodes}
9836 Valid fields: \showfields{noad}\crlf
9837 Id: \showid{noad}
9839 \starttabulate[|lT|l|p|]
9840 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9841 \NC subtype \NC number \NC see below \NC\NR
9842 \NC attr \NC \syntax{<node>} \NC \NC\NR
9843 \NC nucleus \NC \syntax{<kernel>}\NC \NC\NR
9844 \NC sub \NC \syntax{<kernel>}\NC \NC\NR
9845 \NC sup \NC \syntax{<kernel>}\NC \NC\NR
9846 \stoptabulate
9848 Operators are a bit special because they occupy three subtypes.
9849 \type{subtype}.
9851 \starttabulate[|lT|p|]
9852 \NC \ssbf number \NC \bf node sub type \NC\NR
9853 \NC 0 \NC Ord \NC\NR
9854 \NC 1 \NC Op, \type{\displaylimits} \NC\NR
9855 \NC 2 \NC Op, \type{\limits} \NC\NR
9856 \NC 3 \NC Op, \type{\nolimits} \NC\NR
9857 \NC 4 \NC Bin \NC\NR
9858 \NC 5 \NC Rel \NC\NR
9859 \NC 6 \NC Open \NC\NR
9860 \NC 7 \NC Close \NC\NR
9861 \NC 8 \NC Punct \NC\NR
9862 \NC 9 \NC Inner \NC\NR
9863 \NC 10 \NC Under \NC\NR
9864 \NC 11 \NC Over \NC\NR
9865 \NC 12 \NC Vcent \NC\NR
9866 \stoptabulate
9868 \subsubsubsection{accent nodes}
9870 Valid fields: \showfields{accent}\crlf
9871 Id: \showid{accent}
9873 \starttabulate[|lT|l|p|]
9874 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9875 \NC subtype \NC number \NC the first bit is used for a fixed top accent flag (if the \type{accent} field is present),
9876 the second bit for a fixed bottom accent flag (if the \type{bot_accent} field is present).
9877 Example: the actual value \type{3} means: do not stretch either accent\NC\NR
9878 \NC attr \NC \syntax{<node>}\NC \NC\NR
9879 \NC nucleus \NC \syntax{<kernel>}\NC \NC \NR
9880 \NC sub \NC \syntax{<kernel>}\NC \NC\NR
9881 \NC sup \NC \syntax{<kernel>}\NC \NC \NR
9882 \NC accent \NC \syntax{<kernel>}\NC \NC\NR
9883 \NC bot_accent \NC \syntax{<kernel>}\NC \NC\NR
9884 \stoptabulate
9886 \subsubsubsection{style nodes}
9888 Valid fields: \showfields{style}\crlf
9889 Id: \showid{style}
9891 \starttabulate[|lT|l|p|]
9892 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9893 \NC style \NC string \NC contains the style \NC\NR
9894 \stoptabulate
9896 There are eight possibilities for the string value: one of
9897 \quote{display}, \quote{text}, \quote{script}, or \quote{scriptscript}.
9898 Each of these can have a trailing \type{'} to signify
9899 \quote{cramped} styles.
9901 \subsubsubsection{choice nodes}
9903 Valid fields: \showfields{choice}\crlf
9904 Id: \showid{choice}
9906 \starttabulate[|lT|l|p|]
9907 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9908 \NC attr \NC \syntax{<node>}\NC \NC\NR
9909 \NC display \NC \syntax{<node>}\NC \NC\NR
9910 \NC text \NC \syntax{<node>}\NC \NC\NR
9911 \NC script \NC \syntax{<node>}\NC \NC\NR
9912 \NC scriptscript \NC \syntax{<node>}\NC \NC\NR
9913 \stoptabulate
9915 A warning: never assign a node list to the display, text, script, or
9916 scriptscript field unless you are sure its internal link structure is
9917 correct, otherwise an error may be result.
9919 \subsubsubsection{radical nodes}
9921 Valid fields: \showfields{radical}\crlf
9922 Id: \showid{radical}
9924 \starttabulate[|lT|l|p|]
9925 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9926 \NC attr \NC \syntax{<node>}\NC \NC\NR
9927 \NC nucleus \NC \syntax{<kernel>}\NC \NC \NR
9928 \NC sub \NC \syntax{<kernel>}\NC \NC\NR
9929 \NC sup \NC \syntax{<kernel>}\NC \NC \NR
9930 \NC left \NC \syntax{<delim>}\NC \NC \NR
9931 \NC degree \NC \syntax{<kernel>}\NC Only set by \type{\Uroot} \NC \NR
9932 \stoptabulate
9934 A warning: never assign a node list to the nucleus, sub, sup, left, or
9935 degree field
9936 unless you are sure its internal link structure is correct, otherwise
9937 an error may be result.
9939 \subsubsubsection{fraction nodes}
9941 Valid fields: \showfields{fraction}\crlf
9942 Id: \showid{fraction}
9944 \starttabulate[|lT|l|p|]
9945 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9946 \NC attr \NC \syntax{<node>}\NC \NC\NR
9947 \NC width \NC number \NC \NC \NR
9948 \NC num \NC \syntax{<kernel>}\NC \NC\NR
9949 \NC denom \NC \syntax{<kernel>}\NC \NC \NR
9950 \NC left \NC \syntax{<delim>}\NC \NC \NR
9951 \NC right \NC \syntax{<delim>}\NC \NC \NR
9952 \stoptabulate
9954 A warning: never assign a node list to the num, or denom field
9955 unless you are sure its internal link structure is correct, otherwise
9956 an error may be result.
9958 \subsubsubsection{fence nodes}
9960 Valid fields: \showfields{fence}\crlf
9961 Id: \showid{fence}
9963 \starttabulate[|lT|l|p|]
9964 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9965 \NC subtype \NC number \NC 1 = \type{\left},
9966 2 = \type{\middle},
9967 3 = \type{\right} \NC\NR
9968 \NC attr \NC \syntax{<node>}\NC \NC\NR
9969 \NC delim \NC \syntax{<delim>}\NC \NC \NR
9970 \stoptabulate
9972 \subsection{whatsit nodes}
9974 Whatsit nodes come in many subtypes that you can ask for by running
9975 \luatex{node.whatsits()}:
9976 \ctxlua {for n,name in table.sortedpairs(node.whatsits()) do
9977 if (n<100) then
9978 if (n>0) then tex.sprint (', ') end
9979 tex.sprint('\\type{' .. name .. '} (' .. n .. ')') end
9980 end }
9982 \subsubsection{open nodes}
9984 Valid fields: \showfields{whatsit,open}\crlf
9985 Id: \showid{whatsit,open}
9987 \starttabulate[|lT|l|p|]
9988 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9989 \NC attr \NC \syntax{<node>} \NC \NC\NR
9990 \NC stream \NC number \NC \TEX's stream id number\NC\NR
9991 \NC name \NC string \NC file name \NC\NR
9992 \NC ext \NC string \NC file extension \NC\NR
9993 \NC area \NC string \NC file area (this may become obsolete) \NC\NR
9994 \stoptabulate
9996 \subsubsection{write nodes}
9998 Valid fields: \showfields{whatsit,write}\crlf
9999 Id: \showid{whatsit,write}
10001 \starttabulate[|lT|l|p|]
10002 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10003 \NC attr \NC \syntax{<node>} \NC \NC\NR
10004 \NC stream \NC number \NC \TEX's stream id number\NC\NR
10005 \NC data \NC table \NC a table representing the token list to be written\NC\NR
10006 \stoptabulate
10008 \subsubsection{close nodes}
10010 Valid fields: \showfields{whatsit,close}\crlf
10011 Id: \showid{whatsit,close}
10013 \starttabulate[|lT|l|p|]
10014 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10015 \NC attr \NC \syntax{<node>} \NC \NC\NR
10016 \NC stream \NC number \NC \TEX's stream id number\NC\NR
10017 \stoptabulate
10019 \subsubsection{special nodes}
10021 Valid fields: \showfields{whatsit,special}\crlf
10022 Id: \showid{whatsit,special}
10024 \starttabulate[|lT|l|p|]
10025 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10026 \NC attr \NC \syntax{<node>} \NC \NC\NR
10027 \NC data \NC string \NC the \tex{special} information\NC\NR
10028 \stoptabulate
10030 \subsubsection{language nodes}
10033 \LUATEX\ does not have language whatsits any more. All language
10034 information is already present inside the glyph nodes themselves.
10035 This whatsit subtype will be removed in the next release.
10038 \subsubsection{local_par nodes}
10040 Valid fields: \showfields{whatsit,local_par}\crlf
10041 Id: \showid{whatsit,local_par}
10043 \starttabulate[|lT|l|p|]
10044 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10045 \NC attr \NC \syntax{<node>} \NC \NC\NR
10046 \NC pen_inter \NC number \NC local interline penalty (from \tex{localinterlinepenalty})\NC\NR
10047 \NC pen_broken\NC number \NC local broken penalty (from \tex{localbrokenpenalty})\NC\NR
10048 \NC dir \NC string \NC the direction of this par. see~\in{}[dirnodes]\NC\NR
10049 \NC box_left \NC \syntax{<node>} \NC the \tex{localleftbox}\NC\NR
10050 \NC box_left_width\NC number\NC width of the \tex{localleftbox}\NC\NR
10051 \NC box_right \NC \syntax{<node>} \NC the \tex{localrightbox}\NC\NR
10052 \NC box_right_width\NC number\NC width of the \tex{localrightbox}\NC\NR
10053 \stoptabulate
10055 A warning: never assign a node list to the box_left or box_right field
10056 unless you are sure its internal link structure is correct, otherwise
10057 an error may be result.
10062 \subsubsection[dirnodes]{dir nodes}
10064 Valid fields: \showfields{whatsit,dir}\crlf
10065 Id: \showid{whatsit,dir}
10067 \starttabulate[|lT|l|p|]
10068 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10069 \NC attr \NC \syntax{<node>} \NC \NC\NR
10070 \NC dir \NC string \NC the direction (but see below)\NC\NR
10071 \NC level \NC number \NC nesting level of this direction whatsit\NC\NR
10072 \NC dvi_ptr \NC number \NC a saved dvi buffer byte offset\NC\NR
10073 \NC dir_h \NC number \NC a saved dvi position\NC\NR
10074 \stoptabulate
10076 A note on \type{dir} strings. Direction specifiers are three-letter
10077 combinations of \type{T}, \type{B}, \type{R}, and \type{L}.
10079 These are built up out of three separate items:
10080 \startitemize
10081 \item the first is the direction of the \quote{top} of paragraphs.
10082 \item the second is the direction of the \quote{start} of lines.
10083 \item the third is the direction of the \quote{top} of glyphs.
10084 \stopitemize
10086 However, only four combinations are accepted: \type{TLT}, \type{TRT},
10087 \type{RTT}, and \type{LTL}.
10089 Inside actual \type{dir} whatsit nodes, the representation of
10090 \type{dir} is not a three-letter but a four-letter combination. The
10091 first character in this case is always either \type{+} or \type{-},
10092 indicating whether the value is pushed or popped from the direction
10093 stack.
10095 \subsubsection{pdf_literal nodes}
10097 Valid fields: \showfields{whatsit,pdf_literal}\crlf
10098 Id: \showid{whatsit,pdf_literal}
10100 \starttabulate[|lT|l|p|]
10101 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10102 \NC attr \NC \syntax{<node>} \NC \NC\NR
10103 \NC mode \NC number \NC the \quote{mode} setting of this literal\NC\NR
10104 \NC data \NC string \NC the \tex{pdfliteral} information\NC\NR
10105 \stoptabulate
10107 Mode values:
10109 \starttabulate[|lT|p|]
10110 \NC \ssbf value \NC \ssbf corresponding \tex{pdftex} keyword \NC \NR
10111 \NC 0 \NC setorigin \NC \NR
10112 \NC 1 \NC page \NC \NR
10113 \NC 2 \NC direct \NC \NR
10114 \stoptabulate
10116 \subsubsection{pdf_refobj nodes}
10118 Valid fields: \showfields{whatsit,pdf_refobj}\crlf
10119 Id: \showid{whatsit,pdf_refobj}
10121 \starttabulate[|lT|l|p|]
10122 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10123 \NC attr \NC \syntax{<node>} \NC \NC\NR
10124 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
10125 \stoptabulate
10127 \subsubsection{pdf_refxform nodes}
10129 Valid fields: \showfields{whatsit,pdf_refxform}\crlf
10130 Id: \showid{whatsit,pdf_refxform}.
10132 \starttabulate[|lT|l|p|]
10133 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10134 \NC attr \NC \syntax{<node>} \NC \NC\NR
10135 \NC width \NC number \NC \NC \NR
10136 \NC height \NC number \NC \NC \NR
10137 \NC depth \NC number \NC \NC \NR
10138 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
10139 \stoptabulate
10141 Be aware that \type{pdf_refxform} nodes have dimensions that are used by \LUATEX.
10143 \subsubsection{pdf_refximage nodes}
10145 Valid fields: \showfields{whatsit,pdf_refximage}\crlf
10146 Id: \showid{whatsit,pdf_refximage}
10148 \starttabulate[|lT|l|p|]
10149 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10150 \NC attr \NC \syntax{<node>} \NC \NC\NR
10151 \NC width \NC number \NC \NC \NR
10152 \NC height \NC number \NC \NC \NR
10153 \NC depth \NC number \NC \NC \NR
10154 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
10155 \stoptabulate
10157 Be aware that \type{pdf_refximage} nodes have dimensions that are used by \LUATEX.
10159 \subsubsection{pdf_annot nodes}
10161 Valid fields: \showfields{whatsit,pdf_annot}\crlf
10162 Id: \showid{whatsit,pdf_annot}
10164 \starttabulate[|lT|l|p|]
10165 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10166 \NC attr \NC \syntax{<node>} \NC \NC\NR
10167 \NC width \NC number \NC \NC \NR
10168 \NC height \NC number \NC \NC \NR
10169 \NC depth \NC number \NC \NC \NR
10170 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
10171 \NC data \NC string \NC the annotation data\NC\NR
10172 \stoptabulate
10175 \subsubsection{pdf_start_link nodes}
10177 Valid fields: \showfields{whatsit,pdf_start_link}\crlf
10178 Id: \showid{whatsit,pdf_start_link}
10180 \starttabulate[|lT|l|p|]
10181 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10182 \NC attr \NC \syntax{<node>} \NC \NC\NR
10183 \NC width \NC number \NC \NC \NR
10184 \NC height \NC number \NC \NC \NR
10185 \NC depth \NC number \NC \NC \NR
10186 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
10187 \NC link_attr \NC table \NC the link attribute token list\NC\NR
10188 \NC action \NC \syntax{<node>} \NC the action to perform\NC\NR
10189 \stoptabulate
10191 \subsubsection{pdf_end_link nodes}
10193 Valid fields: \showfields{whatsit,pdf_end_link}\crlf
10194 Id: \showid{whatsit,pdf_end_link}
10196 \starttabulate[|lT|l|p|]
10197 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10198 \NC attr \NC \syntax{<node>} \NC \NC\NR
10199 \stoptabulate
10201 \subsubsection{pdf_dest nodes}
10203 Valid fields: \showfields{whatsit,pdf_dest}\crlf
10204 Id: \showid{whatsit,pdf_dest}
10206 \starttabulate[|lT|l|p|]
10207 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10208 \NC attr \NC \syntax{<node>} \NC \NC\NR
10209 \NC width \NC number \NC \NC \NR
10210 \NC height \NC number \NC \NC \NR
10211 \NC depth \NC number \NC \NC \NR
10212 \NC named_id \NC number \NC is the dest_id a string value?\NC\NR
10213 \NC dest_id \NC number or string \NC the destination id\NC\NR
10214 \NC dest_type \NC number\NC type of destination\NC\NR
10215 \NC xyz_zoom \NC number\NC \NC\NR
10216 \NC objnum \NC number \NC the \PDF\ object number\NC\NR
10217 \stoptabulate
10219 \subsubsection{pdf_thread nodes}
10221 Valid fields: \showfields{whatsit,pdf_thread}\crlf
10222 Id: \showid{whatsit,pdf_thread}
10224 \starttabulate[|lT|l|p|]
10225 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10226 \NC attr \NC \syntax{<node>} \NC \NC\NR
10227 \NC width \NC number \NC \NC \NR
10228 \NC height \NC number \NC \NC \NR
10229 \NC depth \NC number \NC \NC \NR
10230 \NC named_id \NC number \NC is the tread_id a string value?\NC\NR
10231 \NC tread_id \NC number or string \NC the thread id\NC\NR
10232 \NC thread_attr\NC number \NC extra thread information\NC\NR
10233 \stoptabulate
10235 \subsubsection{pdf_start_thread nodes}
10237 Valid fields: \showfields{whatsit,pdf_start_thread}\crlf
10238 Id: \showid{whatsit,pdf_start_thread}
10240 \starttabulate[|lT|l|p|]
10241 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10242 \NC attr \NC \syntax{<node>} \NC \NC\NR
10243 \NC width \NC number \NC \NC \NR
10244 \NC height \NC number \NC \NC \NR
10245 \NC depth \NC number \NC \NC \NR
10246 \NC named_id \NC number \NC is the tread_id a string value?\NC\NR
10247 \NC tread_id \NC number or string \NC the thread id\NC\NR
10248 \NC thread_attr\NC number \NC extra thread information\NC\NR
10249 \stoptabulate
10251 \subsubsection{pdf_end_thread nodes}
10253 Valid fields: \showfields{whatsit,pdf_end_thread}\crlf
10254 Id: \showid{whatsit,pdf_end_thread}
10256 \starttabulate[|lT|l|p|]
10257 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10258 \NC attr \NC \syntax{<node>} \NC \NC\NR
10259 \stoptabulate
10261 \subsubsection{pdf_save_pos nodes}
10263 Valid fields: \showfields{whatsit,pdf_save_pos}\crlf
10264 Id: \showid{whatsit,pdf_save_pos}
10266 \starttabulate[|lT|l|p|]
10267 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10268 \NC attr \NC \syntax{<node>} \NC \NC\NR
10269 \stoptabulate
10271 \subsubsection{late_lua nodes}
10273 Valid fields: \showfields{whatsit,late_lua}\crlf
10274 Id: \showid{whatsit,late_lua}
10276 \starttabulate[|lT|l|p|]
10277 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10278 \NC attr \NC \syntax{<node>} \NC \NC\NR
10279 \NC data \NC string \NC data to execute\NC\NR
10280 \NC string \NC string \NC data to execute (0.63)\NC\NR
10281 \NC name \NC string \NC the name to use for lua error reporting\NC\NR
10282 \stoptabulate
10284 The difference between \type{data} and \type{string} is that on
10285 assignment, the \type{data} field is converted to a token list, cf. use as
10286 \tex{latelua}. The \type{string} version is treated as a literal string.
10288 \subsubsection{pdf_colorstack nodes}
10290 Valid fields: \showfields{whatsit,pdf_colorstack}\crlf
10291 Id: \showid{whatsit,pdf_colorstack}
10293 \starttabulate[|lT|l|p|]
10294 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10295 \NC attr \NC \syntax{<node>} \NC \NC\NR
10296 \NC stack \NC number \NC colorstack id number\NC\NR
10297 \NC cmd \NC number \NC command to execute\NC\NR
10298 \NC data \NC string \NC data\NC\NR
10299 \stoptabulate
10301 \subsubsection{pdf_setmatrix nodes}
10303 Valid fields: \showfields{whatsit,pdf_setmatrix}\crlf
10304 Id: \showid{whatsit,pdf_setmatrix}
10306 \starttabulate[|lT|l|p|]
10307 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10308 \NC attr \NC \syntax{<node>} \NC \NC\NR
10309 \NC data \NC string \NC data\NC\NR
10310 \stoptabulate
10312 \subsubsection{pdf_save nodes}
10314 Valid fields: \showfields{whatsit,pdf_save}\crlf
10315 Id: \showid{whatsit,pdf_save}
10317 \starttabulate[|lT|l|p|]
10318 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10319 \NC attr \NC \syntax{<node>} \NC \NC\NR
10320 \stoptabulate
10322 \subsubsection{pdf_restore nodes}
10324 Valid fields: \showfields{whatsit,pdf_restore}\crlf
10325 Id: \showid{whatsit,pdf_restore}
10327 \starttabulate[|lT|l|p|]
10328 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10329 \NC attr \NC \syntax{<node>} \NC \NC\NR
10330 \stoptabulate
10332 \subsubsection{user_defined nodes}
10334 User|-|defined whatsit nodes can only be created and handled from \LUA\
10335 code. In effect, they are an extension to the extension
10336 mechanism. The \LUATEX\ engine will simply step over such whatsits
10337 without ever looking at the contents.
10339 Valid fields: \showfields{whatsit,user_defined}\crlf
10340 Id: \showid{whatsit,user_defined}
10342 \starttabulate[|lT|l|p|]
10343 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10344 \NC attr \NC \syntax{<node>} \NC \NC\NR
10345 \NC user_id \NC number \NC id number\NC\NR
10346 \NC type \NC number \NC type of the value\NC\NR
10347 \NC value \NC number \NC \NC\NR
10348 \NC \NC string \NC \NC\NR
10349 \NC \NC \syntax{<node>} \NC \NC\NR
10350 \NC \NC table \NC \NC\NR
10351 \stoptabulate
10353 The \type{type} can have one of five distinct values:
10355 \starttabulate[|lT|p|]
10356 \NC \ssbf value \NC \bf explanation \NC\NR
10357 \NC 97 \NC the value is an attribute node list \NC\NR
10358 \NC 100 \NC the value is a number \NC\NR
10359 \NC 110 \NC the value is a node list \NC\NR
10360 \NC 115 \NC the value is a string\NC\NR
10361 \NC 116 \NC the value is a token list in \LUA\ table form\NC\NR
10362 \stoptabulate
10364 \section{Two access models}
10366 After doing lots of tests with \LUATEX\ and \LUAJITTEX\, with and without just in
10367 time compilation enabled, and with and without using ffi, we came to the
10368 conclusion that userdata prevents a speedup. We also found that the checking of
10369 metatables as well as assignment comes with overhead that can't be neglected.
10370 This is normally not really a problem but when processing fonts for more complex
10371 scripts it could have quite some overhead.
10373 Because the userdata approach has some benefits, this remains the recommended way
10374 to access nodes. We did several experiments with faster access using this model,
10375 but eventually settled for the \quote {direct} approach. For code that is proven
10376 to be okay, one can use this access model that operates on nodes more directly.
10378 Deep down in \TEX\ a node has a number which is an entry in a memory table. In
10379 fact, this model, where \TEX\ manages memory is real fast and one of the reasons
10380 why plugging in callbacks that operate on nodes is quite fast. No matter what
10381 future memory model \LUATEX\ has, an internal reference will always be a simple data
10382 type (like a number or light userdata in \LUA\ speak). So, if you use the direct
10383 model, even if you know that you currently deal with numbers, you should not depend
10384 on that property but treat it an abstraction just like traditional nodes. In fact,
10385 the fact that we use a simple basic datatype has the penalty that less checking can
10386 be done, but less checking is also the reason why it's somewhat faster. An
10387 important aspect is that one cannot mix both methods, but you can cast both
10388 models.
10390 So our advice is: use the indexed approach when possible and investigate the
10391 direct one when speed might be an issue. For that reason we also provide the
10392 \type {get*} and \type {set*} functions in the top level node namespace. There is
10393 a limited set of getters. When implementing this direct approach the regular
10394 index by key variant was also optimized, so direct access only makes sense when
10395 we're accessing nodes millions of times (which happens in some font processing
10396 for instance).
10398 We're talking mostly of getters because setters are less important. Documents
10399 have not that many content related nodes and setting many thousands of properties
10400 is hardly a burden contrary to millions of consultations.
10402 Normally you will access nodes like this:
10404 \starttyping
10405 local next = current.next
10406 if next then
10407 -- do something
10409 \stoptyping
10411 Here \type {next} is not a real field, but a virtual one. Accessing it results in
10412 a metatable method being called. In practice it boils down to looking up the
10413 node type and based on the node type checking for the field name. In a worst case
10414 you have a node type that sits at the end of the lookup list and a field that is
10415 last in the lookup chain. However, in successive versions of \LUATEX\ these lookups
10416 have been optimized and the most frequently accessed nodes and fields have a higher
10417 priority.
10419 Because in practice the \type {next} accessor results in a function call, there
10420 is some overhead involved. The next code does the same and performs a tiny bit
10421 faster (but not that much because it is still a function call but one that
10422 knows what to look up).
10424 \starttyping
10425 local next = node.next(current)
10426 if next then
10427 -- do something
10429 \stoptyping
10431 There are several such function based accessors now:
10433 \starttabulate[|T|p|]
10434 \NC getnext \NC parsing nodelist always involves this one \NC \NR
10435 \NC getprev \NC used less but is logical companion to getnext \NC \NR
10436 \NC getid \NC consulted a lot \NC \NR
10437 \NC getsubtype \NC consulted less but also a topper \NC \NR
10438 \NC getfont \NC used a lot in otf handling (glyph nodes are consulted a lot) \NC \NR
10439 \NC getchar \NC idem and also in other places \NC \NR
10440 \NC getlist \NC we often parse nested lists so this is a convenient one too
10441 (only works for hlist and vlist!) \NC \NR
10442 \NC getleader \NC comparable to list, seldom used in \TEX\ (but needs frequent consulting
10443 like lists; leaders could have been made a dedicated node type) \NC \NR
10444 \NC getfield \NC generic getter, sufficient for the rest (other field names are
10445 often shared so a specific getter makes no sense then) \NC \NR
10446 \stoptabulate
10448 It doesn't make sense to add more. Profiling demonstrated that these fields can
10449 get accesses way more times than other fields. Even in complex documents, many
10450 node and fields types never get seen, or seen only a few times. Most functions in the
10451 \type {node} namespace have a companion in \type {node.direct}, but of course not the
10452 ones that don't deal with nodes themselves. The following table summarized this:
10454 \start \def\yes{$+$} \def\nop{$-$}
10456 \starttabulate[|T|c|c|]
10458 \NC \bf function \NC \bf node \NC \bf direct \NC \NR
10460 \NC copy \NC \yes \NC \yes \NC \NR
10461 \NC copy_list \NC \yes \NC \yes \NC \NR
10462 \NC count \NC \yes \NC \yes \NC \NR
10463 \NC current_attr \NC \yes \NC \yes \NC \NR
10464 \NC dimensions \NC \yes \NC \yes \NC \NR
10465 \NC do_ligature_n \NC \yes \NC \yes \NC \NR
10466 \NC end_of_math \NC \yes \NC \yes \NC \NR
10467 \NC family_font \NC \yes \NC \nop \NC \NR
10468 \NC fields \NC \yes \NC \nop \NC \NR
10469 \NC first_character \NC \yes \NC \nop \NC \NR
10470 \NC first_glyph \NC \yes \NC \yes \NC \NR
10471 \NC flush_list \NC \yes \NC \yes \NC \NR
10472 \NC flush_node \NC \yes \NC \yes \NC \NR
10473 \NC free \NC \yes \NC \yes \NC \NR
10474 \NC getbox \NC \nop \NC \yes \NC \NR
10475 \NC getchar \NC \yes \NC \yes \NC \NR
10476 \NC getfield \NC \yes \NC \yes \NC \NR
10477 \NC getfont \NC \yes \NC \yes \NC \NR
10478 \NC getid \NC \yes \NC \yes \NC \NR
10479 \NC getnext \NC \yes \NC \yes \NC \NR
10480 \NC getprev \NC \yes \NC \yes \NC \NR
10481 \NC getlist \NC \yes \NC \yes \NC \NR
10482 \NC getleader \NC \yes \NC \yes \NC \NR
10483 \NC getsubtype \NC \yes \NC \yes \NC \NR
10484 \NC has_glyph \NC \yes \NC \yes \NC \NR
10485 \NC has_attribute \NC \yes \NC \yes \NC \NR
10486 \NC has_field \NC \yes \NC \yes \NC \NR
10487 \NC hpack \NC \yes \NC \yes \NC \NR
10488 \NC id \NC \yes \NC \nop \NC \NR
10489 \NC insert_after \NC \yes \NC \yes \NC \NR
10490 \NC insert_before \NC \yes \NC \yes \NC \NR
10491 \NC is_direct \NC \nop \NC \yes \NC \NR
10492 \NC is_node \NC \yes \NC \yes \NC \NR
10493 \NC kerning \NC \yes \NC \nop \NC \NR
10494 \NC last_node \NC \yes \NC \yes \NC \NR
10495 \NC length \NC \yes \NC \yes \NC \NR
10496 \NC ligaturing \NC \yes \NC \nop \NC \NR
10497 \NC mlist_to_hlist \NC \yes \NC \nop \NC \NR
10498 \NC new \NC \yes \NC \yes \NC \NR
10499 \NC next \NC \yes \NC \nop \NC \NR
10500 \NC prev \NC \yes \NC \nop \NC \NR
10501 \NC tostring \NC \yes \NC \yes \NC \NR
10502 \NC protect_glyphs \NC \yes \NC \yes \NC \NR
10503 \NC protrusion_skippable \NC \yes \NC \yes \NC \NR
10504 \NC remove \NC \yes \NC \yes \NC \NR
10505 \NC set_attribute \NC \yes \NC \yes \NC \NR
10506 \NC setbox \NC \yes \NC \yes \NC \NR
10507 \NC setfield \NC \yes \NC \yes \NC \NR
10508 \NC slide \NC \yes \NC \yes \NC \NR
10509 \NC subtype \NC \yes \NC \nop \NC \NR
10510 \NC tail \NC \yes \NC \yes \NC \NR
10511 \NC todirect \NC \yes \NC \yes \NC \NR
10512 \NC tonode \NC \yes \NC \yes \NC \NR
10513 \NC traverse \NC \yes \NC \yes \NC \NR
10514 \NC traverse_id \NC \yes \NC \yes \NC \NR
10515 \NC type \NC \yes \NC \nop \NC \NR
10516 \NC types \NC \yes \NC \nop \NC \NR
10517 \NC unprotect_glyphs \NC \yes \NC \yes \NC \NR
10518 \NC unset_attribute \NC \yes \NC \yes \NC \NR
10519 \NC usedlist \NC \yes \NC \yes \NC \NR
10520 \NC vpack \NC \yes \NC \yes \NC \NR
10521 \NC whatsits \NC \yes \NC \nop \NC \NR
10522 \NC write \NC \yes \NC \yes \NC \NR
10523 \stoptabulate
10525 \stop
10527 The \type {node.next} and \type {node.prev} functions will stay but for
10528 consistency there are variants called \type {getnext} and \type {getprev}.
10529 We had to use \type{get} because \type {node.id} and \type {node.subtype} are
10530 already taken for providing meta information about nodes.
10531 Note: The getters do only basic checking for valid keys.
10532 You should just stick to the keys mentioned in the sections that describe node properties.
10534 \chapter{Modifications}
10536 Besides the expected changes caused by new functionality, there are a
10537 number of not|-|so|-|expected changes. These are sometimes a side|-|effect
10538 of a new (conflicting) feature, or, more often than not, a change
10539 necessary to clean up the internal interfaces.
10541 \section{Changes from \TEX\ 3.1415926}
10543 \startitemize
10545 \item The current code base is written in C, not Pascal web (as of \LUATEX~0.42.0).
10547 \item See~\in{chapter}[languages] for many small changes related to paragraph
10548 building, language handling, and hyphenation. Most important change:
10549 adding a brace group in the middle of a word (like in \type{of{}fice})
10550 does not prevent ligature creation.
10552 \item There is no pool file, all strings are embedded during compilation.
10554 \item \type {plus 1 fillll} does not generate an error. The extra \quote{l} is
10555 simply typeset.
10557 \item The upper limit to \tex{endlinechar} and \tex{newlinechar} is 127.
10559 \stopitemize
10561 \section{Changes from \ETEX\ 2.2}
10563 \startitemize
10565 \item The \ETEX\ functionality is always present and enabled
10566 (but see below about \TEXXET), so the prepended asterisk or
10567 \type{-etex} switch for \INITEX\ is not needed.
10569 \item \TEXXET\ is not present, so the primitives
10571 \starttyping
10572 \TeXXeTstate
10573 \beginR
10574 \beginL
10575 \endR
10576 \endL
10577 \stoptyping
10579 are missing.
10581 \item Some of the tracing information that is output by \ETEX's \tex{tracingassigns} and
10582 \tex{tracingrestores} is not there.
10584 \item Register management in \LUATEX\ uses the \ALEPH\ model, so the maximum value is 65535
10585 and the implementation uses a flat array instead of the mixed
10586 flat|\&|sparse model from \ETEX.
10588 \item \type{savinghyphcodes} is a no-op.
10589 See~\in{chapter}[languages] for details.
10591 \item When kpathsea is used to find files, \LUATEX\ uses the
10592 \type{ofm} file format to search for font metrics. In turn, this means
10593 that \LUATEX\ looks at the \type{OFMFONTS} configuration variable
10594 (like \OMEGA\ and \ALEPH) instead of \type{TFMFONTS} (like \TEX\ and
10595 \PDFTEX). Likewise for virtual fonts (\LUATEX\ uses the variable
10596 \type{OVFFONTS} instead of \type{VFFONTS}).
10599 \stopitemize
10601 \section{Changes from \PDFTEX\ 1.40}
10603 \startitemize
10605 \item The (experimental) support for snap nodes has been removed, because
10606 it is much more natural to build this functionality on top of node
10607 processing and attributes. The associated primitives that are now gone
10608 are: \tex{pdfsnaprefpoint}, \tex{pdfsnapy}, and \tex{pdfsnapycomp}.
10610 \item The (experimental) support for specialized spacing around nodes
10611 has also been removed. The associated primitives that are now gone are:
10612 \tex{pdfadjustinterwordglue}, \tex{pdfprependkern}, and \tex{pdfappendkern},
10613 as well as the five supporting primitives \tex{knbscode}, \tex{stbscode},
10614 \tex{shbscode}, \tex{knbccode}, and \tex{knaccode}.
10616 \item A number of \quote{pdftex primitives} have been removed:
10618 \startcolumns[n=2,balance=yes]
10619 \starttyping
10620 \pdfeachlinedepth
10621 \pdfeachlineheight
10622 \pdfelapsedtime
10623 \pdfescapehex
10624 \pdfescapename
10625 \pdfescapestring
10626 \pdffiledump
10627 \pdffilemoddate
10628 \pdffilesize
10629 \pdffirstlineheight
10630 \pdfforcepagebox
10631 \pdfignoreddimen
10632 \pdflastlinedepth
10633 \pdflastmatch
10634 \pdfmatch
10635 \pdfmdfivesum
10636 \pdfmovechars
10637 \pdfoptionalwaysusepdfpagebox
10638 \pdfoptionpdfinclusionerrorlevel
10639 \pdfresettimer
10640 \pdfshellescape
10641 \pdfstrcmp
10642 \pdftexbanner
10643 \pdftexrevision
10644 \pdftexversion
10645 \pdfunescapehex
10646 \stoptyping
10648 \stopcolumns
10650 \item A few other experimental primitives are also provided without the
10651 extra \luatex {pdf} prefix, so they can also be called like this:
10653 \startcolumns[n=2,balance=yes]
10654 \starttyping
10655 \primitive
10656 \ifprimitive
10657 \ifabsnum
10658 \ifabsdim
10659 \stoptyping
10660 \stopcolumns
10662 \item The PNG transparency fix from 1.40.6 is not applied
10663 (high-level support is pending)
10665 \item LFS (\PDF\ Files larger than 2GiB) support is not working yet.
10667 \item \LUATEX~0.45.0 introduces two extra token lists, \tex{pdfxformresources}
10668 and \tex{pdfxformattr}, as an alternative to \tex{pdfxform} keywords.
10670 \item As of \LUATEX~0.50.0 is no longer possible for fonts from embedded pdf files
10671 to be replaced by / merged with the document fonts of the enveloping
10672 pdf document. This regression may be temporary, depending on how the
10673 rewritten font backend will look after beta 0.60.
10676 \stopitemize
10678 \section{Changes from \ALEPH\ RC4}
10680 \startitemize
10682 \item Starting with \LUATEX\ 0.75.0, the extended 16-bit math primitives
10683 (\tex{omathcode} etc.~) have been removed.
10685 \item Starting with \LUATEX\ 0.63.0, OCP processing is no longer
10686 supported at all. As a consequence, the following primitives have
10687 been removed:
10689 \startcolumns[n=2]
10690 \starttyping
10691 \ocp
10692 \externalocp
10693 \ocplist
10694 \pushocplist
10695 \popocplist
10696 \clearocplists
10697 \addbeforeocplist
10698 \addafterocplist
10699 \removebeforeocplist
10700 \removeafterocplist
10701 \ocptracelevel
10702 \stoptyping
10703 \stopcolumns
10705 \item \LUATEX\ only understands 4~of the 16~direction
10706 specifiers of \ALEPH: \type{TLT} (latin), \type{TRT} (arabic),
10707 \type{RTT} (cjk), \type{LTL} (mongolian). All other direction
10708 specifiers generate an error (\LUATEX\ 0.45).
10710 \item The input translations from \ALEPH\ are not implemented, the
10711 related primitives are not available:
10713 \startcolumns[n=2]
10714 \starttyping
10715 \DefaultInputMode
10716 \noDefaultInputMode
10717 \noInputMode
10718 \InputMode
10719 \DefaultOutputMode
10720 \noDefaultOutputMode
10721 \noOutputMode
10722 \OutputMode
10723 \DefaultInputTranslation
10724 \noDefaultInputTranslation
10725 \noInputTranslation
10726 \InputTranslation
10727 \DefaultOutputTranslation
10728 \noDefaultOutputTranslation
10729 \noOutputTranslation
10730 \OutputTranslation
10731 \stoptyping
10732 \stopcolumns
10734 \item The \tex{hoffset} bug when \tex{pagedir TRT} is fixed,
10735 removing the need for an explicit fix to \tex{hoffset}
10737 \item A bug causing \tex{fam} to fail for family numbers above
10738 15 is fixed.
10740 \item A fair amount of other minor bugs are fixed as well, most of these
10741 related to \tex{tracingcommands} output.
10743 \item The internal function \type{scan_dir()} has been renamed to
10744 \type{scan_direction()} to prevent a naming clash, and it now allows
10745 an optional space after the direction is completely parsed.
10747 \item The \type{^^} notation can come in five and six item repetitions also, to
10748 insert characters that do not fit in the BMP.
10750 \item Glues {\it immediately after} direction change commands are not
10751 legal breakpoints.
10753 \stopitemize
10755 \section{Changes from standard \WEBC}
10757 \startitemize
10759 \item There is no mltex
10761 \item There is no enctex
10763 \item The following commandline switches are silently ignored, even
10764 in non|-|\LUA\ mode:
10766 \starttyping
10767 -8bit
10768 -translate-file=TCXNAME
10769 -mltex
10770 -enc
10771 -etex
10772 \stoptyping
10774 \item \tex{openout} whatsits are not written to the log file.
10776 \item Some of the so|-|called web2c extensions are hard to set up
10777 in non|-|\KPSE\ mode because texmf.cnf is not read: \type{shell-escape}
10778 is off (but that is not a problem because of \LUA's
10779 \lua{os.execute}), and the paranoia checks on \type{openin} and
10780 \type{openout} do not happen (however, it is easy for a \LUA\ script
10781 to do this itself by overloading \lua{io.open}).
10783 \item The \quote{E} option does not do anything useful.
10785 \stopitemize
10787 \chapter{Implementation notes}
10789 \section{Primitives overlap}
10791 The primitives
10793 \starttabulate[|l|l|]
10794 \NC \tex{pdfpagewidth} \NC \tex{pagewidth} \NC \NR
10795 \NC \tex{pdfpageheight}\NC \tex{pageheight} \NC \NR
10796 \NC \tex{fontcharwd} \NC \tex{charwd} \NC \NR
10797 \NC \tex{fontcharht} \NC \tex{charht} \NC \NR
10798 \NC \tex{fontchardp} \NC \tex{chardp} \NC \NR
10799 \NC \tex{fontcharic} \NC \tex{charit} \NC \NR
10800 \stoptabulate
10802 are all aliases of each other.
10804 \section{Memory allocation}
10806 The single internal memory heap that traditional \TEX\ used for tokens
10807 and nodes is split into two separate arrays. Each of these will grow
10808 dynamically when needed.
10810 The \type{texmf.cnf} settings related to main memory are no longer
10811 used (these are: \type{main_memory}, \type{mem_bot},
10812 \type{extra_mem_top} and \type{extra_mem_bot}). \quote{Out of main
10813 memory} errors can still occur, but the limiting factor is now the
10814 amount of RAM in your system, not a predefined limit.
10816 Also, the memory (de)allocation routines for nodes are completely
10817 rewritten. The relevant code now lives in the C file \type{texnode.c},
10818 and basically uses a dozen or so \quote{avail} lists instead of a
10819 doubly|-|linked model. An extra function layer is added so that the
10820 code can ask for nodes by type instead of directly requisitioning
10821 a certain amount of memory words.
10823 Because of the split into two arrays and the resulting differences in the data
10824 structures, some of the macros have been duplicated. For instance, there are now
10825 \type{vlink} and \type{vinfo} as well as \type{token_link} and \type{token_info}. All
10826 access to the variable memory array is now hidden behind a macro called \type{vmem}.
10828 The implementation of the growth of two arrays (via reallocation)
10829 introduces a potential pitfall: the memory arrays should never be used
10830 as the left hand side of a statement that can modify the array in
10831 question.
10833 The input line buffer and pool size are now also reallocated when
10834 needed, and the \type{texmf.cnf} settings \type{buf_size} and
10835 \type{pool_size} are silently ignored.
10837 \section{Sparse arrays}
10839 The \tex{mathcode}, \tex{delcode}, \tex{catcode},
10840 \tex{sfcode}, \tex{lccode} and \tex{uccode} tables are now
10841 sparse arrays that are implemented in~C. They are no longer part of
10842 the \TEX\ \quote{equivalence table} and because each had 1.1 million
10843 entries with a few memory words each, this makes a major difference
10844 in memory usage.
10846 The \tex{catcode}, \tex{sfcode}, \tex{lccode} and \tex{uccode} assignments
10847 do not yet show up when using the etex tracing routines \tex{tracingassigns}
10848 and \tex{tracingrestores} (code simply not written yet).
10850 A side|-|effect of the current implementation is that \tex{global} is
10851 now more expensive in terms of processing than non|-|global assignments.
10853 See \type{mathcodes.c} and \type{textcodes.c} if you are interested in
10854 the details.
10856 Also, the glyph ids within a font are now managed by means
10857 of a sparse array and glyph ids can go up to index $2^{21}-1$.
10859 \section{Simple single-character csnames}
10861 Single|-|character commands are no longer treated specially in the
10862 internals, they are stored in the hash just like the multiletter
10863 csnames.
10865 The code that displays control sequences explicitly checks if
10866 the length is one when it has to decide whether or not to add a
10867 trailing space.
10869 Active characters are internally implemented as a special type
10870 of multi-letter control sequences that uses a prefix that is
10871 otherwise impossible to obtain.
10873 \section{Compressed format}
10875 The format is passed through zlib, allowing it to shrink to roughly
10876 half of the size it would have had in uncompressed form. This takes a
10877 bit more CPU cycles but much less disk I/O, so it should still be
10878 faster.
10880 \section{Binary file reading}
10882 All of the internal code is changed in such a way that if one of the
10883 \type{read_xxx_file} callbacks is not set, then the file is read by
10884 a C function using basically the same convention as the callback: a
10885 single read into a buffer big enough to hold the entire file
10886 contents. While this uses more memory than the previous code (that
10887 mostly used \type{getc} calls), it can be quite a bit faster
10888 (depending on your I/O subsystem).
10890 \chapter{Known bugs and limitations, TODO}
10892 There used to be a lists of bugs and planned features below here, but that did not
10893 work out too well. There are lists of open bugs and feature requests in the tracker at
10894 \hyphenatedurl{http://tracker.luatex.org}.
10896 \stoptext