sinc with TL 35560 and prepare new tag release
[luatex.git] / manual / luatexref-t.tex
blob0548af163e16a598521c505b4e1e26d13027770f
1 % engine=luatex language=uk
2 % $Id$
4 % TODO: fix layout of function legend descriptions
5 % check numbers
6 % check \luatex command
8 %\nopdfcompression
9 %\loggingall
10 \environment luatexref-env
11 \logo[DFONT] {dfont}
12 \logo[CFF] {cff}
13 \logo[CMAP] {CMap}
14 \logo[PATGEN] {patgen}
15 \logo[MP] {MetaPost}
16 \logo[METAPOST]{MetaPost}
17 \logo[MPLIB] {MPlib}
18 \logo[COCO] {coco}
19 \logo[SUNOS] {SunOS}
20 \logo[BSD] {bsd}
21 \logo[SYSV] {sysv}
22 \logo[DPI] {dpi}
24 \setvariables
25 [document]
26 [beta=0.79.2]
28 \starttext
30 \dontcomplain \nonknuthmode
32 \setups[titlepage]
34 \title{Contents}
36 \placecontent[criterium=text,level=subsection]
38 \chapter{Introduction}
40 \startframedtext[framecolor=red,foregroundcolor=red,width=\hsize,style=\tfa]
42 This book will eventually become the reference manual of \LUATEX.
43 At the moment, it simply reports the behavior of the executable
44 matching the snapshot or beta release date in the title page.
46 \blank
48 Features may come and go. The current version of \LUATEX\ is not
49 meant for production and users cannot depend on stability, nor on
50 functionality staying the same.
52 \blank
54 Nothing is considered stable just yet. This manual therefore
55 simply reflects the current state of the executable. {\bs
56 Absolutely nothing\/} on the following pages is set in stone. When
57 the need arises, anything can (and will) be changed.
59 \blank
61 {\bf If you are not willing to deal with this situation, you should
62 wait for the stable version. Currently we expect the 1.0 release to
63 happen in spring 2014. Full stabilization will not happen soon, the
64 TODO list is still large.}
66 \stopframedtext
68 \blank[2*line]
70 \LUATEX\ consists of a number of interrelated but (still)
71 distinguishable parts:
73 \startitemize[packed]
74 \item \PDFTEX\ version 1.40.9, converted to C (with patches from later releases).
75 \item The direction model and some other bits from \ALEPH\ RC4 converted to C.
76 \item \LUA\ 5.2.1
77 \item dedicated \LUA\ libraries
78 \item various \TEX\ extensions
79 \item parts of \FONTFORGE\ 2008.11.17
80 \item the \METAPOST\ library
81 \item newly written compiled source code to glue it all together
82 \stopitemize
84 Neither \ALEPH's I/O translation processes, nor tcx files, nor
85 \ENCTEX\ can be used, these encoding|-|related functions are
86 superseded by a \LUA|-|based solution (reader callbacks). Also, some
87 experimental \PDFTEX\ features are removed. These can be implemented
88 in \LUA\ instead.
90 \chapter{Basic \TEX\ enhancements}
92 \section{Introduction}
94 From day one, \LUATEX\ has offered extra functionality when compared
95 to the superset of \PDFTEX\ and \ALEPH. That has not been limited to
96 the possibility to execute lua code via \type{\directlua}, but
97 \LUATEX\ also adds functionality via new \TEX-side primitives.
99 However, starting with beta \type{0.39.0}, most of that functionality
100 is hidden by default. When \LUATEX\ 0.40.0 starts up in
101 \quote{iniluatex} mode (\type{luatex -ini}), it defines only the
102 primitive commands known by \TEX82 and the one extra command
103 \type{\directlua}.
105 As is fitting, a lua function has to be called to add the extra
106 primitives to the user environment. The simplest method to get access
107 to all of the new primitive commands is by adding this line to the
108 format generation file:
110 \starttyping
111 \directlua { tex.enableprimitives('',tex.extraprimitives()) }
112 \stoptyping
114 But be aware that the curly braces may not have the proper \type{\catcode}
115 assigned to them at this early time (giving a 'Missing number' error),
116 so it may be needed to put these assignments
118 \starttyping
119 \catcode `\{=1
120 \catcode `\}=2
121 \stoptyping
123 before the above line.
124 More fine-grained primitives control is possible, you can look up the details in
125 \in{section}[luaprimitives]. For simplicity's sake, this manual assumes
126 that you have executed the \type{\directlua} command as given above.
128 The startup behavior documented above is considered stable in the sense
129 that there will not be backward-incompatible changes any more.
131 \section{Version information}
133 There are three new primitives to test the version of \LUATEX:
135 \starttabulate[|l|p|]
136 \NC \bf primitive \NC \bf explanation \NC\NR
137 \NC \tex{luatexversion} \NC a combination of major and minor number, as in \PDFTEX;
138 the current current value is {\bf\the\luatexversion} \NC\NR
139 \NC \tex{luatexrevision} \NC the revision number, as in \PDFTEX;
140 the current value is {\bf\luatexrevision} \NC\NR
141 \NC \tex{luatexdatestamp} \NC (deprecated in 0.78.1, will be gone in 0.80.0)
142 a combination of the local date and hour when
143 the current executable was compiled,
144 the syntax is identical to \tex{luatexrevision};
145 the value for the executable that generated this
146 document is {\bf\luatexdatestamp}. \NC\NR
147 \stoptabulate
149 The official \LUATEX\ version is defined as follows:
151 \startitemize
152 \item The major version is the integer result of \tex{luatexversion} divided by 100.
153 The primitive is an \quote{internal variable}, so you may need to prefix its
154 use with \type{\the} depending on the context.
155 \item The minor version is the two-digit result of \tex{luatexversion} modulo 100.
156 \item The revision is the given by \tex{luatexrevision}. This primitive expands to a
157 positive integer.
158 \item The full version number consists of the major version,
159 minor version and revision, separated by dots.
160 \stopitemize
162 \section{\UNICODE\ text support}
164 Text input and output is now considered to be \UNICODE\ text, so
165 input characters can use the full range of \UNICODE\ ($2^{20}+2^{16}-1
166 = \hbox{0x10FFFF}$).
168 Later chapters will talk of characters and glyphs. Although these
169 are not interchangeable, they are closely related. During
170 typesetting, a character is always converted to a suitable graphic
171 representation of that character in a specific font. However,
172 while processing a list of to|-|be|-|typeset nodes, its contents
173 may still be seen as a character. Inside \LUATEX\ there is not yet
174 a clear separation between the two concepts. Until this is
175 implemented, please do not be too harsh on us if we make errors in
176 the usage of the terms.
178 A few primitives are affected by this, all in a similar fashion: each
179 of them has to accommodate for a larger range of acceptable numbers.
180 For instance, \tex{char} now accepts values between~0 and
181 $1{,}114{,}111$. This should not be a problem for well|-|behaved input
182 files, but it could create incompatibilities for input that would have
183 generated an error when processed by older \TEX|-|based engines. The
184 affected commands with an altered initial (left of the equals sign) or
185 secondary (right of the equals sign) value are: \tex{char},
186 \tex{lccode},\tex{uccode}, \tex{catcode}, \tex{sfcode}, \tex{efcode},
187 \tex{lpcode}, \tex{rpcode}, \tex{chardef}.
189 As far as the core engine is concerned, all input and output to
190 text files is \UTF-8 encoded. Input files can be pre|-|processed
191 using the \luatex{reader} callback. This will be explained in a
192 later chapter.
194 Output in byte|-|sized chunks can be achieved by using characters
195 just outside of the valid \UNICODE\ range, starting at the value
196 $1{,}114{,}112$ (0x110000). When the time comes to print a character
197 $c>=1{,}114{,}112$, \LUATEX\ will actually print the single byte
198 corresponding to $c$ minus 1{,}114{,}112.
200 Output to the terminal uses \type{^^} notation for the lower
201 control range ($c<32$), with the exception of \type{^^I},
202 \type{^^J} and \type{^^M}. These are considered \quote{safe} and
203 therefore printed as-is.
205 Normalization of the \UNICODE\ input can be handled by a macro package
206 during callback processing (this will be explained in \in{section}[iocallback]).
208 \section{Extended tables}
210 All traditional \TEX\ and \ETEX\ registers can be 16-bit numbers as in
211 \ALEPH. The affected commands are:
213 \startcolumns[n=4]
214 \starttyping
215 \count
216 \dimen
217 \skip
218 \muskip
219 \marks
220 \toks
221 \countdef
222 \dimendef
223 \skipdef
224 \muskipdef
225 \toksdef
226 \box
227 \unhbox
228 \unvbox
229 \copy
230 \unhcopy
231 \unvcopy
235 \setbox
236 \vsplit
237 \stoptyping
238 \stopcolumns
240 The glyph properties (like \type {\efcode}) introduced in \PDFTEX\
241 that deal with font expansion (hz) and character protruding are
242 also 16-bit. Because font memory management has been rewritten,
243 these character properties are no longer shared among fonts
244 instances that originate from the same metric file.
246 The behavior documented in the above section is considered stable
247 in the sense that there will not be backward-incompatible changes any
248 more.
250 \section{Attribute registers}
252 Attributes are a completely new concept in \LUATEX. Syntactically,
253 they behave a lot like counters: attributes obey \TEX's nesting stack
254 and can be used after \tex{the} etc.\ just like the normal
255 \tex{count} registers.
257 \startsyntax
258 \attribute <16-bit number> <optional equals> <32-bit number>!crlf
259 \attributedef <csname> <optional equals> <16-bit number>
260 \stopsyntax
262 Conceptually, an attribute is either \quote{set} or
263 \quote{unset}. Unset attributes have a special negative value to
264 indicate that they are unset, that value is the lowest legal value:
265 \type{-"7FFFFFFF} in hexadecimal, a.k.a. $-2147483647$ in decimal.
266 It follows that the value \type{-"7FFFFFFF} cannot be used as
267 a legal attribute value, but you {\it can\/} assign \type{-"7FFFFFFF} to
268 \quote{unset} an attribute. All attributes start out in this
269 \quote{unset} state in \INITEX\ (prior to 0.37, there could not be
270 valid negative attribute values, and the \quote{unset} value was $-1$).
272 Attributes can be used as extra counter values, but their usefulness
273 comes mostly from the fact that the numbers and values of all \quote{set}
274 attributes are attached to all nodes created in their scope. These can
275 then be queried from any \LUA\ code that deals with node
276 processing. Further information about how to use attributes for node
277 list processing from \LUA\ is given in~\in{chapter}[nodes].
279 The behavior documented in the above subsection is considered stable
280 in the sense that there will not be backward-incompatible changes any
281 more.
284 \subsection{Box attributes}
286 Nodes typically receive the list of attributes that is in effect when
287 they are created. This moment can be quite asynchronous. For example: in
288 paragraph building, the individual line boxes are created after the
289 \tex{par} command has been processed, so they will receive the list of
290 attributes that is in effect then, not the attributes that were in
291 effect in, say, the first or third line of the paragraph.
293 Similar situations happen in \LUATEX\ regularly. A few of the more
294 obvious problematic cases are dealt with: the attributes for nodes
295 that are created during hyphenation, kerning and ligaturing borrow their
296 attributes from their surrounding glyphs, and it is possible to
297 influence box attributes directly.
299 When you assemble a box in a register, the attributes of the nodes
300 contained in the box are unchanged when such a box is placed,
301 unboxed, or copied. In this respect attributes act the same as
302 characters that have been converted to references to glyphs in
303 fonts. For instance, when you use attributes to implement color
304 support, each node carries information about its eventual color. In that
305 case, unless you implement mechanisms that deal with it, applying
306 a color to already boxed material will have no effect. Keep in
307 mind that this incompatibility is mostly due to the fact that separate
308 specials and literals are a more unnatural approach to colors than
309 attributes.
311 It is possible to fine-tune the list of attributes that are applied
312 to a \type{hbox}, \type{vbox} or \type{vtop} by the use of the
313 keyword \type{attr}. An example:
315 \starttyping
316 \attribute2=5
317 \setbox0=\hbox {Hello}
318 \setbox2=\hbox attr1=12 attr2=-"7FFFFFFF{Hello}
319 \stoptyping
321 This will set the attribute list of box~2 to $1=12$, and the
322 attributes of box~0 will be $2=5$. As you can see, assigning
323 the maximum negative value causes an attribute to be ignored.
325 The \type{attr} keyword(s) should come before a \type{to} or
326 \type{spread}, if that is also specified.
328 \section{\LUA\ related primitives}
330 In order to merge \LUA\ code with \TEX\ input, a few new primitives are
331 needed.
334 \subsection{\tex{directlua}}
336 The primitive \tex{directlua} is used to execute \LUA\ code immediately.
337 The syntax is
339 \startsyntax
340 \directlua <general text>!crlf
341 \directlua name <general text> <general text>!crlf
342 \directlua <16-bit number> <general text>
343 \stopsyntax
345 The last \syntax{<general text>} is expanded fully, and then fed
346 into the \LUA\ interpreter. After reading and expansion has been applied to the
347 \syntax{<general text>}, the resulting token list is converted to a
348 string as if it was displayed using \type{\the\toks}. On the \LUA\
349 side, each \type{\directlua} block is treated as a separate chunk. In
350 such a chunk you can use the \type {local} directive to keep your variables
351 from interfering with those used by the macro package.
353 The conversion to and from a token list means that you normally can
354 not use \LUA\ line comments (starting with \type{--}) within the
355 argument. As there typically will be only one \quote{line} the first
356 line comment will run on until the end of the input. You will either need to
357 use \TEX-style line comments (starting with \%), or change the \TEX\
358 category codes locally. Another possibility is to say:
360 \starttyping
361 \begingroup
362 \endlinechar=10
363 \directlua ...
364 \endgroup
365 \stoptyping
367 Then \LUA\ line comments can be used, since \TEX\ does not replace
368 line endings with spaces.
370 The \syntax{name <general text>} specifies the name of the \LUA\ chunk,
371 mainly shown in the stack backtrace of error messages created by \LUA\
372 code. The \syntax{<general text>} is expanded fully, thus macros can
373 be used to generate the chunk name, i.e.
375 \starttyping
376 \directlua name{\jobname:\the\inputlineno} ...
377 \stoptyping
379 to include the name of the input file as well as the input line into
380 the chunk name.
382 Likewise, the \syntax{<16-bit number>} designates a name of a \LUA\
383 chunk, but in this case the name will be taken from the
384 \type{lua.name} array (see the documentation of the \type{lua} table
385 further in this manual). This syntax is new in version 0.36.0.
387 The chunk name should not start with a \type{@}, or it will be displayed
388 as a file name (this is a quirk in the current \LUA\ implementation).
390 The \tex{directlua} command is expandable. Since it passes {\LUA} code to the
391 {\LUA} interpreter its expansion from the {\TEX} viewpoint is usually empty.
392 However, there are some {\LUA} functions that produce material to be read
393 by {\TeX}, the so called print functions. The most simple use of these is
394 \type{tex.print(<string> s)}. The characters of the string \type{s} will be placed
395 on the {\TeX} input buffer, that is, \quote{before \TeX's eyes} to be read by {\TeX}
396 immediately. For example:
398 \startbuffer
399 \count10=20
400 a\directlua{tex.print(tex.count[10]+5)}b
401 \stopbuffer
403 \typebuffer
405 expands to
407 \getbuffer
409 Here is another example:
411 \startbuffer
412 $\pi = \directlua{tex.print(math.pi)}$
413 \stopbuffer
415 \typebuffer
417 will result in
419 \getbuffer
421 Note that the expansion of \tex{directlua} is a sequence of characters, not
422 of tokens, contrary to all {\TeX} commands. So formally speaking its
423 expansion is null, but it places material on a pseudo-file to be
424 immediately read by {\TeX}, as etex's \tex{scantokens}.
426 For a description of print functions look at \in{section~}[sec:luaprint].
428 Because the \syntax{<general text>} is a chunk, the normal \LUA\ error
429 handling is triggered if there is a problem in the included code. The
430 \LUA\ error messages should be clear enough, but the contextual
431 information is still pretty bad. Often, you will only see the line
432 number of the right brace at the end of the code.
434 While on the subject of errors: some of the things you can do inside
435 \LUA\ code can break up \LUATEX\ pretty bad. If you are not careful
436 while working with the node list interface, you may even end up with
437 assertion errors from within the \TEX\ portion of the executable.
439 The behavior documented in the above subsection is considered stable
440 in the sense that there will not be backward-incompatible changes any
441 more.
443 \subsection{\tex{luafunction}}
445 The \type {\directlua} commands involves tokenization of its argument (after picking up
446 an optional name or number specification). The tokenlist is then converted into a string and
447 given to \LUA\ to turn into a function that is called. The overhead is rather small but when
448 you use this primitive hundreds or thousands of times, it can become noticeable. For this
449 reason there is a variant call available: \type {\luafunction}. This command is used as
450 follows:
452 \starttyping
453 \directlua {
454 local t = lua.get_functions_table()
455 t[1] = function() tex.print("!") end
456 t[2] = function() tex.print("?") end
459 \luafunction1
460 \luafunction2
461 \stoptyping
463 Of course the functions can also be defined in a separate file. There is no
464 limit on the number of functions apart from normal \LUA\ limitations. Of course there
465 is the limitation of no arguments but that would involve parsing and thereby
466 give no gain. The function, when called in fact gets one argument, being the index,
467 so in:
469 \starttyping
470 \directlua {
471 local t = lua.get_functions_table()
472 t[8] = function(slot) tex.print(slot) end
474 \stoptyping
476 the number \type {8} gets typeset.
480 \subsection{\tex{latelua}}
482 \tex{latelua} stores \LUA\ code in a whatsit that will be processed
483 at the time of shipping out. Its intended use is a cross between
484 \tex{pdfliteral} and \tex{write}.
485 Within the \LUA\ code you can print \PDF\
486 statements directly to the \PDF\ file via \type{pdf.print},
487 or you can write to other output streams via \type{texio.write}
488 or simply using lua's I/O routines.
490 \startsyntax
491 \latelua <general text>!crlf
492 \latelua name <general text> <general text>!crlf
493 \latelua <16-bit number> <general text>
494 \stopsyntax
496 Expansion of macros etcetera in the final \type{<general text>} is delayed
497 until just before the whatsit is executed (like in \tex{write}). With
498 regard to PDF output stream \tex{latelua} behaves as \tex{pdfliteral page}.
500 The \syntax{name <general text>} and \syntax{<16-bit number>} behave
501 in the same way as they do for \type{\directlua}
503 \subsection{\tex{luaescapestring}}
505 This primitive converts a \TEX\ token sequence so that it can be
506 safely used as the contents of a \LUA\ string: embedded backslashes,
507 double and single quotes, and newlines and carriage returns are
508 escaped. This is done by prepending an extra token consisting of a
509 backslash with category code~12, and for the line endings,
510 converting them to \type{n} and \type{r} respectively. The token
511 sequence is fully expanded.
513 \startsyntax
514 \luaescapestring <general text>
515 \stopsyntax
517 Most often, this command is not actually the best way to deal with the
518 differences between the \TEX\ and \LUA. In very short bits of \LUA\
519 code it is often not needed, and for longer stretches of \LUA\ code it
520 is easier to keep the code in a separate file and load it using \LUA's
521 \type{dofile}:
523 \starttyping
524 \directlua { dofile('mysetups.lua')}
525 \stoptyping
528 \section{New \ETEX\ primitives}
530 \subsection{\tex{clearmarks}}
532 This primitive clears a mark class completely, resetting all three
533 connected mark texts to empty.
535 \startsyntax
536 \clearmarks <16-bit number>
537 \stopsyntax
539 \subsection{\tex{noligs} and \tex{nokerns}}
541 These primitives prohibit ligature and kerning insertion at the time
542 when the initial node list is built by \LUATEX's main control loop.
543 They are part of a temporary trick and will be removed in the near
544 future. For now, you need to enable these primitives when you want to
545 do node list processing of \quote{characters}, where \TEX's normal
546 processing would get in the way.
548 \startsyntax
549 \noligs <integer>!crlf
550 \nokerns <integer>
551 \stopsyntax
553 These primitives can now be implemented by overloading the ligature
554 building and kerning functions, i.e.\ by assigning dummy functions
555 to their associated callbacks.
557 \subsection{\tex{formatname}}
559 \tex{formatname}'s syntax is identical to \tex{jobname}.
561 In \INITEX, the expansion is empty. Otherwise, the expansion is the
562 value that \tex{jobname} had during the \INITEX\ run that dumped the
563 currently loaded format.
565 \subsection{\tex{scantextokens}}
567 The syntax of \tex{scantextokens} is identical to \tex{scantokens}.
568 This primitive is a slightly adapted version of \ETEX's \tex{scantokens}. The
569 differences are:
571 \startitemize
572 \item The last (and usually only) line does not have a
573 \tex{endlinechar} appended
574 \item \tex{scantextokens} never raises an EOF error,
575 and it does not execute \tex{everyeof} tokens.
576 \item The \quote{\unknown\ while end of file \unknown} error tests are not executed, allowing
577 the expansion to end on a different grouping level or while a
578 conditional is still incomplete.
579 \stopitemize
581 \subsection {Verbose versions of single-character aligments commands (0.45)}
583 \LUATEX\ defines two new primitives that have the same function as
584 \type{#} and \type{&} in aligments:
586 \starttabulate[|l|l|l|l|]
587 \NC \bf primitive \NC \bf explanation \NC\NR
588 \NC \tex{alignmark} \NC Duplicates the functionality of \char`\#~%
589 inside alignment preambles\NC\NR
590 \NC \tex{aligntab} \NC Duplicates the functionality of \char`\&~%
591 inside alignments (and preambles)\NC\NR
592 \stoptabulate
595 \subsection{Catcode tables}
597 Catcode tables are a new feature that allows you to switch to a
598 predefined catcode regime in a single statement. You can have a
599 practically unlimited number of different tables.
601 The subsystem is backward compatible: if you never use the following
602 commands, your document will not notice any difference in behavior
603 compared to traditional \TEX.
605 The contents of each catcode table is independent from any other
606 catcode tables, and their contents is stored and retrieved from the
607 format file.
609 \subsubsection{\tex{catcodetable}}
611 \startsyntax
612 \catcodetable <15-bit number>
613 \stopsyntax
615 The primitive \tex{catcodetable} switches to a different catcode table.
616 Such a table has to be previously created using one of the two
617 primitives below, or it has to be zero. Table zero is initialized by
618 \INITEX.
620 \subsubsection{\tex{initcatcodetable}}
622 \startsyntax
623 \initcatcodetable <15-bit number>
624 \stopsyntax
626 The primitive \tex{initcatcodetable} creates a new table with catcodes
627 identical to those defined by \INITEX:
629 \starttabulate[|l|l|l|l|l|]
630 \NC~0\NC \tt\letterbackslash \NC \NC \tt escape \NC\NR
631 \NC~5\NC \tt\letterhat\letterhat M \NC return \NC \tt car{\_}ret \NC (this name may change) \NC\NR
632 \NC~9\NC \tt\letterhat\letterhat @ \NC null \NC \tt ignore \NC\NR
633 \NC10\NC \tt <space> \NC space \NC \tt spacer \NC\NR
634 \NC11\NC {\tt a} -- {\tt z} \NC \NC \tt letter \NC\NR
635 \NC11\NC {\tt A} -- {\tt Z} \NC \NC \tt letter \NC\NR
636 \NC12\NC everything else \NC \NC \tt other \NC\NR
637 \NC14\NC \tt\letterpercent \NC \NC \tt comment \NC\NR
638 \NC15\NC \tt\letterhat\letterhat ? \NC delete \NC \tt invalid{\_}char \NC\NR
639 \stoptabulate
641 The new catcode table is allocated globally: it will not go away after
642 the current group has ended. If the supplied number is identical to
643 the currently active table, an error is raised.
645 \subsubsection{\tex{savecatcodetable}}
647 \startsyntax
648 \savecatcodetable <15-bit number>
649 \stopsyntax
651 \tex{savecatcodetable} copies the current set of catcodes to a
652 new table with the requested number. The definitions in this new table
653 are all treated as if they were made in the outermost level.
655 The new table is allocated globally: it will not go away after the
656 current group has ended. If the supplied number is the currently
657 active table, an error is raised.
659 \subsection{\tex{suppressfontnotfounderror} (0.11)}
661 \startsyntax
662 \suppressfontnotfounderror = 1
663 \stopsyntax
665 If this new integer parameter is non|-|zero, then \LUATEX\ will not
666 complain about font metrics that are not found. Instead it will
667 silently skip the font assignment, making the requested csname for the
668 font \tex{ifx} equal to \tex{nullfont}, so that it can be tested
669 against that without bothering the user.
671 \subsection{\tex{suppresslongerror} (0.36)}
673 \startsyntax
674 \suppresslongerror = 1
675 \stopsyntax
677 If this new integer parameter is non|-|zero, then \LUATEX\ will not
678 complain about \type{\par} commands encountered in contexts where
679 that is normally prohibited (most prominently in the arguments
680 of non-long macros).
682 \subsection{\tex{suppressifcsnameerror} (0.36)}
684 \startsyntax
685 \suppressifcsnameerror = 1
686 \stopsyntax
688 If this new integer parameter is non|-|zero, then \LUATEX\ will not
689 complain about non-expandable commands appearing in the middle of a
690 \type{\ifcsname} expansion. Instead, it will keep getting expanded
691 tokens from the input until it encounters an \type{\endcsname}
692 command. Use with care! This command is experimental: if the input
693 expansion is unbalanced wrt. \type{\csname} \ldots \type{\endcsname}
694 pairs, the \LUATEX\ process may hang indefinitely.
697 \subsection{\tex{suppressoutererror} (0.36)}
699 \startsyntax
700 \suppressoutererror = 1
701 \stopsyntax
703 If this new integer parameter is non|-|zero, then \LUATEX\ will not
704 complain about \type{\outer} commands encountered in contexts where
705 that is normally prohibited.
707 The addition of this command coincides with a change in the
708 \LUATEX\ engine: ever since the snapshot of 20060915, \type{\outer}
709 was simply ignored. That behavior has now reverted back to be
710 \TEX82-compatible by default.
713 \subsection{\tex{outputbox} (0.37)}
715 \startsyntax
716 \outputbox = 65535
717 \stopsyntax
719 This new integer parameter allows you to alter the number of the box
720 that will be used to store the page sent to the output routine. Its default
721 value is 255, and the acceptable range is from 0 to 65535.
724 \subsection{Font syntax}
726 \LUATEX\ will accept a braced argument as a font name:
728 \starttyping
729 \font\myfont = {cmr10}
730 \stoptyping
732 This allows for embedded spaces, without the need for double quotes.
733 Macro expansion takes place inside the argument.
735 \subsection{File syntax (0.45)}
737 \LUATEX\ will accept a braced argument as a file name:
739 \starttyping
740 \input {plain}
741 \openin 0 {plain}
742 \stoptyping
744 This allows for embedded spaces, without the need for double quotes.
745 Macro expansion takes place inside the argument.
747 \subsection{Images and Forms}
749 \LUATEX\ accepts optional dimension parameters for \type{\pdfrefximage}
750 and \type{\pdfrefxform} in the same format as for \type{\pdfximage}.
751 With images, these dimensions are then used
752 instead of the ones given to \type{\pdfximage};
753 but the original dimensions are not overwritten,
754 so that a \type{\pdfrefximage} without dimensions still provides
755 the image with dimensions defined by \type{\pdfximage}.
756 These optional parameters are not implemented for \type{\pdfxform}.
758 \starttyping
759 \pdfrefximage width 20mm height 10mm depth 5mm \pdflastximage
760 \pdfrefxform width 20mm height 10mm depth 5mm \pdflastxform
761 \stoptyping
763 \section{Debugging}
765 If \tex{tracingonline} is larger than~2, the node list display will
766 also print the node number of the nodes.
768 \section{Global leaders}
770 There is a new experimental primitive: \type{\gleaders} (a \LUATEX\
771 extension, added in 0.43). This type of leaders is anchored to the
772 origin of the box to be shipped out. So they are like normal
773 \type{\leaders} in that they align nicely, except that the alignment
774 is based on the {\it largest\/} enclosing box instead of the
775 {\it smallest\/}.
778 \section{Expandable character codes (0.75)}
780 The new expandable command \tex{Uchar} reads a number between~0 and
781 $1{,}114{,}111$ and expands to the associated Unicode character.
784 \chapter {\LUA\ general}
786 \section[init]{Initialization}
788 \subsection{\LUATEX\ as a \LUA\ interpreter}
790 There are some situations that make \LUATEX\ behave like a standalone \LUA\
791 interpreter:
793 \startitemize[packed]
794 \item if a \type{--luaonly} option is given on the commandline, or
795 \item if the executable is named \type{texlua} (or \type{luatexlua}), or
796 \item if the only non|-|option argument (file) on the commandline has the extension
797 \type{lua} or \type{luc}.
798 \stopitemize
800 In this mode, it will set \LUA's \type{arg[0]} to the found script
801 name, pushing preceding options in negative values and the rest of the
802 commandline in the positive values, just like the \LUA\
803 interpreter.
805 \LUATEX\ will exit immediately after executing the specified \LUA\
806 script and is, in effect, a somewhat bulky standalone \LUA\
807 interpreter with a bunch of extra preloaded libraries.
809 \subsection{\LUATEX\ as a \LUA\ byte compiler}
811 There are two situations that make \LUATEX\ behave like the \LUA\
812 byte compiler:
814 \startitemize[packed]
815 \item if a \type{--luaconly} option is given on the commandline, or
816 \item if the executable is named \type{texluac}
817 \stopitemize
819 In this mode, \LUATEX\ is exactly like \type{luac} from the standalone
820 \LUA\ distribution, except that it does not have the \type{-l} switch,
821 and that it accepts (but ignores) the \type{--luaconly} switch.
823 \subsection{Other commandline processing}
825 When the \LUATEX\ executable starts, it looks for the \type{--lua}
826 commandline option. If there is no \type{--lua} option, the
827 commandline is interpreted in a similar fashion as in traditional
828 \PDFTEX\ and \ALEPH.
830 The following command-line switches are understood.
832 \starttabulate[|lT|p|]
833 \NC --fmt=FORMAT \NC load the format file FORMAT \NC\NR
834 \NC --lua=FILE \NC load and execute a \LUA\ initialization script\NC\NR
835 \NC --safer \NC disable easily exploitable \LUA\ commands \NC\NR
836 \NC --nosocket \NC disable the \LUA\ socket library \NC\NR
837 \NC --help \NC display help and exit \NC\NR
838 \NC --ini \NC be iniluatex, for dumping formats \NC\NR
839 \NC --interaction=STRING \NC set interaction mode (STRING=batchmode/nonstopmode/\crlf
840 scrollmode/errorstopmode) \NC \NR
841 \NC --halt-on-error \NC stop processing at the first error\NC \NR
842 \NC --kpathsea-debug=NUMBER \NC set path searching debugging flags according to
843 the bits of NUMBER \NC \NR
844 \NC --progname=STRING \NC set the program name to STRING \NC \NR
845 \NC --version \NC display version and exit \NC\NR
846 \NC --credits \NC display credits and exit \NC\NR
847 \NC --recorder \NC enable filename recorder \NC \NR
848 \NC --etex \NC ignored\NC \NR
849 \NC --output-comment=STRING \NC use STRING for DVI file comment instead of date
850 (no effect for PDF)\NC \NR
851 \NC --output-directory=DIR \NC use DIR as the directory to write files to \NC \NR
852 \NC --draftmode \NC switch on draft mode (generates no output PDF)\NC \NR
853 \NC --output-format=FORMAT \NC use FORMAT for job output; FORMAT is 'dvi' or 'pdf' \NC \NR
854 \NC --[no-]shell-escape \NC disable/enable \type{\write18{SHELL COMMAND}} \NC \NR
855 \NC --enable-write18 \NC enable \type{\write18{SHELL COMMAND}} \NC \NR
856 \NC --disable-write18 \NC disable \type{\write18{SHELL COMMAND}} \NC \NR
857 \NC --shell-restricted \NC restrict \type{\write18} to a list of commands
858 given in texmf.cnf \NC \NR
859 \NC --debug-format \NC enable format debugging \NC \NR
860 \NC --[no-]file-line-error \NC disable/enable file:line:error style messages \NC \NR
861 \NC --[no-]file-line-error-style \NC aliases of --[no-]file-line-error \NC \NR
862 \NC --jobname=STRING \NC set the job name to STRING \NC \NR
863 \NC --[no-]parse-first-line \NC disable/enable parsing of the first line of the
864 input file \NC \NR
865 \NC --translate-file= \NC ignored \NC \NR
866 \NC --default-translate-file= \NC ignored \NC \NR
867 \NC --8bit \NC ignored \NC \NR
868 \NC --[no-]mktex=FMT \NC disable/enable mktexFMT generation (FMT=tex/tfm)\NC \NR
869 \NC --synctex=NUMBER \NC enable synctex \NC \NR
870 \stoptabulate
872 A note on the creation of the various temporary files and the \type{\jobname}.
873 The value to use for \type{\jobname} is decided as follows:
875 \startitemize
876 \item If \type{--jobname} is given on the command line, its argument
877 will be the value for \tex{jobname}, without any changes. The
878 argument will not be used for actual input so it need not exist.
879 The \type{--jobname} switch only controls the \tex{jobname} setting.
880 \item Otherwise, \tex{jobname} will be the name of the first file that
881 is read from the file system, with any path components and the last
882 extension (the part following the last \type{.}) stripped off.
883 \item An exception to the previous point: if the command
884 line goes into interactive mode (by starting with a command) and
885 there are no files input via \type{\everyjob} either, then the
886 \tex{jobname} is set to \type{texput} as a last resort.
887 \stopitemize
889 The file names for output files that are generated automatically are
890 created by attaching the proper extension (\type{.log}, \type{.pdf},
891 etc.) to the found \tex{jobname}. These files are created in the
892 directory pointed to by \type{--output-directory}, or in the current
893 directory, if that switch is not present.
895 \blank
897 Without the \type{--lua} option, command line processing works like it does in
898 any other web2c-based typesetting engine, except that \LUATEX\ has a few extra
899 switches.
902 If the \type{--lua} option is present, \LUATEX\ will enter an alternative mode
903 of commandline processing in comparison to the standard web2c
904 programs.
906 In this mode, a small series of actions is taken in order. First,
907 it will parse the commandline as usual, but it will only interpret
908 a small subset of the options immediately: \type{--safer}, \type{--nosocket},
909 \type{--[no-]shell-escape}, \type{--enable-write18}, \type{--disable-write18},
910 \type{--shell-restricted}, \type{--help}, \type{--version}, and \type{--credits}.
912 Now it searches for the requested \LUA\ initialization script. If it
913 cannot be found using the actual name given on the commandline, a
914 second attempt is made by prepending the value of the environment
915 variable \type{LUATEXDIR}, if that variable is defined in the environment.
917 Then it checks the various safety switches. You can use those to disable
918 some \LUA\ commands that can easily be abused by a malicious document. At
919 the moment, \type{--safer} \type{nil}s the following functions:
921 \starttabulate[|l|l|]
922 \NC \bf library \NC \bf functions \NC \NR
923 \NC \tt os \NC \tt execute exec setenv rename remove tmpdir \NC \NR
924 \NC \tt io \NC \tt popen output tmpfile \NC \NR
925 \NC \tt lfs \NC \tt rmdir mkdir chdir lock touch \NC \NR
926 \stoptabulate
928 Furthermore, it disables loading of compiled \LUA\ libraries (support
929 for these was added in 0.46.0), and it makes \lua{io.open()} fail on
930 files that are opened for anything besides reading.
932 \type{--nosocket} makes the socket library unavailable, so that
933 \LUA\ cannot use networking.
935 The switches \type{--[no-]shell-escape}, \type{--[enable|disable]-write18}, and
936 \type{--shell-restricted} have the same
937 effects as in \PDFTEX, and additionally make
938 \type{io.popen()}, \type{os.execute}, \type{os.exec} and \type{os.spawn}
939 adhere to the requested option.
941 Next the initialization script is loaded and executed. From within the
942 script, the entire commandline is available in the \LUA\ table
943 \lua{arg}, beginning with \lua {arg[0]}, containing the name of the executable.
945 Commandline processing happens very early on. So early, in fact, that
946 none of \TEX's initializations have taken place yet. For that reason,
947 the tables that deal with typesetting, like \luatex{tex}, \luatex{token},
948 \luatex{node} and \luatex{pdf}, are off|-|limits during the execution
949 of the startup file (they are nilled). Special care is taken that \luatex{texio.write} and
950 \luatex{texio.write_nl} function properly, so that you can at least
951 report your actions to the log file when (and if) it eventually
952 becomes opened (note that \TEX\ does not even know its \tex{jobname}
953 yet at this point). See \in{chapter}[libraries] for more information
954 about the \LUATEX-specific \LUA\ extension tables.
957 Everything you do in the \LUA\ initialization script will remain
958 visible during the rest of the run, with the exception of the
959 aforementioned \luatex{tex}, \luatex{token}, \luatex{node} and
960 \luatex{pdf} tables: those will be initialized
961 to their documented state after the execution of the script. You
962 should not store anything in variables or within tables with these
963 four global names, as they will be overwritten completely.
965 We recommend you use the startup file only for your own
966 \TEX|-|independent initializations (if you need any), to parse the
967 commandline, set values in the \luatex{texconfig} table, and register
968 the callbacks you need.
970 \LUATEX\ allows some of the commandline options to be overridden
971 by reading values from the \luatex{texconfig} table at the end of
972 script execution (see the description of the \luatex{texconfig} table
973 later on in this document for more details on which ones exactly).
975 Unless the \luatex{texconfig} table tells \LUATEX\ not to initialize
976 \KPATHSEA\ at all (set \luatex{texconfig.kpse_init} to \type{false} for that),
977 \LUATEX\ acts on some more commandline options after the
978 initialization script is finished:
979 in order to initialize the built|-|in \KPATHSEA\ library properly,
980 \LUATEX\ needs to know the correct program name to use, and for that it
981 needs to check \type{--progname}, or \type{--ini} and \type{--fmt}, if
982 \type{--progname} is missing.
985 \section{\LUA\ changes}
987 {\bf NOTE:} \LUATEX\ 0.74.0 is the first version with Lua 5.2, and
988 this is used without any patches to the core, which has some side
989 effects. In particular, Lua's \type{tonumber()} may return values in
990 scientific notation, thereby confusing the \TEX\ end of things when it
991 is used as the right-hand side of an assignment to a \type{\dimen}
992 or \type{\count}.
994 {\bf NOTE:} Also in \LUATEX\ 0.74.0 (this is a change in Lua 5.2),
995 loading dynamic Lua libraries will fail if there are two Lua libraries
996 loaded at the same time (which will typically happen on Win32, because
997 there is one Lua 5.2 inside luatex, and another will likely be linked
998 to the \type{dll} file of the module itself). We plan to fix that later
999 by switching \LUATEX\ itself to using de DLL version of Lua 5.2 inside
1000 \LUATEX\ instead of including a static version in the binary.
1002 Starting from version 0.45, \LUATEX\ is able to use the kpathsea
1003 library to find \type{require()}d modules. For this purpose,
1004 \type{package.searchers[2]} is replaced by a different loader function,
1005 that decides at runtime whether to use kpathsea or the built-in core
1006 lua function. It uses \KPATHSEA\ when that is already initialized at
1007 that point in time, otherwise it reverts to using the normal
1008 \type{package.path} loader.
1010 Initialization of \KPATHSEA\ can happen either implicitly (when
1011 \LUATEX\ starts up and the startup script has not set
1012 \type{texconfig.kpse_init} to false), or explicitly by calling the
1013 \LUA\ function \type{kpse.set_program_name()}.
1015 Starting from version 0.46.0 \LUATEX\ is
1016 also able to use dynamically loadable \LUA\ libraries, unless
1017 \type{--safer} was given as an option on the command line.
1019 For this purpose, \type{package.searchers[3]} is replaced by a different
1020 loader function, that decides at runtime whether to use kpathsea or
1021 the build-in core lua function. As in the previous paragraph, it uses
1022 \KPATHSEA\ when that is already initialized at that point in time,
1023 otherwise it reverts to using the normal \type{package.cpath} loader.
1025 This functionality required an extension to kpathsea:
1027 \startnarrower
1028 There is a new kpathsea file format: \type{kpse_clua_format} that
1029 searches for files with extension \type{.dll} and \type{.so}. The
1030 \type{texmf.cnf} setting for this variable is \type{CLUAINPUTS}, and
1031 by default it has this value:
1033 \starttyping
1034 CLUAINPUTS=.:$SELFAUTOLOC/lib/{$progname,$engine,}/lua//
1035 \stoptyping %$
1037 This path is imperfect (it requires a TDS subtree below the binaries
1038 directory), but the architecture has to be in the path somewhere, and
1039 the currently simplest way to do that is to search below the binaries
1040 directory only.
1042 One level up (a \type{lib} directory parallel to \type{bin}) would
1043 have been nicer, but that is not doable because \TEXLIVE\ uses a
1044 \type{bin/<arch>} structure.
1045 \stopnarrower
1047 In keeping with the other \TEX-like programs in \TEXLIVE, the two
1048 \LUA\ functions
1049 \type{os.execute} and \type{io.popen} (as well as the two new functions \type{os.exec}
1050 and \type{os.spawn} that are explained below) take the value of \type{shell_escape}
1051 and/or \type{shell_escape_commands} in account. Whenever \LUATEX\ is run with the
1052 assumed intention to typeset a document (and by that I mean that it is called as
1053 \type{luatex}, as opposed to \type{texlua}, and that the commandline option
1054 \type{--luaonly} was not given), it will only run the four functions above if the
1055 matching texmf.cnf variable(s) or their \type{texconfig} (see~\in{section}[texconfig])
1056 counterparts allow execution of the requested system command. In \quote{script
1057 interpreter} runs of \LUATEX, these settings have no effect, and all four functions
1058 function as normal. This change is new in 0.37.0.
1062 The \lua{f:read("*line")} and \lua{f:lines()} functions from the io library have
1063 been adjusted so that they are line|-|ending neutral: any of \type{LF}, \type
1064 {CR} or \type{CR+LF} are acceptable line endings.
1066 \lua{luafilesystem} has been extended: there are two extra boolean functions
1067 (\luatex{lfs.isdir(filename)} and \luatex{lfs.isfile(filename)}) and
1068 one extra string field in its attributes table
1069 (\type{permissions}). There is an additional function (added in 0.51)
1070 \type{lfs.shortname()} which takes a file name and returns its short
1071 name on WIN32 platforms. On other platforms, it just returns the given
1072 argument. The file name is not tested for existence. Finally, for
1073 non-WIN32 platforms only, there is the new function
1074 \type{lfs.readlink()} (added in 0.51) that takes an existing symbolic
1075 link as argument and returns its content. It returns an error on
1076 WIN32.
1078 The \lua{string} library has an extra function:
1079 \luatex{string.explode(s[,m])}. This function returns an array containing
1080 the string argument \type{s} split into sub-strings based on the value
1081 of the string argument \type{m}. The second argument is a string that
1082 is either empty (this splits the string into characters), a single
1083 character (this splits on each occurrence of that character, possibly
1084 introducing empty strings), or a single character followed by the plus
1085 sign \type{+} (this special version does not create empty
1086 sub-strings). The default value for \type{m} is \quote{\type{ +}} (multiple
1087 spaces).
1089 Note: \type{m} is not hidden by surrounding braces (as it would be if
1090 this function was written in \TEX\ macros).
1092 The \lua{string} library also has six extra iterators that return strings
1093 piecemeal:
1095 \startitemize
1096 \item \luatex{string.utfvalues(s)} (returns an integer value in the
1097 \UNICODE\ range)
1098 \item \luatex{string.utfcharacters(s)} (returns a string with a single
1099 \UTF-8 token in it)
1100 \item \luatex{string.characters(s)} (a string containing one byte)
1101 \item \luatex{string.characterpairs(s)} (two strings each containing one byte) will
1102 produce an empty second string if the string length was odd.
1103 \item \luatex{string.bytes(s)} (a single byte value)
1104 \item \luatex{string.bytepairs(s)} (two byte values) Will produce nil instead of a
1105 number as its second return value if the string length was odd.
1106 \stopitemize
1108 The \luatex{string.characterpairs()} and \luatex{string.bytepairs()}
1109 are useful especially in the conversion of UTF-16 encoded data into UTF-8.
1112 Starting with \LUATEX\ 0.74, there is also a two-argument form of
1113 \type{string.dump()}. The second argument is a boolean which, if true,
1114 strips the symbols from the dumped data. This matches an extension
1115 made in \type{luajit}.
1117 Note: The \lua{string} library functions \luatex{len}, \luatex{lower},
1118 \luatex{sub} etc. are not \UNICODE|-|aware. For strings in the UTF-8
1119 encoding, i.e., strings containing characters above code point 127, the
1120 corresponding functions from the \lua{slnunicode} library can be used,
1121 e.g., \luatex{unicode.utf8.len}, \luatex{unicode.utf8.lower} etc. The
1122 exceptions are \luatex{unicode.utf8.find}, that always returns byte
1123 positions in a string, and \luatex{unicode.utf8.match} and
1124 \luatex{unicode.utf8.gmatch}. While the latter two functions in general
1125 {\it are} \UNICODE|-|aware, they fall-back to non|-|\UNICODE|-|aware
1126 behavior when using the empty capture \lua{()} (other captures work as
1127 expected). For the interpretation of character classes in
1128 \luatex{unicode.utf8} functions refer to the library sources at
1129 \hyphenatedurl{http://luaforge.net/projects/sln}. The \lua{slnunicode}
1130 library will be replaced by an internal \UNICODE\ library in a future
1131 \LUATEX\ version.
1132 \blank
1134 The \lua{os} library has a few extra functions and variables:
1136 \startitemize
1137 \item \luatex{os.selfdir} is a variable that holds the directory path
1138 of the actual executable. For example: {\tt \directlua{tex.sprint(os.selfdir)}}
1139 (present since 0.27.0).
1141 \item \luatex{os.exec(commandline)} is a variation on \lua{os.execute}.
1143 The \type{commandline} can be either a single string or a single table.
1145 If the argument is a table: \LUATEX\ first checks if there is a value at
1146 integer index zero. If there is, this is the command to be executed. Otherwise,
1147 it will use the value at integer index one. (if neither are present, nothing
1148 at all happens).
1150 The set of consecutive values starting at integer 1 in the table are
1151 the arguments that are passed on to the command (the value at index 1
1152 becomes \type{arg[0]}). The command is searched for in the execution path,
1153 so there is normally no need to pass on a fully qualified pathname.
1155 If the argument is a string, then it is automatically converted into
1156 a table by splitting on whitespace. In this case, it is impossible
1157 for the command and first argument to differ from each other.
1159 In the string argument format, whitespace can be protected by putting (part
1160 of) an argument inside single or double quotes. One layer of quotes is
1161 interpreted by \LUATEX, and all occurrences of \tex{"}, \tex{'} or
1162 \type{\\} within the quoted text are un-escaped. In the table format, there
1163 is no string handling taking place.
1165 This function normally does not return control back to the \LUA\ script: the
1166 command will replace the current process. However, it will return the two values
1167 \type{nil} and \type {'error'} if there was a problem while attempting to execute the command.
1169 On Windows, the current process is actually kept in memory until after the
1170 execution of the command has finished. This prevents crashes in situations
1171 where \TEXLUA\ scripts are run inside integrated \TEX\ environments.
1173 The original reason for this command is that it cleans out the current
1174 process before starting the new one, making it especially useful for
1175 use in \TEXLUA.
1177 \item \luatex{os.spawn(commandline)} is a returning version of \lua{os.exec},
1178 with otherwise identical calling conventions.
1180 If the command ran ok, then the return value is the exit status of the
1181 command. Otherwise, it will return the two values \type{nil} and \type {'error'}.
1183 \item \luatex{os.setenv('key','value')}
1184 This sets a variable in the environment. Passing \lua{nil} instead of a
1185 value string will remove the variable.
1187 \item \luatex{os.env}
1188 This is a hash table containing a dump of the variables and values
1189 in the process environment at the start of the run. It is writeable,
1190 but the actual environment is {\em not\/} updated automatically.
1192 \item \luatex{os.gettimeofday()}
1193 Returns the current \quote {\UNIX\ time}, but as a float. This function is
1194 not available on the \SUNOS\ platforms, so do not use this function
1195 for portable documents.
1197 \item \luatex{os.times()}
1198 Returns the current process times according to \ the \UNIX\ C library function
1199 \quote {times}. This function is not available on the \MSWINDOWS\
1200 and \SUNOS\ platforms, so do not use this function for portable
1201 documents.
1203 \item \luatex{os.tmpdir()} This will create a directory in the \quote {current
1204 directory} with the name \type{luatex.XXXXXX} where the \type {X}-es are
1205 replaced by a unique string. The function also returns this string,
1206 so you can \type{lfs.chdir()} into it, or \type{nil} if it failed to
1207 create the directory. The user is responsible for cleaning up at
1208 the end of the run, it does not happen automatically.
1210 \item \luatex{os.type}
1211 This is a string that gives a global indication of the class of operating
1212 system. The possible values are currently \type{windows}, \type{unix}, and
1213 \type{msdos} (you are unlikely to find this value \quote {in the wild}).
1215 \item \luatex{os.name}
1216 This is a string that gives a more precise indication of the operating
1217 system. These possible values are not yet fixed, and for \type{os.type} values
1218 \type{windows} and \type{msdos}, the \type{os.name} values are simply
1219 \type{windows} and \type{msdos}
1221 The list for the type \type{unix} is more precise: \type{linux},
1222 \type{freebsd}, \type{kfreebsd} (since 0.51), \type{cygwin} (since
1223 0.53), \type{openbsd}, \type{solaris}, \type{sunos} (pre-solaris),
1224 \type{hpux}, \type{irix}, \type{macosx}, \type{gnu} (hurd), \type{bsd} (unknown, but \BSD|-|like),
1225 \type{sysv} (unknown, but \SYSV|-|like), \type{generic} (unknown).
1227 (\type{os.version} is planned as a future extension)
1229 \item \luatex{os.uname()}
1230 This function returns a table with specific operating system
1231 information acquired at runtime. The keys in the returned table are
1232 all string valued, and their names are: \type{sysname}, \type{machine},
1233 \type{release}, \type{version}, and \type{nodename}.
1236 \stopitemize
1238 In stock \LUA, many things depend on the current locale. In \LUATEX, we can't do
1239 that, because it makes documents unportable. While \LUATEX\ is running if
1240 forces the following locale settings:
1242 \starttyping
1243 LC_CTYPE=C
1244 LC_COLLATE=C
1245 LC_NUMERIC=C
1246 \stoptyping
1248 \section {\LUA\ modules}
1250 {\bf NOTE}: Starting with \LUATEX\ 0.74, the implied use of the
1251 built-in Lua modules in this section is deprecated. If you want to use
1252 one of these libraries, please start your source file with a
1253 proper \type{require} line. In the near future, \LUATEX\ will switch
1254 to loading these modules on demand.
1256 Some modules that are normally external to \LUA\ are statically linked
1257 in with \LUATEX, because they offer useful functionality:
1259 \startitemize
1260 \item \lua{slnunicode}, from the \type {Selene} libraries, \hyphenatedurl{http://luaforge.net/projects/sln}. (version 1.1)
1262 This library has been slightly extended so that the \type{unicode.utf8.*}
1263 functions also accept the first 256 values of plane~18. This is the range \LUATEX\
1264 uses for raw binary output, as explained above.
1266 \item \lua{luazip}, from the kepler project, \hyphenatedurl{http://www.keplerproject.org/luazip/}.
1267 (version 1.2.1, but patched for compilation with \LUA\ 5.2)
1268 \item \lua{luafilesystem}, also from the kepler project, \hyphenatedurl{http://www.keplerproject.org/luafilesystem/}.
1269 (version 1.5.0)
1270 \item \lua{lpeg}, by Roberto Ierusalimschy, \hyphenatedurl{http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html}. (version 0.10.2)
1272 Note: \lua{lpeg} is not \UNICODE|-|aware, but interprets strings on a
1273 byte|-|per|-|byte basis. This mainly means that \luatex{lpeg.S} cannot be
1274 used with characters above code point 127, since those characters are
1275 encoded using two bytes, and thus \luatex{lpeg.S} will look for one
1276 of those two bytes when matching, not the combination of the two.
1278 The same is true for \luatex{lpeg.R}, although the latter will display
1279 an error message if used with characters above code point 127: I.\,e.\
1280 \luatex{lpeg.R('aä')} results in the message \type{bad argument #1 to
1281 'R' (range must have two characters)}, since to \lua{lpeg}, \type{ä}
1282 is two 'characters' (bytes), so \type{aä} totals three.
1284 \item \lua{lzlib}, by Tiago Dionizio, \hyphenatedurl{http://luaforge.net/projects/lzlib/}. (version 0.2)
1285 \item \lua{md5}, by Roberto Ierusalimschy \hyphenatedurl{http://www.inf.puc-rio.br/~roberto/md5/md5-5/md5.html}.
1287 \item \lua{luasocket}, by Diego Nehab
1288 \hyphenatedurl{http://w3.impa.br/~diego/software/luasocket/}
1289 (version 2.0.2).
1291 Note: the \type{.lua} support modules from \type{luasocket} are also
1292 preloaded inside the executable, there are no external file dependencies.
1293 \stopitemize
1296 \chapter[libraries]{\LUATEX\ \LUA\ Libraries}
1298 {\bf NOTE}: Starting with \LUATEX\ 0.74, the implied use of the
1299 built-in Lua modules \type{epdf}, \type{fontloader}, \type{mplib},
1300 and \type{pdfscanner} is deprecated. If you want to use these, please
1301 start your source file with a proper \type{require} line. In the near
1302 future, \LUATEX\ will switch to loading these modules on demand.
1305 The interfacing between \TEX\ and \LUA\ is facilitated by a set of
1306 library modules. The \LUA\ libraries in this chapter are all defined and
1307 initialized by the \LUATEX\ executable. Together, they allow \LUA\
1308 scripts to query and change a number of \TEX's internal variables, run
1309 various internal \TEX\ functions, and set up \LUATEX's hooks to execute
1310 \LUA\ code.
1312 The following sections are in alphabetical order.
1314 \section{The \luatex{callback} library}
1316 This library has functions that register, find and list callbacks.
1318 A quick note on what callbacks are (thanks, Paul!):
1320 Callbacks are entry points to \LUATEX's internal operations, which can be
1321 interspersed with additional \LUA\ code, and even replaced altogether.
1322 In the first case, \TEX\ is simply augmented with new operations
1323 (for instance, a manipulation of the nodes resulting from the paragraph
1324 builder); in the second case, its hard-coded behavior (for instance, the
1325 paragraph builder itself) is ignored and processing relies on user code only.
1327 More precisely, the code to be inserted at a given callback is a function
1328 (an anonymous function or the name of a function variable); % Is this line useful?
1329 it will receive the arguments associated with the callback, if any, and must
1330 frequently return some other arguments for \TEX\ to resume its operations.
1332 The first task is registering a callback:
1334 \startfunctioncall
1335 id, error = callback.register (<string> callback_name, <function> func)
1336 id, error = callback.register (<string> callback_name, nil)
1337 id, error = callback.register (<string> callback_name, false)
1338 \stopfunctioncall
1340 where the \syntax{callback_name} is a predefined callback name, see
1341 below. The function returns the internal \type{id} of the callback
1342 or \type{nil}, if the callback could not be registered. In the latter
1343 case, \type{error} contains an error message, otherwise it is
1344 \type{nil}.
1346 \LUATEX\ internalizes the callback function in such a way that
1347 it does not matter if you redefine a function accidentally.
1349 Callback assignments are always global. You can use the special value
1350 \type {nil} instead of a function for clearing the callback.
1352 For some minor speed gain, you can assign the boolean \type{false} to
1353 the non-file related callbacks, doing so will prevent \LUATEX\ from
1354 executing whatever it would execute by default (when no callback
1355 function is registered at all). Be warned: this may cause all sorts of
1356 grief unless you know {\it exactly} what you are doing! This functionality
1357 is present since version 0.38.
1359 Currently, callbacks are not dumped into the format file.
1361 \startfunctioncall
1362 <table> info = callback.list()
1363 \stopfunctioncall
1365 The keys in the table are the known callback names, the value is a
1366 boolean where \type{true} means that the callback is currently set
1367 (active).
1369 \startfunctioncall
1370 <function> f = callback.find (callback_name)
1371 \stopfunctioncall
1373 If the callback is not set, \luatex{callback.find} returns \type{nil}.
1375 \subsection{File discovery callbacks}
1377 The behavior documented in this subsection is considered stable in the
1378 sense that there will not be backward-incompatible changes any more.
1380 \subsubsection{\luatex{find_read_file} and \luatex{find_write_file}}
1382 Your callback function should have the following conventions:
1384 \startfunctioncall
1385 <string> actual_name = function (<number> id_number, <string> asked_name)
1386 \stopfunctioncall
1388 Arguments:
1390 \startitemize
1392 \sym{id_number}
1394 This number is zero for the log or \tex{input} files. For \TEX's \tex{read} or
1395 \tex{write} the number is incremented by one, so \tex{read0} becomes~1.
1397 \sym{asked_name}
1399 This is the user|-|supplied filename, as found by \tex{input}, \tex{openin}
1400 or \tex{openout}.
1402 \stopitemize
1404 Return value:
1406 \startitemize
1408 \sym{actual_name}
1410 This is the filename used. For the very first file that is read in by
1411 \TEX, you have to make sure you return an \type{actual_name} that has
1412 an extension and that is suitable for use as \type{jobname}. If you
1413 don't, you will have to manually fix the name of the log file and
1414 output file after \LUATEX\ is finished, and an eventual format
1415 filename will become mangled. That is because these file names depend
1416 on the jobname.
1418 You have to return \type{nil} if the file cannot be found.
1420 \stopitemize
1422 \subsubsection{\luatex{find_font_file}}
1424 Your callback function should have the following conventions:
1426 \startfunctioncall
1427 <string> actual_name = function (<string> asked_name)
1428 \stopfunctioncall
1430 The \type{asked_name} is an \OTF\ or \TFM\ font metrics file.
1432 Return \type{nil} if the file cannot be found.
1434 \subsubsection{\luatex{find_output_file}}
1436 Your callback function should have the following conventions:
1438 \startfunctioncall
1439 <string> actual_name = function (<string> asked_name)
1440 \stopfunctioncall
1442 The \type{asked_name} is the \PDF\ or \DVI\ file for writing.
1444 \subsubsection{\luatex{find_format_file}}
1446 Your callback function should have the following conventions:
1448 \startfunctioncall
1449 <string> actual_name = function (<string> asked_name)
1450 \stopfunctioncall
1452 The \type{asked_name} is a format file for reading (the format file
1453 for writing is always opened in the current directory).
1455 \subsubsection{\luatex{find_vf_file}}
1457 Like \luatex{find_font_file}, but for virtual fonts. This applies to
1458 both \ALEPH's \OVF\ files and traditional Knuthian \VF\ files.
1460 \subsubsection{\luatex{find_map_file}}
1462 Like \luatex{find_font_file}, but for map files.
1464 \subsubsection{\luatex{find_enc_file}}
1466 Like \luatex{find_font_file}, but for enc files.
1468 \subsubsection{\luatex{find_sfd_file}}
1470 Like \luatex{find_font_file}, but for subfont definition files.
1472 \subsubsection{\luatex{find_pk_file}}
1474 Like \luatex{find_font_file}, but for pk bitmap files. The argument
1475 \type{asked_name} is a bit special in this case. Its form is
1477 \starttyping
1478 <base res>dpi/<fontname>.<actual res>pk
1479 \stoptyping
1481 So you may be asked for \type{600dpi/manfnt.720pk}. It is up to you
1482 to find a \quote{reasonable} bitmap file to go with that specification.
1484 \subsubsection{\luatex{find_data_file}}
1486 Like \luatex{find_font_file}, but for embedded files (\tex{pdfobj file '...'}).
1488 \subsubsection{\luatex{find_opentype_file}}
1490 Like \luatex{find_font_file}, but for \OPENTYPE\ font files.
1492 \subsubsection{\luatex{find_truetype_file} and \luatex{find_type1_file}}
1494 Your callback function should have the following conventions:
1496 \startfunctioncall
1497 <string> actual_name = function (<string> asked_name)
1498 \stopfunctioncall
1500 The \type{asked_name} is a font file. This callback is called while
1501 \LUATEX\ is building its internal list of needed font files, so the
1502 actual timing may surprise you. Your return value is later fed back
1503 into the matching \luatex{read_file} callback.
1505 Strangely enough, \luatex{find_type1_file} is also used for \OPENTYPE\
1506 (\OTF) fonts.
1508 \subsubsection{\luatex{find_image_file}}
1510 Your callback function should have the following conventions:
1512 \startfunctioncall
1513 <string> actual_name = function (<string> asked_name)
1514 \stopfunctioncall
1516 The \type{asked_name} is an image file. Your return value is used to
1517 open a file from the harddisk, so make sure you return something that
1518 is considered the name of a valid file by your operating system.
1520 \subsection[iocallback]{File reading callbacks}
1522 The behavior documented in this subsection is considered stable in the
1523 sense that there will not be backward-incompatible changes any more.
1525 \subsubsection{\luatex{open_read_file}}
1527 Your callback function should have the following conventions:
1529 \startfunctioncall
1530 <table> env = function (<string> file_name)
1531 \stopfunctioncall
1533 Argument:
1535 \startitemize
1537 \sym{file_name}
1539 The filename returned by a previous \luatex{find_read_file} or the return
1540 value of \luatex{kpse.find_file()} if there was no such callback defined.
1542 \stopitemize
1544 Return value:
1546 \startitemize
1548 \sym{env}
1550 This is a table containing at least one required and one optional
1551 callback function for this file. The required field is
1552 \luatex{reader} and the associated function will be called once
1553 for each new line to be read, the optional one is \luatex{close}
1554 that will be called once when \LUATEX\ is done with the file.
1556 \LUATEX\ never looks at the rest of the table, so you can use it to
1557 store your private per|-|file data. Both the callback functions will
1558 receive the table as their only argument.
1560 \stopitemize
1562 \subsubsubsection{\luatex{reader}}
1564 \LUATEX\ will run this function whenever it needs a new input line
1565 from the file.
1567 \startfunctioncall
1568 function(<table> env)
1569 return <string> line
1571 \stopfunctioncall
1573 Your function should return either a string or \type{nil}. The value \type{nil}
1574 signals that the end of file has occurred, and will make \TEX\ call
1575 the optional \luatex{close} function next.
1577 \subsubsubsection{\luatex{close}}
1579 \LUATEX\ will run this optional function when it decides to close the file.
1581 \startfunctioncall
1582 function(<table> env)
1584 \stopfunctioncall
1586 Your function should not return any value.
1588 \subsubsection{General file readers}
1590 There is a set of callbacks for the loading of binary data
1591 files. These all use the same interface:
1593 \startfunctioncall
1594 function(<string> name)
1595 return <boolean> success, <string> data, <number> data_size
1597 \stopfunctioncall
1599 The \type{name} will normally be a full path name as it is returned by
1600 either one of the file discovery callbacks or the internal version of
1601 \luatex{kpse.find_file()}.
1603 \startitemize
1605 \sym{success}
1607 Return \type{false} when a fatal error occurred (e.\,g.\ when the file cannot be
1608 found, after all).
1610 \sym{data}
1612 The bytes comprising the file.
1614 \sym{data_size}
1616 The length of the \type{data}, in bytes.
1618 \stopitemize
1620 Return an empty string and zero if the file was found but there was a
1621 reading problem.
1623 The list of functions is as follows:
1625 \starttabulate[|l|p|]
1626 \NC \luatex{read_font_file} \NC ofm or tfm files \NC\NR
1627 \NC \luatex{read_vf_file} \NC virtual fonts \NC\NR
1628 \NC \luatex{read_map_file} \NC map files \NC\NR
1629 \NC \luatex{read_enc_file} \NC encoding files \NC\NR
1630 \NC \luatex{read_sfd_file} \NC subfont definition files \NC\NR
1631 \NC \luatex{read_pk_file} \NC pk bitmap files \NC\NR
1632 \NC \luatex{read_data_file} \NC embedded files (\tex{pdfobj file ...}) \NC\NR
1633 \NC \luatex{read_truetype_file} \NC \TRUETYPE\ font files \NC\NR
1634 \NC \luatex{read_type1_file} \NC \TYPEONE\ font files \NC\NR
1635 \NC \luatex{read_opentype_file} \NC \OPENTYPE\ font files \NC\NR
1636 \stoptabulate
1638 \subsection{Data processing callbacks}
1640 \subsubsection{\luatex{process_input_buffer}}
1643 This callback allows you to change the contents of the line input
1644 buffer just before \LUATEX\ actually starts looking at it.
1646 \startfunctioncall
1647 function(<string> buffer)
1648 return <string> adjusted_buffer
1650 \stopfunctioncall
1652 If you return \type{nil}, \LUATEX\ will pretend like your callback
1653 never happened. You can gain a small amount of processing time from
1654 that.
1656 This callback does not replace any internal code.
1658 \subsubsection{\luatex{process_output_buffer} (0.43)}
1660 This callback allows you to change the contents of the line output
1661 buffer just before \LUATEX\ actually starts writing it to a file as the
1662 result of a \tex{write} command. It is only called for output to an
1663 actual file (that is, excluding the log, the terminal, and \tex{write18}
1664 calls).
1666 \startfunctioncall
1667 function(<string> buffer)
1668 return <string> adjusted_buffer
1670 \stopfunctioncall
1672 If you return \type{nil}, \LUATEX\ will pretend like your callback
1673 never happened. You can gain a small amount of processing time from
1674 that.
1676 This callback does not replace any internal code.
1679 \subsubsection{\luatex{process_jobname} (0.71)}
1681 This callback allows you to change the jobname given by \type{\jobname}
1682 in \TEX\ and \type{tex.jobname} in Lua. It does not affect the internal
1683 job name or the name of the output or log files.
1685 \startfunctioncall
1686 function(<string> jobname)
1687 return <string> adjusted_jobname
1689 \stopfunctioncall
1691 The only argument is the actual job name; you should not use
1692 \type{tex.jobname} inside this function or infinite recursion may occur.
1693 If you return \type{nil}, \LUATEX\ will pretend your callback never
1694 happened.
1696 This callback does not replace any internal code.
1699 \subsubsection{\luatex{token_filter}}
1701 This callback allows you to replace the way \LUATEX\ fetches
1702 lexical tokens.
1704 \startfunctioncall
1705 function()
1706 return <table> token
1708 \stopfunctioncall
1710 The calling convention for this callback is a bit more complicated than
1711 for most other callbacks. The function should either return a \LUA\
1712 table representing a valid to|-|be|-|processed token or tokenlist, or
1713 something else like \type{nil} or an empty table.
1715 If your \LUA\ function does not return a table representing a valid
1716 token, it will be immediately called again, until it eventually does
1717 return a useful token or tokenlist (or until you reset the callback
1718 value to nil). See the description of \luatex{token} for some
1719 handy functions to be used in conjunction with this callback.
1721 If your function returns a single usable token, then that token will
1722 be processed by \LUATEX\ immediately. If the function returns a token
1723 list (a table consisting of a list of consecutive token tables), then
1724 that list will be pushed to the input stack at a completely new token
1725 list level, with its token type set to \quote{inserted}. In either case,
1726 the returned token(s) will not be fed back into the callback function.
1728 Setting this callback to \type{false} has no effect (because otherwise
1729 nothing would happen, forever).
1731 \subsection{Node list processing callbacks}
1733 The description of nodes and node lists is in~\in{chapter}[nodes].
1735 \subsubsection{\luatex{buildpage_filter}}
1737 This callback is called whenever \LUATEX\ is ready to move stuff to
1738 the main vertical list. You can use this callback to do specialized
1739 manipulation of the page building stage like imposition or column
1740 balancing.
1742 \startfunctioncall
1743 function(<string> extrainfo)
1745 \stopfunctioncall
1747 The string \type{extrainfo} gives some additional information about
1748 what \TEX's state is with respect to the \quote{current page}. The possible
1749 values are:
1751 \starttabulate[|lT|p|]
1752 \NC \ssbf value \NC \bf explanation \NC\NR
1753 \NC alignment \NC a (partial) alignment is being added \NC\NR
1754 \NC after_output \NC an output routine has just finished \NC\NR
1755 \NC box \NC a typeset box is being added \NC\NR
1756 %\NC pre_box \NC interline material is being added \NC\NR
1757 %\NC adjust \NC \tex{vadjust} material is being added \NC\NR
1758 \NC new_graf \NC the beginning of a new paragraph \NC\NR
1759 \NC vmode_par \NC \tex{par} was found in vertical mode \NC\NR
1760 \NC hmode_par \NC \tex{par} was found in horizontal mode \NC\NR
1761 \NC insert \NC an insert is added \NC\NR
1762 \NC penalty \NC a penalty (in vertical mode) \NC\NR
1763 \NC before_display \NC immediately before a display starts \NC\NR
1764 \NC after_display \NC a display is finished \NC\NR
1765 \NC end \NC \LUATEX\ is terminating (it's all over)\NC\NR
1766 \stoptabulate
1768 This callback does not replace any internal code.
1771 \subsubsection{\luatex{pre_linebreak_filter}}
1773 This callback is called just before \LUATEX\ starts converting a list
1774 of nodes into a stack of \tex{hbox}es, after the addition of
1775 \type{\parfillskip}.
1777 \startfunctioncall
1778 function(<node> head, <string> groupcode)
1779 return true | false | <node> newhead
1781 \stopfunctioncall
1783 The string called \type {groupcode} identifies the nodelist's context
1784 within \TEX's processing. The range of possibilities is given in the
1785 table below, but not all of those can actually appear in
1786 \luatex {pre_linebreak_filter}, some are for the
1787 \luatex {hpack_filter} and \luatex {vpack_filter} callbacks that
1788 will be explained in the next two paragraphs.
1790 \starttabulate[|lT|p|]
1791 \NC \ssbf value \NC \bf explanation \NC\NR
1792 \NC <empty> \NC main vertical list \NC\NR
1793 \NC hbox \NC \tex{hbox} in horizontal mode \NC\NR
1794 \NC adjusted_hbox\NC \tex{hbox} in vertical mode \NC\NR
1795 \NC vbox \NC \tex{vbox} \NC\NR
1796 \NC vtop \NC \tex{vtop} \NC\NR
1797 \NC align \NC \tex{halign} or \tex{valign} \NC\NR
1798 \NC disc \NC discretionaries \NC\NR
1799 \NC insert \NC packaging an insert \NC\NR
1800 \NC vcenter \NC \tex{vcenter} \NC\NR
1801 \NC local_box \NC \tex{localleftbox} or \tex{localrightbox} \NC\NR
1802 \NC split_off \NC top of a \tex{vsplit} \NC\NR
1803 \NC split_keep \NC remainder of a \tex{vsplit} \NC\NR
1804 \NC align_set \NC alignment cell \NC\NR
1805 \NC fin_row \NC alignment row \NC\NR
1806 \stoptabulate
1808 As for all the callbacks that deal with nodes, the return value can be one of three things:
1810 \startitemize
1811 \item boolean \type{true} signals succesful processing
1812 \item \type{<node>} signals that the \quote{head} node should be replaced by the returned node
1813 \item boolean \type{false} signals that the \quote{head} node list should be ignored and
1814 flushed from memory
1815 \stopitemize
1818 This callback does not replace any internal code.
1821 \subsubsection{\luatex{linebreak_filter}}
1823 This callback replaces \LUATEX's line breaking algorithm.
1825 \startfunctioncall
1826 function(<node> head, <boolean> is_display)
1827 return <node> newhead
1829 \stopfunctioncall
1831 The returned node is the head of the list that will be added to the
1832 main vertical list, the boolean argument is true if this paragraph is
1833 interrupted by a following math display.
1835 If you return something that is not a \type{<node>}, \LUATEX\ will
1836 apply the internal linebreak algorithm on the list that starts at
1837 \type{<head>}. Otherwise, the \type{<node>} you return is supposed
1838 to be the head of a list of nodes that are all allowed in vertical
1839 mode, and at least one of those has to represent a hbox. Failure to do
1840 so will result in a fatal error.
1842 Setting this callback to \type{false} is possible, but dangerous,
1843 because it is possible you will end up in an unfixable
1844 \quote{deadcycles loop}.
1846 \subsubsection{\luatex{post_linebreak_filter}}
1848 This callback is called just after \LUATEX\ has converted a list
1849 of nodes into a stack of \tex{hbox}es.
1851 \startfunctioncall
1852 function(<node> head, <string> groupcode)
1853 return true | false | <node> newhead
1855 \stopfunctioncall
1857 This callback does not replace any internal code.
1859 \subsubsection{\luatex{hpack_filter}}
1861 This callback is called when \TEX\ is ready to start boxing some
1862 horizontal mode material. Math items and line boxes are ignored
1863 at the moment.
1865 \startfunctioncall
1866 function(<node> head, <string> groupcode, <number> size,
1867 <string> packtype [, <string> direction])
1868 return true | false | <node> newhead
1870 \stopfunctioncall
1872 The \type{packtype} is either \type{additional} or \type{exactly}. If
1873 \type{additional}, then the \type{size} is a \tex{hbox spread ...}
1874 argument. If \type{exactly}, then the \type{size} is a \tex{hbox to ...}.
1875 In both cases, the number is in scaled points.
1877 The \type{direction} is either one of the three-letter direction specifier
1878 strings, or \type{nil} (added in 0.45).
1881 This callback does not replace any internal code.
1883 \subsubsection{\luatex{vpack_filter}}
1885 This callback is called when \TEX\ is ready to start boxing some
1886 vertical mode material. Math displays are ignored at the moment.
1888 This function is very similar to the \luatex{hpack_filter}. Besides
1889 the fact that it is called at different moments, there is an extra
1890 variable that matches \TEX's \tex{maxdepth} setting.
1892 \startfunctioncall
1893 function(<node> head, <string> groupcode, <number> size, <string>
1894 packtype, <number> maxdepth [, <string> direction])
1895 return true | false | <node> newhead
1897 \stopfunctioncall
1899 This callback does not replace any internal code.
1901 \subsubsection{\luatex{pre_output_filter}}
1903 This callback is called when \TEX\ is ready to start boxing the
1904 box 255 for \tex{output}.
1906 \startfunctioncall
1907 function(<node> head, <string> groupcode, <number> size, <string> packtype,
1908 <number> maxdepth [, <string> direction])
1909 return true | false | <node> newhead
1911 \stopfunctioncall
1913 This callback does not replace any internal code.
1915 \subsubsection{\luatex{hyphenate}}
1917 \startfunctioncall
1918 function(<node> head, <node> tail)
1920 \stopfunctioncall
1922 No return values. This callback has to insert discretionary nodes in
1923 the node list it receives.
1925 Setting this callback to \type{false} will prevent the internal
1926 discretionary insertion pass.
1928 \subsubsection{\luatex{ligaturing}}
1930 \startfunctioncall
1931 function(<node> head, <node> tail)
1933 \stopfunctioncall
1935 No return values. This callback has to apply ligaturing to the node
1936 list it receives.
1938 You don't have to worry about return values because the \type{head}
1939 node that is passed on to the callback is guaranteed not to be a
1940 glyph_node (if need be, a temporary node will be prepended), and
1941 therefore it cannot be affected by the mutations that take place.
1942 After the callback, the internal value of the \quote {tail of the list}
1943 will be recalculated.
1945 The \type{next} of \type{head} is guaranteed to be non-nil.
1947 The \type{next} of \type{tail} is guaranteed to be nil, and therefore the
1948 second callback argument can often be ignored. It is provided for
1949 orthogonality, and because it can sometimes be handy when special
1950 processing has to take place.
1952 Setting this callback to \type{false} will prevent the internal
1953 ligature creation pass.
1955 \subsubsection{\luatex{kerning}}
1957 \startfunctioncall
1958 function(<node> head, <node> tail)
1960 \stopfunctioncall
1962 No return values. This callback has to apply kerning between the nodes
1963 in the node list it receives. See \type{ligaturing} for calling
1964 conventions.
1966 Setting this callback to \type{false} will prevent the internal
1967 kern insertion pass.
1969 \subsubsection{\luatex{mlist_to_hlist}}
1971 This callback replaces \LUATEX's math list to node list conversion algorithm.
1973 \startfunctioncall
1974 function(<node> head, <string> display_type, <boolean> need_penalties)
1975 return <node> newhead
1977 \stopfunctioncall
1979 The returned node is the head of the list that will be added to the vertical or
1980 horizontal list, the string argument is either \quote{text} or \quote{display}
1981 depending on the current math mode, the boolean argument is \type{true} if penalties
1982 have to be inserted in this list, \type{false} otherwise.
1984 Setting this callback to \type{false} is bad, it will almost
1985 certainly result in an endless loop.
1987 \subsection{Information reporting callbacks}
1989 \subsubsection{\luatex{pre_dump} (0.61)}
1991 \startfunctioncall
1992 function()
1994 \stopfunctioncall
1996 This function is called just before dumping to a format file starts.
1997 It does not replace any code and there are neither arguments nor return values.
1999 \subsubsection{\luatex{start_run}}
2001 \startfunctioncall
2002 function()
2004 \stopfunctioncall
2006 This callback replaces the code that prints \LUATEX's banner. Note that for
2007 successful use, this callback has to be set in the lua initialization script,
2008 otherwise it will be seen only after the run has already started.
2010 \subsubsection{\luatex{stop_run}}
2012 \startfunctioncall
2013 function()
2015 \stopfunctioncall
2017 This callback replaces the code that prints \LUATEX's statistics and \quote{output written
2018 to} messages.
2020 \subsubsection{\luatex{start_page_number}}
2022 \startfunctioncall
2023 function()
2025 \stopfunctioncall
2027 Replaces the code that prints the \type{[} and the page number at the
2028 begin of \tex{shipout}. This callback will also override the
2029 printing of box information that normally takes place when
2030 \tex{tracingoutput} is positive.
2032 \subsubsection{\luatex{stop_page_number}}
2034 \startfunctioncall
2035 function()
2037 \stopfunctioncall
2039 Replaces the code that prints the \type{]} at the end of \tex{shipout}.
2041 \subsubsection{\luatex{show_error_hook}}
2043 \startfunctioncall
2044 function()
2046 \stopfunctioncall
2048 This callback is run from inside the \TEX\ error function, and the idea
2049 is to allow you to do some extra reporting on top of what \TEX\ already
2050 does (none of the normal actions are removed). You may find some of
2051 the values in the \luatex{status} table useful.
2053 This callback does not replace any internal code.
2055 \iffalse % this has been retracted for the moment
2056 \startitemize
2058 \sym{message}
2060 is the formal error message \TEX\ has given to the user.
2061 (the line after the '!').
2063 \sym{indicator}
2065 is either a filename (when it is a string) or a location indicator (a
2066 number) that can mean lots of different things like a token list id
2067 or a \tex{read} number.
2069 \sym{lineno}
2071 is the current line number.
2072 \stopitemize
2074 This is an investigative item for 'testing the water' only.
2075 The final goal is the total replacement of \TEX's error handling
2076 routines, but that needs lots of adjustments in the web source because
2077 \TEX\ deals with errors in a somewhat haphazard fashion. This is why the
2078 exact definition of \type{indicator} is not given here.
2082 \subsubsection{\luatex{show_error_message}}
2084 \startfunctioncall
2085 function()
2087 \stopfunctioncall
2089 This callback replaces the code that prints the error message. The usual
2090 interaction after the message is not affected.
2092 \subsubsection{\luatex{show_lua_error_hook}}
2094 \startfunctioncall
2095 function()
2097 \stopfunctioncall
2099 This callback replaces the code that prints the extra lua error message.
2102 \subsubsection{\luatex{start_file}}
2104 \startfunctioncall
2105 function(category,filename)
2107 \stopfunctioncall
2109 This callback replaces the code that prints \LUATEX's when a file is opened
2110 like \type {(filename} for regular files. The category is a number:
2112 \starttabulate[|||]
2113 \NC 1 \NC a normal data file, like a \TEX\ source \NC \NR
2114 \NC 2 \NC a font map coupling font names to resources \NC \NR
2115 \NC 3 \NC an image file (\type {png}, \type {pdf}, etc) \NC \NR
2116 \NC 4 \NC an embedded font subset \NC \NR
2117 \NC 5 \NC a fully embedded font \NC \NR
2118 \stoptabulate
2120 \subsubsection{\luatex{stop_file}}
2122 \startfunctioncall
2123 function(category)
2125 \stopfunctioncall
2127 This callback replaces the code that prints \LUATEX's when a file is closed
2128 like the \type{)} for regular files.
2130 \subsection{PDF-related callbacks}
2132 \subsubsection{\luatex{finish_pdffile}}
2134 \startfunctioncall
2135 function()
2137 \stopfunctioncall
2139 This callback is called when all document pages are already written to the \PDF\
2140 file and \LUATEX\ is about to finalize the output document structure. Its intended
2141 use is final update of \PDF\ dictionaries such as \type{/Catalog} or
2142 \type{/Info}. The callback does not replace any code. There are neither
2143 arguments nor return values.
2146 \subsubsection{\luatex{finish_pdfpage}}
2149 \startfunctioncall
2150 function(shippingout)
2152 \stopfunctioncall
2154 This callback is called after the pdf page stream has been assembled and before the
2155 page object gets finalized. This callback is available in \LUATEX\ 0.78.4 and later.
2158 \subsection{Font-related callbacks}
2160 \subsubsection{\luatex{define_font}}
2162 \startfunctioncall
2163 function(<string> name, <number> size, <number> id)
2164 return <table> font
2166 \stopfunctioncall
2168 The string \type{name} is the filename part of the font
2169 specification, as given by the user.
2171 The number \type{size} is a bit special:
2173 \startitemize[packed]
2174 \item if it is positive, it specifies an \quote{at size} in scaled points.
2175 \item if it is negative, its absolute value represents a \quote{scaled}
2176 setting relative to the designsize of the font.
2177 \stopitemize
2179 The \type{id} is the internal number assigned to the font.
2181 The internal structure of the \type{font} table that is to be
2182 returned is explained in \in{chapter}[fonts]. That table is saved
2183 internally, so you can put extra fields in the table for your
2184 later \LUA\ code to use.
2186 Setting this callback to \type{false} is pointless as it will prevent
2187 font loading completely but will nevertheless generate errors.
2189 \section{The \luatex{epdf} library}
2191 The \type{epdf} library provides Lua bindings to many \PDF\ access functions
2192 that are defined by the poppler pdf viewer library (written in C$+{}+$
2193 by Kristian H\o gsberg, based on xpdf by Derek Noonburg).
2194 Within \LUATEX\ (and \PDFTEX),
2195 xpdf functionality is being used since long time to embed \PDF\ files.
2196 The \type{epdf} library shall allow to scrutinize an external \PDF\ file.
2197 It gives access to its document structure,
2198 e.\,g., catalog, cross-reference table, individual pages, objects,
2199 annotations, info, and metadata. The \LUATEX\ team is evaluating
2200 the possibility of reducing the binding to a basic low level \PDF\
2201 primitives and delegate the complete set of functions
2202 to an external shared object module.
2205 The \type{epdf} library is still in alpha state:
2206 \PDF\ access is currently read|-|only
2207 (it's not yet possible to alter a \PDF\ file or to assemble it from scratch),
2208 and many function bindings are still missing.
2210 For a start,
2211 a \PDF\ file is opened by \type{epdf.open()} with file name, e.\,g.:
2213 \starttyping
2214 doc = epdf.open("foo.pdf")
2215 \stoptyping
2217 This normally returns a \type{PDFDoc} userdata variable;
2218 but if the file could not be opened successfully,
2219 instead of a fatal error just the value \type{nil} is returned.
2221 All Lua functions in the \type{epdf} library are named after the
2222 poppler functions listed in the poppler header files for the various classes,
2223 e.\,g., files \type{PDFDoc.h}, \type{Dict.h}, and \type{Array.h}.
2224 These files can be found in the poppler subdirectory within the \LUATEX\ sources.
2225 Which functions are already implemented in the \type{epdf} library
2226 can be found in the \LUATEX\ source file \type{lepdflib.cc}.
2227 For using the \type{epdf} library,
2228 knowledge of the \PDF\ file architecture is indispensable.
2230 There are many different userdata types defined
2231 by the \type{epdf} library, currently these are
2232 \type{AnnotBorderStyle},
2233 \type{AnnotBorder},
2234 \type{Annots},
2235 \type{Annot},
2236 \type{Array},
2237 \type{Attribute},
2238 \type{Catalog},
2239 \type{Dict},
2240 \type{EmbFile},
2241 \type{GString},
2242 \type{LinkDest},
2243 \type{Links},
2244 \type{Link},
2245 \type{ObjectStream},
2246 \type{Object},
2247 \type{PDFDoc},
2248 \type{PDFRectangle},
2249 \type{Page},
2250 \type{Ref},
2251 \type{Stream},
2252 \type{StructElement},
2253 \type{StructTreeRoot}
2254 \type{TextSpan},
2255 \type{XRefEntry}
2257 \type{XRef}
2260 All these userdata names and the Lua access functions closely resemble
2261 the classes naming from the poppler header files,
2262 including the choice of mixed upper and lower case letters.
2263 The Lua function calls use object-oriented syntax, e.\,g.,
2264 the following calls return the \type{Page} object for page~1:
2266 \starttyping
2267 pageref = doc:getCatalog():getPageRef(1)
2268 pageobj = doc:getXRef():fetch(pageref.num, pageref.gen)
2269 \stoptyping
2271 But writing such chained calls is risky,
2272 as an intermediate function may return \type{nil} on error;
2273 therefore between function calls there should be Lua type checks
2274 (e.\,g., against \type{nil}) done.
2275 If a non-object item is requested
2276 (e.\,g., a \type{Dict} item by calling \type{page:getPieceInfo()},
2277 cf.~\type{Page.h}) but not available,
2278 the Lua functions return \type{nil} (without error).
2279 If a function should return an \type{Object}, but it's not existing,
2280 a \type{Null} object is returned instead
2281 (also without error; this is in|-|line with poppler behavior).
2283 All library objects have a \type{__gc} metamethod for garbage collection.
2284 The \type{__tostring} metamethod gives the type name for each object.
2286 All object constructors:
2288 \startfunctioncall
2289 <PDFDoc> = epdf.open(<string> PDF filename)
2290 <Annot> = epdf.Annot(<XRef>, <Dict>, <Catalog>, <Ref>)
2291 <Annots> = epdf.Annots(<XRef>, <Catalog>, <Object>)
2292 <Array> = epdf.Array(<XRef>)
2293 <Attribute> = epdf.Attribute(<Type>,<Object>)| epdf.Attribute(<string>, <int>, <Object>)
2294 <Dict> = epdf.Dict(<XRef>)
2295 <Object> = epdf.Object()
2296 <PDFRectangle> = epdf.PDFRectangle()
2297 \stopfunctioncall
2299 The functions \type{StructElement_Type},
2300 \type{Attribute_Type} and
2301 \type{AttributeOwner_Type} return a hash table \type{{<string>,<integer>}}.
2305 \type{Annot} methods:
2307 \startfunctioncall
2308 <boolean> = <Annot>:isOK()
2309 <Object> = <Annot>:getAppearance()
2310 <AnnotBorder> = <Annot>:getBorder()
2311 <boolean> = <Annot>:match(<Ref>)
2312 \stopfunctioncall
2314 \type{AnnotBorderStyle} methods:
2316 \startfunctioncall
2317 <number> = <AnnotBorderStyle>:getWidth()
2318 \stopfunctioncall
2320 \type{Annots} methods:
2322 \startfunctioncall
2323 <integer> = <Annots>:getNumAnnots()
2324 <Annot> = <Annots>:getAnnot(<integer>)
2325 \stopfunctioncall
2327 \type{Array} methods:
2329 \startfunctioncall
2330 <Array>:incRef()
2331 <Array>:decRef()
2332 <integer> = <Array>:getLength()
2333 <Array>:add(<Object>)
2334 <Object> = <Array>:get(<integer>)
2335 <Object> = <Array>:getNF(<integer>)
2336 <string> = <Array>:getString(<integer>)
2337 \stopfunctioncall
2340 \type{Attribute} methods:
2342 \startfunctioncall
2343 <boolean> = <Attribute>:isOk()
2344 <integer> = <Attribute>:getType()
2345 <integer> = <Attribute>:getOwner()
2346 <string> = <Attribute>:getTypeName()
2347 <string> = <Attribute>:getOwnerName()
2348 <Object> = <Attribute>:getValue()
2349 <Object> = <Attribute>:getDefaultValue
2350 <string> = <Attribute>:getName()
2351 <integer> = <Attribute>:getRevision()
2352 <Attribute>:setRevision(<unsigned integer>)
2353 <boolean> = <Attribute>:istHidden()
2354 <Attribute>:setHidden(<boolean>)
2355 <string> = <Attribute>:getFormattedValue()
2356 <string> = <Attribute>:setFormattedValue(<string>)
2357 \stopfunctioncall
2361 \type{Catalog} methods:
2363 \startfunctioncall
2364 <boolean> = <Catalog>:isOK()
2365 <integer> = <Catalog>:getNumPages()
2366 <Page> = <Catalog>:getPage(<integer>)
2367 <Ref> = <Catalog>:getPageRef(<integer>)
2368 <string> = <Catalog>:getBaseURI()
2369 <string> = <Catalog>:readMetadata()
2370 <Object> = <Catalog>:getStructTreeRoot()
2371 <integer> = <Catalog>:findPage(<integer> object number, <integer> object generation)
2372 <LinkDest> = <Catalog>:findDest(<string> name)
2373 <Object> = <Catalog>:getDests()
2374 <integer> = <Catalog>:numEmbeddedFiles()
2375 <EmbFile> = <Catalog>:embeddedFile(<integer>)
2376 <integer> = <Catalog>:numJS()
2377 <string> = <Catalog>:getJS(<integer>)
2378 <Object> = <Catalog>:getOutline()
2379 <Object> = <Catalog>:getAcroForm()
2380 \stopfunctioncall
2382 \type{EmbFile} methods:
2384 \startfunctioncall
2385 <string> = <EmbFile>:name()
2386 <string> = <EmbFile>:description()
2387 <integer> = <EmbFile>:size()
2388 <string> = <EmbFile>:modDate()
2389 <string> = <EmbFile>:createDate()
2390 <string> = <EmbFile>:checksum()
2391 <string> = <EmbFile>:mimeType()
2392 <Object> = <EmbFile>:streamObject()
2393 <boolean> = <EmbFile>:isOk()
2394 \stopfunctioncall
2396 \type{Dict} methods:
2398 \startfunctioncall
2399 <Dict>:incRef()
2400 <Dict>:decRef()
2401 <integer> = <Dict>:getLength()
2402 <Dict>:add(<string>, <Object>)
2403 <Dict>:set(<string>, <Object>)
2404 <Dict>:remove(<string>)
2405 <boolean> = <Dict>:is(<string>)
2406 <Object> = <Dict>:lookup(<string>)
2407 <Object> = <Dict>:lookupNF(<string>)
2408 <integer> = <Dict>:lookupInt(<string>, <string>)
2409 <string> = <Dict>:getKey(<integer>)
2410 <Object> = <Dict>:getVal(<integer>)
2411 <Object> = <Dict>:getValNF(<integer>)
2412 <boolean> = <Dict>:hasKey(<string>)
2413 \stopfunctioncall
2415 \type{Link} methods:
2417 \startfunctioncall
2418 <boolean> = <Link>:isOK()
2419 <boolean> = <Link>:inRect(<number>, <number>)
2420 \stopfunctioncall
2422 \type{LinkDest} methods:
2424 \startfunctioncall
2425 <boolean> = <LinkDest>:isOK()
2426 <integer> = <LinkDest>:getKind()
2427 <string> = <LinkDest>:getKindName()
2428 <boolean> = <LinkDest>:isPageRef()
2429 <integer> = <LinkDest>:getPageNum()
2430 <Ref> = <LinkDest>:getPageRef()
2431 <number> = <LinkDest>:getLeft()
2432 <number> = <LinkDest>:getBottom()
2433 <number> = <LinkDest>:getRight()
2434 <number> = <LinkDest>:getTop()
2435 <number> = <LinkDest>:getZoom()
2436 <boolean> = <LinkDest>:getChangeLeft()
2437 <boolean> = <LinkDest>:getChangeTop()
2438 <boolean> = <LinkDest>:getChangeZoom()
2439 \stopfunctioncall
2441 \type{Links} methods:
2443 \startfunctioncall
2444 <integer> = <Links>:getNumLinks()
2445 <Link> = <Links>:getLink(<integer>)
2446 \stopfunctioncall
2448 \type{Object} methods:
2450 \startfunctioncall
2451 <Object>:initBool(<boolean>)
2452 <Object>:initInt(<integer>)
2453 <Object>:initReal(<number>)
2454 <Object>:initString(<string>)
2455 <Object>:initName(<string>)
2456 <Object>:initNull()
2457 <Object>:initArray(<XRef>)
2458 <Object>:initDict(<XRef>)
2459 <Object>:initStream(<Stream>)
2460 <Object>:initRef(<integer> object number, <integer> object generation)
2461 <Object>:initCmd(<string>)
2462 <Object>:initError()
2463 <Object>:initEOF()
2464 <Object> = <Object>:fetch(<XRef>)
2465 <integer> = <Object>:getType()
2466 <string> = <Object>:getTypeName()
2467 <boolean> = <Object>:isBool()
2468 <boolean> = <Object>:isInt()
2469 <boolean> = <Object>:isReal()
2470 <boolean> = <Object>:isNum()
2471 <boolean> = <Object>:isString()
2472 <boolean> = <Object>:isName()
2473 <boolean> = <Object>:isNull()
2474 <boolean> = <Object>:isArray()
2475 <boolean> = <Object>:isDict()
2476 <boolean> = <Object>:isStream()
2477 <boolean> = <Object>:isRef()
2478 <boolean> = <Object>:isCmd()
2479 <boolean> = <Object>:isError()
2480 <boolean> = <Object>:isEOF()
2481 <boolean> = <Object>:isNone()
2482 <boolean> = <Object>:getBool()
2483 <integer> = <Object>:getInt()
2484 <number> = <Object>:getReal()
2485 <number> = <Object>:getNum()
2486 <string> = <Object>:getString()
2487 <string> = <Object>:getName()
2488 <Array> = <Object>:getArray()
2489 <Dict> = <Object>:getDict()
2490 <Stream> = <Object>:getStream()
2491 <Ref> = <Object>:getRef()
2492 <integer> = <Object>:getRefNum()
2493 <integer> = <Object>:getRefGen()
2494 <string> = <Object>:getCmd()
2495 <integer> = <Object>:arrayGetLength()
2496 = <Object>:arrayAdd(<Object>)
2497 <Object> = <Object>:arrayGet(<integer>)
2498 <Object> = <Object>:arrayGetNF(<integer>)
2499 <integer> = <Object>:dictGetLength(<integer>)
2500 = <Object>:dictAdd(<string>, <Object>)
2501 = <Object>:dictSet(<string>, <Object>)
2502 <Object> = <Object>:dictLookup(<string>)
2503 <Object> = <Object>:dictLookupNF(<string>)
2504 <string> = <Object>:dictgetKey(<integer>)
2505 <Object> = <Object>:dictgetVal(<integer>)
2506 <Object> = <Object>:dictgetValNF(<integer>)
2507 <boolean> = <Object>:streamIs(<string>)
2508 = <Object>:streamReset()
2509 <integer> = <Object>:streamGetChar()
2510 <integer> = <Object>:streamLookChar()
2511 <integer> = <Object>:streamGetPos()
2512 = <Object>:streamSetPos(<integer>)
2513 <Dict> = <Object>:streamGetDict()
2514 \stopfunctioncall
2516 \type{Page} methods:
2518 \startfunctioncall
2519 <boolean> = <Page>:isOk()
2520 <integer> = <Page>:getNum()
2521 <PDFRectangle> = <Page>:getMediaBox()
2522 <PDFRectangle> = <Page>:getCropBox()
2523 <boolean> = <Page>:isCropped()
2524 <number> = <Page>:getMediaWidth()
2525 <number> = <Page>:getMediaHeight()
2526 <number> = <Page>:getCropWidth()
2527 <number> = <Page>:getCropHeight()
2528 <PDFRectangle> = <Page>:getBleedBox()
2529 <PDFRectangle> = <Page>:getTrimBox()
2530 <PDFRectangle> = <Page>:getArtBox()
2531 <integer> = <Page>:getRotate()
2532 <string> = <Page>:getLastModified()
2533 <Dict> = <Page>:getBoxColorInfo()
2534 <Dict> = <Page>:getGroup()
2535 <Stream> = <Page>:getMetadata()
2536 <Dict> = <Page>:getPieceInfo()
2537 <Dict> = <Page>:getSeparationInfo()
2538 <Dict> = <Page>:getResourceDict()
2539 <Object> = <Page>:getAnnots()
2540 <Links> = <Page>:getLinks(<Catalog>)
2541 <Object> = <Page>:getContents()
2542 \stopfunctioncall
2544 \type{PDFDoc} methods:
2546 \startfunctioncall
2547 <boolean> = <PDFDoc>:isOk()
2548 <integer> = <PDFDoc>:getErrorCode()
2549 <string> = <PDFDoc>:getErrorCodeName()
2550 <string> = <PDFDoc>:getFileName()
2551 <XRef> = <PDFDoc>:getXRef()
2552 <Catalog> = <PDFDoc>:getCatalog()
2553 <number> = <PDFDoc>:getPageMediaWidth()
2554 <number> = <PDFDoc>:getPageMediaHeight()
2555 <number> = <PDFDoc>:getPageCropWidth()
2556 <number> = <PDFDoc>:getPageCropHeight()
2557 <integer> = <PDFDoc>:getNumPages()
2558 <string> = <PDFDoc>:readMetadata()
2559 <Object> = <PDFDoc>:getStructTreeRoot()
2560 <integer> = <PDFDoc>:findPage(<integer> object number, <integer> object generation)
2561 <Links> = <PDFDoc>:getLinks(<integer>)
2562 <LinkDest> = <PDFDoc>:findDest(<string>)
2563 <boolean> = <PDFDoc>:isEncrypted()
2564 <boolean> = <PDFDoc>:okToPrint()
2565 <boolean> = <PDFDoc>:okToChange()
2566 <boolean> = <PDFDoc>:okToCopy()
2567 <boolean> = <PDFDoc>:okToAddNotes()
2568 <boolean> = <PDFDoc>:isLinearized()
2569 <Object> = <PDFDoc>:getDocInfo()
2570 <Object> = <PDFDoc>:getDocInfoNF()
2571 <integer> = <PDFDoc>:getPDFMajorVersion()
2572 <integer> = <PDFDoc>:getPDFMinorVersion()
2573 \stopfunctioncall
2575 \type{PDFRectangle} methods:
2577 \startfunctioncall
2578 <boolean> = <PDFRectangle>:isValid()
2579 \stopfunctioncall
2581 %\type{Ref} methods:
2583 %\startfunctioncall
2584 %\stopfunctioncall
2586 \type{Stream} methods:
2588 \startfunctioncall
2589 <integer> = <Stream>:getKind()
2590 <string> = <Stream>:getKindName()
2591 = <Stream>:reset()
2592 = <Stream>:close()
2593 <integer> = <Stream>:getChar()
2594 <integer> = <Stream>:lookChar()
2595 <integer> = <Stream>:getRawChar()
2596 <integer> = <Stream>:getUnfilteredChar()
2597 = <Stream>:unfilteredReset()
2598 <integer> = <Stream>:getPos()
2599 <boolean> = <Stream>:isBinary()
2600 <Stream> = <Stream>:getUndecodedStream()
2601 <Dict> = <Stream>:getDict()
2602 \stopfunctioncall
2604 \type{StructElement} methods:
2606 \startfunctioncall
2607 <string> = <StructElement>:getTypeName()
2608 <integer> = <StructElement>:getType()
2609 <boolean> = <StructElement>:isOk()
2610 <boolean> = <StructElement>:isBlock()
2611 <boolean> = <StructElement>:isInline()
2612 <boolean> = <StructElement>:isGrouping()
2613 <boolean> = <StructElement>:isContent()
2614 <boolean> = <StructElement>:isObjectRef()
2615 <integer> = <StructElement>:getMCID()
2616 <Ref> = <StructElement>:getObjectRef()
2617 <Ref> = <StructElement>:getParentRef()
2618 <boolean> = <StructElement>:hasPageRef()
2619 <Ref> = <StructElement>:getPageRef()
2620 <StructTreeRoot> = <StructElement>:getStructTreeRoot()
2621 <string> = <StructElement>:getID()
2622 <string> = <StructElement>:getLanguage()
2623 <integer> = <StructElement>:getRevision()
2624 <StructElement>:setRevision(<unsigned integer>)
2625 <string> = <StructElement>:getTitle()
2626 <string> = <StructElement>:getExpandedAbbr()
2627 <integer> = <StructElement>:getNumChildren()
2628 <StructElement> = <StructElement>:getChild()
2629 = <StructElement>:appendChild<StructElement>)
2630 <integer> = <StructElement>:getNumAttributes()
2631 <Attribute> = <StructElement>:geAttribute(<integer>)
2632 <string> = <StructElement>:appendAttribute(<Attribute>)
2633 <Attribute> = <StructElement>:findAttribute(<Attribute::Type>,boolean,Attribute::Owner)
2634 <string> = <StructElement>:getAltText()
2635 <string> = <StructElement>:getActualText()
2636 <string> = <StructElement>:getText(<boolean>)
2637 <table> = <StructElement>:getTextSpans()
2638 \stopfunctioncall
2641 \type{StructTreeRoot} methods:
2643 \startfunctioncall
2644 <StructElement> = <StructTreeRoot>:findParentElement
2645 <PDFDoc> = <StructTreeRoot>:getDoc
2646 <Dict> = <StructTreeRoot>:getRoleMap
2647 <Dict> = <StructTreeRoot>:getClassMap
2648 <integer> = <StructTreeRoot>:getNumChildren
2649 <StructElement> = <StructTreeRoot>:getChild
2650 <StructTreeRoot>:appendChild
2651 <StructElement> = <StructTreeRoot>:findParentElement
2652 \stopfunctioncall
2655 \type{TextSpan} han only one method:
2657 \startfunctioncall
2658 <string> = <TestSpan>:getText()
2659 \stopfunctioncall
2664 \type{XRef} methods:
2666 \startfunctioncall
2667 <boolean> = <XRef>:isOk()
2668 <integer> = <XRef>:getErrorCode()
2669 <boolean> = <XRef>:isEncrypted()
2670 <boolean> = <XRef>:okToPrint()
2671 <boolean> = <XRef>:okToPrintHighRes()
2672 <boolean> = <XRef>:okToChange()
2673 <boolean> = <XRef>:okToCopy()
2674 <boolean> = <XRef>:okToAddNotes()
2675 <boolean> = <XRef>:okToFillForm()
2676 <boolean> = <XRef>:okToAccessibility()
2677 <boolean> = <XRef>:okToAssemble()
2678 <Object> = <XRef>:getCatalog()
2679 <Object> = <XRef>:fetch(<integer> object number, <integer> object generation)
2680 <Object> = <XRef>:getDocInfo()
2681 <Object> = <XRef>:getDocInfoNF()
2682 <integer> = <XRef>:getNumObjects()
2683 <integer> = <XRef>:getRootNum()
2684 <integer> = <XRef>:getRootGen()
2685 <integer> = <XRef>:getSize()
2686 <Object> = <XRef>:getTrailerDict()
2687 \stopfunctioncall
2689 %***********************************************************************
2691 \section{The \luatex{font} library}
2693 The font library provides the interface into the internals of the font
2694 system, and also it contains helper functions to load traditional
2695 \TEX\ font metrics formats. Other font loading functionality is
2696 provided by the \luatex{fontloader} library that will be discussed in
2697 the next section.
2699 \subsection{Loading a \TFM\ file}
2701 The behavior documented in this subsection is considered stable in the
2702 sense that there will not be backward-incompatible changes any more.
2704 \startfunctioncall
2705 <table> fnt = font.read_tfm(<string> name, <number> s)
2706 \stopfunctioncall
2708 The number is a bit special:
2710 \startitemize
2711 \item if it is positive, it specifies an \quote{at size} in scaled points.
2712 \item if it is negative, its absolute value represents a \quote{scaled}
2713 setting relative to the designsize of the font.
2714 \stopitemize
2716 The internal structure of the metrics font table that is returned is
2717 explained in \in{chapter}[fonts].
2719 \subsection{Loading a \VF\ file}
2721 The behavior documented in this subsection is considered stable in the
2722 sense that there will not be backward-incompatible changes any more.
2724 \startfunctioncall
2725 <table> vf_fnt = font.read_vf(<string> name, <number> s)
2726 \stopfunctioncall
2728 The meaning of the number \type{s} and the format of the returned
2729 table are similar to the ones in the \luatex{read_tfm()} function.
2731 \subsection{The fonts array}
2733 The whole table of \TEX\ fonts is accessible from \LUA\ using a virtual array.
2735 \starttyping
2736 font.fonts[n] = { ... }
2737 <table> f = font.fonts[n]
2738 \stoptyping
2740 See \in{chapter}[fonts] for the structure of the tables. Because this
2741 is a virtual array, you cannot call \type{pairs} on it, but see below
2742 for the \type{font.each} iterator.
2744 The two metatable functions implementing the virtual array are:
2746 \startfunctioncall
2747 <table> f = font.getfont(<number> n)
2748 font.setfont(<number> n, <table> f)
2749 \stopfunctioncall
2751 Note that at the moment, each access to the \type{font.fonts} or call
2752 to \type{font.getfont} creates a lua table for the whole font. This
2753 process can be quite slow. In a later version of \LUATEX, this
2754 interface will change (it will start using userdata objects instead of
2755 actual tables).
2757 Also note the following: assignments can only be made to fonts that
2758 have already been defined in \TEX, but have not been accessed {\it at
2759 all\/} since that definition. This limits the usability of the write
2760 access to \type{font.fonts} quite a lot, a less stringent ruleset will
2761 likely be implemented later.
2763 \subsection{Checking a font's status}
2765 You can test for the status of a font by calling this function:
2767 \startfunctioncall
2768 <boolean> f = font.frozen(<number> n)
2769 \stopfunctioncall
2771 The return value is one of \type{true} (unassignable), \type{false} (can be changed)
2772 or \type{nil} (not a valid font at all).
2774 \subsection{Defining a font directly}
2776 You can define your own font into \luatex{font.fonts} by calling this function:
2778 \startfunctioncall
2779 <number> i = font.define(<table> f)
2780 \stopfunctioncall
2782 The return value is the internal id number of the defined font (the
2783 index into \luatex{font.fonts}). If the font creation fails, an error is
2784 raised. The table is a font structure, as explained in
2785 \in{chapter}[fonts].
2787 \subsection{Projected next font id}
2789 \startfunctioncall
2790 <number> i = font.nextid()
2791 \stopfunctioncall
2793 This returns the font id number that would be returned by a
2794 \type{font.define} call if it was executed at this spot in the code
2795 flow. This is useful for virtual fonts that need to reference
2796 themselves.
2798 \subsection{Font id (0.47)}
2800 \startfunctioncall
2801 <number> i = font.id(<string> csname)
2802 \stopfunctioncall
2804 This returns the font id associated with \type{csname} string, or $-1$
2805 if \type{csname} is not defined; new in 0.47.
2807 \subsection{Currently active font}
2809 \startfunctioncall
2810 <number> i = font.current()
2811 font.current(<number> i)
2812 \stopfunctioncall
2814 This gets or sets the currently used font number.
2816 \subsection{Maximum font id}
2818 \startfunctioncall
2819 <number> i = font.max()
2820 \stopfunctioncall
2822 This is the largest used index in \type{font.fonts}.
2824 \subsection{Iterating over all fonts}
2826 \startfunctioncall
2827 for i,v in font.each() do
2830 \stopfunctioncall
2832 This is an iterator over each of the defined \TEX\ fonts. The first
2833 returned value is the index in \type{font.fonts}, the second the font
2834 itself, as a \LUA\ table. The indices are listed incrementally, but they
2835 do not always form an array of consecutive numbers: in some cases
2836 there can be holes in the sequence.
2838 \section{The \luatex{fontloader} library (0.36)}
2840 \subsection{Getting quick information on a font}
2842 \startfunctioncall
2843 <table> info = fontloader.info(<string> filename)
2844 \stopfunctioncall
2846 This function returns either \type{nil}, or a \type{table}, or an
2847 array of small tables (in the case of a TrueType collection). The
2848 returned table(s) will contain some fairly interesting information
2849 items from the font(s) defined by the file:
2851 \starttabulate[|lT|l|p|]
2852 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
2853 \NC fontname \NC string \NC the \POSTSCRIPT\ name of the font\NC\NR
2854 \NC fullname \NC string \NC the formal name of the font\NC\NR
2855 \NC familyname \NC string \NC the family name this font belongs to\NC\NR
2856 \NC weight \NC string \NC a string indicating the color value of the font\NC\NR
2857 \NC version \NC string \NC the internal font version\NC\NR
2858 \NC italicangle \NC float \NC the slant angle\NC\NR
2859 \NC units_per_em \NC number \NC (since 0.78.2) 1000 for \POSTSCRIPT-based fonts, usually 2048 for \TRUETYPE\NC\NR
2860 \NC pfminfo \NC table \NC (since 0.78.2) (see \in{section}[fontloaderpfminfotable])\NC\NR
2861 \stoptabulate
2863 Getting information through this function is (sometimes much) more
2864 efficient than loading the font properly, and is therefore handy when
2865 you want to create a dictionary of available fonts based on a
2866 directory contents.
2868 \subsection{Loading an \OPENTYPE\ or \TRUETYPE\ file}
2870 If you want to use an \OPENTYPE\ font, you have to get the metric
2871 information from somewhere. Using the \type{fontloader} library, the
2872 simplest way to get that information is thus:
2874 \starttyping
2875 function load_font (filename)
2876 local metrics = nil
2877 local font = fontloader.open(filename)
2878 if font then
2879 metrics = fontloader.to_table(font)
2880 fontloader.close(font)
2882 return metrics
2885 myfont = load_font('/opt/tex/texmf/fonts/data/arial.ttf')
2886 \stoptyping
2888 The main function call is
2890 \startfunctioncall
2891 <userdata> f, <table> w = fontloader.open(<string> filename)
2892 <userdata> f, <table> w = fontloader.open(<string> filename, <string> fontname)
2893 \stopfunctioncall
2895 The first return value is a userdata representation of the font. The
2896 second return value is a table containing any warnings and errors
2897 reported by fontloader while opening the font. In normal typesetting,
2898 you would probably ignore the second argument, but it can be useful
2899 for debugging purposes.
2901 For \TRUETYPE\ collections (when filename ends in 'ttc') and \DFONT\
2902 collections, you have to use a second string argument to specify which
2903 font you want from the collection. Use the \type{fontname}
2904 strings that are returned by \type{fontloader.info} for that.
2906 To turn the font into a table, \type{fontloader.to_table} is used on
2907 the font returned by \type{fontloader.open}.
2909 \startfunctioncall
2910 <table> f = fontloader.to_table(<userdata> font)
2911 \stopfunctioncall
2913 This table cannot be used directly by \LUATEX\ and should be turned
2914 into another one as described in~\in{chapter}[fonts].
2915 Do not forget to store the \type{fontname} value in the \type{psname}
2916 field of the metrics table to be returned to \LUATEX, otherwise the
2917 font inclusion backend will not be able to find the correct font in
2918 the collection.
2920 See \in{section}[fontloadertables] for details on the userdata object
2921 returned by \type{fontloader.open()} and the layout of the
2922 \type{metrics} table returned by \type{fontloader.to_table()}.
2924 The font file is parsed and partially interpreted by the font
2925 loading routines from \FONTFORGE. The file format can be \OPENTYPE,
2926 \TRUETYPE, \TRUETYPE\ Collection, \CFF, or \TYPEONE.
2928 There are a few advantages to this approach compared to reading the
2929 actual font file ourselves:
2931 \startitemize
2933 \item The font is automatically re|-|encoded, so that the \type{metrics}
2934 table for \TRUETYPE\ and \OPENTYPE\ fonts is using \UNICODE\ for
2935 the character indices.
2937 \item Many features are pre|-|processed into a format that is easier to handle
2938 than just the bare tables would be.
2940 \item \POSTSCRIPT|-|based \OPENTYPE\ fonts do not store the character height and
2941 depth in the font file, so the character boundingbox has to be
2942 calculated in some way.
2944 \item In the future, it may be interesting to allow \LUA\ scripts access to
2945 the font program itself, perhaps even creating or changing the font.
2947 \stopitemize
2949 A loaded font is discarded with:
2951 \startfunctioncall
2952 fontloader.close(<userdata> font)
2953 \stopfunctioncall
2955 \subsection{Applying a \quote{feature file}}
2957 You can apply a \quote{feature file} to a loaded font:
2959 \startfunctioncall
2960 <table> errors = fontloader.apply_featurefile(<userdata> font, <string> filename)
2961 \stopfunctioncall
2963 A \quote{feature file} is a textual representation of the features in an
2964 \OPENTYPE\ font. See\crlf
2965 \hyphenatedurl {http://www.adobe.com/devnet/opentype/afdko/topic_feature_file_syntax.html}\crlf
2966 and\crlf
2967 \hyphenatedurl {http://fontforge.sourceforge.net/featurefile.html}\crlf
2968 for a more detailed description of feature files.
2970 If the function fails, the return value is a table containing any
2971 errors reported by fontloader while applying the feature file. On
2972 success, \type{nil} is returned. (the return value is new in 0.65)
2976 \subsection{Applying an \quote{\AFM\ file}}
2978 You can apply an \quote{\AFM\ file} to a loaded font:
2980 \startfunctioncall
2981 <table> errors = fontloader.apply_afmfile(<userdata> font, <string> filename)
2982 \stopfunctioncall
2984 An \AFM\ file is a textual representation of (some of) the meta information
2985 in a \TYPEONE\ font. See \hyphenatedurl{ftp://ftp.math.utah.edu/u/ma/hohn/linux/postscript/5004.AFM_Spec.pdf}
2986 for more information about afm files.
2988 Note: If you \type{fontloader.open()} a \TYPEONE\ file named \type{font.pfb},
2989 the library will automatically search for and apply \type{font.afm}
2990 if it exists in the same directory as the file \type{font.pfb}. In that case,
2991 there is no need for an explicit call to \type{apply_afmfile()}.
2993 If the function fails, the return value is a table containing any
2994 errors reported by fontloader while applying the AFM file. On
2995 success, \type{nil} is returned. (the return value is new in 0.65)
2997 \subsection[fontloadertables]{Fontloader font tables}
2999 As mentioned earlier, the return value of \type{fontloader.open()} is
3000 a userdata object. In \LUATEX\ versions before 0.63, the only way to
3001 have access to the actual metrics was to call
3002 \type{fontloader.to_table()} on this object, returning the table
3003 structure that is explained in the following subsections.
3005 However, it turns out that the result from
3006 \type{fontloader.to_table()} sometimes needs very large amounts of memory
3007 (depending on the font's complexity and size) so starting with \LUATEX\ 0.63,
3008 it is possible to access the userdata object directly.
3010 In the \LUATEX\ 0.63.0, the following is implemented:
3012 \startitemize
3013 \item all top-level keys that would be returned by \type{to_table()}
3014 can also be accessed directly.
3015 \item the top-level key \quote{glyphs} returns a {\it virtual\/} array that
3016 allows indices from \type{0} to ($\type{f.glyphmax}-1$).
3017 \item the items in that virtual array (the actual glyphs) are themselves also
3018 userdata objects, and each has accessors for all of the keys
3019 explained in the section \quote{Glyph items} below.
3020 \item the top-level key \quote{subfonts} returns an {\it actual} array of
3021 userdata objects, one for each of the subfonts (or nil, if there are no subfonts).
3022 \stopitemize
3025 A short example may be helpful. This code generates a printout of all
3026 the glyph names in the font \type{PunkNova.kern.otf}:
3028 \starttyping
3029 local f = fontloader.open('PunkNova.kern.otf')
3030 print (f.fontname)
3031 local i = 0
3032 while (i < f.glyphmax) do
3033 local g = f.glyphs[i]
3034 if g then
3035 print(g.name)
3037 i = i + 1
3039 fontloader.close(f)
3040 \stoptyping
3042 In this case, the \LUATEX\ memory requirement stays below 100MB on the
3043 test computer, while the internal stucture generated by
3044 \type{to_table()} needs more than 2GB of memory (the font itself is
3045 6.9MB in disk size).
3047 In \LUATEX\ 0.63 only the top-level font, the subfont table entries,
3048 and the glyphs are virtual objects, everything else still produces
3049 normal lua values and tables. In future versions, more return values
3050 may be replaced by userdata objects (as much as needed to keep the
3051 memory requirements in check).
3053 If you want to know the valid fields in a font or glyph
3054 structure, call the \type{fields} function on an object of a
3055 particular type (either glyph or font for now, more will be
3056 implemented later):
3058 \startfunctioncall
3059 <table> fields = fontloader.fields(<userdata> font)
3060 <table> fields = fontloader.fields(<userdata> font_glyph)
3061 \stopfunctioncall
3063 For instance:
3065 \startfunctioncall
3066 local fields = fontloader.fields(f)
3067 local fields = fontloader.fields(f.glyphs[0])
3068 \stopfunctioncall
3071 \subsubsection{Table types}
3073 \subsubsubsection{Top-level}
3075 The top|-|level keys in the returned table are (the explanations in
3076 this part of the documentation are not yet finished):
3078 \starttabulate[|lT|l|p|]
3079 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3080 \NC table_version \NC number \NC indicates the metrics version (currently~0.3)\NC\NR
3081 \NC fontname \NC string \NC \POSTSCRIPT\ font name\NC\NR
3082 \NC fullname \NC string \NC official (human-oriented) font name\NC\NR
3083 \NC familyname \NC string \NC family name\NC\NR
3084 \NC weight \NC string \NC weight indicator\NC\NR
3085 \NC copyright \NC string \NC copyright information\NC\NR
3086 \NC filename \NC string \NC the file name\NC\NR
3087 \NC version \NC string \NC font version\NC\NR
3088 \NC italicangle \NC float \NC slant angle\NC\NR
3089 \NC units_per_em \NC number \NC 1000 for \POSTSCRIPT-based fonts, usually 2048 for \TRUETYPE\NC\NR
3090 \NC ascent \NC number \NC height of ascender in \type{units_per_em}\NC\NR
3091 \NC descent \NC number \NC depth of descender in \type{units_per_em}\NC\NR
3092 \NC upos \NC float \NC \NC\NR
3093 \NC uwidth \NC float \NC \NC\NR
3094 \NC uniqueid \NC number \NC \NC\NR
3095 \NC glyphcnt \NC number \NC number of included glyphs\NC\NR
3096 \NC glyphs \NC array \NC \NC\NR
3097 \NC glyphmax \NC number \NC maximum used index the glyphs array\NC\NR
3098 \NC hasvmetrics \NC number \NC \NC\NR
3099 \NC onlybitmaps \NC number \NC \NC\NR
3100 \NC serifcheck \NC number \NC \NC\NR
3101 \NC isserif \NC number \NC \NC\NR
3102 \NC issans \NC number \NC \NC\NR
3103 \NC encodingchanged \NC number \NC \NC\NR
3104 \NC strokedfont \NC number \NC \NC\NR
3105 \NC use_typo_metrics \NC number \NC \NC\NR
3106 \NC weight_width_slope_only \NC number \NC \NC\NR
3107 \NC head_optimized_for_cleartype \NC number \NC \NC\NR
3108 \NC uni_interp \NC enum \NC \type {unset}, \type {none}, \type {adobe},
3109 \type {greek}, \type {japanese}, \type {trad_chinese},
3110 \type {simp_chinese}, \type {korean}, \type {ams}\NC\NR
3111 \NC origname \NC string \NC the file name, as supplied by the user\NC\NR
3112 \NC map \NC table \NC \NC\NR
3113 \NC private \NC table \NC \NC\NR
3114 \NC xuid \NC string \NC \NC\NR
3115 \NC pfminfo \NC table \NC \NC\NR
3116 \NC names \NC table \NC \NC\NR
3117 \NC cidinfo \NC table \NC \NC\NR
3118 \NC subfonts \NC array \NC \NC\NR
3119 \NC commments \NC string \NC \NC\NR
3120 \NC fontlog \NC string \NC \NC\NR
3121 \NC cvt_names \NC string \NC \NC\NR
3122 \NC anchor_classes \NC table \NC \NC\NR
3123 \NC ttf_tables \NC table \NC \NC\NR
3124 \NC ttf_tab_saved \NC table \NC \NC\NR
3125 \NC kerns \NC table \NC \NC\NR
3126 \NC vkerns \NC table \NC \NC\NR
3127 \NC texdata \NC table \NC \NC\NR
3128 \NC lookups \NC table \NC \NC\NR
3129 \NC gpos \NC table \NC \NC\NR
3130 \NC gsub \NC table \NC \NC\NR
3131 \NC mm \NC table \NC \NC\NR
3132 \NC chosenname \NC string \NC \NC\NR
3133 \NC macstyle \NC number \NC \NC\NR
3134 \NC fondname \NC string \NC \NC\NR
3135 %\NC design_size \NC number \NC \NC\NR
3136 \NC fontstyle_id \NC number \NC \NC\NR
3137 \NC fontstyle_name \NC table \NC \NC\NR
3138 %\NC design_range_bottom \NC number \NC \NC\NR
3139 %\NC design_range_top \NC number \NC \NC\NR
3140 \NC strokewidth \NC float \NC \NC\NR
3141 \NC mark_classes \NC table \NC \NC\NR
3142 \NC creationtime \NC number \NC \NC\NR
3143 \NC modificationtime \NC number \NC \NC\NR
3144 \NC os2_version \NC number \NC \NC\NR
3145 \NC sfd_version \NC number \NC \NC\NR
3146 \NC math \NC table \NC \NC\NR
3147 \NC validation_state \NC table \NC \NC\NR
3148 \NC horiz_base \NC table \NC \NC\NR
3149 \NC vert_base \NC table \NC \NC\NR
3150 \NC extrema_bound \NC number \NC \NC\NR
3151 \stoptabulate
3153 \subsubsubsection{Glyph items}
3155 The \type{glyphs} is an array containing the per|-|character
3156 information (quite a few of these are only present if nonzero).
3158 \starttabulate[|lT|l|p|]
3159 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3160 \NC name \NC string \NC the glyph name\NC\NR
3161 \NC unicode \NC number \NC unicode code point, or -1\NC\NR
3162 \NC boundingbox \NC array \NC array of four numbers, see note below\NC\NR
3163 \NC width \NC number \NC only for horizontal fonts\NC\NR
3164 \NC vwidth \NC number \NC only for vertical fonts\NC\NR
3165 \NC tsidebearing \NC number \NC only for vertical ttf/otf fonts, and only if nonzero (0.79.0)\NC\NR
3166 \NC lsidebearing \NC number \NC only if nonzero and not equal to boundingbox[1]\NC\NR
3167 \NC class \NC string \NC one of "none", "base", "ligature", "mark", "component"
3168 (if not present, the glyph class is \quote{automatic})\NC\NR
3169 \NC kerns \NC array \NC only for horizontal fonts, if set\NC\NR
3170 \NC vkerns \NC array \NC only for vertical fonts, if set\NC\NR
3171 \NC dependents \NC array \NC linear array of glyph name strings, only if nonempty\NC\NR
3172 \NC lookups \NC table \NC only if nonempty\NC\NR
3173 \NC ligatures \NC table \NC only if nonempty\NC\NR
3174 \NC anchors \NC table \NC only if set\NC\NR
3175 \NC comment \NC string \NC only if set\NC\NR
3176 \NC tex_height \NC number \NC only if set\NC\NR
3177 \NC tex_depth \NC number \NC only if set\NC\NR
3178 \NC italic_correction \NC number \NC only if set\NC\NR
3179 \NC top_accent \NC number \NC only if set\NC\NR
3180 \NC is_extended_shape \NC number \NC only if this character is part of a math extension list\NC\NR
3181 \NC altuni \NC table \NC alternate \UNICODE\ items \NC\NR
3182 \NC vert_variants \NC table \NC \NC \NR
3183 \NC horiz_variants \NC table \NC \NC \NR
3184 \NC mathkern \NC table \NC \NC \NR
3185 \stoptabulate
3187 On \type{boundingbox}: The boundingbox information for \TRUETYPE\ fonts and \TRUETYPE-based \OTF\ fonts is read
3188 directly from the font file. \POSTSCRIPT-based fonts do not have this information, so the boundingbox of
3189 traditional \POSTSCRIPT\ fonts is generated by interpreting the actual bezier curves to find the exact
3190 boundingbox. This can be a slow process, so starting from \LUATEX\ 0.45, the boundingboxes of \POSTSCRIPT-based
3191 \OTF\ fonts (and raw \CFF\ fonts) are calculated using an approximation of the glyph shape based on the actual
3192 glyph points only, instead of taking the whole curve into account. This means that glyphs that have missing
3193 points at extrema will have a too-tight boundingbox, but the processing is so much faster that in our opinion
3194 the tradeoff is worth it.
3197 The \type{kerns} and \type{vkerns} are linear arrays of small hashes:
3199 \starttabulate[|lT|l|p|]
3200 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3201 \NC char \NC string \NC \NC\NR
3202 \NC off \NC number \NC \NC\NR
3203 \NC lookup \NC string \NC \NC\NR
3204 \stoptabulate
3206 The \type{lookups} is a hash, based on lookup subtable names, with
3207 the value of each key inside that a linear array of small hashes:
3209 % TODO: fix this description
3210 \starttabulate[|lT|l|p|]
3211 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3212 \NC type \NC enum \NC \type {position}, \type {pair}, \type {substitution}, \type {alternate},
3213 \type {multiple}, \type {ligature}, \type {lcaret}, \type {kerning}, \type {vkerning}, \type {anchors},
3214 \type {contextpos}, \type {contextsub}, \type {chainpos}, \type {chainsub},
3215 \type {reversesub}, \type {max}, \type {kernback}, \type {vkernback} \NC\NR
3216 \NC specification \NC table \NC extra data \NC\NR
3217 \stoptabulate
3219 For the first seven values of \type{type}, there can be additional
3220 sub|-|information, stored in the sub-table \type{specification}:
3222 \starttabulate[|lT|l|p|]
3223 \NC \ssbf value \NC \bf type \NC \bf explanation \NC\NR
3224 \NC position \NC table \NC a table of the \type {offset_specs} type\NC\NR
3225 \NC pair \NC table \NC one string: \type {paired}, and an array of one or
3226 two \type {offset_specs} tables: \type{offsets}\NC\NR
3227 \NC substitution \NC table \NC one string: \type {variant}\NC\NR
3228 \NC alternate \NC table \NC one string: \type {components}\NC\NR
3229 \NC multiple \NC table \NC one string: \type {components}\NC\NR
3230 \NC ligature \NC table \NC two strings: \type {components}, \type {char}\NC\NR
3231 \NC lcaret \NC array \NC linear array of numbers\NC\NR
3232 \stoptabulate
3234 Tables for \type{offset_specs} contain up to four number|-|valued
3235 fields: \type{x} (a horizontal offset), \type{y} (a vertical offset),
3236 \type{h} (an advance width correction) and \type{v} (an advance height
3237 correction).
3239 The \type{ligatures} is a linear array of small hashes:
3241 \starttabulate[|lT|l|p|]
3242 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3243 \NC lig \NC table \NC uses the same substructure as a single item in the \type{lookups} table explained above\NC\NR
3244 \NC char \NC string \NC \NC\NR
3245 \NC components \NC array \NC linear array of named components\NC\NR
3246 \NC ccnt \NC number \NC \NC\NR
3247 \stoptabulate
3249 The \type{anchor} table is indexed by a string signifying the
3250 anchor type, which is one of
3252 \starttabulate[|lT|l|p|]
3253 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3254 \NC mark \NC table \NC placement mark\NC\NR
3255 \NC basechar \NC table \NC mark for attaching combining items to a base char\NC\NR
3256 \NC baselig \NC table \NC mark for attaching combining items to a ligature\NC\NR
3257 \NC basemark \NC table \NC generic mark for attaching combining items to connect to\NC\NR
3258 \NC centry \NC table \NC cursive entry point\NC\NR
3259 \NC cexit \NC table \NC cursive exit point\NC\NR
3260 \stoptabulate
3262 The content of these is a short array of defined anchors, with the
3263 entry keys being the anchor names. For all except \type{baselig}, the
3264 value is a single table with this definition:
3266 \starttabulate[|lT|l|p|]
3267 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3268 \NC x \NC number \NC x location\NC\NR
3269 \NC y \NC number \NC y location\NC\NR
3270 \NC ttf_pt_index \NC number \NC truetype point index, only if given\NC\NR
3271 \stoptabulate
3273 For \type{baselig}, the value is a small array of such anchor sets
3274 sets, one for each constituent item of the ligature.
3276 For clarification, an anchor table could for example look like this :
3278 \starttyping
3279 ['anchor'] = {
3280 ['basemark'] = {
3281 ['Anchor-7'] = { ['x']=170, ['y']=1080 }
3283 ['mark'] ={
3284 ['Anchor-1'] = { ['x']=160, ['y']=810 },
3285 ['Anchor-4'] = { ['x']=160, ['y']=800 }
3287 ['baselig'] = {
3288 [1] = { ['Anchor-2'] = { ['x']=160, ['y']=650 } },
3289 [2] = { ['Anchor-2'] = { ['x']=460, ['y']=640 } }
3292 \stoptyping
3294 \subsubsubsection{map table}
3296 The top|-|level map is a list of encoding mappings. Each of those is a table itself.
3298 \starttabulate[|lT|l|p|]
3299 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3300 \NC enccount \NC number \NC \NC\NR
3301 \NC encmax \NC number \NC \NC\NR
3302 \NC backmax \NC number \NC \NC\NR
3303 \NC remap \NC table \NC \NC\NR
3304 \NC map \NC array \NC non|-|linear array of mappings\NC\NR
3305 \NC backmap \NC array \NC non|-|linear array of backward mappings\NC\NR
3306 \NC enc \NC table \NC \NC\NR
3307 \stoptabulate
3309 The \type{remap} table is very small:
3311 \starttabulate[|lT|l|p|]
3312 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3313 \NC firstenc \NC number \NC \NC\NR
3314 \NC lastenc \NC number \NC \NC\NR
3315 \NC infont \NC number \NC \NC\NR
3316 \stoptabulate
3318 The \type{enc} table is a bit more verbose:
3320 \starttabulate[|lT|l|p|]
3321 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3322 \NC enc_name \NC string \NC \NC\NR
3323 \NC char_cnt \NC number \NC \NC\NR
3324 \NC char_max \NC number \NC \NC\NR
3325 \NC unicode \NC array \NC of \UNICODE\ position numbers\NC\NR
3326 \NC psnames \NC array \NC of \POSTSCRIPT\ glyph names\NC\NR
3327 \NC builtin \NC number \NC \NC\NR
3328 \NC hidden \NC number \NC \NC\NR
3329 \NC only_1byte \NC number \NC \NC\NR
3330 \NC has_1byte \NC number \NC \NC\NR
3331 \NC has_2byte \NC number \NC \NC\NR
3332 \NC is_unicodebmp \NC number \NC only if nonzero\NC\NR
3333 \NC is_unicodefull \NC number \NC only if nonzero\NC\NR
3334 \NC is_custom \NC number \NC only if nonzero\NC\NR
3335 \NC is_original \NC number \NC only if nonzero\NC\NR
3336 \NC is_compact \NC number \NC only if nonzero\NC\NR
3337 \NC is_japanese \NC number \NC only if nonzero\NC\NR
3338 \NC is_korean \NC number \NC only if nonzero\NC\NR
3339 \NC is_tradchinese \NC number \NC only if nonzero [name?]\NC\NR
3340 \NC is_simplechinese \NC number \NC only if nonzero\NC\NR
3341 \NC low_page \NC number \NC \NC\NR
3342 \NC high_page \NC number \NC \NC\NR
3343 \NC iconv_name \NC string \NC \NC\NR
3344 \NC iso_2022_escape \NC string \NC \NC\NR
3345 \stoptabulate
3347 \subsubsubsection{private table}
3349 This is the font's private \POSTSCRIPT\ dictionary, if any. Keys and
3350 values are both strings.
3352 \subsubsubsection{cidinfo table}
3354 \starttabulate[|lT|l|p|]
3355 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3356 \NC registry \NC string \NC \NC\NR
3357 \NC ordering \NC string \NC \NC\NR
3358 \NC supplement \NC number \NC \NC\NR
3359 \NC version \NC number \NC \NC\NR
3360 \stoptabulate
3362 \subsubsubsection[fontloaderpfminfotable]{pfminfo table}
3364 The \type{pfminfo} table contains most of the OS/2 information:
3366 \starttabulate[|lT|l|p|]
3367 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3368 \NC pfmset \NC number \NC \NC\NR
3369 \NC winascent_add \NC number \NC \NC\NR
3370 \NC windescent_add \NC number \NC \NC\NR
3371 \NC hheadascent_add \NC number \NC \NC\NR
3372 \NC hheaddescent_add \NC number \NC \NC\NR
3373 \NC typoascent_add \NC number \NC \NC\NR
3374 \NC typodescent_add \NC number \NC \NC\NR
3375 \NC subsuper_set \NC number \NC \NC\NR
3376 \NC panose_set \NC number \NC \NC\NR
3377 \NC hheadset \NC number \NC \NC\NR
3378 \NC vheadset \NC number \NC \NC\NR
3379 \NC pfmfamily \NC number \NC \NC\NR
3380 \NC weight \NC number \NC \NC\NR
3381 \NC width \NC number \NC \NC\NR
3382 \NC avgwidth \NC number \NC \NC\NR
3383 \NC firstchar \NC number \NC \NC\NR
3384 \NC lastchar \NC number \NC \NC\NR
3385 \NC fstype \NC number \NC \NC\NR
3386 \NC linegap \NC number \NC \NC\NR
3387 \NC vlinegap \NC number \NC \NC\NR
3388 \NC hhead_ascent \NC number \NC \NC\NR
3389 \NC hhead_descent \NC number \NC \NC\NR
3390 \NC hhead_descent \NC number \NC \NC\NR
3391 \NC os2_typoascent \NC number \NC \NC\NR
3392 \NC os2_typodescent \NC number \NC \NC\NR
3393 \NC os2_typolinegap \NC number \NC \NC\NR
3394 \NC os2_winascent \NC number \NC \NC\NR
3395 \NC os2_windescent \NC number \NC \NC\NR
3396 \NC os2_subxsize \NC number \NC \NC\NR
3397 \NC os2_subysize \NC number \NC \NC\NR
3398 \NC os2_subxoff \NC number \NC \NC\NR
3399 \NC os2_subyoff \NC number \NC \NC\NR
3400 \NC os2_supxsize \NC number \NC \NC\NR
3401 \NC os2_supysize \NC number \NC \NC\NR
3402 \NC os2_supxoff \NC number \NC \NC\NR
3403 \NC os2_supyoff \NC number \NC \NC\NR
3404 \NC os2_strikeysize \NC number \NC \NC\NR
3405 \NC os2_strikeypos \NC number \NC \NC\NR
3406 \NC os2_family_class \NC number \NC \NC\NR
3407 \NC os2_xheight \NC number \NC \NC\NR
3408 \NC os2_capheight \NC number \NC \NC\NR
3409 \NC os2_defaultchar \NC number \NC \NC\NR
3410 \NC os2_breakchar \NC number \NC \NC\NR
3411 \NC os2_vendor \NC string \NC \NC\NR
3412 \NC codepages \NC table \NC A two-number array of encoded code pages\NC\NR
3413 \NC unicoderages \NC table \NC A four-number array of encoded unicode ranges\NC\NR
3414 \NC panose \NC table \NC \NC\NR
3415 \stoptabulate
3417 The \type{panose} subtable has exactly 10 string keys:
3419 \starttabulate[|lT|l|p|]
3420 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3421 \NC familytype \NC string \NC Values as in the \OPENTYPE\ font specification:
3422 \type {Any}, \type {No Fit}, \type {Text and Display}, \type {Script},
3423 \type {Decorative}, \type {Pictorial} \NC\NR
3424 \NC serifstyle \NC string \NC See the \OPENTYPE\ font specification for values\NC\NR
3425 \NC weight \NC string \NC id. \NC\NR
3426 \NC proportion \NC string \NC id. \NC\NR
3427 \NC contrast \NC string \NC id. \NC\NR
3428 \NC strokevariation \NC string \NC id. \NC\NR
3429 \NC armstyle \NC string \NC id. \NC\NR
3430 \NC letterform \NC string \NC id. \NC\NR
3431 \NC midline \NC string \NC id. \NC\NR
3432 \NC xheight \NC string \NC id. \NC\NR
3433 \stoptabulate
3435 \subsubsubsection[fontloadernamestable]{names table}
3437 Each item has two top|-|level keys:
3439 \starttabulate[|lT|l|p|]
3440 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3441 \NC lang \NC string \NC language for this entry \NC\NR
3442 \NC names \NC table \NC \NC\NR
3443 \stoptabulate
3445 The \type{names} keys are the actual \TRUETYPE\ name strings. The
3446 possible keys are:
3448 \starttabulate[|lT|p|]
3449 \NC \ssbf key \NC \bf explanation \NC\NR
3450 \NC copyright \NC \NC\NR
3451 \NC family \NC \NC\NR
3452 \NC subfamily \NC \NC\NR
3453 \NC uniqueid \NC \NC\NR
3454 \NC fullname \NC \NC\NR
3455 \NC version \NC \NC\NR
3456 \NC postscriptname \NC \NC\NR
3457 \NC trademark \NC \NC\NR
3458 \NC manufacturer \NC \NC\NR
3459 \NC designer \NC \NC\NR
3460 \NC descriptor \NC \NC\NR
3461 \NC venderurl \NC \NC\NR
3462 \NC designerurl \NC \NC\NR
3463 \NC license \NC \NC\NR
3464 \NC licenseurl \NC \NC\NR
3465 \NC idontknow \NC \NC\NR
3466 \NC preffamilyname \NC \NC\NR
3467 \NC prefmodifiers \NC \NC\NR
3468 \NC compatfull \NC \NC\NR
3469 \NC sampletext \NC \NC\NR
3470 \NC cidfindfontname \NC \NC\NR
3471 \NC wwsfamily \NC \NC\NR
3472 \NC wwssubfamily \NC \NC\NR
3473 \stoptabulate
3475 \subsubsubsection{anchor_classes table}
3477 The anchor_classes classes:
3479 \starttabulate[|lT|l|p|]
3480 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3481 \NC name \NC string \NC a descriptive id of this anchor class\NC\NR
3482 \NC lookup \NC string \NC \NC\NR
3483 \NC type \NC string \NC one of \type {mark}, \type {mkmk}, \type {curs}, \type {mklg} \NC\NR
3484 \stoptabulate
3486 % type is actually a lookup subtype, not a feature name. Officially, these strings
3487 % should be gpos_mark2mark etc.
3489 \subsubsubsection{gpos table}
3491 Th gpos table has one array entry for each lookup. (The \type {gpos_} prefix is somewhat redundant.)
3493 \starttabulate[|lT|l|p|]
3494 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3495 \NC type \NC string \NC one of
3496 \type {gpos_single}, \type {gpos_pair}, \type {gpos_cursive},
3497 \type {gpos_mark2base},\crlf \type {gpos_mark2ligature}, \type {gpos_mark2mark}, \type {gpos_context},\crlf
3498 \type {gpos_contextchain}
3499 \NC\NR
3500 \NC flags \NC table \NC \NC\NR
3501 \NC name \NC string \NC \NC\NR
3502 \NC features \NC array \NC \NC\NR
3503 \NC subtables \NC array \NC \NC\NR
3504 \stoptabulate
3506 The flags table has a true value for each of the lookup flags that is
3507 actually set:
3509 \starttabulate[|lT|l|p|]
3510 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3511 \NC r2l \NC boolean \NC \NC\NR
3512 \NC ignorebaseglyphs \NC boolean \NC \NC\NR
3513 \NC ignoreligatures \NC boolean \NC \NC\NR
3514 \NC ignorecombiningmarks \NC boolean \NC \NC\NR
3515 \NC mark_class \NC string \NC (new in 0.44)\NC\NR
3516 \stoptabulate
3519 The features subtable items of gpos have:
3521 \starttabulate[|lT|l|p|]
3522 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3523 \NC tag \NC string \NC \NC\NR
3524 \NC scripts \NC table \NC \NC\NR
3525 \stoptabulate
3527 The scripts table within features has:
3529 \starttabulate[|lT|l|p|]
3530 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3531 \NC script \NC string \NC \NC\NR
3532 \NC langs \NC array of strings \NC \NC\NR
3533 \stoptabulate
3536 The subtables table has:
3538 \starttabulate[|lT|l|p|]
3539 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3540 \NC name \NC string \NC \NC\NR
3541 \NC suffix \NC string \NC (only if used)\NC\NR % used by gpos_single to get a default
3542 \NC anchor_classes \NC number \NC (only if used)\NC\NR
3543 \NC vertical_kerning \NC number \NC (only if used)\NC\NR
3544 \NC kernclass \NC table \NC (only if used)\NC\NR
3545 \stoptabulate
3548 The kernclass with subtables table has:
3550 \starttabulate[|lT|l|p|]
3551 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3552 \NC firsts \NC array of strings \NC \NC\NR
3553 \NC seconds \NC array of strings \NC \NC\NR
3554 \NC lookup \NC string or array \NC associated lookup(s) \NC \NR
3555 \NC offsets \NC array of numbers \NC \NC\NR
3556 \stoptabulate
3558 \subsubsubsection{gsub table}
3560 This has identical layout to the \type{gpos} table, except for the
3561 type:
3563 \starttabulate[|lT|l|p|]
3564 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3565 \NC type \NC string \NC one of \type {gsub_single}, \type {gsub_multiple}, \type {gsub_alternate},
3566 \type {gsub_ligature},\crlf \type {gsub_context}, \type {gsub_contextchain}, \type {gsub_reversecontextchain}
3567 \NC\NR
3568 \stoptabulate
3572 \subsubsubsection{ttf_tables and ttf_tab_saved tables}
3574 \starttabulate[|lT|l|p|]
3575 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3576 \NC tag \NC string \NC \NC\NR
3577 \NC len \NC number \NC \NC\NR
3578 \NC maxlen \NC number \NC \NC\NR
3579 \NC data \NC number \NC \NC\NR
3580 \stoptabulate
3582 \subsubsubsection{mm table}
3584 \starttabulate[|lT|l|p|]
3585 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3586 \NC axes \NC table \NC array of axis names \NC \NR
3587 \NC instance_count \NC number \NC \NC \NR
3588 \NC positions \NC table \NC array of instance positions
3589 (\#axes * instances )\NC \NR
3590 \NC defweights \NC table \NC array of default weights for instances \NC \NR
3591 \NC cdv \NC string \NC \NC \NR
3592 \NC ndv \NC string \NC \NC \NR
3593 \NC axismaps \NC table \NC \NC \NR
3594 \stoptabulate
3596 The \type{axismaps}:
3598 \starttabulate[|lT|l|p|]
3599 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3600 \NC blends \NC table \NC an array of blend points \NC \NR
3601 \NC designs \NC table \NC an array of design values \NC \NR
3602 \NC min \NC number \NC \NC \NR
3603 \NC def \NC number \NC \NC \NR
3604 \NC max \NC number \NC \NC \NR
3605 \stoptabulate
3608 \subsubsubsection{mark_classes table (0.44)}
3610 The keys in this table are mark class names, and the values
3611 are a space-separated string of glyph names in this class.
3613 Note: This table is indeed new in 0.44. The manual said it existed
3614 before then, but in practise it was missing due to a bug.
3616 \subsubsubsection{math table}
3618 \starttabulate[|lT|p|]
3619 \NC ScriptPercentScaleDown \NC \NC \NR
3620 \NC ScriptScriptPercentScaleDown \NC \NC \NR
3621 \NC DelimitedSubFormulaMinHeight \NC \NC \NR
3622 \NC DisplayOperatorMinHeight \NC \NC \NR
3623 \NC MathLeading \NC \NC \NR
3624 \NC AxisHeight \NC \NC \NR
3625 \NC AccentBaseHeight \NC \NC \NR
3626 \NC FlattenedAccentBaseHeight \NC \NC \NR
3627 \NC SubscriptShiftDown \NC \NC \NR
3628 \NC SubscriptTopMax \NC \NC \NR
3629 \NC SubscriptBaselineDropMin \NC \NC \NR
3630 \NC SuperscriptShiftUp \NC \NC \NR
3631 \NC SuperscriptShiftUpCramped \NC \NC \NR
3632 \NC SuperscriptBottomMin \NC \NC \NR
3633 \NC SuperscriptBaselineDropMax \NC \NC \NR
3634 \NC SubSuperscriptGapMin \NC \NC \NR
3635 \NC SuperscriptBottomMaxWithSubscript \NC \NC \NR
3636 \NC SpaceAfterScript \NC \NC \NR
3637 \NC UpperLimitGapMin \NC \NC \NR
3638 \NC UpperLimitBaselineRiseMin \NC \NC \NR
3639 \NC LowerLimitGapMin \NC \NC \NR
3640 \NC LowerLimitBaselineDropMin \NC \NC \NR
3641 \NC StackTopShiftUp \NC \NC \NR
3642 \NC StackTopDisplayStyleShiftUp \NC \NC \NR
3643 \NC StackBottomShiftDown \NC \NC \NR
3644 \NC StackBottomDisplayStyleShiftDown \NC \NC \NR
3645 \NC StackGapMin \NC \NC \NR
3646 \NC StackDisplayStyleGapMin \NC \NC \NR
3647 \NC StretchStackTopShiftUp \NC \NC \NR
3648 \NC StretchStackBottomShiftDown \NC \NC \NR
3649 \NC StretchStackGapAboveMin \NC \NC \NR
3650 \NC StretchStackGapBelowMin \NC \NC \NR
3651 \NC FractionNumeratorShiftUp \NC \NC \NR
3652 \NC FractionNumeratorDisplayStyleShiftUp \NC \NC \NR
3653 \NC FractionDenominatorShiftDown \NC \NC \NR
3654 \NC FractionDenominatorDisplayStyleShiftDown \NC \NC \NR
3655 \NC FractionNumeratorGapMin \NC \NC \NR
3656 \NC FractionNumeratorDisplayStyleGapMin \NC \NC \NR
3657 \NC FractionRuleThickness \NC \NC \NR
3658 \NC FractionDenominatorGapMin \NC \NC \NR
3659 \NC FractionDenominatorDisplayStyleGapMin \NC \NC \NR
3660 \NC SkewedFractionHorizontalGap \NC \NC \NR
3661 \NC SkewedFractionVerticalGap \NC \NC \NR
3662 \NC OverbarVerticalGap \NC \NC \NR
3663 \NC OverbarRuleThickness \NC \NC \NR
3664 \NC OverbarExtraAscender \NC \NC \NR
3665 \NC UnderbarVerticalGap \NC \NC \NR
3666 \NC UnderbarRuleThickness \NC \NC \NR
3667 \NC UnderbarExtraDescender \NC \NC \NR
3668 \NC RadicalVerticalGap \NC \NC \NR
3669 \NC RadicalDisplayStyleVerticalGap \NC \NC \NR
3670 \NC RadicalRuleThickness \NC \NC \NR
3671 \NC RadicalExtraAscender \NC \NC \NR
3672 \NC RadicalKernBeforeDegree \NC \NC \NR
3673 \NC RadicalKernAfterDegree \NC \NC \NR
3674 \NC RadicalDegreeBottomRaisePercent \NC \NC \NR
3675 \NC MinConnectorOverlap \NC \NC \NR
3676 \NC FractionDelimiterSize \NC (new in 0.47.0)\NC \NR
3677 \NC FractionDelimiterDisplayStyleSize \NC (new in 0.47.0)\NC \NR
3678 \stoptabulate
3680 \subsubsubsection{validation_state table}
3682 \starttabulate[|lT|p|]
3683 \NC \ssbf key \NC \bf explanation \NC\NR
3684 \NC bad_ps_fontname \NC \NC \NR
3685 \NC bad_glyph_table \NC \NC \NR
3686 \NC bad_cff_table \NC \NC \NR
3687 \NC bad_metrics_table \NC \NC \NR
3688 \NC bad_cmap_table \NC \NC \NR
3689 \NC bad_bitmaps_table \NC \NC \NR
3690 \NC bad_gx_table \NC \NC \NR
3691 \NC bad_ot_table \NC \NC \NR
3692 \NC bad_os2_version \NC \NC \NR
3693 \NC bad_sfnt_header \NC \NC \NR
3694 \stoptabulate
3696 \subsubsubsection{horiz_base and vert_base table}
3698 \starttabulate[|lT|l|p|]
3699 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3700 \NC tags \NC table \NC an array of script list tags\NC \NR
3701 \NC scripts \NC table \NC \NC \NR
3702 \stoptabulate
3705 The \type{scripts} subtable:
3707 \starttabulate[|lT|l|p|]
3708 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3709 \NC baseline \NC table \NC \NC \NR
3710 \NC default_baseline \NC number \NC \NC \NR
3711 \NC lang \NC table \NC \NC \NR
3712 \stoptabulate
3715 The \type{lang} subtable:
3717 \starttabulate[|lT|l|p|]
3718 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3719 \NC tag \NC string \NC a script tag \NC \NR
3720 \NC ascent \NC number \NC \NC \NR
3721 \NC descent \NC number \NC \NC \NR
3722 \NC features \NC table \NC \NC \NR
3723 \stoptabulate
3725 The \type{features} points to an array of tables with the same layout
3726 except that in those nested tables, the tag represents a language.
3728 \subsubsubsection{altuni table}
3730 An array of alternate \UNICODE\ values. Inside that array
3731 are hashes with:
3733 \starttabulate[|lT|l|p|]
3734 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3735 \NC unicode \NC number \NC this glyph is also used for this unicode\NC \NR
3736 \NC variant \NC number \NC the alternative is driven by this unicode selector\NC \NR
3737 \stoptabulate
3739 \subsubsubsection{vert_variants and horiz_variants table}
3741 \starttabulate[|lT|l|p|]
3742 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3743 \NC variants \NC string \NC \NC \NR
3744 \NC italic_correction \NC number \NC \NC \NR
3745 \NC parts \NC table \NC \NC \NR
3746 \stoptabulate
3748 The \type{parts} table is an array of smaller tables:
3750 \starttabulate[|lT|l|p|]
3751 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3752 \NC component \NC string \NC \NC \NR
3753 \NC extender \NC number \NC \NC \NR
3754 \NC start \NC number \NC \NC \NR
3755 \NC end \NC number \NC \NC \NR
3756 \NC advance \NC number \NC \NC \NR
3757 \stoptabulate
3760 \subsubsubsection{mathkern table}
3762 \starttabulate[|lT|l|p|]
3763 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3764 \NC top_right \NC table \NC \NC \NR
3765 \NC bottom_right \NC table \NC \NC \NR
3766 \NC top_left \NC table \NC \NC \NR
3767 \NC bottom_left \NC table \NC \NC \NR
3768 \stoptabulate
3770 Each of the subtables is an array of small hashes with two keys:
3772 \starttabulate[|lT|l|p|]
3773 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3774 \NC height \NC number \NC \NC \NR
3775 \NC kern \NC number \NC \NC \NR
3776 \stoptabulate
3778 \subsubsubsection{kerns table}
3780 Substructure is identical to the per|-|glyph subtable.
3782 \subsubsubsection{vkerns table}
3784 Substructure is identical to the per|-|glyph subtable.
3786 \subsubsubsection{texdata table}
3789 \starttabulate[|lT|l|p|]
3790 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3791 \NC type \NC string \NC \type {unset}, \type {text}, \type {math}, \type {mathext}\NC\NR
3792 \NC params \NC array \NC 22 font numeric parameters\NC\NR
3793 \stoptabulate
3795 \subsubsubsection{lookups table}
3797 Top|-|level \type{lookups} is quite different from the ones at
3798 character level. The keys in this hash are strings, the values the
3799 actual lookups, represented as dictionary tables.
3801 \starttabulate[|lT|l|p|]
3802 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3803 \NC type \NC string \NC \NC\NR
3804 \NC format \NC enum \NC one of \type {glyphs}, \type {class}, \type {coverage}, \type {reversecoverage} \NC\NR
3805 \NC tag \NC string \NC \NC\NR
3806 \NC current_class \NC array \NC \NC\NR
3807 \NC before_class \NC array \NC \NC\NR
3808 \NC after_class \NC array \NC \NC\NR
3809 \NC rules \NC array \NC an array of rule items\NC\NR
3810 \stoptabulate
3812 Rule items have one common item and one specialized item:
3814 \starttabulate[|lT|l|p|]
3815 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3816 \NC lookups \NC array \NC a linear array of lookup names\NC\NR
3817 \NC glyphs \NC array \NC only if the parent's format is \type{glyphs}\NC\NR
3818 \NC class \NC array \NC only if the parent's format is \type{class}\NC\NR
3819 \NC coverage \NC array \NC only if the parent's format is \type{coverage}\NC\NR
3820 \NC reversecoverage \NC array \NC only if the parent's format is \type{reversecoverage}\NC\NR
3821 \stoptabulate
3823 A glyph table is:
3825 \starttabulate[|lT|l|p|]
3826 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3827 \NC names \NC string \NC \NC\NR
3828 \NC back \NC string \NC \NC\NR
3829 \NC fore \NC string \NC \NC\NR
3830 \stoptabulate
3832 A class table is:
3834 \starttabulate[|lT|l|p|]
3835 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3836 \NC current \NC array \NC of numbers \NC\NR
3837 \NC before \NC array \NC of numbers \NC\NR
3838 \NC after \NC array \NC of numbers \NC\NR
3839 \stoptabulate
3841 coverage:
3843 \starttabulate[|lT|l|p|]
3844 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3845 \NC current \NC array \NC of strings \NC\NR
3846 \NC before \NC array \NC of strings\NC\NR
3847 \NC after \NC array \NC of strings \NC\NR
3848 \stoptabulate
3850 reversecoverage:
3852 \starttabulate[|lT|l|p|]
3853 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3854 \NC current \NC array \NC of strings \NC\NR
3855 \NC before \NC array \NC of strings\NC\NR
3856 \NC after \NC array \NC of strings \NC\NR
3857 \NC replacements \NC string \NC \NC\NR
3858 \stoptabulate
3860 %***********************************************************************
3862 \section{The \luatex{img} library}
3864 The \type{img} library can be used as an alternative to
3865 \tex{pdfximage} and \tex{pdfrefximage}, and the associated \quote {satellite}
3866 commands like \tex{pdfximagebbox}.
3867 Image objects can also be used within virtual fonts
3868 via the \type{image} command listed in~\in{section}[virtualfonts].
3870 \subsection{\luatex{img.new}}
3872 \startfunctioncall
3873 <image> var = img.new()
3874 <image> var = img.new(<table> image_spec)
3875 \stopfunctioncall
3877 This function creates a userdata object of type \quote {image}. The
3878 \type{image_spec} argument is optional. If it is given, it must be
3879 a table, and that table must contain a \type{filename} key. A number of
3880 other keys can also be useful, these are explained below.
3882 You can either say
3884 \starttyping
3885 a = img.new()
3886 \stoptyping
3888 followed by
3890 \starttyping
3891 a.filename = "foo.png"
3892 \stoptyping
3894 or you can put the file name (and some or all of the other keys)
3895 into a table directly, like so:
3897 \starttyping
3898 a = img.new({filename='foo.pdf', page=1})
3899 \stoptyping
3901 The generated \type{<image>} userdata object allows access to a set of
3902 user|-|specified values as well as a set of values that are normally
3903 filled in and updated automatically by \LUATEX\ itself. Some of those
3904 are derived from the actual image file, others are updated to reflect
3905 the \PDF\ output status of the object.
3907 There is one required user-specified field: the file name
3908 (\type{filename}). It can optionally be augmented by the requested
3909 image dimensions (\type{width}, \type{depth}, \type{height}),
3910 user-specified image attributes (\type{attr}), the requested \PDF\ page
3911 identifier (\type{page}), the requested boundingbox (\type{pagebox})
3912 for \PDF\ inclusion, the requested color space object (\type{colorspace}).
3914 The function \type{img.new} does not access the actual image file, it
3915 just creates the \type{<image>} userdata object and initializes some
3916 memory structures. The \type{<image>} object and its internal
3917 structures are automatically garbage collected.
3919 Once the image is scanned, all the values in the \type{<image>}
3920 except \type{width}, \type{height} and \type{depth}, become frozen,
3921 and you cannot change them any more.
3923 \subsection{\luatex{img.keys}}
3925 \startfunctioncall
3926 <table> keys = img.keys()
3927 \stopfunctioncall
3929 This function returns a list of all the possible \type{image_spec}
3930 keys, both user-supplied and automatic ones.
3932 % hahe: i need to add r/w ro column...
3933 \starttabulate[|l|l|p|]
3934 \NC \bf field name\NC \bf type \NC description \NC \NR
3935 \NC attr \NC string \NC the image attributes for \LUATEX \NC \NR
3936 \NC bbox \NC table \NC table with 4 boundingbox dimensions
3937 \type{llx}, \type{lly}, \type{urx},
3938 and \type{ury} overruling the \type{pagebox}
3939 entry\NC \NR
3940 \NC colordepth \NC number \NC the number of bits used by the color space\NC \NR
3941 \NC colorspace \NC number \NC the color space object number \NC \NR
3942 \NC depth \NC number \NC the image depth for \LUATEX\
3943 (in scaled points)\NC \NR
3944 \NC filename \NC string \NC the image file name \NC \NR
3945 \NC filepath \NC string \NC the full (expanded) file name of the image\NC \NR
3946 \NC height \NC number \NC the image height for \LUATEX\
3947 (in scaled points)\NC \NR
3948 \NC imagetype \NC string \NC one of \type{pdf}, \type{png}, \type{jpg}, \type{jp2},
3949 \type{jbig2}, or \type{nil} \NC \NR
3950 \NC index \NC number \NC the \PDF\ image name suffix \NC \NR
3951 \NC objnum \NC number \NC the \PDF\ image object number \NC \NR
3952 \NC page \NC ?? \NC the identifier for the requested image page
3953 (type is number or string,
3954 default is the number 1)\NC \NR
3955 \NC pagebox \NC string \NC the requested bounding box, one of
3956 \type {none}, \type {media}, \type {crop},
3957 \type {bleed}, \type {trim}, \type {art} \NC \NR
3958 \NC pages \NC number \NC the total number of available pages \NC \NR
3959 \NC rotation \NC number \NC the image rotation from included \PDF\ file,
3960 in multiples of 90~deg. \NC \NR
3961 \NC stream \NC string \NC the raw stream data for an \type{/Xobject}
3962 \type{/Form} object\NC \NR
3963 \NC transform \NC number \NC the image transform, integer number 0..7\NC \NR
3964 \NC width \NC number \NC the image width for \LUATEX\
3965 (in scaled points)\NC \NR
3966 \NC xres \NC number \NC the horizontal natural image resolution
3967 (in \DPI) \NC \NR
3968 \NC xsize \NC number \NC the natural image width \NC \NR
3969 \NC yres \NC number \NC the vertical natural image resolution
3970 (in \DPI) \NC \NR
3971 \NC ysize \NC number \NC the natural image height \NC \NR
3972 \stoptabulate
3974 A running (undefined) dimension in \type{width}, \type{height}, or \type{depth} is
3975 represented as \type{nil} in \LUA, so if you want to load an image at
3976 its \quote {natural} size, you do not have to specify any of those three fields.
3978 The \type{stream} parameter allows to fabricate an \type{/XObject} \type{/Form}
3979 object from a string giving the stream contents,
3980 e.\,g., for a filled rectangle:
3982 \startfunctioncall
3983 a.stream = "0 0 20 10 re f"
3984 \stopfunctioncall
3986 When writing the image, an \type{/Xobject} \type{/Form} object is created,
3987 like with embedded \PDF\ file writing. The object is written out only once.
3988 The \type{stream} key requires that also the \type{bbox} table is given.
3989 The \type{stream} key conflicts with the \type{filename} key.
3990 The \type{transform} key works as usual also with \type{stream}.
3992 The \type{bbox} key needs a table with four boundingbox values, e.\,g.:
3994 \startfunctioncall
3995 a.bbox = {"30bp", 0, "225bp", "200bp"}
3996 \stopfunctioncall
3998 This replaces and overrules any given \type{pagebox} value;
3999 with given \type{bbox} the box dimensions coming with an embedded \PDF\ file
4000 are ignored.
4001 The \type{xsize} and \type{ysize} dimensions are set accordingly,
4002 when the image is scaled.
4003 The \type{bbox} parameter is ignored for non-\PDF\ images.
4005 The \type{transform} allows to mirror and rotate the image in steps of 90~deg.
4006 The default value~0 gives an unmirrored, unrotated image.
4007 Values 1|--|3 give counterclockwise rotation by 90, 180, or 270~degrees,
4008 whereas with values 4|--|7 the image is first mirrored
4009 and then rotated counterclockwise by 90, 180, or 270~degrees.
4010 The \type{transform} operation gives the same visual result
4011 as if you would externally preprocess the image by a graphics tool
4012 and then use it by \LUATEX.
4013 If a \PDF\ file to be embedded already contains a \type{/Rotate} specification,
4014 the rotation result is the combination of the \type{/Rotate} rotation
4015 followed by the \type{transform} operation.
4017 \subsection{\luatex{img.scan}}
4019 \startfunctioncall
4020 <image> var = img.scan(<image> var)
4021 <image> var = img.scan(<table> image_spec)
4022 \stopfunctioncall
4024 When you say \type{img.scan(a)} for a new image, the file is scanned,
4025 and variables such as \type{xsize}, \type{ysize}, image \type{type}, number of
4026 \type{pages}, and the resolution are extracted. Each of the \type{width},
4027 \type{height}, \type{depth} fields are set up according to the image dimensions,
4028 if they were not given an explicit value already.
4029 An image file will never be scanned more than once for a given image variable.
4030 With all subsequent \type{img.scan(a)} calls only the dimensions are again
4031 set up (if they have been changed by the user in the meantime).
4033 For ease of use, you can do right-away a
4035 \starttyping
4036 <image> a = img.scan ({ filename = "foo.png" })
4037 \stoptyping
4039 without a prior \type{img.new}.
4041 Nothing is written yet at this point, so you can do \type{a=img.scan},
4042 retrieve the available info like image width and height, and then
4043 throw away \type{a} again by saying \type{a=nil}. In that case no
4044 image object will be reserved in the PDF, and the used memory will be
4045 cleaned up automatically.
4047 \subsection{\luatex{img.copy}}
4049 \startfunctioncall
4050 <image> var = img.copy(<image> var)
4051 <image> var = img.copy(<table> image_spec)
4052 \stopfunctioncall
4054 If you say \type{a = b}, then both variables point to the same
4055 \type{<image>} object. if you want to write out an image with
4056 different sizes, you can do a \type{b=img.copy(a)}.
4058 Afterwards, \type{a} and \type{b} still reference the same actual
4059 image dictionary, but the dimensions for \type{b} can now be changed
4060 from their initial values that were just copies from \type{a}.
4062 % Hartmut, I don't know if this makes sense. An example of what
4063 % can, and what cannot be changed would be helpful.
4064 % -- will think about it...
4066 \subsection{\luatex{img.write}}
4068 \startfunctioncall
4069 <image> var = img.write(<image> var)
4070 <image> var = img.write(<table> image_spec)
4071 \stopfunctioncall
4073 By \type{img.write(a)} a \PDF\ object number is allocated,
4074 and a whatsit node of subtype \type{pdf_refximage} is generated
4075 and put into the output list.
4076 By this the image \type{a} is placed into the page stream,
4077 and the image file is written out into an image stream object
4078 after the shipping of the current page is finished.
4080 Again you can do a terse call like
4082 \starttyping
4083 img.write ({ filename = "foo.png" })
4084 \stoptyping
4086 The \type{<image>} variable is returned in case you want it for later
4087 processing.
4089 \subsection{\luatex{img.immediatewrite}}
4091 \startfunctioncall
4092 <image> var = img.immediatewrite(<image> var)
4093 <image> var = img.immediatewrite(<table> image_spec)
4094 \stopfunctioncall
4096 By \type{img.immediatewrite(a)} a \PDF\ object number is
4097 allocated, and the image file for image \type{a} is written out
4098 immediately into the \PDF\ file as an image stream object (like
4099 with \tex{immediate}\tex{pdfximage}). The object number of the image
4100 stream dictionary is then available by the \type{objnum} key. No
4101 \type{pdf_refximage} whatsit node is generated. You will need an
4102 \luatex{img.write(a)} or \luatex{img.node(a)} call to let the
4103 image appear on the page, or reference it by another trick; else
4104 you will have a dangling image object in the \PDF\ file.
4106 Also here you can do a terse call like
4108 \starttyping
4109 a = img.immediatewrite ({ filename = "foo.png" })
4110 \stoptyping
4112 The \type{<image>} variable is returned and you will most likely need it.
4114 \subsection{\luatex{img.node}}
4116 \startfunctioncall
4117 <node> n = img.node(<image> var)
4118 <node> n = img.node(<table> image_spec)
4119 \stopfunctioncall
4121 This function allocates a \PDF\ object number and returns a
4122 whatsit node of subtype \type{pdf_refximage}, filled with the
4123 image parameters \type{width}, \type{height}, \type{depth}, and
4124 \type{objnum}. Also here you can do a terse call like:
4126 \starttyping
4127 n = img.node ({ filename = "foo.png" })
4128 \stoptyping
4130 This example outputs an image:
4132 \starttyping
4133 node.write(img.node{filename="foo.png"})
4134 \stoptyping
4136 \subsection{\luatex{img.types}}
4138 \startfunctioncall
4139 <table> types = img.types()
4140 \stopfunctioncall
4142 This function returns a list with the supported image file type names,
4143 currently these are \type{pdf}, \type{png}, \type{jpg}, \type{jp2} (JPEG~2000),
4144 and \type{jbig2}.
4146 \subsection{\luatex{img.boxes}}
4148 \startfunctioncall
4149 <table> boxes = img.boxes()
4150 \stopfunctioncall
4152 This function returns a list with the supported \PDF\ page box names,
4153 currently these are \type {media}, \type {crop}, \type {bleed}, \type {trim}, and \type {art}
4154 (all in lowercase letters).
4156 %***********************************************************************
4158 \section{The \luatex{kpse} library}
4160 This library provides two separate, but nearly identical interfaces to
4161 the \KPATHSEA\ file search functionality: there is a \quote{normal}
4162 procedural interface that shares its kpathsea instance with \LUATEX\
4163 itself, and an object oriented interface that is completely on its
4164 own. The object oriented interface and \type{kpse.new} have been added
4165 in \LUATEX\ 0.37.
4167 \subsection{\luatex{kpse.set_program_name} and \luatex{kpse.new}}
4169 Before the search library can be used at all, its database has to be
4170 initialized. There are three possibilities, two of which belong to the
4171 procedural interface.
4173 First, when \LUATEX\ is used to typeset documents, this initialization
4174 happens automatically and the \KPATHSEA\ executable and program names
4175 are set to \type{luatex} (that is, unless explicitly prohibited by the
4176 user's startup script. See~\in{section}[init] for more details).
4178 Second, in \TEXLUA\ mode, the initialization has to be done explicitly
4179 via the \luatex{kpse.set_program_name} function, which sets the
4180 \KPATHSEA\ executable (and optionally program) name.
4182 \startfunctioncall
4183 kpse.set_program_name(<string> name)
4184 kpse.set_program_name(<string> name, <string> progname)
4185 \stopfunctioncall
4187 The second argument controls the use of the \quote{dotted} values in the
4188 \type{texmf.cnf} configuration file, and defaults to the first argument.
4190 Third, if you prefer the object oriented interface, you have to call a
4191 different function. It has the same arguments, but it returns a
4192 userdata variable.
4194 \startfunctioncall
4195 local kpathsea = kpse.new(<string> name)
4196 local kpathsea = kpse.new(<string> name, <string> progname)
4197 \stopfunctioncall
4199 Apart from these two functions, the calling conventions of the
4200 interfaces are identical. Depending on the chosen interface, you
4201 either call \type{kpse.find_file()} or \type{kpathsea:find_file()},
4202 with identical arguments and return vales.
4204 \subsection{\luatex{find_file}}
4206 The most often used function in the library is find_file:
4208 \startfunctioncall
4209 <string> f = kpse.find_file(<string> filename)
4210 <string> f = kpse.find_file(<string> filename, <string> ftype)
4211 <string> f = kpse.find_file(<string> filename, <boolean> mustexist)
4212 <string> f = kpse.find_file(<string> filename, <string> ftype, <boolean> mustexist)
4213 <string> f = kpse.find_file(<string> filename, <string> ftype, <number> dpi)
4214 \stopfunctioncall
4216 Arguments:
4217 \startitemize[intro]
4219 \sym{filename}
4221 the name of the file you want to find, with or without extension.
4223 \sym{ftype}
4225 maps to the \type {-format} argument of \KPSEWHICH. The supported
4226 \type{ftype} values are the same as the ones supported by the
4227 standalone \type{kpsewhich} program:
4229 \startsimplecolumns
4230 \starttyping
4231 'gf'
4232 'pk'
4233 'bitmap font'
4234 'tfm'
4235 'afm'
4236 'base'
4237 'bib'
4238 'bst'
4239 'cnf'
4240 'ls-R'
4241 'fmt'
4242 'map'
4243 'mem'
4244 'mf'
4245 'mfpool'
4246 'mft'
4247 'mp'
4248 'mppool'
4249 'MetaPost support'
4250 'ocp'
4251 'ofm'
4252 'opl'
4253 'otp'
4254 'ovf'
4255 'ovp'
4256 'graphic/figure'
4257 'tex'
4258 'TeX system documentation'
4259 'texpool'
4260 'TeX system sources'
4261 'PostScript header'
4262 'Troff fonts'
4263 'type1 fonts'
4264 'vf'
4265 'dvips config'
4266 'ist'
4267 'truetype fonts'
4268 'type42 fonts'
4269 'web2c files'
4270 'other text files'
4271 'other binary files'
4272 'misc fonts'
4273 'web'
4274 'cweb'
4275 'enc files'
4276 'cmap files'
4277 'subfont definition files'
4278 'opentype fonts'
4279 'pdftex config'
4280 'lig files'
4281 'texmfscripts'
4282 'lua',
4283 'font feature files',
4284 'cid maps',
4285 'mlbib',
4286 'mlbst',
4287 'clua',
4288 \stoptyping
4289 \stopsimplecolumns
4291 The default type is \type{tex}. Note: this is different from
4292 \KPSEWHICH, which tries to deduce the file type itself from
4293 looking at the supplied extension. The last four types:
4294 'font feature files', 'cid maps', 'mlbib', 'mlbst' were new
4295 additions in \LUATEX\ 0.40.2.
4298 \sym{mustexist}
4300 is similar to \KPSEWHICH's \type{-must-exist}, and the default is \type{false}.
4301 If you specify \type{true} (or a non|-|zero integer), then the \KPSE\ library
4302 will search the disk as well as the \type {ls-R} databases.
4304 \sym{dpi}
4306 This is used for the size argument of the formats \type{pk}, \type{gf}, and \type{bitmap font}.
4307 \stopitemize
4309 \subsection{\luatex{lookup}}
4311 A more powerful (but slower) generic method for finding files is also
4312 available (since 0.51). It returns a string for each found file.
4314 \startfunctioncall
4315 <string> f, ... = kpse.lookup(<string> filename, <table> options)
4316 \stopfunctioncall
4318 The options match commandline arguments from \type{kpsewhich}:
4320 \starttabulate[|l|l|p|]
4321 \NC \ssbf key \NC \ssbf type \NC \ssbf description \NC \NR
4322 \NC debug \NC number \NC set debugging flags for this lookup\NC \NR
4323 \NC format \NC string \NC use specific file type (see list above)\NC \NR
4324 \NC dpi \NC number \NC use this resolution for this lookup; default 600\NC \NR
4325 \NC path \NC string \NC search in the given path\NC \NR
4326 \NC all \NC boolean \NC output all matches, not just the first\NC \NR
4327 \NC mustexist \NC boolean \NC (0.65 and higher) search the disk as well as ls-R if necessary\NC \NR
4328 \NC must-exist\NC boolean \NC (0.64 and lower) search the disk as well as ls-R if necessary\NC \NR
4329 \NC mktexpk \NC boolean \NC disable/enable mktexpk generation for this lookup\NC \NR
4330 \NC mktextex \NC boolean \NC disable/enable mktextex generation for this lookup\NC \NR
4331 \NC mktexmf \NC boolean \NC disable/enable mktexmf generation for this lookup\NC \NR
4332 \NC mktextfm \NC boolean \NC disable/enable mktextfm generation for this lookup\NC \NR
4333 \NC subdir \NC string
4334 or table \NC only output matches whose directory part
4335 ends with the given string(s) \NC \NR
4336 \stoptabulate
4338 \subsection{\luatex{init_prog}}
4340 Extra initialization for programs that need to generate bitmap fonts.
4342 \startfunctioncall
4343 kpse.init_prog(<string> prefix, <number> base_dpi, <string> mfmode)
4344 kpse.init_prog(<string> prefix, <number> base_dpi, <string> mfmode, <string> fallback)
4345 \stopfunctioncall
4348 \subsection{\luatex{readable_file}}
4350 Test if an (absolute) file name is a readable file.
4352 \startfunctioncall
4353 <string> f = kpse.readable_file(<string> name)
4354 \stopfunctioncall
4356 The return value is the actual absolute filename you should use,
4357 because the disk name is not always the same as the requested name,
4358 due to aliases and system|-|specific handling under e.\,g.\ \MSDOS.
4360 Returns \lua {nil} if the file does not exist or is not readable.
4362 \subsection{\luatex{expand_path}}
4364 Like kpsewhich's \type {-expand-path}:
4366 \startfunctioncall
4367 <string> r = kpse.expand_path(<string> s)
4368 \stopfunctioncall
4370 \subsection{\luatex{expand_var}}
4372 Like kpsewhich's \type{-expand-var}:
4374 \startfunctioncall
4375 <string> r = kpse.expand_var(<string> s)
4376 \stopfunctioncall
4378 \subsection{\luatex{expand_braces}}
4380 Like kpsewhich's \type{-expand-braces}:
4382 \startfunctioncall
4383 <string> r = kpse.expand_braces(<string> s)
4384 \stopfunctioncall
4386 \subsection{\luatex{show_path}}
4388 Like kpsewhich's \type{-show-path}:
4390 \startfunctioncall
4391 <string> r = kpse.show_path(<string> ftype)
4392 \stopfunctioncall
4395 \subsection{\luatex{var_value}}
4397 Like kpsewhich's \type{-var-value}:
4399 \startfunctioncall
4400 <string> r = kpse.var_value(<string> s)
4401 \stopfunctioncall
4403 \subsection{\luatex{version}}
4405 Returns the kpathsea version string (new in 0.51)
4407 \startfunctioncall
4408 <string> r = kpse.version()
4409 \stopfunctioncall
4412 \section{The \luatex{lang} library}
4414 This library provides the interface to \LUATEX's structure
4415 representing a language, and the associated functions.
4417 \startfunctioncall
4418 <language> l = lang.new()
4419 <language> l = lang.new(<number> id)
4420 \stopfunctioncall
4422 This function creates a new userdata object. An object of type
4423 \type{<language>} is the first argument to most of the other functions
4424 in the \luatex{lang} library. These functions can also be used as if
4425 they were object methods, using the colon syntax.
4427 Without an argument, the next available internal id number will be
4428 assigned to this object. With argument, an object will be created that
4429 links to the internal language with that id number.
4431 \startfunctioncall
4432 <number> n = lang.id(<language> l)
4433 \stopfunctioncall
4435 returns the internal \tex{language} id number this object refers to.
4437 \startfunctioncall
4438 <string> n = lang.hyphenation(<language> l)
4439 lang.hyphenation(<language> l, <string> n)
4440 \stopfunctioncall
4442 Either returns the current hyphenation exceptions for this language,
4443 or adds new ones. The syntax of the string is explained in~\in{section}[patternsexceptions].
4445 \startfunctioncall
4446 lang.clear_hyphenation(<language> l)
4447 \stopfunctioncall
4449 Clears the exception dictionary for this language.
4451 \startfunctioncall
4452 <string> n = lang.clean(<string> o)
4453 \stopfunctioncall
4455 Creates a hyphenation key from the supplied hyphenation value. The
4456 syntax of the argument string is explained in~\in{section}[patternsexceptions].
4457 This function is useful if
4458 you want to do something else based on the words in a dictionary file,
4459 like spell-checking.
4461 \startfunctioncall
4462 <string> n = lang.patterns(<language> l)
4463 lang.patterns(<language> l, <string> n)
4464 \stopfunctioncall
4466 Adds additional patterns for this language object, or returns the
4467 current set. The syntax of this string is explained in~\in{section}[patternsexceptions].
4469 \startfunctioncall
4470 lang.clear_patterns(<language> l)
4471 \stopfunctioncall
4473 Clears the pattern dictionary for this language.
4475 \startfunctioncall
4476 <number> n = lang.prehyphenchar(<language> l)
4477 lang.prehyphenchar(<language> l, <number> n)
4478 \stopfunctioncall
4480 Gets or sets the \quote{pre|-|break} hyphen character for implicit
4481 hyphenation in this language (initially the hyphen, decimal 45).
4483 \startfunctioncall
4484 <number> n = lang.posthyphenchar(<language> l)
4485 lang.posthyphenchar(<language> l, <number> n)
4486 \stopfunctioncall
4488 Gets or sets the \quote{post|-|break} hyphen character for implicit
4489 hyphenation in this language (initially null, decimal~0, indicating
4490 emptiness).
4493 \startfunctioncall
4494 <number> n = lang.preexhyphenchar(<language> l)
4495 lang.preexhyphenchar(<language> l, <number> n)
4496 \stopfunctioncall
4498 Gets or sets the \quote{pre|-|break} hyphen character for explicit
4499 hyphenation in this language (initially null, decimal~0, indicating
4500 emptiness).
4502 \startfunctioncall
4503 <number> n = lang.postexhyphenchar(<language> l)
4504 lang.postexhyphenchar(<language> l, <number> n)
4505 \stopfunctioncall
4507 Gets or sets the \quote{post|-|break} hyphen character for explicit
4508 hyphenation in this language (initially null, decimal~0, indicating
4509 emptiness).
4511 \startfunctioncall
4512 <boolean> success = lang.hyphenate(<node> head)
4513 <boolean> success = lang.hyphenate(<node> head, <node> tail)
4514 \stopfunctioncall
4516 Inserts hyphenation points (discretionary nodes) in a node list. If
4517 \type{tail} is given as argument, processing stops on that node.
4518 Currently, \type{success} is always true if \type{head} (and \type{tail}, if
4519 specified) are proper nodes, regardless of possible other errors.
4521 Hyphenation works only on \quote{characters}, a special subtype of all
4522 the glyph nodes with the node subtype having the value \type{1}. Glyph
4523 modes with different subtypes are not processed. See
4524 \in{section~}[charsandglyphs] for more details.
4527 \section{The \luatex{lua} library}
4529 This library contains one read|-|only item:
4531 \starttyping
4532 <string> s = lua.version
4533 \stoptyping
4535 This returns the \LUA\ version identifier string. The value is
4536 currently \directlua {tex.print(lua.version)}.
4538 \subsection{\LUA\ bytecode registers}
4540 \LUA\ registers can be used to communicate \LUA\ functions across \LUA\
4541 chunks. The accepted values for assignments are functions and
4542 \type{nil}. Likewise, the retrieved value is either a function or \type{nil}.
4544 \starttyping
4545 lua.bytecode[<number> n] = <function> f
4546 lua.bytecode[<number> n]()
4547 \stoptyping
4549 The contents of the \luatex{lua.bytecode} array is stored inside the format
4550 file as actual \LUA\ bytecode, so it can also be used to preload \LUA\ code.
4552 Note: The function must not contain any upvalues. Currently, functions
4553 containing upvalues can be stored (and their upvalues are set to
4554 \type{nil}), but this is an artifact of the current \LUA\
4555 implementation and thus subject to change.
4557 The associated function calls are
4559 \startfunctioncall
4560 <function> f = lua.getbytecode(<number> n)
4561 lua.setbytecode(<number> n, <function> f)
4562 \stopfunctioncall
4564 Note: Since a \LUA\ file loaded using \luatex{loadfile(filename)} is
4565 essentially an anonymous function, a complete file can be stored in a
4566 bytecode register like this:
4568 \startfunctioncall
4569 lua.bytecode[n] = loadfile(filename)
4570 \stopfunctioncall
4572 Now all definitions (functions, variables) contained in the file can be
4573 created by executing this bytecode register:
4575 \startfunctioncall
4576 lua.bytecode[n]()
4577 \stopfunctioncall
4579 Note that the path of the file is stored in the \LUA\ bytecode to be
4580 used in stack backtraces and therefore dumped into the format file if
4581 the above code is used in \INITEX. If it contains private information, i.e.
4582 the user name, this information is then contained in the format file as
4583 well. This should be kept in mind when preloading files into a bytecode
4584 register in \INITEX.
4586 \subsection{\LUA\ chunk name registers}
4588 There is an array of 65536 (0--65535) potential chunk names for use with
4589 the \type{\directlua} and \type{\latelua} primitives.
4591 \startfunctioncall
4592 lua.name[<number> n] = <string> s
4593 <string> s = lua.name[<number> n]
4594 \stopfunctioncall
4596 If you want to unset a lua name, you can assign \type{nil} to it.
4599 \section{The \luatex{mplib} library}
4601 The \MP\ library interface registers itself in the table \type{mplib}. It
4602 is based on \MPLIB\ version \ctxlua{tex.sprint(mplib.version())}.
4604 \subsection{\luatex{mplib.new}}
4606 To create a new \METAPOST\ instance, call
4608 \startfunctioncall
4609 <mpinstance> mp = mplib.new({...})
4610 \stopfunctioncall
4612 This creates the \type{mp} instance object. The argument hash can have a number of
4613 different fields, as follows:
4615 \starttabulate[|lT|l|p|p|]
4616 \NC \ssbf name \NC \bf type \NC \bf description \NC \bf default \NC\NR
4617 \NC error_line \NC number \NC error line width \NC 79 \NC\NR
4618 \NC print_line \NC number \NC line length in ps output \NC 100\NC\NR
4619 \NC random_seed \NC number \NC the initial random seed \NC variable\NC\NR
4620 \NC interaction \NC string \NC the interaction mode, one of
4621 \type {batch}, \type {nonstop}, \type {scroll}, \type {errorstop} \NC \type {errorstop}\NC\NR
4622 \NC job_name \NC string \NC \type {--jobname} \NC \type {mpout} \NC\NR
4623 \NC find_file \NC function \NC a function to find files \NC only local files\NC\NR
4624 \stoptabulate
4626 The \type{find_file} function should be of this form:
4628 \starttyping
4629 <string> found = finder (<string> name, <string> mode, <string> type)
4630 \stoptyping
4632 with:
4634 \starttabulate[|lT|l|p|]
4635 \NC \bf name \NC \bf the requested file \NC \NR
4636 \NC mode \NC the file mode: \type {r} or \type {w} \NC \NR
4637 \NC type \NC the kind of file, one of: \type {mp}, \type {tfm}, \type {map}, \type {pfb}, \type {enc} \NC \NR
4638 \stoptabulate
4640 Return either the full pathname of the found file, or \type{nil} if
4641 the file cannot be found.
4643 Note that the new version of \MPLIB\ no longer uses binary mem files,
4644 so the way to preload a set of macros is simply to start off with
4645 an \type{input} command in the first \type{mp:execute()} call.
4648 \subsection{\luatex{mp:statistics}}
4650 You can request statistics with:
4652 \startfunctioncall
4653 <table> stats = mp:statistics()
4654 \stopfunctioncall
4656 This function returns the vital statistics for an \MPLIB\ instance. There are four
4657 fields, giving the maximum number of used items in each of four
4658 allocated object classes:
4660 \starttabulate[|lT|l|p|]
4661 \NC main_memory \NC number \NC memory size \NC\NR
4662 \NC hash_size \NC number \NC hash size\NC\NR
4663 \NC param_size \NC number \NC simultaneous macro parameters\NC\NR
4664 \NC max_in_open \NC number \NC input file nesting levels\NC\NR
4665 \stoptabulate
4667 Note that in the new version of \MPLIB, this is informational only. The
4668 objects are all allocated dynamically, so there is no chance of running
4669 out of space unless the available system memory is exhausted.
4671 \subsection{\luatex{mp:execute}}
4673 You can ask the \METAPOST\ interpreter to run a chunk of code by calling
4675 \startfunctioncall
4676 <table> rettable = mp:execute('metapost language chunk')
4677 \stopfunctioncall
4679 for various bits of \METAPOST\ language input. Be sure to check the
4680 \type{rettable.status} (see below) because when a fatal \METAPOST\
4681 error occurs the \MPLIB\ instance will become unusable thereafter.
4683 Generally speaking, it is best to keep your chunks small, but beware
4684 that all chunks have to obey proper syntax, like each of them is a
4685 small file. For instance, you cannot split a single statement over
4686 multiple chunks.
4688 In contrast with the normal standalone \type{mpost} command, there is
4689 {\em no\/} implied \quote{input} at the start of the first chunk.
4691 \subsection{\luatex{mp:finish}}
4693 \startfunctioncall
4694 <table> rettable = mp:finish()
4695 \stopfunctioncall
4697 If for some reason you want to stop using an \MPLIB\ instance while
4698 processing is not yet actually done, you can call \type{mp:finish}.
4699 Eventually, used memory will be freed and open files will be closed by
4700 the \LUA\ garbage collector, but an explicit \type{mp:finish} is the
4701 only way to capture the final part of the output streams.
4703 \subsection{Result table}
4705 The return value of \type{mp:execute} and \type{mp:finish} is a table
4706 with a few possible keys (only \type {status} is always guaranteed to be present).
4708 \starttabulate[|l|l|p|]
4709 \NC log \NC string \NC output to the \quote {log} stream \NC \NR
4710 \NC term \NC string \NC output to the \quote {term} stream \NC \NR
4711 \NC error \NC string \NC output to the \quote {error} stream (only used for \quote {out of memory})\NC \NR
4712 \NC status \NC number \NC the return value: 0=good, 1=warning, 2=errors, 3=fatal error \NC \NR
4713 \NC fig \NC table \NC an array of generated figures (if any)\NC \NR
4714 \stoptabulate
4716 When \type{status} equals~3, you should stop using this \MPLIB\ instance
4717 immediately, it is no longer capable of processing input.
4719 If it is present, each of the entries in the \type{fig} array is a
4720 userdata representing a figure object, and each of those has a number of
4721 object methods you can call:
4723 \starttabulate[|l|l|p|]
4724 \NC boundingbox \NC function \NC returns the bounding box, as an array of 4 values\NC \NR
4725 \NC postscript \NC function \NC returns a string that is the ps output of the \type{fig}.
4726 this function accepts two optional integer arguments for
4727 specifying the values of \type{prologues} (first argument)
4728 and \type{procset} (second argument)\NC \NR
4729 \NC svg \NC function \NC returns a string that is the svg output of the \type{fig}.
4730 This function accepts an optional integer argument for
4731 specifying the value of \type{prologues}\NC \NR
4732 \NC objects \NC function \NC returns the actual array of graphic objects in this \type{fig} \NC \NR
4733 \NC copy_objects \NC function \NC returns a deep copy of the array of graphic objects in this \type{fig} \NC \NR
4734 \NC filename \NC function \NC the filename this \type{fig}'s \POSTSCRIPT\ output
4735 would have written to in standalone mode\NC \NR
4736 \NC width \NC function \NC the \type{charwd} value \NC \NR
4737 \NC height \NC function \NC the \type{charht} value \NC \NR
4738 \NC depth \NC function \NC the \type{chardp} value \NC \NR
4739 \NC italcorr \NC function \NC the \type{charit} value \NC \NR
4740 \NC charcode \NC function \NC the (rounded) \type{charcode} value \NC \NR
4741 \stoptabulate
4743 {\bf NOTE:} you can call \type{fig:objects()} only once for any one \type{fig} object!
4745 When the boundingbox represents a \quote {negated rectangle}, i.e.\ when the first set
4746 of coordinates is larger than the second set, the picture is empty.
4748 Graphical objects come in various types that each has a different list of
4749 accessible values. The types are: \type{fill}, \type{outline}, \type{text},
4750 \type{start_clip}, \type{stop_clip}, \type{start_bounds}, \type{stop_bounds}, \type{special}.
4752 There is helper function (\type{mplib.fields(obj)}) to get the list of
4753 accessible values for a particular object, but you can just as easily
4754 use the tables given below.
4756 All graphical objects have a field \type{type} that gives the object
4757 type as a string value; it is not explicit mentioned in the following tables.
4758 In the following, \type{number}s are \POSTSCRIPT\ points represented as
4759 a floating point number, unless stated otherwise. Field values that
4760 are of type \type{table} are explained in the next section.
4762 \subsubsection{fill}
4764 \starttabulate[|l|l|p|]
4765 \NC path \NC table \NC the list of knots \NC \NR
4766 \NC htap \NC table \NC the list of knots for the reversed trajectory \NC \NR
4767 \NC pen \NC table \NC knots of the pen \NC \NR
4768 \NC color \NC table \NC the object's color \NC \NR
4769 \NC linejoin \NC number \NC line join style (bare number)\NC \NR
4770 \NC miterlimit \NC number \NC miterlimit\NC \NR
4771 \NC prescript \NC string \NC the prescript text \NC \NR
4772 \NC postscript \NC string \NC the postscript text \NC \NR
4773 \stoptabulate
4775 The entries \type{htap} and \type{pen} are optional.
4777 There is helper function (\type{mplib.pen_info(obj)}) that returns
4778 a table containing a bunch of vital characteristics of the used pen
4779 (all values are floats):
4781 \starttabulate[|l|l|p|]
4782 \NC width \NC number \NC width of the pen\NC \NR
4783 \NC sx \NC number \NC $x$ scale \NC \NR
4784 \NC rx \NC number \NC $xy$ multiplier \NC \NR
4785 \NC ry \NC number \NC $yx$ multiplier \NC \NR
4786 \NC sy \NC number \NC $y$ scale \NC \NR
4787 \NC tx \NC number \NC $x$ offset \NC \NR
4788 \NC ty \NC number \NC $y$ offset \NC \NR
4789 \stoptabulate
4791 \subsubsection{outline}
4793 \starttabulate[|l|l|p|]
4794 \NC path \NC table \NC the list of knots \NC \NR
4795 \NC pen \NC table \NC knots of the pen \NC \NR
4796 \NC color \NC table \NC the object's color \NC \NR
4797 \NC linejoin \NC number \NC line join style (bare number)\NC \NR
4798 \NC miterlimit \NC number \NC miterlimit \NC \NR
4799 \NC linecap \NC number \NC line cap style (bare number)\NC \NR
4800 \NC dash \NC table \NC representation of a dash list\NC \NR
4801 \NC prescript \NC string \NC the prescript text \NC \NR
4802 \NC postscript \NC string \NC the postscript text \NC \NR
4803 \stoptabulate
4805 The entry \type{dash} is optional.
4807 \subsubsection{text}
4809 \starttabulate[|l|l|p|]
4810 \NC text \NC string \NC the text \NC \NR
4811 \NC font \NC string \NC font tfm name \NC \NR
4812 \NC dsize \NC number \NC font size\NC \NR
4813 \NC color \NC table \NC the object's color \NC \NR
4814 \NC width \NC number \NC \NC \NR
4815 \NC height \NC number \NC \NC \NR
4816 \NC depth \NC number \NC \NC \NR
4817 \NC transform \NC table \NC a text transformation \NC \NR
4818 \NC prescript \NC string \NC the prescript text \NC \NR
4819 \NC postscript \NC string \NC the postscript text \NC \NR
4820 \stoptabulate
4822 \subsubsection{special}
4824 \starttabulate[|l|l|p|]
4825 \NC prescript \NC string \NC special text \NC \NR
4826 \stoptabulate
4828 \subsubsection{start_bounds, start_clip}
4830 \starttabulate[|l|l|p|]
4831 \NC path \NC table \NC the list of knots \NC \NR
4832 \stoptabulate
4834 \subsubsection{stop_bounds, stop_clip}
4836 Here are no fields available.
4838 \subsection{Subsidiary table formats}
4840 \subsubsection{Paths and pens}
4842 Paths and pens (that are really just a special type of paths as far as
4843 \MPLIB\ is concerned) are represented by an array where each entry
4844 is a table that represents a knot.
4846 \starttabulate[|lT|l|p|]
4847 \NC left_type \NC string \NC when present: 'endpoint', but usually absent \NC \NR
4848 \NC right_type \NC string \NC like \type{left_type}\NC \NR
4849 \NC x_coord \NC number \NC X coordinate of this knot\NC \NR
4850 \NC y_coord \NC number \NC Y coordinate of this knot\NC \NR
4851 \NC left_x \NC number \NC X coordinate of the precontrol point of this knot\NC \NR
4852 \NC left_y \NC number \NC Y coordinate of the precontrol point of this knot\NC \NR
4853 \NC right_x \NC number \NC X coordinate of the postcontrol point of this knot\NC \NR
4854 \NC right_y \NC number \NC Y coordinate of the postcontrol point of this knot\NC \NR
4855 \stoptabulate
4857 There is one special case: pens that are (possibly transformed)
4858 ellipses have an extra string-valued key \type{type} with value
4859 \type{elliptical} besides the array part containing the knot list.
4861 \subsubsection{Colors}
4863 A color is an integer array with 0, 1, 3 or 4 values:
4865 \starttabulate[|l|l|p|]
4866 \NC 0 \NC marking only \NC no values \NC\NR
4867 \NC 1 \NC greyscale \NC one value in the range $(0,1)$, \quote {black} is $0$ \NC\NR
4868 \NC 3 \NC \RGB \NC three values in the range $(0,1)$, \quote {black} is $0,0,0$ \NC\NR
4869 \NC 4 \NC \CMYK \NC four values in the range $(0,1)$, \quote {black} is $0,0,0,1$ \NC\NR
4870 \stoptabulate
4872 If the color model of the internal object was \type{uninitialized}, then
4873 it was initialized to the values representing \quote {black} in the colorspace
4874 \type{defaultcolormodel} that was in effect at the time of the \type{shipout}.
4876 \subsubsection{Transforms}
4878 Each transform is a six-item array.
4880 \starttabulate[|l|l|p|]
4881 \NC 1 \NC number \NC represents x \NC\NR
4882 \NC 2 \NC number \NC represents y \NC\NR
4883 \NC 3 \NC number \NC represents xx \NC\NR
4884 \NC 4 \NC number \NC represents yx \NC\NR
4885 \NC 5 \NC number \NC represents xy \NC\NR
4886 \NC 6 \NC number \NC represents yy \NC\NR
4887 \stoptabulate
4889 Note that the translation (index 1 and 2) comes first. This differs
4890 from the ordering in \POSTSCRIPT, where the translation comes last.
4892 \subsubsection{Dashes}
4894 Each \type{dash} is two-item hash, using the same model as \POSTSCRIPT\
4895 for the representation of the dashlist. \type{dashes} is an array of
4896 \quote {on} and \quote {off}, values, and \type{offset} is the phase of the pattern.
4898 \starttabulate[|l|l|p|]
4899 \NC dashes \NC hash \NC an array of on-off numbers \NC\NR
4900 \NC offset \NC number \NC the starting offset value \NC\NR
4901 \stoptabulate
4903 \subsection{Character size information}
4905 These functions find the size of a glyph in a defined font. The
4906 \type{fontname} is the same name as the argument to \type{infont};
4907 the \type{char} is a glyph id in the range 0 to 255; the returned
4908 \type{w} is in AFM units.
4910 \subsubsection{\luatex{mp:char_width}}
4912 \startfunctioncall
4913 <number> w = mp:char_width(<string> fontname, <number> char)
4914 \stopfunctioncall
4916 \subsubsection{\luatex{mp:char_height}}
4918 \startfunctioncall
4919 <number> w = mp:char_height(<string> fontname, <number> char)
4920 \stopfunctioncall
4922 \subsubsection{\luatex{mp:char_depth}}
4924 \startfunctioncall
4925 <number> w = mp:char_depth(<string> fontname, <number> char)
4926 \stopfunctioncall
4928 \section{The \luatex{node} library}
4930 The \luatex{node} library contains functions that facilitate dealing
4931 with (lists of) nodes and their values. They allow you to create, alter,
4932 copy, delete, and insert \LUATEX\ node objects, the core
4933 objects within the typesetter.
4935 \LUATEX\ nodes are represented in \LUA\ as userdata with
4936 the metadata type \luatex{luatex.node}. The various parts within
4937 a node can be accessed using named fields.
4939 Each node has at least the three fields \type{next}, \type{id}, and
4940 \type{subtype}:
4942 \startitemize[intro]
4944 \item The \type{next} field returns the userdata
4945 object for the next node in a linked list of nodes, or
4946 \type{nil}, if there is no next node.
4948 \item The \type{id} indicates \TEX's \quote{node type}. The field \type{id}
4949 has a numeric value for efficiency reasons, but some of the library
4950 functions also accept a string value instead of \type{id}.
4952 \item The \type{subtype} is another number. It often gives further information
4953 about a node of a particular \type{id}, but it is most important when dealing
4954 with \quote{whatsits}, because they are differentiated solely based on their
4955 \type{subtype}.
4956 \stopitemize
4958 The other available fields depend on the \type{id} (and for \quote{whatsits}, the
4959 \type{subtype}) of the node. Further details on the various fields and their
4960 meanings are given in~\in{chapter}[nodes].
4962 Support for \type{unset} (alignment) nodes is partial:
4963 they can be queried and modified from \LUA\ code, but not created.
4965 Nodes can be compared to each other, but: you are actually comparing
4966 indices into the node memory. This means that equality tests can only
4967 be trusted under very limited conditions. It will not work correctly
4968 in any situation where one of the two nodes has been freed and|/|or
4969 reallocated: in that case, there will be false positives.
4971 At the moment, memory management of nodes should still be done
4972 explicitly by the user. Nodes are not \quote{seen} by the \LUA\
4973 garbage collector, so you have to call the node freeing functions
4974 yourself when you are no longer in need of a node (list). Nodes form
4975 linked lists without reference counting, so you have to be careful
4976 that when control returns back to \LUATEX\ itself, you have not
4977 deleted nodes that are still referenced from a \type{next} pointer
4978 elsewhere, and that you did not create nodes that are referenced more
4979 than once.
4981 There are statistics available with regards to the allocated node memory,
4982 which can be handy for tracing.
4984 \subsection{Node handling functions}
4986 \subsubsection{\luatex{node.is_node}}
4988 \startfunctioncall
4989 <boolean> t = node.is_node(<any> item)
4990 \stopfunctioncall
4992 This function returns true if the argument is a userdata object of
4993 type \type{<node>}.
4995 \subsubsection{\luatex{node.types}}
4997 \startfunctioncall
4998 <table> t = node.types()
4999 \stopfunctioncall
5001 This function returns an array that maps node id numbers to node type
5002 strings, providing an overview of the possible top|-|level \type{id}
5003 types.
5005 \subsubsection{\luatex{node.whatsits}}
5007 \startfunctioncall
5008 <table> t = node.whatsits()
5009 \stopfunctioncall
5011 \TEX's \quote{whatsits} all have the same \type{id}. The various subtypes
5012 are defined by their \type{subtype} fields. The function is much like
5013 \luatex{node.types}, except that it provides an array of \type{subtype}
5014 mappings.
5016 \subsubsection{\luatex{node.id}}
5018 \startfunctioncall
5019 <number> id = node.id(<string> type)
5020 \stopfunctioncall
5022 This converts a single type name to its internal numeric
5023 representation.
5025 \subsubsection{\luatex{node.subtype}}
5027 \startfunctioncall
5028 <number> subtype = node.subtype(<string> type)
5029 \stopfunctioncall
5031 This converts a single whatsit name to its internal numeric
5032 representation (\type{subtype}).
5034 \subsubsection{\luatex{node.type}}
5036 \startfunctioncall
5037 <string> type = node.type(<any> n)
5038 \stopfunctioncall
5040 In the argument is a number, then this function converts an internal
5041 numeric representation to an external string representation.
5042 Otherwise, it will return the string \type{node} if the object
5043 represents a node (this is new in 0.65), and \type{nil} otherwise.
5045 \subsubsection{\luatex{node.fields}}
5047 \startfunctioncall
5048 <table> t = node.fields(<number> id)
5049 <table> t = node.fields(<number> id, <number> subtype)
5050 \stopfunctioncall
5052 This function returns an array of valid field names for a particular
5053 type of node. If you want to get the valid fields for a
5054 \quote{whatsit}, you have to supply the second argument also. In other
5055 cases, any given second argument will be silently ignored.
5057 This function accepts string \type{id} and \type{subtype} values as
5058 well.
5060 \subsubsection{\luatex{node.has_field}}
5062 \startfunctioncall
5063 <boolean> t = node.has_field(<node> n, <string> field)
5064 \stopfunctioncall
5066 This function returns a boolean that is only true if \type{n} is
5067 actually a node, and it has the field.
5069 \subsubsection{\luatex{node.new}}
5071 \startfunctioncall
5072 <node> n = node.new(<number> id)
5073 <node> n = node.new(<number> id, <number> subtype)
5074 \stopfunctioncall
5076 Creates a new node. All of the new node's fields are initialized to
5077 either zero or \type{nil} except for \type{id} and \type{subtype} (if
5078 supplied). If you want to create a new whatsit, then the second
5079 argument is required, otherwise it need not be present. As with all
5080 node functions, this function creates a node on the \TEX\ level.
5082 This function accepts string \type{id} and \type{subtype} values as
5083 well.
5085 \subsubsection{\luatex{node.free}}
5087 \startfunctioncall
5088 node.free(<node> n)
5089 \stopfunctioncall
5091 Removes the node \type{n} from \TEX's memory. Be careful: no checks
5092 are done on whether this node is still pointed to from a register or some
5093 \type{next} field: it is up to you to make sure that the internal data
5094 structures remain correct.
5096 \subsubsection{\luatex{node.flush_list}}
5098 \startfunctioncall
5099 node.flush_list(<node> n)
5100 \stopfunctioncall
5102 Removes the node list \type{n} and the complete node list following
5103 \type{n} from \TEX's memory. Be careful: no checks are done on whether
5104 any of these nodes is still pointed to from a register or some
5105 \type{next} field: it is up to you to make sure that the internal data
5106 structures remain correct.
5108 \subsubsection{\luatex{node.copy}}
5110 \startfunctioncall
5111 <node> m = node.copy(<node> n)
5112 \stopfunctioncall
5114 Creates a deep copy of node \type{n}, including all nested lists as in
5115 the case of a hlist or vlist node. Only the \type{next} field is not
5116 copied.
5118 \subsubsection{\luatex{node.copy_list}}
5120 \startfunctioncall
5121 <node> m = node.copy_list(<node> n)
5122 <node> m = node.copy_list(<node> n, <node> m)
5123 \stopfunctioncall
5125 Creates a deep copy of the node list that starts at \type{n}. If
5126 \type{m} is also given, the copy stops just before node \type{m}.
5128 Note that you cannot copy attribute lists this way, specialized functions for
5129 dealing with attribute lists will be provided later but are not there yet.
5130 However, there is normally no need to copy attribute lists as when you do
5131 assignments to the \type{attr} field or make changes to specific attributes, the
5132 needed copying and freeing takes place automatically.
5134 \subsubsection{\luatex{node.next} (0.65)}
5136 \startfunctioncall
5137 <node> m = node.next(<node> n)
5138 \stopfunctioncall
5140 Returns the node following this node, or \type{nil} if there is no
5141 such node.
5143 \subsubsection{\luatex{node.prev} (0.65)}
5145 \startfunctioncall
5146 <node> m = node.prev(<node> n)
5147 \stopfunctioncall
5149 Returns the node preceding this node, or \type{nil} if there is no
5150 such node.
5153 \subsubsection{\luatex{node.current_attr} (0.66)}
5155 \startfunctioncall
5156 <node> m = node.current_attr()
5157 \stopfunctioncall
5159 Returns the currently active list of attributes, if there is one.
5161 Note: this function is somewhat experimental, and it returns the {\it
5162 actual} attribute list, not a copy thereof.
5163 Therefore, changing any of the attributes in the list will change
5164 these values for all nodes that have the current attribute list
5165 assigned to them.
5168 \subsubsection{\luatex{node.hpack}}
5170 \startfunctioncall
5171 <node> h, <number> b = node.hpack(<node> n)
5172 <node> h, <number> b = node.hpack(<node> n, <number> w, <string> info)
5173 <node> h, <number> b = node.hpack(<node> n, <number> w, <string> info, <string> dir)
5174 \stopfunctioncall
5176 This function creates a new hlist by packaging the list that begins at node
5177 \type{n} into a horizontal box. With only a single argument, this box
5178 is created using the natural width of its components. In the three
5179 argument form, \type{info} must be either \type{additional} or
5180 \type{exactly}, and \type{w} is the additional (\tex{hbox spread})
5181 or exact (\tex{hbox to}) width to be used.
5183 Direction support added in \LUATEX\ 0.45.
5185 The second return value is the badness of the generated box,
5186 this extension was added in 0.51.
5188 Caveat: at this moment, there can be unexpected side|-|effects to this
5189 function, like updating some of the \tex{marks} and \tex{inserts}.
5190 Also note that the content of \type{h} is the original node list
5191 \type{n}: if you call \type{node.free(h)} you will also free the
5192 node list itself, unless you explicitly set the \type{list} field
5193 to \type{nil} beforehand. And in a similar way, calling
5194 \type{node.free(n)} will invalidate \type{h} as well!
5196 \subsubsection{\luatex{node.vpack} (since 0.36)}
5198 \startfunctioncall
5199 <node> h, <number> b = node.vpack(<node> n)
5200 <node> h, <number> b = node.vpack(<node> n, <number> w, <string> info)
5201 <node> h, <number> b = node.vpack(<node> n, <number> w, <string> info, <string> dir)
5202 \stopfunctioncall
5204 This function creates a new vlist by packaging the list that begins at node
5205 \type{n} into a vertical box. With only a single argument, this box
5206 is created using the natural height of its components. In the three
5207 argument form, \type{info} must be either \type{additional} or
5208 \type{exactly}, and \type{w} is the additional (\tex{vbox spread}) or exact (\tex{vbox to}) height to be used.
5210 Direction support added in \LUATEX\ 0.45.
5212 The second return value is the badness of the generated box,
5213 this extension was added in 0.51.
5215 See the description of \type{node.hpack()} for a few memory allocation
5216 caveats.
5218 \subsubsection{\luatex{node.dimensions} (0.43)}
5220 \startfunctioncall
5221 <number> w, <number> h, <number> d = node.dimensions(<node> n)
5222 <number> w, <number> h, <number> d = node.dimensions(<node> n, <string> dir)
5223 <number> w, <number> h, <number> d = node.dimensions(<node> n, <node> t)
5224 <number> w, <number> h, <number> d = node.dimensions(<node> n, <node> t, <string> dir)
5225 \stopfunctioncall
5227 This function calculates the natural in-line dimensions of the node
5228 list starting at node \type{n} and terminating just before node \type{t}
5229 (or the end of the list, if there is no second argument). The return values are scaled
5230 points. An alternative format that starts with glue parameters as the
5231 first three arguments is also possible:
5233 \startfunctioncall
5234 <number> w, <number> h, <number> d =
5235 node.dimensions(<number> glue_set, <number> glue_sign,
5236 <number> glue_order, <node> n)
5237 <number> w, <number> h, <number> d =
5238 node.dimensions(<number> glue_set, <number> glue_sign,
5239 <number> glue_order, <node> n, <string> dir)
5240 <number> w, <number> h, <number> d =
5241 node.dimensions(<number> glue_set, <number> glue_sign,
5242 <number> glue_order, <node> n, <node> t)
5243 <number> w, <number> h, <number> d =
5244 node.dimensions(<number> glue_set, <number> glue_sign,
5245 <number> glue_order, <node> n, <node> t, <string> dir)
5246 \stopfunctioncall
5248 This calling method takes glue settings into account and is especially
5249 useful for finding the actual width of a sublist of nodes that are
5250 already boxed, for example in code like this, which prints the
5251 width of the space inbetween the \type{a} and \type{b} as it would
5252 be if \type{\box0} was used as-is:
5254 \starttyping
5255 \setbox0 = \hbox to 20pt {a b}
5257 \directlua{print (node.dimensions(tex.box[0].glue_set,
5258 tex.box[0].glue_sign,
5259 tex.box[0].glue_order,
5260 tex.box[0].head.next,
5261 node.tail(tex.box[0].head))) }
5262 \stoptyping
5264 Direction support added in \LUATEX\ 0.45.
5266 \subsubsection{\luatex{node.mlist_to_hlist}}
5268 \startfunctioncall
5269 <node> h = node.mlist_to_hlist(<node> n,
5270 <string> display_type, <boolean> penalties)
5271 \stopfunctioncall
5273 This runs the internal mlist to hlist conversion, converting the math list in
5274 \type{n} into the horizontal list \type{h}. The interface is exactly the same as
5275 for the callback \type{mlist_to_hlist}.
5277 \subsubsection{\luatex{node.slide}}
5279 \startfunctioncall
5280 <node> m = node.slide(<node> n)
5281 \stopfunctioncall
5283 Returns the last node of the node list that starts at \type{n}. As a
5284 side|-|effect, it also creates a reverse chain of \type{prev} pointers
5285 between nodes.
5287 \subsubsection{\luatex{node.tail}}
5289 \startfunctioncall
5290 <node> m = node.tail(<node> n)
5291 \stopfunctioncall
5293 Returns the last node of the node list that starts at \type{n}.
5296 \subsubsection{\luatex{node.length}}
5298 \startfunctioncall
5299 <number> i = node.length(<node> n)
5300 <number> i = node.length(<node> n, <node> m)
5301 \stopfunctioncall
5303 Returns the number of nodes contained in the node list that starts at
5304 \type{n}. If \type{m} is also supplied it stops at \type{m} instead of
5305 at the end of the list. The node \type{m} is not counted.
5307 \subsubsection{\luatex{node.count}}
5309 \startfunctioncall
5310 <number> i = node.count(<number> id, <node> n)
5311 <number> i = node.count(<number> id, <node> n, <node> m)
5312 \stopfunctioncall
5314 Returns the number of nodes contained in the node list that starts at
5315 \type{n} that have a matching \type{id} field.
5316 If \type{m} is also supplied, counting stops at \type{m} instead of at
5317 the end of the list. The node \type{m} is not counted.
5319 This function also accept string \type{id}'s.
5321 \subsubsection{\luatex{node.traverse}}
5323 \startfunctioncall
5324 <node> t = node.traverse(<node> n)
5325 \stopfunctioncall
5327 This is a lua iterator that loops over the node list that starts at \type{n}.
5328 Typical input code like this
5330 \starttyping
5331 for n in node.traverse(head) do
5334 \stoptyping
5336 is functionally equivalent to:
5338 \starttyping
5340 local n
5341 local function f (head,var)
5342 local t
5343 if var == nil then
5344 t = head
5345 else
5346 t = var.next
5348 return t
5350 while true do
5351 n = f (head, n)
5352 if n == nil then break end
5356 \stoptyping
5358 It should be clear from the definition of the function \type{f} that
5359 even though it is possible to add or remove nodes from the node list while
5360 traversing, you have to take great care to make sure all the \type{next}
5361 (and \type{prev}) pointers remain valid.
5363 If the above is unclear to you, see the section \quote{For Statement}
5364 in the Lua Reference Manual.
5366 \subsubsection{\luatex{node.traverse_id}}
5368 \startfunctioncall
5369 <node> t = node.traverse_id(<number> id, <node> n)
5370 \stopfunctioncall
5372 This is an iterator that loops over all the nodes in the list that
5373 starts at \type{n} that have a matching \type{id} field.
5375 See the previous section for details. The change is in the local
5376 function \type{f}, which now does an extra while loop checking
5377 against the upvalue \type{id}:
5379 \starttyping
5380 local function f (head,var)
5381 local t
5382 if var == nil then
5383 t = head
5384 else
5385 t = var.next
5387 while not t.id == id do
5388 t = t.next
5390 return t
5392 \stoptyping
5394 \subsubsection{\luatex{node.end_of_math} (0.76)}
5396 \startfunctioncall
5397 <node> t = node.end_of_math(<node> start)
5398 \stopfunctioncall
5400 Looks for and returns the next \type{math_node} following the \type{start}.
5401 If the given node is a math endnode this helper return that node, else it follows the list and return the next math endnote. If no such node is found nil is returned.
5403 \subsubsection{\luatex{node.remove}}
5405 \startfunctioncall
5406 <node> head, current = node.remove(<node> head, <node> current)
5407 \stopfunctioncall
5409 This function removes the node \type{current} from the list following
5410 \type{head}. It is your responsibility to make sure it is really part
5411 of that list. The return values are the new \type{head} and
5412 \type{current} nodes. The returned \type{current} is the node
5413 following the \type{current} in the calling argument, and is only
5414 passed back as a convenience (or \type{nil}, if there is no such node). The
5415 returned \type{head} is more important, because if the function is
5416 called with \type{current} equal to \type{head}, it will be changed.
5418 \subsubsection{\luatex{node.insert_before}}
5420 \startfunctioncall
5421 <node> head, new = node.insert_before(<node> head, <node> current, <node> new)
5422 \stopfunctioncall
5424 This function inserts the node \type{new} before \type{current} into
5425 the list following \type{head}. It is your responsibility to make sure
5426 that \type{current} is really part of that list. The return values are
5427 the (potentially mutated) \type{head} and the node \type{new}, set up to
5428 be part of the list (with correct \type{next} field). If \type{head}
5429 is initially \type{nil}, it will become \type{new}.
5431 \subsubsection{\luatex{node.insert_after}}
5433 \startfunctioncall
5434 <node> head, new = node.insert_after(<node> head, <node> current, <node> new)
5435 \stopfunctioncall
5437 This function inserts the node \type{new} after \type{current} into
5438 the list following \type{head}. It is your responsibility to make sure
5439 that \type{current} is really part of that list. The return values are
5440 the \type{head} and the node \type{new}, set up to be part of the list
5441 (with correct \type{next} field). If \type{head} is initially
5442 \type{nil}, it will become \type{new}.
5444 \subsubsection{\luatex{node.first_glyph} (0.65)}
5446 \startfunctioncall
5447 <node> n = node.first_glyph(<node> n)
5448 <node> n = node.first_glyph(<node> n, <node> m)
5449 \stopfunctioncall
5451 Returns the first node in the list starting at \type{n} that is a
5452 glyph node with a subtype indicating it is a glyph, or \type{nil}.
5453 If \type{m} is given, processing stops at (but including) that node,
5454 otherwise processing stops at the end of the list.
5456 Note: this function used to be called \type{first_character}. It has
5457 been renamed in \LUATEX\ 0.65, and the old name is deprecated now.
5459 \subsubsection{\luatex{node.ligaturing}}
5461 \startfunctioncall
5462 <node> h, <node> t, <boolean> success = node.ligaturing(<node> n)
5463 <node> h, <node> t, <boolean> success = node.ligaturing(<node> n, <node> m)
5464 \stopfunctioncall
5466 Apply \TEX-style ligaturing to the specified nodelist. The tail node
5467 \type{m} is optional. The two returned nodes \type{h} and \type{t} are
5468 the new head and tail (both \type{n} and \type{m} can change into
5469 a new ligature).
5471 \subsubsection{\luatex{node.kerning}}
5473 \startfunctioncall
5474 <node> h, <node> t, <boolean> success = node.kerning(<node> n)
5475 <node> h, <node> t, <boolean> success = node.kerning(<node> n, <node> m)
5476 \stopfunctioncall
5478 Apply \TEX|-|style kerning to the specified nodelist. The tail node
5479 \type{m} is optional. The two returned nodes \type{h} and \type{t} are
5480 the head and tail (either one of these can be an inserted kern node,
5481 because special kernings with word boundaries are possible).
5483 \subsubsection{\luatex{node.unprotect_glyphs}}
5485 \startfunctioncall
5486 node.unprotect_glyphs(<node> n)
5487 \stopfunctioncall
5489 Subtracts 256 from all glyph node subtypes. This and the next
5490 function are helpers to convert from \type{characters} to
5491 \type{glyphs} during node processing.
5493 \subsubsection{\luatex{node.protect_glyphs}}
5495 \startfunctioncall
5496 node.protect_glyphs(<node> n)
5497 \stopfunctioncall
5499 Adds 256 to all glyph node subtypes in the node list starting at
5500 \type{n}, except that if the value is 1, it adds only 255. The special
5501 handling of 1 means that \type{characters} will become \type{glyphs}
5502 after subtraction of 256.
5504 \subsubsection{\luatex{node.last_node}}
5506 \startfunctioncall
5507 <node> n = node.last_node()
5508 \stopfunctioncall
5510 This function pops the last node from \TEX's \quote{current list}.
5511 It returns that node, or \type{nil} if the current list is empty.
5513 \subsubsection{\luatex{node.write}}
5515 \startfunctioncall
5516 node.write(<node> n)
5517 \stopfunctioncall
5519 This is an experimental function that will append a node list to
5520 \TEX's \quote {current list} (the node list is not deep-copied
5521 any more since version 0.38). There is no error checking yet!
5523 \subsubsection{\luatex{node.protrusion_skippable} (0.60.1)}
5524 \startfunctioncall
5525 <boolean> skippable = node.protrusion_skippable(<node> n)
5526 \stopfunctioncall
5528 Returns \type{true} if, for the purpose of line boundary discovery
5529 when character protrusion is active, this node can be skipped.
5531 \subsection{Attribute handling}
5533 Attributes appear as linked list of userdata objects in the
5534 \type{attr} field of individual nodes. They can be handled
5535 individually, but it is much safer and more efficient to use the
5536 dedicated functions associated with them.
5538 \subsubsection{\luatex{node.has_attribute}}
5540 \startfunctioncall
5541 <number> v = node.has_attribute(<node> n, <number> id)
5542 <number> v = node.has_attribute(<node> n, <number> id, <number> val)
5543 \stopfunctioncall
5545 Tests if a node has the attribute with number \type{id} set. If
5546 \type{val} is also supplied, also tests if the value matches \type{val}.
5547 It returns the value, or, if no match is found, \type{nil}.
5549 \subsubsection{\luatex{node.set_attribute}}
5551 \startfunctioncall
5552 node.set_attribute(<node> n, <number> id, <number> val)
5553 \stopfunctioncall
5555 Sets the attribute with number \type{id} to the value
5556 \type{val}. Duplicate assignments are ignored. {\em [needs explanation]}
5558 \subsubsection{\luatex{node.unset_attribute}}
5560 \startfunctioncall
5561 <number> v = node.unset_attribute(<node> n, <number> id)
5562 <number> v = node.unset_attribute(<node> n, <number> id, <number> val)
5563 \stopfunctioncall
5565 Unsets the attribute with number \type{id}. If \type{val} is also supplied,
5566 it will only perform this operation if the value matches \type{val}.
5567 Missing attributes or attribute|-|value pairs are ignored.
5569 If the attribute was actually deleted, returns its old
5570 value. Otherwise, returns \type{nil}.
5572 \section{The \luatex{pdf} library}
5574 This contains variables and functions that are related to the \PDF\ backend.
5576 %***********************************************************************
5578 \subsection{\luatex{pdf.mapfile}, \luatex{pdf.mapline} (new in 0.53.0)}
5580 \startfunctioncall
5581 pdf.mapfile(<string> map file)
5582 pdf.mapfile(<string> map line)
5583 \stopfunctioncall
5585 These two functions can be used to replace primitives \type{\pdfmapfile}
5586 and \type{\pdfmapline} from \PDFTEX. They expect a string as only parameter
5587 and have no return value.
5589 The also functions replace the former variables
5590 \luatex{pdf.pdfmapfile} and \luatex{pdf.pdfmapline}.
5592 %***********************************************************************
5593 \subsection{\luatex{pdf.catalog}, \luatex{pdf.info},
5594 \luatex{pdf.names}, \luatex{pdf.trailer} (new in 0.53.0)}
5596 These variables offer a read|-|write interface to the corresponding
5597 \PDFTEX\ token lists. The value types are strings and they are
5598 written out to the \PDF\ file directly after the \PDFTEX\ token registers.
5600 The preferred interface is now \luatex {pdf.setcatalog}, \luatex {pdf.setinfo}
5601 \luatex {pdf.setnames} and \luatex {pdf.settrailer} for setting these properties
5602 and \luatex {pdf.getcatalog}, \luatex {pdf.getinfo} \luatex {pdf.getnames} and
5603 \luatex {pdf.gettrailer} for querying them,
5605 The corresponding \quote {\type{pdf}} parameter names \luatex {pdf.pdfcatalog},
5606 \luatex {pdf.pdfinfo}, \luatex {pdf.pdfnames}, and \luatex {pdf.pdftrailer} are
5607 removed in 0.79.0.
5609 %***********************************************************************
5610 \subsection{\luatex{pdf.<set/get>pageattributes}, \luatex{pdf.<set/get>pageresources},
5611 \luatex{pdf.<set/get>pagesattributes}}
5613 These variables offer a read|-|write interface to related
5614 token lists. The value types are strings. The variables have no
5615 interaction with the corresponding \PDFTEX\ token registers
5616 \tex{pdfpageattr}, \tex{pdfpageresources}, and \tex{pdfpagesattr}.
5617 They are written out to the \PDF\ file directly after
5618 the \PDFTEX\ token registers.
5620 The preferred interface is now \luatex {pdf.setpageattributes}, \luatex
5621 {pdf.setpagesattributes} and \luatex {pdf.setpageresources} for setting these
5622 properties and \luatex {pdf.getpageattributes}, \luatex {pdf.getpageattributes} and
5623 \luatex {pdf.getpageresources} for querying them.
5625 %***********************************************************************
5627 \subsection{\luatex{pdf.h}, \luatex{pdf.v}}
5630 These are the \type{h} and \type{v} values that define the current location
5631 on the output page, measured from its lower left corner. The values can be queried
5632 using scaled points as units.
5634 \starttyping
5635 local h = pdf.h
5636 local v = pdf.v
5637 \stoptyping
5639 \subsection{\luatex{pdf.getpos}, \luatex{pdf.gethpos}, \luatex{pdf.getvpos}}
5641 These are the function variants of \type {pdf.h} and \type {pdf.v}. Sometimes
5642 using a function is preferred over a key so this saves wrapping. Also, these
5643 functions are faster then the key based access, as \type {h} and \type {v}
5644 keys are not real variables but looked up using a metatable call. The
5645 \type {getpos} function returns two values, the other return one.
5647 \starttyping
5648 local h, v = pdf.getpos()
5649 \stoptyping
5651 \subsection{\luatex{pdf.hasmatrix}, \luatex{pdf.getmatrix}}
5653 The current matrix transformation is available via the \type {getmatrix} command,
5654 which returns 6 values: \type {sx}, \type {rx}, \type {ry}, \type {sy}, \type {tx},
5655 and \type {ty}. The \type {hasmatrix} function returns \type {true} when a matrix is
5656 applied.
5658 \starttyping
5659 if pdf.hasmatrix() then
5660 local sx, rx, ry, sy, tx, ty = pdf.getmatrix()
5661 -- do something useful or not
5663 \stoptyping
5667 \subsection{\luatex{pdf.print}}
5669 A print function to write stuff to the \PDF\ document
5670 that can be used from within a \tex{latelua} argument.
5671 This function is not to be used inside \tex{directlua}
5672 unless you know {\it exactly} what you are doing.
5674 \startfunctioncall
5675 pdf.print(<string> s)
5676 pdf.print(<string> type, <string> s)
5677 \stopfunctioncall
5679 The optional parameter can be used to mimic the behavior of
5680 \tex{pdfliteral}: the \type{type} is \type{direct} or \type{page}.
5682 \subsection{\luatex{pdf.immediateobj}}
5684 This function creates a \PDF\ object
5685 and immediately writes it to the \PDF\ file.
5686 It is modelled after \PDFTEX's \tex{immediate}\tex{pdfobj} primitives.
5687 All function variants return the object number
5688 of the newly generated object.
5690 \startfunctioncall
5691 <number> n = pdf.immediateobj(<string> objtext)
5692 <number> n = pdf.immediateobj("file", <string> filename)
5693 <number> n = pdf.immediateobj("stream", <string> streamtext, <string> attrtext)
5694 <number> n = pdf.immediateobj("streamfile", <string> filename, <string> attrtext)
5695 \stopfunctioncall
5697 The first version puts the \type{objtext} raw into an object.
5698 Only the object wrapper is automatically generated,
5699 but any internal structure (like \type{<< >>} dictionary markers)
5700 needs to provided by the user.
5701 The second version with keyword \type{"file"} as 1st argument
5702 puts the contents of the file with name \type{filename} raw into the object.
5703 The third version with keyword \type{"stream"} creates a stream object
5704 and puts the \type{streamtext} raw into the stream.
5705 The stream length is automatically calculated.
5706 The optional \type{attrtext} goes into the dictionary of that object.
5707 The fourth version with keyword \type{"streamfile"} does the same as the 3rd one,
5708 it just reads the stream data raw from a file.
5710 An optional first argument can be given to make the function use a
5711 previously reserved \PDF\ object.
5713 \startfunctioncall
5714 <number> n = pdf.immediateobj(<integer> n, <string> objtext)
5715 <number> n = pdf.immediateobj(<integer> n, "file", <string> filename)
5716 <number> n = pdf.immediateobj(<integer> n, "stream", <string> streamtext, <string> attrtext)
5717 <number> n = pdf.immediateobj(<integer> n, "streamfile", <string> filename, <string> attrtext)
5718 \stopfunctioncall
5720 %***********************************************************************
5722 \subsection{\luatex{pdf.obj}}
5724 This function creates a \PDF\ object,
5725 which is written to the \PDF\ file only when referenced,
5726 e.\,g., by \luatex{pdf.refobj()}.
5728 All function variants return the object number of the newly generated
5729 object, and there are two separate calling modes.
5731 The first mode is modelled after \PDFTEX's \tex{pdfobj} primitive.
5733 \startfunctioncall
5734 <number> n = pdf.obj(<string> objtext)
5735 <number> n = pdf.obj("file", <string> filename)
5736 <number> n = pdf.obj("stream", <string> streamtext, <string> attrtext)
5737 <number> n = pdf.obj("streamfile", <string> filename, <string> attrtext)
5738 \stopfunctioncall
5740 An optional first argument can be given to make the function use a
5741 previously reserved \PDF\ object.
5743 \startfunctioncall
5744 <number> n = pdf.obj(<integer> n, <string> objtext)
5745 <number> n = pdf.obj(<integer> n, "file", <string> filename)
5746 <number> n = pdf.obj(<integer> n, "stream", <string> streamtext, <string> attrtext)
5747 <number> n = pdf.obj(<integer> n, "streamfile", <string> filename, <string> attrtext)
5748 \stopfunctioncall
5750 The second mode accepts a single argument table with key--value pairs.
5752 \startfunctioncall
5753 <number> n = pdf.obj{ type = <string>,
5754 immmediate = <boolean>,
5755 objnum = <number>,
5756 attr = <string>,
5757 compresslevel = <number>,
5758 objcompression = <boolean>,
5759 file = <string>,
5760 string = <string>}
5761 \stopfunctioncall
5763 The \type{type} field can have the values \type{raw} and
5764 \type{stream}, this field is required, the others are optional
5765 (within constraints).
5767 Note: this mode makes \type{pdf.obj} look more flexible than it
5768 actually is: the constraints from the separate parameter version
5769 still apply, so for example you can't have both \type{string} and
5770 \type{file} at the same time.
5772 %***********************************************************************
5774 \subsection{\luatex{pdf.refobj}}
5776 This function,
5777 the \LUA\ version of the \tex{pdfrefobj} primitive,
5778 references an object by its object number,
5779 so that the object will be written out.
5781 \startfunctioncall
5782 pdf.refobj(<integer> n)
5783 \stopfunctioncall
5785 This function works in both the \tex{directlua} and \tex{latelua} environment.
5786 Inside \tex{directlua} a new whatsit node
5787 \quote{pdf_refobj} is created, which will be marked for flushing during
5788 page output and the object is then written directly after the page,
5789 when also the resources objects are written out.
5790 Inside \tex{latelua} the object will be marked for flushing.
5792 This function has no return values.
5794 %***********************************************************************
5796 \subsection{\luatex{pdf.reserveobj}}
5798 This function creates an empty \PDF\ object and returns its number.
5800 \startfunctioncall
5801 <number> n = pdf.reserveobj()
5802 <number> n = pdf.reserveobj("annot")
5803 \stopfunctioncall
5805 \subsection{\luatex{pdf.registerannot} (new in 0.47.0)}
5807 This function adds an object number to the \type{/Annots} array for the
5808 current page without doing anything else. This function can only be
5809 used from within \type{\latelua}.
5811 \startfunctioncall
5812 pdf.registerannot (<number> objnum)
5813 \stopfunctioncall
5815 \section{The \luatex{pdfscanner} library (new in 0.72.0)}
5817 The \luatex{pdfscanner} library allows interpretation of PDF content streams
5818 and \type{/ToUnicode} (cmap) streams. You can get those streams from the
5819 \luatex{epdf} library, as explained in an earlier section. There is only
5820 a single top|-|level function in this library:
5822 \startfunctioncall
5823 pdfscanner.scan (<Object> stream, <table> operatortable, <table> info)
5824 \stopfunctioncall
5826 The first argument, \type{stream}, should be either a PDF stream
5827 object, or a PDF array of PDF stream objects (those options comprise
5828 the possible return values of \type{<Page>:getContents()}
5829 and \type{<Object>:getStream()} in the \type{epdf} library).
5831 The second argument, \type{operatortable}, should be a Lua table where
5832 the keys are PDF operator name strings and the values are Lua
5833 functions (defined by you) that are used to process those
5834 operators. The functions are called whenever the scanner finds one
5835 of these PDF operators in the content stream(s). The functions are
5836 called with two arguments: the \type{scanner} object itself, and
5837 the \type{info} table that was passed are the third argument
5838 to \type{pdfscanner.scan}.
5840 Internally, \type{pdfscanner.scan} loops over the PDF operators in the
5841 stream(s), collecting operands on an internal stack until it finds a
5842 PDF operator. If that PDF operator's name exists
5843 in \type{operatortable}, then the associated function is
5844 executed. After the function has run (or when there is no function to
5845 execute) the internal operand stack is cleared in preparation for the
5846 next operator, and processing continues.
5848 The \type{scanner} argument to the processing functions is needed
5849 because it offers various methods to get the actual operands from the
5850 internal operand stack. The most important of those functions is
5851 \type{}
5853 A simple example of processing a PDF's document stream
5854 could look like this:
5856 \starttyping
5857 function Do (scanner, info)
5858 local val = scanner:pop()
5859 local name = val[2] -- val[1] == 'name'
5860 print (info.space ..'Use XObject '.. name)
5861 local resources = info.resources
5862 local xobject = resources:lookup("XObject"):getDict():lookup(name)
5863 if (xobject and xobject:isStream()) then
5864 local dict = xobject:getStream():getDict()
5865 if dict then
5866 local name = dict:lookup('Subtype')
5867 if name:getName() == 'Form' then
5868 local newinfo = { space = info.space .. " " ,
5869 resources = dict:lookup('Resources'):getDict() }
5870 pdfscanner.scan(xobject, operatortable, newinfo)
5875 operatortable = {Do = Do}
5877 doc = epdf.open(arg[1])
5878 pagenum = 1
5879 while pagenum <= doc:getNumPages() do
5880 local page = doc:getCatalog():getPage(pagenum)
5881 local info = { space = " " , resources = page:getResourceDict()}
5882 print ('Page ' .. pagenum)
5883 pdfscanner.scan(page:getContents(), operatortable, info)
5884 pagenum = pagenum + 1
5886 \stoptyping
5888 This example iterates over all the actual content in the PDF, and
5889 prints out the found XObject names. While the code demonstrates quite
5890 some of the \type{epdf} functions, let's focus on the type
5891 \type{pdfscanner} specific code instead.
5893 From the bottom up, the line
5895 \starttyping
5896 pdfscanner.scan(page:getContents(), operatortable, info)
5897 \stoptyping
5899 runs the scanner with the PDF page's top-level content.
5901 The third argument, \type{info}, contains two entries: \type{space} is
5902 used to indent the printed output, and \type{resources} is needed so
5903 that embedded \type{XForms} can find their own content.
5905 The second argument, \type{operatortable} defines a processing function
5906 for a single PDF operator, \type{Do}.
5908 The function \type{Do} prints the name of the current XObject, and
5909 then starts a new scanner for that object's content stream, under the
5910 condition that the XObject is in fact a \type{/Form}. That nested
5911 scanner is called with new \type{info} argument with an
5912 updated \type{space} value so that the indentation of the output nicely
5913 nests, and with an new \type{resources} field to help the next
5914 iteration down to properly process any other, embedded XObjects.
5916 Of course, this is not a very useful example in practise, but for the
5917 purpose of demonstrating \type{pdfscanner}, it is just long enough.
5918 It makes use of only one \type{scanner} method: \type{scanner:pop()}.
5919 That function pops the top operand of the internal stack, and returns
5920 a lua table where the object at index one is a string representing
5921 the type of the operand, and object two is its value.
5923 The list of possible operand types and associated lua value types is:
5925 \starttabulate[|lT|p|]
5926 \NC integer \NC <number> \NC \NR
5927 \NC real \NC <number> \NC \NR
5928 \NC boolean \NC <boolean> \NC \NR
5929 \NC name \NC <string> \NC \NR
5930 \NC operator \NC <string> \NC \NR
5931 \NC string \NC <string> \NC \NR
5932 \NC array \NC <table> \NC \NR
5933 \NC dict \NC <table> \NC \NR
5934 \stoptabulate
5936 In case of \type{integer} or \type{real}, the value is always
5937 a Lua (floating point) number.
5939 In case of \type{name}, the leading slash is always stripped.
5941 In case of \type{string}, please bear in mind that PDF actually
5942 supports different types of strings (with different encodings) in
5943 different parts of the PDF document, so may need to reencode some of
5944 the results; \type{pdfscanner} always outputs the byte stream without
5945 reencoding anything. \type{pdfscanner} does not differentiate between
5946 literal strings and hexidecimal strings (the hexadecimal values are
5947 decoded), and it treats the stream data for inline images as a string
5948 that is the single operand for \type{EI}.
5950 In case of \type{array}, the table content is a list of \type{pop}
5951 return values.
5953 In case of \type{dict}, the table keys are PDF name strings
5954 and the values are \type{pop} return values.
5956 \blank
5958 There are few more methods defined that you can ask \type{scanner}:
5960 \starttabulate[|lT|p|]
5961 \NC pop \NC as explained above\NC \NR
5962 \NC popNumber \NC return only the value of a \type{real} or \type{integer}\NC \NR
5963 \NC popName \NC return only the value of a \type{name} \NC \NR
5964 \NC popString \NC return only the value of a \type{string} \NC \NR
5965 \NC popArray \NC return only the value of a \type{array} \NC \NR
5966 \NC popDict \NC return only the value of a \type{dict} \NC \NR
5967 \NC popBool \NC return only the value of a \type{boolean} \NC \NR
5968 \NC done \NC abort further processing of this \type{scan()} call\NC \NR
5969 \stoptabulate
5971 The \type{popXXX} are convenience functions, and come in handy when
5972 you know the type of the operands beforehand (which you usually do, in
5973 PDF). For example, the \type{Do} function could have used \type{local
5974 name = scanner:popName()} instead, because the single operand
5975 to the \type{Do} operator is always a PDF name object.
5977 The \type{done} function allows you to abort processing of a stream
5978 once you have learned everything you want to learn. This comes in handy
5979 while parsing \type{/ToUnicode}, because there usually is trailing
5980 garbage that you are not interested in. Without \type{done}, processing
5981 only end at the end of the stream, possibly wasting CPU cycles.
5983 \section{The \luatex{status} library}
5985 This contains a number of run|-|time configuration items that
5986 you may find useful in message reporting, as well as an iterator
5987 function that gets all of the names and values as a table.
5989 \startfunctioncall
5990 <table> info = status.list()
5991 \stopfunctioncall
5993 The keys in the table are the known items, the value is the
5994 current value. Almost all of the values in \type{status} are
5995 fetched through a metatable at run|-|time whenever they are
5996 accessed, so you cannot use \type{pairs} on \type{status}, but you
5997 {\it can\/} use \type{pairs} on \type{info}, of course. If you do
5998 not need the full list, you can also ask for a single item by
5999 using its name as an index into \type{status}.
6001 The current list is:
6003 \starttabulate[|lT|p|]
6004 \NC \ssbf key \NC \bf explanation \NC\NR
6005 \NC pdf_gone\NC written \PDF\ bytes \NC \NR
6006 \NC pdf_ptr\NC not yet written \PDF\ bytes \NC \NR
6007 \NC dvi_gone\NC written \DVI\ bytes \NC \NR
6008 \NC dvi_ptr\NC not yet written \DVI\ bytes \NC \NR
6009 \NC total_pages\NC number of written pages \NC \NR
6010 \NC output_file_name\NC name of the \PDF\ or \DVI\ file \NC \NR
6011 \NC log_name\NC name of the log file \NC \NR
6012 \NC banner\NC terminal display banner \NC \NR
6013 \NC var_used\NC variable (one|-|word) memory in use \NC \NR
6014 \NC dyn_used\NC token (multi|-|word) memory in use \NC \NR
6015 \NC str_ptr\NC number of strings \NC \NR
6016 \NC init_str_ptr\NC number of \INITEX\ strings \NC \NR
6017 \NC max_strings\NC maximum allowed strings \NC \NR
6018 \NC pool_ptr\NC string pool index \NC \NR
6019 \NC init_pool_ptr\NC \INITEX\ string pool index \NC \NR
6020 \NC pool_size\NC current size allocated for string characters \NC \NR
6021 \NC node_mem_usage\NC a string giving insight into currently used nodes\NC\NR
6022 \NC var_mem_max\NC number of allocated words for nodes\NC \NR
6023 \NC fix_mem_max\NC number of allocated words for tokens\NC \NR
6024 \NC fix_mem_end\NC maximum number of used tokens\NC \NR
6025 \NC cs_count\NC number of control sequences \NC \NR
6026 \NC hash_size\NC size of hash \NC \NR
6027 \NC hash_extra\NC extra allowed hash \NC \NR
6028 \NC font_ptr\NC number of active fonts \NC \NR
6029 \NC max_in_stack\NC max used input stack entries \NC \NR
6030 \NC max_nest_stack\NC max used nesting stack entries \NC \NR
6031 \NC max_param_stack\NC max used parameter stack entries \NC \NR
6032 \NC max_buf_stack\NC max used buffer position \NC \NR
6033 \NC max_save_stack\NC max used save stack entries \NC \NR
6034 \NC stack_size\NC input stack size \NC \NR
6035 \NC nest_size\NC nesting stack size \NC \NR
6036 \NC param_size\NC parameter stack size \NC \NR
6037 \NC buf_size\NC current allocated size of the line buffer \NC \NR
6038 \NC save_size\NC save stack size \NC \NR
6039 \NC obj_ptr\NC max \PDF\ object pointer \NC \NR
6040 \NC obj_tab_size\NC \PDF\ object table size \NC \NR
6041 \NC pdf_os_cntr\NC max \PDF\ object stream pointer \NC \NR
6042 \NC pdf_os_objidx\NC \PDF\ object stream index \NC \NR
6043 \NC pdf_dest_names_ptr\NC max \PDF\ destination pointer \NC \NR
6044 \NC dest_names_size\NC \PDF\ destination table size \NC \NR
6045 \NC pdf_mem_ptr\NC max \PDF\ memory used \NC \NR
6046 \NC pdf_mem_size\NC \PDF\ memory size \NC \NR
6047 \NC largest_used_mark\NC max referenced marks class \NC \NR
6048 \NC filename\NC name of the current input file \NC \NR
6049 \NC inputid\NC numeric id of the current input \NC \NR
6050 \NC linenumber\NC location in the current input file\NC \NR
6051 \NC lasterrorstring\NC last error string\NC \NR
6052 \NC luabytecodes\NC number of active \LUA\ bytecode registers\NC \NR
6053 \NC luabytecode_bytes\NC number of bytes in \LUA\ bytecode registers\NC \NR
6054 \NC luastate_bytes\NC number of bytes in use by \LUA\ interpreters\NC \NR
6055 \NC output_active\NC \type{true} if the \tex{output} routine is active\NC \NR
6056 \NC callbacks\NC total number of executed callbacks so far\NC \NR
6057 \NC indirect_callbacks\NC number of those that were themselves
6058 a result of other callbacks (e.g. file readers)\NC \NR
6059 \NC luatex_svn\NC the luatex repository id (added in 0.51)\NC\NR
6060 \NC luatex_version\NC the luatex version number (added in 0.38)\NC\NR
6061 \NC luatex_revision\NC the luatex revision string (added in 0.38)\NC\NR
6062 \NC ini_version\NC \type{true} if this is an \INITEX\ run (added in 0.38)\NC\NR
6063 \stoptabulate
6066 \section{The \luatex{tex} library}
6068 The \luatex{tex} table contains a large list of virtual internal \TEX\
6069 parameters that are partially writable.
6071 The designation \quote{virtual} means that these items are not properly
6072 defined in \LUA, but are only front\-ends that are handled by a metatable
6073 that operates on the actual \TEX\ values. As a result, most of the \LUA\
6074 table operators (like \type{pairs} and \type{#}) do not work on such
6075 items.
6077 At the moment, it is possible to access almost every parameter
6078 that has these characteristics:
6080 \startitemize[packed]
6081 \item You can use it after \tex{the}
6082 \item It is a single token.
6083 \item Some special others, see the list below
6084 \stopitemize
6086 This excludes parameters that need extra arguments, like
6087 \tex{the}\tex{scriptfont}.
6089 The subset comprising simple integer and dimension registers are
6090 writable as well as readable (stuff like \tex{tracingcommands} and
6091 \tex{parindent}).
6093 \subsection{Internal parameter values}
6095 For all the parameters in this section, it is possible to access them
6096 directly using their names as index in the \type{tex} table, or by
6097 using one of the functions \type{tex.get()} and \type{tex.set()}.
6099 The exact parameters and return values differ depending on the actual
6100 parameter, and so does whether \type{tex.set} has any effect. For the
6101 parameters that {\it can\/} be set, it is possible to use
6102 \type{'global'} as the first argument to \type{tex.set}; this makes
6103 the assignment global instead of local.
6105 \startfunctioncall
6106 tex.set (<string> n, ...)
6107 tex.set ('global', <string> n, ...)
6108 ... = tex.get (<string> n)
6109 \stopfunctioncall
6111 \subsubsection{Integer parameters}
6113 The integer parameters accept and return \LUA\ numbers.
6115 Read-write:
6117 \startcolumns[n=2]
6118 \starttyping
6119 tex.adjdemerits
6120 tex.binoppenalty
6121 tex.brokenpenalty
6122 tex.catcodetable
6123 tex.clubpenalty
6124 tex.day
6125 tex.defaulthyphenchar
6126 tex.defaultskewchar
6127 tex.delimiterfactor
6128 tex.displaywidowpenalty
6129 tex.doublehyphendemerits
6130 tex.endlinechar
6131 tex.errorcontextlines
6132 tex.escapechar
6133 tex.exhyphenpenalty
6134 tex.fam
6135 tex.finalhyphendemerits
6136 tex.floatingpenalty
6137 tex.globaldefs
6138 tex.hangafter
6139 tex.hbadness
6140 tex.holdinginserts
6141 tex.hyphenpenalty
6142 tex.interlinepenalty
6143 tex.language
6144 tex.lastlinefit
6145 tex.lefthyphenmin
6146 tex.linepenalty
6147 tex.localbrokenpenalty
6148 tex.localinterlinepenalty
6149 tex.looseness
6150 tex.mag
6151 tex.maxdeadcycles
6152 tex.month
6153 tex.newlinechar
6154 tex.outputpenalty
6155 tex.pausing
6156 tex.pdfadjustspacing
6157 tex.pdfcompresslevel
6158 tex.pdfdecimaldigits
6159 tex.pdfgamma
6160 tex.pdfgentounicode
6161 tex.pdfimageapplygamma
6162 tex.pdfimagegamma
6163 tex.pdfimagehicolor
6164 tex.pdfimageresolution
6165 tex.pdfinclusionerrorlevel
6166 tex.pdfminorversion
6167 tex.pdfobjcompresslevel
6168 tex.pdfoutput
6169 tex.pdfpagebox
6170 tex.pdfpkresolution
6171 tex.pdfprotrudechars
6172 tex.pdftracingfonts
6173 tex.pdfuniqueresname
6174 tex.postdisplaypenalty
6175 tex.predisplaydirection
6176 tex.predisplaypenalty
6177 tex.pretolerance
6178 tex.relpenalty
6179 tex.righthyphenmin
6180 tex.savinghyphcodes
6181 tex.savingvdiscards
6182 tex.showboxbreadth
6183 tex.showboxdepth
6184 tex.time
6185 tex.tolerance
6186 tex.tracingassigns
6187 tex.tracingcommands
6188 tex.tracinggroups
6189 tex.tracingifs
6190 tex.tracinglostchars
6191 tex.tracingmacros
6192 tex.tracingnesting
6193 tex.tracingonline
6194 tex.tracingoutput
6195 tex.tracingpages
6196 tex.tracingparagraphs
6197 tex.tracingrestores
6198 tex.tracingscantokens
6199 tex.tracingstats
6200 tex.uchyph
6201 tex.vbadness
6202 tex.widowpenalty
6203 tex.year
6204 \stoptyping
6205 \stopcolumns
6207 Read|-|only:
6209 \startcolumns[n=3]
6210 \starttyping
6211 tex.deadcycles
6212 tex.insertpenalties
6213 tex.parshape
6214 tex.prevgraf
6215 tex.spacefactor
6216 \stoptyping
6217 \stopcolumns
6219 \subsubsection{Dimension parameters}
6221 The dimension parameters accept \LUA\ numbers (signifying scaled points)
6222 or strings (with included dimension). The result is always a number in
6223 scaled points.
6225 Read|-|write:
6227 \startcolumns[n=3]
6228 \starttyping
6229 tex.boxmaxdepth
6230 tex.delimitershortfall
6231 tex.displayindent
6232 tex.displaywidth
6233 tex.emergencystretch
6234 tex.hangindent
6235 tex.hfuzz
6236 tex.hoffset
6237 tex.hsize
6238 tex.lineskiplimit
6239 tex.mathsurround
6240 tex.maxdepth
6241 tex.nulldelimiterspace
6242 tex.overfullrule
6243 tex.pagebottomoffset
6244 tex.pageheight
6245 tex.pageleftoffset
6246 tex.pagerightoffset
6247 tex.pagetopoffset
6248 tex.pagewidth
6249 tex.parindent
6250 tex.pdfdestmargin
6251 tex.pdfeachlinedepth
6252 tex.pdfeachlineheight
6253 tex.pdffirstlineheight
6254 tex.pdfhorigin
6255 tex.pdflastlinedepth
6256 tex.pdflinkmargin
6257 tex.pdfpageheight
6258 tex.pdfpagewidth
6259 tex.pdfpxdimen
6260 tex.pdfthreadmargin
6261 tex.pdfvorigin
6262 tex.predisplaysize
6263 tex.scriptspace
6264 tex.splitmaxdepth
6265 tex.vfuzz
6266 tex.voffset
6267 tex.vsize
6268 \stoptyping
6269 \stopcolumns
6271 Read|-|only:
6273 \startcolumns[n=3]
6274 \starttyping
6275 tex.pagedepth
6276 tex.pagefilllstretch
6277 tex.pagefillstretch
6278 tex.pagefilstretch
6279 tex.pagegoal
6280 tex.pageshrink
6281 tex.pagestretch
6282 tex.pagetotal
6283 tex.prevdepth
6284 \stoptyping
6285 \stopcolumns
6287 \subsubsection{Direction parameters}
6289 The direction parameters are read|-|only and return a \LUA\ string.
6291 \startcolumns[n=3]
6292 \starttyping
6293 tex.bodydir
6294 tex.mathdir
6295 tex.pagedir
6296 tex.pardir
6297 tex.textdir
6298 \stoptyping
6299 \stopcolumns
6301 \subsubsection{Glue parameters}
6303 The glue parameters accept and return a userdata object that
6304 represents a \type{glue_spec} node.
6306 \startcolumns[n=3]
6307 \starttyping
6308 tex.abovedisplayshortskip
6309 tex.abovedisplayskip
6310 tex.baselineskip
6311 tex.belowdisplayshortskip
6312 tex.belowdisplayskip
6313 tex.leftskip
6314 tex.lineskip
6315 tex.parfillskip
6316 tex.parskip
6317 tex.rightskip
6318 tex.spaceskip
6319 tex.splittopskip
6320 tex.tabskip
6321 tex.topskip
6322 tex.xspaceskip
6323 \stoptyping
6324 \stopcolumns
6326 \subsubsection{Muglue parameters}
6328 All muglue parameters are to be used read|-|only and return a \LUA\ string.
6330 \startcolumns[n=3]
6331 \starttyping
6332 tex.medmuskip
6333 tex.thickmuskip
6334 tex.thinmuskip
6335 \stoptyping
6336 \stopcolumns
6338 \subsubsection{Tokenlist parameters}
6340 The tokenlist parameters accept and return \LUA\ strings. \LUA\ strings are
6341 converted to and from token lists using \tex{the}\tex{toks} style
6342 expansion: all category codes are either space (10) or other (12).
6343 It follows that assigning to some of these, like \quote{tex.output},
6344 is actually useless, but it feels bad to make exceptions in view
6345 of a coming extension that will accept full-blown token strings.
6347 \startcolumns[n=3]
6348 \starttyping
6349 tex.errhelp
6350 tex.everycr
6351 tex.everydisplay
6352 tex.everyeof
6353 tex.everyhbox
6354 tex.everyjob
6355 tex.everymath
6356 tex.everypar
6357 tex.everyvbox
6358 tex.output
6359 tex.pdfpageattr
6360 tex.pdfpageresources
6361 tex.pdfpagesattr
6362 tex.pdfpkmode
6363 \stoptyping
6364 \stopcolumns
6367 \subsection{Convert commands}
6369 All \quote{convert} commands are read|-|only and return a \LUA\ string.
6370 The supported commands at this moment are:
6372 \startcolumns[n=2]
6373 \starttyping
6374 tex.eTeXVersion
6375 tex.eTeXrevision
6376 tex.formatname
6377 tex.jobname
6378 tex.luatexrevision
6379 tex.pdfnormaldeviate
6380 tex.pdftexbanner
6381 tex.pdftexrevision
6382 tex.fontname(number)
6383 tex.pdffontname(number)
6384 tex.pdffontobjnum(number)
6385 tex.pdffontsize(number)
6386 tex.uniformdeviate(number)
6387 tex.number(number)
6388 tex.romannumeral(number)
6389 tex.pdfpageref(number)
6390 tex.pdfxformname(number)
6391 tex.fontidentifier(number)
6392 \stoptyping
6393 \stopcolumns
6395 If you are wondering why this list looks haphazard; these are all the
6396 cases of the \quote{convert} internal command that do not require an
6397 argument, as well as the ones that require only a simple numeric
6398 value.
6400 The special (lua-only) case of \type{tex.fontidentifier} returns the
6401 \type{csname} string that matches a font id number (if there is one).
6403 \subsection{Last item commands}
6405 All \quote{last item} commands are read|-|only and return a number.
6407 The supported commands at this moment are:
6409 \startcolumns[n=3]
6410 \starttyping
6411 tex.lastpenalty
6412 tex.lastkern
6413 tex.lastskip
6414 tex.lastnodetype
6415 tex.inputlineno
6416 tex.pdftexversion
6417 tex.pdflastobj
6418 tex.pdflastxform
6419 tex.pdflastximage
6420 tex.pdflastximagepages
6421 tex.pdflastannot
6422 tex.pdflastxpos
6423 tex.pdflastypos
6424 tex.pdfrandomseed
6425 tex.pdflastlink
6426 tex.luatexversion
6427 tex.eTeXminorversion
6428 tex.eTeXversion
6429 tex.currentgrouplevel
6430 tex.currentgrouptype
6431 tex.currentiflevel
6432 tex.currentiftype
6433 tex.currentifbranch
6434 tex.pdflastximagecolordepth
6435 \stoptyping
6436 \stopcolumns
6438 \subsection{Attribute, count, dimension, skip and token registers}
6440 \TEX's attributes (\tex{attribute}), counters (\tex{count}),
6441 dimensions (\tex{dimen}), skips (\tex{skip}) and token (\tex{toks})
6442 registers can be accessed and written to using two times five virtual
6443 sub|-|tables of the \luatex{tex} table:
6445 \startcolumns[n=3]
6446 \starttyping
6447 tex.attribute
6448 tex.count
6449 tex.dimen
6450 tex.skip
6451 tex.toks
6452 \stoptyping
6453 \stopcolumns
6455 It is possible to use the names of relevant \tex{attributedef}, \tex{countdef},
6456 \tex{dimendef}, \tex{skipdef}, or \tex{toksdef} control sequences as indices
6457 to these tables:
6459 \starttyping
6460 tex.count.scratchcounter = 0
6461 enormous = tex.dimen['maxdimen']
6462 \stoptyping
6464 In this case, \LUATEX\ looks up the value for you on the fly. You have
6465 to use a valid \tex{countdef} (or \tex{attributedef}, or
6466 \tex{dimendef}, or \tex{skipdef}, or \tex{toksdef}), anything else
6467 will generate an error (the intent is to eventually also allow
6468 \type{<chardef tokens>} and even macros that expand into a number).
6470 The attribute and count registers accept and return \LUA\ numbers.
6472 The dimension registers accept \LUA\ numbers (in scaled points) or
6473 strings (with an included absolute dimension; \type {em} and \type {ex} and \type {px}
6474 are forbidden). The result is always a number in scaled points.
6476 The token registers accept and return \LUA\ strings. \LUA\ strings are
6477 converted to and from token lists using \tex{the}\tex{toks} style
6478 expansion: all category codes are either space (10) or other (12).
6480 The skip registers accept and return \type{glue_spec} userdata node
6481 objects (see the description of the node interface elsewhere in this
6482 manual).
6484 As an alternative to array addressing, there are also accessor
6485 functions defined for all cases, for example, here is the set
6486 of possibilities for \type{\skip} registers:
6488 \startfunctioncall
6489 tex.setskip (<number> n, <node> s)
6490 tex.setskip (<string> s, <node> s)
6491 tex.setskip ('global',<number> n, <node> s)
6492 tex.setskip ('global',<string> s, <node> s)
6493 <node> s = tex.getskip (<number> n)
6494 <node> s = tex.getskip (<string> s)
6495 \stopfunctioncall
6497 In the function-based interface, it is possible to define values
6498 globally by using the string \type{'global'} as the first function argument.
6500 \subsection{Character code registers (0.63)}
6502 \TEX's character code tables (\tex{lccode}, \tex{uccode},
6503 \tex{sfcode}, \tex{catcode}, \tex{mathcode}, \tex{delcode}) can be
6504 accessed and written to using six virtual subtables of the \type{tex}
6505 table
6507 \startcolumns[n=3]
6508 \starttyping
6509 tex.lccode
6510 tex.uccode
6511 tex.sfcode
6512 tex.catcode
6513 tex.mathcode
6514 tex.delcode
6515 \stoptyping
6516 \stopcolumns
6518 The function call interfaces are roughly as above, but there are a few twists.
6519 \type{sfcode}s are the simple ones:
6521 \startfunctioncall
6522 tex.setsfcode (<number> n, <number> s)
6523 tex.setsfcode ('global', <number> n, <number> s)
6524 <number> s = tex.getsfcode (<number> n)
6525 \stopfunctioncall
6527 The function call interface for \type{lccode} and \type{uccode} additionally allows you to set the associated sibling at the same time:
6529 \startfunctioncall
6530 tex.setlccode (['global'], <number> n, <number> lc)
6531 tex.setlccode (['global'], <number> n, <number> lc, <number> uc)
6532 <number> lc = tex.getlccode (<number> n)
6533 tex.setuccode (['global'], <number> n, <number> uc)
6534 tex.setuccode (['global'], <number> n, <number> uc, <number> lc)
6535 <number> uc = tex.getuccode (<number> n)
6536 \stopfunctioncall
6538 The function call interface for \type{catcode} also allows you to
6539 specify a category table to use on assignment or on query (default in
6540 both cases is the current one):
6542 \startfunctioncall
6543 tex.setcatcode (['global'], <number> n, <number> c)
6544 tex.setcatcode (['global'], <number> cattable, <number> n, <number> c)
6545 <number> lc = tex.getcatcode (<number> n)
6546 <number> lc = tex.getcatcode (<number> cattable, <number> n)
6547 \stopfunctioncall
6550 The interfaces for \type{delcode} and \type{mathcode} use small array tables to
6551 set and retrieve values:
6553 \startfunctioncall
6554 tex.setmathcode (['global'], <number> n, <table> mval )
6555 <table> mval = tex.getmathcode (<number> n)
6556 tex.setdelcode (['global'], <number> n, <table> dval )
6557 <table> dval = tex.getdelcode (<number> n)
6558 \stopfunctioncall
6560 Where the table for \type{mathcode} is an array of 3 numbers, like this:
6562 \starttyping
6563 {<number> mathclass, <number> family, <number> character}
6564 \stoptyping
6566 And the table for \type{delcode} is an array with 4 numbers, like this:
6568 \starttyping
6569 {<number> small_fam, <number> small_char, <number> large_fam, <number> large_char}
6570 \stoptyping
6572 Normally, the third and fourth values in a delimiter code assignment
6573 will be zero according to \tex{Udelcode} usage, but the returned table can have
6574 values there (if the delimiter code was set using \type{\delcode}, for
6575 example). Unset \type{delcode}'s can be recognized because
6576 \type{dval[1]} is $-1$.
6578 \subsection{Box registers}
6580 It is possible to set and query actual boxes, using the node
6581 interface as defined in the \luatex{node} library:
6583 \starttyping
6584 tex.box
6585 \stoptyping
6587 for array access, or
6589 \starttyping
6590 tex.setbox(<number> n, <node> s)
6591 tex.setbox(<string> cs, <node> s)
6592 tex.setbox('global', <number> n, <node> s)
6593 tex.setbox('global', <string> cs, <node> s)
6594 <node> n = tex.getbox(<number> n)
6595 <node> n = tex.getbox(<string> cs)
6596 \stoptyping
6598 for function|-|based access.
6599 In the function-based interface, it is possible to define values
6600 globally by using the string \type{'global'} as the first function argument.
6602 Be warned that an assignment like
6604 \starttyping
6605 tex.box[0] = tex.box[2]
6606 \stoptyping
6608 does not copy the node list, it just duplicates a node pointer. If
6609 \tex{box2} will be cleared by \TEX\ commands later on, the contents
6610 of \tex{box0} becomes invalid as well. To prevent this from
6611 happening, always use \luatex{node.copy_list()} unless you are
6612 assigning to a temporary variable:
6614 \starttyping
6615 tex.box[0] = node.copy_list(tex.box[2])
6616 \stoptyping
6618 %{\bf note: In previous versions of \LUATEX\ there were also three
6619 %virtual tables called \type{tex.wd}, \type{tex.ht}, and \type{tex.dp}
6620 %along with an associated function call interface. These were
6621 %removed in version 0.63. You should switch to using \type{tex.box[].width}
6622 %etc. instead.}
6624 %If for some reason you want the functionality of these tables back,
6625 %you can add \LUA\ code to do that for you, like this:
6627 %\starttyping
6628 %local box = tex.box
6630 %local wd = {
6631 % __index = function(t,k) local bk = box[k] return bk and bk.width or 0 end,
6632 % __newindex = function(t,k,v) local bk = box[k] if bk then bk.width = v end end,
6634 %local ht = {
6635 % __index = function(t,k) local bk = box[k] return bk and bk.height or 0 end,
6636 % __newindex = function(t,k,v) local bk = box[k] if bk then bk.height = v end end,
6638 %local dp = {
6639 % __index = function(t,k) local bk = box[k] return bk and bk.depth or 0 end,
6640 % __newindex = function(t,k,v) local bk = box[k] if bk then bk.depth = v end end,
6643 %tex.wd = { } setmetatable(tex.wd,wd)
6644 %tex.ht = { } setmetatable(tex.ht,ht)
6645 %tex.dp = { } setmetatable(tex.dp,dp)
6646 %\stoptyping
6649 \subsection{Math parameters}
6651 It is possible to set and query the internal math parameters
6652 using:
6654 \startfunctioncall
6655 tex.setmath(<string> n, <string> t, <number> n)
6656 tex.setmath('global', <string> n, <string> t, <number> n)
6657 <number> n = tex.getmath(<string> n, <string> t)
6658 \stopfunctioncall
6660 As before an optional first parameter \type{'global'} indicates a
6661 global assignment.
6663 The first string is the parameter name minus the leading \quote{Umath},
6664 and the second string is the style name minus the trailing \quote{style}.
6666 Just to be complete, the values for the math parameter name are:
6668 \starttyping
6669 quad axis operatorsize
6670 overbarkern overbarrule overbarvgap
6671 underbarkern underbarrule underbarvgap
6672 radicalkern radicalrule radicalvgap
6673 radicaldegreebefore radicaldegreeafter radicaldegreeraise
6674 stackvgap stacknumup stackdenomdown
6675 fractionrule fractionnumvgap fractionnumup
6676 fractiondenomvgap fractiondenomdown fractiondelsize
6677 limitabovevgap limitabovebgap limitabovekern
6678 limitbelowvgap limitbelowbgap limitbelowkern
6679 underdelimitervgap underdelimiterbgap
6680 overdelimitervgap overdelimiterbgap
6681 subshiftdrop supshiftdrop subshiftdown
6682 subsupshiftdown subtopmax supshiftup
6683 supbottommin supsubbottommax subsupvgap
6684 spaceafterscript connectoroverlapmin
6685 ordordspacing ordopspacing ordbinspacing ordrelspacing
6686 ordopenspacing ordclosespacing ordpunctspacing ordinnerspacing
6687 opordspacing opopspacing opbinspacing oprelspacing
6688 opopenspacing opclosespacing oppunctspacing opinnerspacing
6689 binordspacing binopspacing binbinspacing binrelspacing
6690 binopenspacing binclosespacing binpunctspacing bininnerspacing
6691 relordspacing relopspacing relbinspacing relrelspacing
6692 relopenspacing relclosespacing relpunctspacing relinnerspacing
6693 openordspacing openopspacing openbinspacing openrelspacing
6694 openopenspacing openclosespacing openpunctspacing openinnerspacing
6695 closeordspacing closeopspacing closebinspacing closerelspacing
6696 closeopenspacing closeclosespacing closepunctspacing closeinnerspacing
6697 punctordspacing punctopspacing punctbinspacing punctrelspacing
6698 punctopenspacing punctclosespacing punctpunctspacing punctinnerspacing
6699 innerordspacing inneropspacing innerbinspacing innerrelspacing
6700 inneropenspacing innerclosespacing innerpunctspacing innerinnerspacing
6701 \stoptyping
6703 The values for the style parameter name are:
6705 \starttyping
6706 display crampeddisplay
6707 text crampedtext
6708 script crampedscript
6709 scriptscript crampedscriptscript
6710 \stoptyping
6713 \subsection{Special list heads}
6715 The virtual table \luatex{tex.lists} contains the set of internal
6716 registers that keep track of building page lists.
6719 \starttabulate[|lT|p|]
6720 \NC \bf field \NC \bf description \NC \NR
6721 \NC page_ins_head \NC circular list of pending insertions \NC \NR
6722 \NC contrib_head \NC the recent contributions \NC \NR
6723 \NC page_head \NC the current page content\NC \NR
6724 %\NC temp_head \NC \NC \NR
6725 \NC hold_head \NC used for held-over items for next page\NC \NR
6726 \NC adjust_head \NC head of the current \tex{vadjust} list \NC \NR
6727 \NC pre_adjust_head \NC head of the current \tex{vadjust pre} list\NC \NR
6728 % \NC align_head \NC \NC \NR
6729 \stoptabulate
6731 \subsection{Semantic nest levels (0.51)}
6733 The virtual table \luatex{tex.nest} contains the currently active
6734 semantic nesting state. It has two main parts: a zero-based array of
6735 userdata for the semantic nest itself, and the numerical value
6736 \type{tex.nest.ptr}, which gives the highest available index. Neither
6737 the array items in \type{tex.nest[]} nor \type{tex.nest.ptr} can be
6738 assigned to (as this would confuse the typesetting engine beyond
6739 repair), but you can assign to the individual values inside the array
6740 items, e.g. \type{tex.nest[tex.nest.ptr].prevdepth}.
6742 \type{tex.nest[tex.nest.ptr]} is the current nest state, \type{tex.nest[0]}
6743 the outermost (main vertical list) level.
6745 The known fields are:
6747 \starttabulate[|lT|l|l|p|]
6748 \NC \ssbf key \NC \bf type \NC \bf modes \NC \bf explanation \NC\NR
6749 \NC mode \NC number \NC all \NC The current mode. This is a number representing the
6750 main mode at this level:\crlf
6751 0 == no mode (this happens during \type{\write})\crlf
6752 1 == vertical,\crlf
6753 127 = horizontal,\crlf
6754 253 = display math.\crlf
6755 $-1$ == internal vertical,\crlf
6756 $-127$ = restricted horizontal,\crlf
6757 $-253$ = inline math.\NC\NR
6758 \NC modeline \NC number \NC all \NC source input line where this mode was entered in,
6759 negative inside the output routine.\NC\NR
6760 \NC head \NC node \NC all \NC the head of the current list\NC\NR
6761 \NC tail \NC node \NC all \NC the tail of the current list\NC\NR
6762 \NC prevgraf \NC number \NC vmode \NC number of lines in the previous paragraph\NC\NR
6763 \NC prevdepth \NC number \NC vmode \NC depth of the previous paragraph (equal to \type{\pdfignoreddimen}
6764 when it is to be ignored)\NC\NR
6765 \NC spacefactor \NC number \NC hmode \NC the current space factor\NC\NR
6766 \NC dirs \NC node \NC hmode \NC used for temporary storage by the line break algorithm\NC\NR
6767 \NC noad \NC node \NC mmode \NC used for temporary storage of a pending fraction numerator,
6768 for \type{\over} etc.\NC\NR
6769 \NC delimptr \NC node \NC mmode \NC used for temporary storage of the previous math delimiter,
6770 for \type{\middle}.\NC\NR
6771 \NC mathdir \NC boolean \NC mmode \NC true when during math processing the \type{\mathdir} is not
6772 the same as the surrounding \type{\textdir}\NC\NR
6773 \NC mathstyle \NC number \NC mmode \NC the current \type{\mathstyle} \NC\NR
6774 \stoptabulate
6777 \subsection[sec:luaprint]{Print functions}
6779 The \luatex{tex} table also contains the three print functions that
6780 are the major interface from \LUA\ scripting to \TEX.
6782 The arguments to these three functions are all stored in an in|-|memory
6783 virtual file that is fed to the \TEX\ scanner as the result of the
6784 expansion of \tex{directlua}.
6786 The total amount of returnable text from a \tex{directlua} command
6787 is only limited by available system \RAM. However, each separate
6788 printed string has to fit completely in \TEX's input buffer.
6790 The result of using these functions from inside callbacks is undefined
6791 at the moment.
6793 \subsubsection{\luatex{tex.print}}
6795 \startfunctioncall
6796 tex.print(<string> s, ...)
6797 tex.print(<number> n, <string> s, ...)
6798 tex.print(<table> t)
6799 tex.print(<number> n, <table> t)
6800 \stopfunctioncall
6802 Each string argument is treated by \TEX\ as a separate input line.
6803 If there is a table argument instead of a list of strings, this has to
6804 be a consecutive array of strings to print (the first non-string value
6805 will stop the printing process). This syntax was added in 0.36.
6807 The optional parameter can be used to print the strings using the
6808 catcode regime defined by \tex{catcodetable}~\type{n}. If \type{n} is
6809 $-1$, the currently active catcode regime is used. If \type{n} is
6810 $-2$, the resulting catcodes are the result of \type{\the\toks}: all
6811 category codes are 12 (other) except for the space character, that has
6812 category code 10 (space). Otherwise, if \type{n} is not
6813 a valid catcode table, then it is ignored, and the currently
6814 active catcode regime is used instead.
6816 The very last string of the very last \luatex{tex.print()} command in a
6817 \tex{directlua} will not have the \tex{endlinechar} appended, all
6818 others do.
6820 \subsubsection{\luatex{tex.sprint}}
6822 \startfunctioncall
6823 tex.sprint(<string> s, ...)
6824 tex.sprint(<number> n, <string> s, ...)
6825 tex.sprint(<table> t)
6826 tex.sprint(<number> n, <table> t)
6827 \stopfunctioncall
6829 Each string argument is treated by \TEX\ as a special kind of input line
6830 that makes it suitable for use as a partial line input mechanism:
6832 \startitemize[packed]
6833 \item \TEX\ does not switch to the \quote{new line} state, so
6834 that leading spaces are not ignored.
6835 \item No \tex{endlinechar} is inserted.
6836 \item Trailing spaces are not removed.
6838 Note that this does not prevent \TEX\ itself from eating spaces as
6839 result of interpreting the line. For example, in
6841 \starttyping
6842 before\directlua{tex.sprint("\\relax")tex.sprint(" inbetween")}after
6843 \stoptyping
6845 the space before \type{inbetween} will be gobbled as a result of
6846 the \quote{normal} scanning of \tex{relax}.
6847 \stopitemize
6849 If there is a table argument instead of a list of strings, this has to
6850 be a consecutive array of strings to print (the first non-string value
6851 will stop the printing process). This syntax was added in 0.36.
6853 The optional argument sets the catcode regime, as with \type{tex.print()}.
6855 \subsubsection{\luatex{tex.tprint}}
6857 \startfunctioncall
6858 tex.tprint({<number> n, <string> s, ...}, {...})
6859 \stopfunctioncall
6861 This function is basically a shortcut for repeated calls to
6862 \luatex{tex.sprint(<number> n, <string> s, ...)}, once for each of
6863 the supplied argument tables.
6865 \subsubsection{\luatex{tex.write}}
6867 \startfunctioncall
6868 tex.write(<string> s, ...)
6869 tex.write(<table> t)
6870 \stopfunctioncall
6872 Each string argument is treated by \TEX\ as a special kind of input
6873 line that makes it suitable for use as a quick way to dump
6874 information:
6876 \startitemize
6877 \item All catcodes on that line are either \quote{space} (for '~') or
6878 \quote{character} (for all others).
6879 \item There is no \tex{endlinechar} appended.
6880 \stopitemize
6882 If there is a table argument instead of a list of strings, this has to
6883 be a consecutive array of strings to print (the first non-string value
6884 will stop the printing process). This syntax was added in 0.36.
6887 \subsection{Helper functions}
6889 \subsubsection{\luatex{tex.round}}
6891 \startfunctioncall
6892 <number> n = tex.round(<number> o)
6893 \stopfunctioncall
6895 Rounds \LUA\ number \type{o}, and returns a number that is in the range
6896 of a valid \TEX\ register value. If the number starts out of range, it
6897 generates a \quote{number to big} error as well.
6899 \subsubsection{\luatex{tex.scale}}
6901 \startfunctioncall
6902 <number> n = tex.scale(<number> o, <number> delta)
6903 <table> n = tex.scale(table o, <number> delta)
6904 \stopfunctioncall
6906 Multiplies the \LUA\ numbers \type{o} and \type{delta}, and returns a
6907 rounded number that is in the range of a valid \TEX\ register value.
6908 In the table version, it creates a copy of the table with all numeric
6909 top||level values scaled in that manner. If the multiplied number(s) are
6910 of range, it generates \quote{number to big} error(s) as well.
6912 Note: the precision of the output of this function will depend on your
6913 computer's architecture and operating system, so use with care! An
6914 interface to \LUATEX's internal, 100\% portable scale function will be
6915 added at a later date.
6917 \subsubsection{\luatex{tex.sp} (0.51)}
6919 \startfunctioncall
6920 <number> n = tex.sp(<number> o)
6921 <number> n = tex.sp(<string> s)
6922 \stopfunctioncall
6924 Converts the number \type{o} or a string \type{s} that represents
6925 an explicit dimension into an integer number of scaled points.
6927 For parsing the string, the same scanning and conversion rules are used
6928 that \LUATEX\ would use if it was scanning a dimension specifier in
6929 its \TEX-like input language (this includes generating errors for bad
6930 values), expect for the following:
6932 \startitemize[n]
6933 \item only explicit values are allowed, control sequences are not handled
6934 \item infinite dimension units (\type{fil...}) are forbidden
6935 \item \type{mu} units do not generate an error (but may not be useful either)
6936 \stopitemize
6938 \subsubsection{\luatex{tex.definefont}}
6940 \startfunctioncall
6941 tex.definefont(<string> csname, <number> fontid)
6942 tex.definefont(<boolean> global, <string> csname, <number> fontid)
6943 \stopfunctioncall
6945 Associates \type{csname} with the internal font number \type{fontid}.
6946 The definition is global if (and only if) \type{global} is specified
6947 and true (the setting of \type{globaldefs} is not taken into account).
6950 \subsubsection{\luatex{tex.error} (0.61)}
6952 \startfunctioncall
6953 tex.error(<string> s)
6954 tex.error(<string> s, <table> help)
6955 \stopfunctioncall
6957 This creates an error somewhat like the combination of \tex{errhelp}
6958 and \tex{errmessage} would. During this error, deletions are disabled.
6960 The array part of the \type{help} table has to contain strings,
6961 one for each line of error help.
6964 \subsubsection{\luatex{tex.hashtokens} (0.25)}
6966 \startfunctioncall
6967 for i,v in pairs (tex.hashtokens()) do ... end
6968 \stopfunctioncall
6970 Returns a name and token table pair (see~\in{section}[luatokens] about
6971 token tables) iterator for every non-zero entry in the hash table.
6972 This can be useful for debugging, but note that this also reports
6973 control sequences that may be unreachable at this moment due to local
6974 redefinitions: it is strictly a dump of the hash table.
6976 \subsection[luaprimitives]{Functions for dealing with primitives }
6978 \subsubsection{\luatex{tex.enableprimitives}}
6980 \startfunctioncall
6981 tex.enableprimitives(<string> prefix, <table> primitive names)
6982 \stopfunctioncall
6984 This function accepts a prefix string and an array of primitive names.
6986 For each combination of \quote{prefix} and \quote{name}, the
6987 \type{tex.enableprimitives} first verifies that \quote{name} is
6988 an actual primitive (it must be returned by one of the
6989 \type{tex.extraprimitives()} calls explained below, or part of
6990 \TEX82, or \type{\directlua}). If it is not,
6991 \type{tex.enableprimitives} does nothing and skips to the next pair.
6993 But if it is, then it will construct a csname variable by concatenating the
6994 \quote{prefix} and \quote{name}, unless the \quote{prefix} is already the actual
6995 prefix of \quote{name}. In the latter case, it will discard the \quote{prefix},
6996 and just use \quote{name}.
6998 Then it will check for the existence of the constructed csname.
6999 If the csname is currently undefined (note: that is not the same as
7000 \type{\relax}), it will globally define the csname to have the
7001 meaning: run code belonging to the primitive \quote{name}. If for some
7002 reason the csname is already defined, it does nothing and tries the
7003 next pair.
7005 An example:
7007 \starttyping
7008 tex.enableprimitives('LuaTeX', {'formatname'})
7009 \stoptyping
7011 will define \type{\LuaTeXformatname} with the same intrinsic meaning
7012 as the documented primitive \type{\formatname}, provided that the
7013 control sequences \type{\LuaTeXformatname} is currently undefined.
7015 Second example:
7017 \starttyping
7018 tex.enableprimitives('Omega',tex.extraprimitives ('omega'))
7019 \stoptyping
7021 will define a whole series of csnames like \type{\Omegatextdir},
7022 \type{\Omegapardir}, etc., but it will stick with \type{\OmegaVersion}
7023 instead of creating the doubly-prefixed \type{\OmegaOmegaVersion}.
7025 Starting with version 0.39.0 (and this is why the above two functions
7026 are needed), \LUATEX\ in \type{--ini} mode contains only the \TEX82
7027 primitives and \type{\directlua}, no extra primitives {\bf at all}.
7029 So, if you want to have all the new functionality available using
7030 their default names, as it is now, you will have to add
7032 \starttyping
7033 \ifx\directlua\undefined \else
7034 \directlua {tex.enableprimitives('',tex.extraprimitives ())}
7036 \stoptyping
7038 near the beginning of your format generation file. Or you can choose
7039 different prefixes for different subsets, as you see fit.
7041 Calling some form of \type{tex.enableprimitives()} is highly important
7042 though, because if you do not, you will end up with a \TEX82-lookalike
7043 that can run lua code but not do much else. The defined csnames are
7044 (of course) saved in the format and will be available at runtime.
7047 \subsubsection{\luatex{tex.extraprimitives}}
7049 \startfunctioncall
7050 <table> t = tex.extraprimitives(<string> s, ...)
7051 \stopfunctioncall
7053 This function returns a list of the primitives that originate
7054 from the engine(s) given by the requested string value(s). The
7055 possible values and their (current) return values are:
7057 \startluacode
7058 function out_prim (a)
7059 local v = tex.extraprimitives(a)
7060 table.sort(v)
7061 for _,n in pairs(v) do
7062 if n == ' ' then
7063 n = '\\normalcontrolspace'
7065 tex.print(n .. '\\hskip 4pt plus 5em')
7068 \stopluacode
7070 \starttabulate[|l|p|]
7071 \NC \bf name\NC \bf values \NC \NR
7072 \NC tex \NC \ctxlua{out_prim('tex') } \NC \NR
7073 \NC core \NC \ctxlua{out_prim('core') } \NC \NR
7074 \NC etex \NC \ctxlua{out_prim('etex') } \NC \NR
7075 \NC pdftex \NC \ctxlua{out_prim('pdftex') } \NC \NR
7076 \NC omega \NC \ctxlua{out_prim('omega') } \NC \NR
7077 \NC aleph \NC \ctxlua{out_prim('aleph') } \NC \NR
7078 \NC luatex \NC \ctxlua{out_prim('luatex') } \NC \NR
7079 \NC umath \NC \ctxlua{out_prim('umath') } \NC \NR
7080 \stoptabulate
7082 Note that \type{'luatex'} does not contain \type{directlua}, as that is
7083 considered to be a core primitive, along with all the \TEX82
7084 primitives, so it is part of the list that is returned from \type{'core'}.
7086 \type{'umath'} is a subset of \type{'luatex'} that covers the Unicode math
7087 primitives and have been added in \LUATEX\ 0.75.0 as it might be desired to
7088 handle the prefixing of that subset differently.
7090 Running \type{tex.extraprimitives()} will give you the complete list
7091 of primitives that are not defined at \LUATEX\ 0.39.0 \type{-ini}
7092 startup. It is exactly equivalent to \type{tex.extraprimitives('etex',
7093 'pdftex', 'omega', 'aleph', 'luatex')}
7095 \subsubsection{\luatex{tex.primitives}}
7097 \startfunctioncall
7098 <table> t = tex.primitives()
7099 \stopfunctioncall
7101 This function returns a hash table listing all primitives that \LUATEX\
7102 knows about. The keys in the hash are primitives names, the values are
7103 tables representing tokens (see~\in{section }[luatokens]). The third value
7104 is always zero.
7106 \subsection{Core functionality interfaces}
7108 \subsubsection{\luatex{tex.badness} (0.53)}
7110 \startfunctioncall
7111 <number> b = tex.badness(<number> t, <number> s)
7112 \stopfunctioncall
7114 This helper function is useful
7115 during linebreak calculations. \type{t} and \type{s} are scaled values; the function
7116 returns the badness for when total \type{t} is supposed to be made from amounts
7117 that sum to \type{s}. The returned number is a reasonable approximation of $100(t/s)^3$;
7119 \subsubsection{\luatex{tex.linebreak} (0.53)}
7121 \startfunctioncall
7122 local <node> nodelist, <table> info =
7123 tex.linebreak(<node> listhead, <table> parameters)
7124 \stopfunctioncall
7126 The understood parameters are as follows:
7128 \starttabulate[|l|l|p|]
7129 \NC \bf name \NC \bf type \NC \bf description \NC \NR
7130 \NC pardir \NC string \NC \NC \NR
7131 \NC pretolerance \NC number \NC \NC \NR
7132 \NC tracingparagraphs \NC number \NC \NC \NR
7133 \NC tolerance \NC number \NC \NC \NR
7134 \NC looseness \NC number \NC \NC \NR
7135 \NC hyphenpenalty \NC number \NC \NC \NR
7136 \NC exhyphenpenalty \NC number \NC \NC \NR
7137 \NC pdfadjustspacing \NC number \NC \NC \NR
7138 \NC adjdemerits \NC number \NC \NC \NR
7139 \NC pdfprotrudechars \NC number \NC \NC \NR
7140 \NC linepenalty \NC number \NC \NC \NR
7141 \NC lastlinefit \NC number \NC \NC \NR
7142 \NC doublehyphendemerits \NC number \NC \NC \NR
7143 \NC finalhyphendemerits \NC number \NC \NC \NR
7144 \NC hangafter \NC number \NC \NC \NR
7145 \NC interlinepenalty \NC number or table \NC if a table, then it is an array like \type{\interlinepenalties}\NC \NR
7146 \NC clubpenalty \NC number or table \NC if a table, then it is an array like \type{\clubpenalties}\NC \NR
7147 \NC widowpenalty \NC number or table \NC if a table, then it is an array like \type{\widowpenalties}\NC \NR
7148 \NC brokenpenalty \NC number \NC \NC \NR
7149 \NC emergencystretch \NC number \NC in scaled points \NC \NR
7150 \NC hangindent \NC number \NC in scaled points \NC \NR
7151 \NC hsize \NC number \NC in scaled points \NC \NR
7152 \NC leftskip \NC glue_spec node \NC \NC \NR
7153 \NC rightskip \NC glue_spec node \NC \NC \NR
7154 \NC pdfeachlineheight \NC number \NC in scaled points \NC \NR
7155 \NC pdfeachlinedepth \NC number \NC in scaled points \NC \NR
7156 \NC pdffirstlineheight \NC number \NC in scaled points \NC \NR
7157 \NC pdflastlinedepth \NC number \NC in scaled points \NC \NR
7158 \NC pdfignoreddimen \NC number \NC in scaled points \NC \NR
7159 \NC parshape \NC table \NC \NC \NR
7160 \stoptabulate
7162 Note that there is no interface for \type{\displaywidowpenalties}, you
7163 have to pass the right choice for \type{widowpenalties} yourself.
7165 The meaning of the various keys should be fairly obvious from the
7166 table (the names match the \TEX\ and \PDFTEX\ primitives) except for
7167 the last 5 entries. The four \type{pdf...line...} keys are ignored if
7168 their value equals \type{pdfignoreddimen}.
7170 It is your own job to make sure that \type{listhead} is a proper
7171 paragraph list: this function does not add any nodes to it. To be
7172 exact, if you want to replace the core line breaking, you may have to
7173 do the following (when you are not actually working in the
7174 \type{pre_linebreak_filter} or \type{linebreak_filter} callbacks, or when the
7175 original list starting at listhead was generated in horizontal mode):
7177 \startitemize
7178 \item add an \quote{indent box} and perhaps a \type{local_par} node at
7179 the start (only if you need them)
7180 \item replace any found final glue by an infinite penalty (or add such
7181 a penalty, if the last node is not a glue)
7182 \item add a glue node for the \type{\parfillskip} after that penalty node
7183 \item make sure all the \type{prev} pointers are OK
7184 \stopitemize
7186 The result is a node list, it still needs to be vpacked if you
7187 want to assign it to a \tex{vbox}.
7190 The returned \type{info} table contains four values that are all numbers:
7192 \starttabulate[|l|p|]
7193 \NC prevdepth \NC depth of the last line in the broken paragraph \NC \NR
7194 \NC prevgraf \NC number of lines in the broken paragraph \NC \NR
7195 \NC looseness \NC the actual looseness value in the broken paragraph \NC \NR
7196 \NC demerits \NC the total demerits of the chosen solution \NC \NR
7197 \stoptabulate
7199 Note there are a few things you cannot interface using this function:
7200 You cannot influence font expansion other than via
7201 \type{pdfadjustspacing}, because the settings for that take place
7202 elsewhere. The same is true for hbadness and hfuzz etc. All these are
7203 in the \type{hpack()} routine, and that fetches its own variables via
7204 globals.
7206 \subsubsection{\luatex{tex.shipout} (0.51)}
7208 \startfunctioncall
7209 tex.shipout(<number> n)
7210 \stopfunctioncall
7212 Ships out box number \type{n} to the output file, and clears the box
7213 register.
7216 \section[texconfig]{The \luatex{texconfig} table}
7218 This is a table that is created empty. A startup \LUA\ script could
7219 fill this table with a number of settings that are read out by
7220 the executable after loading and executing the startup file.
7222 \starttabulate[|lT|l|l|p|]
7223 \NC \ssbf key \NC \bf type \NC \bf default \NC \bf explanation \NC\NR
7224 \NC kpse_init \NC boolean \NC true \NC \type{false} totally disables \KPATHSEA\ initialisation,
7225 and enables interpretation of the following numeric key--value pairs.
7226 (only ever unset this if you implement {\it all\/} file
7227 find callbacks!)\NC \NR
7228 \NC shell_escape \NC string\NC \type{'f'}\NC Use \type{'y'} or \type{'t'} or \type{'1'} to enable \type{\write18} unconditionally,
7229 \type{'p'} to enable the commands that are listed in \type{shell_escape_commands} (new in 0.37)\NC\NR
7230 \NC shell_escape_commands \NC string\NC \NC Comma-separated list of command names that may be executed by \type{\write18} even
7231 if \type{shell_escape} is set to \type{'p'}. Do {\it not\/} use spaces around commas,
7232 separate any required command arguments by using a space, and use the ASCII double quote
7233 (\type{"}) for any needed argument or path quoting (new in 0.37)\NC\NR
7234 \NC string_vacancies \NC number\NC 75000\NC cf.\ web2c docs \NC \NR
7235 \NC pool_free \NC number\NC 5000\NC cf.\ web2c docs \NC \NR
7236 \NC max_strings \NC number\NC 15000\NC cf.\ web2c docs \NC \NR
7237 \NC strings_free \NC number\NC 100\NC cf.\ web2c docs \NC \NR
7238 \NC nest_size \NC number\NC 50\NC cf.\ web2c docs \NC \NR
7239 \NC max_in_open \NC number\NC 15\NC cf.\ web2c docs \NC \NR
7240 \NC param_size \NC number\NC 60\NC cf.\ web2c docs \NC \NR
7241 \NC save_size \NC number\NC 4000\NC cf.\ web2c docs \NC \NR
7242 \NC stack_size \NC number\NC 300\NC cf.\ web2c docs \NC \NR
7243 \NC dvi_buf_size \NC number\NC 16384\NC cf.\ web2c docs \NC \NR
7244 \NC error_line \NC number\NC 79\NC cf.\ web2c docs \NC \NR
7245 \NC half_error_line \NC number\NC 50\NC cf.\ web2c docs \NC \NR
7246 \NC max_print_line \NC number\NC 79\NC cf.\ web2c docs \NC \NR
7247 \NC hash_extra \NC number\NC 0\NC cf.\ web2c docs \NC \NR
7248 \NC pk_dpi \NC number\NC 72\NC cf.\ web2c docs \NC \NR
7249 \NC trace_file_names \NC boolean \NC true \NC \type{false} disables \TEX's normal file open|-|close
7250 feedback (the assumption is that callbacks will take care of
7251 that) \NC \NR
7252 \NC file_line_error \NC boolean \NC false \NC do \type{file:line} style error messages\NC \NR
7253 \NC halt_on_error \NC boolean \NC false \NC abort run on the first encountered error\NC \NR
7254 \NC formatname \NC string \NC \NC if no format name was given
7255 on the commandline, this key will be tested first
7256 instead of simply quitting\NC \NR
7257 \NC jobname \NC string \NC \NC if no input file name was given
7258 on the commandline, this key will be tested first
7259 instead of simply giving up\NC \NR
7260 \stoptabulate
7262 {\bf Note:} the numeric values that match web2c parameters are only used if
7263 \type{kpse_init} is explicitly set to \type{false}. In all other cases, the normal values from
7264 \type{texmf.cnf} are used.
7266 \section{The \luatex{texio} library}
7268 This library takes care of the low|-|level I/O interface.
7270 \subsection{Printing functions}
7272 \subsubsection{\luatex{texio.write}}
7274 \startfunctioncall
7275 texio.write(<string> target, <string> s, ...)
7276 texio.write(<string> s, ...)
7277 \stopfunctioncall
7279 Without the \type{target} argument, writes all given strings to the same
7280 location(s) \TEX\ writes messages to at this moment. If
7281 \tex{batchmode} is in effect, it writes only to the log,
7282 otherwise it writes to the log and the terminal.
7283 The optional \type{target} can be one of three possibilities:
7284 \type{term}, \type{log} or \type {term and log}.
7286 Note: If several strings are given, and if the first of these strings
7287 is or might be one of the targets above, the \type{target} must be
7288 specified explicitly to prevent \LUA\ from interpreting the first
7289 string as the target.
7291 \subsubsection{\luatex{texio.write_nl}}
7293 \startfunctioncall
7294 texio.write_nl(<string> target, <string> s, ...)
7295 texio.write_nl(<string> s, ...)
7296 \stopfunctioncall
7298 This function behaves like \luatex{texio.write}, but make sure that the given strings will
7299 appear at the beginning of a new line. You can pass a single empty string
7300 if you only want to move to the next line.
7302 %***********************************************************************
7304 \section[luatokens]{The \luatex{token} library}
7306 The \luatex{token} table contains interface functions to \TEX's
7307 handling of tokens. These functions are most useful when combined with
7308 the \luatex{token_filter} callback, but they could be used standalone
7309 as well.
7311 A token is represented in \LUA\ as a small table. For the moment, this
7312 table consists of three numeric entries:
7314 \starttabulate[|l|l|p|]
7315 \NC \bf index\NC \bf meaning \NC \bf description \NC \NR
7316 \NC 1 \NC command code \NC this is a value between~$0$ and~$130$ (approximately)\NC \NR
7317 \NC 2 \NC command modifier \NC this is a value between~$0$ and~$2^{21}$ \NC \NR
7318 \NC 3 \NC control sequence id \NC for commands that are not the result of control
7319 sequences, like letters and characters, it is zero,
7320 otherwise, it is a number pointing into the \quote
7321 {equivalence table} \NC \NR
7322 \stoptabulate
7324 \subsection{\luatex{token.get_next}}
7326 \startfunctioncall
7327 token t = token.get_next()
7328 \stopfunctioncall
7330 This fetches the next input token from the current input source,
7331 without expansion.
7333 \subsection{\luatex{token.is_expandable}}
7335 \startfunctioncall
7336 <boolean> b = token.is_expandable(<token> t)
7337 \stopfunctioncall
7339 This tests if the token \type{t} could be expanded.
7341 \subsection{\luatex{token.expand}}
7343 \startfunctioncall
7344 token.expand(<token> t)
7345 \stopfunctioncall
7347 If a token is expandable, this will expand one level of it, so that
7348 the first token of the expansion will now be the next token to be read
7349 by \luatex{token.get_next()}.
7351 \subsection{\luatex{token.is_activechar}}
7353 \startfunctioncall
7354 <boolean> b = token.is_activechar(<token> t)
7355 \stopfunctioncall
7357 This is a special test that is sometimes handy. Discovering whether
7358 some control sequence is the result of an active character turned out
7359 to be very hard otherwise.
7361 \subsection{\luatex{token.create}}
7363 \startfunctioncall
7364 token t = token.create(<string> csname)
7365 token t = token.create(<number> charcode)
7366 token t = token.create(<number> charcode, <number> catcode)
7367 \stopfunctioncall
7369 This is the token factory. If you feed it a string, then it is the
7370 name of a control sequence (without leading backslash), and it will be
7371 looked up in the equivalence table.
7373 If you feed it number, then this is assumed to be an input character,
7374 and an optional second number gives its category code. This means it
7375 is possible to overrule a character's category code, with a few
7376 exceptions: the category codes~0 (escape), 9~(ignored), 13~(active),
7377 14~(comment), and 15 (invalid) cannot occur inside a token. The values~0, 9, 14
7378 and~15 are therefore illegal as input to \luatex{token.create()}, and
7379 active characters will be resolved immediately.
7381 Note: unknown string sequences and never defined active characters
7382 will result in a token representing an \quote{undefined control sequence}
7383 with a near|-|random name. It is {\em not} possible to define brand
7384 new control sequences using \luatex{token.create}!
7386 \subsection{\luatex{token.command_name}}
7388 \startfunctioncall
7389 <string> commandname = token.command_name(<token> t)
7390 \stopfunctioncall
7392 This returns the name associated with the \quote{command} value of the token
7393 in \LUATEX. There is not always a direct connection between these names and
7394 primitives. For instance, all \tex{ifxxx} tests are grouped under
7395 \type {if_test}, and the \quote{command modifier} defines which test is to be run.
7397 \subsection{\luatex{token.command_id}}
7399 \startfunctioncall
7400 <number> i = token.command_id(<string> commandname)
7401 \stopfunctioncall
7403 This returns a number that is the inverse operation of the previous
7404 command, to be used as the first item in a token table.
7406 \subsection{\luatex{token.csname_name}}
7408 \startfunctioncall
7409 <string> csname = token.csname_name(<token> t)
7410 \stopfunctioncall
7412 This returns the name associated with the \quote{equivalence table} value of
7413 the token in \LUATEX. It returns the string value of the command used
7414 to create the current token, or an empty string if there is no
7415 associated control sequence.
7417 Keep in mind that there are potentially two control sequences that
7418 return the same csname string: single character control sequences
7419 and active characters have the same \quote{name}.
7421 \subsection{\luatex{token.csname_id}}
7423 \startfunctioncall
7424 <number> i = token.csname_id(<string> csname)
7425 \stopfunctioncall
7427 This returns a number that is the inverse operation of the previous
7428 command, to be used as the third item in a token table.
7431 \chapter[math]{Math}
7433 The handling of mathematics in \LUATEX\ differs quite a bit from how
7434 \TEX82 (and therefore \PDFTEX) handles math. First, \LUATEX\ adds primitives and
7435 extends some others so that \UNICODE\ input can be used easily. Second, all
7436 of \TEX82's internal special values (for example for operator spacing) have
7437 been made accessible and changeable via control sequences. Third, there are
7438 extensions that make it easier to use \OPENTYPE\ math fonts. And finally,
7439 there are some extensions that have been proposed in the past that are now
7440 added to the engine.
7442 \section{The current math style}
7444 Starting with \LUATEX\ 0.39.0, it is possible to discover the math
7445 style that will be used for a formula in an expandable fashion
7446 (while the math list is still being read). To make this possible,
7447 \LUATEX\ adds the new primitive: \type{\mathstyle}. This is a
7448 \quote{convert command} like e.g. \type{\romannumeral}: its value can
7449 only be read, not set.
7451 \subsection{\tex{mathstyle}}
7453 The returned value is between 0 and 7 (in math mode), or $-1$
7454 (all other modes). For easy testing, the eight math style commands
7455 have been altered so that the can be used as numeric values, so you
7456 can write code like this:
7458 \starttyping
7459 \ifnum\mathstyle=\textstyle
7460 \message{normal text style}
7461 \else \ifnum\mathstyle=\crampedtextstyle
7462 \message{cramped text style}
7463 \fi \fi
7464 \stoptyping
7466 \subsection{\tex{Ustack}}
7468 There are a few math commands in \TEX\ where the style that will be used
7469 is not known straight from the start. These commands (\tex{over},
7470 \tex{atop}, \tex{overwithdelims}, \tex{atopwithdelims}) would
7471 therefore normally return wrong values for \type{\mathstyle}. To
7472 fix this, \LUATEX\ introduces a special prefix command:
7473 \type{\Ustack}:
7475 \starttyping
7476 $\Ustack {a \over b}$
7477 \stoptyping
7479 The \type{\Ustack} command will scan the next brace and start a new
7480 math group with the correct (numerator) math style.
7482 \section{Unicode math characters}
7484 Character handling is now extended up to the full \UNICODE\ range
7485 (the \type{\U} prefix), which is compatible with \XETEX.
7487 The math primitives from \TEX\ are kept as they are, except for
7488 the ones that convert from input to math commands: \type{mathcode},
7489 and \type{delcode}. These two now allow
7490 for a 21-bit character argument on the left hand side of the equals sign.
7492 Some of the new \LUATEX\ primitives read
7493 more than one separate value. This is shown in the tables below by a plus
7494 sign in the second column.
7496 The input for such primitives would look like this:
7498 \starttyping
7499 \def\overbrace {\Umathaccent 0 1 "23DE }
7500 \stoptyping
7503 Altered \TEX82 primitives:
7505 \starttabulate[|l|l|l|]
7506 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7507 \NC \tex{mathcode} \NC 0--10FFFF = 0--8000 \NC\NR
7508 \NC \tex{delcode} \NC 0--10FFFF = 0--FFFFFF \NC\NR
7509 \stoptabulate
7511 Unaltered:
7513 \starttabulate[|l|l|l|]
7514 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7515 \NC \tex{mathchardef} \NC 0--8000 \NC\NR
7516 \NC \tex{mathchar} \NC 0--7FFF \NC\NR
7517 \NC \tex{mathaccent} \NC 0--7FFF \NC\NR
7518 \NC \tex{delimiter} \NC 0--7FFFFFF \NC\NR
7519 \NC \tex{radical} \NC 0--7FFFFFF \NC\NR
7520 \stoptabulate
7522 New primitives that are compatible with \XETEX:
7524 \starttabulate[|l|l|l|l|]
7525 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7526 \NC \tex{Umathchardef} \NC 0+0+0--7+FF+10FFFF$^1$ \NC\NR
7527 \NC \tex{Umathcharnumdef}$^5$ \NC -80000000--7FFFFFFF$^3$ \NC\NR
7528 \NC \tex{Umathcode} \NC 0--10FFFF = 0+0+0--7+FF+10FFFF$^1$ \NC\NR
7529 \NC \tex{Udelcode} \NC 0--10FFFF = 0+0--FF+10FFFF$^2$ \NC\NR
7530 \NC \tex{Umathchar} \NC 0+0+0--7+FF+10FFFF \NC\NR
7531 \NC \tex{Umathaccent} \NC 0+0+0--7+FF+10FFFF$^{2,4}$ \NC\NR
7532 \NC \tex{Udelimiter} \NC 0+0+0--7+FF+10FFFF$^2$ \NC\NR
7533 \NC \tex{Uradical} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7534 \NC \tex{Umathcharnum} \NC -80000000--7FFFFFFF$^3$ \NC\NR
7535 \NC \tex{Umathcodenum} \NC 0--10FFFF = -80000000--7FFFFFFF$^3$ \NC\NR
7536 \NC \tex{Udelcodenum} \NC 0--10FFFF = -80000000--7FFFFFFF$^3$ \NC\NR
7537 \stoptabulate
7539 Note 1: \type{\Umathchardef<csname>="8"0"0} and \type{\Umathchardef<number>="8"0"0}
7540 are also accepted.
7542 Note 2: The new primitives that deal with delimiter-style objects do not
7543 set up a \quote{large family}. Selecting a suitable size for display
7544 purposes is expected to be dealt with by the font via the
7545 \tex{Umathoperatorsize} parameter (more information a following section).
7547 Note 3: For these three primitives, all information is packed into a single
7548 signed integer. For the first two (\tex{Umathcharnum} and
7549 \tex{Umathcodenum}), the lowest 21 bits are the character code, the 3
7550 bits above that represent the math class, and the family data is kept in
7551 the topmost bits (This means that the values for math families 128--255 are
7552 actually negative). For \tex{Udelcodenum} there is no math class; the
7553 math family information is stored in the bits directly on top of the
7554 character code. Using these three commands is not as natural as using the
7555 two- and three-value commands, so unless you know exactly what you are
7556 doing and absolutely require the speedup resulting from the faster input
7557 scanning, it is better to use the verbose commands instead.
7559 Note 4: As of \LUATEX\ 0.65, \tex{Umathaccent} accepts optional
7560 keywords to control various details regarding math accents. See
7561 \in{section}[mathacc] below for details.
7563 Note 5: \tex{Umathcharnumdef} was added in release 0.72.
7566 New primitives that exist in \LUATEX\ only (all of these will be explained
7567 in following sections):
7570 \starttabulate[|l|l|l|l|]
7571 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7572 \NC \tex{Uroot} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7573 \NC \tex{Uoverdelimiter} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7574 \NC \tex{Uunderdelimiter} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7575 \NC \tex{Udelimiterover} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7576 \NC \tex{Udelimiterunder} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7577 \stoptabulate
7579 \section{Cramped math styles}
7581 \LUATEX\ has four new primitives to set the cramped math styles
7582 directly:
7584 \starttyping
7585 \crampeddisplaystyle
7586 \crampedtextstyle
7587 \crampedscriptstyle
7588 \crampedscriptscriptstyle
7589 \stoptyping
7591 These additional commands are not all that valuable on their own, but
7592 they come in handy as arguments to the math parameter settings that
7593 will be added shortly.
7595 \section{Math parameter settings}
7597 In \LUATEX, the font dimension parameters that \TEX\ used in math
7598 typesetting are now accessible via primitive commands. In fact,
7599 refactoring of the math engine has resulted in many more parameters
7600 than were accessible before.
7602 \starttabulate
7603 \NC \bf primitive name \NC \bf description \NC \NR
7604 \NC \type{\Umathquad} \NC the width of 18mu's\NC \NR
7605 \NC \type{\Umathaxis} \NC height of the vertical center axis of
7606 the math formula above the baseline\NC \NR
7607 \NC \type{\Umathoperatorsize} \NC minimum size of large operators in display mode \NC \NR
7608 \NC \type{\Umathoverbarkern} \NC vertical clearance above the rule \NC \NR
7609 \NC \type{\Umathoverbarrule} \NC the width of the rule \NC \NR
7610 \NC \type{\Umathoverbarvgap} \NC vertical clearance below the rule \NC \NR
7611 \NC \type{\Umathunderbarkern} \NC vertical clearance below the rule \NC \NR
7612 \NC \type{\Umathunderbarrule} \NC the width of the rule \NC \NR
7613 \NC \type{\Umathunderbarvgap} \NC vertical clearance above the rule \NC \NR
7614 \NC \type{\Umathradicalkern} \NC vertical clearance above the rule \NC \NR
7615 \NC \type{\Umathradicalrule} \NC the width of the rule \NC \NR
7616 \NC \type{\Umathradicalvgap} \NC vertical clearance below the rule \NC \NR
7617 \NC \type{\Umathradicaldegreebefore}\NC the forward kern that takes place before placement of
7618 the radical degree \NC \NR
7619 \NC \type{\Umathradicaldegreeafter} \NC the backward kern that takes place after placement of
7620 the radical degree \NC \NR
7621 \NC \type{\Umathradicaldegreeraise} \NC this is the percentage of the total height and depth of
7622 the radical sign that the degree is raised by. It is
7623 expressed in \type{percents}, so 60\% is expressed as the
7624 integer $60$.\NC \NR
7625 \NC \type{\Umathstackvgap} \NC vertical clearance between the two
7626 elements in a \type{\atop} stack \NC \NR
7627 \NC \type{\Umathstacknumup} \NC numerator shift upward in \type{\atop} stack \NC \NR
7628 \NC \type{\Umathstackdenomdown} \NC denominator shift downward in \type{\atop} stack\NC \NR
7629 \NC \type{\Umathfractionrule} \NC the width of the rule in a \type{\over}\NC \NR
7630 \NC \type{\Umathfractionnumvgap} \NC vertical clearance between the numerator and the rule\NC \NR
7631 \NC \type{\Umathfractionnumup} \NC numerator shift upward in \type{\over} \NC \NR
7632 \NC \type{\Umathfractiondenomvgap} \NC vertical clearance between the denominator and the rule\NC \NR
7633 \NC \type{\Umathfractiondenomdown} \NC denominator shift downward in \type{\over} \NC \NR
7634 \NC \type{\Umathfractiondelsize} \NC minimum delimiter size for \type{\...withdelims}\NC \NR
7635 \NC \type{\Umathlimitabovevgap} \NC vertical clearance for limits above operators\NC \NR
7636 \NC \type{\Umathlimitabovebgap} \NC vertical baseline clearance for limits above operators\NC \NR
7637 \NC \type{\Umathlimitabovekern} \NC space reserved at the top of the limit\NC \NR
7638 \NC \type{\Umathlimitbelowvgap} \NC vertical clearance for limits below operators\NC \NR
7639 \NC \type{\Umathlimitbelowbgap} \NC vertical baseline clearance for limits below operators\NC \NR
7640 \NC \type{\Umathlimitbelowkern} \NC space reserved at the bottom of the limit\NC \NR
7641 \NC \type{\Umathoverdelimitervgap} \NC vertical clearance for limits above delimiters\NC \NR
7642 \NC \type{\Umathoverdelimiterbgap} \NC vertical baseline clearance for limits above delimiters\NC \NR
7643 \NC \type{\Umathunderdelimitervgap} \NC vertical clearance for limits below delimiters\NC \NR
7644 \NC \type{\Umathunderdelimiterbgap} \NC vertical baseline clearance for limits below delimiters\NC \NR
7645 \NC \type{\Umathsubshiftdrop} \NC subscript drop for boxes and subformulas\NC \NR
7646 \NC \type{\Umathsubshiftdown} \NC subscript drop for characters\NC \NR
7647 \NC \type{\Umathsupshiftdrop} \NC superscript drop (raise, actually) for boxes and subformulas\NC \NR
7648 \NC \type{\Umathsupshiftup} \NC superscript raise for characters\NC \NR
7649 \NC \type{\Umathsubsupshiftdown} \NC subscript drop in the presence of a superscript\NC \NR
7650 \NC \type{\Umathsubtopmax} \NC the top of standalone subscripts cannot be higher than this above the baseline\NC \NR
7651 \NC \type{\Umathsupbottommin} \NC the bottom of standalone superscripts cannot be less than this above the baseline\NC \NR
7652 \NC \type{\Umathsupsubbottommax} \NC the bottom of the superscript of a combined super- and subscript
7653 be at least as high as this above the baseline\NC \NR
7654 \NC \type{\Umathsubsupvgap} \NC vertical clearance between super- and subscript\NC \NR
7655 \NC \type{\Umathspaceafterscript} \NC additional space added after a super- or subscript\NC \NR
7656 \NC \type{\Umathconnectoroverlapmin}\NC minimum overlap between parts in an extensible recipe\NC \NR
7657 \stoptabulate
7659 Each of the parameters in this section can be set by a command like this:
7661 \starttyping
7662 \Umathquad\displaystyle=1em
7663 \stoptyping
7665 they obey grouping, and you can use \type{\the\Umathquad\displaystyle} if needed.
7667 \section{Font-based Math Parameters}
7669 While it is nice to have these math parameters available for tweaking, it
7670 would be tedious to have to set each of them by hand. For this reason,
7671 \LUATEX\ initializes a bunch of these parameters whenever you assign a font
7672 identifier to a math family based on either the traditional math font
7673 dimensions in the font (for assignments to math family~2 and~3 using
7674 \TFM|-|based fonts like \type{cmsy} and \type{cmex}), or based on the named
7675 values in a potential \type{MathConstants} table when the font is loaded
7676 via Lua. If there is a \type{MathConstants} table, this takes precedence
7677 over font dimensions, and in that case no attention is paid to which
7678 family is being assigned to: the \type{MathConstants} tables in the last
7679 assigned family sets all parameters.
7681 In the table below, the one-letter style abbreviations and symbolic tfm
7682 font dimension names match those using in the \TeX book. Assignments to
7683 \tex{textfont} set the values for the cramped and uncramped display and
7684 text styles. Use \tex{scriptfont} for the script styles, and
7685 \tex{scriptscriptfont} for the scriptscript styles (totalling eight
7686 parameters for three font sizes). In the \TFM\ case, assignments only happen
7687 in family~2 and family~3 (and of course only for the parameters for which
7688 there are font dimensions).
7690 Besides the parameters below, \LUATEX\ also looks at the \quote{space}
7691 font dimension parameter. For math fonts, this should be set to zero.
7693 \start
7695 \switchtobodyfont[8pt]
7697 \starttabulate[|l|l|l|p|]
7698 \NC \bf variable \NC \bf style \NC \bf default value opentype \NC \bf default value tfm \NC\NR
7699 \NC \tex{Umathaxis} \NC -- \NC AxisHeight \NC axis_height \NC\NR
7700 \NC \tex{Umathoperatorsize} \NC D, D' \NC DisplayOperatorMinHeight \NC $^6$ \NC\NR
7701 \NC \tex{Umathfractiondelsize} \NC D, D' \NC FractionDelimiterDisplayStyleSize$^9$ \NC delim1 \NC\NR
7702 \NC " \NC T, T', S, S', SS, SS' \NC FractionDelimiterSize$^9$ \NC delim2 \NC\NR
7703 \NC \tex{Umathfractiondenomdown}\NC D, D' \NC FractionDenominatorDisplayStyleShiftDown \NC denom1 \NC\NR
7704 \NC " \NC T, T', S, S', SS, SS' \NC FractionDenominatorShiftDown \NC denom2 \NC\NR
7705 \NC \tex{Umathfractiondenomvgap}\NC D, D' \NC FractionDenominatorDisplayStyleGapMin \NC 3*default_rule_thickness \NC\NR
7706 \NC " \NC T, T', S, S', SS, SS' \NC FractionDenominatorGapMin \NC default_rule_thickness \NC\NR
7707 \NC \tex{Umathfractionnumup} \NC D, D' \NC FractionNumeratorDisplayStyleShiftUp \NC num1 \NC\NR
7708 \NC " \NC T, T', S, S', SS, SS' \NC FractionNumeratorShiftUp \NC num2 \NC\NR
7709 \NC \tex{Umathfractionnumvgap} \NC D, D' \NC FractionNumeratorDisplayStyleGapMin \NC 3*default_rule_thickness \NC\NR
7710 \NC " \NC T, T', S, S', SS, SS' \NC FractionNumeratorGapMin \NC default_rule_thickness \NC\NR
7711 \NC \tex{Umathfractionrule} \NC -- \NC FractionRuleThickness \NC default_rule_thickness \NC\NR
7712 \NC \tex{Umathlimitabovebgap} \NC -- \NC UpperLimitBaselineRiseMin \NC big_op_spacing3 \NC\NR
7713 \NC \tex{Umathlimitabovekern} \NC -- \NC 0$^1$ \NC big_op_spacing5 \NC\NR
7714 \NC \tex{Umathlimitabovevgap} \NC -- \NC UpperLimitGapMin \NC big_op_spacing1 \NC\NR
7715 \NC \tex{Umathlimitbelowbgap} \NC -- \NC LowerLimitBaselineDropMin \NC big_op_spacing4 \NC\NR
7716 \NC \tex{Umathlimitbelowkern} \NC -- \NC 0$^1$ \NC big_op_spacing5 \NC\NR
7717 \NC \tex{Umathlimitbelowvgap} \NC -- \NC LowerLimitGapMin \NC big_op_spacing2 \NC\NR
7718 \NC \tex{Umathoverdelimitervgap}\NC -- \NC StretchStackGapBelowMin \NC big_op_spacing1 \NC\NR
7719 \NC \tex{Umathoverdelimiterbgap}\NC -- \NC StretchStackTopShiftUp \NC big_op_spacing3 \NC\NR
7720 \NC \tex{Umathunderdelimitervgap}\NC-- \NC StretchStackGapAboveMin \NC big_op_spacing2 \NC\NR
7721 \NC \tex{Umathunderdelimiterbgap}\NC-- \NC StretchStackBottomShiftDown \NC big_op_spacing4 \NC\NR
7722 \NC \tex{Umathoverbarkern} \NC -- \NC OverbarExtraAscender \NC default_rule_thickness \NC\NR
7723 \NC \tex{Umathoverbarrule} \NC -- \NC OverbarRuleThickness \NC default_rule_thickness \NC\NR
7724 \NC \tex{Umathoverbarvgap} \NC -- \NC OverbarVerticalGap \NC 3*default_rule_thickness \NC\NR
7725 \NC \tex{Umathquad} \NC -- \NC <font_size(f)>$^1$ \NC math_quad \NC\NR
7726 \NC \tex{Umathradicalkern} \NC -- \NC RadicalExtraAscender \NC default_rule_thickness \NC\NR
7727 \NC \tex{Umathradicalrule} \NC -- \NC RadicalRuleThickness \NC <not set>$^2$ \NC\NR
7728 \NC \tex{Umathradicalvgap} \NC D, D' \NC RadicalDisplayStyleVerticalGap \NC (default_rule_thickness+\crlf
7729 (abs(math_x_height)/4))$^3$ \NC\NR
7730 \NC " \NC T, T', S, S', SS, SS' \NC RadicalVerticalGap \NC (default_rule_thickness+\crlf
7731 (abs(default_rule_thickness)/4))$^3$ \NC\NR
7732 \NC \tex{Umathradicaldegreebefore}\NC -- \NC RadicalKernBeforeDegree \NC <not set>$^2$ \NC\NR
7733 \NC \tex{Umathradicaldegreeafter}\NC -- \NC RadicalKernAfterDegree \NC <not set>$^2$ \NC\NR
7734 \NC \tex{Umathradicaldegreeraise}\NC -- \NC RadicalDegreeBottomRaisePercent \NC <not set>$^{2,7}$ \NC\NR
7735 \NC \tex{Umathspaceafterscript} \NC -- \NC SpaceAfterScript \NC script_space$^4$ \NC\NR
7736 \NC \tex{Umathstackdenomdown} \NC D, D' \NC StackBottomDisplayStyleShiftDown \NC denom1 \NC\NR
7737 \NC " \NC T, T', S, S', SS, SS' \NC StackBottomShiftDown \NC denom2 \NC\NR
7738 \NC \tex{Umathstacknumup} \NC D, D' \NC StackTopDisplayStyleShiftUp \NC num1 \NC\NR
7739 \NC " \NC T, T', S, S', SS, SS' \NC StackTopShiftUp \NC num3 \NC\NR
7740 \NC \tex{Umathstackvgap} \NC D, D' \NC StackDisplayStyleGapMin \NC 7*default_rule_thickness \NC\NR
7741 \NC " \NC T, T', S, S', SS, SS' \NC StackGapMin \NC 3*default_rule_thickness \NC\NR
7742 \NC \tex{Umathsubshiftdown} \NC -- \NC SubscriptShiftDown \NC sub1 \NC\NR
7743 \NC \tex{Umathsubshiftdrop} \NC -- \NC SubscriptBaselineDropMin \NC sub_drop \NC\NR
7744 \NC \tex{Umathsubsupshiftdown} \NC -- \NC SubscriptShiftDownWithSuperscript$^8$ \NC \NC\NR
7745 \NC \NC \NC \quad\ or SubscriptShiftDown \NC sub2 \NC\NR
7746 \NC \tex{Umathsubtopmax} \NC -- \NC SubscriptTopMax \NC (abs(math_x_height * 4) / 5) \NC\NR
7747 \NC \tex{Umathsubsupvgap} \NC -- \NC SubSuperscriptGapMin \NC 4*default_rule_thickness \NC\NR
7748 \NC \tex{Umathsupbottommin} \NC -- \NC SuperscriptBottomMin \NC (abs(math_x_height) / 4) \NC\NR
7749 \NC \tex{Umathsupshiftdrop} \NC -- \NC SuperscriptBaselineDropMax \NC sup_drop \NC\NR
7750 \NC \tex{Umathsupshiftup} \NC D \NC SuperscriptShiftUp \NC sup1 \NC\NR
7751 \NC " \NC T, S, SS, \NC SuperscriptShiftUp \NC sup2 \NC\NR
7752 \NC " \NC D', T', S', SS' \NC SuperscriptShiftUpCramped \NC sup3 \NC\NR
7753 \NC \tex{Umathsupsubbottommax} \NC -- \NC SuperscriptBottomMaxWithSubscript \NC (abs(math_x_height * 4) / 5) \NC\NR
7754 \NC \tex{Umathunderbarkern} \NC -- \NC UnderbarExtraDescender \NC default_rule_thickness \NC\NR
7755 \NC \tex{Umathunderbarrule} \NC -- \NC UnderbarRuleThickness \NC default_rule_thickness \NC\NR
7756 \NC \tex{Umathunderbarvgap} \NC -- \NC UnderbarVerticalGap \NC 3*default_rule_thickness \NC\NR
7757 \NC \tex{Umathconnectoroverlapmin}\NC -- \NC MinConnectorOverlap \NC 0$^5$ \NC\NR
7758 \stoptabulate
7760 \stop
7762 Note 1: \OPENTYPE\ fonts set \tex{Umathlimitabovekern} and
7763 \tex{Umathlimitbelowkern} to zero and set \tex{Umathquad} to the font size of the used font,
7764 because these are not supported in the MATH table,
7766 Note 2: \TFM\ fonts do not set \tex{Umathradicalrule} because \TeX82\ uses the height of the radical
7767 instead. When this parameter is indeed not set when \LUATEX\ has to typeset a radical, a backward
7768 compatibility mode will kick in that assumes that an oldstyle \TeX\ font is used. Also, they do
7769 not set \tex{Umathradicaldegreebefore}, \tex{Umathradicaldegreeafter}, and
7770 \tex{Umathradicaldegreeraise}. These are then automatically initialized to $5/18$quad, $-10/18$quad, and 60.
7772 Note 3: If tfm fonts are used, then the \tex{Umathradicalvgap} is not set until the first time
7773 \LUATEX\ has to typeset a formula because this needs parameters from both family2 and family3.
7774 This provides a partial backward compatibility with \TEX82, but that compatibility is only partial:
7775 once the \tex{Umathradicalvgap} is set, it will not be recalculated any more.
7777 Note 4: (also if tfm fonts are used) A similar situation arises wrt. \tex{Umathspaceafterscript}: it is not
7778 set until the first time \LUATEX\ has to typeset a formula. This provides some backward compatibility with
7779 \TEX82. But once the \tex{Umathspaceafterscript} is set, \tex{scriptspace} will never be looked at again.
7781 Note 5: Tfm fonts set \tex{Umathconnectoroverlapmin} to zero because
7782 \TeX82\ always stacks extensibles without any overlap.
7784 Note 6: The \tex{Umathoperatorsize} is only used in \type{\displaystyle}, and is only set
7785 in \OPENTYPE\ fonts. In \TFM\ font mode, it is artificially set to one scaled point more than the
7786 initial attempt's size, so that always the \quote{first next} will be tried, just like in \TEX82.
7788 Note 7: The \tex{Umathradicaldegreeraise} is a special case because it is the only parameter that is
7789 expressed in a percentage instead of as a number of scaled points.
7791 Note 8: \type{SubscriptShiftDownWithSuperscript} does not actually exist in the \quote{standard}
7792 Opentype Math font Cambria, but it is useful enough to be added. New in version 0.38.
7794 Note 9: \type{FractionDelimiterDisplayStyleSize} and \type{FractionDelimiterSize} do not actually exist in the \quote{standard}
7795 Opentype Math font Cambria, but were useful enough to be added. New in version 0.47.
7798 \section{Math spacing setting}
7800 Besides the parameters mentioned in the previous sections, there are
7801 also 64 new primitives to control the math spacing table (as explained in
7802 Chapter~18 of the \TeX book). The primitive names are a simple matter
7803 of combining two math atom types, but for completeness' sake, here is
7804 the whole list:
7806 \startcolumns[n=2]
7807 \starttyping
7808 \Umathordordspacing
7809 \Umathordopspacing
7810 \Umathordbinspacing
7811 \Umathordrelspacing
7812 \Umathordopenspacing
7813 \Umathordclosespacing
7814 \Umathordpunctspacing
7815 \Umathordinnerspacing
7816 \Umathopordspacing
7817 \Umathopopspacing
7818 \Umathopbinspacing
7819 \Umathoprelspacing
7820 \Umathopopenspacing
7821 \Umathopclosespacing
7822 \Umathoppunctspacing
7823 \Umathopinnerspacing
7824 \Umathbinordspacing
7825 \Umathbinopspacing
7826 \Umathbinbinspacing
7827 \Umathbinrelspacing
7828 \Umathbinopenspacing
7829 \Umathbinclosespacing
7830 \Umathbinpunctspacing
7831 \Umathbininnerspacing
7832 \Umathrelordspacing
7833 \Umathrelopspacing
7834 \Umathrelbinspacing
7835 \Umathrelrelspacing
7836 \Umathrelopenspacing
7837 \Umathrelclosespacing
7838 \Umathrelpunctspacing
7839 \Umathrelinnerspacing
7840 \Umathopenordspacing
7841 \Umathopenopspacing
7842 \Umathopenbinspacing
7843 \Umathopenrelspacing
7844 \Umathopenopenspacing
7845 \Umathopenclosespacing
7846 \Umathopenpunctspacing
7847 \Umathopeninnerspacing
7848 \Umathcloseordspacing
7849 \Umathcloseopspacing
7850 \Umathclosebinspacing
7851 \Umathcloserelspacing
7852 \Umathcloseopenspacing
7853 \Umathcloseclosespacing
7854 \Umathclosepunctspacing
7855 \Umathcloseinnerspacing
7856 \Umathpunctordspacing
7857 \Umathpunctopspacing
7858 \Umathpunctbinspacing
7859 \Umathpunctrelspacing
7860 \Umathpunctopenspacing
7861 \Umathpunctclosespacing
7862 \Umathpunctpunctspacing
7863 \Umathpunctinnerspacing
7864 \Umathinnerordspacing
7865 \Umathinneropspacing
7866 \Umathinnerbinspacing
7867 \Umathinnerrelspacing
7868 \Umathinneropenspacing
7869 \Umathinnerclosespacing
7870 \Umathinnerpunctspacing
7871 \Umathinnerinnerspacing
7872 \stoptyping
7873 \stopcolumns
7875 These parameters are of type \type{\muskip}, so setting a parameter
7876 can be done like this:
7878 \starttyping
7879 \Umathopordspacing\displaystyle=4mu plus 2mu
7880 \stoptyping
7882 They are all initialized by initex to the values mentioned in the
7883 table in Chapter~18 of the \TeX book.
7885 Note 1: for ease of use as well as for backward compatibility, \type{\thinmuskip},
7886 \type{\medmuskip} and \type{\thickmuskip} are treated especially. In their case a pointer to
7887 the corresponding internal parameter is saved, not the actual \type{\muskip} value. This
7888 means that any later changes to one of these three parameters will be taken into account.
7890 Note 2: Careful readers will realise that there are also primitives
7891 for the items marked \type{*} in the \TeX book. These will not
7892 actually be used as those combinations of atoms cannot actually
7893 happen, but it seemed better not to break orthogonality. They are initialized to zero.
7896 \section[mathacc]{Math accent handling}
7898 \LUATEX\ supports both top accents and bottom accents in math mode,
7899 and math accents stretch automatically (if this is supported by the
7900 font the accent comes from, of course). Bottom and combined accents as
7901 well as fixed-width math accents are controlled by optional keywords
7902 following \tex{Umathaccent}.
7904 The keyword \type{bottom} after \tex{Umathaccent} signals that a bottom
7905 accent is needed, and the keyword \type{both} signals that both a top
7906 and a bottom accent are needed (in this case two accents need to be
7907 specified, of course).
7909 Then the set of three integers defining the accent is read. This set
7910 of integers can be prefixed by the \type{fixed} keyword to indicate
7911 that a non-stretching variant is requested (in case of both accents,
7912 this step is repeated).
7914 A simple example:
7915 \starttyping
7916 \Umathaccent both fixed 0 0 "20D7 fixed 0 0 "20D7 {example}
7917 \stoptyping
7919 If a math top accent has to be placed and the accentee is a character and has a non-zero
7920 \type{top_accent} value, then this value will be used to place the accent instead of
7921 the \type{\skewchar} kern used by \TEX82.
7923 The \type{top_accent} value represents a vertical line somewhere in the accentee. The
7924 accent will be shifted horizontally such that its own \type{top_accent} line coincides
7925 with the one from the accentee. If the \type{top_accent} value of the accent is zero,
7926 then half the width of the accent followed by its italic correction is used instead.
7928 The vertical placement of a top accent depends on the \type{x_height} of the font of the
7929 accentee (as explained in the \TEX book), but if value that turns out to be zero and the
7930 font had a MathConstants table, then \type{AccentBaseHeight} is used instead.
7932 If a math bottom accent has to be placed, the \type{bot_accent} value is checked instead
7933 of \type{top_accent}. Because bottom accents do not exist in \TEX82, the \type{\skewchar}
7934 kern is ignored.
7936 The vertical placement of a bottom accent is straight below the accentee, no correction
7937 takes place.
7939 \section{Math root extension}
7941 The new primitive \type{\Uroot} allows the construction of a radical
7942 noad including a degree field. Its syntax is an extension of \type{\Uradical}:
7944 \starttyping
7945 \Uradical <fam integer> <char integer> <radicand>
7946 \Uroot <fam integer> <char integer> <degree> <radicand>
7947 \stoptyping
7949 The placement of the degree is controlled by the math parameters
7950 \type{\Umathradicaldegreebefore}, \type{\Umathradicaldegreeafter}, and
7951 \type{\Umathradicaldegreeraise}. The degree will be typeset in \type{\scriptscriptstyle}.
7954 \section{Math kerning in super- and subscripts}
7956 The character fields in a lua-loaded OpenType math font can have a \quote{mathkern} table.
7957 The format of this table is the same as the \quote{mathkern} table that is returned by
7958 the \type{fontloader} library, except that all height and kern values have to
7959 be specified in actual scaled points.
7961 When a super- or subscript has to be placed next to a math item, \LUATEX\ checks
7962 whether the super- or subscript and the nucleus are both simple character items. If
7963 they are, and if the fonts of both character imtes are OpenType fonts (as opposed to
7964 legacy \TEX\ fonts), then \LUATEX\ will use the OpenType MATH algorithm for deciding
7965 on the horizontal placement of the super- or subscript.
7967 This works as follows:
7969 \startitemize
7970 \item The vertical position of the script is calculated.
7971 \item The default horizontal position is flat next to the base character.
7972 \item For superscripts, the italic correction of the base character is added.
7973 \item For a superscript, two vertical values are calculated: the bottom of the
7974 script (after shifting up), and the top of the base. For a subscript,
7975 the two values are the top of the (shifted down) script, and the bottom
7976 of the base.
7977 \item For each of these two locations:
7978 \startitemize
7979 \item find the mathkern value at this height for the base
7980 (for a subscript placement, this is the bottom_right corner,
7981 for a superscript placement the top_right corner)
7982 \item find the mathkern value at this height for the script
7983 (for a subscript placement, this is the top_left corner,
7984 for a superscript placement the bottom_left corner)
7985 \item add the found values together to get a preliminary result.
7986 \stopitemize
7987 \item The horizontal kern to be applied is the smallest of the two results from
7988 previous step.
7989 \stopitemize
7991 The mathkern value at a specific height is the kern value that is specified by the
7992 next higher height and kern pair, or the highest one in the character (if there is no
7993 value high enough in the character), or simply zero (if the character has no mathkern
7994 pairs at all).
7996 \section{Scripts on horizontally extensible items like arrows}
7998 The new primitives \tex{Uunderdelimiter} and \tex{Uoverdelimiter}
7999 (both from 0.35) allow the placement of a subscript or superscript on
8000 an automatically extensible item and \tex{Udelimiterunder} and
8001 \tex{Udelimiterover} (both from 0.37) allow the placement of
8002 an automatically extensible item as a subscript or superscript on a
8003 nucleus.
8005 The vertical placements are controlled by
8006 \tex{Umathunderdelimiterbgap}, \tex{Umathunderdelimitervgap},
8007 \tex{Umathoverdelimiterbgap}, and \tex{Umathoverdelimitervgap} in a similar way as limit
8008 placements on large operators. The superscript in \tex{Uoverdelimiter} is typeset in
8009 a suitable scripted style, the subscript in \tex{Uunderdelimiter} is cramped as well.
8011 \section {Extensible delimiters}
8013 \LUATEX\ internally uses a structure that supports \OPENTYPE\ \quote{MathVariants} as well
8014 as \TFM\ \quote{extensible recipes}.
8017 \section{Other Math changes}
8019 \subsection {Verbose versions of single-character math commands}
8021 \LUATEX\ defines six new primitives that have the same function as
8022 \type{^}, \type{_}, \type{$}, and \type{$$}. %$
8024 \starttabulate[|l|l|l|l|]
8025 \NC \bf primitive \NC \bf explanation \NC\NR
8026 \NC \tex{Usuperscript} \NC Duplicates the functionality of \type{^} \NC\NR
8027 \NC \tex{Usubscript} \NC Duplicates the functionality of \type{_} \NC\NR
8028 \NC \tex{Ustartmath} \NC Duplicates the functionality of \type{$}, % $
8029 when used in non-math mode. \NC\NR
8030 \NC \tex{Ustopmath} \NC Duplicates the functionality of \type{$}, % $
8031 when used in inline math mode. \NC\NR
8032 \NC \tex{Ustartdisplaymath}\NC Duplicates the functionality of \type{$$}, % $$
8033 when used in non-math mode. \NC\NR
8034 \NC \tex{Ustopdisplaymath} \NC Duplicates the functionality of \type{$$}, % $$
8035 when used in display math mode. \NC\NR
8036 \stoptabulate
8038 All are new in version 0.38. The \tex{Ustopmath} and \tex{Ustopdisplaymath}
8039 primitives check if the current math mode is the correct one (inline
8040 vs. displayed), but you can freely intermix the four mathon|/|mathoff
8041 commands with explicit dollar sign(s).
8044 \subsection{Allowed math commands in non-math modes}
8046 The commands \type{\mathchar}, and \type{\Umathchar} and control
8047 sequences that are the result of \type{\mathchardef} or
8048 \type{\Umathchardef} are also acceptable in the horizontal and vertical modes.
8049 In those cases, the \type{\textfont} from the requested math family is used.
8051 \section{Math todo}
8053 The following items are still todo.
8055 \startitemize
8056 \item Pre-scripts.
8057 \item Multi-story stacks.
8058 \item Flattened accents for high characters (?).
8059 \item Better control over the spacing around displays and handling of equation numbers.
8060 \item Support for multi-line displays using \MATHML\ style alignment points.
8061 \stopitemize
8063 \chapter[languages]{Languages and characters, fonts and glyphs}
8065 \LUATEX's internal handling of the characters and glyphs that eventually
8066 become typeset is quite different from the way \TEX82 handles those
8067 same objects. The easiest way to explain the difference is to focus on
8068 unrestricted horizontal mode (i.\,e.\ paragraphs) and hyphenation first.
8069 Later on, it will be easy to deal with the differences that occur in
8070 horizontal and math modes.
8072 In \TEX82, the characters you type are converted into \type{char_node}
8073 records when they are encountered by the main control loop. \TEX\
8074 attaches and processes the font information while creating those
8075 records, so that the resulting \quote{horizontal list} contains the final
8076 forms of ligatures and implicit kerning. This packaging is needed because
8077 we may want to get the effective width of for instance a horizontal box.
8079 When it becomes necessary to hyphenate words in a paragraph, \TEX\
8080 converts (one word at time) the \type{char_node} records into a
8081 string array by replacing ligatures with their components and
8082 ignoring the kerning. Then it runs the hyphenation algorithm on this
8083 string, and converts the hyphenated result back into a
8084 \quote{horizontal list} that is consecutively spliced back into
8085 the paragraph stream. Keep in mind that the paragraph may contain unboxed horizontal material,
8086 which then already contains ligatures and kerns and the words therein
8087 are part of the hyphenation process.
8089 The \type{char_node} records are somewhat misnamed, as they are glyph
8090 positions in specific fonts, and therefore not really \quote{characters}
8091 in the linguistic sense. There is no language information inside the
8092 \type{char_node} records. Instead, language information is passed along
8093 using \type{language whatsit} records inside the horizontal list.
8095 In \LUATEX, the situation is quite different. The characters you
8096 type are always converted into \type{glyph_node} records with a
8097 special subtype to identify them as being intended as linguistic
8098 characters. \LUATEX\ stores the needed language information in those
8099 records, but does not do any font|-|related processing at the time of
8100 node creation. It only stores the index of the font.
8102 When it becomes necessary to typeset a paragraph, \LUATEX\ first
8103 inserts all hyphenation points right into the whole node list.
8104 Next, it processes all the font information in the whole list
8105 (creating ligatures and adjusting kerning), and finally it adjusts
8106 all the subtype identifiers so that the records are \quote{glyph
8107 nodes} from now on.
8109 That was the broad overview. The rest of this chapter will deal with the
8110 minutiae of the new process.
8112 \section[charsandglyphs]{Characters and glyphs}
8114 \TEX82 (including \PDFTEX) differentiated between \type{char_node}s
8115 and \type{lig_node}s. The former are simple items that contained
8116 nothing but a \quote{character} and a \quote{font} field, and they
8117 lived in the same memory as tokens did. The latter also contained a
8118 list of components, and a subtype indicating whether this ligature was
8119 the result of a word boundary, and it was stored in the same place as
8120 other nodes like boxes and kerns and glues.
8122 In \LUATEX, these two types are merged into one, somewhat larger
8123 structure called a \type{glyph_node}. Besides having the old
8124 character, font, and component fields, and the new special fields like
8125 \quote{attr} (see~\in{section}[glyphnodes]), these nodes also contain:
8127 \startitemize
8129 \item A subtype, split into four main types:
8131 \startitemize
8132 \item \type{character}, for characters to be hyphenated: the lowest
8133 bit (bit 0) is set to 1.
8134 \item \type{glyph}, for specific font glyphs: the lowest bit
8135 (bit 0) is not set.
8136 \item \type{ligature}, for ligatures (bit 1 is set)
8137 \item \type{ghost}, for \quote{ghost objects} (bit 2 is set)
8138 \stopitemize
8140 The latter two make further use of two extra fields (bits 3 and 4):
8142 \startitemize
8143 \item \type{left}, for ligatures created from a left word boundary and
8144 for ghosts created from \tex{leftghost}
8145 \item \type{right}, for ligatures created from a right word boundary and
8146 for ghosts created from \tex{rightghost}
8147 \stopitemize
8149 For ligatures, both bits can be set at the same time (in case of a single|-|glyph word).
8151 \item \type{glyph_node}s of type \quote{character} also contain language data,
8152 split into four items that were current when the node was created:
8153 the \tex{setlanguage} (15 bits), \tex{lefthyphenmin} (8 bits),
8154 \tex{righthyphenmin} (8 bits), and \tex{uchyph} (1 bit).
8156 \stopitemize
8158 Incidentally, \LUATEX\ allows 16383 separate languages, and words can
8159 be 256 characters long.
8161 Because the \tex{uchyph} value is saved in the actual nodes, its
8162 handling is subtly different from \TEX82: changes to \tex{uchyph}
8163 become effective immediately, not at the end of the current partial
8164 paragraph.
8166 Typeset boxes now always have their language information embedded in
8167 the nodes themselves, so there is no longer a possible dependency on
8168 the surrounding language settings. In \TEX82, a mid-paragraph
8169 statement like \tex{unhbox0} would process the box using the current
8170 paragraph language unless there was a \tex{setlanguage} issued inside
8171 the box. In \LUATEX, all language variables are already frozen.
8174 \section{The main control loop}
8176 In \LUATEX's main loop, almost all input characters that are to be
8177 typeset are converted into \type{glyph_node} records with subtype
8178 \quote{character}, but there are a few small exceptions.
8180 First, the \tex{accent} primitives creates nodes with subtype \quote{glyph}
8181 instead of \quote{character}: one for the actual accent and one for the
8182 accentee. The primary reason for this is that \tex{accent} in \TEX82
8183 is explicitly dependent on the current font encoding, so it would not
8184 make much sense to attach a new meaning to the primitive's name, as
8185 that would invalidate many old documents and macro packages. A
8186 secondary reason is that in \TEX82, \tex{accent} prohibits hyphenation
8187 of the current word. Since in \LUATEX\ hyphenation only takes place on
8188 \quote{character} nodes, it is possible to achieve the same effect.
8190 This change of meaning did happen with \tex{char}, that now generates
8191 \quote{character} nodes, consistent with its changed meaning in \XETEX.
8192 The changed status of \tex{char} is not yet finalized, but if it stays
8193 as it is now, a new primitive \tex{glyph} should be added to directly
8194 insert a font glyph id.
8196 Second, all the results of processing in math mode eventually become
8197 nodes with \quote{glyph} subtypes.
8199 Third, the \ALEPH-derived commands \tex{leftghost} and
8200 \tex{rightghost} create nodes of a third subtype: \quote{ghost}. These nodes
8201 are ignored completely by all further processing until the stage where
8202 inter-glyph kerning is added.
8204 Fourth, automatic discretionaries are handled differently. \TEX82
8205 inserts an empty discretionary after sensing an input character that
8206 matches the \tex{hyphenchar} in the current font. This test is wrong,
8207 in our opinion: whether or not hyphenation takes place should not
8208 depend on the current font, it is a language property.
8210 In \LUATEX, it works like this: if \LUATEX\ senses a string of input
8211 characters that matches the value of the new integer parameter
8212 \tex{exhyphenchar}, it will insert an explicit discretionary after that
8213 series of nodes. Initex sets the \tex{exhyphenchar=`\-}.
8214 Incidentally, this is a global parameter instead of a
8215 language-specific one because it may be useful to change the value
8216 depending on the document structure instead of the text language.
8218 Note: as of \LUATEX\ 0.63.0, the insertion of discretionaries after
8219 a sequence of explicit hyphens happens at the same time as the other
8220 hyphenation processing, {\it not\/} inside the main control loop.
8222 The only use \LUATEX\ has for \tex{hyphenchar} is at the check
8223 whether a word should be considered for hyphenation at all. If the
8224 \tex{hyphenchar} of the font attached to the first character node in a
8225 word is negative, then hyphenation of that word is abandoned
8226 immediately. {\bf This behavior is added for backward
8227 compatibility only, and the use of \type{\hyphenchar=-1} as a means of
8228 preventing hyphenation should not be used in new \LUATEX\ documents.}
8230 Fifth, \tex{setlanguage} no longer creates whatsits. The meaning of
8231 \tex{setlanguage} is changed so that it is now an integer parameter
8232 like all others. That integer parameter is used in \tex{glyph_node}
8233 creation to add language information to the glyph nodes. In
8234 conjunction, the \tex{language} primitive is extended so that it
8235 always also updates the value of \tex{setlanguage}.
8237 Sixth, the \tex{noboundary} command (this command prohibits word
8238 boundary processing where that would normally take place) now does
8239 create whatsits. These whatsits are needed because the exact place of
8240 the \tex{noboundary} command in the input stream has to be retained
8241 until after the ligature and font processing stages.
8243 Finally, there is no longer a \type{main_loop} label in the
8244 code. Remember that \TEX82 did quite a lot of processing while adding
8245 \type{char_nodes} to the horizontal list? For speed reasons, it handled
8246 that processing code outside of the \quote{main control} loop, and only the
8247 first character of any \quote{word} was handled by that \quote{main control} loop.
8248 In \LUATEX, there is no longer a need for that (all hard work is done
8249 later), and the (now very small) bits of character-handling code have
8250 been moved back inline. When \tex{tracingcommands} is on, this is
8251 visible because the full word is reported, instead of just the initial
8252 character.
8255 \section[patternsexceptions]{Loading patterns and exceptions}
8257 The hyphenation algorithm in \LUATEX\ is quite different from the one
8258 in \TEX82, although it uses essentially the same user input.
8260 After expansion, the argument for \tex{patterns} has to be proper
8261 UTF-8 with individual patterns separated by spaces, no \tex{char} or
8262 \tex{chardef-ed} commands are allowed. (The current implementation is
8263 even more strict, and will reject all non|-|\UNICODE\ characters, but
8264 that will be changed in the future. For now, the generated errors are
8265 a valuable tool in discovering font-encoding specific pattern files)
8267 Likewise, the expanded argument for \tex{hyphenation} also has to be
8268 proper UTF-8, but here a tiny little bit of extra syntax is provided:
8270 \startitemize[n]
8271 \item three sets of arguments in curly braces (\type{{}{}{}})
8272 indicates a desired complex discretionary, with arguments
8273 as in \tex{discretionary}'s command in normal document input.
8274 \item \type{-} indicates a desired simple discretionary, cf. \tex{-} and
8275 \type{\discretionary{-}{}{}} in normal document input.
8276 \item Internal command names are ignored. This rule is provided
8277 especially for \tex{discretionary}, but it also helps to deal with
8278 \tex{relax} commands that may sneak in.
8279 \item \type{=} indicates a (non-discretionary) hyphen in the document input.
8280 \stopitemize
8282 The expanded argument is first converted back to a space-separated
8283 string while dropping the internal command names. This string is then
8284 converted into a dictionary by a routine that creates key||value pairs
8285 by converting the other listed items. It is important to note that the
8286 keys in an exception dictionary can always be generated from the
8287 values. Here are a few examples:
8289 \starttabulate[|l|l|l|]
8290 \NC \ssbf value \NC \ssbf implied key (input) \NC \ssbf effect\NC\NR
8291 \NC \type{ta-ble} \NC table \NC \type{ta\-ble}
8292 ($=$ \type{ta\discretionary{-}{}{}ble})\NC\NR
8293 \NC \type{ba{k-}{}{c}ken}\NC backen \NC \type{ba\discretionary{k-}{}{c}ken}\NC\NR
8294 \stoptabulate
8296 The resultant patterns and exception dictionary will be stored under
8297 the language code that is the present value of \tex{language}.
8299 In the last line of the table, you see there is no \tex{discretionary}
8300 command in the value: the command is optional in the \TEX-based input
8301 syntax. The underlying reason for that is that it is conceivable that
8302 a whole dictionary of words is stored as a plain text file and loaded
8303 into \LUATEX\ using one of the functions in the \LUA\ \luatex{lang}
8304 library. This loading method is quite a bit faster than going through
8305 the \TEX\ language primitives, but some (most?) of that speed gain
8306 would be lost if it had to interpret command sequences while doing so.
8308 Starting with \LUATEX\ 0.63.0, it is possible to specify extra hyphenation
8309 points in compound words by using \type{{-}{}{-}} for the explicit hyphen
8310 character (replace \type{-} by the actual explicit hyphen character if needed).
8311 For example, this matches the word \quote{multi-word-boundaries} and allows
8312 an extra break inbetweem \quote{boun} and \quote{daries}:
8314 \starttyping
8315 \hyphenation{multi{-}{}{-}word{-}{}{-}boun-daries}
8316 \stoptyping
8318 The motivation behind the \ETEX\ extension \tex{savinghyphcodes} was
8319 that hyphenation heavily depended on font encodings. This is no longer
8320 true in \LUATEX, and the corresponding primitive is ignored pending
8321 complete removal. The future semantics of \tex{uppercase} and
8322 \tex{lowercase} are still under consideration, no changes have taken
8323 place yet.
8326 \section{Applying hyphenation}
8328 The internal structures \LUATEX\ uses for the insertion of
8329 discretionaries in words is very different from the ones in \TEX82,
8330 and that means there are some noticeable differences in handling as
8331 well.
8333 First and foremost, there is no \quote{compressed trie} involved in
8334 hyphenation. The algorithm still reads \PATGEN-generated pattern
8335 files, but \LUATEX\ uses a finite state hash to match the patterns
8336 against the word to be hyphenated. This algorithm is based on the
8337 \quote{libhnj} library used by OpenOffice, which in turn is inspired
8338 by \TEX.
8339 The memory allocation for this new implementation is completely
8340 dynamic, so the \WEBC\ setting for \type{trie_size} is ignored.
8342 Differences between \LUATEX\ and \TEX82 that are a direct result of that:
8344 \startitemize
8345 \item \LUATEX\ happily hyphenates the full \UNICODE\ character range.
8346 \item Pattern and exception dictionary size is limited by the
8347 available memory only, all allocations are done dynamically.
8348 The trie-related settings in \type{texmf.cnf} are ignored.
8349 \item Because there is no \quote{trie preparation} stage, language patterns
8350 never become frozen. This means that the primitive \tex{patterns}
8351 (and its \LUA\ counterpart \luatex{lang.patterns}) can be used at any
8352 time, not only in initex.
8353 \item Only the string representation of \tex{patterns} and
8354 \tex{hyphenation} is stored in the format file. At format load time,
8355 they are simply re-evaluated. It follows that there is no real
8356 reason to preload languages in the format file. In fact, it is
8357 usually not a good idea to do so. It is much smarter to load
8358 patterns no sooner than the first time they are actually needed.
8359 \item \LUATEX\ uses the language-specific variables
8360 \tex{prehyphenchar} and \tex{posthyphenchar} in the creation of
8361 implicit discretionaries, instead of \TEX82's \tex{hyphenchar}, and
8362 the values of the language-specific variables \tex{preexhyphenchar} and
8363 \tex{postexhyphenchar} for explicit discretionaries (instead of
8364 \TEX82's empty discretionary).
8365 \stopitemize
8367 Inserted characters and ligatures inherit their attributes from the
8368 nearest glyph node item (usually the preceding one, but the following
8369 one for the items inserted at the left-hand side of a word).
8371 Word boundaries are no longer implied by font switches, but by
8372 language switches. One word can have two separate fonts and still be
8373 hyphenated correctly (but it can not have two different languages,
8374 the \tex{setlanguage} command forces a word boundary).
8376 All languages start out with \tex{prehyphenchar=`\-},
8377 \tex{posthyphenchar=0}, \tex{preexhyphenchar=0} and
8378 \tex{postexhyphenchar=0}.
8379 When you assign the values of one of these four parameters, you are
8380 actually changing the settings for the current \tex{language}, this
8381 behavior is compatible with \tex{patterns} and \tex{hyphenation}.
8383 \LUATEX\ also hyphenates the first word in a paragraph.
8385 Words can be up to 256 characters long (up from 64 in \TEX82). Longer
8386 words generate an error right now, but eventually either the
8387 limitation will be removed or perhaps it will become possible to
8388 silently ignore the excess characters (this is what happens in \TEX82,
8389 but there the behavior cannot be controlled).
8391 If you are using the \LUA\ function \type{lang.hyphenate}, you should be
8392 aware that this function expects to receive a list of \quote{character}
8393 nodes. It will not operate properly in the presence of \quote{glyph},
8394 \quote{ligature}, or \quote{ghost} nodes, nor does it know how to deal with
8395 kerning. In the near future, it will be able to skip over \quote{ghost}
8396 nodes, and we may add a less fuzzy function you can call as well.
8398 The hyphenation exception dictionary is maintained as key-value
8399 hash, and that is also dynamic, so the \type{hyph_size} setting is not
8400 used either.
8402 A technical paper detailing the new algorithm will be released as a
8403 separate document.
8405 \section{Applying ligatures and kerning}
8407 After all possible hyphenation points have been inserted in the list,
8408 \LUATEX\ will process the list to convert the \quote{character} nodes into
8409 \quote{glyph} and \quote{ligature} nodes. This is actually done in two stages:
8410 first all ligatures are processed, then all kerning information is
8411 applied to the result list. But those two stages are somewhat
8412 dependent on each other: If the used font makes it possible to do so,
8413 the ligaturing stage adds virtual \quote{character} nodes to the word
8414 boundaries in the list. While doing so, it removes and interprets
8415 \type{noboundary} nodes. The kerning stage deletes those word boundary
8416 items after it is done with them, and it does the same for \quote{ghost}
8417 nodes. Finally, at the end of the kerning stage, all remaining
8418 \quote{character} nodes are converted to \quote{glyph} nodes.
8420 This work separation is worth mentioning because, if you overrule from
8421 \LUA\ only one of the two callbacks related to font handling, then you
8422 have to make sure you perform the tasks normally done by \LUATEX\
8423 itself in order to make sure that the other, non|-|overruled, routine
8424 continues to function properly.
8426 Work in this area is not yet complete, but most of the possible cases
8427 are handled by our rewritten ligaturing engine. We are working hard to
8428 make sure all of the possible inputs will become supported soon.
8430 For example, take the word \type{office}, hyphenated \type{of-fice},
8431 using a \quote{normal} font with all the \type{f}-\type{f} and
8432 \type{f}-\type{i} type ligatures:
8434 \starttabulate[|l|l|]
8435 \NC Initial: \NC \type{{o}{f}{f}{i}{c}{e}}\NC\NR
8436 \NC After hyphenation: \NC \type{{o}{f}{{-},{},{}}{f}{i}{c}{e}}\NC\NR
8437 \NC First ligature stage: \NC \type{{o}{{f-},{f},{<ff>}}{i}{c}{e}}\NC\NR
8438 \NC Final result: \NC \type{{o}{{f-},{<fi>},{<ffi>}}{c}{e}} \NC\NR
8439 \stoptabulate
8441 That's bad enough, but let us assume that there is also a hyphenation
8442 point between the \type{f} and the \type{i}, to create
8443 \type{of-f-ice}. Then the final result should be:
8445 \starttyping
8446 {o}{{f-},
8447 {{f-},
8448 {i},
8449 {<fi>}},
8450 {{<ff>-},
8451 {i},
8452 {<ffi>}}}{c}{e}
8453 \stoptyping
8455 with discretionaries in the post-break text as well as in the
8456 replacement text of the top-level discretionary that resulted from the
8457 first hyphenation point.
8459 Here is that nested solution again, in a different representation:
8461 \starttabulate[|l|l|l|l|]
8462 \NC \NC pre \NC post \NC replace \NC \NR
8463 \NC topdisc \NC \type{f-}$^1$ \NC sub1 \NC sub2 \NC \NR
8464 \NC sub1 \NC \type{f-}$^2$ \NC \type{i}$^3$ \NC \type{<fi>}$^4$ \NC \NR
8465 \NC sub2 \NC \type{<ff>-}$^5$\NC \type{i}$^6$ \NC \type{<ffi>}$^7$\NC \NR
8466 \stoptabulate
8468 When line breaking is choosing its breakpoints, the following fields will eventually
8469 be selected:
8471 \starttabulate[|l|l|l|]
8472 \NC \type{of-f-ice} \NC \type{f-}$^1$ \NC \NR
8473 \NC \NC \type{f-}$^2$ \NC \NR
8474 \NC \NC \type{i}$^3$ \NC \NR
8475 \NC \type{of-fice} \NC \type{f-}$^1$ \NC \NR
8476 \NC \NC \type{<fi>}$^4$ \NC \NR
8477 \NC \type{off-ice} \NC \type{<ff>-}$^5$ \NC \NR
8478 \NC \NC \type{i}$^6$ \NC \NR
8479 \NC \type{office} \NC \type{<ffi>}$^7$ \NC \NR
8480 \stoptabulate
8482 The current solution in \LUATEX\ is not able to handle nested
8483 discretionaries, but it is in fact smart enough to handle this
8484 fictional \type{of-f-ice} example. It does so by combining two
8485 sequential discretionary nodes as if they were a single object
8486 (where the second discretionary node is treated as an extension
8487 of the first node).
8489 One can observe that the \type{of-f-ice} and \type{off-ice} cases both end
8490 with the same actual post replacement list (\type{i}), and that this
8491 would be the case even if that \type{i} was the first item of a
8492 potential following ligature like \type{ic}. This allows \LUATEX\
8493 to do away with one of the fields, and thus make the whole stuff fit
8494 into just two discretionary nodes.
8496 The mapping of the seven list fields to the six fields in this
8497 discretionary node pair is as follows:
8499 \starttabulate[|l|p|]
8500 \NC \bf field \NC \bf description \NC \NR
8501 \NC \type{disc1.pre} \NC \type{f-}$^1$ \NC \NR
8502 \NC \type{disc1.post} \NC \type{<fi>}$^4$ \NC \NR
8503 \NC \type{disc1.replace} \NC \type{<ffi>}$^7$ \NC \NR
8504 \NC \type{disc2.pre} \NC \type{f-}$^2$ \NC \NR
8505 \NC \type{disc2.post} \NC \type{i}$^{3{,}6}$\NC \NR
8506 \NC \type{disc2.replace} \NC \type{<ff>-}$^5$\NC \NR
8507 \stoptabulate
8509 What is actually generated after ligaturing has been applied is
8510 therefore:
8512 \starttyping
8513 {o}{{f-},
8514 {<fi>},
8515 {<ffi>}}
8516 {{f-},
8517 {i},
8518 {<ff>-}}{c}{e}
8519 \stoptyping
8521 The two discretionaries have different subtypes from a discretionary
8522 appearing on its own: the first has subtype 4, and the second has
8523 subtype 5. The need for these special subtypes stems from the fact
8524 that not all of the fields appear in their \quote{normal} location.
8525 The second discretionary especially looks odd, with things like the
8526 \type{<ff>-} appearing in \type{disc2.replace}. The fact that some of
8527 the fields have different meanings (and different processing code
8528 internally) is what makes it necessary to have different subtypes:
8529 this enables \LUATEX\ to distinguish this sequence of two joined
8530 discretionary nodes from the case of two standalone discretionaries
8531 appearing in a row.
8534 \section{Breaking paragraphs into lines}
8536 This code is still almost unchanged, but because of the
8537 above|-|mentioned changes with respect to discretionaries and ligatures,
8538 line breaking will potentially be different from traditional \TEX.
8539 The actual line breaking code is still based on the \TEX82 algorithms,
8540 and it does not expect there to be discretionaries inside of
8541 discretionaries.
8543 But that situation is now fairly common in \LUATEX, due to the changes
8544 to the ligaturing mechanism. And also, the \LUATEX\ discretionary
8545 nodes are implemented slightly different from the \TEX82 nodes: the
8546 \type{no_break} text is now embedded inside the disc node, where
8547 previously these nodes kept their place in the horizontal list (the
8548 discretionary node contained a counter indicating how many nodes to
8549 skip).
8551 The combined effect of these two differences is that \LUATEX\ does not
8552 always use all of the potential breakpoints in a paragraph, especially
8553 when fonts with many ligatures are used.
8555 % TODO:
8556 % Check \sfcode handling
8557 % Implement \glyph
8559 % Remove \savinghyphcodes
8560 % Allow non-UCS characters in \patterns
8562 \chapter[fonts]{Font structure}
8564 All \TEX\ fonts are represented to \LUA\ code as tables, and
8565 internally as C~structures. All keys in the table below are saved in
8566 the internal font structure if they are present in the table returned
8567 by the
8568 \luatex{define_font} callback, or if they result from the normal \TFM|/|\VF\
8569 reading routines if there is no \luatex{define_font} callback defined.
8571 The column \quote{from \VF} means that this key will be created by the
8572 \luatex{font.read_vf()} routine, \quote{from \TFM} means that the key will be created
8573 by the \luatex{font.read_tfm()} routine, and \quote{used} means whether or not the
8574 \LUATEX\ engine itself will do something with the key.
8576 The top|-|level keys in the table are as follows:
8578 \starttabulate[|Tl|l|l|l|l|p|]
8579 \NC \ssbf key \NC \bf from vf \NC \bf from tfm \NC \bf used\NC \bf value type \NC \bf description \NC\NR
8580 \NC name \NC yes \NC yes \NC yes \NC string \NC metric (file) name\NC\NR
8581 \NC area \NC no \NC yes \NC yes \NC string \NC (directory) location, typically empty\NC\NR
8582 \NC used \NC no \NC yes \NC yes \NC boolean\NC used already? (initial: false)\NC \NR
8583 \NC characters \NC yes \NC yes \NC yes \NC table \NC the defined glyphs of this font \NC \NR
8584 \NC checksum \NC yes \NC yes \NC no \NC number \NC default: 0 \NC \NR
8585 \NC designsize \NC no \NC yes \NC yes \NC number \NC expected size (default: 655360 == 10pt) \NC \NR
8586 \NC direction \NC no \NC yes \NC yes \NC number \NC default: 0 (TLT) \NC \NR
8587 \NC encodingbytes \NC no \NC no \NC yes \NC number \NC default: depends on \type {format}\NC\NR
8588 \NC encodingname \NC no \NC no \NC yes \NC string \NC encoding name\NC\NR
8589 \NC fonts \NC yes \NC no \NC yes \NC table \NC locally used fonts\NC \NR
8590 \NC psname \NC no \NC no \NC yes \NC string
8591 \NC actual (\POSTSCRIPT) name (this is the PS fontname in the
8592 incoming font source, also used as fontname identifier in the \PDF\ output, new in 0.43)\NC\NR
8593 \NC fullname \NC no \NC no \NC yes \NC string \NC output font name, used as a fallback in the \PDF\ output if the psname is not set\NC\NR
8594 \NC header \NC yes \NC no \NC no \NC string \NC header comments, if any\NC \NR
8595 \NC hyphenchar \NC no \NC no \NC yes \NC number \NC default: TeX's \tex{hyphenchar} \NC \NR
8596 \NC parameters \NC no \NC yes \NC yes \NC hash \NC default: 7 parameters, all zero \NC \NR
8597 \NC size \NC no \NC yes \NC yes \NC number \NC loaded (at) size. (default: same as designsize) \NC \NR
8598 \NC skewchar \NC no \NC no \NC yes \NC number \NC default: TeX's \tex{skewchar} \NC \NR
8599 \NC type \NC yes \NC no \NC yes \NC string \NC basic type of this font\NC \NR
8600 \NC format \NC no \NC no \NC yes \NC string \NC disk format type\NC \NR
8601 \NC embedding \NC no \NC no \NC yes \NC string \NC \PDF\ inclusion\NC \NR
8602 \NC filename \NC no \NC no \NC yes \NC string \NC disk file name\NC\NR
8603 \NC tounicode \NC no \NC yes \NC yes \NC number \NC if 1, \LUATEX\ assumes per-glyph tounicode entries are
8604 present in the font\NC\NR
8605 \NC stretch \NC no \NC no \NC yes \NC number \NC the \quote {stretch} value from \tex{pdffontexpand}\NC\NR
8606 \NC shrink \NC no \NC no \NC yes \NC number \NC the \quote {shrink} value from \tex{pdffontexpand}\NC\NR
8607 \NC step \NC no \NC no \NC yes \NC number \NC the \quote {step} value from \tex{pdffontexpand}\NC\NR
8608 \NC auto_expand \NC no \NC no \NC yes \NC boolean\NC the \quote {autoexpand} keyword from\crlf \tex{pdffontexpand}\NC\NR
8609 \NC expansion_factor \NC no \NC no \NC no \NC number \NC the actual expansion factor of an expanded font\NC\NR
8610 \NC attributes \NC no \NC no \NC yes \NC string \NC the \tex{pdffontattr}\NC\NR
8611 \NC cache \NC no \NC no \NC yes \NC string \NC this key controls caching of the lua table on the \type{tex}
8612 end. \type{yes}: use a reference to the table that is
8613 passed to \LUATEX\ (this is the default). \type{no}: don't store the table
8614 reference, don't cache any lua data for this font.
8615 \type{renew}: don't store the table reference, but
8616 save a reference to the table that is created at the
8617 first access to one of its fields in font.fonts.
8618 (new in 0.40.0, before that caching was always \type{yes}).
8619 Note: the saved reference is thread-local, so be careful when you are using coroutines: an error will be thrown if the table
8620 has been cached in one thread, but you reference it from another thread ($\approx$ coroutine)\NC\NR
8621 \NC nomath \NC no \NC no \NC yes \NC boolean\NC this key allows a minor speedup for text fonts. if it is
8622 present and true, then \LUATEX\ will not check the
8623 character enties for math-specific keys. (0.42.0)\NC\NR
8624 \NC slant \NC no \NC no \NC yes \NC number \NC This has the same semantics as the \type{SlantFont} operator
8625 in font map files. (0.47.0)\NC\NR
8626 \NC extent \NC no \NC no \NC yes \NC number \NC This has the same semantics as the \type{ExtendFont} operator
8627 in font map files. (0.50.0)\NC\NR
8628 \stoptabulate
8630 The key \type{name} is always required. The keys \type{stretch},
8631 \type{shrink}, \type{step} and optionally \type{auto_expand} only
8632 have meaning when used together: they can be used to replace a
8633 post-loading \tex{pdffontexpand} command. The
8634 \type{expansion_factor} is value that can be present inside a font
8635 in \type{font.fonts}. It is the actual expansion factor (a value
8636 between \type{-shrink} and \type{stretch}, with step \type{step})
8637 of a font that was automatically generated by the font expansion
8638 algorithm. The key \type{attributes} can be used to replace
8639 \tex{pdffontattr}. The key \type{used} is set by the engine when a
8640 font is actively in use, this makes sure that the font's
8641 definition is written to the output file (\DVI\ or \PDF). The
8642 \TFM\ reader sets it to false. The \type{direction} is a number
8643 signalling the \quote{normal} direction for this font. There are
8644 sixteen possibilities:
8646 \starttabulate[|Tc|c|c|c|]
8647 \NC \ssbf number \NC \bf meaning \NC \bf number \NC \bf meaning \NC\NR
8648 \NC 0 \NC LT \NC 8 \NC TT \NC\NR
8649 \NC 1 \NC LL \NC 9 \NC TL \NC\NR
8650 \NC 2 \NC LB \NC 10 \NC TB \NC\NR
8651 \NC 3 \NC LR \NC 11 \NC TR \NC\NR
8652 \NC 4 \NC RT \NC 12 \NC BT \NC\NR
8653 \NC 5 \NC RL \NC 13 \NC BL \NC\NR
8654 \NC 6 \NC RB \NC 14 \NC BB \NC\NR
8655 \NC 7 \NC RR \NC 15 \NC BR \NC\NR
8656 \stoptabulate
8658 These are \OMEGA|-|style direction abbreviations: the first character
8659 indicates the \quote{first} edge of the character glyphs (the edge that is
8660 seen first in the writing direction), the second the \quote{top} side.
8662 The \type{parameters} is a hash with mixed key types. There are seven
8663 possible string keys, as well as a number of integer indices (these
8664 start from 8 up). The seven strings are actually used instead of the
8665 bottom seven indices, because that gives a nicer user interface.
8667 The names and their internal remapping are:
8669 \starttabulate[|lT|c|]
8670 \NC \ssbf name \NC \bf internal remapped number \NC\NR
8671 \NC slant \NC 1 \NC\NR
8672 \NC space \NC 2 \NC\NR
8673 \NC space_stretch \NC 3 \NC\NR
8674 \NC space_shrink \NC 4 \NC\NR
8675 \NC x_height \NC 5 \NC\NR
8676 \NC quad \NC 6 \NC\NR
8677 \NC extra_space \NC 7 \NC\LR
8678 \stoptabulate
8680 The keys \type{type}, \type{format}, \type{embedding}, \type{fullname} and
8681 \type{filename} are used to embed \OPENTYPE\ fonts in the result \PDF.
8683 The \type{characters} table is a list of character hashes indexed by
8684 an integer number. The number is the \quote{internal code} \TEX\ knows this
8685 character by.
8687 Two very special string indexes can be used also: \type{left_boundary} is a
8688 virtual character whose ligatures and kerns are used to handle word
8689 boundary processing. \type{right_boundary} is similar but not actually
8690 used for anything (yet!).
8692 Other index keys are ignored.
8694 Each character hash itself is a hash. For example, here is the
8695 character \quote{f} (decimal 102) in the font cmr10 at 10 points:
8697 \starttyping
8698 [102] = {
8699 ['width'] = 200250,
8700 ['height'] = 455111,
8701 ['depth'] = 0,
8702 ['italic'] = 50973,
8703 ['kerns'] = {
8704 [63] = 50973,
8705 [93] = 50973,
8706 [39] = 50973,
8707 [33] = 50973,
8708 [41] = 50973
8710 ['ligatures'] = {
8711 [102] = {
8712 ['char'] = 11,
8713 ['type'] = 0
8715 [108] = {
8716 ['char'] = 13,
8717 ['type'] = 0
8719 [105] = {
8720 ['char'] = 12,
8721 ['type'] = 0
8725 \stoptyping
8727 The following top|-|level keys can be present inside a character hash:
8729 \starttabulate[|lT|c|c|c|l|p|]
8730 \NC \ssbf key \NC \bf from vf \NC \bf from tfm \NC \bf used \NC \bf value type \NC \bf description \NC\NR
8731 \NC width \NC yes \NC yes \NC yes \NC number \NC character's width, in sp (default 0) \NC\NR
8732 \NC height \NC no \NC yes \NC yes \NC number \NC character's height, in sp (default 0) \NC\NR
8733 \NC depth \NC no \NC yes \NC yes \NC number \NC character's depth, in sp (default 0) \NC\NR
8734 \NC italic \NC no \NC yes \NC yes \NC number \NC character's italic correction, in sp (default zero) \NC\NR
8735 \NC top_accent \NC no \NC no \NC maybe \NC number \NC character's top accent alignment place, in sp (default zero) \NC\NR
8736 \NC bot_accent \NC no \NC no \NC maybe \NC number \NC character's bottom accent alignment place, in sp (default zero) \NC\NR
8737 \NC left_protruding \NC no \NC no \NC maybe \NC number \NC character's \tex{lpcode}\NC\NR
8738 \NC right_protruding \NC no \NC no \NC maybe \NC number \NC character's \tex{rpcode}\NC\NR
8739 \NC expansion_factor \NC no \NC no \NC maybe \NC number \NC character's \tex{efcode}\NC\NR
8740 \NC tounicode \NC no \NC no \NC maybe \NC string \NC character's Unicode equivalent(s), in UTF-16BE hexadecimal format\NC\NR
8741 \NC next \NC no \NC yes \NC yes \NC number \NC the \quote{next larger} character index \NC\NR
8742 \NC extensible \NC no \NC yes \NC yes \NC table \NC the constituent parts of an extensible recipe \NC\NR
8743 \NC vert_variants \NC no \NC no \NC yes \NC table \NC constituent parts of a vertical variant set\NC \NR
8744 \NC horiz_variants\NC no \NC no \NC yes \NC table \NC constituent parts of a horizontal variant set\NC \NR
8745 \NC kerns \NC no \NC yes \NC yes \NC table \NC kerning information \NC\NR
8746 \NC ligatures \NC no \NC yes \NC yes \NC table \NC ligaturing information \NC\NR
8747 \NC commands \NC yes \NC no \NC yes \NC array \NC virtual font commands \NC\NR
8748 \NC name \NC no \NC no \NC no \NC string \NC the character (\POSTSCRIPT) name \NC\NR
8749 \NC index \NC no \NC no \NC yes \NC number \NC the (\OPENTYPE\ or \TRUETYPE) font glyph index \NC\NR
8750 \NC used \NC no \NC yes \NC yes \NC boolean \NC typeset already (default: false)? \NC\NR
8751 \NC mathkern \NC no \NC no \NC yes \NC table \NC math cut-in specifications \NC\NR
8752 \stoptabulate
8754 The values of \type{top_accent}, \type{bot_accent} and \type{mathkern} are used only for math
8755 accent and superscript placement, see the \at{math chapter}[math] in this manual for details.
8757 The values of \type{left_protruding} and \type{right_protruding} are used only when
8758 \tex{pdfprotrudechars} is non-zero.
8760 Whether or not \type{expansion_factor} is used depends on the font's global expansion
8761 settings, as well as on the value of \tex{pdfadjustspacing}.
8763 The usage of \type{tounicode} is this: if this font specifies a \type{tounicode=1} at
8764 the top level, then \LUATEX\ will construct a \type{/ToUnicode} entry for the \PDF\
8765 font (or font subset) based on the character-level \type{tounicode} strings, where
8766 they are available. If a character does not have a sensible \UNICODE\ equivalent,
8767 do not provide a string either (no empty strings).
8769 If the font-level \type{tounicode} is not set, then \LUATEX\ will build up
8770 \type{/ToUnicode} based on the \TEX\ code points you used, and any character-level
8771 \type{tounicodes} will be ignored. {\it At the moment, the string format is exactly the
8772 format that is expected by Adobe \CMAP\ files (\UTF-16BE in hexadecimal encoding), minus
8773 the enclosing angle brackets. This may change in the future.} Small example: the
8774 \type{tounicode} for a \type{fi} ligature would be \type{00660069}.
8776 The presence of \type{extensible} will overrule \type{next}, if that is also present.
8777 It in in turn can be overruled by \type{vert_variants}.
8779 The \type{extensible} table is very simple:
8781 \starttabulate[|lT|l|p|]
8782 \NC \ssbf key \NC \bf type \NC \bf description \NC\NR
8783 \NC top \NC number \NC \quote{top} character index \NC\NR
8784 \NC mid \NC number \NC \quote{middle} character index \NC\NR
8785 \NC bot \NC number \NC \quote{bottom} character index \NC\NR
8786 \NC rep \NC number \NC \quote{repeatable} character index \NC\NR
8787 \stoptabulate
8789 The \type{horiz_variants} and \type{vert_variants} are arrays of components. Each of those
8790 components is itself a hash of up to five keys:
8792 \starttabulate[|lT|l|p|]
8793 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
8794 \NC glyph \NC number \NC The character index (note that this is an encoding number, not a name).\NC \NR
8795 \NC extender \NC number \NC One (1) if this part is repeatable, zero (0) otherwise.\NC \NR
8796 \NC start \NC number \NC Maximum overlap at the starting side (in scaled points).\NC \NR
8797 \NC end \NC number \NC Maximum overlap at the ending side (in scaled points).\NC \NR
8798 \NC advance \NC number \NC Total advance width of this item (can be zero or missing,
8799 then the natural size of the glyph for character \type{component}
8800 is used).\NC \NR
8801 \stoptabulate
8803 The \type{kerns} table is a hash indexed by character index (and
8804 \quote{character index} is defined as either a non|-|negative integer or the
8805 string value \type {right_boundary}), with the values the kerning to be
8806 applied, in scaled points.
8808 The \type{ligatures} table is a hash indexed by character index (and
8809 \quote{character index} is defined as either a non|-|negative integer or the
8810 string value \type {right_boundary}), with the values being yet another small
8811 hash, with two fields:
8813 \starttabulate[|lT|l|p|]
8814 \NC \ssbf key \NC \bf type \NC \bf description \NC\NR
8815 \NC type \NC number \NC the type of this ligature command, default 0 \NC\NR
8816 \NC char \NC number \NC the character index of the resultant ligature \NC\NR
8817 \stoptabulate
8819 The \type{char} field in a ligature is required.
8821 The \type{type} field inside a ligature is the numerical or string value of one of the eight
8822 possible ligature types supported by \TEX. When \TEX\ inserts a new ligature, it puts the new
8823 glyph in the middle of the left and right glyphs. The original left and right glyphs can
8824 optionally be retained, and when at least one of them is kept, it is also possible to move the
8825 new \quote{insertion point} forward one or two places. The glyph that ends up to the right of the
8826 insertion point will become the next \quote{left}.
8828 \starttabulate[|l|c|l|l|]
8829 \NC \bf textual (Knuth) \NC \bf number \NC \bf string \NC result \NC\NR
8830 \NC l + r =: n \NC 0 \NC \type{=:} \NC \|n \NC\NR
8831 \NC l + r =:\| n \NC 1 \NC \type{=:|} \NC \|nr \NC\NR
8832 \NC l + r \|=: n \NC 2 \NC \type{|=:} \NC \|ln \NC\NR
8833 \NC l + r \|=:\| n \NC 3 \NC \type{|=:|} \NC \|lnr \NC\NR
8834 \NC l + r =:\|\> n \NC 5 \NC \type{=:|>} \NC n\|r \NC\NR
8835 \NC l + r \|=:\> n \NC 6 \NC \type{|=:>} \NC l\|n \NC\NR
8836 \NC l + r \|=:\|\> n \NC 7 \NC \type{|=:|>} \NC l\|nr \NC\NR
8837 \NC l + r \|=:\|\>\> n \NC 11 \NC \type{|=:|>>} \NC ln\|r \NC\NR
8838 \stoptabulate
8840 The default value is~0, and can be left out. That signifies a \quote{normal}
8841 ligature where the ligature replaces both original glyphs. In this table
8842 the~\| indicates the final insertion point.
8844 The \type{commands} array is explained below.
8846 \section {Real fonts}
8848 Whether or not a \TEX\ font is a \quote{real} font that should be written to
8849 the \PDF\ document is decided by the \type{type} value in the top|-|level
8850 font structure. If the value is \type{real}, then this is a proper
8851 font, and the inclusion mechanism will attempt to add the needed
8852 font object definitions to the \PDF.
8854 Values for \type{type}:
8856 \starttabulate[|Tl|p|]
8857 \NC \ssbf value \NC \bf description \NC\NR
8858 \NC real \NC this is a base font \NC\NR
8859 \NC virtual \NC this is a virtual font \NC\NR
8860 \stoptabulate
8862 The actions to be taken depend on a number of different variables:
8864 \startitemize[packed]
8865 \item Whether the used font fits in an 8-bit encoding scheme or not
8866 \item The type of the disk font file
8867 \item The level of embedding requested
8868 \stopitemize
8870 A font that uses anything other than an 8-bit encoding vector has to
8871 be written to the \PDF\ in a different way.
8873 The rule is: if the font table has \type {encodingbytes} set to~2,
8874 then this is a wide font, in all other cases it isn't. The value~2 is
8875 the default for \OPENTYPE\ and \TRUETYPE\ fonts loaded via \LUA. For
8876 \TYPEONE\ fonts, you have to set \type {encodingbytes} to~2
8877 explicitly. For \PK\ bitmap fonts, wide font encoding is not
8878 supported at all.
8880 If no special care is needed, \LUATEX\ currently falls back to the
8881 mapfile|-|based solution used by \PDFTEX\ and \DVIPS. This behavior
8882 will be removed in the future, when the existing code becomes
8883 integrated in the new subsystem.
8885 But if this is a \quote{wide} font, then the new subsystem kicks in, and
8886 some extra fields have to be present in the font structure. In this
8887 case, \LUATEX\ does not use a map file at all.
8889 The extra fields are: \type{format}, \type{embedding}, \type{fullname},
8890 \type{cidinfo} (as explained above), \type{filename}, and the
8891 \type{index} key in the separate characters.
8893 Values for \type{format} are:
8895 \starttabulate[|Tl|p|]
8896 \NC \ssbf value \NC \bf description \NC\NR
8897 \NC type1 \NC this is a \POSTSCRIPT\ \TYPEONE\ font \NC\NR
8898 \NC type3 \NC this is a bitmapped (\PK) font \NC\NR
8899 \NC truetype \NC this is a \TRUETYPE\ or \TRUETYPE|-|based \OPENTYPE\ font \NC\NR
8900 \NC opentype \NC this is a \POSTSCRIPT|-|based \OPENTYPE\ font \NC\NR
8901 \stoptabulate
8903 (\type{type3} fonts are provided for backward compatibility only, and do not
8904 support the new wide encoding options.)
8906 Values for \type{embedding} are:
8908 \starttabulate[|Tl|p|]
8909 \NC \ssbf value \NC \bf description \NC\NR
8910 \NC no \NC don't embed the font at all \NC\NR
8911 \NC subset \NC include and atttempt to subset the font \NC\NR
8912 \NC full \NC include this font in its entirety \NC\NR
8913 \stoptabulate
8915 It is not possible to artificially modify the transformation matrix
8916 for the font at the moment.
8918 The other fields are used as follows: The \type{fullname} will be the
8919 \POSTSCRIPT|/|\PDF\ font name. The \type{cidinfo} will be used as the
8920 character set (the CID \type{/Ordering} and \type{/Registry} keys). The
8921 \type{filename} points to the actual font file. If you include the
8922 full path in the \type{filename} or if the file is in the local
8923 directory, \LUATEX\ will run a little bit more efficient because it
8924 will not have to re|-|run the \type{find_xxx_file} callback in that
8925 case.
8927 Be careful: when mixing old and new fonts in one document, it is possible to
8928 create \POSTSCRIPT\ name clashes that can result in printing
8929 errors. When this happens, you have to change the \type{fullname}
8930 of the font.
8932 Typeset strings are written out in a wide format using 2~bytes per
8933 glyph, using the \type{index} key in the character information as
8934 value. The overall effect is like having an encoding based on numbers
8935 instead of traditional (\POSTSCRIPT) name|-|based reencoding. The way
8936 to get the correct \type{index} numbers for \TYPEONE\ fonts is by
8937 loading the font via \type{fontloader.open}; use the table indices as
8938 \type{index} fields.
8940 This type of reencoding means that there is no longer a clear
8941 connection between the text in your input file and the strings in the
8942 output \PDF\ file. Dealing with this is high on the agenda.
8944 \section[virtualfonts]{Virtual fonts}
8946 You have to take the following steps if you want \LUATEX\ to treat the
8947 returned table from \luatex{define_font} as a virtual font:
8949 \startitemize[packed]
8950 \item Set the top|-|level key \type {type} to \type {virtual}.
8951 \item Make sure there is at least one valid entry in \luatex{fonts} (see below).
8952 \item Give a \type {commands} array to every character (see below).
8953 \stopitemize
8955 The presence of the toplevel \type {type} key with the specific value
8956 \type {virtual} will trigger handling of the rest of the special virtual
8957 font fields in the table, but the mere existence of 'type' is enough
8958 to prevent \LUATEX\ from looking for a virtual font on its own.
8960 Therefore, this also works \quote{in reverse}: if you are absolutely certain
8961 that a font is not a virtual font, assigning the value \type{base} or
8962 \type{real} to \type{type} will inhibit \LUATEX\ from looking for a virtual font
8963 file, thereby saving you a disk search.
8965 The \luatex{fonts} is another \LUA\ array. The values are one- or two|-|key
8966 hashes themselves, each entry indicating one of the base fonts in a
8967 virtual font. In case your font is referring to itself, you can use the
8968 \type {font.nextid()} function which returns the index of the next to be defined
8969 font which is probably the currently defined one.
8971 An example makes this easy to understand
8973 \starttyping
8974 fonts = {
8975 { name = 'ptmr8a', size = 655360 },
8976 { name = 'psyr', size = 600000 },
8977 { id = 38 }
8979 \stoptyping
8981 says that the first referenced font (index 1) in this virtual font is
8982 \type{ptrmr8a} loaded at 10pt, and the second is \type{psyr} loaded
8983 at a little over 9pt. The third one is previously defined font that
8984 is known to \LUATEX\ as fontid \quote{38}.
8986 The array index numbers are used by the character command definitions
8987 that are part of each character.
8989 The \luatex{commands} array is a hash where each item is another small array, with the first
8990 entry representing a command and the extra items being the parameters to that command. The
8991 allowed commands and their arguments are:
8993 \starttabulate[|Tl|l|l|p|]
8994 \NC \ssbf command name \NC \bf arguments \NC \bf arg type \NC \bf description \NC\NR
8995 \NC font \NC 1 \NC number \NC select a new font from the local \luatex{fonts} table\NC\NR
8996 \NC char \NC 1 \NC number \NC typeset this character number from the current font,
8997 and move right by the character's width\NC\NR
8998 \NC node \NC 1 \NC node \NC output this node (list), and move right
8999 by the width of this list\NC\NR
9000 \NC slot \NC 2 \NC number \NC a shortcut for the combination of a font and char command\NC\NR
9001 \NC push \NC 0 \NC \NC save current position\NC\NR
9002 \NC nop \NC 0 \NC \NC do nothing \NC\NR
9003 \NC pop \NC 0 \NC \NC pop position \NC\NR
9004 \NC rule \NC 2 \NC 2 numbers \NC output a rule $ht*wd$, and move right.\NC\NR
9005 \NC down \NC 1 \NC number \NC move down on the page\NC\NR
9006 \NC right \NC 1 \NC number \NC move right on the page\NC\NR
9007 \NC special \NC 1 \NC string \NC output a \tex{special} command\NC\NR
9008 \NC lua \NC 1 \NC string \NC execute a \LUA\ script (at \tex{latelua} time)\NC\NR
9009 \NC image \NC 1 \NC image \NC output an image (the argument can be either an \type{<image>}
9010 variable or an \type{image_spec} table)\NC\NR
9011 \NC comment \NC any \NC any \NC the arguments of this command are ignored\NC\NR
9012 \stoptabulate
9014 Here is a rather elaborate glyph commands example:
9015 \starttyping
9017 commands = {
9018 {'push'}, -- remember where we are
9019 {'right', 5000}, -- move right about 0.08pt
9020 {'font', 3}, -- select the fonts[3] entry
9021 {'char', 97}, -- place character 97 (ASCII 'a')
9022 {'pop'}, -- go all the way back
9023 {'down', -200000}, -- move upwards by about 3pt
9024 {'special', 'pdf: 1 0 0 rg'} -- switch to red color
9025 {'rule', 500000, 20000} -- draw a bar
9026 {'special','pdf: 0 g'} -- back to black
9029 \stoptyping
9031 The default value for \type {font} is always~1 at the start of the \type{commands} array.
9032 Therefore, if the virtual font is essentially only a re|-|encoding, then you do usually not
9033 have create an explicit \quote{font} command in the array.
9035 Rules inside of \type{commands} arrays are built up using only two dimensions:
9036 they do not have depth. For correct vertical placement, an extra \type{down} command
9037 may be needed.
9039 Regardless of the amount of movement you create within the \type {commands},
9040 the output pointer will always move by exactly the width that was given in
9041 the \type {width} key of the character hash. Any movements that take place
9042 inside the \type{commands} array are ignored on the upper level.
9044 \subsection{Artificial fonts}
9046 Even in a \quote{real} font, there can be virtual characters. When \LUATEX\ encounters a \type {commands}
9047 field inside a character when it becomes time to typeset the character, it will interpret the commands, just
9048 like for a true virtual character. In this case, if you have created no \quote{fonts} array, then the default
9049 (and only) \quote{base} font is taken to be the current font itself. In practice, this means that you can
9050 create virtual duplicates of existing characters which is useful if you want to create composite characters.
9052 Note: this feature does {\it not\/} work the other way around. There can not be \quote{real} characters in a
9053 virtual font! You cannot use this technique for font re-encoding either; you need a truly virtual
9054 font for that (because characters that are already present cannot be altered).
9056 \subsection{Example virtual font}
9058 Finally, here is a plain \TEX\ input file with a virtual font demonstration:
9060 \startbuffer
9061 \directlua {
9062 callback.register('define_font',
9063 function (name,size)
9064 if name == 'cmr10-red' then
9065 f = font.read_tfm('cmr10',size)
9066 f.name = 'cmr10-red'
9067 f.type = 'virtual'
9068 f.fonts = {{ name = 'cmr10', size = size }}
9069 for i,v in pairs(f.characters) do
9070 if (string.char(i)):find('[tacohanshartmut]') then
9071 v.commands = {
9072 {'special','pdf: 1 0 0 rg'},
9073 {'char',i},
9074 {'special','pdf: 0 g'},
9076 else
9077 v.commands = {{'char',i}}
9080 else
9081 f = font.read_tfm(name,size)
9083 return f
9088 \font\myfont = cmr10-red at 10pt \myfont This is a line of text \par
9089 \font\myfontx= cmr10 at 10pt \myfontx Here is another line of text \par
9090 \stopbuffer
9092 \typebuffer
9094 %\getbuffer
9096 \chapter[nodes]{Nodes}
9098 \section{\LUA\ node representation}
9100 \TEX's nodes are represented in \LUA\ as userdata object with a variable
9101 set of fields. In the following syntax tables, such the type of such a
9102 userdata object is represented as \syntax{<node>}.
9105 The current return value of \luatex{node.types()} is:
9106 \ctxlua {
9107 local d = node.types()
9108 tex.print('\\type{' .. d[0] .. '} (' .. 0 .. '), ')
9109 for _,v in pairs(d) do
9110 if _ > 0 then
9111 tex.print('\\type{' .. v .. '} (' .. _ .. '), ')
9116 NOTE: The \type {\lastnodetype} primitive is \ETEX\ compliant. The valid
9117 range is still -1 .. 15 and glyph nodes have number 0 (used to be
9118 char node) and ligature nodes are mapped to 7. That way macro
9119 packages can use the same symbolic names as in traditional \ETEX.
9120 Keep in mind that the internal node numbers are different and that
9121 there are more node types than 15.
9123 \subsection{Auxiliary items}
9125 A few node|-|typed userdata objects do not occur in the \quote{normal}
9126 list of nodes, but can be pointed to from within that list. They are
9127 not quite the same as regular nodes, but it is easier for the library
9128 routines to treat them as if they were.
9130 \subsubsection{glue_spec items}
9132 Skips are about the only type of data objects in traditional \TEX\
9133 that are not a simple value. The structure that represents the glue
9134 components of a skip is called a \type {glue_spec}, and it has the following
9135 accessible fields:
9137 \starttabulate[|lT|l|p|]
9138 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
9139 \NC width \NC number \NC \NC\NR
9140 \NC stretch \NC number \NC \NC\NR
9141 \NC stretch_order \NC number \NC \NC\NR
9142 \NC shrink \NC number \NC \NC\NR
9143 \NC shrink_order \NC number \NC \NC\NR
9144 \NC writable \NC boolean \NC If this is true, you can't assign to this \type{glue_spec}
9145 because it is one of the preallocated special cases. New in 0.52\NC\NR
9146 \stoptabulate
9148 These objects are reference counted, so there is actually an extra
9149 read-only field named \type {ref_count} as well. This item type will likely
9150 disappear in the future, and the glue fields themselves will
9151 become part of the nodes referencing glue items.
9153 \subsubsection{attribute{\_}list and attribute items}
9155 The newly introduced attribute registers are non|-|trivial, because
9156 the value that is attached to a node is essentially a sparse array of
9157 key|-|value pairs.
9159 It is generally easiest to deal with attribute lists and attributes
9160 by using the dedicated functions in the \luatex{node} library, but
9161 for completeness, here is the low|-|level interface.
9163 An \type{attribute_list} item is used as a head pointer for a list
9164 of attribute items. It has only one user-visible field:
9166 \starttabulate[|lT|l|p|]
9167 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9168 \NC next \NC \syntax{<node>} \NC pointer to the first attribute\NC\NR
9169 \stoptabulate
9171 A normal node's attribute field will point to an item of type
9172 \type{attribute_list}, and the \type{next} field in that item will point
9173 to the first defined \quote{attribute} item, whose \type {next} will
9174 point to the second \quote{attribute} item, etc.
9176 Valid fields in \type{attribute} items:
9178 \starttabulate[|lT|l|p|]
9179 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9180 \NC next \NC \syntax{<node>} \NC pointer to the next attribute\NC\NR
9181 \NC number \NC number \NC the attribute type id\NC\NR
9182 \NC value \NC number \NC the attribute value\NC\NR
9183 \stoptabulate
9185 \subsubsection{action item}
9187 Valid fields: \showfields{action}\crlf
9188 Id: \showid{action}
9190 These are a special kind of item that only appears inside
9191 pdf start link objects.
9193 \starttabulate[|lT|l|p|]
9194 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9195 \NC action_type \NC number \NC \NC\NR
9196 \NC action_id \NC number or string \NC \NC\NR
9197 \NC named_id \NC number \NC \NC\NR
9198 \NC file \NC string \NC \NC\NR
9199 \NC new_window \NC number \NC \NC\NR
9200 \NC data \NC string \NC \NC\NR
9201 \NC ref_count \NC number \NC (read-only)\NC\NR
9202 \stoptabulate
9204 \subsection{Main text nodes}
9206 These are the nodes that comprise actual typesetting commands.
9208 A few fields are present in all nodes regardless of their type, these are:
9210 \starttabulate[|lT|l|p|]
9211 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9212 \NC next \NC \syntax{<node>} \NC The next node in a list, or nil\NC\NR
9213 \NC id \NC number \NC The node's type (\type{id}) number \NC\NR
9214 \NC subtype \NC number \NC The node \type{subtype} identifier\NC\NR
9215 \stoptabulate
9217 The \type{subtype} is sometimes just a stub entry. Not all nodes
9218 actually use the \type{subtype}, but this way you can be sure that all
9219 nodes accept it as a valid field name, and that is often handy in node
9220 list traversal. In the following tables \type{next} and \type{id} are
9221 not explicitly mentioned.
9223 Besides these three fields, almost all nodes also have an \type {attr}
9224 field, and there is a also a field called \type{prev}. That last field
9225 is always present, but only initialized on explicit request: when the
9226 function \type{node.slide()} is called, it will set up the \type{prev}
9227 fields to be a backwards pointer in the argument node list.
9230 \subsubsection{hlist nodes}
9232 Valid fields: \showfields{hlist}\crlf
9233 Id: \showid{hlist}
9235 \starttabulate[|lT|l|p|]
9236 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9237 \NC subtype \NC number \NC 0 = unknown origin, 1 = created by
9238 linebreaking, 2 = explicit box command. (0.46.0),
9239 3 = paragraph indentation box, 4 = alignment column or row, 5 = alignment cell (0.62.0)\NC\NR
9240 \NC attr \NC \syntax{<node>} \NC The head of the associated attribute list \NC\NR
9241 \NC width \NC number \NC \NC\NR
9242 \NC height \NC number \NC \NC\NR
9243 \NC depth \NC number \NC \NC\NR
9244 \NC shift \NC number \NC a displacement perpendicular to the
9245 character progression direction \NC\NR
9246 \NC glue_order \NC number \NC a number in the range 0--4, indicating
9247 the glue order\NC\NR
9248 \NC glue_set \NC number \NC the calculated glue ratio\NC\NR
9249 \NC glue_sign \NC number \NC 0 = normal,1 = stretching,2 = shrinking \NC\NR
9250 \NC head \NC \syntax{<node>} \NC the first node of the body of this list\NC\NR
9251 \NC dir \NC string \NC the direction of this box. see~\in{}[dirnodes]\NC\NR
9252 \stoptabulate
9254 A warning: never assign a node list to the \type{head} field
9255 unless you are sure its internal link structure is correct, otherwise
9256 an error may result.
9258 Note: the new field name \type{head} was introduced in 0.65 to replace
9259 the old name \type{list}. Use of the name \type{list} is now
9260 deprecated, but it will stay available until at least version 0.80.
9262 \subsubsection{vlist nodes}
9264 Valid fields: As for hlist, except that \quote{shift} is a displacement
9265 perpendicular to the line progression direction, and \quote{subtype} only
9266 has subtypes 0, 4, and 5.
9268 \subsubsection{rule nodes}
9270 Valid fields: \showfields{rule}\crlf
9271 Id: \showid{rule}
9273 \starttabulate[|lT|l|p|]
9274 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9275 \NC subtype \NC number \NC unused\NC\NR
9276 \NC attr \NC \syntax{<node>} \NC \NC\NR
9277 \NC width \NC number \NC the width of the rule; the special value $-1073741824$
9278 is used for \quote{running} glue dimensions\NC\NR
9279 \NC height \NC number \NC the height of the rule (can be negative)\NC\NR
9280 \NC depth \NC number \NC the depth of the rule (can be negative)\NC\NR
9281 \NC dir \NC string \NC the direction of this rule. see~\in{}[dirnodes]\NC\NR
9282 \stoptabulate
9284 \subsubsection{ins nodes}
9286 Valid fields: \showfields{ins}\crlf
9287 Id: \showid{ins}
9289 \starttabulate[|lT|l|p|]
9290 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9291 \NC subtype \NC number \NC the insertion class\NC\NR
9292 \NC attr \NC \syntax{<node>} \NC \NC\NR
9293 \NC cost \NC number \NC the penalty associated with this insert\NC\NR
9294 \NC height \NC number \NC \NC\NR
9295 \NC depth \NC number \NC \NC\NR
9296 \NC head \NC \syntax{<node>} \NC the first node of the body of this insert\NC\NR
9297 \NC spec \NC \syntax{<node>} \NC a pointer to the \tex{splittopskip} glue spec\NC\NR
9298 \stoptabulate
9300 A warning: never assign a node list to the \type{head} field
9301 unless you are sure its internal link structure is correct, otherwise
9302 an error may be result.
9304 Note: the new field name \type{head} was introduced in 0.65 to replace
9305 the old name \type{list}. Use of the name \type{list} is now
9306 deprecated, but it will stay available until at least version 0.80.
9309 \subsubsection{mark nodes}
9311 Valid fields: \showfields{mark}\crlf
9312 Id: \showid{mark}
9314 \starttabulate[|lT|l|p|]
9315 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9316 \NC subtype \NC number \NC unused\NC\NR
9317 \NC attr \NC \syntax{<node>} \NC \NC\NR
9318 \NC class \NC number \NC the mark class\NC\NR
9319 \NC mark \NC table \NC a table representing a token list\NC\NR
9320 \stoptabulate
9322 \subsubsection{adjust nodes}
9324 Valid fields: \showfields{adjust}\crlf
9325 Id: \showid{adjust}
9327 \starttabulate[|lT|l|p|]
9328 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9329 \NC subtype \NC number \NC 0 = normal, 1 = \quote{pre}\NC\NR
9330 \NC attr \NC \syntax{<node>} \NC \NC\NR
9331 \NC head \NC \syntax{<node>} \NC adjusted material\NC\NR
9332 \stoptabulate
9334 A warning: never assign a node list to the \type{head} field
9335 unless you are sure its internal link structure is correct, otherwise
9336 an error may be result.
9338 Note: the new field name \type{head} was introduced in 0.65 to replace
9339 the old name \type{list}. Use of the name \type{list} is now
9340 deprecated, but it will stay available until at least version 0.80.
9343 \subsubsection{disc nodes}
9345 Valid fields: \showfields{disc}\crlf
9346 Id: \showid{disc}
9348 \starttabulate[|lT|l|p|]
9349 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9350 \NC subtype \NC number \NC indicates the source of a discretionary.
9351 0 = the \tex{discretionary} command,
9352 1 = the \tex{-} command,
9353 2 = added automatically following a \type{-},
9354 3 = added by the hyphenation algorithm (simple),
9355 4 = added by the hyphenation algorithm (hard, first item),
9356 5 = added by the hyphenation algorithm (hard, second item)\NC\NR
9357 \NC attr \NC \syntax{<node>} \NC \NC\NR
9358 \NC pre \NC \syntax{<node>} \NC pointer to the pre|-|break text\NC\NR
9359 \NC post \NC \syntax{<node>} \NC pointer to the post|-|break text\NC\NR
9360 \NC replace \NC \syntax{<node>} \NC pointer to the no|-|break text\NC\NR
9361 \stoptabulate
9363 The subtype numbers~4 and~5 belong to the \quote{of-f-ice} explanation given elsewhere.
9365 A warning: never assign a node list to the pre, post or replace field
9366 unless you are sure its internal link structure is correct, otherwise
9367 an error may be result.
9369 \subsubsection{math nodes}
9371 Valid fields: \showfields{math}\crlf
9372 Id: \showid{math}
9374 \starttabulate[|lT|l|p|]
9375 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9376 \NC subtype \NC number \NC 0 = \quote{on}, 1 = \quote{off}\NC\NR
9377 \NC attr \NC \syntax{<node>} \NC \NC\NR
9378 \NC surround \NC number \NC width of the \tex{mathsurround} kern\NC\NR
9379 \stoptabulate
9381 \subsubsection{glue nodes}
9383 Valid fields: \showfields{glue}\crlf
9384 Id: \showid{glue}
9386 \starttabulate[|lT|l|p|]
9387 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9388 \NC subtype \NC number \NC 0 = \tex{skip},
9389 1--18 = internal glue parameters,
9390 100-103 = \quote{leader} subtypes \NC\NR
9391 \NC attr \NC \syntax{<node>} \NC \NC\NR
9392 \NC spec \NC \syntax{<node>} \NC pointer to a glue{\_}spec item \NC\NR
9393 \NC leader \NC \syntax{<node>} \NC pointer to a box or rule for leaders\NC\NR
9394 \stoptabulate
9396 The exact meanings of the subtypes are as follows:
9398 \starttabulate[|rT|l|]
9399 \NC 1 \NC \tex{lineskip} \NC \NR
9400 \NC 2 \NC \tex{baselineskip} \NC \NR
9401 \NC 3 \NC \tex{parskip} \NC \NR
9402 \NC 4 \NC \tex{abovedisplayskip} \NC \NR
9403 \NC 5 \NC \tex{belowdisplayskip} \NC \NR
9404 \NC 6 \NC \tex{abovedisplayshortskip} \NC \NR
9405 \NC 7 \NC \tex{belowdisplayshortskip} \NC \NR
9406 \NC 8 \NC \tex{leftskip} \NC \NR
9407 \NC 9 \NC \tex{rightskip} \NC \NR
9408 \NC 10 \NC \tex{topskip} \NC \NR
9409 \NC 11 \NC \tex{splittopskip} \NC \NR
9410 \NC 12 \NC \tex{tabskip} \NC \NR
9411 \NC 13 \NC \tex{spaceskip} \NC \NR
9412 \NC 14 \NC \tex{xspaceskip} \NC \NR
9413 \NC 15 \NC \tex{parfillskip} \NC \NR
9414 \NC 16 \NC \tex{thinmuskip} \NC \NR
9415 \NC 17 \NC \tex{medmuskip} \NC \NR
9416 \NC 18 \NC \tex{thickmuskip} \NC \NR
9417 \NC 100 \NC \tex{leaders} \NC \NR
9418 \NC 101 \NC \tex{cleaders} \NC \NR
9419 \NC 102 \NC \tex{xleaders} \NC \NR
9420 \NC 103 \NC \tex{gleaders} \NC \NR
9421 \stoptabulate
9423 \subsubsection{kern nodes}
9425 Valid fields: \showfields{kern}\crlf
9426 Id: \showid{kern}
9428 \starttabulate[|lT|l|p|]
9429 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9430 \NC subtype \NC number \NC 0 = from font,
9431 1 = from \tex{kern} or \tex{/},
9432 2 = from \tex{accent}\NC\NR
9433 \NC attr \NC \syntax{<node>} \NC \NC\NR
9434 \NC kern \NC number \NC \NC\NR
9435 \stoptabulate
9438 \subsubsection{penalty nodes}
9440 Valid fields: \showfields{penalty}\crlf
9441 Id: \showid{penalty}
9443 \starttabulate[|lT|l|p|]
9444 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9445 \NC subtype \NC number \NC not used\NC\NR
9446 \NC attr \NC \syntax{<node>} \NC \NC\NR
9447 \NC penalty \NC number \NC \NC\NR
9448 \stoptabulate
9450 \subsubsection[glyphnodes]{glyph nodes}
9452 Valid fields: \showfields{glyph}\crlf
9453 Id: \showid{glyph}
9455 \starttabulate[|lT|l|p|]
9456 \NC \ssbf field \NC \ssbf type \NC \ssbf explanation \NC \NR
9457 \NC subtype \NC number \NC bitfield \NC \NR
9458 \NC attr \NC \syntax{<node>} \NC \NC \NR
9459 \NC char \NC number \NC \NC \NR
9460 \NC font \NC number \NC \NC \NR
9461 \NC lang \NC number \NC \NC \NR
9462 \NC left \NC number \NC \NC \NR
9463 \NC right \NC number \NC \NC \NR
9464 \NC uchyph \NC boolean \NC \NC \NR
9465 \NC components \NC \syntax{<node>} \NC pointer to ligature components \NC \NR
9466 \NC xoffset \NC number \NC \NC \NR
9467 \NC yoffset \NC number \NC \NC \NR
9468 \NC width \NC number \NC (new in 0.53) \NC \NR
9469 \NC height \NC number \NC (new in 0.53) \NC \NR
9470 \NC depth \NC number \NC (new in 0.53) \NC \NR
9471 \NC expansion_factor \NC number \NC (new in 0.78) \NC \NR
9472 \stoptabulate
9474 A warning: never assign a node list to the components field
9475 unless you are sure its internal link structure is correct, otherwise
9476 an error may be result.
9478 Valid bits for the \type{subtype} field are:
9480 \starttabulate[|c|l|]
9481 \NC \ssbf bit \NC \bf meaning \NC\NR
9482 \NC 0 \NC character \NC\NR
9483 \NC 1 \NC ligature \NC\NR
9484 \NC 2 \NC ghost \NC\NR
9485 \NC 3 \NC left \NC\NR
9486 \NC 4 \NC right \NC\NR
9487 \stoptabulate
9489 See \in{section}[charsandglyphs] for a detailed description of the
9490 \type{subtype} field.
9492 The \type {expansion_factor} is relatively new and the result of extensive
9493 experiments with a more efficient implementation of expansion. Early versions of
9494 \LUATEX\ already replaced multiple instances of fonts in the backend by scaling
9495 but contrary to \PDFTEX\ in \LUATEX\ we now also got rid of font copies in the
9496 frontend and replaced them by expansion factors that travel with glyph nodes. Apart
9497 from a cleaner approach this is also a step towards a better separation between
9498 front- and backend.
9500 \subsubsection{margin{\_}kern nodes}
9502 Valid fields: \showfields{margin_kern}\crlf
9503 Id: \showid{margin_kern}
9505 \starttabulate[|lT|l|p|]
9506 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9507 \NC subtype \NC number \NC 0 = left side,
9508 1 = right side\NC\NR
9509 \NC attr \NC \syntax{<node>} \NC \NC\NR
9510 \NC width \NC number \NC \NC\NR
9511 \NC glyph \NC \syntax{<node>} \NC \NC\NR
9512 \stoptabulate
9514 \subsection{Math nodes}
9516 These are the so||called \quote{noad}s and the nodes that are specifically
9517 associated with math processing. Most of these nodes contain sub-nodes so
9518 that the list of possible fields is actually quite small. First, the subnodes:
9520 \subsubsection{Math kernel subnodes}
9522 Many object fields in math mode are either simple characters in a
9523 specific family or math lists or node lists. There are four associated
9524 subnodes that represent these cases (in the following node
9525 descriptions these are indicated by the word \type{<kernel>}).
9527 The \type{next} and \type{prev} fields for these subnodes are unused.
9529 \subsubsubsection{math{\_}char and math{\_}text{\_}char subnodes}
9531 Valid fields: \showfields{math_char}\crlf
9532 Id: \showid{math_char}
9534 \starttabulate[|lT|l|p|]
9535 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9536 \NC attr \NC \syntax{<node>}\NC \NC\NR
9537 \NC char \NC number \NC \NC \NR
9538 \NC fam \NC number \NC \NC\NR
9539 \stoptabulate
9541 The \type{math_char} is the simplest subnode field, it contains
9542 the character and family for a single glyph object. The
9543 \type{math_text_char} is a special case that you will not
9544 normally encounter, it arises temporarily during math list conversion
9545 (its sole function is to suppress a following italic correction).
9547 \subsubsubsection{sub{\_}box and sub{\_}mlist subnodes}
9549 Valid fields: \showfields{sub_box}\crlf
9550 Id: \showid{sub_box}
9552 \starttabulate[|lT|l|p|]
9553 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9554 \NC attr \NC \syntax{<node>}\NC \NC\NR
9555 \NC head \NC \syntax{<node>}\NC \NC \NR
9556 \stoptabulate
9558 These two subnode types are used for subsidiary list items. For
9559 \type{sub_box}, the \type{head} points to a \quote{normal} vbox or
9560 hbox. For \type{sub_mlist}, the \type{head} points to a math list
9561 that is yet to be converted.
9563 A warning: never assign a node list to the \type{head} field
9564 unless you are sure its internal link structure is correct, otherwise
9565 an error may be result.
9567 Note: the new field name \type{head} was introduced in 0.65 to replace
9568 the old name \type{list}. Use of the name \type{list} is now
9569 deprecated, but it will stay available until at least version 0.80.
9571 \subsubsection{Math delimiter subnode}
9573 There is a fifth subnode type that is used exclusively for delimiter
9574 fields. As before, the \type{next} and \type{prev} fields are unused.
9576 \subsubsubsection{delim subnodes}
9578 Valid fields: \showfields{delim}\crlf
9579 Id: \showid{delim}
9581 \starttabulate[|lT|l|p|]
9582 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9583 \NC attr \NC \syntax{<node>}\NC \NC\NR
9584 \NC small_char \NC number \NC \NC \NR
9585 \NC small_fam \NC number \NC \NC\NR
9586 \NC large_char \NC number \NC \NC \NR
9587 \NC large_fam \NC number \NC \NC\NR
9588 \stoptabulate
9590 The fields \type{large_char} and \type{large_fam} can be zero, in that
9591 case the font that is sed for the \type{small_fam} is expected to
9592 provide the large version as an extension to the \type{small_char}.
9594 \subsubsection{Math core nodes}
9596 First, there are the objects (the \TEX book calls then \quote{atoms})
9597 that are associated with the simple math objects: Ord, Op, Bin, Rel,
9598 Open, Close, Punct, Inner, Over, Under, Vcent. These all have
9599 the same fields, and they are combined into a single node type with
9600 separate subtypes for differentiation.
9602 \subsubsubsection{simple nodes}
9604 Valid fields: \showfields{noad}\crlf
9605 Id: \showid{noad}
9607 \starttabulate[|lT|l|p|]
9608 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9609 \NC subtype \NC number \NC see below \NC\NR
9610 \NC attr \NC \syntax{<node>} \NC \NC\NR
9611 \NC nucleus \NC \syntax{<kernel>}\NC \NC\NR
9612 \NC sub \NC \syntax{<kernel>}\NC \NC\NR
9613 \NC sup \NC \syntax{<kernel>}\NC \NC\NR
9614 \stoptabulate
9616 Operators are a bit special because they occupy three subtypes.
9617 \type{subtype}.
9619 \starttabulate[|lT|p|]
9620 \NC \ssbf number \NC \bf node sub type \NC\NR
9621 \NC 0 \NC Ord \NC\NR
9622 \NC 1 \NC Op, \type{\displaylimits} \NC\NR
9623 \NC 2 \NC Op, \type{\limits} \NC\NR
9624 \NC 3 \NC Op, \type{\nolimits} \NC\NR
9625 \NC 4 \NC Bin \NC\NR
9626 \NC 5 \NC Rel \NC\NR
9627 \NC 6 \NC Open \NC\NR
9628 \NC 7 \NC Close \NC\NR
9629 \NC 8 \NC Punct \NC\NR
9630 \NC 9 \NC Inner \NC\NR
9631 \NC 10 \NC Under \NC\NR
9632 \NC 11 \NC Over \NC\NR
9633 \NC 12 \NC Vcent \NC\NR
9634 \stoptabulate
9636 \subsubsubsection{accent nodes}
9638 Valid fields: \showfields{accent}\crlf
9639 Id: \showid{accent}
9641 \starttabulate[|lT|l|p|]
9642 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9643 \NC subtype \NC number \NC the first bit is used for a fixed top accent flag (if the \type{accent} field is present),
9644 the second bit for a fixed bottom accent flag (if the \type{bot_accent} field is present).
9645 Example: the actual value \type{3} means: do not stretch either accent\NC\NR
9646 \NC attr \NC \syntax{<node>}\NC \NC\NR
9647 \NC nucleus \NC \syntax{<kernel>}\NC \NC \NR
9648 \NC sub \NC \syntax{<kernel>}\NC \NC\NR
9649 \NC sup \NC \syntax{<kernel>}\NC \NC \NR
9650 \NC accent \NC \syntax{<kernel>}\NC \NC\NR
9651 \NC bot_accent \NC \syntax{<kernel>}\NC \NC\NR
9652 \stoptabulate
9654 \subsubsubsection{style nodes}
9656 Valid fields: \showfields{style}\crlf
9657 Id: \showid{style}
9659 \starttabulate[|lT|l|p|]
9660 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9661 \NC style \NC string \NC contains the style \NC\NR
9662 \stoptabulate
9664 There are eight possibilities for the string value: one of
9665 \quote{display}, \quote{text}, \quote{script}, or \quote{scriptscript}.
9666 Each of these can have a trailing \type{'} to signify
9667 \quote{cramped} styles.
9669 \subsubsubsection{choice nodes}
9671 Valid fields: \showfields{choice}\crlf
9672 Id: \showid{choice}
9674 \starttabulate[|lT|l|p|]
9675 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9676 \NC attr \NC \syntax{<node>}\NC \NC\NR
9677 \NC display \NC \syntax{<node>}\NC \NC\NR
9678 \NC text \NC \syntax{<node>}\NC \NC\NR
9679 \NC script \NC \syntax{<node>}\NC \NC\NR
9680 \NC scriptscript \NC \syntax{<node>}\NC \NC\NR
9681 \stoptabulate
9683 A warning: never assign a node list to the display, text, script, or
9684 scriptscript field unless you are sure its internal link structure is
9685 correct, otherwise an error may be result.
9687 \subsubsubsection{radical nodes}
9689 Valid fields: \showfields{radical}\crlf
9690 Id: \showid{radical}
9692 \starttabulate[|lT|l|p|]
9693 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9694 \NC attr \NC \syntax{<node>}\NC \NC\NR
9695 \NC nucleus \NC \syntax{<kernel>}\NC \NC \NR
9696 \NC sub \NC \syntax{<kernel>}\NC \NC\NR
9697 \NC sup \NC \syntax{<kernel>}\NC \NC \NR
9698 \NC left \NC \syntax{<delim>}\NC \NC \NR
9699 \NC degree \NC \syntax{<kernel>}\NC Only set by \type{\Uroot} \NC \NR
9700 \stoptabulate
9702 A warning: never assign a node list to the nucleus, sub, sup, left, or
9703 degree field
9704 unless you are sure its internal link structure is correct, otherwise
9705 an error may be result.
9707 \subsubsubsection{fraction nodes}
9709 Valid fields: \showfields{fraction}\crlf
9710 Id: \showid{fraction}
9712 \starttabulate[|lT|l|p|]
9713 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9714 \NC attr \NC \syntax{<node>}\NC \NC\NR
9715 \NC width \NC number \NC \NC \NR
9716 \NC num \NC \syntax{<kernel>}\NC \NC\NR
9717 \NC denom \NC \syntax{<kernel>}\NC \NC \NR
9718 \NC left \NC \syntax{<delim>}\NC \NC \NR
9719 \NC right \NC \syntax{<delim>}\NC \NC \NR
9720 \stoptabulate
9722 A warning: never assign a node list to the num, or denom field
9723 unless you are sure its internal link structure is correct, otherwise
9724 an error may be result.
9726 \subsubsubsection{fence nodes}
9728 Valid fields: \showfields{fence}\crlf
9729 Id: \showid{fence}
9731 \starttabulate[|lT|l|p|]
9732 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9733 \NC subtype \NC number \NC 1 = \type{\left},
9734 2 = \type{\middle},
9735 3 = \type{\right} \NC\NR
9736 \NC attr \NC \syntax{<node>}\NC \NC\NR
9737 \NC delim \NC \syntax{<delim>}\NC \NC \NR
9738 \stoptabulate
9740 \subsection{whatsit nodes}
9742 Whatsit nodes come in many subtypes that you can ask for by running
9743 \luatex{node.whatsits()}:
9744 \ctxlua {for n,name in table.sortedpairs(node.whatsits()) do
9745 if (n<100) then
9746 if (n>0) then tex.sprint (', ') end
9747 tex.sprint('\\type{' .. name .. '} (' .. n .. ')') end
9748 end }
9750 \subsubsection{open nodes}
9752 Valid fields: \showfields{whatsit,open}\crlf
9753 Id: \showid{whatsit,open}
9755 \starttabulate[|lT|l|p|]
9756 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9757 \NC attr \NC \syntax{<node>} \NC \NC\NR
9758 \NC stream \NC number \NC \TEX's stream id number\NC\NR
9759 \NC name \NC string \NC file name \NC\NR
9760 \NC ext \NC string \NC file extension \NC\NR
9761 \NC area \NC string \NC file area (this may become obsolete) \NC\NR
9762 \stoptabulate
9764 \subsubsection{write nodes}
9766 Valid fields: \showfields{whatsit,write}\crlf
9767 Id: \showid{whatsit,write}
9769 \starttabulate[|lT|l|p|]
9770 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9771 \NC attr \NC \syntax{<node>} \NC \NC\NR
9772 \NC stream \NC number \NC \TEX's stream id number\NC\NR
9773 \NC data \NC table \NC a table representing the token list to be written\NC\NR
9774 \stoptabulate
9776 \subsubsection{close nodes}
9778 Valid fields: \showfields{whatsit,close}\crlf
9779 Id: \showid{whatsit,close}
9781 \starttabulate[|lT|l|p|]
9782 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9783 \NC attr \NC \syntax{<node>} \NC \NC\NR
9784 \NC stream \NC number \NC \TEX's stream id number\NC\NR
9785 \stoptabulate
9787 \subsubsection{special nodes}
9789 Valid fields: \showfields{whatsit,special}\crlf
9790 Id: \showid{whatsit,special}
9792 \starttabulate[|lT|l|p|]
9793 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9794 \NC attr \NC \syntax{<node>} \NC \NC\NR
9795 \NC data \NC string \NC the \tex{special} information\NC\NR
9796 \stoptabulate
9798 \subsubsection{language nodes}
9801 \LUATEX\ does not have language whatsits any more. All language
9802 information is already present inside the glyph nodes themselves.
9803 This whatsit subtype will be removed in the next release.
9806 \subsubsection{local_par nodes}
9808 Valid fields: \showfields{whatsit,local_par}\crlf
9809 Id: \showid{whatsit,local_par}
9811 \starttabulate[|lT|l|p|]
9812 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9813 \NC attr \NC \syntax{<node>} \NC \NC\NR
9814 \NC pen_inter \NC number \NC local interline penalty (from \tex{localinterlinepenalty})\NC\NR
9815 \NC pen_broken\NC number \NC local broken penalty (from \tex{localbrokenpenalty})\NC\NR
9816 \NC dir \NC string \NC the direction of this par. see~\in{}[dirnodes]\NC\NR
9817 \NC box_left \NC \syntax{<node>} \NC the \tex{localleftbox}\NC\NR
9818 \NC box_left_width\NC number\NC width of the \tex{localleftbox}\NC\NR
9819 \NC box_right \NC \syntax{<node>} \NC the \tex{localrightbox}\NC\NR
9820 \NC box_right_width\NC number\NC width of the \tex{localrightbox}\NC\NR
9821 \stoptabulate
9823 A warning: never assign a node list to the box_left or box_right field
9824 unless you are sure its internal link structure is correct, otherwise
9825 an error may be result.
9830 \subsubsection[dirnodes]{dir nodes}
9832 Valid fields: \showfields{whatsit,dir}\crlf
9833 Id: \showid{whatsit,dir}
9835 \starttabulate[|lT|l|p|]
9836 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9837 \NC attr \NC \syntax{<node>} \NC \NC\NR
9838 \NC dir \NC string \NC the direction (but see below)\NC\NR
9839 \NC level \NC number \NC nesting level of this direction whatsit\NC\NR
9840 \NC dvi_ptr \NC number \NC a saved dvi buffer byte offset\NC\NR
9841 \NC dir_h \NC number \NC a saved dvi position\NC\NR
9842 \stoptabulate
9844 A note on \type{dir} strings. Direction specifiers are three-letter
9845 combinations of \type{T}, \type{B}, \type{R}, and \type{L}.
9847 These are built up out of three separate items:
9848 \startitemize
9849 \item the first is the direction of the \quote{top} of paragraphs.
9850 \item the second is the direction of the \quote{start} of lines.
9851 \item the third is the direction of the \quote{top} of glyphs.
9852 \stopitemize
9854 However, only four combinations are accepted: \type{TLT}, \type{TRT},
9855 \type{RTT}, and \type{LTL}.
9857 Inside actual \type{dir} whatsit nodes, the representation of
9858 \type{dir} is not a three-letter but a four-letter combination. The
9859 first character in this case is always either \type{+} or \type{-},
9860 indicating whether the value is pushed or popped from the direction
9861 stack.
9863 \subsubsection{pdf_literal nodes}
9865 Valid fields: \showfields{whatsit,pdf_literal}\crlf
9866 Id: \showid{whatsit,pdf_literal}
9868 \starttabulate[|lT|l|p|]
9869 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9870 \NC attr \NC \syntax{<node>} \NC \NC\NR
9871 \NC mode \NC number \NC the \quote{mode} setting of this literal\NC\NR
9872 \NC data \NC string \NC the \tex{pdfliteral} information\NC\NR
9873 \stoptabulate
9875 Mode values:
9877 \starttabulate[|lT|p|]
9878 \NC \ssbf value \NC \ssbf corresponding \tex{pdftex} keyword \NC \NR
9879 \NC 0 \NC setorigin \NC \NR
9880 \NC 1 \NC page \NC \NR
9881 \NC 2 \NC direct \NC \NR
9882 \stoptabulate
9884 \subsubsection{pdf_refobj nodes}
9886 Valid fields: \showfields{whatsit,pdf_refobj}\crlf
9887 Id: \showid{whatsit,pdf_refobj}
9889 \starttabulate[|lT|l|p|]
9890 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9891 \NC attr \NC \syntax{<node>} \NC \NC\NR
9892 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
9893 \stoptabulate
9895 \subsubsection{pdf_refxform nodes}
9897 Valid fields: \showfields{whatsit,pdf_refxform}\crlf
9898 Id: \showid{whatsit,pdf_refxform}.
9900 \starttabulate[|lT|l|p|]
9901 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9902 \NC attr \NC \syntax{<node>} \NC \NC\NR
9903 \NC width \NC number \NC \NC \NR
9904 \NC height \NC number \NC \NC \NR
9905 \NC depth \NC number \NC \NC \NR
9906 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
9907 \stoptabulate
9909 Be aware that \type{pdf_refxform} nodes have dimensions that are used by \LUATEX.
9911 \subsubsection{pdf_refximage nodes}
9913 Valid fields: \showfields{whatsit,pdf_refximage}\crlf
9914 Id: \showid{whatsit,pdf_refximage}
9916 \starttabulate[|lT|l|p|]
9917 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9918 \NC attr \NC \syntax{<node>} \NC \NC\NR
9919 \NC width \NC number \NC \NC \NR
9920 \NC height \NC number \NC \NC \NR
9921 \NC depth \NC number \NC \NC \NR
9922 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
9923 \stoptabulate
9925 Be aware that \type{pdf_refximage} nodes have dimensions that are used by \LUATEX.
9927 \subsubsection{pdf_annot nodes}
9929 Valid fields: \showfields{whatsit,pdf_annot}\crlf
9930 Id: \showid{whatsit,pdf_annot}
9932 \starttabulate[|lT|l|p|]
9933 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9934 \NC attr \NC \syntax{<node>} \NC \NC\NR
9935 \NC width \NC number \NC \NC \NR
9936 \NC height \NC number \NC \NC \NR
9937 \NC depth \NC number \NC \NC \NR
9938 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
9939 \NC data \NC string \NC the annotation data\NC\NR
9940 \stoptabulate
9943 \subsubsection{pdf_start_link nodes}
9945 Valid fields: \showfields{whatsit,pdf_start_link}\crlf
9946 Id: \showid{whatsit,pdf_start_link}
9948 \starttabulate[|lT|l|p|]
9949 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9950 \NC attr \NC \syntax{<node>} \NC \NC\NR
9951 \NC width \NC number \NC \NC \NR
9952 \NC height \NC number \NC \NC \NR
9953 \NC depth \NC number \NC \NC \NR
9954 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
9955 \NC link_attr \NC table \NC the link attribute token list\NC\NR
9956 \NC action \NC \syntax{<node>} \NC the action to perform\NC\NR
9957 \stoptabulate
9959 \subsubsection{pdf_end_link nodes}
9961 Valid fields: \showfields{whatsit,pdf_end_link}\crlf
9962 Id: \showid{whatsit,pdf_end_link}
9964 \starttabulate[|lT|l|p|]
9965 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9966 \NC attr \NC \syntax{<node>} \NC \NC\NR
9967 \stoptabulate
9969 \subsubsection{pdf_dest nodes}
9971 Valid fields: \showfields{whatsit,pdf_dest}\crlf
9972 Id: \showid{whatsit,pdf_dest}
9974 \starttabulate[|lT|l|p|]
9975 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9976 \NC attr \NC \syntax{<node>} \NC \NC\NR
9977 \NC width \NC number \NC \NC \NR
9978 \NC height \NC number \NC \NC \NR
9979 \NC depth \NC number \NC \NC \NR
9980 \NC named_id \NC number \NC is the dest_id a string value?\NC\NR
9981 \NC dest_id \NC number or string \NC the destination id\NC\NR
9982 \NC dest_type \NC number\NC type of destination\NC\NR
9983 \NC xyz_zoom \NC number\NC \NC\NR
9984 \NC objnum \NC number \NC the \PDF\ object number\NC\NR
9985 \stoptabulate
9987 \subsubsection{pdf_thread nodes}
9989 Valid fields: \showfields{whatsit,pdf_thread}\crlf
9990 Id: \showid{whatsit,pdf_thread}
9992 \starttabulate[|lT|l|p|]
9993 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9994 \NC attr \NC \syntax{<node>} \NC \NC\NR
9995 \NC width \NC number \NC \NC \NR
9996 \NC height \NC number \NC \NC \NR
9997 \NC depth \NC number \NC \NC \NR
9998 \NC named_id \NC number \NC is the tread_id a string value?\NC\NR
9999 \NC tread_id \NC number or string \NC the thread id\NC\NR
10000 \NC thread_attr\NC number \NC extra thread information\NC\NR
10001 \stoptabulate
10003 \subsubsection{pdf_start_thread nodes}
10005 Valid fields: \showfields{whatsit,pdf_start_thread}\crlf
10006 Id: \showid{whatsit,pdf_start_thread}
10008 \starttabulate[|lT|l|p|]
10009 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10010 \NC attr \NC \syntax{<node>} \NC \NC\NR
10011 \NC width \NC number \NC \NC \NR
10012 \NC height \NC number \NC \NC \NR
10013 \NC depth \NC number \NC \NC \NR
10014 \NC named_id \NC number \NC is the tread_id a string value?\NC\NR
10015 \NC tread_id \NC number or string \NC the thread id\NC\NR
10016 \NC thread_attr\NC number \NC extra thread information\NC\NR
10017 \stoptabulate
10019 \subsubsection{pdf_end_thread nodes}
10021 Valid fields: \showfields{whatsit,pdf_end_thread}\crlf
10022 Id: \showid{whatsit,pdf_end_thread}
10024 \starttabulate[|lT|l|p|]
10025 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10026 \NC attr \NC \syntax{<node>} \NC \NC\NR
10027 \stoptabulate
10029 \subsubsection{pdf_save_pos nodes}
10031 Valid fields: \showfields{whatsit,pdf_save_pos}\crlf
10032 Id: \showid{whatsit,pdf_save_pos}
10034 \starttabulate[|lT|l|p|]
10035 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10036 \NC attr \NC \syntax{<node>} \NC \NC\NR
10037 \stoptabulate
10039 \subsubsection{late_lua nodes}
10041 Valid fields: \showfields{whatsit,late_lua}\crlf
10042 Id: \showid{whatsit,late_lua}
10044 \starttabulate[|lT|l|p|]
10045 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10046 \NC attr \NC \syntax{<node>} \NC \NC\NR
10047 \NC data \NC string \NC data to execute\NC\NR
10048 \NC string \NC string \NC data to execute (0.63)\NC\NR
10049 \NC name \NC string \NC the name to use for lua error reporting\NC\NR
10050 \stoptabulate
10052 The difference between \type{data} and \type{string} is that on
10053 assignment, the \type{data} field is converted to a token list, cf. use as
10054 \tex{latelua}. The \type{string} version is treated as a literal string.
10056 \subsubsection{pdf_colorstack nodes}
10058 Valid fields: \showfields{whatsit,pdf_colorstack}\crlf
10059 Id: \showid{whatsit,pdf_colorstack}
10061 \starttabulate[|lT|l|p|]
10062 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10063 \NC attr \NC \syntax{<node>} \NC \NC\NR
10064 \NC stack \NC number \NC colorstack id number\NC\NR
10065 \NC cmd \NC number \NC command to execute\NC\NR
10066 \NC data \NC string \NC data\NC\NR
10067 \stoptabulate
10069 \subsubsection{pdf_setmatrix nodes}
10071 Valid fields: \showfields{whatsit,pdf_setmatrix}\crlf
10072 Id: \showid{whatsit,pdf_setmatrix}
10074 \starttabulate[|lT|l|p|]
10075 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10076 \NC attr \NC \syntax{<node>} \NC \NC\NR
10077 \NC data \NC string \NC data\NC\NR
10078 \stoptabulate
10080 \subsubsection{pdf_save nodes}
10082 Valid fields: \showfields{whatsit,pdf_save}\crlf
10083 Id: \showid{whatsit,pdf_save}
10085 \starttabulate[|lT|l|p|]
10086 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10087 \NC attr \NC \syntax{<node>} \NC \NC\NR
10088 \stoptabulate
10090 \subsubsection{pdf_restore nodes}
10092 Valid fields: \showfields{whatsit,pdf_restore}\crlf
10093 Id: \showid{whatsit,pdf_restore}
10095 \starttabulate[|lT|l|p|]
10096 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10097 \NC attr \NC \syntax{<node>} \NC \NC\NR
10098 \stoptabulate
10100 \subsubsection{user_defined nodes}
10102 User|-|defined whatsit nodes can only be created and handled from \LUA\
10103 code. In effect, they are an extension to the extension
10104 mechanism. The \LUATEX\ engine will simply step over such whatsits
10105 without ever looking at the contents.
10107 Valid fields: \showfields{whatsit,user_defined}\crlf
10108 Id: \showid{whatsit,user_defined}
10110 \starttabulate[|lT|l|p|]
10111 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
10112 \NC attr \NC \syntax{<node>} \NC \NC\NR
10113 \NC user_id \NC number \NC id number\NC\NR
10114 \NC type \NC number \NC type of the value\NC\NR
10115 \NC value \NC number \NC \NC\NR
10116 \NC \NC string \NC \NC\NR
10117 \NC \NC \syntax{<node>} \NC \NC\NR
10118 \NC \NC table \NC \NC\NR
10119 \stoptabulate
10121 The \type{type} can have one of five distinct values:
10123 \starttabulate[|lT|p|]
10124 \NC \ssbf value \NC \bf explanation \NC\NR
10125 \NC 97 \NC the value is an attribute node list \NC\NR
10126 \NC 100 \NC the value is a number \NC\NR
10127 \NC 110 \NC the value is a node list \NC\NR
10128 \NC 115 \NC the value is a string\NC\NR
10129 \NC 116 \NC the value is a token list in \LUA\ table form\NC\NR
10130 \stoptabulate
10132 \section{Two access models}
10134 After doing lots of tests with \LUATEX\ and \LUAJITTEX\, with and without just in
10135 time compilation enabled, and with and without using ffi, we came to the
10136 conclusion that userdata prevents a speedup. We also found that the checking of
10137 metatables as well as assignment comes with overhead that can't be neglected.
10138 This is normally not really a problem but when processing fonts for more complex
10139 scripts it could have quite some overhead.
10141 Because the userdata approach has some benefits, this remains the recommended way
10142 to access nodes. We did several experiments with faster access using this model,
10143 but eventually settled for the \quote {direct} approach. For code that is proven
10144 to be okay, one can use this access model that operates on nodes more directly.
10146 Deep down in \TEX\ a node has a number which is an entry in a memory table. In
10147 fact, this model, where \TEX\ manages memory is real fast and one of the reasons
10148 why plugging in callbacks that operate on nodes is quite fast. No matter what
10149 future memory model \LUATEX\ has, an internal reference will always be a simple data
10150 type (like a number or light userdata in \LUA\ speak). So, if you use the direct
10151 model, even if you know that you currently deal with numbers, you should not depend
10152 on that property but treat it an abstraction just like traditional nodes. In fact,
10153 the fact that we use a simple basic datatype has the penalty that less checking can
10154 be done, but less checking is also the reason why it's somewhat faster. An
10155 important aspect is that one cannot mix both methods, but you can cast both
10156 models.
10158 So our advice is: use the indexed approach when possible and investigate the
10159 direct one when speed might be an issue. For that reason we also provide the
10160 \type {get*} and \type {set*} functions in the top level node namespace. There is
10161 a limited set of getters. When implementing this direct approach the regular
10162 index by key variant was also optimized, so direct access only makes sense when
10163 we're accessing nodes millions of times (which happens in some font processing
10164 for instance).
10166 We're talking mostly of getters because setters are less important. Documents
10167 have not that many content related nodes and setting many thousands of properties
10168 is hardly a burden contrary to millions of consultations.
10170 Normally you will access nodes like this:
10172 \starttyping
10173 local next = current.next
10174 if next then
10175 -- do something
10177 \stoptyping
10179 Here \type {next} is not a real field, but a virtual one. Accessing it results in
10180 a metatable method being called. In practice it boils down to looking up the
10181 node type and based on the node type checking for the field name. In a worst case
10182 you have a node type that sits at the end of the lookup list and a field that is
10183 last in the lookup chain. However, in successive versions of \LUATEX\ these lookups
10184 have been optimized and the most frequently accessed nodes and fields have a higher
10185 priority.
10187 Because in practice the \type {next} accessor results in a function call, there
10188 is some overhead involved. The next code does the same and performs a tiny bit
10189 faster (but not that much because it is still a function call but one that
10190 knows what to look up).
10192 \starttyping
10193 local next = node.next(current)
10194 if next then
10195 -- do something
10197 \stoptyping
10199 There are several such function based accessors now:
10201 \starttabulate[|T|p|]
10202 \NC getnext \NC parsing nodelist always involves this one \NC \NR
10203 \NC getprev \NC used less but is logical companion to getnext \NC \NR
10204 \NC getid \NC consulted a lot \NC \NR
10205 \NC getsubtype \NC consulted less but also a topper \NC \NR
10206 \NC getfont \NC used a lot in otf handling (glyph nodes are consulted a lot) \NC \NR
10207 \NC getchar \NC idem and also in other places \NC \NR
10208 \NC getlist \NC we often parse nested lists so this is a convenient one too
10209 (only works for hlist and vlist!) \NC \NR
10210 \NC getleader \NC comparable to list, seldom used in \TEX\ (but needs frequent consulting
10211 like lists; leaders could have been made a dedicated node type) \NC \NR
10212 \NC getfield \NC generic getter, sufficient for the rest (other field names are
10213 often shared so a specific getter makes no sense then) \NC \NR
10214 \stoptabulate
10216 It doesn't make sense to add more. Profiling demonstrated that these fields can
10217 get accesses way more times than other fields. Even in complex documents, many
10218 node and fields types never get seen, or seen only a few times. Most functions in the
10219 \type {node} namespace have a companion in \type {node.direct}, but of course not the
10220 ones that don't deal with nodes themselves. The following table summarized this:
10222 \start \def\yes{$+$} \def\nop{$-$}
10224 \starttabulate[|T|c|c|]
10226 \NC \bf function \NC \bf node \NC \bf direct \NC \NR
10228 \NC copy \NC \yes \NC \yes \NC \NR
10229 \NC copy_list \NC \yes \NC \yes \NC \NR
10230 \NC count \NC \yes \NC \yes \NC \NR
10231 \NC current_attr \NC \yes \NC \yes \NC \NR
10232 \NC dimensions \NC \yes \NC \yes \NC \NR
10233 \NC do_ligature_n \NC \yes \NC \yes \NC \NR
10234 \NC end_of_math \NC \yes \NC \yes \NC \NR
10235 \NC family_font \NC \yes \NC \nop \NC \NR
10236 \NC fields \NC \yes \NC \nop \NC \NR
10237 \NC first_character \NC \yes \NC \nop \NC \NR
10238 \NC first_glyph \NC \yes \NC \yes \NC \NR
10239 \NC flush_list \NC \yes \NC \yes \NC \NR
10240 \NC flush_node \NC \yes \NC \yes \NC \NR
10241 \NC free \NC \yes \NC \yes \NC \NR
10242 \NC getbox \NC \nop \NC \yes \NC \NR
10243 \NC getchar \NC \yes \NC \yes \NC \NR
10244 \NC getfield \NC \yes \NC \yes \NC \NR
10245 \NC getfont \NC \yes \NC \yes \NC \NR
10246 \NC getid \NC \yes \NC \yes \NC \NR
10247 \NC getnext \NC \yes \NC \yes \NC \NR
10248 \NC getprev \NC \yes \NC \yes \NC \NR
10249 \NC getlist \NC \yes \NC \yes \NC \NR
10250 \NC getleader \NC \yes \NC \yes \NC \NR
10251 \NC getsubtype \NC \yes \NC \yes \NC \NR
10252 \NC has_glyph \NC \yes \NC \yes \NC \NR
10253 \NC has_attribute \NC \yes \NC \yes \NC \NR
10254 \NC has_field \NC \yes \NC \yes \NC \NR
10255 \NC hpack \NC \yes \NC \yes \NC \NR
10256 \NC id \NC \yes \NC \nop \NC \NR
10257 \NC insert_after \NC \yes \NC \yes \NC \NR
10258 \NC insert_before \NC \yes \NC \yes \NC \NR
10259 \NC is_direct \NC \nop \NC \yes \NC \NR
10260 \NC is_node \NC \yes \NC \yes \NC \NR
10261 \NC kerning \NC \yes \NC \nop \NC \NR
10262 \NC last_node \NC \yes \NC \yes \NC \NR
10263 \NC length \NC \yes \NC \yes \NC \NR
10264 \NC ligaturing \NC \yes \NC \nop \NC \NR
10265 \NC mlist_to_hlist \NC \yes \NC \nop \NC \NR
10266 \NC new \NC \yes \NC \yes \NC \NR
10267 \NC next \NC \yes \NC \nop \NC \NR
10268 \NC prev \NC \yes \NC \nop \NC \NR
10269 \NC tostring \NC \yes \NC \yes \NC \NR
10270 \NC protect_glyphs \NC \yes \NC \yes \NC \NR
10271 \NC protrusion_skippable \NC \yes \NC \yes \NC \NR
10272 \NC remove \NC \yes \NC \yes \NC \NR
10273 \NC set_attribute \NC \yes \NC \yes \NC \NR
10274 \NC setbox \NC \yes \NC \yes \NC \NR
10275 \NC setfield \NC \yes \NC \yes \NC \NR
10276 \NC slide \NC \yes \NC \yes \NC \NR
10277 \NC subtype \NC \yes \NC \nop \NC \NR
10278 \NC tail \NC \yes \NC \yes \NC \NR
10279 \NC todirect \NC \yes \NC \yes \NC \NR
10280 \NC tonode \NC \yes \NC \yes \NC \NR
10281 \NC traverse \NC \yes \NC \yes \NC \NR
10282 \NC traverse_id \NC \yes \NC \yes \NC \NR
10283 \NC type \NC \yes \NC \nop \NC \NR
10284 \NC types \NC \yes \NC \nop \NC \NR
10285 \NC unprotect_glyphs \NC \yes \NC \yes \NC \NR
10286 \NC unset_attribute \NC \yes \NC \yes \NC \NR
10287 \NC usedlist \NC \yes \NC \yes \NC \NR
10288 \NC vpack \NC \yes \NC \yes \NC \NR
10289 \NC whatsits \NC \yes \NC \nop \NC \NR
10290 \NC write \NC \yes \NC \yes \NC \NR
10291 \stoptabulate
10293 \stop
10295 The \type {node.next} and \type {node.prev} functions will stay but for
10296 consistency there are variants called \type {getnext} and \type {getprev}.
10297 We had to use \type{get} because \type {node.id} and \type {node.subtype} are
10298 already taken for providing meta information about nodes.
10300 \chapter{Modifications}
10302 Besides the expected changes caused by new functionality, there are a
10303 number of not|-|so|-|expected changes. These are sometimes a side|-|effect
10304 of a new (conflicting) feature, or, more often than not, a change
10305 necessary to clean up the internal interfaces.
10307 \section{Changes from \TEX\ 3.1415926}
10309 \startitemize
10311 \item The current code base is written in C, not Pascal web (as of \LUATEX~0.42.0).
10313 \item See~\in{chapter}[languages] for many small changes related to paragraph
10314 building, language handling, and hyphenation. Most important change:
10315 adding a brace group in the middle of a word (like in \type{of{}fice})
10316 does not prevent ligature creation.
10318 \item There is no pool file, all strings are embedded during compilation.
10320 \item \type {plus 1 fillll} does not generate an error. The extra \quote{l} is
10321 simply typeset.
10323 \item The upper limit to \tex{endlinechar} and \tex{newlinechar} is 127.
10325 \stopitemize
10327 \section{Changes from \ETEX\ 2.2}
10329 \startitemize
10331 \item The \ETEX\ functionality is always present and enabled
10332 (but see below about \TEXXET), so the prepended asterisk or
10333 \type{-etex} switch for \INITEX\ is not needed.
10335 \item \TEXXET\ is not present, so the primitives
10337 \starttyping
10338 \TeXXeTstate
10339 \beginR
10340 \beginL
10341 \endR
10342 \endL
10343 \stoptyping
10345 are missing.
10347 \item Some of the tracing information that is output by \ETEX's \tex{tracingassigns} and
10348 \tex{tracingrestores} is not there.
10350 \item Register management in \LUATEX\ uses the \ALEPH\ model, so the maximum value is 65535
10351 and the implementation uses a flat array instead of the mixed
10352 flat|\&|sparse model from \ETEX.
10354 \item \type{savinghyphcodes} is a no-op.
10355 See~\in{chapter}[languages] for details.
10357 \item When kpathsea is used to find files, \LUATEX\ uses the
10358 \type{ofm} file format to search for font metrics. In turn, this means
10359 that \LUATEX\ looks at the \type{OFMFONTS} configuration variable
10360 (like \OMEGA\ and \ALEPH) instead of \type{TFMFONTS} (like \TEX\ and
10361 \PDFTEX). Likewise for virtual fonts (\LUATEX\ uses the variable
10362 \type{OVFFONTS} instead of \type{VFFONTS}).
10365 \stopitemize
10367 \section{Changes from \PDFTEX\ 1.40}
10369 \startitemize
10371 \item The (experimental) support for snap nodes has been removed, because
10372 it is much more natural to build this functionality on top of node
10373 processing and attributes. The associated primitives that are now gone
10374 are: \tex{pdfsnaprefpoint}, \tex{pdfsnapy}, and \tex{pdfsnapycomp}.
10376 \item The (experimental) support for specialized spacing around nodes
10377 has also been removed. The associated primitives that are now gone are:
10378 \tex{pdfadjustinterwordglue}, \tex{pdfprependkern}, and \tex{pdfappendkern},
10379 as well as the five supporting primitives \tex{knbscode}, \tex{stbscode},
10380 \tex{shbscode}, \tex{knbccode}, and \tex{knaccode}.
10382 \item A number of \quote{utility functions} is removed:
10384 \startcolumns[n=3]
10385 \starttyping
10386 \pdfelapsedtime
10387 \pdfescapehex
10388 \pdfescapename
10389 \pdfescapestring
10390 \pdffiledump
10391 \pdffilemoddate
10392 \pdffilesize
10393 \pdflastmatch
10394 \pdfmatch
10395 \pdfmdfivesum
10396 \pdfresettimer
10397 \pdfshellescape
10398 \pdfstrcmp
10399 \pdfunescapehex
10400 \stoptyping
10401 \stopcolumns
10403 \item The four primitives that were already marked obsolete in \PDFTEX~1.40
10404 have been removed since \LUATEX~0.42:
10406 \startcolumns[n=2]
10407 \starttyping
10408 \pdfoptionalwaysusepdfpagebox
10409 \pdfoptionpdfinclusionerrorlevel
10410 \pdfforcepagebox
10411 \pdfmovechars
10412 \stoptyping
10413 \stopcolumns
10416 \item A few other experimental primitives are also provided without the
10417 extra \luatex {pdf} prefix, so they can also be called like this:
10419 \startcolumns[n=3]
10420 \starttyping
10421 \primitive
10422 \ifprimitive
10423 \ifabsnum
10424 \ifabsdim
10425 \stoptyping
10426 \stopcolumns
10428 \item The \tex{pdftexversion} is set to 200.
10430 \item The PNG transparency fix from 1.40.6 is not applied
10431 (high-level support is pending)
10433 \item LFS (\PDF\ Files larger than 2GiB) support is not working yet.
10435 \item \LUATEX~0.45.0 introduces two extra token lists, \tex{pdfxformresources}
10436 and \tex{pdfxformattr}, as an alternative to \tex{pdfxform} keywords.
10438 \item As of \LUATEX~0.50.0 is no longer possible for fonts from embedded pdf files
10439 to be replaced by / merged with the document fonts of the enveloping
10440 pdf document. This regression may be temporary, depending on how the
10441 rewritten font backend will look after beta 0.60.
10444 \stopitemize
10446 \section{Changes from \ALEPH\ RC4}
10448 \startitemize
10450 \item Starting with \LUATEX\ 0.75.0, the extended 16-bit math primitives
10451 (\tex{omathcode} etc.~) have been removed.
10453 \item Starting with \LUATEX\ 0.63.0, OCP processing is no longer
10454 supported at all. As a consequence, the following primitives have
10455 been removed:
10457 \startcolumns[n=2]
10458 \starttyping
10459 \ocp
10460 \externalocp
10461 \ocplist
10462 \pushocplist
10463 \popocplist
10464 \clearocplists
10465 \addbeforeocplist
10466 \addafterocplist
10467 \removebeforeocplist
10468 \removeafterocplist
10469 \ocptracelevel
10470 \stoptyping
10471 \stopcolumns
10473 \item \LUATEX\ only understands 4~of the 16~direction
10474 specifiers of \ALEPH: \type{TLT} (latin), \type{TRT} (arabic),
10475 \type{RTT} (cjk), \type{LTL} (mongolian). All other direction
10476 specifiers generate an error (\LUATEX\ 0.45).
10478 \item The input translations from \ALEPH\ are not implemented, the
10479 related primitives are not available:
10481 \startcolumns[n=2]
10482 \starttyping
10483 \DefaultInputMode
10484 \noDefaultInputMode
10485 \noInputMode
10486 \InputMode
10487 \DefaultOutputMode
10488 \noDefaultOutputMode
10489 \noOutputMode
10490 \OutputMode
10491 \DefaultInputTranslation
10492 \noDefaultInputTranslation
10493 \noInputTranslation
10494 \InputTranslation
10495 \DefaultOutputTranslation
10496 \noDefaultOutputTranslation
10497 \noOutputTranslation
10498 \OutputTranslation
10499 \stoptyping
10500 \stopcolumns
10502 \item The \tex{hoffset} bug when \tex{pagedir TRT} is fixed,
10503 removing the need for an explicit fix to \tex{hoffset}
10505 \item A bug causing \tex{fam} to fail for family numbers above
10506 15 is fixed.
10508 \item A fair amount of other minor bugs are fixed as well, most of these
10509 related to \tex{tracingcommands} output.
10511 \item The internal function \type{scan_dir()} has been renamed to
10512 \type{scan_direction()} to prevent a naming clash, and it now allows
10513 an optional space after the direction is completely parsed.
10515 \item The \type{^^} notation can come in five and six item repetitions also, to
10516 insert characters that do not fit in the BMP.
10518 \item Glues {\it immediately after} direction change commands are not
10519 legal breakpoints.
10521 \stopitemize
10523 \section{Changes from standard \WEBC}
10525 \startitemize
10527 \item There is no mltex
10529 \item There is no enctex
10531 \item The following commandline switches are silently ignored, even
10532 in non|-|\LUA\ mode:
10534 \starttyping
10535 -8bit
10536 -translate-file=TCXNAME
10537 -mltex
10538 -enc
10539 -etex
10540 \stoptyping
10542 \item \tex{openout} whatsits are not written to the log file.
10544 \item Some of the so|-|called web2c extensions are hard to set up
10545 in non|-|\KPSE\ mode because texmf.cnf is not read: \type{shell-escape}
10546 is off (but that is not a problem because of \LUA's
10547 \lua{os.execute}), and the paranoia checks on \type{openin} and
10548 \type{openout} do not happen (however, it is easy for a \LUA\ script
10549 to do this itself by overloading \lua{io.open}).
10551 \item The \quote{E} option does not do anything useful.
10553 \stopitemize
10555 \chapter{Implementation notes}
10557 \section{Primitives overlap}
10559 The primitives
10561 \starttabulate[|l|l|]
10562 \NC \tex{pdfpagewidth} \NC \tex{pagewidth} \NC \NR
10563 \NC \tex{pdfpageheight}\NC \tex{pageheight} \NC \NR
10564 \NC \tex{fontcharwd} \NC \tex{charwd} \NC \NR
10565 \NC \tex{fontcharht} \NC \tex{charht} \NC \NR
10566 \NC \tex{fontchardp} \NC \tex{chardp} \NC \NR
10567 \NC \tex{fontcharic} \NC \tex{charit} \NC \NR
10568 \stoptabulate
10570 are all aliases of each other.
10572 \section{Memory allocation}
10574 The single internal memory heap that traditional \TEX\ used for tokens
10575 and nodes is split into two separate arrays. Each of these will grow
10576 dynamically when needed.
10578 The \type{texmf.cnf} settings related to main memory are no longer
10579 used (these are: \type{main_memory}, \type{mem_bot},
10580 \type{extra_mem_top} and \type{extra_mem_bot}). \quote{Out of main
10581 memory} errors can still occur, but the limiting factor is now the
10582 amount of RAM in your system, not a predefined limit.
10584 Also, the memory (de)allocation routines for nodes are completely
10585 rewritten. The relevant code now lives in the C file \type{texnode.c},
10586 and basically uses a dozen or so \quote{avail} lists instead of a
10587 doubly|-|linked model. An extra function layer is added so that the
10588 code can ask for nodes by type instead of directly requisitioning
10589 a certain amount of memory words.
10591 Because of the split into two arrays and the resulting differences in the data
10592 structures, some of the macros have been duplicated. For instance, there are now
10593 \type{vlink} and \type{vinfo} as well as \type{token_link} and \type{token_info}. All
10594 access to the variable memory array is now hidden behind a macro called \type{vmem}.
10596 The implementation of the growth of two arrays (via reallocation)
10597 introduces a potential pitfall: the memory arrays should never be used
10598 as the left hand side of a statement that can modify the array in
10599 question.
10601 The input line buffer and pool size are now also reallocated when
10602 needed, and the \type{texmf.cnf} settings \type{buf_size} and
10603 \type{pool_size} are silently ignored.
10605 \section{Sparse arrays}
10607 The \tex{mathcode}, \tex{delcode}, \tex{catcode},
10608 \tex{sfcode}, \tex{lccode} and \tex{uccode} tables are now
10609 sparse arrays that are implemented in~C. They are no longer part of
10610 the \TEX\ \quote{equivalence table} and because each had 1.1 million
10611 entries with a few memory words each, this makes a major difference
10612 in memory usage.
10614 The \tex{catcode}, \tex{sfcode}, \tex{lccode} and \tex{uccode} assignments
10615 do not yet show up when using the etex tracing routines \tex{tracingassigns}
10616 and \tex{tracingrestores} (code simply not written yet).
10618 A side|-|effect of the current implementation is that \tex{global} is
10619 now more expensive in terms of processing than non|-|global assignments.
10621 See \type{mathcodes.c} and \type{textcodes.c} if you are interested in
10622 the details.
10624 Also, the glyph ids within a font are now managed by means
10625 of a sparse array and glyph ids can go up to index $2^{21}-1$.
10627 \section{Simple single-character csnames}
10629 Single|-|character commands are no longer treated specially in the
10630 internals, they are stored in the hash just like the multiletter
10631 csnames.
10633 The code that displays control sequences explicitly checks if
10634 the length is one when it has to decide whether or not to add a
10635 trailing space.
10637 Active characters are internally implemented as a special type
10638 of multi-letter control sequences that uses a prefix that is
10639 otherwise impossible to obtain.
10641 \section{Compressed format}
10643 The format is passed through zlib, allowing it to shrink to roughly
10644 half of the size it would have had in uncompressed form. This takes a
10645 bit more CPU cycles but much less disk I/O, so it should still be
10646 faster.
10648 \section{Binary file reading}
10650 All of the internal code is changed in such a way that if one of the
10651 \type{read_xxx_file} callbacks is not set, then the file is read by
10652 a C function using basically the same convention as the callback: a
10653 single read into a buffer big enough to hold the entire file
10654 contents. While this uses more memory than the previous code (that
10655 mostly used \type{getc} calls), it can be quite a bit faster
10656 (depending on your I/O subsystem).
10658 \chapter{Known bugs and limitations, TODO}
10660 There used to be a lists of bugs and planned features below here, but that did not
10661 work out too well. There are lists of open bugs and feature requests in the tracker at
10662 \hyphenatedurl{http://tracker.luatex.org}.
10664 \stoptext