sync with experimental
[luatex.git] / manual / luatexref-t.tex
blob1a0ec13921c690cd5897c95feb88900387f0adb6
1 % engine=luatex language=uk
2 % $Id$
4 % TODO: fix layout of function legend descriptions
5 % check numbers
6 % check \luatex command
8 %\nopdfcompression
9 %\loggingall
10 \environment luatexref-env
11 \logo[DFONT] {dfont}
12 \logo[CFF] {cff}
13 \logo[CMAP] {CMap}
14 \logo[PATGEN] {patgen}
15 \logo[MP] {MetaPost}
16 \logo[METAPOST]{MetaPost}
17 \logo[MPLIB] {MPlib}
18 \logo[COCO] {coco}
19 \logo[SUNOS] {SunOS}
20 \logo[BSD] {bsd}
21 \logo[SYSV] {sysv}
22 \logo[DPI] {dpi}
24 \setvariables
25 [document]
26 [beta=0.79.0]
28 \starttext
30 \dontcomplain \nonknuthmode
32 \setups[titlepage]
34 \title{Contents}
36 \placecontent[criterium=text,level=subsection]
38 \chapter{Introduction}
40 \startframedtext[framecolor=red,foregroundcolor=red,width=\hsize,style=\tfa]
42 This book will eventually become the reference manual of \LUATEX.
43 At the moment, it simply reports the behavior of the executable
44 matching the snapshot or beta release date in the title page.
46 \blank
48 Features may come and go. The current version of \LUATEX\ is not
49 meant for production and users cannot depend on stability, nor on
50 functionality staying the same.
52 \blank
54 Nothing is considered stable just yet. This manual therefore
55 simply reflects the current state of the executable. {\bs
56 Absolutely nothing\/} on the following pages is set in stone. When
57 the need arises, anything can (and will) be changed.
59 \blank
61 {\bf If you are not willing to deal with this situation, you should
62 wait for the stable version. Currently we expect the 1.0 release to
63 happen in spring 2014. Full stabilization will not happen soon, the
64 TODO list is still large.}
66 \stopframedtext
68 \blank[2*line]
70 \LUATEX\ consists of a number of interrelated but (still)
71 distinguishable parts:
73 \startitemize[packed]
74 \item \PDFTEX\ version 1.40.9, converted to C (with patches from later releases).
75 \item The direction model and some other bits from \ALEPH\ RC4 converted to C.
76 \item \LUA\ 5.2.1
77 \item dedicated \LUA\ libraries
78 \item various \TEX\ extensions
79 \item parts of \FONTFORGE\ 2008.11.17
80 \item the \METAPOST\ library
81 \item newly written compiled source code to glue it all together
82 \stopitemize
84 Neither \ALEPH's I/O translation processes, nor tcx files, nor
85 \ENCTEX\ can be used, these encoding|-|related functions are
86 superseded by a \LUA|-|based solution (reader callbacks). Also, some
87 experimental \PDFTEX\ features are removed. These can be implemented
88 in \LUA\ instead.
90 \chapter{Basic \TEX\ enhancements}
92 \section{Introduction}
94 From day one, \LUATEX\ has offered extra functionality when compared
95 to the superset of \PDFTEX\ and \ALEPH. That has not been limited to
96 the possibility to execute lua code via \type{\directlua}, but
97 \LUATEX\ also adds functionality via new \TEX-side primitives.
99 However, starting with beta \type{0.39.0}, most of that functionality
100 is hidden by default. When \LUATEX\ 0.40.0 starts up in
101 \quote{iniluatex} mode (\type{luatex -ini}), it defines only the
102 primitive commands known by \TEX82 and the one extra command
103 \type{\directlua}.
105 As is fitting, a lua function has to be called to add the extra
106 primitives to the user environment. The simplest method to get access
107 to all of the new primitive commands is by adding this line to the
108 format generation file:
110 \starttyping
111 \directlua { tex.enableprimitives('',tex.extraprimitives()) }
112 \stoptyping
114 But be aware that the curly braces may not have the proper \type{\catcode}
115 assigned to them at this early time (giving a 'Missing number' error),
116 so it may be needed to put these assignments
118 \starttyping
119 \catcode `\{=1
120 \catcode `\}=2
121 \stoptyping
123 before the above line.
124 More fine-grained primitives control is possible, you can look up the details in
125 \in{section}[luaprimitives]. For simplicity's sake, this manual assumes
126 that you have executed the \type{\directlua} command as given above.
128 The startup behavior documented above is considered stable in the sense
129 that there will not be backward-incompatible changes any more.
131 \section{Version information}
133 There are three new primitives to test the version of \LUATEX:
135 \starttabulate[|l|p|]
136 \NC \bf primitive \NC \bf explanation \NC\NR
137 \NC \tex{luatexversion} \NC a combination of major and minor number, as in \PDFTEX;
138 the current current value is {\bf\the\luatexversion} \NC\NR
139 \NC \tex{luatexrevision} \NC the revision number, as in \PDFTEX;
140 the current value is {\bf\luatexrevision} \NC\NR
141 \NC \tex{luatexdatestamp} \NC (deprecated in 0.78.1, will be gone in 0.80.0)
142 a combination of the local date and hour when
143 the current executable was compiled,
144 the syntax is identical to \tex{luatexrevision};
145 the value for the executable that generated this
146 document is {\bf\luatexdatestamp}. \NC\NR
147 \stoptabulate
149 The official \LUATEX\ version is defined as follows:
151 \startitemize
152 \item The major version is the integer result of \tex{luatexversion} divided by 100.
153 The primitive is an \quote{internal variable}, so you may need to prefix its
154 use with \type{\the} depending on the context.
155 \item The minor version is the two-digit result of \tex{luatexversion} modulo 100.
156 \item The revision is the given by \tex{luatexrevision}. This primitive expands to a
157 positive integer.
158 \item The full version number consists of the major version,
159 minor version and revision, separated by dots.
160 \stopitemize
162 \section{\UNICODE\ text support}
164 Text input and output is now considered to be \UNICODE\ text, so
165 input characters can use the full range of \UNICODE\ ($2^{20}+2^{16}-1
166 = \hbox{0x10FFFF}$).
168 Later chapters will talk of characters and glyphs. Although these
169 are not interchangeable, they are closely related. During
170 typesetting, a character is always converted to a suitable graphic
171 representation of that character in a specific font. However,
172 while processing a list of to|-|be|-|typeset nodes, its contents
173 may still be seen as a character. Inside \LUATEX\ there is not yet
174 a clear separation between the two concepts. Until this is
175 implemented, please do not be too harsh on us if we make errors in
176 the usage of the terms.
178 A few primitives are affected by this, all in a similar fashion: each
179 of them has to accommodate for a larger range of acceptable numbers.
180 For instance, \tex{char} now accepts values between~0 and
181 $1{,}114{,}111$. This should not be a problem for well|-|behaved input
182 files, but it could create incompatibilities for input that would have
183 generated an error when processed by older \TEX|-|based engines. The
184 affected commands with an altered initial (left of the equals sign) or
185 secondary (right of the equals sign) value are: \tex{char},
186 \tex{lccode},\tex{uccode}, \tex{catcode}, \tex{sfcode}, \tex{efcode},
187 \tex{lpcode}, \tex{rpcode}, \tex{chardef}.
189 As far as the core engine is concerned, all input and output to
190 text files is \UTF-8 encoded. Input files can be pre|-|processed
191 using the \luatex{reader} callback. This will be explained in a
192 later chapter.
194 Output in byte|-|sized chunks can be achieved by using characters
195 just outside of the valid \UNICODE\ range, starting at the value
196 $1{,}114{,}112$ (0x110000). When the time comes to print a character
197 $c>=1{,}114{,}112$, \LUATEX\ will actually print the single byte
198 corresponding to $c$ minus 1{,}114{,}112.
200 Output to the terminal uses \type{^^} notation for the lower
201 control range ($c<32$), with the exception of \type{^^I},
202 \type{^^J} and \type{^^M}. These are considered \quote{safe} and
203 therefore printed as-is.
205 Normalization of the \UNICODE\ input can be handled by a macro package
206 during callback processing (this will be explained in \in{section}[iocallback]).
208 \section{Extended tables}
210 All traditional \TEX\ and \ETEX\ registers can be 16-bit numbers as in
211 \ALEPH. The affected commands are:
213 \startcolumns[n=4]
214 \starttyping
215 \count
216 \dimen
217 \skip
218 \muskip
219 \marks
220 \toks
221 \countdef
222 \dimendef
223 \skipdef
224 \muskipdef
225 \toksdef
226 \box
227 \unhbox
228 \unvbox
229 \copy
230 \unhcopy
231 \unvcopy
235 \setbox
236 \vsplit
237 \stoptyping
238 \stopcolumns
240 The glyph properties (like \type {\efcode}) introduced in \PDFTEX\
241 that deal with font expansion (hz) and character protruding are
242 also 16-bit. Because font memory management has been rewritten,
243 these character properties are no longer shared among fonts
244 instances that originate from the same metric file.
246 The behavior documented in the above section is considered stable
247 in the sense that there will not be backward-incompatible changes any
248 more.
250 \section{Attribute registers}
252 Attributes are a completely new concept in \LUATEX. Syntactically,
253 they behave a lot like counters: attributes obey \TEX's nesting stack
254 and can be used after \tex{the} etc.\ just like the normal
255 \tex{count} registers.
257 \startsyntax
258 \attribute <16-bit number> <optional equals> <32-bit number>!crlf
259 \attributedef <csname> <optional equals> <16-bit number>
260 \stopsyntax
262 Conceptually, an attribute is either \quote{set} or
263 \quote{unset}. Unset attributes have a special negative value to
264 indicate that they are unset, that value is the lowest legal value:
265 \type{-"7FFFFFFF} in hexadecimal, a.k.a. $-2147483647$ in decimal.
266 It follows that the value \type{-"7FFFFFFF} cannot be used as
267 a legal attribute value, but you {\it can\/} assign \type{-"7FFFFFFF} to
268 \quote{unset} an attribute. All attributes start out in this
269 \quote{unset} state in \INITEX\ (prior to 0.37, there could not be
270 valid negative attribute values, and the \quote{unset} value was $-1$).
272 Attributes can be used as extra counter values, but their usefulness
273 comes mostly from the fact that the numbers and values of all \quote{set}
274 attributes are attached to all nodes created in their scope. These can
275 then be queried from any \LUA\ code that deals with node
276 processing. Further information about how to use attributes for node
277 list processing from \LUA\ is given in~\in{chapter}[nodes].
279 The behavior documented in the above subsection is considered stable
280 in the sense that there will not be backward-incompatible changes any
281 more.
284 \subsection{Box attributes}
286 Nodes typically receive the list of attributes that is in effect when
287 they are created. This moment can be quite asynchronous. For example: in
288 paragraph building, the individual line boxes are created after the
289 \tex{par} command has been processed, so they will receive the list of
290 attributes that is in effect then, not the attributes that were in
291 effect in, say, the first or third line of the paragraph.
293 Similar situations happen in \LUATEX\ regularly. A few of the more
294 obvious problematic cases are dealt with: the attributes for nodes
295 that are created during hyphenation, kerning and ligaturing borrow their
296 attributes from their surrounding glyphs, and it is possible to
297 influence box attributes directly.
299 When you assemble a box in a register, the attributes of the nodes
300 contained in the box are unchanged when such a box is placed,
301 unboxed, or copied. In this respect attributes act the same as
302 characters that have been converted to references to glyphs in
303 fonts. For instance, when you use attributes to implement color
304 support, each node carries information about its eventual color. In that
305 case, unless you implement mechanisms that deal with it, applying
306 a color to already boxed material will have no effect. Keep in
307 mind that this incompatibility is mostly due to the fact that separate
308 specials and literals are a more unnatural approach to colors than
309 attributes.
311 It is possible to fine-tune the list of attributes that are applied
312 to a \type{hbox}, \type{vbox} or \type{vtop} by the use of the
313 keyword \type{attr}. An example:
315 \starttyping
316 \attribute2=5
317 \setbox0=\hbox {Hello}
318 \setbox2=\hbox attr1=12 attr2=-"7FFFFFFF{Hello}
319 \stoptyping
321 This will set the attribute list of box~2 to $1=12$, and the
322 attributes of box~0 will be $2=5$. As you can see, assigning
323 the maximum negative value causes an attribute to be ignored.
325 The \type{attr} keyword(s) should come before a \type{to} or
326 \type{spread}, if that is also specified.
328 \section{\LUA\ related primitives}
330 In order to merge \LUA\ code with \TEX\ input, a few new primitives are
331 needed.
334 \subsection{\tex{directlua}}
336 The primitive \tex{directlua} is used to execute \LUA\ code immediately.
337 The syntax is
339 \startsyntax
340 \directlua <general text>!crlf
341 \directlua name <general text> <general text>!crlf
342 \directlua <16-bit number> <general text>
343 \stopsyntax
345 The last \syntax{<general text>} is expanded fully, and then fed
346 into the \LUA\ interpreter. After reading and expansion has been applied to the
347 \syntax{<general text>}, the resulting token list is converted to a
348 string as if it was displayed using \type{\the\toks}. On the \LUA\
349 side, each \type{\directlua} block is treated as a separate chunk. In
350 such a chunk you can use the \type {local} directive to keep your variables
351 from interfering with those used by the macro package.
353 The conversion to and from a token list means that you normally can
354 not use \LUA\ line comments (starting with \type{--}) within the
355 argument. As there typically will be only one \quote{line} the first
356 line comment will run on until the end of the input. You will either need to
357 use \TEX-style line comments (starting with \%), or change the \TEX\
358 category codes locally. Another possibility is to say:
360 \starttyping
361 \begingroup
362 \endlinechar=10
363 \directlua ...
364 \endgroup
365 \stoptyping
367 Then \LUA\ line comments can be used, since \TEX\ does not replace
368 line endings with spaces.
370 The \syntax{name <general text>} specifies the name of the \LUA\ chunk,
371 mainly shown in the stack backtrace of error messages created by \LUA\
372 code. The \syntax{<general text>} is expanded fully, thus macros can
373 be used to generate the chunk name, i.e.
375 \starttyping
376 \directlua name{\jobname:\the\inputlineno} ...
377 \stoptyping
379 to include the name of the input file as well as the input line into
380 the chunk name.
382 Likewise, the \syntax{<16-bit number>} designates a name of a \LUA\
383 chunk, but in this case the name will be taken from the
384 \type{lua.name} array (see the documentation of the \type{lua} table
385 further in this manual). This syntax is new in version 0.36.0.
387 The chunk name should not start with a \type{@}, or it will be displayed
388 as a file name (this is a quirk in the current \LUA\ implementation).
390 The \tex{directlua} command is expandable. Since it passes {\LUA} code to the
391 {\LUA} interpreter its expansion from the {\TEX} viewpoint is usually empty.
392 However, there are some {\LUA} functions that produce material to be read
393 by {\TeX}, the so called print functions. The most simple use of these is
394 \type{tex.print(<string> s)}. The characters of the string \type{s} will be placed
395 on the {\TeX} input buffer, that is, \quote{before \TeX's eyes} to be read by {\TeX}
396 immediately. For example:
398 \startbuffer
399 \count10=20
400 a\directlua{tex.print(tex.count[10]+5)}b
401 \stopbuffer
403 \typebuffer
405 expands to
407 \getbuffer
409 Here is another example:
411 \startbuffer
412 $\pi = \directlua{tex.print(math.pi)}$
413 \stopbuffer
415 \typebuffer
417 will result in
419 \getbuffer
421 Note that the expansion of \tex{directlua} is a sequence of characters, not
422 of tokens, contrary to all {\TeX} commands. So formally speaking its
423 expansion is null, but it places material on a pseudo-file to be
424 immediately read by {\TeX}, as etex's \tex{scantokens}.
426 For a description of print functions look at \in{section~}[sec:luaprint].
428 Because the \syntax{<general text>} is a chunk, the normal \LUA\ error
429 handling is triggered if there is a problem in the included code. The
430 \LUA\ error messages should be clear enough, but the contextual
431 information is still pretty bad. Often, you will only see the line
432 number of the right brace at the end of the code.
434 While on the subject of errors: some of the things you can do inside
435 \LUA\ code can break up \LUATEX\ pretty bad. If you are not careful
436 while working with the node list interface, you may even end up with
437 assertion errors from within the \TEX\ portion of the executable.
439 The behavior documented in the above subsection is considered stable
440 in the sense that there will not be backward-incompatible changes any
441 more.
443 \subsection{\tex{luafunction}}
445 The \type {\directlua} commands involves tokenization of its argument (after picking up
446 an optional name or number specification). The tokenlist is then converted into a string and
447 given to \LUA\ to turn into a function that is called. The overhead is rather small but when
448 you use this primitive hundreds or thousands of times, it can become noticeable. For this
449 reason there is a variant call available: \type {\luafunction}. This command is used as
450 follows:
452 \starttyping
453 \directlua {
454 local t = lua.get_functions_table()
455 t[1] = function() tex.print("!") end
456 t[2] = function() tex.print("?") end
459 \luafunction1
460 \luafunction2
461 \stoptyping
463 Of course the functions can also be defined in a separate file. There is no
464 limit on the number of functions apart from normal \LUA\ limitations. Of course there
465 is the limitation of no arguments but that would involve parsing and thereby
466 give no gain. The function, when called in fact gets one argument, being the index,
467 so in:
469 \starttyping
470 \directlua {
471 local t = lua.get_functions_table()
472 t[8] = function(slot) tex.print(slot) end
474 \stoptyping
476 the number \type {8} gets typeset.
480 \subsection{\tex{latelua}}
482 \tex{latelua} stores \LUA\ code in a whatsit that will be processed
483 at the time of shipping out. Its intended use is a cross between
484 \tex{pdfliteral} and \tex{write}.
485 Within the \LUA\ code you can print \PDF\
486 statements directly to the \PDF\ file via \type{pdf.print},
487 or you can write to other output streams via \type{texio.write}
488 or simply using lua's I/O routines.
490 \startsyntax
491 \latelua <general text>!crlf
492 \latelua name <general text> <general text>!crlf
493 \latelua <16-bit number> <general text>
494 \stopsyntax
496 Expansion of macros etcetera in the final \type{<general text>} is delayed
497 until just before the whatsit is executed (like in \tex{write}). With
498 regard to PDF output stream \tex{latelua} behaves as \tex{pdfliteral page}.
500 The \syntax{name <general text>} and \syntax{<16-bit number>} behave
501 in the same way as they do for \type{\directlua}
503 \subsection{\tex{luaescapestring}}
505 This primitive converts a \TEX\ token sequence so that it can be
506 safely used as the contents of a \LUA\ string: embedded backslashes,
507 double and single quotes, and newlines and carriage returns are
508 escaped. This is done by prepending an extra token consisting of a
509 backslash with category code~12, and for the line endings,
510 converting them to \type{n} and \type{r} respectively. The token
511 sequence is fully expanded.
513 \startsyntax
514 \luaescapestring <general text>
515 \stopsyntax
517 Most often, this command is not actually the best way to deal with the
518 differences between the \TEX\ and \LUA. In very short bits of \LUA\
519 code it is often not needed, and for longer stretches of \LUA\ code it
520 is easier to keep the code in a separate file and load it using \LUA's
521 \type{dofile}:
523 \starttyping
524 \directlua { dofile('mysetups.lua')}
525 \stoptyping
528 \section{New \ETEX\ primitives}
530 \subsection{\tex{clearmarks}}
532 This primitive clears a mark class completely, resetting all three
533 connected mark texts to empty.
535 \startsyntax
536 \clearmarks <16-bit number>
537 \stopsyntax
539 \subsection{\tex{noligs} and \tex{nokerns}}
541 These primitives prohibit ligature and kerning insertion at the time
542 when the initial node list is built by \LUATEX's main control loop.
543 They are part of a temporary trick and will be removed in the near
544 future. For now, you need to enable these primitives when you want to
545 do node list processing of \quote{characters}, where \TEX's normal
546 processing would get in the way.
548 \startsyntax
549 \noligs <integer>!crlf
550 \nokerns <integer>
551 \stopsyntax
553 These primitives can now be implemented by overloading the ligature
554 building and kerning functions, i.e.\ by assigning dummy functions
555 to their associated callbacks.
557 \subsection{\tex{formatname}}
559 \tex{formatname}'s syntax is identical to \tex{jobname}.
561 In \INITEX, the expansion is empty. Otherwise, the expansion is the
562 value that \tex{jobname} had during the \INITEX\ run that dumped the
563 currently loaded format.
565 \subsection{\tex{scantextokens}}
567 The syntax of \tex{scantextokens} is identical to \tex{scantokens}.
568 This primitive is a slightly adapted version of \ETEX's \tex{scantokens}. The
569 differences are:
571 \startitemize
572 \item The last (and usually only) line does not have a
573 \tex{endlinechar} appended
574 \item \tex{scantextokens} never raises an EOF error,
575 and it does not execute \tex{everyeof} tokens.
576 \item The \quote{\unknown\ while end of file \unknown} error tests are not executed, allowing
577 the expansion to end on a different grouping level or while a
578 conditional is still incomplete.
579 \stopitemize
581 \subsection {Verbose versions of single-character aligments commands (0.45)}
583 \LUATEX\ defines two new primitives that have the same function as
584 \type{#} and \type{&} in aligments:
586 \starttabulate[|l|l|l|l|]
587 \NC \bf primitive \NC \bf explanation \NC\NR
588 \NC \tex{alignmark} \NC Duplicates the functionality of \char`\#~%
589 inside alignment preambles\NC\NR
590 \NC \tex{aligntab} \NC Duplicates the functionality of \char`\&~%
591 inside alignments (and preambles)\NC\NR
592 \stoptabulate
595 \subsection{Catcode tables}
597 Catcode tables are a new feature that allows you to switch to a
598 predefined catcode regime in a single statement. You can have a
599 practically unlimited number of different tables.
601 The subsystem is backward compatible: if you never use the following
602 commands, your document will not notice any difference in behavior
603 compared to traditional \TEX.
605 The contents of each catcode table is independent from any other
606 catcode tables, and their contents is stored and retrieved from the
607 format file.
609 \subsubsection{\tex{catcodetable}}
611 \startsyntax
612 \catcodetable <15-bit number>
613 \stopsyntax
615 The primitive \tex{catcodetable} switches to a different catcode table.
616 Such a table has to be previously created using one of the two
617 primitives below, or it has to be zero. Table zero is initialized by
618 \INITEX.
620 \subsubsection{\tex{initcatcodetable}}
622 \startsyntax
623 \initcatcodetable <15-bit number>
624 \stopsyntax
626 The primitive \tex{initcatcodetable} creates a new table with catcodes
627 identical to those defined by \INITEX:
629 \starttabulate[|l|l|l|l|l|]
630 \NC~0\NC \tt\letterbackslash \NC \NC \tt escape \NC\NR
631 \NC~5\NC \tt\letterhat\letterhat M \NC return \NC \tt car{\_}ret \NC (this name may change) \NC\NR
632 \NC~9\NC \tt\letterhat\letterhat @ \NC null \NC \tt ignore \NC\NR
633 \NC10\NC \tt <space> \NC space \NC \tt spacer \NC\NR
634 \NC11\NC {\tt a} -- {\tt z} \NC \NC \tt letter \NC\NR
635 \NC11\NC {\tt A} -- {\tt Z} \NC \NC \tt letter \NC\NR
636 \NC12\NC everything else \NC \NC \tt other \NC\NR
637 \NC14\NC \tt\letterpercent \NC \NC \tt comment \NC\NR
638 \NC15\NC \tt\letterhat\letterhat ? \NC delete \NC \tt invalid{\_}char \NC\NR
639 \stoptabulate
641 The new catcode table is allocated globally: it will not go away after
642 the current group has ended. If the supplied number is identical to
643 the currently active table, an error is raised.
645 \subsubsection{\tex{savecatcodetable}}
647 \startsyntax
648 \savecatcodetable <15-bit number>
649 \stopsyntax
651 \tex{savecatcodetable} copies the current set of catcodes to a
652 new table with the requested number. The definitions in this new table
653 are all treated as if they were made in the outermost level.
655 The new table is allocated globally: it will not go away after the
656 current group has ended. If the supplied number is the currently
657 active table, an error is raised.
659 \subsection{\tex{suppressfontnotfounderror} (0.11)}
661 \startsyntax
662 \suppressfontnotfounderror = 1
663 \stopsyntax
665 If this new integer parameter is non|-|zero, then \LUATEX\ will not
666 complain about font metrics that are not found. Instead it will
667 silently skip the font assignment, making the requested csname for the
668 font \tex{ifx} equal to \tex{nullfont}, so that it can be tested
669 against that without bothering the user.
671 \subsection{\tex{suppresslongerror} (0.36)}
673 \startsyntax
674 \suppresslongerror = 1
675 \stopsyntax
677 If this new integer parameter is non|-|zero, then \LUATEX\ will not
678 complain about \type{\par} commands encountered in contexts where
679 that is normally prohibited (most prominently in the arguments
680 of non-long macros).
682 \subsection{\tex{suppressifcsnameerror} (0.36)}
684 \startsyntax
685 \suppressifcsnameerror = 1
686 \stopsyntax
688 If this new integer parameter is non|-|zero, then \LUATEX\ will not
689 complain about non-expandable commands appearing in the middle of a
690 \type{\ifcsname} expansion. Instead, it will keep getting expanded
691 tokens from the input until it encounters an \type{\endcsname}
692 command. Use with care! This command is experimental: if the input
693 expansion is unbalanced wrt. \type{\csname} \ldots \type{\endcsname}
694 pairs, the \LUATEX\ process may hang indefinitely.
697 \subsection{\tex{suppressoutererror} (0.36)}
699 \startsyntax
700 \suppressoutererror = 1
701 \stopsyntax
703 If this new integer parameter is non|-|zero, then \LUATEX\ will not
704 complain about \type{\outer} commands encountered in contexts where
705 that is normally prohibited.
707 The addition of this command coincides with a change in the
708 \LUATEX\ engine: ever since the snapshot of 20060915, \type{\outer}
709 was simply ignored. That behavior has now reverted back to be
710 \TEX82-compatible by default.
713 \subsection{\tex{outputbox} (0.37)}
715 \startsyntax
716 \outputbox = 65535
717 \stopsyntax
719 This new integer parameter allows you to alter the number of the box
720 that will be used to store the page sent to the output routine. Its default
721 value is 255, and the acceptable range is from 0 to 65535.
724 \subsection{Font syntax}
726 \LUATEX\ will accept a braced argument as a font name:
728 \starttyping
729 \font\myfont = {cmr10}
730 \stoptyping
732 This allows for embedded spaces, without the need for double quotes.
733 Macro expansion takes place inside the argument.
735 \subsection{File syntax (0.45)}
737 \LUATEX\ will accept a braced argument as a file name:
739 \starttyping
740 \input {plain}
741 \openin 0 {plain}
742 \stoptyping
744 This allows for embedded spaces, without the need for double quotes.
745 Macro expansion takes place inside the argument.
747 \subsection{Images and Forms}
749 \LUATEX\ accepts optional dimension parameters for \type{\pdfrefximage}
750 and \type{\pdfrefxform} in the same format as for \type{\pdfximage}.
751 With images, these dimensions are then used
752 instead of the ones given to \type{\pdfximage};
753 but the original dimensions are not overwritten,
754 so that a \type{\pdfrefximage} without dimensions still provides
755 the image with dimensions defined by \type{\pdfximage}.
756 These optional parameters are not implemented for \type{\pdfxform}.
758 \starttyping
759 \pdfrefximage width 20mm height 10mm depth 5mm \pdflastximage
760 \pdfrefxform width 20mm height 10mm depth 5mm \pdflastxform
761 \stoptyping
763 \section{Debugging}
765 If \tex{tracingonline} is larger than~2, the node list display will
766 also print the node number of the nodes.
768 \section{Global leaders}
770 There is a new experimental primitive: \type{\gleaders} (a \LUATEX\
771 extension, added in 0.43). This type of leaders is anchored to the
772 origin of the box to be shipped out. So they are like normal
773 \type{\leaders} in that they align nicely, except that the alignment
774 is based on the {\it largest\/} enclosing box instead of the
775 {\it smallest\/}.
778 \section{Expandable character codes (0.75)}
780 The new expandable command \tex{Uchar} reads a number between~0 and
781 $1{,}114{,}111$ and expands to the associated Unicode character.
784 \chapter {\LUA\ general}
786 \section[init]{Initialization}
788 \subsection{\LUATEX\ as a \LUA\ interpreter}
790 There are some situations that make \LUATEX\ behave like a standalone \LUA\
791 interpreter:
793 \startitemize[packed]
794 \item if a \type{--luaonly} option is given on the commandline, or
795 \item if the executable is named \type{texlua} (or \type{luatexlua}), or
796 \item if the only non|-|option argument (file) on the commandline has the extension
797 \type{lua} or \type{luc}.
798 \stopitemize
800 In this mode, it will set \LUA's \type{arg[0]} to the found script
801 name, pushing preceding options in negative values and the rest of the
802 commandline in the positive values, just like the \LUA\
803 interpreter.
805 \LUATEX\ will exit immediately after executing the specified \LUA\
806 script and is, in effect, a somewhat bulky standalone \LUA\
807 interpreter with a bunch of extra preloaded libraries.
809 \subsection{\LUATEX\ as a \LUA\ byte compiler}
811 There are two situations that make \LUATEX\ behave like the \LUA\
812 byte compiler:
814 \startitemize[packed]
815 \item if a \type{--luaconly} option is given on the commandline, or
816 \item if the executable is named \type{texluac}
817 \stopitemize
819 In this mode, \LUATEX\ is exactly like \type{luac} from the standalone
820 \LUA\ distribution, except that it does not have the \type{-l} switch,
821 and that it accepts (but ignores) the \type{--luaconly} switch.
823 \subsection{Other commandline processing}
825 When the \LUATEX\ executable starts, it looks for the \type{--lua}
826 commandline option. If there is no \type{--lua} option, the
827 commandline is interpreted in a similar fashion as in traditional
828 \PDFTEX\ and \ALEPH.
830 The following command-line switches are understood.
832 \starttabulate[|lT|p|]
833 \NC --fmt=FORMAT \NC load the format file FORMAT \NC\NR
834 \NC --lua=FILE \NC load and execute a \LUA\ initialization script\NC\NR
835 \NC --safer \NC disable easily exploitable \LUA\ commands \NC\NR
836 \NC --nosocket \NC disable the \LUA\ socket library \NC\NR
837 \NC --help \NC display help and exit \NC\NR
838 \NC --ini \NC be iniluatex, for dumping formats \NC\NR
839 \NC --interaction=STRING \NC set interaction mode (STRING=batchmode/nonstopmode/\crlf
840 scrollmode/errorstopmode) \NC \NR
841 \NC --halt-on-error \NC stop processing at the first error\NC \NR
842 \NC --kpathsea-debug=NUMBER \NC set path searching debugging flags according to
843 the bits of NUMBER \NC \NR
844 \NC --progname=STRING \NC set the program name to STRING \NC \NR
845 \NC --version \NC display version and exit \NC\NR
846 \NC --credits \NC display credits and exit \NC\NR
847 \NC --recorder \NC enable filename recorder \NC \NR
848 \NC --etex \NC ignored\NC \NR
849 \NC --output-comment=STRING \NC use STRING for DVI file comment instead of date
850 (no effect for PDF)\NC \NR
851 \NC --output-directory=DIR \NC use DIR as the directory to write files to \NC \NR
852 \NC --draftmode \NC switch on draft mode (generates no output PDF)\NC \NR
853 \NC --output-format=FORMAT \NC use FORMAT for job output; FORMAT is 'dvi' or 'pdf' \NC \NR
854 \NC --[no-]shell-escape \NC disable/enable \type{\write18{SHELL COMMAND}} \NC \NR
855 \NC --enable-write18 \NC enable \type{\write18{SHELL COMMAND}} \NC \NR
856 \NC --disable-write18 \NC disable \type{\write18{SHELL COMMAND}} \NC \NR
857 \NC --shell-restricted \NC restrict \type{\write18} to a list of commands
858 given in texmf.cnf \NC \NR
859 \NC --debug-format \NC enable format debugging \NC \NR
860 \NC --[no-]file-line-error \NC disable/enable file:line:error style messages \NC \NR
861 \NC --[no-]file-line-error-style \NC aliases of --[no-]file-line-error \NC \NR
862 \NC --jobname=STRING \NC set the job name to STRING \NC \NR
863 \NC --[no-]parse-first-line \NC disable/enable parsing of the first line of the
864 input file \NC \NR
865 \NC --translate-file= \NC ignored \NC \NR
866 \NC --default-translate-file= \NC ignored \NC \NR
867 \NC --8bit \NC ignored \NC \NR
868 \NC --[no-]mktex=FMT \NC disable/enable mktexFMT generation (FMT=tex/tfm)\NC \NR
869 \NC --synctex=NUMBER \NC enable synctex \NC \NR
870 \stoptabulate
872 A note on the creation of the various temporary files and the \type{\jobname}.
873 The value to use for \type{\jobname} is decided as follows:
875 \startitemize
876 \item If \type{--jobname} is given on the command line, its argument
877 will be the value for \tex{jobname}, without any changes. The
878 argument will not be used for actual input so it need not exist.
879 The \type{--jobname} switch only controls the \tex{jobname} setting.
880 \item Otherwise, \tex{jobname} will be the name of the first file that
881 is read from the file system, with any path components and the last
882 extension (the part following the last \type{.}) stripped off.
883 \item An exception to the previous point: if the command
884 line goes into interactive mode (by starting with a command) and
885 there are no files input via \type{\everyjob} either, then the
886 \tex{jobname} is set to \type{texput} as a last resort.
887 \stopitemize
889 The file names for output files that are generated automatically are
890 created by attaching the proper extension (\type{.log}, \type{.pdf},
891 etc.) to the found \tex{jobname}. These files are created in the
892 directory pointed to by \type{--output-directory}, or in the current
893 directory, if that switch is not present.
895 \blank
897 Without the \type{--lua} option, command line processing works like it does in
898 any other web2c-based typesetting engine, except that \LUATEX\ has a few extra
899 switches.
902 If the \type{--lua} option is present, \LUATEX\ will enter an alternative mode
903 of commandline processing in comparison to the standard web2c
904 programs.
906 In this mode, a small series of actions is taken in order. First,
907 it will parse the commandline as usual, but it will only interpret
908 a small subset of the options immediately: \type{--safer}, \type{--nosocket},
909 \type{--[no-]shell-escape}, \type{--enable-write18}, \type{--disable-write18},
910 \type{--shell-restricted}, \type{--help}, \type{--version}, and \type{--credits}.
912 Now it searches for the requested \LUA\ initialization script. If it
913 cannot be found using the actual name given on the commandline, a
914 second attempt is made by prepending the value of the environment
915 variable \type{LUATEXDIR}, if that variable is defined in the environment.
917 Then it checks the various safety switches. You can use those to disable
918 some \LUA\ commands that can easily be abused by a malicious document. At
919 the moment, \type{--safer} \type{nil}s the following functions:
921 \starttabulate[|l|l|]
922 \NC \bf library \NC \bf functions \NC \NR
923 \NC \tt os \NC \tt execute exec setenv rename remove tmpdir \NC \NR
924 \NC \tt io \NC \tt popen output tmpfile \NC \NR
925 \NC \tt lfs \NC \tt rmdir mkdir chdir lock touch \NC \NR
926 \stoptabulate
928 Furthermore, it disables loading of compiled \LUA\ libraries (support
929 for these was added in 0.46.0), and it makes \lua{io.open()} fail on
930 files that are opened for anything besides reading.
932 \type{--nosocket} makes the socket library unavailable, so that
933 \LUA\ cannot use networking.
935 The switches \type{--[no-]shell-escape}, \type{--[enable|disable]-write18}, and
936 \type{--shell-restricted} have the same
937 effects as in \PDFTEX, and additionally make
938 \type{io.popen()}, \type{os.execute}, \type{os.exec} and \type{os.spawn}
939 adhere to the requested option.
941 Next the initialization script is loaded and executed. From within the
942 script, the entire commandline is available in the \LUA\ table
943 \lua{arg}, beginning with \lua {arg[0]}, containing the name of the executable.
945 Commandline processing happens very early on. So early, in fact, that
946 none of \TEX's initializations have taken place yet. For that reason,
947 the tables that deal with typesetting, like \luatex{tex}, \luatex{token},
948 \luatex{node} and \luatex{pdf}, are off|-|limits during the execution
949 of the startup file (they are nilled). Special care is taken that \luatex{texio.write} and
950 \luatex{texio.write_nl} function properly, so that you can at least
951 report your actions to the log file when (and if) it eventually
952 becomes opened (note that \TEX\ does not even know its \tex{jobname}
953 yet at this point). See \in{chapter}[libraries] for more information
954 about the \LUATEX-specific \LUA\ extension tables.
957 Everything you do in the \LUA\ initialization script will remain
958 visible during the rest of the run, with the exception of the
959 aforementioned \luatex{tex}, \luatex{token}, \luatex{node} and
960 \luatex{pdf} tables: those will be initialized
961 to their documented state after the execution of the script. You
962 should not store anything in variables or within tables with these
963 four global names, as they will be overwritten completely.
965 We recommend you use the startup file only for your own
966 \TEX|-|independent initializations (if you need any), to parse the
967 commandline, set values in the \luatex{texconfig} table, and register
968 the callbacks you need.
970 \LUATEX\ allows some of the commandline options to be overridden
971 by reading values from the \luatex{texconfig} table at the end of
972 script execution (see the description of the \luatex{texconfig} table
973 later on in this document for more details on which ones exactly).
975 Unless the \luatex{texconfig} table tells \LUATEX\ not to initialize
976 \KPATHSEA\ at all (set \luatex{texconfig.kpse_init} to \type{false} for that),
977 \LUATEX\ acts on some more commandline options after the
978 initialization script is finished:
979 in order to initialize the built|-|in \KPATHSEA\ library properly,
980 \LUATEX\ needs to know the correct program name to use, and for that it
981 needs to check \type{--progname}, or \type{--ini} and \type{--fmt}, if
982 \type{--progname} is missing.
985 \section{\LUA\ changes}
987 {\bf NOTE:} \LUATEX\ 0.74.0 is the first version with Lua 5.2, and
988 this is used without any patches to the core, which has some side
989 effects. In particular, Lua's \type{tonumber()} may return values in
990 scientific notation, thereby confusing the \TEX\ end of things when it
991 is used as the right-hand side of an assignment to a \type{\dimen}
992 or \type{\count}.
994 {\bf NOTE:} Also in \LUATEX\ 0.74.0 (this is a change in Lua 5.2),
995 loading dynamic Lua libraries will fail if there are two Lua libraries
996 loaded at the same time (which will typically happen on Win32, because
997 there is one Lua 5.2 inside luatex, and another will likely be linked
998 to the \type{dll} file of the module itself). We plan to fix that later
999 by switching \LUATEX\ itself to using de DLL version of Lua 5.2 inside
1000 \LUATEX\ instead of including a static version in the binary.
1002 Starting from version 0.45, \LUATEX\ is able to use the kpathsea
1003 library to find \type{require()}d modules. For this purpose,
1004 \type{package.searchers[2]} is replaced by a different loader function,
1005 that decides at runtime whether to use kpathsea or the built-in core
1006 lua function. It uses \KPATHSEA\ when that is already initialized at
1007 that point in time, otherwise it reverts to using the normal
1008 \type{package.path} loader.
1010 Initialization of \KPATHSEA\ can happen either implicitly (when
1011 \LUATEX\ starts up and the startup script has not set
1012 \type{texconfig.kpse_init} to false), or explicitly by calling the
1013 \LUA\ function \type{kpse.set_program_name()}.
1015 Starting from version 0.46.0 \LUATEX\ is
1016 also able to use dynamically loadable \LUA\ libraries, unless
1017 \type{--safer} was given as an option on the command line.
1019 For this purpose, \type{package.searchers[3]} is replaced by a different
1020 loader function, that decides at runtime whether to use kpathsea or
1021 the build-in core lua function. As in the previous paragraph, it uses
1022 \KPATHSEA\ when that is already initialized at that point in time,
1023 otherwise it reverts to using the normal \type{package.cpath} loader.
1025 This functionality required an extension to kpathsea:
1027 \startnarrower
1028 There is a new kpathsea file format: \type{kpse_clua_format} that
1029 searches for files with extension \type{.dll} and \type{.so}. The
1030 \type{texmf.cnf} setting for this variable is \type{CLUAINPUTS}, and
1031 by default it has this value:
1033 \starttyping
1034 CLUAINPUTS=.:$SELFAUTOLOC/lib/{$progname,$engine,}/lua//
1035 \stoptyping %$
1037 This path is imperfect (it requires a TDS subtree below the binaries
1038 directory), but the architecture has to be in the path somewhere, and
1039 the currently simplest way to do that is to search below the binaries
1040 directory only.
1042 One level up (a \type{lib} directory parallel to \type{bin}) would
1043 have been nicer, but that is not doable because \TEXLIVE\ uses a
1044 \type{bin/<arch>} structure.
1045 \stopnarrower
1047 In keeping with the other \TEX-like programs in \TEXLIVE, the two
1048 \LUA\ functions
1049 \type{os.execute} and \type{io.popen} (as well as the two new functions \type{os.exec}
1050 and \type{os.spawn} that are explained below) take the value of \type{shell_escape}
1051 and/or \type{shell_escape_commands} in account. Whenever \LUATEX\ is run with the
1052 assumed intention to typeset a document (and by that I mean that it is called as
1053 \type{luatex}, as opposed to \type{texlua}, and that the commandline option
1054 \type{--luaonly} was not given), it will only run the four functions above if the
1055 matching texmf.cnf variable(s) or their \type{texconfig} (see~\in{section}[texconfig])
1056 counterparts allow execution of the requested system command. In \quote{script
1057 interpreter} runs of \LUATEX, these settings have no effect, and all four functions
1058 function as normal. This change is new in 0.37.0.
1062 The \lua{f:read("*line")} and \lua{f:lines()} functions from the io library have
1063 been adjusted so that they are line|-|ending neutral: any of \type{LF}, \type
1064 {CR} or \type{CR+LF} are acceptable line endings.
1066 \lua{luafilesystem} has been extended: there are two extra boolean functions
1067 (\luatex{lfs.isdir(filename)} and \luatex{lfs.isfile(filename)}) and
1068 one extra string field in its attributes table
1069 (\type{permissions}). There is an additional function (added in 0.51)
1070 \type{lfs.shortname()} which takes a file name and returns its short
1071 name on WIN32 platforms. On other platforms, it just returns the given
1072 argument. The file name is not tested for existence. Finally, for
1073 non-WIN32 platforms only, there is the new function
1074 \type{lfs.readlink()} (added in 0.51) that takes an existing symbolic
1075 link as argument and returns its content. It returns an error on
1076 WIN32.
1078 The \lua{string} library has an extra function:
1079 \luatex{string.explode(s[,m])}. This function returns an array containing
1080 the string argument \type{s} split into sub-strings based on the value
1081 of the string argument \type{m}. The second argument is a string that
1082 is either empty (this splits the string into characters), a single
1083 character (this splits on each occurrence of that character, possibly
1084 introducing empty strings), or a single character followed by the plus
1085 sign \type{+} (this special version does not create empty
1086 sub-strings). The default value for \type{m} is \quote{\type{ +}} (multiple
1087 spaces).
1089 Note: \type{m} is not hidden by surrounding braces (as it would be if
1090 this function was written in \TEX\ macros).
1092 The \lua{string} library also has six extra iterators that return strings
1093 piecemeal:
1095 \startitemize
1096 \item \luatex{string.utfvalues(s)} (returns an integer value in the
1097 \UNICODE\ range)
1098 \item \luatex{string.utfcharacters(s)} (returns a string with a single
1099 \UTF-8 token in it)
1100 \item \luatex{string.characters(s)} (a string containing one byte)
1101 \item \luatex{string.characterpairs(s)} (two strings each containing one byte) will
1102 produce an empty second string if the string length was odd.
1103 \item \luatex{string.bytes(s)} (a single byte value)
1104 \item \luatex{string.bytepairs(s)} (two byte values) Will produce nil instead of a
1105 number as its second return value if the string length was odd.
1106 \stopitemize
1108 The \luatex{string.characterpairs()} and \luatex{string.bytepairs()}
1109 are useful especially in the conversion of UTF-16 encoded data into UTF-8.
1112 Starting with \LUATEX\ 0.74, there is also a two-argument form of
1113 \type{string.dump()}. The second argument is a boolean which, if true,
1114 strips the symbols from the dumped data. This matches an extension
1115 made in \type{luajit}.
1117 Note: The \lua{string} library functions \luatex{len}, \luatex{lower},
1118 \luatex{sub} etc. are not \UNICODE|-|aware. For strings in the UTF-8
1119 encoding, i.e., strings containing characters above code point 127, the
1120 corresponding functions from the \lua{slnunicode} library can be used,
1121 e.g., \luatex{unicode.utf8.len}, \luatex{unicode.utf8.lower} etc. The
1122 exceptions are \luatex{unicode.utf8.find}, that always returns byte
1123 positions in a string, and \luatex{unicode.utf8.match} and
1124 \luatex{unicode.utf8.gmatch}. While the latter two functions in general
1125 {\it are} \UNICODE|-|aware, they fall-back to non|-|\UNICODE|-|aware
1126 behavior when using the empty capture \lua{()} (other captures work as
1127 expected). For the interpretation of character classes in
1128 \luatex{unicode.utf8} functions refer to the library sources at
1129 \hyphenatedurl{http://luaforge.net/projects/sln}. The \lua{slnunicode}
1130 library will be replaced by an internal \UNICODE\ library in a future
1131 \LUATEX\ version.
1132 \blank
1134 The \lua{os} library has a few extra functions and variables:
1136 \startitemize
1137 \item \luatex{os.selfdir} is a variable that holds the directory path
1138 of the actual executable. For example: {\tt \directlua{tex.sprint(os.selfdir)}}
1139 (present since 0.27.0).
1141 \item \luatex{os.exec(commandline)} is a variation on \lua{os.execute}.
1143 The \type{commandline} can be either a single string or a single table.
1145 If the argument is a table: \LUATEX\ first checks if there is a value at
1146 integer index zero. If there is, this is the command to be executed. Otherwise,
1147 it will use the value at integer index one. (if neither are present, nothing
1148 at all happens).
1150 The set of consecutive values starting at integer 1 in the table are
1151 the arguments that are passed on to the command (the value at index 1
1152 becomes \type{arg[0]}). The command is searched for in the execution path,
1153 so there is normally no need to pass on a fully qualified pathname.
1155 If the argument is a string, then it is automatically converted into
1156 a table by splitting on whitespace. In this case, it is impossible
1157 for the command and first argument to differ from each other.
1159 In the string argument format, whitespace can be protected by putting (part
1160 of) an argument inside single or double quotes. One layer of quotes is
1161 interpreted by \LUATEX, and all occurrences of \tex{"}, \tex{'} or
1162 \type{\\} within the quoted text are un-escaped. In the table format, there
1163 is no string handling taking place.
1165 This function normally does not return control back to the \LUA\ script: the
1166 command will replace the current process. However, it will return the two values
1167 \type{nil} and \type {'error'} if there was a problem while attempting to execute the command.
1169 On Windows, the current process is actually kept in memory until after the
1170 execution of the command has finished. This prevents crashes in situations
1171 where \TEXLUA\ scripts are run inside integrated \TEX\ environments.
1173 The original reason for this command is that it cleans out the current
1174 process before starting the new one, making it especially useful for
1175 use in \TEXLUA.
1177 \item \luatex{os.spawn(commandline)} is a returning version of \lua{os.exec},
1178 with otherwise identical calling conventions.
1180 If the command ran ok, then the return value is the exit status of the
1181 command. Otherwise, it will return the two values \type{nil} and \type {'error'}.
1183 \item \luatex{os.setenv('key','value')}
1184 This sets a variable in the environment. Passing \lua{nil} instead of a
1185 value string will remove the variable.
1187 \item \luatex{os.env}
1188 This is a hash table containing a dump of the variables and values
1189 in the process environment at the start of the run. It is writeable,
1190 but the actual environment is {\em not\/} updated automatically.
1192 \item \luatex{os.gettimeofday()}
1193 Returns the current \quote {\UNIX\ time}, but as a float. This function is
1194 not available on the \SUNOS\ platforms, so do not use this function
1195 for portable documents.
1197 \item \luatex{os.times()}
1198 Returns the current process times according to \ the \UNIX\ C library function
1199 \quote {times}. This function is not available on the \MSWINDOWS\
1200 and \SUNOS\ platforms, so do not use this function for portable
1201 documents.
1203 \item \luatex{os.tmpdir()} This will create a directory in the \quote {current
1204 directory} with the name \type{luatex.XXXXXX} where the \type {X}-es are
1205 replaced by a unique string. The function also returns this string,
1206 so you can \type{lfs.chdir()} into it, or \type{nil} if it failed to
1207 create the directory. The user is responsible for cleaning up at
1208 the end of the run, it does not happen automatically.
1210 \item \luatex{os.type}
1211 This is a string that gives a global indication of the class of operating
1212 system. The possible values are currently \type{windows}, \type{unix}, and
1213 \type{msdos} (you are unlikely to find this value \quote {in the wild}).
1215 \item \luatex{os.name}
1216 This is a string that gives a more precise indication of the operating
1217 system. These possible values are not yet fixed, and for \type{os.type} values
1218 \type{windows} and \type{msdos}, the \type{os.name} values are simply
1219 \type{windows} and \type{msdos}
1221 The list for the type \type{unix} is more precise: \type{linux},
1222 \type{freebsd}, \type{kfreebsd} (since 0.51), \type{cygwin} (since
1223 0.53), \type{openbsd}, \type{solaris}, \type{sunos} (pre-solaris),
1224 \type{hpux}, \type{irix}, \type{macosx}, \type{gnu} (hurd), \type{bsd} (unknown, but \BSD|-|like),
1225 \type{sysv} (unknown, but \SYSV|-|like), \type{generic} (unknown).
1227 (\type{os.version} is planned as a future extension)
1229 \item \luatex{os.uname()}
1230 This function returns a table with specific operating system
1231 information acquired at runtime. The keys in the returned table are
1232 all string valued, and their names are: \type{sysname}, \type{machine},
1233 \type{release}, \type{version}, and \type{nodename}.
1236 \stopitemize
1238 In stock \LUA, many things depend on the current locale. In \LUATEX, we can't do
1239 that, because it makes documents unportable. While \LUATEX\ is running if
1240 forces the following locale settings:
1242 \starttyping
1243 LC_CTYPE=C
1244 LC_COLLATE=C
1245 LC_NUMERIC=C
1246 \stoptyping
1248 \section {\LUA\ modules}
1250 {\bf NOTE}: Starting with \LUATEX\ 0.74, the implied use of the
1251 built-in Lua modules in this section is deprecated. If you want to use
1252 one of these libraries, please start your source file with a
1253 proper \type{require} line. In the near future, \LUATEX\ will switch
1254 to loading these modules on demand.
1256 Some modules that are normally external to \LUA\ are statically linked
1257 in with \LUATEX, because they offer useful functionality:
1259 \startitemize
1260 \item \lua{slnunicode}, from the \type {Selene} libraries, \hyphenatedurl{http://luaforge.net/projects/sln}. (version 1.1)
1262 This library has been slightly extended so that the \type{unicode.utf8.*}
1263 functions also accept the first 256 values of plane~18. This is the range \LUATEX\
1264 uses for raw binary output, as explained above.
1266 \item \lua{luazip}, from the kepler project, \hyphenatedurl{http://www.keplerproject.org/luazip/}.
1267 (version 1.2.1, but patched for compilation with \LUA\ 5.2)
1268 \item \lua{luafilesystem}, also from the kepler project, \hyphenatedurl{http://www.keplerproject.org/luafilesystem/}.
1269 (version 1.5.0)
1270 \item \lua{lpeg}, by Roberto Ierusalimschy, \hyphenatedurl{http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html}. (version 0.10.2)
1272 Note: \lua{lpeg} is not \UNICODE|-|aware, but interprets strings on a
1273 byte|-|per|-|byte basis. This mainly means that \luatex{lpeg.S} cannot be
1274 used with characters above code point 127, since those characters are
1275 encoded using two bytes, and thus \luatex{lpeg.S} will look for one
1276 of those two bytes when matching, not the combination of the two.
1278 The same is true for \luatex{lpeg.R}, although the latter will display
1279 an error message if used with characters above code point 127: I.\,e.\
1280 \luatex{lpeg.R('aä')} results in the message \type{bad argument #1 to
1281 'R' (range must have two characters)}, since to \lua{lpeg}, \type{ä}
1282 is two 'characters' (bytes), so \type{aä} totals three.
1284 \item \lua{lzlib}, by Tiago Dionizio, \hyphenatedurl{http://luaforge.net/projects/lzlib/}. (version 0.2)
1285 \item \lua{md5}, by Roberto Ierusalimschy \hyphenatedurl{http://www.inf.puc-rio.br/~roberto/md5/md5-5/md5.html}.
1287 \item \lua{luasocket}, by Diego Nehab
1288 \hyphenatedurl{http://w3.impa.br/~diego/software/luasocket/}
1289 (version 2.0.2).
1291 Note: the \type{.lua} support modules from \type{luasocket} are also
1292 preloaded inside the executable, there are no external file dependencies.
1293 \stopitemize
1296 \chapter[libraries]{\LUATEX\ \LUA\ Libraries}
1298 {\bf NOTE}: Starting with \LUATEX\ 0.74, the implied use of the
1299 built-in Lua modules \type{epdf}, \type{fontloader}, \type{mplib},
1300 and \type{pdfscanner} is deprecated. If you want to use these, please
1301 start your source file with a proper \type{require} line. In the near
1302 future, \LUATEX\ will switch to loading these modules on demand.
1305 The interfacing between \TEX\ and \LUA\ is facilitated by a set of
1306 library modules. The \LUA\ libraries in this chapter are all defined and
1307 initialized by the \LUATEX\ executable. Together, they allow \LUA\
1308 scripts to query and change a number of \TEX's internal variables, run
1309 various internal \TEX\ functions, and set up \LUATEX's hooks to execute
1310 \LUA\ code.
1312 The following sections are in alphabetical order.
1314 \section{The \luatex{callback} library}
1316 This library has functions that register, find and list callbacks.
1318 A quick note on what callbacks are (thanks, Paul!):
1320 Callbacks are entry points to \LUATEX's internal operations, which can be
1321 interspersed with additional \LUA\ code, and even replaced altogether.
1322 In the first case, \TEX\ is simply augmented with new operations
1323 (for instance, a manipulation of the nodes resulting from the paragraph
1324 builder); in the second case, its hard-coded behavior (for instance, the
1325 paragraph builder itself) is ignored and processing relies on user code only.
1327 More precisely, the code to be inserted at a given callback is a function
1328 (an anonymous function or the name of a function variable); % Is this line useful?
1329 it will receive the arguments associated with the callback, if any, and must
1330 frequently return some other arguments for \TEX\ to resume its operations.
1332 The first task is registering a callback:
1334 \startfunctioncall
1335 id, error = callback.register (<string> callback_name, <function> func)
1336 id, error = callback.register (<string> callback_name, nil)
1337 id, error = callback.register (<string> callback_name, false)
1338 \stopfunctioncall
1340 where the \syntax{callback_name} is a predefined callback name, see
1341 below. The function returns the internal \type{id} of the callback
1342 or \type{nil}, if the callback could not be registered. In the latter
1343 case, \type{error} contains an error message, otherwise it is
1344 \type{nil}.
1346 \LUATEX\ internalizes the callback function in such a way that
1347 it does not matter if you redefine a function accidentally.
1349 Callback assignments are always global. You can use the special value
1350 \type {nil} instead of a function for clearing the callback.
1352 For some minor speed gain, you can assign the boolean \type{false} to
1353 the non-file related callbacks, doing so will prevent \LUATEX\ from
1354 executing whatever it would execute by default (when no callback
1355 function is registered at all). Be warned: this may cause all sorts of
1356 grief unless you know {\it exactly} what you are doing! This functionality
1357 is present since version 0.38.
1359 Currently, callbacks are not dumped into the format file.
1361 \startfunctioncall
1362 <table> info = callback.list()
1363 \stopfunctioncall
1365 The keys in the table are the known callback names, the value is a
1366 boolean where \type{true} means that the callback is currently set
1367 (active).
1369 \startfunctioncall
1370 <function> f = callback.find (callback_name)
1371 \stopfunctioncall
1373 If the callback is not set, \luatex{callback.find} returns \type{nil}.
1375 \subsection{File discovery callbacks}
1377 The behavior documented in this subsection is considered stable in the
1378 sense that there will not be backward-incompatible changes any more.
1380 \subsubsection{\luatex{find_read_file} and \luatex{find_write_file}}
1382 Your callback function should have the following conventions:
1384 \startfunctioncall
1385 <string> actual_name = function (<number> id_number, <string> asked_name)
1386 \stopfunctioncall
1388 Arguments:
1390 \startitemize
1392 \sym{id_number}
1394 This number is zero for the log or \tex{input} files. For \TEX's \tex{read} or
1395 \tex{write} the number is incremented by one, so \tex{read0} becomes~1.
1397 \sym{asked_name}
1399 This is the user|-|supplied filename, as found by \tex{input}, \tex{openin}
1400 or \tex{openout}.
1402 \stopitemize
1404 Return value:
1406 \startitemize
1408 \sym{actual_name}
1410 This is the filename used. For the very first file that is read in by
1411 \TEX, you have to make sure you return an \type{actual_name} that has
1412 an extension and that is suitable for use as \type{jobname}. If you
1413 don't, you will have to manually fix the name of the log file and
1414 output file after \LUATEX\ is finished, and an eventual format
1415 filename will become mangled. That is because these file names depend
1416 on the jobname.
1418 You have to return \type{nil} if the file cannot be found.
1420 \stopitemize
1422 \subsubsection{\luatex{find_font_file}}
1424 Your callback function should have the following conventions:
1426 \startfunctioncall
1427 <string> actual_name = function (<string> asked_name)
1428 \stopfunctioncall
1430 The \type{asked_name} is an \OTF\ or \TFM\ font metrics file.
1432 Return \type{nil} if the file cannot be found.
1434 \subsubsection{\luatex{find_output_file}}
1436 Your callback function should have the following conventions:
1438 \startfunctioncall
1439 <string> actual_name = function (<string> asked_name)
1440 \stopfunctioncall
1442 The \type{asked_name} is the \PDF\ or \DVI\ file for writing.
1444 \subsubsection{\luatex{find_format_file}}
1446 Your callback function should have the following conventions:
1448 \startfunctioncall
1449 <string> actual_name = function (<string> asked_name)
1450 \stopfunctioncall
1452 The \type{asked_name} is a format file for reading (the format file
1453 for writing is always opened in the current directory).
1455 \subsubsection{\luatex{find_vf_file}}
1457 Like \luatex{find_font_file}, but for virtual fonts. This applies to
1458 both \ALEPH's \OVF\ files and traditional Knuthian \VF\ files.
1460 \subsubsection{\luatex{find_map_file}}
1462 Like \luatex{find_font_file}, but for map files.
1464 \subsubsection{\luatex{find_enc_file}}
1466 Like \luatex{find_font_file}, but for enc files.
1468 \subsubsection{\luatex{find_sfd_file}}
1470 Like \luatex{find_font_file}, but for subfont definition files.
1472 \subsubsection{\luatex{find_pk_file}}
1474 Like \luatex{find_font_file}, but for pk bitmap files. The argument
1475 \type{asked_name} is a bit special in this case. Its form is
1477 \starttyping
1478 <base res>dpi/<fontname>.<actual res>pk
1479 \stoptyping
1481 So you may be asked for \type{600dpi/manfnt.720pk}. It is up to you
1482 to find a \quote{reasonable} bitmap file to go with that specification.
1484 \subsubsection{\luatex{find_data_file}}
1486 Like \luatex{find_font_file}, but for embedded files (\tex{pdfobj file '...'}).
1488 \subsubsection{\luatex{find_opentype_file}}
1490 Like \luatex{find_font_file}, but for \OPENTYPE\ font files.
1492 \subsubsection{\luatex{find_truetype_file} and \luatex{find_type1_file}}
1494 Your callback function should have the following conventions:
1496 \startfunctioncall
1497 <string> actual_name = function (<string> asked_name)
1498 \stopfunctioncall
1500 The \type{asked_name} is a font file. This callback is called while
1501 \LUATEX\ is building its internal list of needed font files, so the
1502 actual timing may surprise you. Your return value is later fed back
1503 into the matching \luatex{read_file} callback.
1505 Strangely enough, \luatex{find_type1_file} is also used for \OPENTYPE\
1506 (\OTF) fonts.
1508 \subsubsection{\luatex{find_image_file}}
1510 Your callback function should have the following conventions:
1512 \startfunctioncall
1513 <string> actual_name = function (<string> asked_name)
1514 \stopfunctioncall
1516 The \type{asked_name} is an image file. Your return value is used to
1517 open a file from the harddisk, so make sure you return something that
1518 is considered the name of a valid file by your operating system.
1520 \subsection[iocallback]{File reading callbacks}
1522 The behavior documented in this subsection is considered stable in the
1523 sense that there will not be backward-incompatible changes any more.
1525 \subsubsection{\luatex{open_read_file}}
1527 Your callback function should have the following conventions:
1529 \startfunctioncall
1530 <table> env = function (<string> file_name)
1531 \stopfunctioncall
1533 Argument:
1535 \startitemize
1537 \sym{file_name}
1539 The filename returned by a previous \luatex{find_read_file} or the return
1540 value of \luatex{kpse.find_file()} if there was no such callback defined.
1542 \stopitemize
1544 Return value:
1546 \startitemize
1548 \sym{env}
1550 This is a table containing at least one required and one optional
1551 callback function for this file. The required field is
1552 \luatex{reader} and the associated function will be called once
1553 for each new line to be read, the optional one is \luatex{close}
1554 that will be called once when \LUATEX\ is done with the file.
1556 \LUATEX\ never looks at the rest of the table, so you can use it to
1557 store your private per|-|file data. Both the callback functions will
1558 receive the table as their only argument.
1560 \stopitemize
1562 \subsubsubsection{\luatex{reader}}
1564 \LUATEX\ will run this function whenever it needs a new input line
1565 from the file.
1567 \startfunctioncall
1568 function(<table> env)
1569 return <string> line
1571 \stopfunctioncall
1573 Your function should return either a string or \type{nil}. The value \type{nil}
1574 signals that the end of file has occurred, and will make \TEX\ call
1575 the optional \luatex{close} function next.
1577 \subsubsubsection{\luatex{close}}
1579 \LUATEX\ will run this optional function when it decides to close the file.
1581 \startfunctioncall
1582 function(<table> env)
1584 \stopfunctioncall
1586 Your function should not return any value.
1588 \subsubsection{General file readers}
1590 There is a set of callbacks for the loading of binary data
1591 files. These all use the same interface:
1593 \startfunctioncall
1594 function(<string> name)
1595 return <boolean> success, <string> data, <number> data_size
1597 \stopfunctioncall
1599 The \type{name} will normally be a full path name as it is returned by
1600 either one of the file discovery callbacks or the internal version of
1601 \luatex{kpse.find_file()}.
1603 \startitemize
1605 \sym{success}
1607 Return \type{false} when a fatal error occurred (e.\,g.\ when the file cannot be
1608 found, after all).
1610 \sym{data}
1612 The bytes comprising the file.
1614 \sym{data_size}
1616 The length of the \type{data}, in bytes.
1618 \stopitemize
1620 Return an empty string and zero if the file was found but there was a
1621 reading problem.
1623 The list of functions is as follows:
1625 \starttabulate[|l|p|]
1626 \NC \luatex{read_font_file} \NC ofm or tfm files \NC\NR
1627 \NC \luatex{read_vf_file} \NC virtual fonts \NC\NR
1628 \NC \luatex{read_map_file} \NC map files \NC\NR
1629 \NC \luatex{read_enc_file} \NC encoding files \NC\NR
1630 \NC \luatex{read_sfd_file} \NC subfont definition files \NC\NR
1631 \NC \luatex{read_pk_file} \NC pk bitmap files \NC\NR
1632 \NC \luatex{read_data_file} \NC embedded files (\tex{pdfobj file ...}) \NC\NR
1633 \NC \luatex{read_truetype_file} \NC \TRUETYPE\ font files \NC\NR
1634 \NC \luatex{read_type1_file} \NC \TYPEONE\ font files \NC\NR
1635 \NC \luatex{read_opentype_file} \NC \OPENTYPE\ font files \NC\NR
1636 \stoptabulate
1638 \subsection{Data processing callbacks}
1640 \subsubsection{\luatex{process_input_buffer}}
1643 This callback allows you to change the contents of the line input
1644 buffer just before \LUATEX\ actually starts looking at it.
1646 \startfunctioncall
1647 function(<string> buffer)
1648 return <string> adjusted_buffer
1650 \stopfunctioncall
1652 If you return \type{nil}, \LUATEX\ will pretend like your callback
1653 never happened. You can gain a small amount of processing time from
1654 that.
1656 This callback does not replace any internal code.
1658 \subsubsection{\luatex{process_output_buffer} (0.43)}
1660 This callback allows you to change the contents of the line output
1661 buffer just before \LUATEX\ actually starts writing it to a file as the
1662 result of a \tex{write} command. It is only called for output to an
1663 actual file (that is, excluding the log, the terminal, and \tex{write18}
1664 calls).
1666 \startfunctioncall
1667 function(<string> buffer)
1668 return <string> adjusted_buffer
1670 \stopfunctioncall
1672 If you return \type{nil}, \LUATEX\ will pretend like your callback
1673 never happened. You can gain a small amount of processing time from
1674 that.
1676 This callback does not replace any internal code.
1679 \subsubsection{\luatex{process_jobname} (0.71)}
1681 This callback allows you to change the jobname given by \type{\jobname}
1682 in \TEX\ and \type{tex.jobname} in Lua. It does not affect the internal
1683 job name or the name of the output or log files.
1685 \startfunctioncall
1686 function(<string> jobname)
1687 return <string> adjusted_jobname
1689 \stopfunctioncall
1691 The only argument is the actual job name; you should not use
1692 \type{tex.jobname} inside this function or infinite recursion may occur.
1693 If you return \type{nil}, \LUATEX\ will pretend your callback never
1694 happened.
1696 This callback does not replace any internal code.
1699 \subsubsection{\luatex{token_filter}}
1701 This callback allows you to replace the way \LUATEX\ fetches
1702 lexical tokens.
1704 \startfunctioncall
1705 function()
1706 return <table> token
1708 \stopfunctioncall
1710 The calling convention for this callback is a bit more complicated than
1711 for most other callbacks. The function should either return a \LUA\
1712 table representing a valid to|-|be|-|processed token or tokenlist, or
1713 something else like \type{nil} or an empty table.
1715 If your \LUA\ function does not return a table representing a valid
1716 token, it will be immediately called again, until it eventually does
1717 return a useful token or tokenlist (or until you reset the callback
1718 value to nil). See the description of \luatex{token} for some
1719 handy functions to be used in conjunction with this callback.
1721 If your function returns a single usable token, then that token will
1722 be processed by \LUATEX\ immediately. If the function returns a token
1723 list (a table consisting of a list of consecutive token tables), then
1724 that list will be pushed to the input stack at a completely new token
1725 list level, with its token type set to \quote{inserted}. In either case,
1726 the returned token(s) will not be fed back into the callback function.
1728 Setting this callback to \type{false} has no effect (because otherwise
1729 nothing would happen, forever).
1731 \subsection{Node list processing callbacks}
1733 The description of nodes and node lists is in~\in{chapter}[nodes].
1735 \subsubsection{\luatex{buildpage_filter}}
1737 This callback is called whenever \LUATEX\ is ready to move stuff to
1738 the main vertical list. You can use this callback to do specialized
1739 manipulation of the page building stage like imposition or column
1740 balancing.
1742 \startfunctioncall
1743 function(<string> extrainfo)
1745 \stopfunctioncall
1747 The string \type{extrainfo} gives some additional information about
1748 what \TEX's state is with respect to the \quote{current page}. The possible
1749 values are:
1751 \starttabulate[|lT|p|]
1752 \NC \ssbf value \NC \bf explanation \NC\NR
1753 \NC alignment \NC a (partial) alignment is being added \NC\NR
1754 \NC after_output \NC an output routine has just finished \NC\NR
1755 \NC box \NC a typeset box is being added \NC\NR
1756 %\NC pre_box \NC interline material is being added \NC\NR
1757 %\NC adjust \NC \tex{vadjust} material is being added \NC\NR
1758 \NC new_graf \NC the beginning of a new paragraph \NC\NR
1759 \NC vmode_par \NC \tex{par} was found in vertical mode \NC\NR
1760 \NC hmode_par \NC \tex{par} was found in horizontal mode \NC\NR
1761 \NC insert \NC an insert is added \NC\NR
1762 \NC penalty \NC a penalty (in vertical mode) \NC\NR
1763 \NC before_display \NC immediately before a display starts \NC\NR
1764 \NC after_display \NC a display is finished \NC\NR
1765 \NC end \NC \LUATEX\ is terminating (it's all over)\NC\NR
1766 \stoptabulate
1768 This callback does not replace any internal code.
1771 \subsubsection{\luatex{pre_linebreak_filter}}
1773 This callback is called just before \LUATEX\ starts converting a list
1774 of nodes into a stack of \tex{hbox}es, after the addition of
1775 \type{\parfillskip}.
1777 \startfunctioncall
1778 function(<node> head, <string> groupcode)
1779 return true | false | <node> newhead
1781 \stopfunctioncall
1783 The string called \type {groupcode} identifies the nodelist's context
1784 within \TEX's processing. The range of possibilities is given in the
1785 table below, but not all of those can actually appear in
1786 \luatex {pre_linebreak_filter}, some are for the
1787 \luatex {hpack_filter} and \luatex {vpack_filter} callbacks that
1788 will be explained in the next two paragraphs.
1790 \starttabulate[|lT|p|]
1791 \NC \ssbf value \NC \bf explanation \NC\NR
1792 \NC <empty> \NC main vertical list \NC\NR
1793 \NC hbox \NC \tex{hbox} in horizontal mode \NC\NR
1794 \NC adjusted_hbox\NC \tex{hbox} in vertical mode \NC\NR
1795 \NC vbox \NC \tex{vbox} \NC\NR
1796 \NC vtop \NC \tex{vtop} \NC\NR
1797 \NC align \NC \tex{halign} or \tex{valign} \NC\NR
1798 \NC disc \NC discretionaries \NC\NR
1799 \NC insert \NC packaging an insert \NC\NR
1800 \NC vcenter \NC \tex{vcenter} \NC\NR
1801 \NC local_box \NC \tex{localleftbox} or \tex{localrightbox} \NC\NR
1802 \NC split_off \NC top of a \tex{vsplit} \NC\NR
1803 \NC split_keep \NC remainder of a \tex{vsplit} \NC\NR
1804 \NC align_set \NC alignment cell \NC\NR
1805 \NC fin_row \NC alignment row \NC\NR
1806 \stoptabulate
1808 As for all the callbacks that deal with nodes, the return value can be one of three things:
1810 \startitemize
1811 \item boolean \type{true} signals succesful processing
1812 \item \type{<node>} signals that the \quote{head} node should be replaced by the returned node
1813 \item boolean \type{false} signals that the \quote{head} node list should be ignored and
1814 flushed from memory
1815 \stopitemize
1818 This callback does not replace any internal code.
1821 \subsubsection{\luatex{linebreak_filter}}
1823 This callback replaces \LUATEX's line breaking algorithm.
1825 \startfunctioncall
1826 function(<node> head, <boolean> is_display)
1827 return <node> newhead
1829 \stopfunctioncall
1831 The returned node is the head of the list that will be added to the
1832 main vertical list, the boolean argument is true if this paragraph is
1833 interrupted by a following math display.
1835 If you return something that is not a \type{<node>}, \LUATEX\ will
1836 apply the internal linebreak algorithm on the list that starts at
1837 \type{<head>}. Otherwise, the \type{<node>} you return is supposed
1838 to be the head of a list of nodes that are all allowed in vertical
1839 mode, and at least one of those has to represent a hbox. Failure to do
1840 so will result in a fatal error.
1842 Setting this callback to \type{false} is possible, but dangerous,
1843 because it is possible you will end up in an unfixable
1844 \quote{deadcycles loop}.
1846 \subsubsection{\luatex{post_linebreak_filter}}
1848 This callback is called just after \LUATEX\ has converted a list
1849 of nodes into a stack of \tex{hbox}es.
1851 \startfunctioncall
1852 function(<node> head, <string> groupcode)
1853 return true | false | <node> newhead
1855 \stopfunctioncall
1857 This callback does not replace any internal code.
1859 \subsubsection{\luatex{hpack_filter}}
1861 This callback is called when \TEX\ is ready to start boxing some
1862 horizontal mode material. Math items and line boxes are ignored
1863 at the moment.
1865 \startfunctioncall
1866 function(<node> head, <string> groupcode, <number> size,
1867 <string> packtype [, <string> direction])
1868 return true | false | <node> newhead
1870 \stopfunctioncall
1872 The \type{packtype} is either \type{additional} or \type{exactly}. If
1873 \type{additional}, then the \type{size} is a \tex{hbox spread ...}
1874 argument. If \type{exactly}, then the \type{size} is a \tex{hbox to ...}.
1875 In both cases, the number is in scaled points.
1877 The \type{direction} is either one of the three-letter direction specifier
1878 strings, or \type{nil} (added in 0.45).
1881 This callback does not replace any internal code.
1883 \subsubsection{\luatex{vpack_filter}}
1885 This callback is called when \TEX\ is ready to start boxing some
1886 vertical mode material. Math displays are ignored at the moment.
1888 This function is very similar to the \luatex{hpack_filter}. Besides
1889 the fact that it is called at different moments, there is an extra
1890 variable that matches \TEX's \tex{maxdepth} setting.
1892 \startfunctioncall
1893 function(<node> head, <string> groupcode, <number> size, <string>
1894 packtype, <number> maxdepth [, <string> direction])
1895 return true | false | <node> newhead
1897 \stopfunctioncall
1899 This callback does not replace any internal code.
1901 \subsubsection{\luatex{pre_output_filter}}
1903 This callback is called when \TEX\ is ready to start boxing the
1904 box 255 for \tex{output}.
1906 \startfunctioncall
1907 function(<node> head, <string> groupcode, <number> size, <string> packtype,
1908 <number> maxdepth [, <string> direction])
1909 return true | false | <node> newhead
1911 \stopfunctioncall
1913 This callback does not replace any internal code.
1915 \subsubsection{\luatex{hyphenate}}
1917 \startfunctioncall
1918 function(<node> head, <node> tail)
1920 \stopfunctioncall
1922 No return values. This callback has to insert discretionary nodes in
1923 the node list it receives.
1925 Setting this callback to \type{false} will prevent the internal
1926 discretionary insertion pass.
1928 \subsubsection{\luatex{ligaturing}}
1930 \startfunctioncall
1931 function(<node> head, <node> tail)
1933 \stopfunctioncall
1935 No return values. This callback has to apply ligaturing to the node
1936 list it receives.
1938 You don't have to worry about return values because the \type{head}
1939 node that is passed on to the callback is guaranteed not to be a
1940 glyph_node (if need be, a temporary node will be prepended), and
1941 therefore it cannot be affected by the mutations that take place.
1942 After the callback, the internal value of the \quote {tail of the list}
1943 will be recalculated.
1945 The \type{next} of \type{head} is guaranteed to be non-nil.
1947 The \type{next} of \type{tail} is guaranteed to be nil, and therefore the
1948 second callback argument can often be ignored. It is provided for
1949 orthogonality, and because it can sometimes be handy when special
1950 processing has to take place.
1952 Setting this callback to \type{false} will prevent the internal
1953 ligature creation pass.
1955 \subsubsection{\luatex{kerning}}
1957 \startfunctioncall
1958 function(<node> head, <node> tail)
1960 \stopfunctioncall
1962 No return values. This callback has to apply kerning between the nodes
1963 in the node list it receives. See \type{ligaturing} for calling
1964 conventions.
1966 Setting this callback to \type{false} will prevent the internal
1967 kern insertion pass.
1969 \subsubsection{\luatex{mlist_to_hlist}}
1971 This callback replaces \LUATEX's math list to node list conversion algorithm.
1973 \startfunctioncall
1974 function(<node> head, <string> display_type, <boolean> need_penalties)
1975 return <node> newhead
1977 \stopfunctioncall
1979 The returned node is the head of the list that will be added to the vertical or
1980 horizontal list, the string argument is either \quote{text} or \quote{display}
1981 depending on the current math mode, the boolean argument is \type{true} if penalties
1982 have to be inserted in this list, \type{false} otherwise.
1984 Setting this callback to \type{false} is bad, it will almost
1985 certainly result in an endless loop.
1987 \subsection{Information reporting callbacks}
1989 \subsubsection{\luatex{pre_dump} (0.61)}
1991 \startfunctioncall
1992 function()
1994 \stopfunctioncall
1996 This function is called just before dumping to a format file starts.
1997 It does not replace any code and there are neither arguments nor return values.
1999 \subsubsection{\luatex{start_run}}
2001 \startfunctioncall
2002 function()
2004 \stopfunctioncall
2006 This callback replaces the code that prints \LUATEX's banner. Note that for
2007 successful use, this callback has to be set in the lua initialization script,
2008 otherwise it will be seen only after the run has already started.
2010 \subsubsection{\luatex{stop_run}}
2012 \startfunctioncall
2013 function()
2015 \stopfunctioncall
2017 This callback replaces the code that prints \LUATEX's statistics and \quote{output written
2018 to} messages.
2020 \subsubsection{\luatex{start_page_number}}
2022 \startfunctioncall
2023 function()
2025 \stopfunctioncall
2027 Replaces the code that prints the \type{[} and the page number at the
2028 begin of \tex{shipout}. This callback will also override the
2029 printing of box information that normally takes place when
2030 \tex{tracingoutput} is positive.
2032 \subsubsection{\luatex{stop_page_number}}
2034 \startfunctioncall
2035 function()
2037 \stopfunctioncall
2039 Replaces the code that prints the \type{]} at the end of \tex{shipout}.
2041 \subsubsection{\luatex{show_error_hook}}
2043 \startfunctioncall
2044 function()
2046 \stopfunctioncall
2048 This callback is run from inside the \TEX\ error function, and the idea
2049 is to allow you to do some extra reporting on top of what \TEX\ already
2050 does (none of the normal actions are removed). You may find some of
2051 the values in the \luatex{status} table useful.
2053 This callback does not replace any internal code.
2055 \iffalse % this has been retracted for the moment
2056 \startitemize
2058 \sym{message}
2060 is the formal error message \TEX\ has given to the user.
2061 (the line after the '!').
2063 \sym{indicator}
2065 is either a filename (when it is a string) or a location indicator (a
2066 number) that can mean lots of different things like a token list id
2067 or a \tex{read} number.
2069 \sym{lineno}
2071 is the current line number.
2072 \stopitemize
2074 This is an investigative item for 'testing the water' only.
2075 The final goal is the total replacement of \TEX's error handling
2076 routines, but that needs lots of adjustments in the web source because
2077 \TEX\ deals with errors in a somewhat haphazard fashion. This is why the
2078 exact definition of \type{indicator} is not given here.
2082 \subsubsection{\luatex{show_error_message}}
2084 \startfunctioncall
2085 function()
2087 \stopfunctioncall
2089 This callback replaces the code that prints the error message. The usual
2090 interaction after the message is not affected.
2092 \subsubsection{\luatex{show_lua_error_hook}}
2094 \startfunctioncall
2095 function()
2097 \stopfunctioncall
2099 This callback replaces the code that prints the extra lua error message.
2101 \subsection{PDF-related callbacks}
2103 \subsubsection{\luatex{finish_pdffile}}
2105 \startfunctioncall
2106 function()
2108 \stopfunctioncall
2110 This callback is called when all document pages are already written to the \PDF\
2111 file and \LUATEX\ is about to finalize the output document structure. Its intended
2112 use is final update of \PDF\ dictionaries such as \type{/Catalog} or
2113 \type{/Info}. The callback does not replace any code. There are neither
2114 arguments nor return values.
2117 \subsubsection{\luatex{finish_pdfpage}}
2120 \startfunctioncall
2121 function(shippingout)
2123 \stopfunctioncall
2125 This callback is called after the pdf page stream has been assembled and before the
2126 page object gets finalized. This callback is available in \LUATEX\ 0.78.4 and later.
2129 \subsection{Font-related callbacks}
2131 \subsubsection{\luatex{define_font}}
2133 \startfunctioncall
2134 function(<string> name, <number> size, <number> id)
2135 return <table> font
2137 \stopfunctioncall
2139 The string \type{name} is the filename part of the font
2140 specification, as given by the user.
2142 The number \type{size} is a bit special:
2144 \startitemize[packed]
2145 \item if it is positive, it specifies an \quote{at size} in scaled points.
2146 \item if it is negative, its absolute value represents a \quote{scaled}
2147 setting relative to the designsize of the font.
2148 \stopitemize
2150 The \type{id} is the internal number assigned to the font.
2152 The internal structure of the \type{font} table that is to be
2153 returned is explained in \in{chapter}[fonts]. That table is saved
2154 internally, so you can put extra fields in the table for your
2155 later \LUA\ code to use.
2157 Setting this callback to \type{false} is pointless as it will prevent
2158 font loading completely but will nevertheless generate errors.
2160 \section{The \luatex{epdf} library}
2162 The \type{epdf} library provides Lua bindings to many \PDF\ access functions
2163 that are defined by the poppler pdf viewer library (written in C$+{}+$
2164 by Kristian H\o gsberg, based on xpdf by Derek Noonburg).
2165 Within \LUATEX\ (and \PDFTEX),
2166 xpdf functionality is being used since long time to embed \PDF\ files.
2167 The \type{epdf} library shall allow to scrutinize an external \PDF\ file.
2168 It gives access to its document structure,
2169 e.\,g., catalog, cross-reference table, individual pages, objects,
2170 annotations, info, and metadata.
2172 The \type{epdf} library is still in alpha state:
2173 \PDF\ access is currently read|-|only
2174 (it's not yet possible to alter a \PDF\ file or to assemble it from scratch),
2175 and many function bindings are still missing.
2177 For a start,
2178 a \PDF\ file is opened by \type{epdf.open()} with file name, e.\,g.:
2180 \starttyping
2181 doc = epdf.open("foo.pdf")
2182 \stoptyping
2184 This normally returns a \type{PDFDoc} userdata variable;
2185 but if the file could not be opened successfully,
2186 instead of a fatal error just the value \type{nil} is returned.
2188 All Lua functions in the \type{epdf} library are named after the
2189 poppler functions listed in the poppler header files for the various classes,
2190 e.\,g., files \type{PDFDoc.h}, \type{Dict.h}, and \type{Array.h}.
2191 These files can be found in the poppler subdirectory within the \LUATEX\ sources.
2192 Which functions are already implemented in the \type{epdf} library
2193 can be found in the \LUATEX\ source file \type{lepdflib.cc}.
2194 For using the \type{epdf} library,
2195 knowledge of the \PDF\ file architecture is indispensable.
2197 There are many different userdata types defined
2198 by the \type{epdf} library, currently these are
2199 \type{Annot},
2200 \type{AnnotBorder},
2201 \type{AnnotBorderStyle},
2202 \type{Annots},
2203 \type{Array},
2204 \type{Catalog},
2205 \type{EmbFile},
2206 \type{Dict},
2207 \type{GString},
2208 \type{Link},
2209 \type{LinkDest},
2210 \type{Links},
2211 \type{Object},
2212 \type{ObjectStream},
2213 \type{Page},
2214 \type{PDFDoc},
2215 \type{PDFRectangle},
2216 \type{Ref},
2217 \type{Stream},
2218 \type{XRef}, and
2219 \type{XRefEntry}.
2221 All these userdata names and the Lua access functions closely resemble
2222 the classes naming from the poppler header files,
2223 including the choice of mixed upper and lower case letters.
2224 The Lua function calls use object-oriented syntax, e.\,g.,
2225 the following calls return the \type{Page} object for page~1:
2227 \starttyping
2228 pageref = doc:getCatalog():getPageRef(1)
2229 pageobj = doc:getXRef():fetch(pageref.num, pageref.gen)
2230 \stoptyping
2232 But writing such chained calls is risky,
2233 as an intermediate function may return \type{nil} on error;
2234 therefore between function calls there should be Lua type checks
2235 (e.\,g., against \type{nil}) done.
2236 If a non-object item is requested
2237 (e.\,g., a \type{Dict} item by calling \type{page:getPieceInfo()},
2238 cf.~\type{Page.h}) but not available,
2239 the Lua functions return \type{nil} (without error).
2240 If a function should return an \type{Object}, but it's not existing,
2241 a \type{Null} object is returned instead
2242 (also without error; this is in|-|line with poppler behavior).
2244 All library objects have a \type{__gc} metamethod for garbage collection.
2245 The \type{__tostring} metamethod gives the type name for each object.
2247 All object constructors:
2249 \startfunctioncall
2250 <PDFDoc> = epdf.open(<string> PDF filename)
2251 <Annot> = epdf.Annot(<XRef>, <Dict>, <Catalog>, <Ref>)
2252 <Annots> = epdf.Annots(<XRef>, <Catalog>, <Object>)
2253 <Array> = epdf.Array(<XRef>)
2254 <Dict> = epdf.Dict(<XRef>)
2255 <Object> = epdf.Object()
2256 <PDFRectangle> = epdf.PDFRectangle()
2257 \stopfunctioncall
2259 \type{Annot} methods:
2261 \startfunctioncall
2262 <boolean> = <Annot>:isOK()
2263 <Object> = <Annot>:getAppearance()
2264 <AnnotBorder> = <Annot>:getBorder()
2265 <boolean> = <Annot>:match(<Ref>)
2266 \stopfunctioncall
2268 \type{AnnotBorderStyle} methods:
2270 \startfunctioncall
2271 <number> = <AnnotBorderStyle>:getWidth()
2272 \stopfunctioncall
2274 \type{Annots} methods:
2276 \startfunctioncall
2277 <integer> = <Annots>:getNumAnnots()
2278 <Annot> = <Annots>:getAnnot(<integer>)
2279 \stopfunctioncall
2281 \type{Array} methods:
2283 \startfunctioncall
2284 <Array>:incRef()
2285 <Array>:decRef()
2286 <integer> = <Array>:getLength()
2287 <Array>:add(<Object>)
2288 <Object> = <Array>:get(<integer>)
2289 <Object> = <Array>:getNF(<integer>)
2290 <string> = <Array>:getString(<integer>)
2291 \stopfunctioncall
2293 \type{Catalog} methods:
2295 \startfunctioncall
2296 <boolean> = <Catalog>:isOK()
2297 <integer> = <Catalog>:getNumPages()
2298 <Page> = <Catalog>:getPage(<integer>)
2299 <Ref> = <Catalog>:getPageRef(<integer>)
2300 <string> = <Catalog>:getBaseURI()
2301 <string> = <Catalog>:readMetadata()
2302 <Object> = <Catalog>:getStructTreeRoot()
2303 <integer> = <Catalog>:findPage(<integer> object number, <integer> object generation)
2304 <LinkDest> = <Catalog>:findDest(<string> name)
2305 <Object> = <Catalog>:getDests()
2306 <integer> = <Catalog>:numEmbeddedFiles()
2307 <EmbFile> = <Catalog>:embeddedFile(<integer>)
2308 <integer> = <Catalog>:numJS()
2309 <string> = <Catalog>:getJS(<integer>)
2310 <Object> = <Catalog>:getOutline()
2311 <Object> = <Catalog>:getAcroForm()
2312 \stopfunctioncall
2314 \type{EmbFile} methods:
2316 \startfunctioncall
2317 <string> = <EmbFile>:name()
2318 <string> = <EmbFile>:description()
2319 <integer> = <EmbFile>:size()
2320 <string> = <EmbFile>:modDate()
2321 <string> = <EmbFile>:createDate()
2322 <string> = <EmbFile>:checksum()
2323 <string> = <EmbFile>:mimeType()
2324 <Object> = <EmbFile>:streamObject()
2325 <boolean> = <EmbFile>:isOk()
2326 \stopfunctioncall
2328 \type{Dict} methods:
2330 \startfunctioncall
2331 <Dict>:incRef()
2332 <Dict>:decRef()
2333 <integer> = <Dict>:getLength()
2334 <Dict>:add(<string>, <Object>)
2335 <Dict>:set(<string>, <Object>)
2336 <Dict>:remove(<string>)
2337 <boolean> = <Dict>:is(<string>)
2338 <Object> = <Dict>:lookup(<string>)
2339 <Object> = <Dict>:lookupNF(<string>)
2340 <integer> = <Dict>:lookupInt(<string>, <string>)
2341 <string> = <Dict>:getKey(<integer>)
2342 <Object> = <Dict>:getVal(<integer>)
2343 <Object> = <Dict>:getValNF(<integer>)
2344 <boolean> = <Dict>:hasKey(<string>)
2345 \stopfunctioncall
2347 \type{Link} methods:
2349 \startfunctioncall
2350 <boolean> = <Link>:isOK()
2351 <boolean> = <Link>:inRect(<number>, <number>)
2352 \stopfunctioncall
2354 \type{LinkDest} methods:
2356 \startfunctioncall
2357 <boolean> = <LinkDest>:isOK()
2358 <integer> = <LinkDest>:getKind()
2359 <string> = <LinkDest>:getKindName()
2360 <boolean> = <LinkDest>:isPageRef()
2361 <integer> = <LinkDest>:getPageNum()
2362 <Ref> = <LinkDest>:getPageRef()
2363 <number> = <LinkDest>:getLeft()
2364 <number> = <LinkDest>:getBottom()
2365 <number> = <LinkDest>:getRight()
2366 <number> = <LinkDest>:getTop()
2367 <number> = <LinkDest>:getZoom()
2368 <boolean> = <LinkDest>:getChangeLeft()
2369 <boolean> = <LinkDest>:getChangeTop()
2370 <boolean> = <LinkDest>:getChangeZoom()
2371 \stopfunctioncall
2373 \type{Links} methods:
2375 \startfunctioncall
2376 <integer> = <Links>:getNumLinks()
2377 <Link> = <Links>:getLink(<integer>)
2378 \stopfunctioncall
2380 \type{Object} methods:
2382 \startfunctioncall
2383 <Object>:initBool(<boolean>)
2384 <Object>:initInt(<integer>)
2385 <Object>:initReal(<number>)
2386 <Object>:initString(<string>)
2387 <Object>:initName(<string>)
2388 <Object>:initNull()
2389 <Object>:initArray(<XRef>)
2390 <Object>:initDict(<XRef>)
2391 <Object>:initStream(<Stream>)
2392 <Object>:initRef(<integer> object number, <integer> object generation)
2393 <Object>:initCmd(<string>)
2394 <Object>:initError()
2395 <Object>:initEOF()
2396 <Object> = <Object>:fetch(<XRef>)
2397 <integer> = <Object>:getType()
2398 <string> = <Object>:getTypeName()
2399 <boolean> = <Object>:isBool()
2400 <boolean> = <Object>:isInt()
2401 <boolean> = <Object>:isReal()
2402 <boolean> = <Object>:isNum()
2403 <boolean> = <Object>:isString()
2404 <boolean> = <Object>:isName()
2405 <boolean> = <Object>:isNull()
2406 <boolean> = <Object>:isArray()
2407 <boolean> = <Object>:isDict()
2408 <boolean> = <Object>:isStream()
2409 <boolean> = <Object>:isRef()
2410 <boolean> = <Object>:isCmd()
2411 <boolean> = <Object>:isError()
2412 <boolean> = <Object>:isEOF()
2413 <boolean> = <Object>:isNone()
2414 <boolean> = <Object>:getBool()
2415 <integer> = <Object>:getInt()
2416 <number> = <Object>:getReal()
2417 <number> = <Object>:getNum()
2418 <string> = <Object>:getString()
2419 <string> = <Object>:getName()
2420 <Array> = <Object>:getArray()
2421 <Dict> = <Object>:getDict()
2422 <Stream> = <Object>:getStream()
2423 <Ref> = <Object>:getRef()
2424 <integer> = <Object>:getRefNum()
2425 <integer> = <Object>:getRefGen()
2426 <string> = <Object>:getCmd()
2427 <integer> = <Object>:arrayGetLength()
2428 = <Object>:arrayAdd(<Object>)
2429 <Object> = <Object>:arrayGet(<integer>)
2430 <Object> = <Object>:arrayGetNF(<integer>)
2431 <integer> = <Object>:dictGetLength(<integer>)
2432 = <Object>:dictAdd(<string>, <Object>)
2433 = <Object>:dictSet(<string>, <Object>)
2434 <Object> = <Object>:dictLookup(<string>)
2435 <Object> = <Object>:dictLookupNF(<string>)
2436 <string> = <Object>:dictgetKey(<integer>)
2437 <Object> = <Object>:dictgetVal(<integer>)
2438 <Object> = <Object>:dictgetValNF(<integer>)
2439 <boolean> = <Object>:streamIs(<string>)
2440 = <Object>:streamReset()
2441 <integer> = <Object>:streamGetChar()
2442 <integer> = <Object>:streamLookChar()
2443 <integer> = <Object>:streamGetPos()
2444 = <Object>:streamSetPos(<integer>)
2445 <Dict> = <Object>:streamGetDict()
2446 \stopfunctioncall
2448 \type{Page} methods:
2450 \startfunctioncall
2451 <boolean> = <Page>:isOk()
2452 <integer> = <Page>:getNum()
2453 <PDFRectangle> = <Page>:getMediaBox()
2454 <PDFRectangle> = <Page>:getCropBox()
2455 <boolean> = <Page>:isCropped()
2456 <number> = <Page>:getMediaWidth()
2457 <number> = <Page>:getMediaHeight()
2458 <number> = <Page>:getCropWidth()
2459 <number> = <Page>:getCropHeight()
2460 <PDFRectangle> = <Page>:getBleedBox()
2461 <PDFRectangle> = <Page>:getTrimBox()
2462 <PDFRectangle> = <Page>:getArtBox()
2463 <integer> = <Page>:getRotate()
2464 <string> = <Page>:getLastModified()
2465 <Dict> = <Page>:getBoxColorInfo()
2466 <Dict> = <Page>:getGroup()
2467 <Stream> = <Page>:getMetadata()
2468 <Dict> = <Page>:getPieceInfo()
2469 <Dict> = <Page>:getSeparationInfo()
2470 <Dict> = <Page>:getResourceDict()
2471 <Object> = <Page>:getAnnots()
2472 <Links> = <Page>:getLinks(<Catalog>)
2473 <Object> = <Page>:getContents()
2474 \stopfunctioncall
2476 \type{PDFDoc} methods:
2478 \startfunctioncall
2479 <boolean> = <PDFDoc>:isOk()
2480 <integer> = <PDFDoc>:getErrorCode()
2481 <string> = <PDFDoc>:getErrorCodeName()
2482 <string> = <PDFDoc>:getFileName()
2483 <XRef> = <PDFDoc>:getXRef()
2484 <Catalog> = <PDFDoc>:getCatalog()
2485 <number> = <PDFDoc>:getPageMediaWidth()
2486 <number> = <PDFDoc>:getPageMediaHeight()
2487 <number> = <PDFDoc>:getPageCropWidth()
2488 <number> = <PDFDoc>:getPageCropHeight()
2489 <integer> = <PDFDoc>:getNumPages()
2490 <string> = <PDFDoc>:readMetadata()
2491 <Object> = <PDFDoc>:getStructTreeRoot()
2492 <integer> = <PDFDoc>:findPage(<integer> object number, <integer> object generation)
2493 <Links> = <PDFDoc>:getLinks(<integer>)
2494 <LinkDest> = <PDFDoc>:findDest(<string>)
2495 <boolean> = <PDFDoc>:isEncrypted()
2496 <boolean> = <PDFDoc>:okToPrint()
2497 <boolean> = <PDFDoc>:okToChange()
2498 <boolean> = <PDFDoc>:okToCopy()
2499 <boolean> = <PDFDoc>:okToAddNotes()
2500 <boolean> = <PDFDoc>:isLinearized()
2501 <Object> = <PDFDoc>:getDocInfo()
2502 <Object> = <PDFDoc>:getDocInfoNF()
2503 <integer> = <PDFDoc>:getPDFMajorVersion()
2504 <integer> = <PDFDoc>:getPDFMinorVersion()
2505 \stopfunctioncall
2507 \type{PDFRectangle} methods:
2509 \startfunctioncall
2510 <boolean> = <PDFRectangle>:isValid()
2511 \stopfunctioncall
2513 %\type{Ref} methods:
2515 %\startfunctioncall
2516 %\stopfunctioncall
2518 \type{Stream} methods:
2520 \startfunctioncall
2521 <integer> = <Stream>:getKind()
2522 <string> = <Stream>:getKindName()
2523 = <Stream>:reset()
2524 = <Stream>:close()
2525 <integer> = <Stream>:getChar()
2526 <integer> = <Stream>:lookChar()
2527 <integer> = <Stream>:getRawChar()
2528 <integer> = <Stream>:getUnfilteredChar()
2529 = <Stream>:unfilteredReset()
2530 <integer> = <Stream>:getPos()
2531 <boolean> = <Stream>:isBinary()
2532 <Stream> = <Stream>:getUndecodedStream()
2533 <Dict> = <Stream>:getDict()
2534 \stopfunctioncall
2536 \type{XRef} methods:
2538 \startfunctioncall
2539 <boolean> = <XRef>:isOk()
2540 <integer> = <XRef>:getErrorCode()
2541 <boolean> = <XRef>:isEncrypted()
2542 <boolean> = <XRef>:okToPrint()
2543 <boolean> = <XRef>:okToPrintHighRes()
2544 <boolean> = <XRef>:okToChange()
2545 <boolean> = <XRef>:okToCopy()
2546 <boolean> = <XRef>:okToAddNotes()
2547 <boolean> = <XRef>:okToFillForm()
2548 <boolean> = <XRef>:okToAccessibility()
2549 <boolean> = <XRef>:okToAssemble()
2550 <Object> = <XRef>:getCatalog()
2551 <Object> = <XRef>:fetch(<integer> object number, <integer> object generation)
2552 <Object> = <XRef>:getDocInfo()
2553 <Object> = <XRef>:getDocInfoNF()
2554 <integer> = <XRef>:getNumObjects()
2555 <integer> = <XRef>:getRootNum()
2556 <integer> = <XRef>:getRootGen()
2557 <integer> = <XRef>:getSize()
2558 <Object> = <XRef>:getTrailerDict()
2559 \stopfunctioncall
2561 %***********************************************************************
2563 \section{The \luatex{font} library}
2565 The font library provides the interface into the internals of the font
2566 system, and also it contains helper functions to load traditional
2567 \TEX\ font metrics formats. Other font loading functionality is
2568 provided by the \luatex{fontloader} library that will be discussed in
2569 the next section.
2571 \subsection{Loading a \TFM\ file}
2573 The behavior documented in this subsection is considered stable in the
2574 sense that there will not be backward-incompatible changes any more.
2576 \startfunctioncall
2577 <table> fnt = font.read_tfm(<string> name, <number> s)
2578 \stopfunctioncall
2580 The number is a bit special:
2582 \startitemize
2583 \item if it is positive, it specifies an \quote{at size} in scaled points.
2584 \item if it is negative, its absolute value represents a \quote{scaled}
2585 setting relative to the designsize of the font.
2586 \stopitemize
2588 The internal structure of the metrics font table that is returned is
2589 explained in \in{chapter}[fonts].
2591 \subsection{Loading a \VF\ file}
2593 The behavior documented in this subsection is considered stable in the
2594 sense that there will not be backward-incompatible changes any more.
2596 \startfunctioncall
2597 <table> vf_fnt = font.read_vf(<string> name, <number> s)
2598 \stopfunctioncall
2600 The meaning of the number \type{s} and the format of the returned
2601 table are similar to the ones in the \luatex{read_tfm()} function.
2603 \subsection{The fonts array}
2605 The whole table of \TEX\ fonts is accessible from \LUA\ using a virtual array.
2607 \starttyping
2608 font.fonts[n] = { ... }
2609 <table> f = font.fonts[n]
2610 \stoptyping
2612 See \in{chapter}[fonts] for the structure of the tables. Because this
2613 is a virtual array, you cannot call \type{pairs} on it, but see below
2614 for the \type{font.each} iterator.
2616 The two metatable functions implementing the virtual array are:
2618 \startfunctioncall
2619 <table> f = font.getfont(<number> n)
2620 font.setfont(<number> n, <table> f)
2621 \stopfunctioncall
2623 Note that at the moment, each access to the \type{font.fonts} or call
2624 to \type{font.getfont} creates a lua table for the whole font. This
2625 process can be quite slow. In a later version of \LUATEX, this
2626 interface will change (it will start using userdata objects instead of
2627 actual tables).
2629 Also note the following: assignments can only be made to fonts that
2630 have already been defined in \TEX, but have not been accessed {\it at
2631 all\/} since that definition. This limits the usability of the write
2632 access to \type{font.fonts} quite a lot, a less stringent ruleset will
2633 likely be implemented later.
2635 \subsection{Checking a font's status}
2637 You can test for the status of a font by calling this function:
2639 \startfunctioncall
2640 <boolean> f = font.frozen(<number> n)
2641 \stopfunctioncall
2643 The return value is one of \type{true} (unassignable), \type{false} (can be changed)
2644 or \type{nil} (not a valid font at all).
2646 \subsection{Defining a font directly}
2648 You can define your own font into \luatex{font.fonts} by calling this function:
2650 \startfunctioncall
2651 <number> i = font.define(<table> f)
2652 \stopfunctioncall
2654 The return value is the internal id number of the defined font (the
2655 index into \luatex{font.fonts}). If the font creation fails, an error is
2656 raised. The table is a font structure, as explained in
2657 \in{chapter}[fonts].
2659 \subsection{Projected next font id}
2661 \startfunctioncall
2662 <number> i = font.nextid()
2663 \stopfunctioncall
2665 This returns the font id number that would be returned by a
2666 \type{font.define} call if it was executed at this spot in the code
2667 flow. This is useful for virtual fonts that need to reference
2668 themselves.
2670 \subsection{Font id (0.47)}
2672 \startfunctioncall
2673 <number> i = font.id(<string> csname)
2674 \stopfunctioncall
2676 This returns the font id associated with \type{csname} string, or $-1$
2677 if \type{csname} is not defined; new in 0.47.
2679 \subsection{Currently active font}
2681 \startfunctioncall
2682 <number> i = font.current()
2683 font.current(<number> i)
2684 \stopfunctioncall
2686 This gets or sets the currently used font number.
2688 \subsection{Maximum font id}
2690 \startfunctioncall
2691 <number> i = font.max()
2692 \stopfunctioncall
2694 This is the largest used index in \type{font.fonts}.
2696 \subsection{Iterating over all fonts}
2698 \startfunctioncall
2699 for i,v in font.each() do
2702 \stopfunctioncall
2704 This is an iterator over each of the defined \TEX\ fonts. The first
2705 returned value is the index in \type{font.fonts}, the second the font
2706 itself, as a \LUA\ table. The indices are listed incrementally, but they
2707 do not always form an array of consecutive numbers: in some cases
2708 there can be holes in the sequence.
2710 \section{The \luatex{fontloader} library (0.36)}
2712 \subsection{Getting quick information on a font}
2714 \startfunctioncall
2715 <table> info = fontloader.info(<string> filename)
2716 \stopfunctioncall
2718 This function returns either \type{nil}, or a \type{table}, or an
2719 array of small tables (in the case of a TrueType collection). The
2720 returned table(s) will contain some fairly interesting information
2721 items from the font(s) defined by the file:
2723 \starttabulate[|lT|l|p|]
2724 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
2725 \NC fontname \NC string \NC the \POSTSCRIPT\ name of the font\NC\NR
2726 \NC fullname \NC string \NC the formal name of the font\NC\NR
2727 \NC familyname \NC string \NC the family name this font belongs to\NC\NR
2728 \NC weight \NC string \NC a string indicating the color value of the font\NC\NR
2729 \NC version \NC string \NC the internal font version\NC\NR
2730 \NC italicangle \NC float \NC the slant angle\NC\NR
2731 \NC units_per_em \NC number \NC (since 0.78.2) 1000 for \POSTSCRIPT-based fonts, usually 2048 for \TRUETYPE\NC\NR
2732 \NC pfminfo \NC table \NC (since 0.78.2) (see \in{section}[fontloaderpfminfotable])\NC\NR
2733 \stoptabulate
2735 Getting information through this function is (sometimes much) more
2736 efficient than loading the font properly, and is therefore handy when
2737 you want to create a dictionary of available fonts based on a
2738 directory contents.
2740 \subsection{Loading an \OPENTYPE\ or \TRUETYPE\ file}
2742 If you want to use an \OPENTYPE\ font, you have to get the metric
2743 information from somewhere. Using the \type{fontloader} library, the
2744 simplest way to get that information is thus:
2746 \starttyping
2747 function load_font (filename)
2748 local metrics = nil
2749 local font = fontloader.open(filename)
2750 if font then
2751 metrics = fontloader.to_table(font)
2752 fontloader.close(font)
2754 return metrics
2757 myfont = load_font('/opt/tex/texmf/fonts/data/arial.ttf')
2758 \stoptyping
2760 The main function call is
2762 \startfunctioncall
2763 <userdata> f, <table> w = fontloader.open(<string> filename)
2764 <userdata> f, <table> w = fontloader.open(<string> filename, <string> fontname)
2765 \stopfunctioncall
2767 The first return value is a userdata representation of the font. The
2768 second return value is a table containing any warnings and errors
2769 reported by fontloader while opening the font. In normal typesetting,
2770 you would probably ignore the second argument, but it can be useful
2771 for debugging purposes.
2773 For \TRUETYPE\ collections (when filename ends in 'ttc') and \DFONT\
2774 collections, you have to use a second string argument to specify which
2775 font you want from the collection. Use the \type{fontname}
2776 strings that are returned by \type{fontloader.info} for that.
2778 To turn the font into a table, \type{fontloader.to_table} is used on
2779 the font returned by \type{fontloader.open}.
2781 \startfunctioncall
2782 <table> f = fontloader.to_table(<userdata> font)
2783 \stopfunctioncall
2785 This table cannot be used directly by \LUATEX\ and should be turned
2786 into another one as described in~\in{chapter}[fonts].
2787 Do not forget to store the \type{fontname} value in the \type{psname}
2788 field of the metrics table to be returned to \LUATEX, otherwise the
2789 font inclusion backend will not be able to find the correct font in
2790 the collection.
2792 See \in{section}[fontloadertables] for details on the userdata object
2793 returned by \type{fontloader.open()} and the layout of the
2794 \type{metrics} table returned by \type{fontloader.to_table()}.
2796 The font file is parsed and partially interpreted by the font
2797 loading routines from \FONTFORGE. The file format can be \OPENTYPE,
2798 \TRUETYPE, \TRUETYPE\ Collection, \CFF, or \TYPEONE.
2800 There are a few advantages to this approach compared to reading the
2801 actual font file ourselves:
2803 \startitemize
2805 \item The font is automatically re|-|encoded, so that the \type{metrics}
2806 table for \TRUETYPE\ and \OPENTYPE\ fonts is using \UNICODE\ for
2807 the character indices.
2809 \item Many features are pre|-|processed into a format that is easier to handle
2810 than just the bare tables would be.
2812 \item \POSTSCRIPT|-|based \OPENTYPE\ fonts do not store the character height and
2813 depth in the font file, so the character boundingbox has to be
2814 calculated in some way.
2816 \item In the future, it may be interesting to allow \LUA\ scripts access to
2817 the font program itself, perhaps even creating or changing the font.
2819 \stopitemize
2821 A loaded font is discarded with:
2823 \startfunctioncall
2824 fontloader.close(<userdata> font)
2825 \stopfunctioncall
2827 \subsection{Applying a \quote{feature file}}
2829 You can apply a \quote{feature file} to a loaded font:
2831 \startfunctioncall
2832 <table> errors = fontloader.apply_featurefile(<userdata> font, <string> filename)
2833 \stopfunctioncall
2835 A \quote{feature file} is a textual representation of the features in an
2836 \OPENTYPE\ font. See\crlf
2837 \hyphenatedurl {http://www.adobe.com/devnet/opentype/afdko/topic_feature_file_syntax.html}\crlf
2838 and\crlf
2839 \hyphenatedurl {http://fontforge.sourceforge.net/featurefile.html}\crlf
2840 for a more detailed description of feature files.
2842 If the function fails, the return value is a table containing any
2843 errors reported by fontloader while applying the feature file. On
2844 success, \type{nil} is returned. (the return value is new in 0.65)
2848 \subsection{Applying an \quote{\AFM\ file}}
2850 You can apply an \quote{\AFM\ file} to a loaded font:
2852 \startfunctioncall
2853 <table> errors = fontloader.apply_afmfile(<userdata> font, <string> filename)
2854 \stopfunctioncall
2856 An \AFM\ file is a textual representation of (some of) the meta information
2857 in a \TYPEONE\ font. See \hyphenatedurl{ftp://ftp.math.utah.edu/u/ma/hohn/linux/postscript/5004.AFM_Spec.pdf}
2858 for more information about afm files.
2860 Note: If you \type{fontloader.open()} a \TYPEONE\ file named \type{font.pfb},
2861 the library will automatically search for and apply \type{font.afm}
2862 if it exists in the same directory as the file \type{font.pfb}. In that case,
2863 there is no need for an explicit call to \type{apply_afmfile()}.
2865 If the function fails, the return value is a table containing any
2866 errors reported by fontloader while applying the AFM file. On
2867 success, \type{nil} is returned. (the return value is new in 0.65)
2869 \subsection[fontloadertables]{Fontloader font tables}
2871 As mentioned earlier, the return value of \type{fontloader.open()} is
2872 a userdata object. In \LUATEX\ versions before 0.63, the only way to
2873 have access to the actual metrics was to call
2874 \type{fontloader.to_table()} on this object, returning the table
2875 structure that is explained in the following subsections.
2877 However, it turns out that the result from
2878 \type{fontloader.to_table()} sometimes needs very large amounts of memory
2879 (depending on the font's complexity and size) so starting with \LUATEX\ 0.63,
2880 it is possible to access the userdata object directly.
2882 In the \LUATEX\ 0.63.0, the following is implemented:
2884 \startitemize
2885 \item all top-level keys that would be returned by \type{to_table()}
2886 can also be accessed directly.
2887 \item the top-level key \quote{glyphs} returns a {\it virtual\/} array that
2888 allows indices from \type{0} to ($\type{f.glyphmax}-1$).
2889 \item the items in that virtual array (the actual glyphs) are themselves also
2890 userdata objects, and each has accessors for all of the keys
2891 explained in the section \quote{Glyph items} below.
2892 \item the top-level key \quote{subfonts} returns an {\it actual} array of
2893 userdata objects, one for each of the subfonts (or nil, if there are no subfonts).
2894 \stopitemize
2897 A short example may be helpful. This code generates a printout of all
2898 the glyph names in the font \type{PunkNova.kern.otf}:
2900 \starttyping
2901 local f = fontloader.open('PunkNova.kern.otf')
2902 print (f.fontname)
2903 local i = 0
2904 while (i < f.glyphmax) do
2905 local g = f.glyphs[i]
2906 if g then
2907 print(g.name)
2909 i = i + 1
2911 fontloader.close(f)
2912 \stoptyping
2914 In this case, the \LUATEX\ memory requirement stays below 100MB on the
2915 test computer, while the internal stucture generated by
2916 \type{to_table()} needs more than 2GB of memory (the font itself is
2917 6.9MB in disk size).
2919 In \LUATEX\ 0.63 only the top-level font, the subfont table entries,
2920 and the glyphs are virtual objects, everything else still produces
2921 normal lua values and tables. In future versions, more return values
2922 may be replaced by userdata objects (as much as needed to keep the
2923 memory requirements in check).
2925 If you want to know the valid fields in a font or glyph
2926 structure, call the \type{fields} function on an object of a
2927 particular type (either glyph or font for now, more will be
2928 implemented later):
2930 \startfunctioncall
2931 <table> fields = fontloader.fields(<userdata> font)
2932 <table> fields = fontloader.fields(<userdata> font_glyph)
2933 \stopfunctioncall
2935 For instance:
2937 \startfunctioncall
2938 local fields = fontloader.fields(f)
2939 local fields = fontloader.fields(f.glyphs[0])
2940 \stopfunctioncall
2943 \subsubsection{Table types}
2945 \subsubsubsection{Top-level}
2947 The top|-|level keys in the returned table are (the explanations in
2948 this part of the documentation are not yet finished):
2950 \starttabulate[|lT|l|p|]
2951 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
2952 \NC table_version \NC number \NC indicates the metrics version (currently~0.3)\NC\NR
2953 \NC fontname \NC string \NC \POSTSCRIPT\ font name\NC\NR
2954 \NC fullname \NC string \NC official (human-oriented) font name\NC\NR
2955 \NC familyname \NC string \NC family name\NC\NR
2956 \NC weight \NC string \NC weight indicator\NC\NR
2957 \NC copyright \NC string \NC copyright information\NC\NR
2958 \NC filename \NC string \NC the file name\NC\NR
2959 \NC version \NC string \NC font version\NC\NR
2960 \NC italicangle \NC float \NC slant angle\NC\NR
2961 \NC units_per_em \NC number \NC 1000 for \POSTSCRIPT-based fonts, usually 2048 for \TRUETYPE\NC\NR
2962 \NC ascent \NC number \NC height of ascender in \type{units_per_em}\NC\NR
2963 \NC descent \NC number \NC depth of descender in \type{units_per_em}\NC\NR
2964 \NC upos \NC float \NC \NC\NR
2965 \NC uwidth \NC float \NC \NC\NR
2966 \NC uniqueid \NC number \NC \NC\NR
2967 \NC glyphcnt \NC number \NC number of included glyphs\NC\NR
2968 \NC glyphs \NC array \NC \NC\NR
2969 \NC glyphmax \NC number \NC maximum used index the glyphs array\NC\NR
2970 \NC hasvmetrics \NC number \NC \NC\NR
2971 \NC onlybitmaps \NC number \NC \NC\NR
2972 \NC serifcheck \NC number \NC \NC\NR
2973 \NC isserif \NC number \NC \NC\NR
2974 \NC issans \NC number \NC \NC\NR
2975 \NC encodingchanged \NC number \NC \NC\NR
2976 \NC strokedfont \NC number \NC \NC\NR
2977 \NC use_typo_metrics \NC number \NC \NC\NR
2978 \NC weight_width_slope_only \NC number \NC \NC\NR
2979 \NC head_optimized_for_cleartype \NC number \NC \NC\NR
2980 \NC uni_interp \NC enum \NC \type {unset}, \type {none}, \type {adobe},
2981 \type {greek}, \type {japanese}, \type {trad_chinese},
2982 \type {simp_chinese}, \type {korean}, \type {ams}\NC\NR
2983 \NC origname \NC string \NC the file name, as supplied by the user\NC\NR
2984 \NC map \NC table \NC \NC\NR
2985 \NC private \NC table \NC \NC\NR
2986 \NC xuid \NC string \NC \NC\NR
2987 \NC pfminfo \NC table \NC \NC\NR
2988 \NC names \NC table \NC \NC\NR
2989 \NC cidinfo \NC table \NC \NC\NR
2990 \NC subfonts \NC array \NC \NC\NR
2991 \NC commments \NC string \NC \NC\NR
2992 \NC fontlog \NC string \NC \NC\NR
2993 \NC cvt_names \NC string \NC \NC\NR
2994 \NC anchor_classes \NC table \NC \NC\NR
2995 \NC ttf_tables \NC table \NC \NC\NR
2996 \NC ttf_tab_saved \NC table \NC \NC\NR
2997 \NC kerns \NC table \NC \NC\NR
2998 \NC vkerns \NC table \NC \NC\NR
2999 \NC texdata \NC table \NC \NC\NR
3000 \NC lookups \NC table \NC \NC\NR
3001 \NC gpos \NC table \NC \NC\NR
3002 \NC gsub \NC table \NC \NC\NR
3003 \NC mm \NC table \NC \NC\NR
3004 \NC chosenname \NC string \NC \NC\NR
3005 \NC macstyle \NC number \NC \NC\NR
3006 \NC fondname \NC string \NC \NC\NR
3007 %\NC design_size \NC number \NC \NC\NR
3008 \NC fontstyle_id \NC number \NC \NC\NR
3009 \NC fontstyle_name \NC table \NC \NC\NR
3010 %\NC design_range_bottom \NC number \NC \NC\NR
3011 %\NC design_range_top \NC number \NC \NC\NR
3012 \NC strokewidth \NC float \NC \NC\NR
3013 \NC mark_classes \NC table \NC \NC\NR
3014 \NC creationtime \NC number \NC \NC\NR
3015 \NC modificationtime \NC number \NC \NC\NR
3016 \NC os2_version \NC number \NC \NC\NR
3017 \NC sfd_version \NC number \NC \NC\NR
3018 \NC math \NC table \NC \NC\NR
3019 \NC validation_state \NC table \NC \NC\NR
3020 \NC horiz_base \NC table \NC \NC\NR
3021 \NC vert_base \NC table \NC \NC\NR
3022 \NC extrema_bound \NC number \NC \NC\NR
3023 \stoptabulate
3025 \subsubsubsection{Glyph items}
3027 The \type{glyphs} is an array containing the per|-|character
3028 information (quite a few of these are only present if nonzero).
3030 \starttabulate[|lT|l|p|]
3031 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3032 \NC name \NC string \NC the glyph name\NC\NR
3033 \NC unicode \NC number \NC unicode code point, or -1\NC\NR
3034 \NC boundingbox \NC array \NC array of four numbers, see note below\NC\NR
3035 \NC width \NC number \NC only for horizontal fonts\NC\NR
3036 \NC vwidth \NC number \NC only for vertical fonts\NC\NR
3037 \NC tsidebearing \NC number \NC only for vertical ttf/otf fonts, and only if nonzero (0.79.0)\NC\NR
3038 \NC lsidebearing \NC number \NC only if nonzero and not equal to boundingbox[1]\NC\NR
3039 \NC class \NC string \NC one of "none", "base", "ligature", "mark", "component"
3040 (if not present, the glyph class is \quote{automatic})\NC\NR
3041 \NC kerns \NC array \NC only for horizontal fonts, if set\NC\NR
3042 \NC vkerns \NC array \NC only for vertical fonts, if set\NC\NR
3043 \NC dependents \NC array \NC linear array of glyph name strings, only if nonempty\NC\NR
3044 \NC lookups \NC table \NC only if nonempty\NC\NR
3045 \NC ligatures \NC table \NC only if nonempty\NC\NR
3046 \NC anchors \NC table \NC only if set\NC\NR
3047 \NC comment \NC string \NC only if set\NC\NR
3048 \NC tex_height \NC number \NC only if set\NC\NR
3049 \NC tex_depth \NC number \NC only if set\NC\NR
3050 \NC italic_correction \NC number \NC only if set\NC\NR
3051 \NC top_accent \NC number \NC only if set\NC\NR
3052 \NC is_extended_shape \NC number \NC only if this character is part of a math extension list\NC\NR
3053 \NC altuni \NC table \NC alternate \UNICODE\ items \NC\NR
3054 \NC vert_variants \NC table \NC \NC \NR
3055 \NC horiz_variants \NC table \NC \NC \NR
3056 \NC mathkern \NC table \NC \NC \NR
3057 \stoptabulate
3059 On \type{boundingbox}: The boundingbox information for \TRUETYPE\ fonts and \TRUETYPE-based \OTF\ fonts is read
3060 directly from the font file. \POSTSCRIPT-based fonts do not have this information, so the boundingbox of
3061 traditional \POSTSCRIPT\ fonts is generated by interpreting the actual bezier curves to find the exact
3062 boundingbox. This can be a slow process, so starting from \LUATEX\ 0.45, the boundingboxes of \POSTSCRIPT-based
3063 \OTF\ fonts (and raw \CFF\ fonts) are calculated using an approximation of the glyph shape based on the actual
3064 glyph points only, instead of taking the whole curve into account. This means that glyphs that have missing
3065 points at extrema will have a too-tight boundingbox, but the processing is so much faster that in our opinion
3066 the tradeoff is worth it.
3069 The \type{kerns} and \type{vkerns} are linear arrays of small hashes:
3071 \starttabulate[|lT|l|p|]
3072 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3073 \NC char \NC string \NC \NC\NR
3074 \NC off \NC number \NC \NC\NR
3075 \NC lookup \NC string \NC \NC\NR
3076 \stoptabulate
3078 The \type{lookups} is a hash, based on lookup subtable names, with
3079 the value of each key inside that a linear array of small hashes:
3081 % TODO: fix this description
3082 \starttabulate[|lT|l|p|]
3083 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3084 \NC type \NC enum \NC \type {position}, \type {pair}, \type {substitution}, \type {alternate},
3085 \type {multiple}, \type {ligature}, \type {lcaret}, \type {kerning}, \type {vkerning}, \type {anchors},
3086 \type {contextpos}, \type {contextsub}, \type {chainpos}, \type {chainsub},
3087 \type {reversesub}, \type {max}, \type {kernback}, \type {vkernback} \NC\NR
3088 \NC specification \NC table \NC extra data \NC\NR
3089 \stoptabulate
3091 For the first seven values of \type{type}, there can be additional
3092 sub|-|information, stored in the sub-table \type{specification}:
3094 \starttabulate[|lT|l|p|]
3095 \NC \ssbf value \NC \bf type \NC \bf explanation \NC\NR
3096 \NC position \NC table \NC a table of the \type {offset_specs} type\NC\NR
3097 \NC pair \NC table \NC one string: \type {paired}, and an array of one or
3098 two \type {offset_specs} tables: \type{offsets}\NC\NR
3099 \NC substitution \NC table \NC one string: \type {variant}\NC\NR
3100 \NC alternate \NC table \NC one string: \type {components}\NC\NR
3101 \NC multiple \NC table \NC one string: \type {components}\NC\NR
3102 \NC ligature \NC table \NC two strings: \type {components}, \type {char}\NC\NR
3103 \NC lcaret \NC array \NC linear array of numbers\NC\NR
3104 \stoptabulate
3106 Tables for \type{offset_specs} contain up to four number|-|valued
3107 fields: \type{x} (a horizontal offset), \type{y} (a vertical offset),
3108 \type{h} (an advance width correction) and \type{v} (an advance height
3109 correction).
3111 The \type{ligatures} is a linear array of small hashes:
3113 \starttabulate[|lT|l|p|]
3114 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3115 \NC lig \NC table \NC uses the same substructure as a single item in the \type{lookups} table explained above\NC\NR
3116 \NC char \NC string \NC \NC\NR
3117 \NC components \NC array \NC linear array of named components\NC\NR
3118 \NC ccnt \NC number \NC \NC\NR
3119 \stoptabulate
3121 The \type{anchor} table is indexed by a string signifying the
3122 anchor type, which is one of
3124 \starttabulate[|lT|l|p|]
3125 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3126 \NC mark \NC table \NC placement mark\NC\NR
3127 \NC basechar \NC table \NC mark for attaching combining items to a base char\NC\NR
3128 \NC baselig \NC table \NC mark for attaching combining items to a ligature\NC\NR
3129 \NC basemark \NC table \NC generic mark for attaching combining items to connect to\NC\NR
3130 \NC centry \NC table \NC cursive entry point\NC\NR
3131 \NC cexit \NC table \NC cursive exit point\NC\NR
3132 \stoptabulate
3134 The content of these is a short array of defined anchors, with the
3135 entry keys being the anchor names. For all except \type{baselig}, the
3136 value is a single table with this definition:
3138 \starttabulate[|lT|l|p|]
3139 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3140 \NC x \NC number \NC x location\NC\NR
3141 \NC y \NC number \NC y location\NC\NR
3142 \NC ttf_pt_index \NC number \NC truetype point index, only if given\NC\NR
3143 \stoptabulate
3145 For \type{baselig}, the value is a small array of such anchor sets
3146 sets, one for each constituent item of the ligature.
3148 For clarification, an anchor table could for example look like this :
3150 \starttyping
3151 ['anchor'] = {
3152 ['basemark'] = {
3153 ['Anchor-7'] = { ['x']=170, ['y']=1080 }
3155 ['mark'] ={
3156 ['Anchor-1'] = { ['x']=160, ['y']=810 },
3157 ['Anchor-4'] = { ['x']=160, ['y']=800 }
3159 ['baselig'] = {
3160 [1] = { ['Anchor-2'] = { ['x']=160, ['y']=650 } },
3161 [2] = { ['Anchor-2'] = { ['x']=460, ['y']=640 } }
3164 \stoptyping
3166 \subsubsubsection{map table}
3168 The top|-|level map is a list of encoding mappings. Each of those is a table itself.
3170 \starttabulate[|lT|l|p|]
3171 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3172 \NC enccount \NC number \NC \NC\NR
3173 \NC encmax \NC number \NC \NC\NR
3174 \NC backmax \NC number \NC \NC\NR
3175 \NC remap \NC table \NC \NC\NR
3176 \NC map \NC array \NC non|-|linear array of mappings\NC\NR
3177 \NC backmap \NC array \NC non|-|linear array of backward mappings\NC\NR
3178 \NC enc \NC table \NC \NC\NR
3179 \stoptabulate
3181 The \type{remap} table is very small:
3183 \starttabulate[|lT|l|p|]
3184 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3185 \NC firstenc \NC number \NC \NC\NR
3186 \NC lastenc \NC number \NC \NC\NR
3187 \NC infont \NC number \NC \NC\NR
3188 \stoptabulate
3190 The \type{enc} table is a bit more verbose:
3192 \starttabulate[|lT|l|p|]
3193 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3194 \NC enc_name \NC string \NC \NC\NR
3195 \NC char_cnt \NC number \NC \NC\NR
3196 \NC char_max \NC number \NC \NC\NR
3197 \NC unicode \NC array \NC of \UNICODE\ position numbers\NC\NR
3198 \NC psnames \NC array \NC of \POSTSCRIPT\ glyph names\NC\NR
3199 \NC builtin \NC number \NC \NC\NR
3200 \NC hidden \NC number \NC \NC\NR
3201 \NC only_1byte \NC number \NC \NC\NR
3202 \NC has_1byte \NC number \NC \NC\NR
3203 \NC has_2byte \NC number \NC \NC\NR
3204 \NC is_unicodebmp \NC number \NC only if nonzero\NC\NR
3205 \NC is_unicodefull \NC number \NC only if nonzero\NC\NR
3206 \NC is_custom \NC number \NC only if nonzero\NC\NR
3207 \NC is_original \NC number \NC only if nonzero\NC\NR
3208 \NC is_compact \NC number \NC only if nonzero\NC\NR
3209 \NC is_japanese \NC number \NC only if nonzero\NC\NR
3210 \NC is_korean \NC number \NC only if nonzero\NC\NR
3211 \NC is_tradchinese \NC number \NC only if nonzero [name?]\NC\NR
3212 \NC is_simplechinese \NC number \NC only if nonzero\NC\NR
3213 \NC low_page \NC number \NC \NC\NR
3214 \NC high_page \NC number \NC \NC\NR
3215 \NC iconv_name \NC string \NC \NC\NR
3216 \NC iso_2022_escape \NC string \NC \NC\NR
3217 \stoptabulate
3219 \subsubsubsection{private table}
3221 This is the font's private \POSTSCRIPT\ dictionary, if any. Keys and
3222 values are both strings.
3224 \subsubsubsection{cidinfo table}
3226 \starttabulate[|lT|l|p|]
3227 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3228 \NC registry \NC string \NC \NC\NR
3229 \NC ordering \NC string \NC \NC\NR
3230 \NC supplement \NC number \NC \NC\NR
3231 \NC version \NC number \NC \NC\NR
3232 \stoptabulate
3234 \subsubsubsection[fontloaderpfminfotable]{pfminfo table}
3236 The \type{pfminfo} table contains most of the OS/2 information:
3238 \starttabulate[|lT|l|p|]
3239 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3240 \NC pfmset \NC number \NC \NC\NR
3241 \NC winascent_add \NC number \NC \NC\NR
3242 \NC windescent_add \NC number \NC \NC\NR
3243 \NC hheadascent_add \NC number \NC \NC\NR
3244 \NC hheaddescent_add \NC number \NC \NC\NR
3245 \NC typoascent_add \NC number \NC \NC\NR
3246 \NC typodescent_add \NC number \NC \NC\NR
3247 \NC subsuper_set \NC number \NC \NC\NR
3248 \NC panose_set \NC number \NC \NC\NR
3249 \NC hheadset \NC number \NC \NC\NR
3250 \NC vheadset \NC number \NC \NC\NR
3251 \NC pfmfamily \NC number \NC \NC\NR
3252 \NC weight \NC number \NC \NC\NR
3253 \NC width \NC number \NC \NC\NR
3254 \NC avgwidth \NC number \NC \NC\NR
3255 \NC firstchar \NC number \NC \NC\NR
3256 \NC lastchar \NC number \NC \NC\NR
3257 \NC fstype \NC number \NC \NC\NR
3258 \NC linegap \NC number \NC \NC\NR
3259 \NC vlinegap \NC number \NC \NC\NR
3260 \NC hhead_ascent \NC number \NC \NC\NR
3261 \NC hhead_descent \NC number \NC \NC\NR
3262 \NC hhead_descent \NC number \NC \NC\NR
3263 \NC os2_typoascent \NC number \NC \NC\NR
3264 \NC os2_typodescent \NC number \NC \NC\NR
3265 \NC os2_typolinegap \NC number \NC \NC\NR
3266 \NC os2_winascent \NC number \NC \NC\NR
3267 \NC os2_windescent \NC number \NC \NC\NR
3268 \NC os2_subxsize \NC number \NC \NC\NR
3269 \NC os2_subysize \NC number \NC \NC\NR
3270 \NC os2_subxoff \NC number \NC \NC\NR
3271 \NC os2_subyoff \NC number \NC \NC\NR
3272 \NC os2_supxsize \NC number \NC \NC\NR
3273 \NC os2_supysize \NC number \NC \NC\NR
3274 \NC os2_supxoff \NC number \NC \NC\NR
3275 \NC os2_supyoff \NC number \NC \NC\NR
3276 \NC os2_strikeysize \NC number \NC \NC\NR
3277 \NC os2_strikeypos \NC number \NC \NC\NR
3278 \NC os2_family_class \NC number \NC \NC\NR
3279 \NC os2_xheight \NC number \NC \NC\NR
3280 \NC os2_capheight \NC number \NC \NC\NR
3281 \NC os2_defaultchar \NC number \NC \NC\NR
3282 \NC os2_breakchar \NC number \NC \NC\NR
3283 \NC os2_vendor \NC string \NC \NC\NR
3284 \NC codepages \NC table \NC A two-number array of encoded code pages\NC\NR
3285 \NC unicoderages \NC table \NC A four-number array of encoded unicode ranges\NC\NR
3286 \NC panose \NC table \NC \NC\NR
3287 \stoptabulate
3289 The \type{panose} subtable has exactly 10 string keys:
3291 \starttabulate[|lT|l|p|]
3292 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3293 \NC familytype \NC string \NC Values as in the \OPENTYPE\ font specification:
3294 \type {Any}, \type {No Fit}, \type {Text and Display}, \type {Script},
3295 \type {Decorative}, \type {Pictorial} \NC\NR
3296 \NC serifstyle \NC string \NC See the \OPENTYPE\ font specification for values\NC\NR
3297 \NC weight \NC string \NC id. \NC\NR
3298 \NC proportion \NC string \NC id. \NC\NR
3299 \NC contrast \NC string \NC id. \NC\NR
3300 \NC strokevariation \NC string \NC id. \NC\NR
3301 \NC armstyle \NC string \NC id. \NC\NR
3302 \NC letterform \NC string \NC id. \NC\NR
3303 \NC midline \NC string \NC id. \NC\NR
3304 \NC xheight \NC string \NC id. \NC\NR
3305 \stoptabulate
3307 \subsubsubsection[fontloadernamestable]{names table}
3309 Each item has two top|-|level keys:
3311 \starttabulate[|lT|l|p|]
3312 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3313 \NC lang \NC string \NC language for this entry \NC\NR
3314 \NC names \NC table \NC \NC\NR
3315 \stoptabulate
3317 The \type{names} keys are the actual \TRUETYPE\ name strings. The
3318 possible keys are:
3320 \starttabulate[|lT|p|]
3321 \NC \ssbf key \NC \bf explanation \NC\NR
3322 \NC copyright \NC \NC\NR
3323 \NC family \NC \NC\NR
3324 \NC subfamily \NC \NC\NR
3325 \NC uniqueid \NC \NC\NR
3326 \NC fullname \NC \NC\NR
3327 \NC version \NC \NC\NR
3328 \NC postscriptname \NC \NC\NR
3329 \NC trademark \NC \NC\NR
3330 \NC manufacturer \NC \NC\NR
3331 \NC designer \NC \NC\NR
3332 \NC descriptor \NC \NC\NR
3333 \NC venderurl \NC \NC\NR
3334 \NC designerurl \NC \NC\NR
3335 \NC license \NC \NC\NR
3336 \NC licenseurl \NC \NC\NR
3337 \NC idontknow \NC \NC\NR
3338 \NC preffamilyname \NC \NC\NR
3339 \NC prefmodifiers \NC \NC\NR
3340 \NC compatfull \NC \NC\NR
3341 \NC sampletext \NC \NC\NR
3342 \NC cidfindfontname \NC \NC\NR
3343 \NC wwsfamily \NC \NC\NR
3344 \NC wwssubfamily \NC \NC\NR
3345 \stoptabulate
3347 \subsubsubsection{anchor_classes table}
3349 The anchor_classes classes:
3351 \starttabulate[|lT|l|p|]
3352 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3353 \NC name \NC string \NC a descriptive id of this anchor class\NC\NR
3354 \NC lookup \NC string \NC \NC\NR
3355 \NC type \NC string \NC one of \type {mark}, \type {mkmk}, \type {curs}, \type {mklg} \NC\NR
3356 \stoptabulate
3358 % type is actually a lookup subtype, not a feature name. Officially, these strings
3359 % should be gpos_mark2mark etc.
3361 \subsubsubsection{gpos table}
3363 Th gpos table has one array entry for each lookup. (The \type {gpos_} prefix is somewhat redundant.)
3365 \starttabulate[|lT|l|p|]
3366 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3367 \NC type \NC string \NC one of
3368 \type {gpos_single}, \type {gpos_pair}, \type {gpos_cursive},
3369 \type {gpos_mark2base},\crlf \type {gpos_mark2ligature}, \type {gpos_mark2mark}, \type {gpos_context},\crlf
3370 \type {gpos_contextchain}
3371 \NC\NR
3372 \NC flags \NC table \NC \NC\NR
3373 \NC name \NC string \NC \NC\NR
3374 \NC features \NC array \NC \NC\NR
3375 \NC subtables \NC array \NC \NC\NR
3376 \stoptabulate
3378 The flags table has a true value for each of the lookup flags that is
3379 actually set:
3381 \starttabulate[|lT|l|p|]
3382 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3383 \NC r2l \NC boolean \NC \NC\NR
3384 \NC ignorebaseglyphs \NC boolean \NC \NC\NR
3385 \NC ignoreligatures \NC boolean \NC \NC\NR
3386 \NC ignorecombiningmarks \NC boolean \NC \NC\NR
3387 \NC mark_class \NC string \NC (new in 0.44)\NC\NR
3388 \stoptabulate
3391 The features subtable items of gpos have:
3393 \starttabulate[|lT|l|p|]
3394 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3395 \NC tag \NC string \NC \NC\NR
3396 \NC scripts \NC table \NC \NC\NR
3397 \stoptabulate
3399 The scripts table within features has:
3401 \starttabulate[|lT|l|p|]
3402 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3403 \NC script \NC string \NC \NC\NR
3404 \NC langs \NC array of strings \NC \NC\NR
3405 \stoptabulate
3408 The subtables table has:
3410 \starttabulate[|lT|l|p|]
3411 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3412 \NC name \NC string \NC \NC\NR
3413 \NC suffix \NC string \NC (only if used)\NC\NR % used by gpos_single to get a default
3414 \NC anchor_classes \NC number \NC (only if used)\NC\NR
3415 \NC vertical_kerning \NC number \NC (only if used)\NC\NR
3416 \NC kernclass \NC table \NC (only if used)\NC\NR
3417 \stoptabulate
3420 The kernclass with subtables table has:
3422 \starttabulate[|lT|l|p|]
3423 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3424 \NC firsts \NC array of strings \NC \NC\NR
3425 \NC seconds \NC array of strings \NC \NC\NR
3426 \NC lookup \NC string or array \NC associated lookup(s) \NC \NR
3427 \NC offsets \NC array of numbers \NC \NC\NR
3428 \stoptabulate
3430 \subsubsubsection{gsub table}
3432 This has identical layout to the \type{gpos} table, except for the
3433 type:
3435 \starttabulate[|lT|l|p|]
3436 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3437 \NC type \NC string \NC one of \type {gsub_single}, \type {gsub_multiple}, \type {gsub_alternate},
3438 \type {gsub_ligature},\crlf \type {gsub_context}, \type {gsub_contextchain}, \type {gsub_reversecontextchain}
3439 \NC\NR
3440 \stoptabulate
3444 \subsubsubsection{ttf_tables and ttf_tab_saved tables}
3446 \starttabulate[|lT|l|p|]
3447 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3448 \NC tag \NC string \NC \NC\NR
3449 \NC len \NC number \NC \NC\NR
3450 \NC maxlen \NC number \NC \NC\NR
3451 \NC data \NC number \NC \NC\NR
3452 \stoptabulate
3454 \subsubsubsection{mm table}
3456 \starttabulate[|lT|l|p|]
3457 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3458 \NC axes \NC table \NC array of axis names \NC \NR
3459 \NC instance_count \NC number \NC \NC \NR
3460 \NC positions \NC table \NC array of instance positions
3461 (\#axes * instances )\NC \NR
3462 \NC defweights \NC table \NC array of default weights for instances \NC \NR
3463 \NC cdv \NC string \NC \NC \NR
3464 \NC ndv \NC string \NC \NC \NR
3465 \NC axismaps \NC table \NC \NC \NR
3466 \stoptabulate
3468 The \type{axismaps}:
3470 \starttabulate[|lT|l|p|]
3471 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3472 \NC blends \NC table \NC an array of blend points \NC \NR
3473 \NC designs \NC table \NC an array of design values \NC \NR
3474 \NC min \NC number \NC \NC \NR
3475 \NC def \NC number \NC \NC \NR
3476 \NC max \NC number \NC \NC \NR
3477 \stoptabulate
3480 \subsubsubsection{mark_classes table (0.44)}
3482 The keys in this table are mark class names, and the values
3483 are a space-separated string of glyph names in this class.
3485 Note: This table is indeed new in 0.44. The manual said it existed
3486 before then, but in practise it was missing due to a bug.
3488 \subsubsubsection{math table}
3490 \starttabulate[|lT|p|]
3491 \NC ScriptPercentScaleDown \NC \NC \NR
3492 \NC ScriptScriptPercentScaleDown \NC \NC \NR
3493 \NC DelimitedSubFormulaMinHeight \NC \NC \NR
3494 \NC DisplayOperatorMinHeight \NC \NC \NR
3495 \NC MathLeading \NC \NC \NR
3496 \NC AxisHeight \NC \NC \NR
3497 \NC AccentBaseHeight \NC \NC \NR
3498 \NC FlattenedAccentBaseHeight \NC \NC \NR
3499 \NC SubscriptShiftDown \NC \NC \NR
3500 \NC SubscriptTopMax \NC \NC \NR
3501 \NC SubscriptBaselineDropMin \NC \NC \NR
3502 \NC SuperscriptShiftUp \NC \NC \NR
3503 \NC SuperscriptShiftUpCramped \NC \NC \NR
3504 \NC SuperscriptBottomMin \NC \NC \NR
3505 \NC SuperscriptBaselineDropMax \NC \NC \NR
3506 \NC SubSuperscriptGapMin \NC \NC \NR
3507 \NC SuperscriptBottomMaxWithSubscript \NC \NC \NR
3508 \NC SpaceAfterScript \NC \NC \NR
3509 \NC UpperLimitGapMin \NC \NC \NR
3510 \NC UpperLimitBaselineRiseMin \NC \NC \NR
3511 \NC LowerLimitGapMin \NC \NC \NR
3512 \NC LowerLimitBaselineDropMin \NC \NC \NR
3513 \NC StackTopShiftUp \NC \NC \NR
3514 \NC StackTopDisplayStyleShiftUp \NC \NC \NR
3515 \NC StackBottomShiftDown \NC \NC \NR
3516 \NC StackBottomDisplayStyleShiftDown \NC \NC \NR
3517 \NC StackGapMin \NC \NC \NR
3518 \NC StackDisplayStyleGapMin \NC \NC \NR
3519 \NC StretchStackTopShiftUp \NC \NC \NR
3520 \NC StretchStackBottomShiftDown \NC \NC \NR
3521 \NC StretchStackGapAboveMin \NC \NC \NR
3522 \NC StretchStackGapBelowMin \NC \NC \NR
3523 \NC FractionNumeratorShiftUp \NC \NC \NR
3524 \NC FractionNumeratorDisplayStyleShiftUp \NC \NC \NR
3525 \NC FractionDenominatorShiftDown \NC \NC \NR
3526 \NC FractionDenominatorDisplayStyleShiftDown \NC \NC \NR
3527 \NC FractionNumeratorGapMin \NC \NC \NR
3528 \NC FractionNumeratorDisplayStyleGapMin \NC \NC \NR
3529 \NC FractionRuleThickness \NC \NC \NR
3530 \NC FractionDenominatorGapMin \NC \NC \NR
3531 \NC FractionDenominatorDisplayStyleGapMin \NC \NC \NR
3532 \NC SkewedFractionHorizontalGap \NC \NC \NR
3533 \NC SkewedFractionVerticalGap \NC \NC \NR
3534 \NC OverbarVerticalGap \NC \NC \NR
3535 \NC OverbarRuleThickness \NC \NC \NR
3536 \NC OverbarExtraAscender \NC \NC \NR
3537 \NC UnderbarVerticalGap \NC \NC \NR
3538 \NC UnderbarRuleThickness \NC \NC \NR
3539 \NC UnderbarExtraDescender \NC \NC \NR
3540 \NC RadicalVerticalGap \NC \NC \NR
3541 \NC RadicalDisplayStyleVerticalGap \NC \NC \NR
3542 \NC RadicalRuleThickness \NC \NC \NR
3543 \NC RadicalExtraAscender \NC \NC \NR
3544 \NC RadicalKernBeforeDegree \NC \NC \NR
3545 \NC RadicalKernAfterDegree \NC \NC \NR
3546 \NC RadicalDegreeBottomRaisePercent \NC \NC \NR
3547 \NC MinConnectorOverlap \NC \NC \NR
3548 \NC FractionDelimiterSize \NC (new in 0.47.0)\NC \NR
3549 \NC FractionDelimiterDisplayStyleSize \NC (new in 0.47.0)\NC \NR
3550 \stoptabulate
3552 \subsubsubsection{validation_state table}
3554 \starttabulate[|lT|p|]
3555 \NC \ssbf key \NC \bf explanation \NC\NR
3556 \NC bad_ps_fontname \NC \NC \NR
3557 \NC bad_glyph_table \NC \NC \NR
3558 \NC bad_cff_table \NC \NC \NR
3559 \NC bad_metrics_table \NC \NC \NR
3560 \NC bad_cmap_table \NC \NC \NR
3561 \NC bad_bitmaps_table \NC \NC \NR
3562 \NC bad_gx_table \NC \NC \NR
3563 \NC bad_ot_table \NC \NC \NR
3564 \NC bad_os2_version \NC \NC \NR
3565 \NC bad_sfnt_header \NC \NC \NR
3566 \stoptabulate
3568 \subsubsubsection{horiz_base and vert_base table}
3570 \starttabulate[|lT|l|p|]
3571 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3572 \NC tags \NC table \NC an array of script list tags\NC \NR
3573 \NC scripts \NC table \NC \NC \NR
3574 \stoptabulate
3577 The \type{scripts} subtable:
3579 \starttabulate[|lT|l|p|]
3580 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3581 \NC baseline \NC table \NC \NC \NR
3582 \NC default_baseline \NC number \NC \NC \NR
3583 \NC lang \NC table \NC \NC \NR
3584 \stoptabulate
3587 The \type{lang} subtable:
3589 \starttabulate[|lT|l|p|]
3590 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3591 \NC tag \NC string \NC a script tag \NC \NR
3592 \NC ascent \NC number \NC \NC \NR
3593 \NC descent \NC number \NC \NC \NR
3594 \NC features \NC table \NC \NC \NR
3595 \stoptabulate
3597 The \type{features} points to an array of tables with the same layout
3598 except that in those nested tables, the tag represents a language.
3600 \subsubsubsection{altuni table}
3602 An array of alternate \UNICODE\ values. Inside that array
3603 are hashes with:
3605 \starttabulate[|lT|l|p|]
3606 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3607 \NC unicode \NC number \NC this glyph is also used for this unicode\NC \NR
3608 \NC variant \NC number \NC the alternative is driven by this unicode selector\NC \NR
3609 \stoptabulate
3611 \subsubsubsection{vert_variants and horiz_variants table}
3613 \starttabulate[|lT|l|p|]
3614 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3615 \NC variants \NC string \NC \NC \NR
3616 \NC italic_correction \NC number \NC \NC \NR
3617 \NC parts \NC table \NC \NC \NR
3618 \stoptabulate
3620 The \type{parts} table is an array of smaller tables:
3622 \starttabulate[|lT|l|p|]
3623 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3624 \NC component \NC string \NC \NC \NR
3625 \NC extender \NC number \NC \NC \NR
3626 \NC start \NC number \NC \NC \NR
3627 \NC end \NC number \NC \NC \NR
3628 \NC advance \NC number \NC \NC \NR
3629 \stoptabulate
3632 \subsubsubsection{mathkern table}
3634 \starttabulate[|lT|l|p|]
3635 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3636 \NC top_right \NC table \NC \NC \NR
3637 \NC bottom_right \NC table \NC \NC \NR
3638 \NC top_left \NC table \NC \NC \NR
3639 \NC bottom_left \NC table \NC \NC \NR
3640 \stoptabulate
3642 Each of the subtables is an array of small hashes with two keys:
3644 \starttabulate[|lT|l|p|]
3645 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3646 \NC height \NC number \NC \NC \NR
3647 \NC kern \NC number \NC \NC \NR
3648 \stoptabulate
3650 \subsubsubsection{kerns table}
3652 Substructure is identical to the per|-|glyph subtable.
3654 \subsubsubsection{vkerns table}
3656 Substructure is identical to the per|-|glyph subtable.
3658 \subsubsubsection{texdata table}
3661 \starttabulate[|lT|l|p|]
3662 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3663 \NC type \NC string \NC \type {unset}, \type {text}, \type {math}, \type {mathext}\NC\NR
3664 \NC params \NC array \NC 22 font numeric parameters\NC\NR
3665 \stoptabulate
3667 \subsubsubsection{lookups table}
3669 Top|-|level \type{lookups} is quite different from the ones at
3670 character level. The keys in this hash are strings, the values the
3671 actual lookups, represented as dictionary tables.
3673 \starttabulate[|lT|l|p|]
3674 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3675 \NC type \NC string \NC \NC\NR
3676 \NC format \NC enum \NC one of \type {glyphs}, \type {class}, \type {coverage}, \type {reversecoverage} \NC\NR
3677 \NC tag \NC string \NC \NC\NR
3678 \NC current_class \NC array \NC \NC\NR
3679 \NC before_class \NC array \NC \NC\NR
3680 \NC after_class \NC array \NC \NC\NR
3681 \NC rules \NC array \NC an array of rule items\NC\NR
3682 \stoptabulate
3684 Rule items have one common item and one specialized item:
3686 \starttabulate[|lT|l|p|]
3687 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3688 \NC lookups \NC array \NC a linear array of lookup names\NC\NR
3689 \NC glyphs \NC array \NC only if the parent's format is \type{glyphs}\NC\NR
3690 \NC class \NC array \NC only if the parent's format is \type{class}\NC\NR
3691 \NC coverage \NC array \NC only if the parent's format is \type{coverage}\NC\NR
3692 \NC reversecoverage \NC array \NC only if the parent's format is \type{reversecoverage}\NC\NR
3693 \stoptabulate
3695 A glyph table is:
3697 \starttabulate[|lT|l|p|]
3698 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3699 \NC names \NC string \NC \NC\NR
3700 \NC back \NC string \NC \NC\NR
3701 \NC fore \NC string \NC \NC\NR
3702 \stoptabulate
3704 A class table is:
3706 \starttabulate[|lT|l|p|]
3707 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3708 \NC current \NC array \NC of numbers \NC\NR
3709 \NC before \NC array \NC of numbers \NC\NR
3710 \NC after \NC array \NC of numbers \NC\NR
3711 \stoptabulate
3713 coverage:
3715 \starttabulate[|lT|l|p|]
3716 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3717 \NC current \NC array \NC of strings \NC\NR
3718 \NC before \NC array \NC of strings\NC\NR
3719 \NC after \NC array \NC of strings \NC\NR
3720 \stoptabulate
3722 reversecoverage:
3724 \starttabulate[|lT|l|p|]
3725 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
3726 \NC current \NC array \NC of strings \NC\NR
3727 \NC before \NC array \NC of strings\NC\NR
3728 \NC after \NC array \NC of strings \NC\NR
3729 \NC replacements \NC string \NC \NC\NR
3730 \stoptabulate
3732 %***********************************************************************
3734 \section{The \luatex{img} library}
3736 The \type{img} library can be used as an alternative to
3737 \tex{pdfximage} and \tex{pdfrefximage}, and the associated \quote {satellite}
3738 commands like \tex{pdfximagebbox}.
3739 Image objects can also be used within virtual fonts
3740 via the \type{image} command listed in~\in{section}[virtualfonts].
3742 \subsection{\luatex{img.new}}
3744 \startfunctioncall
3745 <image> var = img.new()
3746 <image> var = img.new(<table> image_spec)
3747 \stopfunctioncall
3749 This function creates a userdata object of type \quote {image}. The
3750 \type{image_spec} argument is optional. If it is given, it must be
3751 a table, and that table must contain a \type{filename} key. A number of
3752 other keys can also be useful, these are explained below.
3754 You can either say
3756 \starttyping
3757 a = img.new()
3758 \stoptyping
3760 followed by
3762 \starttyping
3763 a.filename = "foo.png"
3764 \stoptyping
3766 or you can put the file name (and some or all of the other keys)
3767 into a table directly, like so:
3769 \starttyping
3770 a = img.new({filename='foo.pdf', page=1})
3771 \stoptyping
3773 The generated \type{<image>} userdata object allows access to a set of
3774 user|-|specified values as well as a set of values that are normally
3775 filled in and updated automatically by \LUATEX\ itself. Some of those
3776 are derived from the actual image file, others are updated to reflect
3777 the \PDF\ output status of the object.
3779 There is one required user-specified field: the file name
3780 (\type{filename}). It can optionally be augmented by the requested
3781 image dimensions (\type{width}, \type{depth}, \type{height}),
3782 user-specified image attributes (\type{attr}), the requested \PDF\ page
3783 identifier (\type{page}), the requested boundingbox (\type{pagebox})
3784 for \PDF\ inclusion, the requested color space object (\type{colorspace}).
3786 The function \type{img.new} does not access the actual image file, it
3787 just creates the \type{<image>} userdata object and initializes some
3788 memory structures. The \type{<image>} object and its internal
3789 structures are automatically garbage collected.
3791 Once the image is scanned, all the values in the \type{<image>}
3792 except \type{width}, \type{height} and \type{depth}, become frozen,
3793 and you cannot change them any more.
3795 \subsection{\luatex{img.keys}}
3797 \startfunctioncall
3798 <table> keys = img.keys()
3799 \stopfunctioncall
3801 This function returns a list of all the possible \type{image_spec}
3802 keys, both user-supplied and automatic ones.
3804 % hahe: i need to add r/w ro column...
3805 \starttabulate[|l|l|p|]
3806 \NC \bf field name\NC \bf type \NC description \NC \NR
3807 \NC attr \NC string \NC the image attributes for \LUATEX \NC \NR
3808 \NC bbox \NC table \NC table with 4 boundingbox dimensions
3809 \type{llx}, \type{lly}, \type{urx},
3810 and \type{ury} overruling the \type{pagebox}
3811 entry\NC \NR
3812 \NC colordepth \NC number \NC the number of bits used by the color space\NC \NR
3813 \NC colorspace \NC number \NC the color space object number \NC \NR
3814 \NC depth \NC number \NC the image depth for \LUATEX\
3815 (in scaled points)\NC \NR
3816 \NC filename \NC string \NC the image file name \NC \NR
3817 \NC filepath \NC string \NC the full (expanded) file name of the image\NC \NR
3818 \NC height \NC number \NC the image height for \LUATEX\
3819 (in scaled points)\NC \NR
3820 \NC imagetype \NC string \NC one of \type{pdf}, \type{png}, \type{jpg}, \type{jp2},
3821 \type{jbig2}, or \type{nil} \NC \NR
3822 \NC index \NC number \NC the \PDF\ image name suffix \NC \NR
3823 \NC objnum \NC number \NC the \PDF\ image object number \NC \NR
3824 \NC page \NC ?? \NC the identifier for the requested image page
3825 (type is number or string,
3826 default is the number 1)\NC \NR
3827 \NC pagebox \NC string \NC the requested bounding box, one of
3828 \type {none}, \type {media}, \type {crop},
3829 \type {bleed}, \type {trim}, \type {art} \NC \NR
3830 \NC pages \NC number \NC the total number of available pages \NC \NR
3831 \NC rotation \NC number \NC the image rotation from included \PDF\ file,
3832 in multiples of 90~deg. \NC \NR
3833 \NC stream \NC string \NC the raw stream data for an \type{/Xobject}
3834 \type{/Form} object\NC \NR
3835 \NC transform \NC number \NC the image transform, integer number 0..7\NC \NR
3836 \NC width \NC number \NC the image width for \LUATEX\
3837 (in scaled points)\NC \NR
3838 \NC xres \NC number \NC the horizontal natural image resolution
3839 (in \DPI) \NC \NR
3840 \NC xsize \NC number \NC the natural image width \NC \NR
3841 \NC yres \NC number \NC the vertical natural image resolution
3842 (in \DPI) \NC \NR
3843 \NC ysize \NC number \NC the natural image height \NC \NR
3844 \stoptabulate
3846 A running (undefined) dimension in \type{width}, \type{height}, or \type{depth} is
3847 represented as \type{nil} in \LUA, so if you want to load an image at
3848 its \quote {natural} size, you do not have to specify any of those three fields.
3850 The \type{stream} parameter allows to fabricate an \type{/XObject} \type{/Form}
3851 object from a string giving the stream contents,
3852 e.\,g., for a filled rectangle:
3854 \startfunctioncall
3855 a.stream = "0 0 20 10 re f"
3856 \stopfunctioncall
3858 When writing the image, an \type{/Xobject} \type{/Form} object is created,
3859 like with embedded \PDF\ file writing. The object is written out only once.
3860 The \type{stream} key requires that also the \type{bbox} table is given.
3861 The \type{stream} key conflicts with the \type{filename} key.
3862 The \type{transform} key works as usual also with \type{stream}.
3864 The \type{bbox} key needs a table with four boundingbox values, e.\,g.:
3866 \startfunctioncall
3867 a.bbox = {"30bp", 0, "225bp", "200bp"}
3868 \stopfunctioncall
3870 This replaces and overrules any given \type{pagebox} value;
3871 with given \type{bbox} the box dimensions coming with an embedded \PDF\ file
3872 are ignored.
3873 The \type{xsize} and \type{ysize} dimensions are set accordingly,
3874 when the image is scaled.
3875 The \type{bbox} parameter is ignored for non-\PDF\ images.
3877 The \type{transform} allows to mirror and rotate the image in steps of 90~deg.
3878 The default value~0 gives an unmirrored, unrotated image.
3879 Values 1|--|3 give counterclockwise rotation by 90, 180, or 270~degrees,
3880 whereas with values 4|--|7 the image is first mirrored
3881 and then rotated counterclockwise by 90, 180, or 270~degrees.
3882 The \type{transform} operation gives the same visual result
3883 as if you would externally preprocess the image by a graphics tool
3884 and then use it by \LUATEX.
3885 If a \PDF\ file to be embedded already contains a \type{/Rotate} specification,
3886 the rotation result is the combination of the \type{/Rotate} rotation
3887 followed by the \type{transform} operation.
3889 \subsection{\luatex{img.scan}}
3891 \startfunctioncall
3892 <image> var = img.scan(<image> var)
3893 <image> var = img.scan(<table> image_spec)
3894 \stopfunctioncall
3896 When you say \type{img.scan(a)} for a new image, the file is scanned,
3897 and variables such as \type{xsize}, \type{ysize}, image \type{type}, number of
3898 \type{pages}, and the resolution are extracted. Each of the \type{width},
3899 \type{height}, \type{depth} fields are set up according to the image dimensions,
3900 if they were not given an explicit value already.
3901 An image file will never be scanned more than once for a given image variable.
3902 With all subsequent \type{img.scan(a)} calls only the dimensions are again
3903 set up (if they have been changed by the user in the meantime).
3905 For ease of use, you can do right-away a
3907 \starttyping
3908 <image> a = img.scan ({ filename = "foo.png" })
3909 \stoptyping
3911 without a prior \type{img.new}.
3913 Nothing is written yet at this point, so you can do \type{a=img.scan},
3914 retrieve the available info like image width and height, and then
3915 throw away \type{a} again by saying \type{a=nil}. In that case no
3916 image object will be reserved in the PDF, and the used memory will be
3917 cleaned up automatically.
3919 \subsection{\luatex{img.copy}}
3921 \startfunctioncall
3922 <image> var = img.copy(<image> var)
3923 <image> var = img.copy(<table> image_spec)
3924 \stopfunctioncall
3926 If you say \type{a = b}, then both variables point to the same
3927 \type{<image>} object. if you want to write out an image with
3928 different sizes, you can do a \type{b=img.copy(a)}.
3930 Afterwards, \type{a} and \type{b} still reference the same actual
3931 image dictionary, but the dimensions for \type{b} can now be changed
3932 from their initial values that were just copies from \type{a}.
3934 % Hartmut, I don't know if this makes sense. An example of what
3935 % can, and what cannot be changed would be helpful.
3936 % -- will think about it...
3938 \subsection{\luatex{img.write}}
3940 \startfunctioncall
3941 <image> var = img.write(<image> var)
3942 <image> var = img.write(<table> image_spec)
3943 \stopfunctioncall
3945 By \type{img.write(a)} a \PDF\ object number is allocated,
3946 and a whatsit node of subtype \type{pdf_refximage} is generated
3947 and put into the output list.
3948 By this the image \type{a} is placed into the page stream,
3949 and the image file is written out into an image stream object
3950 after the shipping of the current page is finished.
3952 Again you can do a terse call like
3954 \starttyping
3955 img.write ({ filename = "foo.png" })
3956 \stoptyping
3958 The \type{<image>} variable is returned in case you want it for later
3959 processing.
3961 \subsection{\luatex{img.immediatewrite}}
3963 \startfunctioncall
3964 <image> var = img.immediatewrite(<image> var)
3965 <image> var = img.immediatewrite(<table> image_spec)
3966 \stopfunctioncall
3968 By \type{img.immediatewrite(a)} a \PDF\ object number is
3969 allocated, and the image file for image \type{a} is written out
3970 immediately into the \PDF\ file as an image stream object (like
3971 with \tex{immediate}\tex{pdfximage}). The object number of the image
3972 stream dictionary is then available by the \type{objnum} key. No
3973 \type{pdf_refximage} whatsit node is generated. You will need an
3974 \luatex{img.write(a)} or \luatex{img.node(a)} call to let the
3975 image appear on the page, or reference it by another trick; else
3976 you will have a dangling image object in the \PDF\ file.
3978 Also here you can do a terse call like
3980 \starttyping
3981 a = img.immediatewrite ({ filename = "foo.png" })
3982 \stoptyping
3984 The \type{<image>} variable is returned and you will most likely need it.
3986 \subsection{\luatex{img.node}}
3988 \startfunctioncall
3989 <node> n = img.node(<image> var)
3990 <node> n = img.node(<table> image_spec)
3991 \stopfunctioncall
3993 This function allocates a \PDF\ object number and returns a
3994 whatsit node of subtype \type{pdf_refximage}, filled with the
3995 image parameters \type{width}, \type{height}, \type{depth}, and
3996 \type{objnum}. Also here you can do a terse call like:
3998 \starttyping
3999 n = img.node ({ filename = "foo.png" })
4000 \stoptyping
4002 This example outputs an image:
4004 \starttyping
4005 node.write(img.node{filename="foo.png"})
4006 \stoptyping
4008 \subsection{\luatex{img.types}}
4010 \startfunctioncall
4011 <table> types = img.types()
4012 \stopfunctioncall
4014 This function returns a list with the supported image file type names,
4015 currently these are \type{pdf}, \type{png}, \type{jpg}, \type{jp2} (JPEG~2000),
4016 and \type{jbig2}.
4018 \subsection{\luatex{img.boxes}}
4020 \startfunctioncall
4021 <table> boxes = img.boxes()
4022 \stopfunctioncall
4024 This function returns a list with the supported \PDF\ page box names,
4025 currently these are \type {media}, \type {crop}, \type {bleed}, \type {trim}, and \type {art}
4026 (all in lowercase letters).
4028 %***********************************************************************
4030 \section{The \luatex{kpse} library}
4032 This library provides two separate, but nearly identical interfaces to
4033 the \KPATHSEA\ file search functionality: there is a \quote{normal}
4034 procedural interface that shares its kpathsea instance with \LUATEX\
4035 itself, and an object oriented interface that is completely on its
4036 own. The object oriented interface and \type{kpse.new} have been added
4037 in \LUATEX\ 0.37.
4039 \subsection{\luatex{kpse.set_program_name} and \luatex{kpse.new}}
4041 Before the search library can be used at all, its database has to be
4042 initialized. There are three possibilities, two of which belong to the
4043 procedural interface.
4045 First, when \LUATEX\ is used to typeset documents, this initialization
4046 happens automatically and the \KPATHSEA\ executable and program names
4047 are set to \type{luatex} (that is, unless explicitly prohibited by the
4048 user's startup script. See~\in{section}[init] for more details).
4050 Second, in \TEXLUA\ mode, the initialization has to be done explicitly
4051 via the \luatex{kpse.set_program_name} function, which sets the
4052 \KPATHSEA\ executable (and optionally program) name.
4054 \startfunctioncall
4055 kpse.set_program_name(<string> name)
4056 kpse.set_program_name(<string> name, <string> progname)
4057 \stopfunctioncall
4059 The second argument controls the use of the \quote{dotted} values in the
4060 \type{texmf.cnf} configuration file, and defaults to the first argument.
4062 Third, if you prefer the object oriented interface, you have to call a
4063 different function. It has the same arguments, but it returns a
4064 userdata variable.
4066 \startfunctioncall
4067 local kpathsea = kpse.new(<string> name)
4068 local kpathsea = kpse.new(<string> name, <string> progname)
4069 \stopfunctioncall
4071 Apart from these two functions, the calling conventions of the
4072 interfaces are identical. Depending on the chosen interface, you
4073 either call \type{kpse.find_file()} or \type{kpathsea:find_file()},
4074 with identical arguments and return vales.
4076 \subsection{\luatex{find_file}}
4078 The most often used function in the library is find_file:
4080 \startfunctioncall
4081 <string> f = kpse.find_file(<string> filename)
4082 <string> f = kpse.find_file(<string> filename, <string> ftype)
4083 <string> f = kpse.find_file(<string> filename, <boolean> mustexist)
4084 <string> f = kpse.find_file(<string> filename, <string> ftype, <boolean> mustexist)
4085 <string> f = kpse.find_file(<string> filename, <string> ftype, <number> dpi)
4086 \stopfunctioncall
4088 Arguments:
4089 \startitemize[intro]
4091 \sym{filename}
4093 the name of the file you want to find, with or without extension.
4095 \sym{ftype}
4097 maps to the \type {-format} argument of \KPSEWHICH. The supported
4098 \type{ftype} values are the same as the ones supported by the
4099 standalone \type{kpsewhich} program:
4101 \startsimplecolumns
4102 \starttyping
4103 'gf'
4104 'pk'
4105 'bitmap font'
4106 'tfm'
4107 'afm'
4108 'base'
4109 'bib'
4110 'bst'
4111 'cnf'
4112 'ls-R'
4113 'fmt'
4114 'map'
4115 'mem'
4116 'mf'
4117 'mfpool'
4118 'mft'
4119 'mp'
4120 'mppool'
4121 'MetaPost support'
4122 'ocp'
4123 'ofm'
4124 'opl'
4125 'otp'
4126 'ovf'
4127 'ovp'
4128 'graphic/figure'
4129 'tex'
4130 'TeX system documentation'
4131 'texpool'
4132 'TeX system sources'
4133 'PostScript header'
4134 'Troff fonts'
4135 'type1 fonts'
4136 'vf'
4137 'dvips config'
4138 'ist'
4139 'truetype fonts'
4140 'type42 fonts'
4141 'web2c files'
4142 'other text files'
4143 'other binary files'
4144 'misc fonts'
4145 'web'
4146 'cweb'
4147 'enc files'
4148 'cmap files'
4149 'subfont definition files'
4150 'opentype fonts'
4151 'pdftex config'
4152 'lig files'
4153 'texmfscripts'
4154 'lua',
4155 'font feature files',
4156 'cid maps',
4157 'mlbib',
4158 'mlbst',
4159 'clua',
4160 \stoptyping
4161 \stopsimplecolumns
4163 The default type is \type{tex}. Note: this is different from
4164 \KPSEWHICH, which tries to deduce the file type itself from
4165 looking at the supplied extension. The last four types:
4166 'font feature files', 'cid maps', 'mlbib', 'mlbst' were new
4167 additions in \LUATEX\ 0.40.2.
4170 \sym{mustexist}
4172 is similar to \KPSEWHICH's \type{-must-exist}, and the default is \type{false}.
4173 If you specify \type{true} (or a non|-|zero integer), then the \KPSE\ library
4174 will search the disk as well as the \type {ls-R} databases.
4176 \sym{dpi}
4178 This is used for the size argument of the formats \type{pk}, \type{gf}, and \type{bitmap font}.
4179 \stopitemize
4181 \subsection{\luatex{lookup}}
4183 A more powerful (but slower) generic method for finding files is also
4184 available (since 0.51). It returns a string for each found file.
4186 \startfunctioncall
4187 <string> f, ... = kpse.lookup(<string> filename, <table> options)
4188 \stopfunctioncall
4190 The options match commandline arguments from \type{kpsewhich}:
4192 \starttabulate[|l|l|p|]
4193 \NC \ssbf key \NC \ssbf type \NC \ssbf description \NC \NR
4194 \NC debug \NC number \NC set debugging flags for this lookup\NC \NR
4195 \NC format \NC string \NC use specific file type (see list above)\NC \NR
4196 \NC dpi \NC number \NC use this resolution for this lookup; default 600\NC \NR
4197 \NC path \NC string \NC search in the given path\NC \NR
4198 \NC all \NC boolean \NC output all matches, not just the first\NC \NR
4199 \NC mustexist \NC boolean \NC (0.65 and higher) search the disk as well as ls-R if necessary\NC \NR
4200 \NC must-exist\NC boolean \NC (0.64 and lower) search the disk as well as ls-R if necessary\NC \NR
4201 \NC mktexpk \NC boolean \NC disable/enable mktexpk generation for this lookup\NC \NR
4202 \NC mktextex \NC boolean \NC disable/enable mktextex generation for this lookup\NC \NR
4203 \NC mktexmf \NC boolean \NC disable/enable mktexmf generation for this lookup\NC \NR
4204 \NC mktextfm \NC boolean \NC disable/enable mktextfm generation for this lookup\NC \NR
4205 \NC subdir \NC string
4206 or table \NC only output matches whose directory part
4207 ends with the given string(s) \NC \NR
4208 \stoptabulate
4210 \subsection{\luatex{init_prog}}
4212 Extra initialization for programs that need to generate bitmap fonts.
4214 \startfunctioncall
4215 kpse.init_prog(<string> prefix, <number> base_dpi, <string> mfmode)
4216 kpse.init_prog(<string> prefix, <number> base_dpi, <string> mfmode, <string> fallback)
4217 \stopfunctioncall
4220 \subsection{\luatex{readable_file}}
4222 Test if an (absolute) file name is a readable file.
4224 \startfunctioncall
4225 <string> f = kpse.readable_file(<string> name)
4226 \stopfunctioncall
4228 The return value is the actual absolute filename you should use,
4229 because the disk name is not always the same as the requested name,
4230 due to aliases and system|-|specific handling under e.\,g.\ \MSDOS.
4232 Returns \lua {nil} if the file does not exist or is not readable.
4234 \subsection{\luatex{expand_path}}
4236 Like kpsewhich's \type {-expand-path}:
4238 \startfunctioncall
4239 <string> r = kpse.expand_path(<string> s)
4240 \stopfunctioncall
4242 \subsection{\luatex{expand_var}}
4244 Like kpsewhich's \type{-expand-var}:
4246 \startfunctioncall
4247 <string> r = kpse.expand_var(<string> s)
4248 \stopfunctioncall
4250 \subsection{\luatex{expand_braces}}
4252 Like kpsewhich's \type{-expand-braces}:
4254 \startfunctioncall
4255 <string> r = kpse.expand_braces(<string> s)
4256 \stopfunctioncall
4258 \subsection{\luatex{show_path}}
4260 Like kpsewhich's \type{-show-path}:
4262 \startfunctioncall
4263 <string> r = kpse.show_path(<string> ftype)
4264 \stopfunctioncall
4267 \subsection{\luatex{var_value}}
4269 Like kpsewhich's \type{-var-value}:
4271 \startfunctioncall
4272 <string> r = kpse.var_value(<string> s)
4273 \stopfunctioncall
4275 \subsection{\luatex{version}}
4277 Returns the kpathsea version string (new in 0.51)
4279 \startfunctioncall
4280 <string> r = kpse.version()
4281 \stopfunctioncall
4284 \section{The \luatex{lang} library}
4286 This library provides the interface to \LUATEX's structure
4287 representing a language, and the associated functions.
4289 \startfunctioncall
4290 <language> l = lang.new()
4291 <language> l = lang.new(<number> id)
4292 \stopfunctioncall
4294 This function creates a new userdata object. An object of type
4295 \type{<language>} is the first argument to most of the other functions
4296 in the \luatex{lang} library. These functions can also be used as if
4297 they were object methods, using the colon syntax.
4299 Without an argument, the next available internal id number will be
4300 assigned to this object. With argument, an object will be created that
4301 links to the internal language with that id number.
4303 \startfunctioncall
4304 <number> n = lang.id(<language> l)
4305 \stopfunctioncall
4307 returns the internal \tex{language} id number this object refers to.
4309 \startfunctioncall
4310 <string> n = lang.hyphenation(<language> l)
4311 lang.hyphenation(<language> l, <string> n)
4312 \stopfunctioncall
4314 Either returns the current hyphenation exceptions for this language,
4315 or adds new ones. The syntax of the string is explained in~\in{section}[patternsexceptions].
4317 \startfunctioncall
4318 lang.clear_hyphenation(<language> l)
4319 \stopfunctioncall
4321 Clears the exception dictionary for this language.
4323 \startfunctioncall
4324 <string> n = lang.clean(<string> o)
4325 \stopfunctioncall
4327 Creates a hyphenation key from the supplied hyphenation value. The
4328 syntax of the argument string is explained in~\in{section}[patternsexceptions].
4329 This function is useful if
4330 you want to do something else based on the words in a dictionary file,
4331 like spell-checking.
4333 \startfunctioncall
4334 <string> n = lang.patterns(<language> l)
4335 lang.patterns(<language> l, <string> n)
4336 \stopfunctioncall
4338 Adds additional patterns for this language object, or returns the
4339 current set. The syntax of this string is explained in~\in{section}[patternsexceptions].
4341 \startfunctioncall
4342 lang.clear_patterns(<language> l)
4343 \stopfunctioncall
4345 Clears the pattern dictionary for this language.
4347 \startfunctioncall
4348 <number> n = lang.prehyphenchar(<language> l)
4349 lang.prehyphenchar(<language> l, <number> n)
4350 \stopfunctioncall
4352 Gets or sets the \quote{pre|-|break} hyphen character for implicit
4353 hyphenation in this language (initially the hyphen, decimal 45).
4355 \startfunctioncall
4356 <number> n = lang.posthyphenchar(<language> l)
4357 lang.posthyphenchar(<language> l, <number> n)
4358 \stopfunctioncall
4360 Gets or sets the \quote{post|-|break} hyphen character for implicit
4361 hyphenation in this language (initially null, decimal~0, indicating
4362 emptiness).
4365 \startfunctioncall
4366 <number> n = lang.preexhyphenchar(<language> l)
4367 lang.preexhyphenchar(<language> l, <number> n)
4368 \stopfunctioncall
4370 Gets or sets the \quote{pre|-|break} hyphen character for explicit
4371 hyphenation in this language (initially null, decimal~0, indicating
4372 emptiness).
4374 \startfunctioncall
4375 <number> n = lang.postexhyphenchar(<language> l)
4376 lang.postexhyphenchar(<language> l, <number> n)
4377 \stopfunctioncall
4379 Gets or sets the \quote{post|-|break} hyphen character for explicit
4380 hyphenation in this language (initially null, decimal~0, indicating
4381 emptiness).
4383 \startfunctioncall
4384 <boolean> success = lang.hyphenate(<node> head)
4385 <boolean> success = lang.hyphenate(<node> head, <node> tail)
4386 \stopfunctioncall
4388 Inserts hyphenation points (discretionary nodes) in a node list. If
4389 \type{tail} is given as argument, processing stops on that node.
4390 Currently, \type{success} is always true if \type{head} (and \type{tail}, if
4391 specified) are proper nodes, regardless of possible other errors.
4393 Hyphenation works only on \quote{characters}, a special subtype of all
4394 the glyph nodes with the node subtype having the value \type{1}. Glyph
4395 modes with different subtypes are not processed. See
4396 \in{section~}[charsandglyphs] for more details.
4399 \section{The \luatex{lua} library}
4401 This library contains one read|-|only item:
4403 \starttyping
4404 <string> s = lua.version
4405 \stoptyping
4407 This returns the \LUA\ version identifier string. The value is
4408 currently \directlua {tex.print(lua.version)}.
4410 \subsection{\LUA\ bytecode registers}
4412 \LUA\ registers can be used to communicate \LUA\ functions across \LUA\
4413 chunks. The accepted values for assignments are functions and
4414 \type{nil}. Likewise, the retrieved value is either a function or \type{nil}.
4416 \starttyping
4417 lua.bytecode[<number> n] = <function> f
4418 lua.bytecode[<number> n]()
4419 \stoptyping
4421 The contents of the \luatex{lua.bytecode} array is stored inside the format
4422 file as actual \LUA\ bytecode, so it can also be used to preload \LUA\ code.
4424 Note: The function must not contain any upvalues. Currently, functions
4425 containing upvalues can be stored (and their upvalues are set to
4426 \type{nil}), but this is an artifact of the current \LUA\
4427 implementation and thus subject to change.
4429 The associated function calls are
4431 \startfunctioncall
4432 <function> f = lua.getbytecode(<number> n)
4433 lua.setbytecode(<number> n, <function> f)
4434 \stopfunctioncall
4436 Note: Since a \LUA\ file loaded using \luatex{loadfile(filename)} is
4437 essentially an anonymous function, a complete file can be stored in a
4438 bytecode register like this:
4440 \startfunctioncall
4441 lua.bytecode[n] = loadfile(filename)
4442 \stopfunctioncall
4444 Now all definitions (functions, variables) contained in the file can be
4445 created by executing this bytecode register:
4447 \startfunctioncall
4448 lua.bytecode[n]()
4449 \stopfunctioncall
4451 Note that the path of the file is stored in the \LUA\ bytecode to be
4452 used in stack backtraces and therefore dumped into the format file if
4453 the above code is used in \INITEX. If it contains private information, i.e.
4454 the user name, this information is then contained in the format file as
4455 well. This should be kept in mind when preloading files into a bytecode
4456 register in \INITEX.
4458 \subsection{\LUA\ chunk name registers}
4460 There is an array of 65536 (0--65535) potential chunk names for use with
4461 the \type{\directlua} and \type{\latelua} primitives.
4463 \startfunctioncall
4464 lua.name[<number> n] = <string> s
4465 <string> s = lua.name[<number> n]
4466 \stopfunctioncall
4468 If you want to unset a lua name, you can assign \type{nil} to it.
4471 \section{The \luatex{mplib} library}
4473 The \MP\ library interface registers itself in the table \type{mplib}. It
4474 is based on \MPLIB\ version \ctxlua{tex.sprint(mplib.version())}.
4476 \subsection{\luatex{mplib.new}}
4478 To create a new \METAPOST\ instance, call
4480 \startfunctioncall
4481 <mpinstance> mp = mplib.new({...})
4482 \stopfunctioncall
4484 This creates the \type{mp} instance object. The argument hash can have a number of
4485 different fields, as follows:
4487 \starttabulate[|lT|l|p|p|]
4488 \NC \ssbf name \NC \bf type \NC \bf description \NC \bf default \NC\NR
4489 \NC error_line \NC number \NC error line width \NC 79 \NC\NR
4490 \NC print_line \NC number \NC line length in ps output \NC 100\NC\NR
4491 \NC random_seed \NC number \NC the initial random seed \NC variable\NC\NR
4492 \NC interaction \NC string \NC the interaction mode, one of
4493 \type {batch}, \type {nonstop}, \type {scroll}, \type {errorstop} \NC \type {errorstop}\NC\NR
4494 \NC job_name \NC string \NC \type {--jobname} \NC \type {mpout} \NC\NR
4495 \NC find_file \NC function \NC a function to find files \NC only local files\NC\NR
4496 \stoptabulate
4498 The \type{find_file} function should be of this form:
4500 \starttyping
4501 <string> found = finder (<string> name, <string> mode, <string> type)
4502 \stoptyping
4504 with:
4506 \starttabulate[|lT|l|p|]
4507 \NC \bf name \NC \bf the requested file \NC \NR
4508 \NC mode \NC the file mode: \type {r} or \type {w} \NC \NR
4509 \NC type \NC the kind of file, one of: \type {mp}, \type {tfm}, \type {map}, \type {pfb}, \type {enc} \NC \NR
4510 \stoptabulate
4512 Return either the full pathname of the found file, or \type{nil} if
4513 the file cannot be found.
4515 Note that the new version of \MPLIB\ no longer uses binary mem files,
4516 so the way to preload a set of macros is simply to start off with
4517 an \type{input} command in the first \type{mp:execute()} call.
4520 \subsection{\luatex{mp:statistics}}
4522 You can request statistics with:
4524 \startfunctioncall
4525 <table> stats = mp:statistics()
4526 \stopfunctioncall
4528 This function returns the vital statistics for an \MPLIB\ instance. There are four
4529 fields, giving the maximum number of used items in each of four
4530 allocated object classes:
4532 \starttabulate[|lT|l|p|]
4533 \NC main_memory \NC number \NC memory size \NC\NR
4534 \NC hash_size \NC number \NC hash size\NC\NR
4535 \NC param_size \NC number \NC simultaneous macro parameters\NC\NR
4536 \NC max_in_open \NC number \NC input file nesting levels\NC\NR
4537 \stoptabulate
4539 Note that in the new version of \MPLIB, this is informational only. The
4540 objects are all allocated dynamically, so there is no chance of running
4541 out of space unless the available system memory is exhausted.
4543 \subsection{\luatex{mp:execute}}
4545 You can ask the \METAPOST\ interpreter to run a chunk of code by calling
4547 \startfunctioncall
4548 <table> rettable = mp:execute('metapost language chunk')
4549 \stopfunctioncall
4551 for various bits of \METAPOST\ language input. Be sure to check the
4552 \type{rettable.status} (see below) because when a fatal \METAPOST\
4553 error occurs the \MPLIB\ instance will become unusable thereafter.
4555 Generally speaking, it is best to keep your chunks small, but beware
4556 that all chunks have to obey proper syntax, like each of them is a
4557 small file. For instance, you cannot split a single statement over
4558 multiple chunks.
4560 In contrast with the normal standalone \type{mpost} command, there is
4561 {\em no\/} implied \quote{input} at the start of the first chunk.
4563 \subsection{\luatex{mp:finish}}
4565 \startfunctioncall
4566 <table> rettable = mp:finish()
4567 \stopfunctioncall
4569 If for some reason you want to stop using an \MPLIB\ instance while
4570 processing is not yet actually done, you can call \type{mp:finish}.
4571 Eventually, used memory will be freed and open files will be closed by
4572 the \LUA\ garbage collector, but an explicit \type{mp:finish} is the
4573 only way to capture the final part of the output streams.
4575 \subsection{Result table}
4577 The return value of \type{mp:execute} and \type{mp:finish} is a table
4578 with a few possible keys (only \type {status} is always guaranteed to be present).
4580 \starttabulate[|l|l|p|]
4581 \NC log \NC string \NC output to the \quote {log} stream \NC \NR
4582 \NC term \NC string \NC output to the \quote {term} stream \NC \NR
4583 \NC error \NC string \NC output to the \quote {error} stream (only used for \quote {out of memory})\NC \NR
4584 \NC status \NC number \NC the return value: 0=good, 1=warning, 2=errors, 3=fatal error \NC \NR
4585 \NC fig \NC table \NC an array of generated figures (if any)\NC \NR
4586 \stoptabulate
4588 When \type{status} equals~3, you should stop using this \MPLIB\ instance
4589 immediately, it is no longer capable of processing input.
4591 If it is present, each of the entries in the \type{fig} array is a
4592 userdata representing a figure object, and each of those has a number of
4593 object methods you can call:
4595 \starttabulate[|l|l|p|]
4596 \NC boundingbox \NC function \NC returns the bounding box, as an array of 4 values\NC \NR
4597 \NC postscript \NC function \NC returns a string that is the ps output of the \type{fig}.
4598 this function accepts two optional integer arguments for
4599 specifying the values of \type{prologues} (first argument)
4600 and \type{procset} (second argument)\NC \NR
4601 \NC svg \NC function \NC returns a string that is the svg output of the \type{fig}.
4602 This function accepts an optional integer argument for
4603 specifying the value of \type{prologues}\NC \NR
4604 \NC objects \NC function \NC returns the actual array of graphic objects in this \type{fig} \NC \NR
4605 \NC copy_objects \NC function \NC returns a deep copy of the array of graphic objects in this \type{fig} \NC \NR
4606 \NC filename \NC function \NC the filename this \type{fig}'s \POSTSCRIPT\ output
4607 would have written to in standalone mode\NC \NR
4608 \NC width \NC function \NC the \type{charwd} value \NC \NR
4609 \NC height \NC function \NC the \type{charht} value \NC \NR
4610 \NC depth \NC function \NC the \type{chardp} value \NC \NR
4611 \NC italcorr \NC function \NC the \type{charit} value \NC \NR
4612 \NC charcode \NC function \NC the (rounded) \type{charcode} value \NC \NR
4613 \stoptabulate
4615 {\bf NOTE:} you can call \type{fig:objects()} only once for any one \type{fig} object!
4617 When the boundingbox represents a \quote {negated rectangle}, i.e.\ when the first set
4618 of coordinates is larger than the second set, the picture is empty.
4620 Graphical objects come in various types that each has a different list of
4621 accessible values. The types are: \type{fill}, \type{outline}, \type{text},
4622 \type{start_clip}, \type{stop_clip}, \type{start_bounds}, \type{stop_bounds}, \type{special}.
4624 There is helper function (\type{mplib.fields(obj)}) to get the list of
4625 accessible values for a particular object, but you can just as easily
4626 use the tables given below.
4628 All graphical objects have a field \type{type} that gives the object
4629 type as a string value; it is not explicit mentioned in the following tables.
4630 In the following, \type{number}s are \POSTSCRIPT\ points represented as
4631 a floating point number, unless stated otherwise. Field values that
4632 are of type \type{table} are explained in the next section.
4634 \subsubsection{fill}
4636 \starttabulate[|l|l|p|]
4637 \NC path \NC table \NC the list of knots \NC \NR
4638 \NC htap \NC table \NC the list of knots for the reversed trajectory \NC \NR
4639 \NC pen \NC table \NC knots of the pen \NC \NR
4640 \NC color \NC table \NC the object's color \NC \NR
4641 \NC linejoin \NC number \NC line join style (bare number)\NC \NR
4642 \NC miterlimit \NC number \NC miterlimit\NC \NR
4643 \NC prescript \NC string \NC the prescript text \NC \NR
4644 \NC postscript \NC string \NC the postscript text \NC \NR
4645 \stoptabulate
4647 The entries \type{htap} and \type{pen} are optional.
4649 There is helper function (\type{mplib.pen_info(obj)}) that returns
4650 a table containing a bunch of vital characteristics of the used pen
4651 (all values are floats):
4653 \starttabulate[|l|l|p|]
4654 \NC width \NC number \NC width of the pen\NC \NR
4655 \NC sx \NC number \NC $x$ scale \NC \NR
4656 \NC rx \NC number \NC $xy$ multiplier \NC \NR
4657 \NC ry \NC number \NC $yx$ multiplier \NC \NR
4658 \NC sy \NC number \NC $y$ scale \NC \NR
4659 \NC tx \NC number \NC $x$ offset \NC \NR
4660 \NC ty \NC number \NC $y$ offset \NC \NR
4661 \stoptabulate
4663 \subsubsection{outline}
4665 \starttabulate[|l|l|p|]
4666 \NC path \NC table \NC the list of knots \NC \NR
4667 \NC pen \NC table \NC knots of the pen \NC \NR
4668 \NC color \NC table \NC the object's color \NC \NR
4669 \NC linejoin \NC number \NC line join style (bare number)\NC \NR
4670 \NC miterlimit \NC number \NC miterlimit \NC \NR
4671 \NC linecap \NC number \NC line cap style (bare number)\NC \NR
4672 \NC dash \NC table \NC representation of a dash list\NC \NR
4673 \NC prescript \NC string \NC the prescript text \NC \NR
4674 \NC postscript \NC string \NC the postscript text \NC \NR
4675 \stoptabulate
4677 The entry \type{dash} is optional.
4679 \subsubsection{text}
4681 \starttabulate[|l|l|p|]
4682 \NC text \NC string \NC the text \NC \NR
4683 \NC font \NC string \NC font tfm name \NC \NR
4684 \NC dsize \NC number \NC font size\NC \NR
4685 \NC color \NC table \NC the object's color \NC \NR
4686 \NC width \NC number \NC \NC \NR
4687 \NC height \NC number \NC \NC \NR
4688 \NC depth \NC number \NC \NC \NR
4689 \NC transform \NC table \NC a text transformation \NC \NR
4690 \NC prescript \NC string \NC the prescript text \NC \NR
4691 \NC postscript \NC string \NC the postscript text \NC \NR
4692 \stoptabulate
4694 \subsubsection{special}
4696 \starttabulate[|l|l|p|]
4697 \NC prescript \NC string \NC special text \NC \NR
4698 \stoptabulate
4700 \subsubsection{start_bounds, start_clip}
4702 \starttabulate[|l|l|p|]
4703 \NC path \NC table \NC the list of knots \NC \NR
4704 \stoptabulate
4706 \subsubsection{stop_bounds, stop_clip}
4708 Here are no fields available.
4710 \subsection{Subsidiary table formats}
4712 \subsubsection{Paths and pens}
4714 Paths and pens (that are really just a special type of paths as far as
4715 \MPLIB\ is concerned) are represented by an array where each entry
4716 is a table that represents a knot.
4718 \starttabulate[|lT|l|p|]
4719 \NC left_type \NC string \NC when present: 'endpoint', but usually absent \NC \NR
4720 \NC right_type \NC string \NC like \type{left_type}\NC \NR
4721 \NC x_coord \NC number \NC X coordinate of this knot\NC \NR
4722 \NC y_coord \NC number \NC Y coordinate of this knot\NC \NR
4723 \NC left_x \NC number \NC X coordinate of the precontrol point of this knot\NC \NR
4724 \NC left_y \NC number \NC Y coordinate of the precontrol point of this knot\NC \NR
4725 \NC right_x \NC number \NC X coordinate of the postcontrol point of this knot\NC \NR
4726 \NC right_y \NC number \NC Y coordinate of the postcontrol point of this knot\NC \NR
4727 \stoptabulate
4729 There is one special case: pens that are (possibly transformed)
4730 ellipses have an extra string-valued key \type{type} with value
4731 \type{elliptical} besides the array part containing the knot list.
4733 \subsubsection{Colors}
4735 A color is an integer array with 0, 1, 3 or 4 values:
4737 \starttabulate[|l|l|p|]
4738 \NC 0 \NC marking only \NC no values \NC\NR
4739 \NC 1 \NC greyscale \NC one value in the range $(0,1)$, \quote {black} is $0$ \NC\NR
4740 \NC 3 \NC \RGB \NC three values in the range $(0,1)$, \quote {black} is $0,0,0$ \NC\NR
4741 \NC 4 \NC \CMYK \NC four values in the range $(0,1)$, \quote {black} is $0,0,0,1$ \NC\NR
4742 \stoptabulate
4744 If the color model of the internal object was \type{uninitialized}, then
4745 it was initialized to the values representing \quote {black} in the colorspace
4746 \type{defaultcolormodel} that was in effect at the time of the \type{shipout}.
4748 \subsubsection{Transforms}
4750 Each transform is a six-item array.
4752 \starttabulate[|l|l|p|]
4753 \NC 1 \NC number \NC represents x \NC\NR
4754 \NC 2 \NC number \NC represents y \NC\NR
4755 \NC 3 \NC number \NC represents xx \NC\NR
4756 \NC 4 \NC number \NC represents yx \NC\NR
4757 \NC 5 \NC number \NC represents xy \NC\NR
4758 \NC 6 \NC number \NC represents yy \NC\NR
4759 \stoptabulate
4761 Note that the translation (index 1 and 2) comes first. This differs
4762 from the ordering in \POSTSCRIPT, where the translation comes last.
4764 \subsubsection{Dashes}
4766 Each \type{dash} is two-item hash, using the same model as \POSTSCRIPT\
4767 for the representation of the dashlist. \type{dashes} is an array of
4768 \quote {on} and \quote {off}, values, and \type{offset} is the phase of the pattern.
4770 \starttabulate[|l|l|p|]
4771 \NC dashes \NC hash \NC an array of on-off numbers \NC\NR
4772 \NC offset \NC number \NC the starting offset value \NC\NR
4773 \stoptabulate
4775 \subsection{Character size information}
4777 These functions find the size of a glyph in a defined font. The
4778 \type{fontname} is the same name as the argument to \type{infont};
4779 the \type{char} is a glyph id in the range 0 to 255; the returned
4780 \type{w} is in AFM units.
4782 \subsubsection{\luatex{mp:char_width}}
4784 \startfunctioncall
4785 <number> w = mp:char_width(<string> fontname, <number> char)
4786 \stopfunctioncall
4788 \subsubsection{\luatex{mp:char_height}}
4790 \startfunctioncall
4791 <number> w = mp:char_height(<string> fontname, <number> char)
4792 \stopfunctioncall
4794 \subsubsection{\luatex{mp:char_depth}}
4796 \startfunctioncall
4797 <number> w = mp:char_depth(<string> fontname, <number> char)
4798 \stopfunctioncall
4800 \section{The \luatex{node} library}
4802 The \luatex{node} library contains functions that facilitate dealing
4803 with (lists of) nodes and their values. They allow you to create, alter,
4804 copy, delete, and insert \LUATEX\ node objects, the core
4805 objects within the typesetter.
4807 \LUATEX\ nodes are represented in \LUA\ as userdata with
4808 the metadata type \luatex{luatex.node}. The various parts within
4809 a node can be accessed using named fields.
4811 Each node has at least the three fields \type{next}, \type{id}, and
4812 \type{subtype}:
4814 \startitemize[intro]
4816 \item The \type{next} field returns the userdata
4817 object for the next node in a linked list of nodes, or
4818 \type{nil}, if there is no next node.
4820 \item The \type{id} indicates \TEX's \quote{node type}. The field \type{id}
4821 has a numeric value for efficiency reasons, but some of the library
4822 functions also accept a string value instead of \type{id}.
4824 \item The \type{subtype} is another number. It often gives further information
4825 about a node of a particular \type{id}, but it is most important when dealing
4826 with \quote{whatsits}, because they are differentiated solely based on their
4827 \type{subtype}.
4828 \stopitemize
4830 The other available fields depend on the \type{id} (and for \quote{whatsits}, the
4831 \type{subtype}) of the node. Further details on the various fields and their
4832 meanings are given in~\in{chapter}[nodes].
4834 Support for \type{unset} (alignment) nodes is partial:
4835 they can be queried and modified from \LUA\ code, but not created.
4837 Nodes can be compared to each other, but: you are actually comparing
4838 indices into the node memory. This means that equality tests can only
4839 be trusted under very limited conditions. It will not work correctly
4840 in any situation where one of the two nodes has been freed and|/|or
4841 reallocated: in that case, there will be false positives.
4843 At the moment, memory management of nodes should still be done
4844 explicitly by the user. Nodes are not \quote{seen} by the \LUA\
4845 garbage collector, so you have to call the node freeing functions
4846 yourself when you are no longer in need of a node (list). Nodes form
4847 linked lists without reference counting, so you have to be careful
4848 that when control returns back to \LUATEX\ itself, you have not
4849 deleted nodes that are still referenced from a \type{next} pointer
4850 elsewhere, and that you did not create nodes that are referenced more
4851 than once.
4853 There are statistics available with regards to the allocated node memory,
4854 which can be handy for tracing.
4856 \subsection{Node handling functions}
4858 \subsubsection{\luatex{node.is_node}}
4860 \startfunctioncall
4861 <boolean> t = node.is_node(<any> item)
4862 \stopfunctioncall
4864 This function returns true if the argument is a userdata object of
4865 type \type{<node>}.
4867 \subsubsection{\luatex{node.types}}
4869 \startfunctioncall
4870 <table> t = node.types()
4871 \stopfunctioncall
4873 This function returns an array that maps node id numbers to node type
4874 strings, providing an overview of the possible top|-|level \type{id}
4875 types.
4877 \subsubsection{\luatex{node.whatsits}}
4879 \startfunctioncall
4880 <table> t = node.whatsits()
4881 \stopfunctioncall
4883 \TEX's \quote{whatsits} all have the same \type{id}. The various subtypes
4884 are defined by their \type{subtype} fields. The function is much like
4885 \luatex{node.types}, except that it provides an array of \type{subtype}
4886 mappings.
4888 \subsubsection{\luatex{node.id}}
4890 \startfunctioncall
4891 <number> id = node.id(<string> type)
4892 \stopfunctioncall
4894 This converts a single type name to its internal numeric
4895 representation.
4897 \subsubsection{\luatex{node.subtype}}
4899 \startfunctioncall
4900 <number> subtype = node.subtype(<string> type)
4901 \stopfunctioncall
4903 This converts a single whatsit name to its internal numeric
4904 representation (\type{subtype}).
4906 \subsubsection{\luatex{node.type}}
4908 \startfunctioncall
4909 <string> type = node.type(<any> n)
4910 \stopfunctioncall
4912 In the argument is a number, then this function converts an internal
4913 numeric representation to an external string representation.
4914 Otherwise, it will return the string \type{node} if the object
4915 represents a node (this is new in 0.65), and \type{nil} otherwise.
4917 \subsubsection{\luatex{node.fields}}
4919 \startfunctioncall
4920 <table> t = node.fields(<number> id)
4921 <table> t = node.fields(<number> id, <number> subtype)
4922 \stopfunctioncall
4924 This function returns an array of valid field names for a particular
4925 type of node. If you want to get the valid fields for a
4926 \quote{whatsit}, you have to supply the second argument also. In other
4927 cases, any given second argument will be silently ignored.
4929 This function accepts string \type{id} and \type{subtype} values as
4930 well.
4932 \subsubsection{\luatex{node.has_field}}
4934 \startfunctioncall
4935 <boolean> t = node.has_field(<node> n, <string> field)
4936 \stopfunctioncall
4938 This function returns a boolean that is only true if \type{n} is
4939 actually a node, and it has the field.
4941 \subsubsection{\luatex{node.new}}
4943 \startfunctioncall
4944 <node> n = node.new(<number> id)
4945 <node> n = node.new(<number> id, <number> subtype)
4946 \stopfunctioncall
4948 Creates a new node. All of the new node's fields are initialized to
4949 either zero or \type{nil} except for \type{id} and \type{subtype} (if
4950 supplied). If you want to create a new whatsit, then the second
4951 argument is required, otherwise it need not be present. As with all
4952 node functions, this function creates a node on the \TEX\ level.
4954 This function accepts string \type{id} and \type{subtype} values as
4955 well.
4957 \subsubsection{\luatex{node.free}}
4959 \startfunctioncall
4960 node.free(<node> n)
4961 \stopfunctioncall
4963 Removes the node \type{n} from \TEX's memory. Be careful: no checks
4964 are done on whether this node is still pointed to from a register or some
4965 \type{next} field: it is up to you to make sure that the internal data
4966 structures remain correct.
4968 \subsubsection{\luatex{node.flush_list}}
4970 \startfunctioncall
4971 node.flush_list(<node> n)
4972 \stopfunctioncall
4974 Removes the node list \type{n} and the complete node list following
4975 \type{n} from \TEX's memory. Be careful: no checks are done on whether
4976 any of these nodes is still pointed to from a register or some
4977 \type{next} field: it is up to you to make sure that the internal data
4978 structures remain correct.
4980 \subsubsection{\luatex{node.copy}}
4982 \startfunctioncall
4983 <node> m = node.copy(<node> n)
4984 \stopfunctioncall
4986 Creates a deep copy of node \type{n}, including all nested lists as in
4987 the case of a hlist or vlist node. Only the \type{next} field is not
4988 copied.
4990 \subsubsection{\luatex{node.copy_list}}
4992 \startfunctioncall
4993 <node> m = node.copy_list(<node> n)
4994 <node> m = node.copy_list(<node> n, <node> m)
4995 \stopfunctioncall
4997 Creates a deep copy of the node list that starts at \type{n}. If
4998 \type{m} is also given, the copy stops just before node \type{m}.
5000 Note that you cannot copy attribute lists this way, specialized functions for
5001 dealing with attribute lists will be provided later but are not there yet.
5002 However, there is normally no need to copy attribute lists as when you do
5003 assignments to the \type{attr} field or make changes to specific attributes, the
5004 needed copying and freeing takes place automatically.
5006 \subsubsection{\luatex{node.next} (0.65)}
5008 \startfunctioncall
5009 <node> m = node.next(<node> n)
5010 \stopfunctioncall
5012 Returns the node following this node, or \type{nil} if there is no
5013 such node.
5015 \subsubsection{\luatex{node.prev} (0.65)}
5017 \startfunctioncall
5018 <node> m = node.prev(<node> n)
5019 \stopfunctioncall
5021 Returns the node preceding this node, or \type{nil} if there is no
5022 such node.
5025 \subsubsection{\luatex{node.current_attr} (0.66)}
5027 \startfunctioncall
5028 <node> m = node.current_attr()
5029 \stopfunctioncall
5031 Returns the currently active list of attributes, if there is one.
5033 Note: this function is somewhat experimental, and it returns the {\it
5034 actual} attribute list, not a copy thereof.
5035 Therefore, changing any of the attributes in the list will change
5036 these values for all nodes that have the current attribute list
5037 assigned to them.
5040 \subsubsection{\luatex{node.hpack}}
5042 \startfunctioncall
5043 <node> h, <number> b = node.hpack(<node> n)
5044 <node> h, <number> b = node.hpack(<node> n, <number> w, <string> info)
5045 <node> h, <number> b = node.hpack(<node> n, <number> w, <string> info, <string> dir)
5046 \stopfunctioncall
5048 This function creates a new hlist by packaging the list that begins at node
5049 \type{n} into a horizontal box. With only a single argument, this box
5050 is created using the natural width of its components. In the three
5051 argument form, \type{info} must be either \type{additional} or
5052 \type{exactly}, and \type{w} is the additional (\tex{hbox spread})
5053 or exact (\tex{hbox to}) width to be used.
5055 Direction support added in \LUATEX\ 0.45.
5057 The second return value is the badness of the generated box,
5058 this extension was added in 0.51.
5060 Caveat: at this moment, there can be unexpected side|-|effects to this
5061 function, like updating some of the \tex{marks} and \tex{inserts}.
5062 Also note that the content of \type{h} is the original node list
5063 \type{n}: if you call \type{node.free(h)} you will also free the
5064 node list itself, unless you explicitly set the \type{list} field
5065 to \type{nil} beforehand. And in a similar way, calling
5066 \type{node.free(n)} will invalidate \type{h} as well!
5068 \subsubsection{\luatex{node.vpack} (since 0.36)}
5070 \startfunctioncall
5071 <node> h, <number> b = node.vpack(<node> n)
5072 <node> h, <number> b = node.vpack(<node> n, <number> w, <string> info)
5073 <node> h, <number> b = node.vpack(<node> n, <number> w, <string> info, <string> dir)
5074 \stopfunctioncall
5076 This function creates a new vlist by packaging the list that begins at node
5077 \type{n} into a vertical box. With only a single argument, this box
5078 is created using the natural height of its components. In the three
5079 argument form, \type{info} must be either \type{additional} or
5080 \type{exactly}, and \type{w} is the additional (\tex{vbox spread}) or exact (\tex{vbox to}) height to be used.
5082 Direction support added in \LUATEX\ 0.45.
5084 The second return value is the badness of the generated box,
5085 this extension was added in 0.51.
5087 See the description of \type{node.hpack()} for a few memory allocation
5088 caveats.
5090 \subsubsection{\luatex{node.dimensions} (0.43)}
5092 \startfunctioncall
5093 <number> w, <number> h, <number> d = node.dimensions(<node> n)
5094 <number> w, <number> h, <number> d = node.dimensions(<node> n, <string> dir)
5095 <number> w, <number> h, <number> d = node.dimensions(<node> n, <node> t)
5096 <number> w, <number> h, <number> d = node.dimensions(<node> n, <node> t, <string> dir)
5097 \stopfunctioncall
5099 This function calculates the natural in-line dimensions of the node
5100 list starting at node \type{n} and terminating just before node \type{t}
5101 (or the end of the list, if there is no second argument). The return values are scaled
5102 points. An alternative format that starts with glue parameters as the
5103 first three arguments is also possible:
5105 \startfunctioncall
5106 <number> w, <number> h, <number> d =
5107 node.dimensions(<number> glue_set, <number> glue_sign,
5108 <number> glue_order, <node> n)
5109 <number> w, <number> h, <number> d =
5110 node.dimensions(<number> glue_set, <number> glue_sign,
5111 <number> glue_order, <node> n, <string> dir)
5112 <number> w, <number> h, <number> d =
5113 node.dimensions(<number> glue_set, <number> glue_sign,
5114 <number> glue_order, <node> n, <node> t)
5115 <number> w, <number> h, <number> d =
5116 node.dimensions(<number> glue_set, <number> glue_sign,
5117 <number> glue_order, <node> n, <node> t, <string> dir)
5118 \stopfunctioncall
5120 This calling method takes glue settings into account and is especially
5121 useful for finding the actual width of a sublist of nodes that are
5122 already boxed, for example in code like this, which prints the
5123 width of the space inbetween the \type{a} and \type{b} as it would
5124 be if \type{\box0} was used as-is:
5126 \starttyping
5127 \setbox0 = \hbox to 20pt {a b}
5129 \directlua{print (node.dimensions(tex.box[0].glue_set,
5130 tex.box[0].glue_sign,
5131 tex.box[0].glue_order,
5132 tex.box[0].head.next,
5133 node.tail(tex.box[0].head))) }
5134 \stoptyping
5136 Direction support added in \LUATEX\ 0.45.
5138 \subsubsection{\luatex{node.mlist_to_hlist}}
5140 \startfunctioncall
5141 <node> h = node.mlist_to_hlist(<node> n,
5142 <string> display_type, <boolean> penalties)
5143 \stopfunctioncall
5145 This runs the internal mlist to hlist conversion, converting the math list in
5146 \type{n} into the horizontal list \type{h}. The interface is exactly the same as
5147 for the callback \type{mlist_to_hlist}.
5149 \subsubsection{\luatex{node.slide}}
5151 \startfunctioncall
5152 <node> m = node.slide(<node> n)
5153 \stopfunctioncall
5155 Returns the last node of the node list that starts at \type{n}. As a
5156 side|-|effect, it also creates a reverse chain of \type{prev} pointers
5157 between nodes.
5159 \subsubsection{\luatex{node.tail}}
5161 \startfunctioncall
5162 <node> m = node.tail(<node> n)
5163 \stopfunctioncall
5165 Returns the last node of the node list that starts at \type{n}.
5168 \subsubsection{\luatex{node.length}}
5170 \startfunctioncall
5171 <number> i = node.length(<node> n)
5172 <number> i = node.length(<node> n, <node> m)
5173 \stopfunctioncall
5175 Returns the number of nodes contained in the node list that starts at
5176 \type{n}. If \type{m} is also supplied it stops at \type{m} instead of
5177 at the end of the list. The node \type{m} is not counted.
5179 \subsubsection{\luatex{node.count}}
5181 \startfunctioncall
5182 <number> i = node.count(<number> id, <node> n)
5183 <number> i = node.count(<number> id, <node> n, <node> m)
5184 \stopfunctioncall
5186 Returns the number of nodes contained in the node list that starts at
5187 \type{n} that have a matching \type{id} field.
5188 If \type{m} is also supplied, counting stops at \type{m} instead of at
5189 the end of the list. The node \type{m} is not counted.
5191 This function also accept string \type{id}'s.
5193 \subsubsection{\luatex{node.traverse}}
5195 \startfunctioncall
5196 <node> t = node.traverse(<node> n)
5197 \stopfunctioncall
5199 This is a lua iterator that loops over the node list that starts at \type{n}.
5200 Typical input code like this
5202 \starttyping
5203 for n in node.traverse(head) do
5206 \stoptyping
5208 is functionally equivalent to:
5210 \starttyping
5212 local n
5213 local function f (head,var)
5214 local t
5215 if var == nil then
5216 t = head
5217 else
5218 t = var.next
5220 return t
5222 while true do
5223 n = f (head, n)
5224 if n == nil then break end
5228 \stoptyping
5230 It should be clear from the definition of the function \type{f} that
5231 even though it is possible to add or remove nodes from the node list while
5232 traversing, you have to take great care to make sure all the \type{next}
5233 (and \type{prev}) pointers remain valid.
5235 If the above is unclear to you, see the section \quote{For Statement}
5236 in the Lua Reference Manual.
5238 \subsubsection{\luatex{node.traverse_id}}
5240 \startfunctioncall
5241 <node> t = node.traverse_id(<number> id, <node> n)
5242 \stopfunctioncall
5244 This is an iterator that loops over all the nodes in the list that
5245 starts at \type{n} that have a matching \type{id} field.
5247 See the previous section for details. The change is in the local
5248 function \type{f}, which now does an extra while loop checking
5249 against the upvalue \type{id}:
5251 \starttyping
5252 local function f (head,var)
5253 local t
5254 if var == nil then
5255 t = head
5256 else
5257 t = var.next
5259 while not t.id == id do
5260 t = t.next
5262 return t
5264 \stoptyping
5266 \subsubsection{\luatex{node.end_of_math} (0.76)}
5268 \startfunctioncall
5269 <node> t = node.end_of_math(<node> start)
5270 \stopfunctioncall
5272 Looks for and returns the next \type{math_node} following the \type{start}.
5274 \subsubsection{\luatex{node.remove}}
5276 \startfunctioncall
5277 <node> head, current = node.remove(<node> head, <node> current)
5278 \stopfunctioncall
5280 This function removes the node \type{current} from the list following
5281 \type{head}. It is your responsibility to make sure it is really part
5282 of that list. The return values are the new \type{head} and
5283 \type{current} nodes. The returned \type{current} is the node
5284 following the \type{current} in the calling argument, and is only
5285 passed back as a convenience (or \type{nil}, if there is no such node). The
5286 returned \type{head} is more important, because if the function is
5287 called with \type{current} equal to \type{head}, it will be changed.
5289 \subsubsection{\luatex{node.insert_before}}
5291 \startfunctioncall
5292 <node> head, new = node.insert_before(<node> head, <node> current, <node> new)
5293 \stopfunctioncall
5295 This function inserts the node \type{new} before \type{current} into
5296 the list following \type{head}. It is your responsibility to make sure
5297 that \type{current} is really part of that list. The return values are
5298 the (potentially mutated) \type{head} and the node \type{new}, set up to
5299 be part of the list (with correct \type{next} field). If \type{head}
5300 is initially \type{nil}, it will become \type{new}.
5302 \subsubsection{\luatex{node.insert_after}}
5304 \startfunctioncall
5305 <node> head, new = node.insert_after(<node> head, <node> current, <node> new)
5306 \stopfunctioncall
5308 This function inserts the node \type{new} after \type{current} into
5309 the list following \type{head}. It is your responsibility to make sure
5310 that \type{current} is really part of that list. The return values are
5311 the \type{head} and the node \type{new}, set up to be part of the list
5312 (with correct \type{next} field). If \type{head} is initially
5313 \type{nil}, it will become \type{new}.
5315 \subsubsection{\luatex{node.first_glyph} (0.65)}
5317 \startfunctioncall
5318 <node> n = node.first_glyph(<node> n)
5319 <node> n = node.first_glyph(<node> n, <node> m)
5320 \stopfunctioncall
5322 Returns the first node in the list starting at \type{n} that is a
5323 glyph node with a subtype indicating it is a glyph, or \type{nil}.
5324 If \type{m} is given, processing stops at (but including) that node,
5325 otherwise processing stops at the end of the list.
5327 Note: this function used to be called \type{first_character}. It has
5328 been renamed in \LUATEX\ 0.65, and the old name is deprecated now.
5330 \subsubsection{\luatex{node.ligaturing}}
5332 \startfunctioncall
5333 <node> h, <node> t, <boolean> success = node.ligaturing(<node> n)
5334 <node> h, <node> t, <boolean> success = node.ligaturing(<node> n, <node> m)
5335 \stopfunctioncall
5337 Apply \TEX-style ligaturing to the specified nodelist. The tail node
5338 \type{m} is optional. The two returned nodes \type{h} and \type{t} are
5339 the new head and tail (both \type{n} and \type{m} can change into
5340 a new ligature).
5342 \subsubsection{\luatex{node.kerning}}
5344 \startfunctioncall
5345 <node> h, <node> t, <boolean> success = node.kerning(<node> n)
5346 <node> h, <node> t, <boolean> success = node.kerning(<node> n, <node> m)
5347 \stopfunctioncall
5349 Apply \TEX|-|style kerning to the specified nodelist. The tail node
5350 \type{m} is optional. The two returned nodes \type{h} and \type{t} are
5351 the head and tail (either one of these can be an inserted kern node,
5352 because special kernings with word boundaries are possible).
5354 \subsubsection{\luatex{node.unprotect_glyphs}}
5356 \startfunctioncall
5357 node.unprotect_glyphs(<node> n)
5358 \stopfunctioncall
5360 Subtracts 256 from all glyph node subtypes. This and the next
5361 function are helpers to convert from \type{characters} to
5362 \type{glyphs} during node processing.
5364 \subsubsection{\luatex{node.protect_glyphs}}
5366 \startfunctioncall
5367 node.protect_glyphs(<node> n)
5368 \stopfunctioncall
5370 Adds 256 to all glyph node subtypes in the node list starting at
5371 \type{n}, except that if the value is 1, it adds only 255. The special
5372 handling of 1 means that \type{characters} will become \type{glyphs}
5373 after subtraction of 256.
5375 \subsubsection{\luatex{node.last_node}}
5377 \startfunctioncall
5378 <node> n = node.last_node()
5379 \stopfunctioncall
5381 This function pops the last node from \TEX's \quote{current list}.
5382 It returns that node, or \type{nil} if the current list is empty.
5384 \subsubsection{\luatex{node.write}}
5386 \startfunctioncall
5387 node.write(<node> n)
5388 \stopfunctioncall
5390 This is an experimental function that will append a node list to
5391 \TEX's \quote {current list} (the node list is not deep-copied
5392 any more since version 0.38). There is no error checking yet!
5394 \subsubsection{\luatex{node.protrusion_skippable} (0.60.1)}
5395 \startfunctioncall
5396 <boolean> skippable = node.protrusion_skippable(<node> n)
5397 \stopfunctioncall
5399 Returns \type{true} if, for the purpose of line boundary discovery
5400 when character protrusion is active, this node can be skipped.
5402 \subsection{Attribute handling}
5404 Attributes appear as linked list of userdata objects in the
5405 \type{attr} field of individual nodes. They can be handled
5406 individually, but it is much safer and more efficient to use the
5407 dedicated functions associated with them.
5409 \subsubsection{\luatex{node.has_attribute}}
5411 \startfunctioncall
5412 <number> v = node.has_attribute(<node> n, <number> id)
5413 <number> v = node.has_attribute(<node> n, <number> id, <number> val)
5414 \stopfunctioncall
5416 Tests if a node has the attribute with number \type{id} set. If
5417 \type{val} is also supplied, also tests if the value matches \type{val}.
5418 It returns the value, or, if no match is found, \type{nil}.
5420 \subsubsection{\luatex{node.set_attribute}}
5422 \startfunctioncall
5423 node.set_attribute(<node> n, <number> id, <number> val)
5424 \stopfunctioncall
5426 Sets the attribute with number \type{id} to the value
5427 \type{val}. Duplicate assignments are ignored. {\em [needs explanation]}
5429 \subsubsection{\luatex{node.unset_attribute}}
5431 \startfunctioncall
5432 <number> v = node.unset_attribute(<node> n, <number> id)
5433 <number> v = node.unset_attribute(<node> n, <number> id, <number> val)
5434 \stopfunctioncall
5436 Unsets the attribute with number \type{id}. If \type{val} is also supplied,
5437 it will only perform this operation if the value matches \type{val}.
5438 Missing attributes or attribute|-|value pairs are ignored.
5440 If the attribute was actually deleted, returns its old
5441 value. Otherwise, returns \type{nil}.
5443 \section{The \luatex{pdf} library}
5445 This contains variables and functions that are related to the \PDF\ backend.
5447 %***********************************************************************
5449 \subsection{\luatex{pdf.mapfile}, \luatex{pdf.mapline} (new in 0.53.0)}
5451 \startfunctioncall
5452 pdf.mapfile(<string> map file)
5453 pdf.mapfile(<string> map line)
5454 \stopfunctioncall
5456 These two functions can be used to replace primitives \type{\pdfmapfile}
5457 and \type{\pdfmapline} from \PDFTEX. They expect a string as only parameter
5458 and have no return value.
5460 The also functions replace the former variables
5461 \luatex{pdf.pdfmapfile} and \luatex{pdf.pdfmapline}.
5463 %***********************************************************************
5465 \subsection{\luatex{pdf.catalog}, \luatex{pdf.info},
5466 \luatex{pdf.names}, \luatex{pdf.trailer} (new in 0.53.0)}
5468 These variables offer a read-write interface to the corresponding
5469 \PDFTEX\ token lists. The value types are strings.
5471 The corresponding \quote{\type{pdf}} parameter names
5472 \luatex{pdf.pdfcatalog}, \luatex{pdf.pdfinfo}, \luatex{pdf.pdfnames},
5473 and \luatex{pdf.pdftrailer} (all new in 0.47.0)
5474 still work, but are obsolescent (since 0.53.0).
5476 Note: this interface will almost certainly change in the future.
5478 %***********************************************************************
5480 \subsection{\luatex{pdf.pageattributes}, \luatex{pdf.pageresources},
5481 \luatex{pdf.pagesattributes} (new in 0.53.0)}
5483 These variables offer a read-write interface to related
5484 token lists. The value types are strings. The variables have no
5485 interaction with the corresponding \PDFTEX\ token registers
5486 \tex{pdfpageattr}, \tex{pdfpageresources}, and \tex{pdfpagesattr},
5487 but they are written out to the \PDF\ file directly after
5488 the \PDFTEX\ token registers.
5490 %***********************************************************************
5492 \subsection{\luatex{pdf.h}, \luatex{pdf.v}}
5495 These are the \type{h} and \type{v} values that define the current location
5496 on the output page, measured from its lower left corner. The values can be queried
5497 using scaled points as units.
5499 \starttyping
5500 local h = pdf.h
5501 local v = pdf.v
5502 \stoptyping
5504 \subsection{\luatex{pdf.getpos}, \luatex{pdf.gethpos}, \luatex{pdf.getvpos}}
5506 These are the function variants of \type {pdf.h} and \type {pdf.v}. Sometimes
5507 using a function is preferred over a key so this saves wrapping. Also, these
5508 functions are faster then the key based access, as \type {h} and \type {v}
5509 keys are not real variables but looked up using a metatable call. The
5510 \type {getpos} function returns two values, the other return one.
5512 \starttyping
5513 local h, v = pdf.getpos()
5514 \stoptyping
5516 \subsection{\luatex{pdf.hasmatrix}, \luatex{pdf.getmatrix}}
5518 The current matrix transformation is available via the \type {getmatrix} command,
5519 which returns 6 values: \type {sx}, \type {rx}, \type {ry}, \type {sy}, \type {tx},
5520 and \type {ty}. The \type {hasmatrix} function returns \type {true} when a matrix is
5521 applied.
5523 \starttyping
5524 if pdf.hasmatrix() then
5525 local sx, rx, ry, sy, tx, ty = pdf.getmatrix()
5526 -- do something useful or not
5528 \stoptyping
5532 \subsection{\luatex{pdf.print}}
5534 A print function to write stuff to the \PDF\ document
5535 that can be used from within a \tex{latelua} argument.
5536 This function is not to be used inside \tex{directlua}
5537 unless you know {\it exactly} what you are doing.
5539 \startfunctioncall
5540 pdf.print(<string> s)
5541 pdf.print(<string> type, <string> s)
5542 \stopfunctioncall
5544 The optional parameter can be used to mimic the behavior of
5545 \tex{pdfliteral}: the \type{type} is \type{direct} or \type{page}.
5547 \subsection{\luatex{pdf.immediateobj}}
5549 This function creates a \PDF\ object
5550 and immediately writes it to the \PDF\ file.
5551 It is modelled after \PDFTEX's \tex{immediate}\tex{pdfobj} primitives.
5552 All function variants return the object number
5553 of the newly generated object.
5555 \startfunctioncall
5556 <number> n = pdf.immediateobj(<string> objtext)
5557 <number> n = pdf.immediateobj("file", <string> filename)
5558 <number> n = pdf.immediateobj("stream", <string> streamtext, <string> attrtext)
5559 <number> n = pdf.immediateobj("streamfile", <string> filename, <string> attrtext)
5560 \stopfunctioncall
5562 The first version puts the \type{objtext} raw into an object.
5563 Only the object wrapper is automatically generated,
5564 but any internal structure (like \type{<< >>} dictionary markers)
5565 needs to provided by the user.
5566 The second version with keyword \type{"file"} as 1st argument
5567 puts the contents of the file with name \type{filename} raw into the object.
5568 The third version with keyword \type{"stream"} creates a stream object
5569 and puts the \type{streamtext} raw into the stream.
5570 The stream length is automatically calculated.
5571 The optional \type{attrtext} goes into the dictionary of that object.
5572 The fourth version with keyword \type{"streamfile"} does the same as the 3rd one,
5573 it just reads the stream data raw from a file.
5575 An optional first argument can be given to make the function use a
5576 previously reserved \PDF\ object.
5578 \startfunctioncall
5579 <number> n = pdf.immediateobj(<integer> n, <string> objtext)
5580 <number> n = pdf.immediateobj(<integer> n, "file", <string> filename)
5581 <number> n = pdf.immediateobj(<integer> n, "stream", <string> streamtext, <string> attrtext)
5582 <number> n = pdf.immediateobj(<integer> n, "streamfile", <string> filename, <string> attrtext)
5583 \stopfunctioncall
5585 %***********************************************************************
5587 \subsection{\luatex{pdf.obj}}
5589 This function creates a \PDF\ object,
5590 which is written to the \PDF\ file only when referenced,
5591 e.\,g., by \luatex{pdf.refobj()}.
5593 All function variants return the object number of the newly generated
5594 object, and there are two separate calling modes.
5596 The first mode is modelled after \PDFTEX's \tex{pdfobj} primitive.
5598 \startfunctioncall
5599 <number> n = pdf.obj(<string> objtext)
5600 <number> n = pdf.obj("file", <string> filename)
5601 <number> n = pdf.obj("stream", <string> streamtext, <string> attrtext)
5602 <number> n = pdf.obj("streamfile", <string> filename, <string> attrtext)
5603 \stopfunctioncall
5605 An optional first argument can be given to make the function use a
5606 previously reserved \PDF\ object.
5608 \startfunctioncall
5609 <number> n = pdf.obj(<integer> n, <string> objtext)
5610 <number> n = pdf.obj(<integer> n, "file", <string> filename)
5611 <number> n = pdf.obj(<integer> n, "stream", <string> streamtext, <string> attrtext)
5612 <number> n = pdf.obj(<integer> n, "streamfile", <string> filename, <string> attrtext)
5613 \stopfunctioncall
5615 The second mode accepts a single argument table with key--value pairs.
5617 \startfunctioncall
5618 <number> n = pdf.obj{ type = <string>,
5619 immmediate = <boolean>,
5620 objnum = <number>,
5621 attr = <string>,
5622 compresslevel = <number>,
5623 objcompression = <boolean>,
5624 file = <string>,
5625 string = <string>}
5626 \stopfunctioncall
5628 The \type{type} field can have the values \type{raw} and
5629 \type{stream}, this field is required, the others are optional
5630 (within constraints).
5632 Note: this mode makes \type{pdf.obj} look more flexible than it
5633 actually is: the constraints from the separate parameter version
5634 still apply, so for example you can't have both \type{string} and
5635 \type{file} at the same time.
5637 %***********************************************************************
5639 \subsection{\luatex{pdf.refobj}}
5641 This function,
5642 the \LUA\ version of the \tex{pdfrefobj} primitive,
5643 references an object by its object number,
5644 so that the object will be written out.
5646 \startfunctioncall
5647 pdf.refobj(<integer> n)
5648 \stopfunctioncall
5650 This function works in both the \tex{directlua} and \tex{latelua} environment.
5651 Inside \tex{directlua} a new whatsit node
5652 \quote{pdf_refobj} is created, which will be marked for flushing during
5653 page output and the object is then written directly after the page,
5654 when also the resources objects are written out.
5655 Inside \tex{latelua} the object will be marked for flushing.
5657 This function has no return values.
5659 %***********************************************************************
5661 \subsection{\luatex{pdf.reserveobj}}
5663 This function creates an empty \PDF\ object and returns its number.
5665 \startfunctioncall
5666 <number> n = pdf.reserveobj()
5667 <number> n = pdf.reserveobj("annot")
5668 \stopfunctioncall
5670 \subsection{\luatex{pdf.registerannot} (new in 0.47.0)}
5672 This function adds an object number to the \type{/Annots} array for the
5673 current page without doing anything else. This function can only be
5674 used from within \type{\latelua}.
5676 \startfunctioncall
5677 pdf.registerannot (<number> objnum)
5678 \stopfunctioncall
5680 \section{The \luatex{pdfscanner} library (new in 0.72.0)}
5682 The \luatex{pdfscanner} library allows interpretation of PDF content streams
5683 and \type{/ToUnicode} (cmap) streams. You can get those streams from the
5684 \luatex{epdf} library, as explained in an earlier section. There is only
5685 a single top|-|level function in this library:
5687 \startfunctioncall
5688 pdfscanner.scan (<Object> stream, <table> operatortable, <table> info)
5689 \stopfunctioncall
5691 The first argument, \type{stream}, should be either a PDF stream
5692 object, or a PDF array of PDF stream objects (those options comprise
5693 the possible return values of \type{<Page>:getContents()}
5694 and \type{<Object>:getStream()} in the \type{epdf} library).
5696 The second argument, \type{operatortable}, should be a Lua table where
5697 the keys are PDF operator name strings and the values are Lua
5698 functions (defined by you) that are used to process those
5699 operators. The functions are called whenever the scanner finds one
5700 of these PDF operators in the content stream(s). The functions are
5701 called with two arguments: the \type{scanner} object itself, and
5702 the \type{info} table that was passed are the third argument
5703 to \type{pdfscanner.scan}.
5705 Internally, \type{pdfscanner.scan} loops over the PDF operators in the
5706 stream(s), collecting operands on an internal stack until it finds a
5707 PDF operator. If that PDF operator's name exists
5708 in \type{operatortable}, then the associated function is
5709 executed. After the function has run (or when there is no function to
5710 execute) the internal operand stack is cleared in preparation for the
5711 next operator, and processing continues.
5713 The \type{scanner} argument to the processing functions is needed
5714 because it offers various methods to get the actual operands from the
5715 internal operand stack. The most important of those functions is
5716 \type{}
5718 A simple example of processing a PDF's document stream
5719 could look like this:
5721 \starttyping
5722 function Do (scanner, info)
5723 local val = scanner:pop()
5724 local name = val[2] -- val[1] == 'name'
5725 print (info.space ..'Use XObject '.. name)
5726 local resources = info.resources
5727 local xobject = resources:lookup("XObject"):getDict():lookup(name)
5728 if (xobject and xobject:isStream()) then
5729 local dict = xobject:getStream():getDict()
5730 if dict then
5731 local name = dict:lookup('Subtype')
5732 if name:getName() == 'Form' then
5733 local newinfo = { space = info.space .. " " ,
5734 resources = dict:lookup('Resources'):getDict() }
5735 pdfscanner.scan(xobject, operatortable, newinfo)
5740 operatortable = {Do = Do}
5742 doc = epdf.open(arg[1])
5743 pagenum = 1
5744 while pagenum <= doc:getNumPages() do
5745 local page = doc:getCatalog():getPage(pagenum)
5746 local info = { space = " " , resources = page:getResourceDict()}
5747 print ('Page ' .. pagenum)
5748 pdfscanner.scan(page:getContents(), operatortable, info)
5749 pagenum = pagenum + 1
5751 \stoptyping
5753 This example iterates over all the actual content in the PDF, and
5754 prints out the found XObject names. While the code demonstrates quite
5755 some of the \type{epdf} functions, let's focus on the type
5756 \type{pdfscanner} specific code instead.
5758 From the bottom up, the line
5760 \starttyping
5761 pdfscanner.scan(page:getContents(), operatortable, info)
5762 \stoptyping
5764 runs the scanner with the PDF page's top-level content.
5766 The third argument, \type{info}, contains two entries: \type{space} is
5767 used to indent the printed output, and \type{resources} is needed so
5768 that embedded \type{XForms} can find their own content.
5770 The second argument, \type{operatortable} defines a processing function
5771 for a single PDF operator, \type{Do}.
5773 The function \type{Do} prints the name of the current XObject, and
5774 then starts a new scanner for that object's content stream, under the
5775 condition that the XObject is in fact a \type{/Form}. That nested
5776 scanner is called with new \type{info} argument with an
5777 updated \type{space} value so that the indentation of the output nicely
5778 nests, and with an new \type{resources} field to help the next
5779 iteration down to properly process any other, embedded XObjects.
5781 Of course, this is not a very useful example in practise, but for the
5782 purpose of demonstrating \type{pdfscanner}, it is just long enough.
5783 It makes use of only one \type{scanner} method: \type{scanner:pop()}.
5784 That function pops the top operand of the internal stack, and returns
5785 a lua table where the object at index one is a string representing
5786 the type of the operand, and object two is its value.
5788 The list of possible operand types and associated lua value types is:
5790 \starttabulate[|lT|p|]
5791 \NC integer \NC <number> \NC \NR
5792 \NC real \NC <number> \NC \NR
5793 \NC boolean \NC <boolean> \NC \NR
5794 \NC name \NC <string> \NC \NR
5795 \NC operator \NC <string> \NC \NR
5796 \NC string \NC <string> \NC \NR
5797 \NC array \NC <table> \NC \NR
5798 \NC dict \NC <table> \NC \NR
5799 \stoptabulate
5801 In case of \type{integer} or \type{real}, the value is always
5802 a Lua (floating point) number.
5804 In case of \type{name}, the leading slash is always stripped.
5806 In case of \type{string}, please bear in mind that PDF actually
5807 supports different types of strings (with different encodings) in
5808 different parts of the PDF document, so may need to reencode some of
5809 the results; \type{pdfscanner} always outputs the byte stream without
5810 reencoding anything. \type{pdfscanner} does not differentiate between
5811 literal strings and hexidecimal strings (the hexadecimal values are
5812 decoded), and it treats the stream data for inline images as a string
5813 that is the single operand for \type{EI}.
5815 In case of \type{array}, the table content is a list of \type{pop}
5816 return values.
5818 In case of \type{dict}, the table keys are PDF name strings
5819 and the values are \type{pop} return values.
5821 \blank
5823 There are few more methods defined that you can ask \type{scanner}:
5825 \starttabulate[|lT|p|]
5826 \NC pop \NC as explained above\NC \NR
5827 \NC popNumber \NC return only the value of a \type{real} or \type{integer}\NC \NR
5828 \NC popName \NC return only the value of a \type{name} \NC \NR
5829 \NC popString \NC return only the value of a \type{string} \NC \NR
5830 \NC popArray \NC return only the value of a \type{array} \NC \NR
5831 \NC popDict \NC return only the value of a \type{dict} \NC \NR
5832 \NC popBool \NC return only the value of a \type{boolean} \NC \NR
5833 \NC done \NC abort further processing of this \type{scan()} call\NC \NR
5834 \stoptabulate
5836 The \type{popXXX} are convenience functions, and come in handy when
5837 you know the type of the operands beforehand (which you usually do, in
5838 PDF). For example, the \type{Do} function could have used \type{local
5839 name = scanner:popName()} instead, because the single operand
5840 to the \type{Do} operator is always a PDF name object.
5842 The \type{done} function allows you to abort processing of a stream
5843 once you have learned everything you want to learn. This comes in handy
5844 while parsing \type{/ToUnicode}, because there usually is trailing
5845 garbage that you are not interested in. Without \type{done}, processing
5846 only end at the end of the stream, possibly wasting CPU cycles.
5848 \section{The \luatex{status} library}
5850 This contains a number of run|-|time configuration items that
5851 you may find useful in message reporting, as well as an iterator
5852 function that gets all of the names and values as a table.
5854 \startfunctioncall
5855 <table> info = status.list()
5856 \stopfunctioncall
5858 The keys in the table are the known items, the value is the
5859 current value. Almost all of the values in \type{status} are
5860 fetched through a metatable at run|-|time whenever they are
5861 accessed, so you cannot use \type{pairs} on \type{status}, but you
5862 {\it can\/} use \type{pairs} on \type{info}, of course. If you do
5863 not need the full list, you can also ask for a single item by
5864 using its name as an index into \type{status}.
5866 The current list is:
5868 \starttabulate[|lT|p|]
5869 \NC \ssbf key \NC \bf explanation \NC\NR
5870 \NC pdf_gone\NC written \PDF\ bytes \NC \NR
5871 \NC pdf_ptr\NC not yet written \PDF\ bytes \NC \NR
5872 \NC dvi_gone\NC written \DVI\ bytes \NC \NR
5873 \NC dvi_ptr\NC not yet written \DVI\ bytes \NC \NR
5874 \NC total_pages\NC number of written pages \NC \NR
5875 \NC output_file_name\NC name of the \PDF\ or \DVI\ file \NC \NR
5876 \NC log_name\NC name of the log file \NC \NR
5877 \NC banner\NC terminal display banner \NC \NR
5878 \NC var_used\NC variable (one|-|word) memory in use \NC \NR
5879 \NC dyn_used\NC token (multi|-|word) memory in use \NC \NR
5880 \NC str_ptr\NC number of strings \NC \NR
5881 \NC init_str_ptr\NC number of \INITEX\ strings \NC \NR
5882 \NC max_strings\NC maximum allowed strings \NC \NR
5883 \NC pool_ptr\NC string pool index \NC \NR
5884 \NC init_pool_ptr\NC \INITEX\ string pool index \NC \NR
5885 \NC pool_size\NC current size allocated for string characters \NC \NR
5886 \NC node_mem_usage\NC a string giving insight into currently used nodes\NC\NR
5887 \NC var_mem_max\NC number of allocated words for nodes\NC \NR
5888 \NC fix_mem_max\NC number of allocated words for tokens\NC \NR
5889 \NC fix_mem_end\NC maximum number of used tokens\NC \NR
5890 \NC cs_count\NC number of control sequences \NC \NR
5891 \NC hash_size\NC size of hash \NC \NR
5892 \NC hash_extra\NC extra allowed hash \NC \NR
5893 \NC font_ptr\NC number of active fonts \NC \NR
5894 \NC max_in_stack\NC max used input stack entries \NC \NR
5895 \NC max_nest_stack\NC max used nesting stack entries \NC \NR
5896 \NC max_param_stack\NC max used parameter stack entries \NC \NR
5897 \NC max_buf_stack\NC max used buffer position \NC \NR
5898 \NC max_save_stack\NC max used save stack entries \NC \NR
5899 \NC stack_size\NC input stack size \NC \NR
5900 \NC nest_size\NC nesting stack size \NC \NR
5901 \NC param_size\NC parameter stack size \NC \NR
5902 \NC buf_size\NC current allocated size of the line buffer \NC \NR
5903 \NC save_size\NC save stack size \NC \NR
5904 \NC obj_ptr\NC max \PDF\ object pointer \NC \NR
5905 \NC obj_tab_size\NC \PDF\ object table size \NC \NR
5906 \NC pdf_os_cntr\NC max \PDF\ object stream pointer \NC \NR
5907 \NC pdf_os_objidx\NC \PDF\ object stream index \NC \NR
5908 \NC pdf_dest_names_ptr\NC max \PDF\ destination pointer \NC \NR
5909 \NC dest_names_size\NC \PDF\ destination table size \NC \NR
5910 \NC pdf_mem_ptr\NC max \PDF\ memory used \NC \NR
5911 \NC pdf_mem_size\NC \PDF\ memory size \NC \NR
5912 \NC largest_used_mark\NC max referenced marks class \NC \NR
5913 \NC filename\NC name of the current input file \NC \NR
5914 \NC inputid\NC numeric id of the current input \NC \NR
5915 \NC linenumber\NC location in the current input file\NC \NR
5916 \NC lasterrorstring\NC last error string\NC \NR
5917 \NC luabytecodes\NC number of active \LUA\ bytecode registers\NC \NR
5918 \NC luabytecode_bytes\NC number of bytes in \LUA\ bytecode registers\NC \NR
5919 \NC luastate_bytes\NC number of bytes in use by \LUA\ interpreters\NC \NR
5920 \NC output_active\NC \type{true} if the \tex{output} routine is active\NC \NR
5921 \NC callbacks\NC total number of executed callbacks so far\NC \NR
5922 \NC indirect_callbacks\NC number of those that were themselves
5923 a result of other callbacks (e.g. file readers)\NC \NR
5924 \NC luatex_svn\NC the luatex repository id (added in 0.51)\NC\NR
5925 \NC luatex_version\NC the luatex version number (added in 0.38)\NC\NR
5926 \NC luatex_revision\NC the luatex revision string (added in 0.38)\NC\NR
5927 \NC ini_version\NC \type{true} if this is an \INITEX\ run (added in 0.38)\NC\NR
5928 \stoptabulate
5931 \section{The \luatex{tex} library}
5933 The \luatex{tex} table contains a large list of virtual internal \TEX\
5934 parameters that are partially writable.
5936 The designation \quote{virtual} means that these items are not properly
5937 defined in \LUA, but are only front\-ends that are handled by a metatable
5938 that operates on the actual \TEX\ values. As a result, most of the \LUA\
5939 table operators (like \type{pairs} and \type{#}) do not work on such
5940 items.
5942 At the moment, it is possible to access almost every parameter
5943 that has these characteristics:
5945 \startitemize[packed]
5946 \item You can use it after \tex{the}
5947 \item It is a single token.
5948 \item Some special others, see the list below
5949 \stopitemize
5951 This excludes parameters that need extra arguments, like
5952 \tex{the}\tex{scriptfont}.
5954 The subset comprising simple integer and dimension registers are
5955 writable as well as readable (stuff like \tex{tracingcommands} and
5956 \tex{parindent}).
5958 \subsection{Internal parameter values}
5960 For all the parameters in this section, it is possible to access them
5961 directly using their names as index in the \type{tex} table, or by
5962 using one of the functions \type{tex.get()} and \type{tex.set()}.
5964 The exact parameters and return values differ depending on the actual
5965 parameter, and so does whether \type{tex.set} has any effect. For the
5966 parameters that {\it can\/} be set, it is possible to use
5967 \type{'global'} as the first argument to \type{tex.set}; this makes
5968 the assignment global instead of local.
5970 \startfunctioncall
5971 tex.set (<string> n, ...)
5972 tex.set ('global', <string> n, ...)
5973 ... = tex.get (<string> n)
5974 \stopfunctioncall
5976 \subsubsection{Integer parameters}
5978 The integer parameters accept and return \LUA\ numbers.
5980 Read-write:
5982 \startcolumns[n=2]
5983 \starttyping
5984 tex.adjdemerits
5985 tex.binoppenalty
5986 tex.brokenpenalty
5987 tex.catcodetable
5988 tex.clubpenalty
5989 tex.day
5990 tex.defaulthyphenchar
5991 tex.defaultskewchar
5992 tex.delimiterfactor
5993 tex.displaywidowpenalty
5994 tex.doublehyphendemerits
5995 tex.endlinechar
5996 tex.errorcontextlines
5997 tex.escapechar
5998 tex.exhyphenpenalty
5999 tex.fam
6000 tex.finalhyphendemerits
6001 tex.floatingpenalty
6002 tex.globaldefs
6003 tex.hangafter
6004 tex.hbadness
6005 tex.holdinginserts
6006 tex.hyphenpenalty
6007 tex.interlinepenalty
6008 tex.language
6009 tex.lastlinefit
6010 tex.lefthyphenmin
6011 tex.linepenalty
6012 tex.localbrokenpenalty
6013 tex.localinterlinepenalty
6014 tex.looseness
6015 tex.mag
6016 tex.maxdeadcycles
6017 tex.month
6018 tex.newlinechar
6019 tex.outputpenalty
6020 tex.pausing
6021 tex.pdfadjustspacing
6022 tex.pdfcompresslevel
6023 tex.pdfdecimaldigits
6024 tex.pdfgamma
6025 tex.pdfgentounicode
6026 tex.pdfimageapplygamma
6027 tex.pdfimagegamma
6028 tex.pdfimagehicolor
6029 tex.pdfimageresolution
6030 tex.pdfinclusionerrorlevel
6031 tex.pdfminorversion
6032 tex.pdfobjcompresslevel
6033 tex.pdfoutput
6034 tex.pdfpagebox
6035 tex.pdfpkresolution
6036 tex.pdfprotrudechars
6037 tex.pdftracingfonts
6038 tex.pdfuniqueresname
6039 tex.postdisplaypenalty
6040 tex.predisplaydirection
6041 tex.predisplaypenalty
6042 tex.pretolerance
6043 tex.relpenalty
6044 tex.righthyphenmin
6045 tex.savinghyphcodes
6046 tex.savingvdiscards
6047 tex.showboxbreadth
6048 tex.showboxdepth
6049 tex.time
6050 tex.tolerance
6051 tex.tracingassigns
6052 tex.tracingcommands
6053 tex.tracinggroups
6054 tex.tracingifs
6055 tex.tracinglostchars
6056 tex.tracingmacros
6057 tex.tracingnesting
6058 tex.tracingonline
6059 tex.tracingoutput
6060 tex.tracingpages
6061 tex.tracingparagraphs
6062 tex.tracingrestores
6063 tex.tracingscantokens
6064 tex.tracingstats
6065 tex.uchyph
6066 tex.vbadness
6067 tex.widowpenalty
6068 tex.year
6069 \stoptyping
6070 \stopcolumns
6072 Read|-|only:
6074 \startcolumns[n=3]
6075 \starttyping
6076 tex.deadcycles
6077 tex.insertpenalties
6078 tex.parshape
6079 tex.prevgraf
6080 tex.spacefactor
6081 \stoptyping
6082 \stopcolumns
6084 \subsubsection{Dimension parameters}
6086 The dimension parameters accept \LUA\ numbers (signifying scaled points)
6087 or strings (with included dimension). The result is always a number in
6088 scaled points.
6090 Read|-|write:
6092 \startcolumns[n=3]
6093 \starttyping
6094 tex.boxmaxdepth
6095 tex.delimitershortfall
6096 tex.displayindent
6097 tex.displaywidth
6098 tex.emergencystretch
6099 tex.hangindent
6100 tex.hfuzz
6101 tex.hoffset
6102 tex.hsize
6103 tex.lineskiplimit
6104 tex.mathsurround
6105 tex.maxdepth
6106 tex.nulldelimiterspace
6107 tex.overfullrule
6108 tex.pagebottomoffset
6109 tex.pageheight
6110 tex.pageleftoffset
6111 tex.pagerightoffset
6112 tex.pagetopoffset
6113 tex.pagewidth
6114 tex.parindent
6115 tex.pdfdestmargin
6116 tex.pdfeachlinedepth
6117 tex.pdfeachlineheight
6118 tex.pdffirstlineheight
6119 tex.pdfhorigin
6120 tex.pdflastlinedepth
6121 tex.pdflinkmargin
6122 tex.pdfpageheight
6123 tex.pdfpagewidth
6124 tex.pdfpxdimen
6125 tex.pdfthreadmargin
6126 tex.pdfvorigin
6127 tex.predisplaysize
6128 tex.scriptspace
6129 tex.splitmaxdepth
6130 tex.vfuzz
6131 tex.voffset
6132 tex.vsize
6133 \stoptyping
6134 \stopcolumns
6136 Read|-|only:
6138 \startcolumns[n=3]
6139 \starttyping
6140 tex.pagedepth
6141 tex.pagefilllstretch
6142 tex.pagefillstretch
6143 tex.pagefilstretch
6144 tex.pagegoal
6145 tex.pageshrink
6146 tex.pagestretch
6147 tex.pagetotal
6148 tex.prevdepth
6149 \stoptyping
6150 \stopcolumns
6152 \subsubsection{Direction parameters}
6154 The direction parameters are read|-|only and return a \LUA\ string.
6156 \startcolumns[n=3]
6157 \starttyping
6158 tex.bodydir
6159 tex.mathdir
6160 tex.pagedir
6161 tex.pardir
6162 tex.textdir
6163 \stoptyping
6164 \stopcolumns
6166 \subsubsection{Glue parameters}
6168 The glue parameters accept and return a userdata object that
6169 represents a \type{glue_spec} node.
6171 \startcolumns[n=3]
6172 \starttyping
6173 tex.abovedisplayshortskip
6174 tex.abovedisplayskip
6175 tex.baselineskip
6176 tex.belowdisplayshortskip
6177 tex.belowdisplayskip
6178 tex.leftskip
6179 tex.lineskip
6180 tex.parfillskip
6181 tex.parskip
6182 tex.rightskip
6183 tex.spaceskip
6184 tex.splittopskip
6185 tex.tabskip
6186 tex.topskip
6187 tex.xspaceskip
6188 \stoptyping
6189 \stopcolumns
6191 \subsubsection{Muglue parameters}
6193 All muglue parameters are to be used read|-|only and return a \LUA\ string.
6195 \startcolumns[n=3]
6196 \starttyping
6197 tex.medmuskip
6198 tex.thickmuskip
6199 tex.thinmuskip
6200 \stoptyping
6201 \stopcolumns
6203 \subsubsection{Tokenlist parameters}
6205 The tokenlist parameters accept and return \LUA\ strings. \LUA\ strings are
6206 converted to and from token lists using \tex{the}\tex{toks} style
6207 expansion: all category codes are either space (10) or other (12).
6208 It follows that assigning to some of these, like \quote{tex.output},
6209 is actually useless, but it feels bad to make exceptions in view
6210 of a coming extension that will accept full-blown token strings.
6212 \startcolumns[n=3]
6213 \starttyping
6214 tex.errhelp
6215 tex.everycr
6216 tex.everydisplay
6217 tex.everyeof
6218 tex.everyhbox
6219 tex.everyjob
6220 tex.everymath
6221 tex.everypar
6222 tex.everyvbox
6223 tex.output
6224 tex.pdfpageattr
6225 tex.pdfpageresources
6226 tex.pdfpagesattr
6227 tex.pdfpkmode
6228 \stoptyping
6229 \stopcolumns
6232 \subsection{Convert commands}
6234 All \quote{convert} commands are read|-|only and return a \LUA\ string.
6235 The supported commands at this moment are:
6237 \startcolumns[n=2]
6238 \starttyping
6239 tex.eTeXVersion
6240 tex.eTeXrevision
6241 tex.formatname
6242 tex.jobname
6243 tex.luatexrevision
6244 tex.pdfnormaldeviate
6245 tex.pdftexbanner
6246 tex.pdftexrevision
6247 tex.fontname(number)
6248 tex.pdffontname(number)
6249 tex.pdffontobjnum(number)
6250 tex.pdffontsize(number)
6251 tex.uniformdeviate(number)
6252 tex.number(number)
6253 tex.romannumeral(number)
6254 tex.pdfpageref(number)
6255 tex.pdfxformname(number)
6256 tex.fontidentifier(number)
6257 \stoptyping
6258 \stopcolumns
6260 If you are wondering why this list looks haphazard; these are all the
6261 cases of the \quote{convert} internal command that do not require an
6262 argument, as well as the ones that require only a simple numeric
6263 value.
6265 The special (lua-only) case of \type{tex.fontidentifier} returns the
6266 \type{csname} string that matches a font id number (if there is one).
6268 \subsection{Last item commands}
6270 All \quote{last item} commands are read|-|only and return a number.
6272 The supported commands at this moment are:
6274 \startcolumns[n=3]
6275 \starttyping
6276 tex.lastpenalty
6277 tex.lastkern
6278 tex.lastskip
6279 tex.lastnodetype
6280 tex.inputlineno
6281 tex.pdftexversion
6282 tex.pdflastobj
6283 tex.pdflastxform
6284 tex.pdflastximage
6285 tex.pdflastximagepages
6286 tex.pdflastannot
6287 tex.pdflastxpos
6288 tex.pdflastypos
6289 tex.pdfrandomseed
6290 tex.pdflastlink
6291 tex.luatexversion
6292 tex.eTeXminorversion
6293 tex.eTeXversion
6294 tex.currentgrouplevel
6295 tex.currentgrouptype
6296 tex.currentiflevel
6297 tex.currentiftype
6298 tex.currentifbranch
6299 tex.pdflastximagecolordepth
6300 \stoptyping
6301 \stopcolumns
6303 \subsection{Attribute, count, dimension, skip and token registers}
6305 \TEX's attributes (\tex{attribute}), counters (\tex{count}),
6306 dimensions (\tex{dimen}), skips (\tex{skip}) and token (\tex{toks})
6307 registers can be accessed and written to using two times five virtual
6308 sub|-|tables of the \luatex{tex} table:
6310 \startcolumns[n=3]
6311 \starttyping
6312 tex.attribute
6313 tex.count
6314 tex.dimen
6315 tex.skip
6316 tex.toks
6317 \stoptyping
6318 \stopcolumns
6320 It is possible to use the names of relevant \tex{attributedef}, \tex{countdef},
6321 \tex{dimendef}, \tex{skipdef}, or \tex{toksdef} control sequences as indices
6322 to these tables:
6324 \starttyping
6325 tex.count.scratchcounter = 0
6326 enormous = tex.dimen['maxdimen']
6327 \stoptyping
6329 In this case, \LUATEX\ looks up the value for you on the fly. You have
6330 to use a valid \tex{countdef} (or \tex{attributedef}, or
6331 \tex{dimendef}, or \tex{skipdef}, or \tex{toksdef}), anything else
6332 will generate an error (the intent is to eventually also allow
6333 \type{<chardef tokens>} and even macros that expand into a number).
6335 The attribute and count registers accept and return \LUA\ numbers.
6337 The dimension registers accept \LUA\ numbers (in scaled points) or
6338 strings (with an included absolute dimension; \type {em} and \type {ex} and \type {px}
6339 are forbidden). The result is always a number in scaled points.
6341 The token registers accept and return \LUA\ strings. \LUA\ strings are
6342 converted to and from token lists using \tex{the}\tex{toks} style
6343 expansion: all category codes are either space (10) or other (12).
6345 The skip registers accept and return \type{glue_spec} userdata node
6346 objects (see the description of the node interface elsewhere in this
6347 manual).
6349 As an alternative to array addressing, there are also accessor
6350 functions defined for all cases, for example, here is the set
6351 of possibilities for \type{\skip} registers:
6353 \startfunctioncall
6354 tex.setskip (<number> n, <node> s)
6355 tex.setskip (<string> s, <node> s)
6356 tex.setskip ('global',<number> n, <node> s)
6357 tex.setskip ('global',<string> s, <node> s)
6358 <node> s = tex.getskip (<number> n)
6359 <node> s = tex.getskip (<string> s)
6360 \stopfunctioncall
6362 In the function-based interface, it is possible to define values
6363 globally by using the string \type{'global'} as the first function argument.
6365 \subsection{Character code registers (0.63)}
6367 \TEX's character code tables (\tex{lccode}, \tex{uccode},
6368 \tex{sfcode}, \tex{catcode}, \tex{mathcode}, \tex{delcode}) can be
6369 accessed and written to using six virtual subtables of the \type{tex}
6370 table
6372 \startcolumns[n=3]
6373 \starttyping
6374 tex.lccode
6375 tex.uccode
6376 tex.sfcode
6377 tex.catcode
6378 tex.mathcode
6379 tex.delcode
6380 \stoptyping
6381 \stopcolumns
6383 The function call interfaces are roughly as above, but there are a few twists.
6384 \type{sfcode}s are the simple ones:
6386 \startfunctioncall
6387 tex.setsfcode (<number> n, <number> s)
6388 tex.setsfcode ('global', <number> n, <number> s)
6389 <number> s = tex.getsfcode (<number> n)
6390 \stopfunctioncall
6392 The function call interface for \type{lccode} and \type{uccode} additionally allows you to set the associated sibling at the same time:
6394 \startfunctioncall
6395 tex.setlccode (['global'], <number> n, <number> lc)
6396 tex.setlccode (['global'], <number> n, <number> lc, <number> uc)
6397 <number> lc = tex.getlccode (<number> n)
6398 tex.setuccode (['global'], <number> n, <number> uc)
6399 tex.setuccode (['global'], <number> n, <number> uc, <number> lc)
6400 <number> uc = tex.getuccode (<number> n)
6401 \stopfunctioncall
6403 The function call interface for \type{catcode} also allows you to
6404 specify a category table to use on assignment or on query (default in
6405 both cases is the current one):
6407 \startfunctioncall
6408 tex.setcatcode (['global'], <number> n, <number> c)
6409 tex.setcatcode (['global'], <number> cattable, <number> n, <number> c)
6410 <number> lc = tex.getcatcode (<number> n)
6411 <number> lc = tex.getcatcode (<number> cattable, <number> n)
6412 \stopfunctioncall
6415 The interfaces for \type{delcode} and \type{mathcode} use small array tables to
6416 set and retrieve values:
6418 \startfunctioncall
6419 tex.setmathcode (['global'], <number> n, <table> mval )
6420 <table> mval = tex.getmathcode (<number> n)
6421 tex.setdelcode (['global'], <number> n, <table> dval )
6422 <table> dval = tex.getdelcode (<number> n)
6423 \stopfunctioncall
6425 Where the table for \type{mathcode} is an array of 3 numbers, like this:
6427 \starttyping
6428 {<number> mathclass, <number> family, <number> character}
6429 \stoptyping
6431 And the table for \type{delcode} is an array with 4 numbers, like this:
6433 \starttyping
6434 {<number> small_fam, <number> small_char, <number> large_fam, <number> large_char}
6435 \stoptyping
6437 Normally, the third and fourth values in a delimiter code assignment
6438 will be zero according to \tex{Udelcode} usage, but the returned table can have
6439 values there (if the delimiter code was set using \type{\delcode}, for
6440 example). Unset \type{delcode}'s can be recognized because
6441 \type{dval[1]} is $-1$.
6443 \subsection{Box registers}
6445 It is possible to set and query actual boxes, using the node
6446 interface as defined in the \luatex{node} library:
6448 \starttyping
6449 tex.box
6450 \stoptyping
6452 for array access, or
6454 \starttyping
6455 tex.setbox(<number> n, <node> s)
6456 tex.setbox(<string> cs, <node> s)
6457 tex.setbox('global', <number> n, <node> s)
6458 tex.setbox('global', <string> cs, <node> s)
6459 <node> n = tex.getbox(<number> n)
6460 <node> n = tex.getbox(<string> cs)
6461 \stoptyping
6463 for function|-|based access.
6464 In the function-based interface, it is possible to define values
6465 globally by using the string \type{'global'} as the first function argument.
6467 Be warned that an assignment like
6469 \starttyping
6470 tex.box[0] = tex.box[2]
6471 \stoptyping
6473 does not copy the node list, it just duplicates a node pointer. If
6474 \tex{box2} will be cleared by \TEX\ commands later on, the contents
6475 of \tex{box0} becomes invalid as well. To prevent this from
6476 happening, always use \luatex{node.copy_list()} unless you are
6477 assigning to a temporary variable:
6479 \starttyping
6480 tex.box[0] = node.copy_list(tex.box[2])
6481 \stoptyping
6483 %{\bf note: In previous versions of \LUATEX\ there were also three
6484 %virtual tables called \type{tex.wd}, \type{tex.ht}, and \type{tex.dp}
6485 %along with an associated function call interface. These were
6486 %removed in version 0.63. You should switch to using \type{tex.box[].width}
6487 %etc. instead.}
6489 %If for some reason you want the functionality of these tables back,
6490 %you can add \LUA\ code to do that for you, like this:
6492 %\starttyping
6493 %local box = tex.box
6495 %local wd = {
6496 % __index = function(t,k) local bk = box[k] return bk and bk.width or 0 end,
6497 % __newindex = function(t,k,v) local bk = box[k] if bk then bk.width = v end end,
6499 %local ht = {
6500 % __index = function(t,k) local bk = box[k] return bk and bk.height or 0 end,
6501 % __newindex = function(t,k,v) local bk = box[k] if bk then bk.height = v end end,
6503 %local dp = {
6504 % __index = function(t,k) local bk = box[k] return bk and bk.depth or 0 end,
6505 % __newindex = function(t,k,v) local bk = box[k] if bk then bk.depth = v end end,
6508 %tex.wd = { } setmetatable(tex.wd,wd)
6509 %tex.ht = { } setmetatable(tex.ht,ht)
6510 %tex.dp = { } setmetatable(tex.dp,dp)
6511 %\stoptyping
6514 \subsection{Math parameters}
6516 It is possible to set and query the internal math parameters
6517 using:
6519 \startfunctioncall
6520 tex.setmath(<string> n, <string> t, <number> n)
6521 tex.setmath('global', <string> n, <string> t, <number> n)
6522 <number> n = tex.getmath(<string> n, <string> t)
6523 \stopfunctioncall
6525 As before an optional first parameter \type{'global'} indicates a
6526 global assignment.
6528 The first string is the parameter name minus the leading \quote{Umath},
6529 and the second string is the style name minus the trailing \quote{style}.
6531 Just to be complete, the values for the math parameter name are:
6533 \starttyping
6534 quad axis operatorsize
6535 overbarkern overbarrule overbarvgap
6536 underbarkern underbarrule underbarvgap
6537 radicalkern radicalrule radicalvgap
6538 radicaldegreebefore radicaldegreeafter radicaldegreeraise
6539 stackvgap stacknumup stackdenomdown
6540 fractionrule fractionnumvgap fractionnumup
6541 fractiondenomvgap fractiondenomdown fractiondelsize
6542 limitabovevgap limitabovebgap limitabovekern
6543 limitbelowvgap limitbelowbgap limitbelowkern
6544 underdelimitervgap underdelimiterbgap
6545 overdelimitervgap overdelimiterbgap
6546 subshiftdrop supshiftdrop subshiftdown
6547 subsupshiftdown subtopmax supshiftup
6548 supbottommin supsubbottommax subsupvgap
6549 spaceafterscript connectoroverlapmin
6550 ordordspacing ordopspacing ordbinspacing ordrelspacing
6551 ordopenspacing ordclosespacing ordpunctspacing ordinnerspacing
6552 opordspacing opopspacing opbinspacing oprelspacing
6553 opopenspacing opclosespacing oppunctspacing opinnerspacing
6554 binordspacing binopspacing binbinspacing binrelspacing
6555 binopenspacing binclosespacing binpunctspacing bininnerspacing
6556 relordspacing relopspacing relbinspacing relrelspacing
6557 relopenspacing relclosespacing relpunctspacing relinnerspacing
6558 openordspacing openopspacing openbinspacing openrelspacing
6559 openopenspacing openclosespacing openpunctspacing openinnerspacing
6560 closeordspacing closeopspacing closebinspacing closerelspacing
6561 closeopenspacing closeclosespacing closepunctspacing closeinnerspacing
6562 punctordspacing punctopspacing punctbinspacing punctrelspacing
6563 punctopenspacing punctclosespacing punctpunctspacing punctinnerspacing
6564 innerordspacing inneropspacing innerbinspacing innerrelspacing
6565 inneropenspacing innerclosespacing innerpunctspacing innerinnerspacing
6566 \stoptyping
6568 The values for the style parameter name are:
6570 \starttyping
6571 display crampeddisplay
6572 text crampedtext
6573 script crampedscript
6574 scriptscript crampedscriptscript
6575 \stoptyping
6578 \subsection{Special list heads}
6580 The virtual table \luatex{tex.lists} contains the set of internal
6581 registers that keep track of building page lists.
6584 \starttabulate[|lT|p|]
6585 \NC \bf field \NC \bf description \NC \NR
6586 \NC page_ins_head \NC circular list of pending insertions \NC \NR
6587 \NC contrib_head \NC the recent contributions \NC \NR
6588 \NC page_head \NC the current page content\NC \NR
6589 %\NC temp_head \NC \NC \NR
6590 \NC hold_head \NC used for held-over items for next page\NC \NR
6591 \NC adjust_head \NC head of the current \tex{vadjust} list \NC \NR
6592 \NC pre_adjust_head \NC head of the current \tex{vadjust pre} list\NC \NR
6593 % \NC align_head \NC \NC \NR
6594 \stoptabulate
6596 \subsection{Semantic nest levels (0.51)}
6598 The virtual table \luatex{tex.nest} contains the currently active
6599 semantic nesting state. It has two main parts: a zero-based array of
6600 userdata for the semantic nest itself, and the numerical value
6601 \type{tex.nest.ptr}, which gives the highest available index. Neither
6602 the array items in \type{tex.nest[]} nor \type{tex.nest.ptr} can be
6603 assigned to (as this would confuse the typesetting engine beyond
6604 repair), but you can assign to the individual values inside the array
6605 items, e.g. \type{tex.nest[tex.nest.ptr].prevdepth}.
6607 \type{tex.nest[tex.nest.ptr]} is the current nest state, \type{tex.nest[0]}
6608 the outermost (main vertical list) level.
6610 The known fields are:
6612 \starttabulate[|lT|l|l|p|]
6613 \NC \ssbf key \NC \bf type \NC \bf modes \NC \bf explanation \NC\NR
6614 \NC mode \NC number \NC all \NC The current mode. This is a number representing the
6615 main mode at this level:\crlf
6616 0 == no mode (this happens during \type{\write})\crlf
6617 1 == vertical,\crlf
6618 127 = horizontal,\crlf
6619 253 = display math.\crlf
6620 $-1$ == internal vertical,\crlf
6621 $-127$ = restricted horizontal,\crlf
6622 $-253$ = inline math.\NC\NR
6623 \NC modeline \NC number \NC all \NC source input line where this mode was entered in,
6624 negative inside the output routine.\NC\NR
6625 \NC head \NC node \NC all \NC the head of the current list\NC\NR
6626 \NC tail \NC node \NC all \NC the tail of the current list\NC\NR
6627 \NC prevgraf \NC number \NC vmode \NC number of lines in the previous paragraph\NC\NR
6628 \NC prevdepth \NC number \NC vmode \NC depth of the previous paragraph (equal to \type{\pdfignoreddimen}
6629 when it is to be ignored)\NC\NR
6630 \NC spacefactor \NC number \NC hmode \NC the current space factor\NC\NR
6631 \NC dirs \NC node \NC hmode \NC used for temporary storage by the line break algorithm\NC\NR
6632 \NC noad \NC node \NC mmode \NC used for temporary storage of a pending fraction numerator,
6633 for \type{\over} etc.\NC\NR
6634 \NC delimptr \NC node \NC mmode \NC used for temporary storage of the previous math delimiter,
6635 for \type{\middle}.\NC\NR
6636 \NC mathdir \NC boolean \NC mmode \NC true when during math processing the \type{\mathdir} is not
6637 the same as the surrounding \type{\textdir}\NC\NR
6638 \NC mathstyle \NC number \NC mmode \NC the current \type{\mathstyle} \NC\NR
6639 \stoptabulate
6642 \subsection[sec:luaprint]{Print functions}
6644 The \luatex{tex} table also contains the three print functions that
6645 are the major interface from \LUA\ scripting to \TEX.
6647 The arguments to these three functions are all stored in an in|-|memory
6648 virtual file that is fed to the \TEX\ scanner as the result of the
6649 expansion of \tex{directlua}.
6651 The total amount of returnable text from a \tex{directlua} command
6652 is only limited by available system \RAM. However, each separate
6653 printed string has to fit completely in \TEX's input buffer.
6655 The result of using these functions from inside callbacks is undefined
6656 at the moment.
6658 \subsubsection{\luatex{tex.print}}
6660 \startfunctioncall
6661 tex.print(<string> s, ...)
6662 tex.print(<number> n, <string> s, ...)
6663 tex.print(<table> t)
6664 tex.print(<number> n, <table> t)
6665 \stopfunctioncall
6667 Each string argument is treated by \TEX\ as a separate input line.
6668 If there is a table argument instead of a list of strings, this has to
6669 be a consecutive array of strings to print (the first non-string value
6670 will stop the printing process). This syntax was added in 0.36.
6672 The optional parameter can be used to print the strings using the
6673 catcode regime defined by \tex{catcodetable}~\type{n}. If \type{n} is
6674 $-1$, the currently active catcode regime is used. If \type{n} is
6675 $-2$, the resulting catcodes are the result of \type{\the\toks}: all
6676 category codes are 12 (other) except for the space character, that has
6677 category code 10 (space). Otherwise, if \type{n} is not
6678 a valid catcode table, then it is ignored, and the currently
6679 active catcode regime is used instead.
6681 The very last string of the very last \luatex{tex.print()} command in a
6682 \tex{directlua} will not have the \tex{endlinechar} appended, all
6683 others do.
6685 \subsubsection{\luatex{tex.sprint}}
6687 \startfunctioncall
6688 tex.sprint(<string> s, ...)
6689 tex.sprint(<number> n, <string> s, ...)
6690 tex.sprint(<table> t)
6691 tex.sprint(<number> n, <table> t)
6692 \stopfunctioncall
6694 Each string argument is treated by \TEX\ as a special kind of input line
6695 that makes it suitable for use as a partial line input mechanism:
6697 \startitemize[packed]
6698 \item \TEX\ does not switch to the \quote{new line} state, so
6699 that leading spaces are not ignored.
6700 \item No \tex{endlinechar} is inserted.
6701 \item Trailing spaces are not removed.
6703 Note that this does not prevent \TEX\ itself from eating spaces as
6704 result of interpreting the line. For example, in
6706 \starttyping
6707 before\directlua{tex.sprint("\\relax")tex.sprint(" inbetween")}after
6708 \stoptyping
6710 the space before \type{inbetween} will be gobbled as a result of
6711 the \quote{normal} scanning of \tex{relax}.
6712 \stopitemize
6714 If there is a table argument instead of a list of strings, this has to
6715 be a consecutive array of strings to print (the first non-string value
6716 will stop the printing process). This syntax was added in 0.36.
6718 The optional argument sets the catcode regime, as with \type{tex.print()}.
6720 \subsubsection{\luatex{tex.tprint}}
6722 \startfunctioncall
6723 tex.tprint({<number> n, <string> s, ...}, {...})
6724 \stopfunctioncall
6726 This function is basically a shortcut for repeated calls to
6727 \luatex{tex.sprint(<number> n, <string> s, ...)}, once for each of
6728 the supplied argument tables.
6730 \subsubsection{\luatex{tex.write}}
6732 \startfunctioncall
6733 tex.write(<string> s, ...)
6734 tex.write(<table> t)
6735 \stopfunctioncall
6737 Each string argument is treated by \TEX\ as a special kind of input
6738 line that makes it suitable for use as a quick way to dump
6739 information:
6741 \startitemize
6742 \item All catcodes on that line are either \quote{space} (for '~') or
6743 \quote{character} (for all others).
6744 \item There is no \tex{endlinechar} appended.
6745 \stopitemize
6747 If there is a table argument instead of a list of strings, this has to
6748 be a consecutive array of strings to print (the first non-string value
6749 will stop the printing process). This syntax was added in 0.36.
6752 \subsection{Helper functions}
6754 \subsubsection{\luatex{tex.round}}
6756 \startfunctioncall
6757 <number> n = tex.round(<number> o)
6758 \stopfunctioncall
6760 Rounds \LUA\ number \type{o}, and returns a number that is in the range
6761 of a valid \TEX\ register value. If the number starts out of range, it
6762 generates a \quote{number to big} error as well.
6764 \subsubsection{\luatex{tex.scale}}
6766 \startfunctioncall
6767 <number> n = tex.scale(<number> o, <number> delta)
6768 <table> n = tex.scale(table o, <number> delta)
6769 \stopfunctioncall
6771 Multiplies the \LUA\ numbers \type{o} and \type{delta}, and returns a
6772 rounded number that is in the range of a valid \TEX\ register value.
6773 In the table version, it creates a copy of the table with all numeric
6774 top||level values scaled in that manner. If the multiplied number(s) are
6775 of range, it generates \quote{number to big} error(s) as well.
6777 Note: the precision of the output of this function will depend on your
6778 computer's architecture and operating system, so use with care! An
6779 interface to \LUATEX's internal, 100\% portable scale function will be
6780 added at a later date.
6782 \subsubsection{\luatex{tex.sp} (0.51)}
6784 \startfunctioncall
6785 <number> n = tex.sp(<number> o)
6786 <number> n = tex.sp(<string> s)
6787 \stopfunctioncall
6789 Converts the number \type{o} or a string \type{s} that represents
6790 an explicit dimension into an integer number of scaled points.
6792 For parsing the string, the same scanning and conversion rules are used
6793 that \LUATEX\ would use if it was scanning a dimension specifier in
6794 its \TEX-like input language (this includes generating errors for bad
6795 values), expect for the following:
6797 \startitemize[n]
6798 \item only explicit values are allowed, control sequences are not handled
6799 \item infinite dimension units (\type{fil...}) are forbidden
6800 \item \type{mu} units do not generate an error (but may not be useful either)
6801 \stopitemize
6803 \subsubsection{\luatex{tex.definefont}}
6805 \startfunctioncall
6806 tex.definefont(<string> csname, <number> fontid)
6807 tex.definefont(<boolean> global, <string> csname, <number> fontid)
6808 \stopfunctioncall
6810 Associates \type{csname} with the internal font number \type{fontid}.
6811 The definition is global if (and only if) \type{global} is specified
6812 and true (the setting of \type{globaldefs} is not taken into account).
6815 \subsubsection{\luatex{tex.error} (0.61)}
6817 \startfunctioncall
6818 tex.error(<string> s)
6819 tex.error(<string> s, <table> help)
6820 \stopfunctioncall
6822 This creates an error somewhat like the combination of \tex{errhelp}
6823 and \tex{errmessage} would. During this error, deletions are disabled.
6825 The array part of the \type{help} table has to contain strings,
6826 one for each line of error help.
6829 \subsubsection{\luatex{tex.hashtokens} (0.25)}
6831 \startfunctioncall
6832 for i,v in pairs (tex.hashtokens()) do ... end
6833 \stopfunctioncall
6835 Returns a name and token table pair (see~\in{section}[luatokens] about
6836 token tables) iterator for every non-zero entry in the hash table.
6837 This can be useful for debugging, but note that this also reports
6838 control sequences that may be unreachable at this moment due to local
6839 redefinitions: it is strictly a dump of the hash table.
6841 \subsection[luaprimitives]{Functions for dealing with primitives }
6843 \subsubsection{\luatex{tex.enableprimitives}}
6845 \startfunctioncall
6846 tex.enableprimitives(<string> prefix, <table> primitive names)
6847 \stopfunctioncall
6849 This function accepts a prefix string and an array of primitive names.
6851 For each combination of \quote{prefix} and \quote{name}, the
6852 \type{tex.enableprimitives} first verifies that \quote{name} is
6853 an actual primitive (it must be returned by one of the
6854 \type{tex.extraprimitives()} calls explained below, or part of
6855 \TEX82, or \type{\directlua}). If it is not,
6856 \type{tex.enableprimitives} does nothing and skips to the next pair.
6858 But if it is, then it will construct a csname variable by concatenating the
6859 \quote{prefix} and \quote{name}, unless the \quote{prefix} is already the actual
6860 prefix of \quote{name}. In the latter case, it will discard the \quote{prefix},
6861 and just use \quote{name}.
6863 Then it will check for the existence of the constructed csname.
6864 If the csname is currently undefined (note: that is not the same as
6865 \type{\relax}), it will globally define the csname to have the
6866 meaning: run code belonging to the primitive \quote{name}. If for some
6867 reason the csname is already defined, it does nothing and tries the
6868 next pair.
6870 An example:
6872 \starttyping
6873 tex.enableprimitives('LuaTeX', {'formatname'})
6874 \stoptyping
6876 will define \type{\LuaTeXformatname} with the same intrinsic meaning
6877 as the documented primitive \type{\formatname}, provided that the
6878 control sequences \type{\LuaTeXformatname} is currently undefined.
6880 Second example:
6882 \starttyping
6883 tex.enableprimitives('Omega',tex.extraprimitives ('omega'))
6884 \stoptyping
6886 will define a whole series of csnames like \type{\Omegatextdir},
6887 \type{\Omegapardir}, etc., but it will stick with \type{\OmegaVersion}
6888 instead of creating the doubly-prefixed \type{\OmegaOmegaVersion}.
6890 Starting with version 0.39.0 (and this is why the above two functions
6891 are needed), \LUATEX\ in \type{--ini} mode contains only the \TEX82
6892 primitives and \type{\directlua}, no extra primitives {\bf at all}.
6894 So, if you want to have all the new functionality available using
6895 their default names, as it is now, you will have to add
6897 \starttyping
6898 \ifx\directlua\undefined \else
6899 \directlua {tex.enableprimitives('',tex.extraprimitives ())}
6901 \stoptyping
6903 near the beginning of your format generation file. Or you can choose
6904 different prefixes for different subsets, as you see fit.
6906 Calling some form of \type{tex.enableprimitives()} is highly important
6907 though, because if you do not, you will end up with a \TEX82-lookalike
6908 that can run lua code but not do much else. The defined csnames are
6909 (of course) saved in the format and will be available at runtime.
6912 \subsubsection{\luatex{tex.extraprimitives}}
6914 \startfunctioncall
6915 <table> t = tex.extraprimitives(<string> s, ...)
6916 \stopfunctioncall
6918 This function returns a list of the primitives that originate
6919 from the engine(s) given by the requested string value(s). The
6920 possible values and their (current) return values are:
6922 \startluacode
6923 function out_prim (a)
6924 local v = tex.extraprimitives(a)
6925 table.sort(v)
6926 for _,n in pairs(v) do
6927 if n == ' ' then
6928 n = '\\normalcontrolspace'
6930 tex.print(n .. '\\hskip 4pt plus 5em')
6933 \stopluacode
6935 \starttabulate[|l|p|]
6936 \NC \bf name\NC \bf values \NC \NR
6937 \NC tex \NC \ctxlua{out_prim('tex') } \NC \NR
6938 \NC core \NC \ctxlua{out_prim('core') } \NC \NR
6939 \NC etex \NC \ctxlua{out_prim('etex') } \NC \NR
6940 \NC pdftex \NC \ctxlua{out_prim('pdftex') } \NC \NR
6941 \NC omega \NC \ctxlua{out_prim('omega') } \NC \NR
6942 \NC aleph \NC \ctxlua{out_prim('aleph') } \NC \NR
6943 \NC luatex \NC \ctxlua{out_prim('luatex') } \NC \NR
6944 \NC umath \NC \ctxlua{out_prim('umath') } \NC \NR
6945 \stoptabulate
6947 Note that \type{'luatex'} does not contain \type{directlua}, as that is
6948 considered to be a core primitive, along with all the \TEX82
6949 primitives, so it is part of the list that is returned from \type{'core'}.
6951 \type{'umath'} is a subset of \type{'luatex'} that covers the Unicode math
6952 primitives and have been added in \LUATEX\ 0.75.0 as it might be desired to
6953 handle the prefixing of that subset differently.
6955 Running \type{tex.extraprimitives()} will give you the complete list
6956 of primitives that are not defined at \LUATEX\ 0.39.0 \type{-ini}
6957 startup. It is exactly equivalent to \type{tex.extraprimitives('etex',
6958 'pdftex', 'omega', 'aleph', 'luatex')}
6960 \subsubsection{\luatex{tex.primitives}}
6962 \startfunctioncall
6963 <table> t = tex.primitives()
6964 \stopfunctioncall
6966 This function returns a hash table listing all primitives that \LUATEX\
6967 knows about. The keys in the hash are primitives names, the values are
6968 tables representing tokens (see~\in{section }[luatokens]). The third value
6969 is always zero.
6971 \subsection{Core functionality interfaces}
6973 \subsubsection{\luatex{tex.badness} (0.53)}
6975 \startfunctioncall
6976 <number> b = tex.badness(<number> t, <number> s)
6977 \stopfunctioncall
6979 This helper function is useful
6980 during linebreak calculations. \type{t} and \type{s} are scaled values; the function
6981 returns the badness for when total \type{t} is supposed to be made from amounts
6982 that sum to \type{s}. The returned number is a reasonable approximation of $100(t/s)^3$;
6984 \subsubsection{\luatex{tex.linebreak} (0.53)}
6986 \startfunctioncall
6987 local <node> nodelist, <table> info =
6988 tex.linebreak(<node> listhead, <table> parameters)
6989 \stopfunctioncall
6991 The understood parameters are as follows:
6993 \starttabulate[|l|l|p|]
6994 \NC \bf name \NC \bf type \NC \bf description \NC \NR
6995 \NC pardir \NC string \NC \NC \NR
6996 \NC pretolerance \NC number \NC \NC \NR
6997 \NC tracingparagraphs \NC number \NC \NC \NR
6998 \NC tolerance \NC number \NC \NC \NR
6999 \NC looseness \NC number \NC \NC \NR
7000 \NC hyphenpenalty \NC number \NC \NC \NR
7001 \NC exhyphenpenalty \NC number \NC \NC \NR
7002 \NC pdfadjustspacing \NC number \NC \NC \NR
7003 \NC adjdemerits \NC number \NC \NC \NR
7004 \NC pdfprotrudechars \NC number \NC \NC \NR
7005 \NC linepenalty \NC number \NC \NC \NR
7006 \NC lastlinefit \NC number \NC \NC \NR
7007 \NC doublehyphendemerits \NC number \NC \NC \NR
7008 \NC finalhyphendemerits \NC number \NC \NC \NR
7009 \NC hangafter \NC number \NC \NC \NR
7010 \NC interlinepenalty \NC number or table \NC if a table, then it is an array like \type{\interlinepenalties}\NC \NR
7011 \NC clubpenalty \NC number or table \NC if a table, then it is an array like \type{\clubpenalties}\NC \NR
7012 \NC widowpenalty \NC number or table \NC if a table, then it is an array like \type{\widowpenalties}\NC \NR
7013 \NC brokenpenalty \NC number \NC \NC \NR
7014 \NC emergencystretch \NC number \NC in scaled points \NC \NR
7015 \NC hangindent \NC number \NC in scaled points \NC \NR
7016 \NC hsize \NC number \NC in scaled points \NC \NR
7017 \NC leftskip \NC glue_spec node \NC \NC \NR
7018 \NC rightskip \NC glue_spec node \NC \NC \NR
7019 \NC pdfeachlineheight \NC number \NC in scaled points \NC \NR
7020 \NC pdfeachlinedepth \NC number \NC in scaled points \NC \NR
7021 \NC pdffirstlineheight \NC number \NC in scaled points \NC \NR
7022 \NC pdflastlinedepth \NC number \NC in scaled points \NC \NR
7023 \NC pdfignoreddimen \NC number \NC in scaled points \NC \NR
7024 \NC parshape \NC table \NC \NC \NR
7025 \stoptabulate
7027 Note that there is no interface for \type{\displaywidowpenalties}, you
7028 have to pass the right choice for \type{widowpenalties} yourself.
7030 The meaning of the various keys should be fairly obvious from the
7031 table (the names match the \TEX\ and \PDFTEX\ primitives) except for
7032 the last 5 entries. The four \type{pdf...line...} keys are ignored if
7033 their value equals \type{pdfignoreddimen}.
7035 It is your own job to make sure that \type{listhead} is a proper
7036 paragraph list: this function does not add any nodes to it. To be
7037 exact, if you want to replace the core line breaking, you may have to
7038 do the following (when you are not actually working in the
7039 \type{pre_linebreak_filter} or \type{linebreak_filter} callbacks, or when the
7040 original list starting at listhead was generated in horizontal mode):
7042 \startitemize
7043 \item add an \quote{indent box} and perhaps a \type{local_par} node at
7044 the start (only if you need them)
7045 \item replace any found final glue by an infinite penalty (or add such
7046 a penalty, if the last node is not a glue)
7047 \item add a glue node for the \type{\parfillskip} after that penalty node
7048 \item make sure all the \type{prev} pointers are OK
7049 \stopitemize
7051 The result is a node list, it still needs to be vpacked if you
7052 want to assign it to a \tex{vbox}.
7055 The returned \type{info} table contains four values that are all numbers:
7057 \starttabulate[|l|p|]
7058 \NC prevdepth \NC depth of the last line in the broken paragraph \NC \NR
7059 \NC prevgraf \NC number of lines in the broken paragraph \NC \NR
7060 \NC looseness \NC the actual looseness value in the broken paragraph \NC \NR
7061 \NC demerits \NC the total demerits of the chosen solution \NC \NR
7062 \stoptabulate
7064 Note there are a few things you cannot interface using this function:
7065 You cannot influence font expansion other than via
7066 \type{pdfadjustspacing}, because the settings for that take place
7067 elsewhere. The same is true for hbadness and hfuzz etc. All these are
7068 in the \type{hpack()} routine, and that fetches its own variables via
7069 globals.
7071 \subsubsection{\luatex{tex.shipout} (0.51)}
7073 \startfunctioncall
7074 tex.shipout(<number> n)
7075 \stopfunctioncall
7077 Ships out box number \type{n} to the output file, and clears the box
7078 register.
7081 \section[texconfig]{The \luatex{texconfig} table}
7083 This is a table that is created empty. A startup \LUA\ script could
7084 fill this table with a number of settings that are read out by
7085 the executable after loading and executing the startup file.
7087 \starttabulate[|lT|l|l|p|]
7088 \NC \ssbf key \NC \bf type \NC \bf default \NC \bf explanation \NC\NR
7089 \NC kpse_init \NC boolean \NC true \NC \type{false} totally disables \KPATHSEA\ initialisation,
7090 and enables interpretation of the following numeric key--value pairs.
7091 (only ever unset this if you implement {\it all\/} file
7092 find callbacks!)\NC \NR
7093 \NC shell_escape \NC string\NC \type{'f'}\NC Use \type{'y'} or \type{'t'} or \type{'1'} to enable \type{\write18} unconditionally,
7094 \type{'p'} to enable the commands that are listed in \type{shell_escape_commands} (new in 0.37)\NC\NR
7095 \NC shell_escape_commands \NC string\NC \NC Comma-separated list of command names that may be executed by \type{\write18} even
7096 if \type{shell_escape} is set to \type{'p'}. Do {\it not\/} use spaces around commas,
7097 separate any required command arguments by using a space, and use the ASCII double quote
7098 (\type{"}) for any needed argument or path quoting (new in 0.37)\NC\NR
7099 \NC string_vacancies \NC number\NC 75000\NC cf.\ web2c docs \NC \NR
7100 \NC pool_free \NC number\NC 5000\NC cf.\ web2c docs \NC \NR
7101 \NC max_strings \NC number\NC 15000\NC cf.\ web2c docs \NC \NR
7102 \NC strings_free \NC number\NC 100\NC cf.\ web2c docs \NC \NR
7103 \NC nest_size \NC number\NC 50\NC cf.\ web2c docs \NC \NR
7104 \NC max_in_open \NC number\NC 15\NC cf.\ web2c docs \NC \NR
7105 \NC param_size \NC number\NC 60\NC cf.\ web2c docs \NC \NR
7106 \NC save_size \NC number\NC 4000\NC cf.\ web2c docs \NC \NR
7107 \NC stack_size \NC number\NC 300\NC cf.\ web2c docs \NC \NR
7108 \NC dvi_buf_size \NC number\NC 16384\NC cf.\ web2c docs \NC \NR
7109 \NC error_line \NC number\NC 79\NC cf.\ web2c docs \NC \NR
7110 \NC half_error_line \NC number\NC 50\NC cf.\ web2c docs \NC \NR
7111 \NC max_print_line \NC number\NC 79\NC cf.\ web2c docs \NC \NR
7112 \NC hash_extra \NC number\NC 0\NC cf.\ web2c docs \NC \NR
7113 \NC pk_dpi \NC number\NC 72\NC cf.\ web2c docs \NC \NR
7114 \NC trace_file_names \NC boolean \NC true \NC \type{false} disables \TEX's normal file open|-|close
7115 feedback (the assumption is that callbacks will take care of
7116 that) \NC \NR
7117 \NC file_line_error \NC boolean \NC false \NC do \type{file:line} style error messages\NC \NR
7118 \NC halt_on_error \NC boolean \NC false \NC abort run on the first encountered error\NC \NR
7119 \NC formatname \NC string \NC \NC if no format name was given
7120 on the commandline, this key will be tested first
7121 instead of simply quitting\NC \NR
7122 \NC jobname \NC string \NC \NC if no input file name was given
7123 on the commandline, this key will be tested first
7124 instead of simply giving up\NC \NR
7125 \stoptabulate
7127 {\bf Note:} the numeric values that match web2c parameters are only used if
7128 \type{kpse_init} is explicitly set to \type{false}. In all other cases, the normal values from
7129 \type{texmf.cnf} are used.
7131 \section{The \luatex{texio} library}
7133 This library takes care of the low|-|level I/O interface.
7135 \subsection{Printing functions}
7137 \subsubsection{\luatex{texio.write}}
7139 \startfunctioncall
7140 texio.write(<string> target, <string> s, ...)
7141 texio.write(<string> s, ...)
7142 \stopfunctioncall
7144 Without the \type{target} argument, writes all given strings to the same
7145 location(s) \TEX\ writes messages to at this moment. If
7146 \tex{batchmode} is in effect, it writes only to the log,
7147 otherwise it writes to the log and the terminal.
7148 The optional \type{target} can be one of three possibilities:
7149 \type{term}, \type{log} or \type {term and log}.
7151 Note: If several strings are given, and if the first of these strings
7152 is or might be one of the targets above, the \type{target} must be
7153 specified explicitly to prevent \LUA\ from interpreting the first
7154 string as the target.
7156 \subsubsection{\luatex{texio.write_nl}}
7158 \startfunctioncall
7159 texio.write_nl(<string> target, <string> s, ...)
7160 texio.write_nl(<string> s, ...)
7161 \stopfunctioncall
7163 This function behaves like \luatex{texio.write}, but make sure that the given strings will
7164 appear at the beginning of a new line. You can pass a single empty string
7165 if you only want to move to the next line.
7167 %***********************************************************************
7169 \section[luatokens]{The \luatex{token} library}
7171 The \luatex{token} table contains interface functions to \TEX's
7172 handling of tokens. These functions are most useful when combined with
7173 the \luatex{token_filter} callback, but they could be used standalone
7174 as well.
7176 A token is represented in \LUA\ as a small table. For the moment, this
7177 table consists of three numeric entries:
7179 \starttabulate[|l|l|p|]
7180 \NC \bf index\NC \bf meaning \NC \bf description \NC \NR
7181 \NC 1 \NC command code \NC this is a value between~$0$ and~$130$ (approximately)\NC \NR
7182 \NC 2 \NC command modifier \NC this is a value between~$0$ and~$2^{21}$ \NC \NR
7183 \NC 3 \NC control sequence id \NC for commands that are not the result of control
7184 sequences, like letters and characters, it is zero,
7185 otherwise, it is a number pointing into the \quote
7186 {equivalence table} \NC \NR
7187 \stoptabulate
7189 \subsection{\luatex{token.get_next}}
7191 \startfunctioncall
7192 token t = token.get_next()
7193 \stopfunctioncall
7195 This fetches the next input token from the current input source,
7196 without expansion.
7198 \subsection{\luatex{token.is_expandable}}
7200 \startfunctioncall
7201 <boolean> b = token.is_expandable(<token> t)
7202 \stopfunctioncall
7204 This tests if the token \type{t} could be expanded.
7206 \subsection{\luatex{token.expand}}
7208 \startfunctioncall
7209 token.expand(<token> t)
7210 \stopfunctioncall
7212 If a token is expandable, this will expand one level of it, so that
7213 the first token of the expansion will now be the next token to be read
7214 by \luatex{token.get_next()}.
7216 \subsection{\luatex{token.is_activechar}}
7218 \startfunctioncall
7219 <boolean> b = token.is_activechar(<token> t)
7220 \stopfunctioncall
7222 This is a special test that is sometimes handy. Discovering whether
7223 some control sequence is the result of an active character turned out
7224 to be very hard otherwise.
7226 \subsection{\luatex{token.create}}
7228 \startfunctioncall
7229 token t = token.create(<string> csname)
7230 token t = token.create(<number> charcode)
7231 token t = token.create(<number> charcode, <number> catcode)
7232 \stopfunctioncall
7234 This is the token factory. If you feed it a string, then it is the
7235 name of a control sequence (without leading backslash), and it will be
7236 looked up in the equivalence table.
7238 If you feed it number, then this is assumed to be an input character,
7239 and an optional second number gives its category code. This means it
7240 is possible to overrule a character's category code, with a few
7241 exceptions: the category codes~0 (escape), 9~(ignored), 13~(active),
7242 14~(comment), and 15 (invalid) cannot occur inside a token. The values~0, 9, 14
7243 and~15 are therefore illegal as input to \luatex{token.create()}, and
7244 active characters will be resolved immediately.
7246 Note: unknown string sequences and never defined active characters
7247 will result in a token representing an \quote{undefined control sequence}
7248 with a near|-|random name. It is {\em not} possible to define brand
7249 new control sequences using \luatex{token.create}!
7251 \subsection{\luatex{token.command_name}}
7253 \startfunctioncall
7254 <string> commandname = token.command_name(<token> t)
7255 \stopfunctioncall
7257 This returns the name associated with the \quote{command} value of the token
7258 in \LUATEX. There is not always a direct connection between these names and
7259 primitives. For instance, all \tex{ifxxx} tests are grouped under
7260 \type {if_test}, and the \quote{command modifier} defines which test is to be run.
7262 \subsection{\luatex{token.command_id}}
7264 \startfunctioncall
7265 <number> i = token.command_id(<string> commandname)
7266 \stopfunctioncall
7268 This returns a number that is the inverse operation of the previous
7269 command, to be used as the first item in a token table.
7271 \subsection{\luatex{token.csname_name}}
7273 \startfunctioncall
7274 <string> csname = token.csname_name(<token> t)
7275 \stopfunctioncall
7277 This returns the name associated with the \quote{equivalence table} value of
7278 the token in \LUATEX. It returns the string value of the command used
7279 to create the current token, or an empty string if there is no
7280 associated control sequence.
7282 Keep in mind that there are potentially two control sequences that
7283 return the same csname string: single character control sequences
7284 and active characters have the same \quote{name}.
7286 \subsection{\luatex{token.csname_id}}
7288 \startfunctioncall
7289 <number> i = token.csname_id(<string> csname)
7290 \stopfunctioncall
7292 This returns a number that is the inverse operation of the previous
7293 command, to be used as the third item in a token table.
7296 \chapter[math]{Math}
7298 The handling of mathematics in \LUATEX\ differs quite a bit from how
7299 \TEX82 (and therefore \PDFTEX) handles math. First, \LUATEX\ adds primitives and
7300 extends some others so that \UNICODE\ input can be used easily. Second, all
7301 of \TEX82's internal special values (for example for operator spacing) have
7302 been made accessible and changeable via control sequences. Third, there are
7303 extensions that make it easier to use \OPENTYPE\ math fonts. And finally,
7304 there are some extensions that have been proposed in the past that are now
7305 added to the engine.
7307 \section{The current math style}
7309 Starting with \LUATEX\ 0.39.0, it is possible to discover the math
7310 style that will be used for a formula in an expandable fashion
7311 (while the math list is still being read). To make this possible,
7312 \LUATEX\ adds the new primitive: \type{\mathstyle}. This is a
7313 \quote{convert command} like e.g. \type{\romannumeral}: its value can
7314 only be read, not set.
7316 \subsection{\tex{mathstyle}}
7318 The returned value is between 0 and 7 (in math mode), or $-1$
7319 (all other modes). For easy testing, the eight math style commands
7320 have been altered so that the can be used as numeric values, so you
7321 can write code like this:
7323 \starttyping
7324 \ifnum\mathstyle=\textstyle
7325 \message{normal text style}
7326 \else \ifnum\mathstyle=\crampedtextstyle
7327 \message{cramped text style}
7328 \fi \fi
7329 \stoptyping
7331 \subsection{\tex{Ustack}}
7333 There are a few math commands in \TEX\ where the style that will be used
7334 is not known straight from the start. These commands (\tex{over},
7335 \tex{atop}, \tex{overwithdelims}, \tex{atopwithdelims}) would
7336 therefore normally return wrong values for \type{\mathstyle}. To
7337 fix this, \LUATEX\ introduces a special prefix command:
7338 \type{\Ustack}:
7340 \starttyping
7341 $\Ustack {a \over b}$
7342 \stoptyping
7344 The \type{\Ustack} command will scan the next brace and start a new
7345 math group with the correct (numerator) math style.
7347 \section{Unicode math characters}
7349 Character handling is now extended up to the full \UNICODE\ range
7350 (the \type{\U} prefix), which is compatible with \XETEX.
7352 The math primitives from \TEX\ are kept as they are, except for
7353 the ones that convert from input to math commands: \type{mathcode},
7354 and \type{delcode}. These two now allow
7355 for a 21-bit character argument on the left hand side of the equals sign.
7357 Some of the new \LUATEX\ primitives read
7358 more than one separate value. This is shown in the tables below by a plus
7359 sign in the second column.
7361 The input for such primitives would look like this:
7363 \starttyping
7364 \def\overbrace {\Umathaccent 0 1 "23DE }
7365 \stoptyping
7368 Altered \TEX82 primitives:
7370 \starttabulate[|l|l|l|]
7371 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7372 \NC \tex{mathcode} \NC 0--10FFFF = 0--8000 \NC\NR
7373 \NC \tex{delcode} \NC 0--10FFFF = 0--FFFFFF \NC\NR
7374 \stoptabulate
7376 Unaltered:
7378 \starttabulate[|l|l|l|]
7379 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7380 \NC \tex{mathchardef} \NC 0--8000 \NC\NR
7381 \NC \tex{mathchar} \NC 0--7FFF \NC\NR
7382 \NC \tex{mathaccent} \NC 0--7FFF \NC\NR
7383 \NC \tex{delimiter} \NC 0--7FFFFFF \NC\NR
7384 \NC \tex{radical} \NC 0--7FFFFFF \NC\NR
7385 \stoptabulate
7387 New primitives that are compatible with \XETEX:
7389 \starttabulate[|l|l|l|l|]
7390 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7391 \NC \tex{Umathchardef} \NC 0+0+0--7+FF+10FFFF$^1$ \NC\NR
7392 \NC \tex{Umathcharnumdef}$^5$ \NC -80000000--7FFFFFFF$^3$ \NC\NR
7393 \NC \tex{Umathcode} \NC 0--10FFFF = 0+0+0--7+FF+10FFFF$^1$ \NC\NR
7394 \NC \tex{Udelcode} \NC 0--10FFFF = 0+0--FF+10FFFF$^2$ \NC\NR
7395 \NC \tex{Umathchar} \NC 0+0+0--7+FF+10FFFF \NC\NR
7396 \NC \tex{Umathaccent} \NC 0+0+0--7+FF+10FFFF$^{2,4}$ \NC\NR
7397 \NC \tex{Udelimiter} \NC 0+0+0--7+FF+10FFFF$^2$ \NC\NR
7398 \NC \tex{Uradical} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7399 \NC \tex{Umathcharnum} \NC -80000000--7FFFFFFF$^3$ \NC\NR
7400 \NC \tex{Umathcodenum} \NC 0--10FFFF = -80000000--7FFFFFFF$^3$ \NC\NR
7401 \NC \tex{Udelcodenum} \NC 0--10FFFF = -80000000--7FFFFFFF$^3$ \NC\NR
7402 \stoptabulate
7404 Note 1: \type{\Umathchardef<csname>="8"0"0} and \type{\Umathchardef<number>="8"0"0}
7405 are also accepted.
7407 Note 2: The new primitives that deal with delimiter-style objects do not
7408 set up a \quote{large family}. Selecting a suitable size for display
7409 purposes is expected to be dealt with by the font via the
7410 \tex{Umathoperatorsize} parameter (more information a following section).
7412 Note 3: For these three primitives, all information is packed into a single
7413 signed integer. For the first two (\tex{Umathcharnum} and
7414 \tex{Umathcodenum}), the lowest 21 bits are the character code, the 3
7415 bits above that represent the math class, and the family data is kept in
7416 the topmost bits (This means that the values for math families 128--255 are
7417 actually negative). For \tex{Udelcodenum} there is no math class; the
7418 math family information is stored in the bits directly on top of the
7419 character code. Using these three commands is not as natural as using the
7420 two- and three-value commands, so unless you know exactly what you are
7421 doing and absolutely require the speedup resulting from the faster input
7422 scanning, it is better to use the verbose commands instead.
7424 Note 4: As of \LUATEX\ 0.65, \tex{Umathaccent} accepts optional
7425 keywords to control various details regarding math accents. See
7426 \in{section}[mathacc] below for details.
7428 Note 5: \tex{Umathcharnumdef} was added in release 0.72.
7431 New primitives that exist in \LUATEX\ only (all of these will be explained
7432 in following sections):
7435 \starttabulate[|l|l|l|l|]
7436 \NC \bf primitive \NC \bf value range (in hex) \NC\NR
7437 \NC \tex{Uroot} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7438 \NC \tex{Uoverdelimiter} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7439 \NC \tex{Uunderdelimiter} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7440 \NC \tex{Udelimiterover} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7441 \NC \tex{Udelimiterunder} \NC 0+0--FF+10FFFF$^2$ \NC\NR
7442 \stoptabulate
7444 \section{Cramped math styles}
7446 \LUATEX\ has four new primitives to set the cramped math styles
7447 directly:
7449 \starttyping
7450 \crampeddisplaystyle
7451 \crampedtextstyle
7452 \crampedscriptstyle
7453 \crampedscriptscriptstyle
7454 \stoptyping
7456 These additional commands are not all that valuable on their own, but
7457 they come in handy as arguments to the math parameter settings that
7458 will be added shortly.
7460 \section{Math parameter settings}
7462 In \LUATEX, the font dimension parameters that \TEX\ used in math
7463 typesetting are now accessible via primitive commands. In fact,
7464 refactoring of the math engine has resulted in many more parameters
7465 than were accessible before.
7467 \starttabulate
7468 \NC \bf primitive name \NC \bf description \NC \NR
7469 \NC \type{\Umathquad} \NC the width of 18mu's\NC \NR
7470 \NC \type{\Umathaxis} \NC height of the vertical center axis of
7471 the math formula above the baseline\NC \NR
7472 \NC \type{\Umathoperatorsize} \NC minimum size of large operators in display mode \NC \NR
7473 \NC \type{\Umathoverbarkern} \NC vertical clearance above the rule \NC \NR
7474 \NC \type{\Umathoverbarrule} \NC the width of the rule \NC \NR
7475 \NC \type{\Umathoverbarvgap} \NC vertical clearance below the rule \NC \NR
7476 \NC \type{\Umathunderbarkern} \NC vertical clearance below the rule \NC \NR
7477 \NC \type{\Umathunderbarrule} \NC the width of the rule \NC \NR
7478 \NC \type{\Umathunderbarvgap} \NC vertical clearance above the rule \NC \NR
7479 \NC \type{\Umathradicalkern} \NC vertical clearance above the rule \NC \NR
7480 \NC \type{\Umathradicalrule} \NC the width of the rule \NC \NR
7481 \NC \type{\Umathradicalvgap} \NC vertical clearance below the rule \NC \NR
7482 \NC \type{\Umathradicaldegreebefore}\NC the forward kern that takes place before placement of
7483 the radical degree \NC \NR
7484 \NC \type{\Umathradicaldegreeafter} \NC the backward kern that takes place after placement of
7485 the radical degree \NC \NR
7486 \NC \type{\Umathradicaldegreeraise} \NC this is the percentage of the total height and depth of
7487 the radical sign that the degree is raised by. It is
7488 expressed in \type{percents}, so 60\% is expressed as the
7489 integer $60$.\NC \NR
7490 \NC \type{\Umathstackvgap} \NC vertical clearance between the two
7491 elements in a \type{\atop} stack \NC \NR
7492 \NC \type{\Umathstacknumup} \NC numerator shift upward in \type{\atop} stack \NC \NR
7493 \NC \type{\Umathstackdenomdown} \NC denominator shift downward in \type{\atop} stack\NC \NR
7494 \NC \type{\Umathfractionrule} \NC the width of the rule in a \type{\over}\NC \NR
7495 \NC \type{\Umathfractionnumvgap} \NC vertical clearance between the numerator and the rule\NC \NR
7496 \NC \type{\Umathfractionnumup} \NC numerator shift upward in \type{\over} \NC \NR
7497 \NC \type{\Umathfractiondenomvgap} \NC vertical clearance between the denominator and the rule\NC \NR
7498 \NC \type{\Umathfractiondenomdown} \NC denominator shift downward in \type{\over} \NC \NR
7499 \NC \type{\Umathfractiondelsize} \NC minimum delimiter size for \type{\...withdelims}\NC \NR
7500 \NC \type{\Umathlimitabovevgap} \NC vertical clearance for limits above operators\NC \NR
7501 \NC \type{\Umathlimitabovebgap} \NC vertical baseline clearance for limits above operators\NC \NR
7502 \NC \type{\Umathlimitabovekern} \NC space reserved at the top of the limit\NC \NR
7503 \NC \type{\Umathlimitbelowvgap} \NC vertical clearance for limits below operators\NC \NR
7504 \NC \type{\Umathlimitbelowbgap} \NC vertical baseline clearance for limits below operators\NC \NR
7505 \NC \type{\Umathlimitbelowkern} \NC space reserved at the bottom of the limit\NC \NR
7506 \NC \type{\Umathoverdelimitervgap} \NC vertical clearance for limits above delimiters\NC \NR
7507 \NC \type{\Umathoverdelimiterbgap} \NC vertical baseline clearance for limits above delimiters\NC \NR
7508 \NC \type{\Umathunderdelimitervgap} \NC vertical clearance for limits below delimiters\NC \NR
7509 \NC \type{\Umathunderdelimiterbgap} \NC vertical baseline clearance for limits below delimiters\NC \NR
7510 \NC \type{\Umathsubshiftdrop} \NC subscript drop for boxes and subformulas\NC \NR
7511 \NC \type{\Umathsubshiftdown} \NC subscript drop for characters\NC \NR
7512 \NC \type{\Umathsupshiftdrop} \NC superscript drop (raise, actually) for boxes and subformulas\NC \NR
7513 \NC \type{\Umathsupshiftup} \NC superscript raise for characters\NC \NR
7514 \NC \type{\Umathsubsupshiftdown} \NC subscript drop in the presence of a superscript\NC \NR
7515 \NC \type{\Umathsubtopmax} \NC the top of standalone subscripts cannot be higher than this above the baseline\NC \NR
7516 \NC \type{\Umathsupbottommin} \NC the bottom of standalone superscripts cannot be less than this above the baseline\NC \NR
7517 \NC \type{\Umathsupsubbottommax} \NC the bottom of the superscript of a combined super- and subscript
7518 be at least as high as this above the baseline\NC \NR
7519 \NC \type{\Umathsubsupvgap} \NC vertical clearance between super- and subscript\NC \NR
7520 \NC \type{\Umathspaceafterscript} \NC additional space added after a super- or subscript\NC \NR
7521 \NC \type{\Umathconnectoroverlapmin}\NC minimum overlap between parts in an extensible recipe\NC \NR
7522 \stoptabulate
7524 Each of the parameters in this section can be set by a command like this:
7526 \starttyping
7527 \Umathquad\displaystyle=1em
7528 \stoptyping
7530 they obey grouping, and you can use \type{\the\Umathquad\displaystyle} if needed.
7532 \section{Font-based Math Parameters}
7534 While it is nice to have these math parameters available for tweaking, it
7535 would be tedious to have to set each of them by hand. For this reason,
7536 \LUATEX\ initializes a bunch of these parameters whenever you assign a font
7537 identifier to a math family based on either the traditional math font
7538 dimensions in the font (for assignments to math family~2 and~3 using
7539 \TFM|-|based fonts like \type{cmsy} and \type{cmex}), or based on the named
7540 values in a potential \type{MathConstants} table when the font is loaded
7541 via Lua. If there is a \type{MathConstants} table, this takes precedence
7542 over font dimensions, and in that case no attention is paid to which
7543 family is being assigned to: the \type{MathConstants} tables in the last
7544 assigned family sets all parameters.
7546 In the table below, the one-letter style abbreviations and symbolic tfm
7547 font dimension names match those using in the \TeX book. Assignments to
7548 \tex{textfont} set the values for the cramped and uncramped display and
7549 text styles. Use \tex{scriptfont} for the script styles, and
7550 \tex{scriptscriptfont} for the scriptscript styles (totalling eight
7551 parameters for three font sizes). In the \TFM\ case, assignments only happen
7552 in family~2 and family~3 (and of course only for the parameters for which
7553 there are font dimensions).
7555 Besides the parameters below, \LUATEX\ also looks at the \quote{space}
7556 font dimension parameter. For math fonts, this should be set to zero.
7558 \start
7560 \switchtobodyfont[8pt]
7562 \starttabulate[|l|l|l|p|]
7563 \NC \bf variable \NC \bf style \NC \bf default value opentype \NC \bf default value tfm \NC\NR
7564 \NC \tex{Umathaxis} \NC -- \NC AxisHeight \NC axis_height \NC\NR
7565 \NC \tex{Umathoperatorsize} \NC D, D' \NC DisplayOperatorMinHeight \NC $^6$ \NC\NR
7566 \NC \tex{Umathfractiondelsize} \NC D, D' \NC FractionDelimiterDisplayStyleSize$^9$ \NC delim1 \NC\NR
7567 \NC " \NC T, T', S, S', SS, SS' \NC FractionDelimiterSize$^9$ \NC delim2 \NC\NR
7568 \NC \tex{Umathfractiondenomdown}\NC D, D' \NC FractionDenominatorDisplayStyleShiftDown \NC denom1 \NC\NR
7569 \NC " \NC T, T', S, S', SS, SS' \NC FractionDenominatorShiftDown \NC denom2 \NC\NR
7570 \NC \tex{Umathfractiondenomvgap}\NC D, D' \NC FractionDenominatorDisplayStyleGapMin \NC 3*default_rule_thickness \NC\NR
7571 \NC " \NC T, T', S, S', SS, SS' \NC FractionDenominatorGapMin \NC default_rule_thickness \NC\NR
7572 \NC \tex{Umathfractionnumup} \NC D, D' \NC FractionNumeratorDisplayStyleShiftUp \NC num1 \NC\NR
7573 \NC " \NC T, T', S, S', SS, SS' \NC FractionNumeratorShiftUp \NC num2 \NC\NR
7574 \NC \tex{Umathfractionnumvgap} \NC D, D' \NC FractionNumeratorDisplayStyleGapMin \NC 3*default_rule_thickness \NC\NR
7575 \NC " \NC T, T', S, S', SS, SS' \NC FractionNumeratorGapMin \NC default_rule_thickness \NC\NR
7576 \NC \tex{Umathfractionrule} \NC -- \NC FractionRuleThickness \NC default_rule_thickness \NC\NR
7577 \NC \tex{Umathlimitabovebgap} \NC -- \NC UpperLimitBaselineRiseMin \NC big_op_spacing3 \NC\NR
7578 \NC \tex{Umathlimitabovekern} \NC -- \NC 0$^1$ \NC big_op_spacing5 \NC\NR
7579 \NC \tex{Umathlimitabovevgap} \NC -- \NC UpperLimitGapMin \NC big_op_spacing1 \NC\NR
7580 \NC \tex{Umathlimitbelowbgap} \NC -- \NC LowerLimitBaselineDropMin \NC big_op_spacing4 \NC\NR
7581 \NC \tex{Umathlimitbelowkern} \NC -- \NC 0$^1$ \NC big_op_spacing5 \NC\NR
7582 \NC \tex{Umathlimitbelowvgap} \NC -- \NC LowerLimitGapMin \NC big_op_spacing2 \NC\NR
7583 \NC \tex{Umathoverdelimitervgap}\NC -- \NC StretchStackGapBelowMin \NC big_op_spacing1 \NC\NR
7584 \NC \tex{Umathoverdelimiterbgap}\NC -- \NC StretchStackTopShiftUp \NC big_op_spacing3 \NC\NR
7585 \NC \tex{Umathunderdelimitervgap}\NC-- \NC StretchStackGapAboveMin \NC big_op_spacing2 \NC\NR
7586 \NC \tex{Umathunderdelimiterbgap}\NC-- \NC StretchStackBottomShiftDown \NC big_op_spacing4 \NC\NR
7587 \NC \tex{Umathoverbarkern} \NC -- \NC OverbarExtraAscender \NC default_rule_thickness \NC\NR
7588 \NC \tex{Umathoverbarrule} \NC -- \NC OverbarRuleThickness \NC default_rule_thickness \NC\NR
7589 \NC \tex{Umathoverbarvgap} \NC -- \NC OverbarVerticalGap \NC 3*default_rule_thickness \NC\NR
7590 \NC \tex{Umathquad} \NC -- \NC <font_size(f)>$^1$ \NC math_quad \NC\NR
7591 \NC \tex{Umathradicalkern} \NC -- \NC RadicalExtraAscender \NC default_rule_thickness \NC\NR
7592 \NC \tex{Umathradicalrule} \NC -- \NC RadicalRuleThickness \NC <not set>$^2$ \NC\NR
7593 \NC \tex{Umathradicalvgap} \NC D, D' \NC RadicalDisplayStyleVerticalGap \NC (default_rule_thickness+\crlf
7594 (abs(math_x_height)/4))$^3$ \NC\NR
7595 \NC " \NC T, T', S, S', SS, SS' \NC RadicalVerticalGap \NC (default_rule_thickness+\crlf
7596 (abs(default_rule_thickness)/4))$^3$ \NC\NR
7597 \NC \tex{Umathradicaldegreebefore}\NC -- \NC RadicalKernBeforeDegree \NC <not set>$^2$ \NC\NR
7598 \NC \tex{Umathradicaldegreeafter}\NC -- \NC RadicalKernAfterDegree \NC <not set>$^2$ \NC\NR
7599 \NC \tex{Umathradicaldegreeraise}\NC -- \NC RadicalDegreeBottomRaisePercent \NC <not set>$^{2,7}$ \NC\NR
7600 \NC \tex{Umathspaceafterscript} \NC -- \NC SpaceAfterScript \NC script_space$^4$ \NC\NR
7601 \NC \tex{Umathstackdenomdown} \NC D, D' \NC StackBottomDisplayStyleShiftDown \NC denom1 \NC\NR
7602 \NC " \NC T, T', S, S', SS, SS' \NC StackBottomShiftDown \NC denom2 \NC\NR
7603 \NC \tex{Umathstacknumup} \NC D, D' \NC StackTopDisplayStyleShiftUp \NC num1 \NC\NR
7604 \NC " \NC T, T', S, S', SS, SS' \NC StackTopShiftUp \NC num3 \NC\NR
7605 \NC \tex{Umathstackvgap} \NC D, D' \NC StackDisplayStyleGapMin \NC 7*default_rule_thickness \NC\NR
7606 \NC " \NC T, T', S, S', SS, SS' \NC StackGapMin \NC 3*default_rule_thickness \NC\NR
7607 \NC \tex{Umathsubshiftdown} \NC -- \NC SubscriptShiftDown \NC sub1 \NC\NR
7608 \NC \tex{Umathsubshiftdrop} \NC -- \NC SubscriptBaselineDropMin \NC sub_drop \NC\NR
7609 \NC \tex{Umathsubsupshiftdown} \NC -- \NC SubscriptShiftDownWithSuperscript$^8$ \NC \NC\NR
7610 \NC \NC \NC \quad\ or SubscriptShiftDown \NC sub2 \NC\NR
7611 \NC \tex{Umathsubtopmax} \NC -- \NC SubscriptTopMax \NC (abs(math_x_height * 4) / 5) \NC\NR
7612 \NC \tex{Umathsubsupvgap} \NC -- \NC SubSuperscriptGapMin \NC 4*default_rule_thickness \NC\NR
7613 \NC \tex{Umathsupbottommin} \NC -- \NC SuperscriptBottomMin \NC (abs(math_x_height) / 4) \NC\NR
7614 \NC \tex{Umathsupshiftdrop} \NC -- \NC SuperscriptBaselineDropMax \NC sup_drop \NC\NR
7615 \NC \tex{Umathsupshiftup} \NC D \NC SuperscriptShiftUp \NC sup1 \NC\NR
7616 \NC " \NC T, S, SS, \NC SuperscriptShiftUp \NC sup2 \NC\NR
7617 \NC " \NC D', T', S', SS' \NC SuperscriptShiftUpCramped \NC sup3 \NC\NR
7618 \NC \tex{Umathsupsubbottommax} \NC -- \NC SuperscriptBottomMaxWithSubscript \NC (abs(math_x_height * 4) / 5) \NC\NR
7619 \NC \tex{Umathunderbarkern} \NC -- \NC UnderbarExtraDescender \NC default_rule_thickness \NC\NR
7620 \NC \tex{Umathunderbarrule} \NC -- \NC UnderbarRuleThickness \NC default_rule_thickness \NC\NR
7621 \NC \tex{Umathunderbarvgap} \NC -- \NC UnderbarVerticalGap \NC 3*default_rule_thickness \NC\NR
7622 \NC \tex{Umathconnectoroverlapmin}\NC -- \NC MinConnectorOverlap \NC 0$^5$ \NC\NR
7623 \stoptabulate
7625 \stop
7627 Note 1: \OPENTYPE\ fonts set \tex{Umathlimitabovekern} and
7628 \tex{Umathlimitbelowkern} to zero and set \tex{Umathquad} to the font size of the used font,
7629 because these are not supported in the MATH table,
7631 Note 2: \TFM\ fonts do not set \tex{Umathradicalrule} because \TeX82\ uses the height of the radical
7632 instead. When this parameter is indeed not set when \LUATEX\ has to typeset a radical, a backward
7633 compatibility mode will kick in that assumes that an oldstyle \TeX\ font is used. Also, they do
7634 not set \tex{Umathradicaldegreebefore}, \tex{Umathradicaldegreeafter}, and
7635 \tex{Umathradicaldegreeraise}. These are then automatically initialized to $5/18$quad, $-10/18$quad, and 60.
7637 Note 3: If tfm fonts are used, then the \tex{Umathradicalvgap} is not set until the first time
7638 \LUATEX\ has to typeset a formula because this needs parameters from both family2 and family3.
7639 This provides a partial backward compatibility with \TEX82, but that compatibility is only partial:
7640 once the \tex{Umathradicalvgap} is set, it will not be recalculated any more.
7642 Note 4: (also if tfm fonts are used) A similar situation arises wrt. \tex{Umathspaceafterscript}: it is not
7643 set until the first time \LUATEX\ has to typeset a formula. This provides some backward compatibility with
7644 \TEX82. But once the \tex{Umathspaceafterscript} is set, \tex{scriptspace} will never be looked at again.
7646 Note 5: Tfm fonts set \tex{Umathconnectoroverlapmin} to zero because
7647 \TeX82\ always stacks extensibles without any overlap.
7649 Note 6: The \tex{Umathoperatorsize} is only used in \type{\displaystyle}, and is only set
7650 in \OPENTYPE\ fonts. In \TFM\ font mode, it is artificially set to one scaled point more than the
7651 initial attempt's size, so that always the \quote{first next} will be tried, just like in \TEX82.
7653 Note 7: The \tex{Umathradicaldegreeraise} is a special case because it is the only parameter that is
7654 expressed in a percentage instead of as a number of scaled points.
7656 Note 8: \type{SubscriptShiftDownWithSuperscript} does not actually exist in the \quote{standard}
7657 Opentype Math font Cambria, but it is useful enough to be added. New in version 0.38.
7659 Note 9: \type{FractionDelimiterDisplayStyleSize} and \type{FractionDelimiterSize} do not actually exist in the \quote{standard}
7660 Opentype Math font Cambria, but were useful enough to be added. New in version 0.47.
7663 \section{Math spacing setting}
7665 Besides the parameters mentioned in the previous sections, there are
7666 also 64 new primitives to control the math spacing table (as explained in
7667 Chapter~18 of the \TeX book). The primitive names are a simple matter
7668 of combining two math atom types, but for completeness' sake, here is
7669 the whole list:
7671 \startcolumns[n=2]
7672 \starttyping
7673 \Umathordordspacing
7674 \Umathordopspacing
7675 \Umathordbinspacing
7676 \Umathordrelspacing
7677 \Umathordopenspacing
7678 \Umathordclosespacing
7679 \Umathordpunctspacing
7680 \Umathordinnerspacing
7681 \Umathopordspacing
7682 \Umathopopspacing
7683 \Umathopbinspacing
7684 \Umathoprelspacing
7685 \Umathopopenspacing
7686 \Umathopclosespacing
7687 \Umathoppunctspacing
7688 \Umathopinnerspacing
7689 \Umathbinordspacing
7690 \Umathbinopspacing
7691 \Umathbinbinspacing
7692 \Umathbinrelspacing
7693 \Umathbinopenspacing
7694 \Umathbinclosespacing
7695 \Umathbinpunctspacing
7696 \Umathbininnerspacing
7697 \Umathrelordspacing
7698 \Umathrelopspacing
7699 \Umathrelbinspacing
7700 \Umathrelrelspacing
7701 \Umathrelopenspacing
7702 \Umathrelclosespacing
7703 \Umathrelpunctspacing
7704 \Umathrelinnerspacing
7705 \Umathopenordspacing
7706 \Umathopenopspacing
7707 \Umathopenbinspacing
7708 \Umathopenrelspacing
7709 \Umathopenopenspacing
7710 \Umathopenclosespacing
7711 \Umathopenpunctspacing
7712 \Umathopeninnerspacing
7713 \Umathcloseordspacing
7714 \Umathcloseopspacing
7715 \Umathclosebinspacing
7716 \Umathcloserelspacing
7717 \Umathcloseopenspacing
7718 \Umathcloseclosespacing
7719 \Umathclosepunctspacing
7720 \Umathcloseinnerspacing
7721 \Umathpunctordspacing
7722 \Umathpunctopspacing
7723 \Umathpunctbinspacing
7724 \Umathpunctrelspacing
7725 \Umathpunctopenspacing
7726 \Umathpunctclosespacing
7727 \Umathpunctpunctspacing
7728 \Umathpunctinnerspacing
7729 \Umathinnerordspacing
7730 \Umathinneropspacing
7731 \Umathinnerbinspacing
7732 \Umathinnerrelspacing
7733 \Umathinneropenspacing
7734 \Umathinnerclosespacing
7735 \Umathinnerpunctspacing
7736 \Umathinnerinnerspacing
7737 \stoptyping
7738 \stopcolumns
7740 These parameters are of type \type{\muskip}, so setting a parameter
7741 can be done like this:
7743 \starttyping
7744 \Umathopordspacing\displaystyle=4mu plus 2mu
7745 \stoptyping
7747 They are all initialized by initex to the values mentioned in the
7748 table in Chapter~18 of the \TeX book.
7750 Note 1: for ease of use as well as for backward compatibility, \type{\thinmuskip},
7751 \type{\medmuskip} and \type{\thickmuskip} are treated especially. In their case a pointer to
7752 the corresponding internal parameter is saved, not the actual \type{\muskip} value. This
7753 means that any later changes to one of these three parameters will be taken into account.
7755 Note 2: Careful readers will realise that there are also primitives
7756 for the items marked \type{*} in the \TeX book. These will not
7757 actually be used as those combinations of atoms cannot actually
7758 happen, but it seemed better not to break orthogonality. They are initialized to zero.
7761 \section[mathacc]{Math accent handling}
7763 \LUATEX\ supports both top accents and bottom accents in math mode,
7764 and math accents stretch automatically (if this is supported by the
7765 font the accent comes from, of course). Bottom and combined accents as
7766 well as fixed-width math accents are controlled by optional keywords
7767 following \tex{Umathaccent}.
7769 The keyword \type{bottom} after \tex{Umathaccent} signals that a bottom
7770 accent is needed, and the keyword \type{both} signals that both a top
7771 and a bottom accent are needed (in this case two accents need to be
7772 specified, of course).
7774 Then the set of three integers defining the accent is read. This set
7775 of integers can be prefixed by the \type{fixed} keyword to indicate
7776 that a non-stretching variant is requested (in case of both accents,
7777 this step is repeated).
7779 A simple example:
7780 \starttyping
7781 \Umathaccent both fixed 0 0 "20D7 fixed 0 0 "20D7 {example}
7782 \stoptyping
7784 If a math top accent has to be placed and the accentee is a character and has a non-zero
7785 \type{top_accent} value, then this value will be used to place the accent instead of
7786 the \type{\skewchar} kern used by \TEX82.
7788 The \type{top_accent} value represents a vertical line somewhere in the accentee. The
7789 accent will be shifted horizontally such that its own \type{top_accent} line coincides
7790 with the one from the accentee. If the \type{top_accent} value of the accent is zero,
7791 then half the width of the accent followed by its italic correction is used instead.
7793 The vertical placement of a top accent depends on the \type{x_height} of the font of the
7794 accentee (as explained in the \TEX book), but if value that turns out to be zero and the
7795 font had a MathConstants table, then \type{AccentBaseHeight} is used instead.
7797 If a math bottom accent has to be placed, the \type{bot_accent} value is checked instead
7798 of \type{top_accent}. Because bottom accents do not exist in \TEX82, the \type{\skewchar}
7799 kern is ignored.
7801 The vertical placement of a bottom accent is straight below the accentee, no correction
7802 takes place.
7804 \section{Math root extension}
7806 The new primitive \type{\Uroot} allows the construction of a radical
7807 noad including a degree field. Its syntax is an extension of \type{\Uradical}:
7809 \starttyping
7810 \Uradical <fam integer> <char integer> <radicand>
7811 \Uroot <fam integer> <char integer> <degree> <radicand>
7812 \stoptyping
7814 The placement of the degree is controlled by the math parameters
7815 \type{\Umathradicaldegreebefore}, \type{\Umathradicaldegreeafter}, and
7816 \type{\Umathradicaldegreeraise}. The degree will be typeset in \type{\scriptscriptstyle}.
7819 \section{Math kerning in super- and subscripts}
7821 The character fields in a lua-loaded OpenType math font can have a \quote{mathkern} table.
7822 The format of this table is the same as the \quote{mathkern} table that is returned by
7823 the \type{fontloader} library, except that all height and kern values have to
7824 be specified in actual scaled points.
7826 When a super- or subscript has to be placed next to a math item, \LUATEX\ checks
7827 whether the super- or subscript and the nucleus are both simple character items. If
7828 they are, and if the fonts of both character imtes are OpenType fonts (as opposed to
7829 legacy \TEX\ fonts), then \LUATEX\ will use the OpenType MATH algorithm for deciding
7830 on the horizontal placement of the super- or subscript.
7832 This works as follows:
7834 \startitemize
7835 \item The vertical position of the script is calculated.
7836 \item The default horizontal position is flat next to the base character.
7837 \item For superscripts, the italic correction of the base character is added.
7838 \item For a superscript, two vertical values are calculated: the bottom of the
7839 script (after shifting up), and the top of the base. For a subscript,
7840 the two values are the top of the (shifted down) script, and the bottom
7841 of the base.
7842 \item For each of these two locations:
7843 \startitemize
7844 \item find the mathkern value at this height for the base
7845 (for a subscript placement, this is the bottom_right corner,
7846 for a superscript placement the top_right corner)
7847 \item find the mathkern value at this height for the script
7848 (for a subscript placement, this is the top_left corner,
7849 for a superscript placement the bottom_left corner)
7850 \item add the found values together to get a preliminary result.
7851 \stopitemize
7852 \item The horizontal kern to be applied is the smallest of the two results from
7853 previous step.
7854 \stopitemize
7856 The mathkern value at a specific height is the kern value that is specified by the
7857 next higher height and kern pair, or the highest one in the character (if there is no
7858 value high enough in the character), or simply zero (if the character has no mathkern
7859 pairs at all).
7861 \section{Scripts on horizontally extensible items like arrows}
7863 The new primitives \tex{Uunderdelimiter} and \tex{Uoverdelimiter}
7864 (both from 0.35) allow the placement of a subscript or superscript on
7865 an automatically extensible item and \tex{Udelimiterunder} and
7866 \tex{Udelimiterover} (both from 0.37) allow the placement of
7867 an automatically extensible item as a subscript or superscript on a
7868 nucleus.
7870 The vertical placements are controlled by
7871 \tex{Umathunderdelimiterbgap}, \tex{Umathunderdelimitervgap},
7872 \tex{Umathoverdelimiterbgap}, and \tex{Umathoverdelimitervgap} in a similar way as limit
7873 placements on large operators. The superscript in \tex{Uoverdelimiter} is typeset in
7874 a suitable scripted style, the subscript in \tex{Uunderdelimiter} is cramped as well.
7876 \section {Extensible delimiters}
7878 \LUATEX\ internally uses a structure that supports \OPENTYPE\ \quote{MathVariants} as well
7879 as \TFM\ \quote{extensible recipes}.
7882 \section{Other Math changes}
7884 \subsection {Verbose versions of single-character math commands}
7886 \LUATEX\ defines six new primitives that have the same function as
7887 \type{^}, \type{_}, \type{$}, and \type{$$}. %$
7889 \starttabulate[|l|l|l|l|]
7890 \NC \bf primitive \NC \bf explanation \NC\NR
7891 \NC \tex{Usuperscript} \NC Duplicates the functionality of \type{^} \NC\NR
7892 \NC \tex{Usubscript} \NC Duplicates the functionality of \type{_} \NC\NR
7893 \NC \tex{Ustartmath} \NC Duplicates the functionality of \type{$}, % $
7894 when used in non-math mode. \NC\NR
7895 \NC \tex{Ustopmath} \NC Duplicates the functionality of \type{$}, % $
7896 when used in inline math mode. \NC\NR
7897 \NC \tex{Ustartdisplaymath}\NC Duplicates the functionality of \type{$$}, % $$
7898 when used in non-math mode. \NC\NR
7899 \NC \tex{Ustopdisplaymath} \NC Duplicates the functionality of \type{$$}, % $$
7900 when used in display math mode. \NC\NR
7901 \stoptabulate
7903 All are new in version 0.38. The \tex{Ustopmath} and \tex{Ustopdisplaymath}
7904 primitives check if the current math mode is the correct one (inline
7905 vs. displayed), but you can freely intermix the four mathon|/|mathoff
7906 commands with explicit dollar sign(s).
7909 \subsection{Allowed math commands in non-math modes}
7911 The commands \type{\mathchar}, and \type{\Umathchar} and control
7912 sequences that are the result of \type{\mathchardef} or
7913 \type{\Umathchardef} are also acceptable in the horizontal and vertical modes.
7914 In those cases, the \type{\textfont} from the requested math family is used.
7916 \section{Math todo}
7918 The following items are still todo.
7920 \startitemize
7921 \item Pre-scripts.
7922 \item Multi-story stacks.
7923 \item Flattened accents for high characters (?).
7924 \item Better control over the spacing around displays and handling of equation numbers.
7925 \item Support for multi-line displays using \MATHML\ style alignment points.
7926 \stopitemize
7928 \chapter[languages]{Languages and characters, fonts and glyphs}
7930 \LUATEX's internal handling of the characters and glyphs that eventually
7931 become typeset is quite different from the way \TEX82 handles those
7932 same objects. The easiest way to explain the difference is to focus on
7933 unrestricted horizontal mode (i.\,e.\ paragraphs) and hyphenation first.
7934 Later on, it will be easy to deal with the differences that occur in
7935 horizontal and math modes.
7937 In \TEX82, the characters you type are converted into \type{char_node}
7938 records when they are encountered by the main control loop. \TEX\
7939 attaches and processes the font information while creating those
7940 records, so that the resulting \quote{horizontal list} contains the final
7941 forms of ligatures and implicit kerning. This packaging is needed because
7942 we may want to get the effective width of for instance a horizontal box.
7944 When it becomes necessary to hyphenate words in a paragraph, \TEX\
7945 converts (one word at time) the \type{char_node} records into a
7946 string array by replacing ligatures with their components and
7947 ignoring the kerning. Then it runs the hyphenation algorithm on this
7948 string, and converts the hyphenated result back into a
7949 \quote{horizontal list} that is consecutively spliced back into
7950 the paragraph stream. Keep in mind that the paragraph may contain unboxed horizontal material,
7951 which then already contains ligatures and kerns and the words therein
7952 are part of the hyphenation process.
7954 The \type{char_node} records are somewhat misnamed, as they are glyph
7955 positions in specific fonts, and therefore not really \quote{characters}
7956 in the linguistic sense. There is no language information inside the
7957 \type{char_node} records. Instead, language information is passed along
7958 using \type{language whatsit} records inside the horizontal list.
7960 In \LUATEX, the situation is quite different. The characters you
7961 type are always converted into \type{glyph_node} records with a
7962 special subtype to identify them as being intended as linguistic
7963 characters. \LUATEX\ stores the needed language information in those
7964 records, but does not do any font|-|related processing at the time of
7965 node creation. It only stores the index of the font.
7967 When it becomes necessary to typeset a paragraph, \LUATEX\ first
7968 inserts all hyphenation points right into the whole node list.
7969 Next, it processes all the font information in the whole list
7970 (creating ligatures and adjusting kerning), and finally it adjusts
7971 all the subtype identifiers so that the records are \quote{glyph
7972 nodes} from now on.
7974 That was the broad overview. The rest of this chapter will deal with the
7975 minutiae of the new process.
7977 \section[charsandglyphs]{Characters and glyphs}
7979 \TEX82 (including \PDFTEX) differentiated between \type{char_node}s
7980 and \type{lig_node}s. The former are simple items that contained
7981 nothing but a \quote{character} and a \quote{font} field, and they
7982 lived in the same memory as tokens did. The latter also contained a
7983 list of components, and a subtype indicating whether this ligature was
7984 the result of a word boundary, and it was stored in the same place as
7985 other nodes like boxes and kerns and glues.
7987 In \LUATEX, these two types are merged into one, somewhat larger
7988 structure called a \type{glyph_node}. Besides having the old
7989 character, font, and component fields, and the new special fields like
7990 \quote{attr} (see~\in{section}[glyphnodes]), these nodes also contain:
7992 \startitemize
7994 \item A subtype, split into four main types:
7996 \startitemize
7997 \item \type{character}, for characters to be hyphenated: the lowest
7998 bit (bit 0) is set to 1.
7999 \item \type{glyph}, for specific font glyphs: the lowest bit
8000 (bit 0) is not set.
8001 \item \type{ligature}, for ligatures (bit 1 is set)
8002 \item \type{ghost}, for \quote{ghost objects} (bit 2 is set)
8003 \stopitemize
8005 The latter two make further use of two extra fields (bits 3 and 4):
8007 \startitemize
8008 \item \type{left}, for ligatures created from a left word boundary and
8009 for ghosts created from \tex{leftghost}
8010 \item \type{right}, for ligatures created from a right word boundary and
8011 for ghosts created from \tex{rightghost}
8012 \stopitemize
8014 For ligatures, both bits can be set at the same time (in case of a single|-|glyph word).
8016 \item \type{glyph_node}s of type \quote{character} also contain language data,
8017 split into four items that were current when the node was created:
8018 the \tex{setlanguage} (15 bits), \tex{lefthyphenmin} (8 bits),
8019 \tex{righthyphenmin} (8 bits), and \tex{uchyph} (1 bit).
8021 \stopitemize
8023 Incidentally, \LUATEX\ allows 16383 separate languages, and words can
8024 be 256 characters long.
8026 Because the \tex{uchyph} value is saved in the actual nodes, its
8027 handling is subtly different from \TEX82: changes to \tex{uchyph}
8028 become effective immediately, not at the end of the current partial
8029 paragraph.
8031 Typeset boxes now always have their language information embedded in
8032 the nodes themselves, so there is no longer a possible dependency on
8033 the surrounding language settings. In \TEX82, a mid-paragraph
8034 statement like \tex{unhbox0} would process the box using the current
8035 paragraph language unless there was a \tex{setlanguage} issued inside
8036 the box. In \LUATEX, all language variables are already frozen.
8039 \section{The main control loop}
8041 In \LUATEX's main loop, almost all input characters that are to be
8042 typeset are converted into \type{glyph_node} records with subtype
8043 \quote{character}, but there are a few small exceptions.
8045 First, the \tex{accent} primitives creates nodes with subtype \quote{glyph}
8046 instead of \quote{character}: one for the actual accent and one for the
8047 accentee. The primary reason for this is that \tex{accent} in \TEX82
8048 is explicitly dependent on the current font encoding, so it would not
8049 make much sense to attach a new meaning to the primitive's name, as
8050 that would invalidate many old documents and macro packages. A
8051 secondary reason is that in \TEX82, \tex{accent} prohibits hyphenation
8052 of the current word. Since in \LUATEX\ hyphenation only takes place on
8053 \quote{character} nodes, it is possible to achieve the same effect.
8055 This change of meaning did happen with \tex{char}, that now generates
8056 \quote{character} nodes, consistent with its changed meaning in \XETEX.
8057 The changed status of \tex{char} is not yet finalized, but if it stays
8058 as it is now, a new primitive \tex{glyph} should be added to directly
8059 insert a font glyph id.
8061 Second, all the results of processing in math mode eventually become
8062 nodes with \quote{glyph} subtypes.
8064 Third, the \ALEPH-derived commands \tex{leftghost} and
8065 \tex{rightghost} create nodes of a third subtype: \quote{ghost}. These nodes
8066 are ignored completely by all further processing until the stage where
8067 inter-glyph kerning is added.
8069 Fourth, automatic discretionaries are handled differently. \TEX82
8070 inserts an empty discretionary after sensing an input character that
8071 matches the \tex{hyphenchar} in the current font. This test is wrong,
8072 in our opinion: whether or not hyphenation takes place should not
8073 depend on the current font, it is a language property.
8075 In \LUATEX, it works like this: if \LUATEX\ senses a string of input
8076 characters that matches the value of the new integer parameter
8077 \tex{exhyphenchar}, it will insert an explicit discretionary after that
8078 series of nodes. Initex sets the \tex{exhyphenchar=`\-}.
8079 Incidentally, this is a global parameter instead of a
8080 language-specific one because it may be useful to change the value
8081 depending on the document structure instead of the text language.
8083 Note: as of \LUATEX\ 0.63.0, the insertion of discretionaries after
8084 a sequence of explicit hyphens happens at the same time as the other
8085 hyphenation processing, {\it not\/} inside the main control loop.
8087 The only use \LUATEX\ has for \tex{hyphenchar} is at the check
8088 whether a word should be considered for hyphenation at all. If the
8089 \tex{hyphenchar} of the font attached to the first character node in a
8090 word is negative, then hyphenation of that word is abandoned
8091 immediately. {\bf This behavior is added for backward
8092 compatibility only, and the use of \type{\hyphenchar=-1} as a means of
8093 preventing hyphenation should not be used in new \LUATEX\ documents.}
8095 Fifth, \tex{setlanguage} no longer creates whatsits. The meaning of
8096 \tex{setlanguage} is changed so that it is now an integer parameter
8097 like all others. That integer parameter is used in \tex{glyph_node}
8098 creation to add language information to the glyph nodes. In
8099 conjunction, the \tex{language} primitive is extended so that it
8100 always also updates the value of \tex{setlanguage}.
8102 Sixth, the \tex{noboundary} command (this command prohibits word
8103 boundary processing where that would normally take place) now does
8104 create whatsits. These whatsits are needed because the exact place of
8105 the \tex{noboundary} command in the input stream has to be retained
8106 until after the ligature and font processing stages.
8108 Finally, there is no longer a \type{main_loop} label in the
8109 code. Remember that \TEX82 did quite a lot of processing while adding
8110 \type{char_nodes} to the horizontal list? For speed reasons, it handled
8111 that processing code outside of the \quote{main control} loop, and only the
8112 first character of any \quote{word} was handled by that \quote{main control} loop.
8113 In \LUATEX, there is no longer a need for that (all hard work is done
8114 later), and the (now very small) bits of character-handling code have
8115 been moved back inline. When \tex{tracingcommands} is on, this is
8116 visible because the full word is reported, instead of just the initial
8117 character.
8120 \section[patternsexceptions]{Loading patterns and exceptions}
8122 The hyphenation algorithm in \LUATEX\ is quite different from the one
8123 in \TEX82, although it uses essentially the same user input.
8125 After expansion, the argument for \tex{patterns} has to be proper
8126 UTF-8 with individual patterns separated by spaces, no \tex{char} or
8127 \tex{chardef-ed} commands are allowed. (The current implementation is
8128 even more strict, and will reject all non|-|\UNICODE\ characters, but
8129 that will be changed in the future. For now, the generated errors are
8130 a valuable tool in discovering font-encoding specific pattern files)
8132 Likewise, the expanded argument for \tex{hyphenation} also has to be
8133 proper UTF-8, but here a tiny little bit of extra syntax is provided:
8135 \startitemize[n]
8136 \item three sets of arguments in curly braces (\type{{}{}{}})
8137 indicates a desired complex discretionary, with arguments
8138 as in \tex{discretionary}'s command in normal document input.
8139 \item \type{-} indicates a desired simple discretionary, cf. \tex{-} and
8140 \type{\discretionary{-}{}{}} in normal document input.
8141 \item Internal command names are ignored. This rule is provided
8142 especially for \tex{discretionary}, but it also helps to deal with
8143 \tex{relax} commands that may sneak in.
8144 \item \type{=} indicates a (non-discretionary) hyphen in the document input.
8145 \stopitemize
8147 The expanded argument is first converted back to a space-separated
8148 string while dropping the internal command names. This string is then
8149 converted into a dictionary by a routine that creates key||value pairs
8150 by converting the other listed items. It is important to note that the
8151 keys in an exception dictionary can always be generated from the
8152 values. Here are a few examples:
8154 \starttabulate[|l|l|l|]
8155 \NC \ssbf value \NC \ssbf implied key (input) \NC \ssbf effect\NC\NR
8156 \NC \type{ta-ble} \NC table \NC \type{ta\-ble}
8157 ($=$ \type{ta\discretionary{-}{}{}ble})\NC\NR
8158 \NC \type{ba{k-}{}{c}ken}\NC backen \NC \type{ba\discretionary{k-}{}{c}ken}\NC\NR
8159 \stoptabulate
8161 The resultant patterns and exception dictionary will be stored under
8162 the language code that is the present value of \tex{language}.
8164 In the last line of the table, you see there is no \tex{discretionary}
8165 command in the value: the command is optional in the \TEX-based input
8166 syntax. The underlying reason for that is that it is conceivable that
8167 a whole dictionary of words is stored as a plain text file and loaded
8168 into \LUATEX\ using one of the functions in the \LUA\ \luatex{lang}
8169 library. This loading method is quite a bit faster than going through
8170 the \TEX\ language primitives, but some (most?) of that speed gain
8171 would be lost if it had to interpret command sequences while doing so.
8173 Starting with \LUATEX\ 0.63.0, it is possible to specify extra hyphenation
8174 points in compound words by using \type{{-}{}{-}} for the explicit hyphen
8175 character (replace \type{-} by the actual explicit hyphen character if needed).
8176 For example, this matches the word \quote{multi-word-boundaries} and allows
8177 an extra break inbetweem \quote{boun} and \quote{daries}:
8179 \starttyping
8180 \hyphenation{multi{-}{}{-}word{-}{}{-}boun-daries}
8181 \stoptyping
8183 The motivation behind the \ETEX\ extension \tex{savinghyphcodes} was
8184 that hyphenation heavily depended on font encodings. This is no longer
8185 true in \LUATEX, and the corresponding primitive is ignored pending
8186 complete removal. The future semantics of \tex{uppercase} and
8187 \tex{lowercase} are still under consideration, no changes have taken
8188 place yet.
8191 \section{Applying hyphenation}
8193 The internal structures \LUATEX\ uses for the insertion of
8194 discretionaries in words is very different from the ones in \TEX82,
8195 and that means there are some noticeable differences in handling as
8196 well.
8198 First and foremost, there is no \quote{compressed trie} involved in
8199 hyphenation. The algorithm still reads \PATGEN-generated pattern
8200 files, but \LUATEX\ uses a finite state hash to match the patterns
8201 against the word to be hyphenated. This algorithm is based on the
8202 \quote{libhnj} library used by OpenOffice, which in turn is inspired
8203 by \TEX.
8204 The memory allocation for this new implementation is completely
8205 dynamic, so the \WEBC\ setting for \type{trie_size} is ignored.
8207 Differences between \LUATEX\ and \TEX82 that are a direct result of that:
8209 \startitemize
8210 \item \LUATEX\ happily hyphenates the full \UNICODE\ character range.
8211 \item Pattern and exception dictionary size is limited by the
8212 available memory only, all allocations are done dynamically.
8213 The trie-related settings in \type{texmf.cnf} are ignored.
8214 \item Because there is no \quote{trie preparation} stage, language patterns
8215 never become frozen. This means that the primitive \tex{patterns}
8216 (and its \LUA\ counterpart \luatex{lang.patterns}) can be used at any
8217 time, not only in initex.
8218 \item Only the string representation of \tex{patterns} and
8219 \tex{hyphenation} is stored in the format file. At format load time,
8220 they are simply re-evaluated. It follows that there is no real
8221 reason to preload languages in the format file. In fact, it is
8222 usually not a good idea to do so. It is much smarter to load
8223 patterns no sooner than the first time they are actually needed.
8224 \item \LUATEX\ uses the language-specific variables
8225 \tex{prehyphenchar} and \tex{posthyphenchar} in the creation of
8226 implicit discretionaries, instead of \TEX82's \tex{hyphenchar}, and
8227 the values of the language-specific variables \tex{preexhyphenchar} and
8228 \tex{postexhyphenchar} for explicit discretionaries (instead of
8229 \TEX82's empty discretionary).
8230 \stopitemize
8232 Inserted characters and ligatures inherit their attributes from the
8233 nearest glyph node item (usually the preceding one, but the following
8234 one for the items inserted at the left-hand side of a word).
8236 Word boundaries are no longer implied by font switches, but by
8237 language switches. One word can have two separate fonts and still be
8238 hyphenated correctly (but it can not have two different languages,
8239 the \tex{setlanguage} command forces a word boundary).
8241 All languages start out with \tex{prehyphenchar=`\-},
8242 \tex{posthyphenchar=0}, \tex{preexhyphenchar=0} and
8243 \tex{postexhyphenchar=0}.
8244 When you assign the values of one of these four parameters, you are
8245 actually changing the settings for the current \tex{language}, this
8246 behavior is compatible with \tex{patterns} and \tex{hyphenation}.
8248 \LUATEX\ also hyphenates the first word in a paragraph.
8250 Words can be up to 256 characters long (up from 64 in \TEX82). Longer
8251 words generate an error right now, but eventually either the
8252 limitation will be removed or perhaps it will become possible to
8253 silently ignore the excess characters (this is what happens in \TEX82,
8254 but there the behavior cannot be controlled).
8256 If you are using the \LUA\ function \type{lang.hyphenate}, you should be
8257 aware that this function expects to receive a list of \quote{character}
8258 nodes. It will not operate properly in the presence of \quote{glyph},
8259 \quote{ligature}, or \quote{ghost} nodes, nor does it know how to deal with
8260 kerning. In the near future, it will be able to skip over \quote{ghost}
8261 nodes, and we may add a less fuzzy function you can call as well.
8263 The hyphenation exception dictionary is maintained as key-value
8264 hash, and that is also dynamic, so the \type{hyph_size} setting is not
8265 used either.
8267 A technical paper detailing the new algorithm will be released as a
8268 separate document.
8270 \section{Applying ligatures and kerning}
8272 After all possible hyphenation points have been inserted in the list,
8273 \LUATEX\ will process the list to convert the \quote{character} nodes into
8274 \quote{glyph} and \quote{ligature} nodes. This is actually done in two stages:
8275 first all ligatures are processed, then all kerning information is
8276 applied to the result list. But those two stages are somewhat
8277 dependent on each other: If the used font makes it possible to do so,
8278 the ligaturing stage adds virtual \quote{character} nodes to the word
8279 boundaries in the list. While doing so, it removes and interprets
8280 \type{noboundary} nodes. The kerning stage deletes those word boundary
8281 items after it is done with them, and it does the same for \quote{ghost}
8282 nodes. Finally, at the end of the kerning stage, all remaining
8283 \quote{character} nodes are converted to \quote{glyph} nodes.
8285 This work separation is worth mentioning because, if you overrule from
8286 \LUA\ only one of the two callbacks related to font handling, then you
8287 have to make sure you perform the tasks normally done by \LUATEX\
8288 itself in order to make sure that the other, non|-|overruled, routine
8289 continues to function properly.
8291 Work in this area is not yet complete, but most of the possible cases
8292 are handled by our rewritten ligaturing engine. We are working hard to
8293 make sure all of the possible inputs will become supported soon.
8295 For example, take the word \type{office}, hyphenated \type{of-fice},
8296 using a \quote{normal} font with all the \type{f}-\type{f} and
8297 \type{f}-\type{i} type ligatures:
8299 \starttabulate[|l|l|]
8300 \NC Initial: \NC \type{{o}{f}{f}{i}{c}{e}}\NC\NR
8301 \NC After hyphenation: \NC \type{{o}{f}{{-},{},{}}{f}{i}{c}{e}}\NC\NR
8302 \NC First ligature stage: \NC \type{{o}{{f-},{f},{<ff>}}{i}{c}{e}}\NC\NR
8303 \NC Final result: \NC \type{{o}{{f-},{<fi>},{<ffi>}}{c}{e}} \NC\NR
8304 \stoptabulate
8306 That's bad enough, but let us assume that there is also a hyphenation
8307 point between the \type{f} and the \type{i}, to create
8308 \type{of-f-ice}. Then the final result should be:
8310 \starttyping
8311 {o}{{f-},
8312 {{f-},
8313 {i},
8314 {<fi>}},
8315 {{<ff>-},
8316 {i},
8317 {<ffi>}}}{c}{e}
8318 \stoptyping
8320 with discretionaries in the post-break text as well as in the
8321 replacement text of the top-level discretionary that resulted from the
8322 first hyphenation point.
8324 Here is that nested solution again, in a different representation:
8326 \starttabulate[|l|l|l|l|]
8327 \NC \NC pre \NC post \NC replace \NC \NR
8328 \NC topdisc \NC \type{f-}$^1$ \NC sub1 \NC sub2 \NC \NR
8329 \NC sub1 \NC \type{f-}$^2$ \NC \type{i}$^3$ \NC \type{<fi>}$^4$ \NC \NR
8330 \NC sub2 \NC \type{<ff>-}$^5$\NC \type{i}$^6$ \NC \type{<ffi>}$^7$\NC \NR
8331 \stoptabulate
8333 When line breaking is choosing its breakpoints, the following fields will eventually
8334 be selected:
8336 \starttabulate[|l|l|l|]
8337 \NC \type{of-f-ice} \NC \type{f-}$^1$ \NC \NR
8338 \NC \NC \type{f-}$^2$ \NC \NR
8339 \NC \NC \type{i}$^3$ \NC \NR
8340 \NC \type{of-fice} \NC \type{f-}$^1$ \NC \NR
8341 \NC \NC \type{<fi>}$^4$ \NC \NR
8342 \NC \type{off-ice} \NC \type{<ff>-}$^5$ \NC \NR
8343 \NC \NC \type{i}$^6$ \NC \NR
8344 \NC \type{office} \NC \type{<ffi>}$^7$ \NC \NR
8345 \stoptabulate
8347 The current solution in \LUATEX\ is not able to handle nested
8348 discretionaries, but it is in fact smart enough to handle this
8349 fictional \type{of-f-ice} example. It does so by combining two
8350 sequential discretionary nodes as if they were a single object
8351 (where the second discretionary node is treated as an extension
8352 of the first node).
8354 One can observe that the \type{of-f-ice} and \type{off-ice} cases both end
8355 with the same actual post replacement list (\type{i}), and that this
8356 would be the case even if that \type{i} was the first item of a
8357 potential following ligature like \type{ic}. This allows \LUATEX\
8358 to do away with one of the fields, and thus make the whole stuff fit
8359 into just two discretionary nodes.
8361 The mapping of the seven list fields to the six fields in this
8362 discretionary node pair is as follows:
8364 \starttabulate[|l|p|]
8365 \NC \bf field \NC \bf description \NC \NR
8366 \NC \type{disc1.pre} \NC \type{f-}$^1$ \NC \NR
8367 \NC \type{disc1.post} \NC \type{<fi>}$^4$ \NC \NR
8368 \NC \type{disc1.replace} \NC \type{<ffi>}$^7$ \NC \NR
8369 \NC \type{disc2.pre} \NC \type{f-}$^2$ \NC \NR
8370 \NC \type{disc2.post} \NC \type{i}$^{3{,}6}$\NC \NR
8371 \NC \type{disc2.replace} \NC \type{<ff>-}$^5$\NC \NR
8372 \stoptabulate
8374 What is actually generated after ligaturing has been applied is
8375 therefore:
8377 \starttyping
8378 {o}{{f-},
8379 {<fi>},
8380 {<ffi>}}
8381 {{f-},
8382 {i},
8383 {<ff>-}}{c}{e}
8384 \stoptyping
8386 The two discretionaries have different subtypes from a discretionary
8387 appearing on its own: the first has subtype 4, and the second has
8388 subtype 5. The need for these special subtypes stems from the fact
8389 that not all of the fields appear in their \quote{normal} location.
8390 The second discretionary especially looks odd, with things like the
8391 \type{<ff>-} appearing in \type{disc2.replace}. The fact that some of
8392 the fields have different meanings (and different processing code
8393 internally) is what makes it necessary to have different subtypes:
8394 this enables \LUATEX\ to distinguish this sequence of two joined
8395 discretionary nodes from the case of two standalone discretionaries
8396 appearing in a row.
8399 \section{Breaking paragraphs into lines}
8401 This code is still almost unchanged, but because of the
8402 above|-|mentioned changes with respect to discretionaries and ligatures,
8403 line breaking will potentially be different from traditional \TEX.
8404 The actual line breaking code is still based on the \TEX82 algorithms,
8405 and it does not expect there to be discretionaries inside of
8406 discretionaries.
8408 But that situation is now fairly common in \LUATEX, due to the changes
8409 to the ligaturing mechanism. And also, the \LUATEX\ discretionary
8410 nodes are implemented slightly different from the \TEX82 nodes: the
8411 \type{no_break} text is now embedded inside the disc node, where
8412 previously these nodes kept their place in the horizontal list (the
8413 discretionary node contained a counter indicating how many nodes to
8414 skip).
8416 The combined effect of these two differences is that \LUATEX\ does not
8417 always use all of the potential breakpoints in a paragraph, especially
8418 when fonts with many ligatures are used.
8420 % TODO:
8421 % Check \sfcode handling
8422 % Implement \glyph
8424 % Remove \savinghyphcodes
8425 % Allow non-UCS characters in \patterns
8427 \chapter[fonts]{Font structure}
8429 All \TEX\ fonts are represented to \LUA\ code as tables, and
8430 internally as C~structures. All keys in the table below are saved in
8431 the internal font structure if they are present in the table returned
8432 by the
8433 \luatex{define_font} callback, or if they result from the normal \TFM|/|\VF\
8434 reading routines if there is no \luatex{define_font} callback defined.
8436 The column \quote{from \VF} means that this key will be created by the
8437 \luatex{font.read_vf()} routine, \quote{from \TFM} means that the key will be created
8438 by the \luatex{font.read_tfm()} routine, and \quote{used} means whether or not the
8439 \LUATEX\ engine itself will do something with the key.
8441 The top|-|level keys in the table are as follows:
8443 \starttabulate[|Tl|l|l|l|l|p|]
8444 \NC \ssbf key \NC \bf from vf \NC \bf from tfm \NC \bf used\NC \bf value type \NC \bf description \NC\NR
8445 \NC name \NC yes \NC yes \NC yes \NC string \NC metric (file) name\NC\NR
8446 \NC area \NC no \NC yes \NC yes \NC string \NC (directory) location, typically empty\NC\NR
8447 \NC used \NC no \NC yes \NC yes \NC boolean\NC used already? (initial: false)\NC \NR
8448 \NC characters \NC yes \NC yes \NC yes \NC table \NC the defined glyphs of this font \NC \NR
8449 \NC checksum \NC yes \NC yes \NC no \NC number \NC default: 0 \NC \NR
8450 \NC designsize \NC no \NC yes \NC yes \NC number \NC expected size (default: 655360 == 10pt) \NC \NR
8451 \NC direction \NC no \NC yes \NC yes \NC number \NC default: 0 (TLT) \NC \NR
8452 \NC encodingbytes \NC no \NC no \NC yes \NC number \NC default: depends on \type {format}\NC\NR
8453 \NC encodingname \NC no \NC no \NC yes \NC string \NC encoding name\NC\NR
8454 \NC fonts \NC yes \NC no \NC yes \NC table \NC locally used fonts\NC \NR
8455 \NC psname \NC no \NC no \NC yes \NC string
8456 \NC actual (\POSTSCRIPT) name (this is the PS fontname in the
8457 incoming font source, also used as fontname identifier in the \PDF\ output, new in 0.43)\NC\NR
8458 \NC fullname \NC no \NC no \NC yes \NC string \NC output font name, used as a fallback in the \PDF\ output if the psname is not set\NC\NR
8459 \NC header \NC yes \NC no \NC no \NC string \NC header comments, if any\NC \NR
8460 \NC hyphenchar \NC no \NC no \NC yes \NC number \NC default: TeX's \tex{hyphenchar} \NC \NR
8461 \NC parameters \NC no \NC yes \NC yes \NC hash \NC default: 7 parameters, all zero \NC \NR
8462 \NC size \NC no \NC yes \NC yes \NC number \NC loaded (at) size. (default: same as designsize) \NC \NR
8463 \NC skewchar \NC no \NC no \NC yes \NC number \NC default: TeX's \tex{skewchar} \NC \NR
8464 \NC type \NC yes \NC no \NC yes \NC string \NC basic type of this font\NC \NR
8465 \NC format \NC no \NC no \NC yes \NC string \NC disk format type\NC \NR
8466 \NC embedding \NC no \NC no \NC yes \NC string \NC \PDF\ inclusion\NC \NR
8467 \NC filename \NC no \NC no \NC yes \NC string \NC disk file name\NC\NR
8468 \NC tounicode \NC no \NC yes \NC yes \NC number \NC if 1, \LUATEX\ assumes per-glyph tounicode entries are
8469 present in the font\NC\NR
8470 \NC stretch \NC no \NC no \NC yes \NC number \NC the \quote {stretch} value from \tex{pdffontexpand}\NC\NR
8471 \NC shrink \NC no \NC no \NC yes \NC number \NC the \quote {shrink} value from \tex{pdffontexpand}\NC\NR
8472 \NC step \NC no \NC no \NC yes \NC number \NC the \quote {step} value from \tex{pdffontexpand}\NC\NR
8473 \NC auto_expand \NC no \NC no \NC yes \NC boolean\NC the \quote {autoexpand} keyword from\crlf \tex{pdffontexpand}\NC\NR
8474 \NC expansion_factor \NC no \NC no \NC no \NC number \NC the actual expansion factor of an expanded font\NC\NR
8475 \NC attributes \NC no \NC no \NC yes \NC string \NC the \tex{pdffontattr}\NC\NR
8476 \NC cache \NC no \NC no \NC yes \NC string \NC this key controls caching of the lua table on the \type{tex}
8477 end. \type{yes}: use a reference to the table that is
8478 passed to \LUATEX\ (this is the default). \type{no}: don't store the table
8479 reference, don't cache any lua data for this font.
8480 \type{renew}: don't store the table reference, but
8481 save a reference to the table that is created at the
8482 first access to one of its fields in font.fonts.
8483 (new in 0.40.0, before that caching was always \type{yes}).
8484 Note: the saved reference is thread-local, so be careful when you are using coroutines: an error will be thrown if the table
8485 has been cached in one thread, but you reference it from another thread ($\approx$ coroutine)\NC\NR
8486 \NC nomath \NC no \NC no \NC yes \NC boolean\NC this key allows a minor speedup for text fonts. if it is
8487 present and true, then \LUATEX\ will not check the
8488 character enties for math-specific keys. (0.42.0)\NC\NR
8489 \NC slant \NC no \NC no \NC yes \NC number \NC This has the same semantics as the \type{SlantFont} operator
8490 in font map files. (0.47.0)\NC\NR
8491 \NC extent \NC no \NC no \NC yes \NC number \NC This has the same semantics as the \type{ExtendFont} operator
8492 in font map files. (0.50.0)\NC\NR
8493 \stoptabulate
8495 The key \type{name} is always required. The keys \type{stretch},
8496 \type{shrink}, \type{step} and optionally \type{auto_expand} only
8497 have meaning when used together: they can be used to replace a
8498 post-loading \tex{pdffontexpand} command. The
8499 \type{expansion_factor} is value that can be present inside a font
8500 in \type{font.fonts}. It is the actual expansion factor (a value
8501 between \type{-shrink} and \type{stretch}, with step \type{step})
8502 of a font that was automatically generated by the font expansion
8503 algorithm. The key \type{attributes} can be used to replace
8504 \tex{pdffontattr}. The key \type{used} is set by the engine when a
8505 font is actively in use, this makes sure that the font's
8506 definition is written to the output file (\DVI\ or \PDF). The
8507 \TFM\ reader sets it to false. The \type{direction} is a number
8508 signalling the \quote{normal} direction for this font. There are
8509 sixteen possibilities:
8511 \starttabulate[|Tc|c|c|c|]
8512 \NC \ssbf number \NC \bf meaning \NC \bf number \NC \bf meaning \NC\NR
8513 \NC 0 \NC LT \NC 8 \NC TT \NC\NR
8514 \NC 1 \NC LL \NC 9 \NC TL \NC\NR
8515 \NC 2 \NC LB \NC 10 \NC TB \NC\NR
8516 \NC 3 \NC LR \NC 11 \NC TR \NC\NR
8517 \NC 4 \NC RT \NC 12 \NC BT \NC\NR
8518 \NC 5 \NC RL \NC 13 \NC BL \NC\NR
8519 \NC 6 \NC RB \NC 14 \NC BB \NC\NR
8520 \NC 7 \NC RR \NC 15 \NC BR \NC\NR
8521 \stoptabulate
8523 These are \OMEGA|-|style direction abbreviations: the first character
8524 indicates the \quote{first} edge of the character glyphs (the edge that is
8525 seen first in the writing direction), the second the \quote{top} side.
8527 The \type{parameters} is a hash with mixed key types. There are seven
8528 possible string keys, as well as a number of integer indices (these
8529 start from 8 up). The seven strings are actually used instead of the
8530 bottom seven indices, because that gives a nicer user interface.
8532 The names and their internal remapping are:
8534 \starttabulate[|lT|c|]
8535 \NC \ssbf name \NC \bf internal remapped number \NC\NR
8536 \NC slant \NC 1 \NC\NR
8537 \NC space \NC 2 \NC\NR
8538 \NC space_stretch \NC 3 \NC\NR
8539 \NC space_shrink \NC 4 \NC\NR
8540 \NC x_height \NC 5 \NC\NR
8541 \NC quad \NC 6 \NC\NR
8542 \NC extra_space \NC 7 \NC\LR
8543 \stoptabulate
8545 The keys \type{type}, \type{format}, \type{embedding}, \type{fullname} and
8546 \type{filename} are used to embed \OPENTYPE\ fonts in the result \PDF.
8548 The \type{characters} table is a list of character hashes indexed by
8549 an integer number. The number is the \quote{internal code} \TEX\ knows this
8550 character by.
8552 Two very special string indexes can be used also: \type{left_boundary} is a
8553 virtual character whose ligatures and kerns are used to handle word
8554 boundary processing. \type{right_boundary} is similar but not actually
8555 used for anything (yet!).
8557 Other index keys are ignored.
8559 Each character hash itself is a hash. For example, here is the
8560 character \quote{f} (decimal 102) in the font cmr10 at 10 points:
8562 \starttyping
8563 [102] = {
8564 ['width'] = 200250,
8565 ['height'] = 455111,
8566 ['depth'] = 0,
8567 ['italic'] = 50973,
8568 ['kerns'] = {
8569 [63] = 50973,
8570 [93] = 50973,
8571 [39] = 50973,
8572 [33] = 50973,
8573 [41] = 50973
8575 ['ligatures'] = {
8576 [102] = {
8577 ['char'] = 11,
8578 ['type'] = 0
8580 [108] = {
8581 ['char'] = 13,
8582 ['type'] = 0
8584 [105] = {
8585 ['char'] = 12,
8586 ['type'] = 0
8590 \stoptyping
8592 The following top|-|level keys can be present inside a character hash:
8594 \starttabulate[|lT|c|c|c|l|p|]
8595 \NC \ssbf key \NC \bf from vf \NC \bf from tfm \NC \bf used \NC \bf value type \NC \bf description \NC\NR
8596 \NC width \NC yes \NC yes \NC yes \NC number \NC character's width, in sp (default 0) \NC\NR
8597 \NC height \NC no \NC yes \NC yes \NC number \NC character's height, in sp (default 0) \NC\NR
8598 \NC depth \NC no \NC yes \NC yes \NC number \NC character's depth, in sp (default 0) \NC\NR
8599 \NC italic \NC no \NC yes \NC yes \NC number \NC character's italic correction, in sp (default zero) \NC\NR
8600 \NC top_accent \NC no \NC no \NC maybe \NC number \NC character's top accent alignment place, in sp (default zero) \NC\NR
8601 \NC bot_accent \NC no \NC no \NC maybe \NC number \NC character's bottom accent alignment place, in sp (default zero) \NC\NR
8602 \NC left_protruding \NC no \NC no \NC maybe \NC number \NC character's \tex{lpcode}\NC\NR
8603 \NC right_protruding \NC no \NC no \NC maybe \NC number \NC character's \tex{rpcode}\NC\NR
8604 \NC expansion_factor \NC no \NC no \NC maybe \NC number \NC character's \tex{efcode}\NC\NR
8605 \NC tounicode \NC no \NC no \NC maybe \NC string \NC character's Unicode equivalent(s), in UTF-16BE hexadecimal format\NC\NR
8606 \NC next \NC no \NC yes \NC yes \NC number \NC the \quote{next larger} character index \NC\NR
8607 \NC extensible \NC no \NC yes \NC yes \NC table \NC the constituent parts of an extensible recipe \NC\NR
8608 \NC vert_variants \NC no \NC no \NC yes \NC table \NC constituent parts of a vertical variant set\NC \NR
8609 \NC horiz_variants\NC no \NC no \NC yes \NC table \NC constituent parts of a horizontal variant set\NC \NR
8610 \NC kerns \NC no \NC yes \NC yes \NC table \NC kerning information \NC\NR
8611 \NC ligatures \NC no \NC yes \NC yes \NC table \NC ligaturing information \NC\NR
8612 \NC commands \NC yes \NC no \NC yes \NC array \NC virtual font commands \NC\NR
8613 \NC name \NC no \NC no \NC no \NC string \NC the character (\POSTSCRIPT) name \NC\NR
8614 \NC index \NC no \NC no \NC yes \NC number \NC the (\OPENTYPE\ or \TRUETYPE) font glyph index \NC\NR
8615 \NC used \NC no \NC yes \NC yes \NC boolean \NC typeset already (default: false)? \NC\NR
8616 \NC mathkern \NC no \NC no \NC yes \NC table \NC math cut-in specifications \NC\NR
8617 \stoptabulate
8619 The values of \type{top_accent}, \type{bot_accent} and \type{mathkern} are used only for math
8620 accent and superscript placement, see the \at{math chapter}[math] in this manual for details.
8622 The values of \type{left_protruding} and \type{right_protruding} are used only when
8623 \tex{pdfprotrudechars} is non-zero.
8625 Whether or not \type{expansion_factor} is used depends on the font's global expansion
8626 settings, as well as on the value of \tex{pdfadjustspacing}.
8628 The usage of \type{tounicode} is this: if this font specifies a \type{tounicode=1} at
8629 the top level, then \LUATEX\ will construct a \type{/ToUnicode} entry for the \PDF\
8630 font (or font subset) based on the character-level \type{tounicode} strings, where
8631 they are available. If a character does not have a sensible \UNICODE\ equivalent,
8632 do not provide a string either (no empty strings).
8634 If the font-level \type{tounicode} is not set, then \LUATEX\ will build up
8635 \type{/ToUnicode} based on the \TEX\ code points you used, and any character-level
8636 \type{tounicodes} will be ignored. {\it At the moment, the string format is exactly the
8637 format that is expected by Adobe \CMAP\ files (\UTF-16BE in hexadecimal encoding), minus
8638 the enclosing angle brackets. This may change in the future.} Small example: the
8639 \type{tounicode} for a \type{fi} ligature would be \type{00660069}.
8641 The presence of \type{extensible} will overrule \type{next}, if that is also present.
8642 It in in turn can be overruled by \type{vert_variants}.
8644 The \type{extensible} table is very simple:
8646 \starttabulate[|lT|l|p|]
8647 \NC \ssbf key \NC \bf type \NC \bf description \NC\NR
8648 \NC top \NC number \NC \quote{top} character index \NC\NR
8649 \NC mid \NC number \NC \quote{middle} character index \NC\NR
8650 \NC bot \NC number \NC \quote{bottom} character index \NC\NR
8651 \NC rep \NC number \NC \quote{repeatable} character index \NC\NR
8652 \stoptabulate
8654 The \type{horiz_variants} and \type{vert_variants} are arrays of components. Each of those
8655 components is itself a hash of up to five keys:
8657 \starttabulate[|lT|l|p|]
8658 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
8659 \NC glyph \NC number \NC The character index (note that this is an encoding number, not a name).\NC \NR
8660 \NC extender \NC number \NC One (1) if this part is repeatable, zero (0) otherwise.\NC \NR
8661 \NC start \NC number \NC Maximum overlap at the starting side (in scaled points).\NC \NR
8662 \NC end \NC number \NC Maximum overlap at the ending side (in scaled points).\NC \NR
8663 \NC advance \NC number \NC Total advance width of this item (can be zero or missing,
8664 then the natural size of the glyph for character \type{component}
8665 is used).\NC \NR
8666 \stoptabulate
8668 The \type{kerns} table is a hash indexed by character index (and
8669 \quote{character index} is defined as either a non|-|negative integer or the
8670 string value \type {right_boundary}), with the values the kerning to be
8671 applied, in scaled points.
8673 The \type{ligatures} table is a hash indexed by character index (and
8674 \quote{character index} is defined as either a non|-|negative integer or the
8675 string value \type {right_boundary}), with the values being yet another small
8676 hash, with two fields:
8678 \starttabulate[|lT|l|p|]
8679 \NC \ssbf key \NC \bf type \NC \bf description \NC\NR
8680 \NC type \NC number \NC the type of this ligature command, default 0 \NC\NR
8681 \NC char \NC number \NC the character index of the resultant ligature \NC\NR
8682 \stoptabulate
8684 The \type{char} field in a ligature is required.
8686 The \type{type} field inside a ligature is the numerical or string value of one of the eight
8687 possible ligature types supported by \TEX. When \TEX\ inserts a new ligature, it puts the new
8688 glyph in the middle of the left and right glyphs. The original left and right glyphs can
8689 optionally be retained, and when at least one of them is kept, it is also possible to move the
8690 new \quote{insertion point} forward one or two places. The glyph that ends up to the right of the
8691 insertion point will become the next \quote{left}.
8693 \starttabulate[|l|c|l|l|]
8694 \NC \bf textual (Knuth) \NC \bf number \NC \bf string \NC result \NC\NR
8695 \NC l + r =: n \NC 0 \NC \type{=:} \NC \|n \NC\NR
8696 \NC l + r =:\| n \NC 1 \NC \type{=:|} \NC \|nr \NC\NR
8697 \NC l + r \|=: n \NC 2 \NC \type{|=:} \NC \|ln \NC\NR
8698 \NC l + r \|=:\| n \NC 3 \NC \type{|=:|} \NC \|lnr \NC\NR
8699 \NC l + r =:\|\> n \NC 5 \NC \type{=:|>} \NC n\|r \NC\NR
8700 \NC l + r \|=:\> n \NC 6 \NC \type{|=:>} \NC l\|n \NC\NR
8701 \NC l + r \|=:\|\> n \NC 7 \NC \type{|=:|>} \NC l\|nr \NC\NR
8702 \NC l + r \|=:\|\>\> n \NC 11 \NC \type{|=:|>>} \NC ln\|r \NC\NR
8703 \stoptabulate
8705 The default value is~0, and can be left out. That signifies a \quote{normal}
8706 ligature where the ligature replaces both original glyphs. In this table
8707 the~\| indicates the final insertion point.
8709 The \type{commands} array is explained below.
8711 \section {Real fonts}
8713 Whether or not a \TEX\ font is a \quote{real} font that should be written to
8714 the \PDF\ document is decided by the \type{type} value in the top|-|level
8715 font structure. If the value is \type{real}, then this is a proper
8716 font, and the inclusion mechanism will attempt to add the needed
8717 font object definitions to the \PDF.
8719 Values for \type{type}:
8721 \starttabulate[|Tl|p|]
8722 \NC \ssbf value \NC \bf description \NC\NR
8723 \NC real \NC this is a base font \NC\NR
8724 \NC virtual \NC this is a virtual font \NC\NR
8725 \stoptabulate
8727 The actions to be taken depend on a number of different variables:
8729 \startitemize[packed]
8730 \item Whether the used font fits in an 8-bit encoding scheme or not
8731 \item The type of the disk font file
8732 \item The level of embedding requested
8733 \stopitemize
8735 A font that uses anything other than an 8-bit encoding vector has to
8736 be written to the \PDF\ in a different way.
8738 The rule is: if the font table has \type {encodingbytes} set to~2,
8739 then this is a wide font, in all other cases it isn't. The value~2 is
8740 the default for \OPENTYPE\ and \TRUETYPE\ fonts loaded via \LUA. For
8741 \TYPEONE\ fonts, you have to set \type {encodingbytes} to~2
8742 explicitly. For \PK\ bitmap fonts, wide font encoding is not
8743 supported at all.
8745 If no special care is needed, \LUATEX\ currently falls back to the
8746 mapfile|-|based solution used by \PDFTEX\ and \DVIPS. This behavior
8747 will be removed in the future, when the existing code becomes
8748 integrated in the new subsystem.
8750 But if this is a \quote{wide} font, then the new subsystem kicks in, and
8751 some extra fields have to be present in the font structure. In this
8752 case, \LUATEX\ does not use a map file at all.
8754 The extra fields are: \type{format}, \type{embedding}, \type{fullname},
8755 \type{cidinfo} (as explained above), \type{filename}, and the
8756 \type{index} key in the separate characters.
8758 Values for \type{format} are:
8760 \starttabulate[|Tl|p|]
8761 \NC \ssbf value \NC \bf description \NC\NR
8762 \NC type1 \NC this is a \POSTSCRIPT\ \TYPEONE\ font \NC\NR
8763 \NC type3 \NC this is a bitmapped (\PK) font \NC\NR
8764 \NC truetype \NC this is a \TRUETYPE\ or \TRUETYPE|-|based \OPENTYPE\ font \NC\NR
8765 \NC opentype \NC this is a \POSTSCRIPT|-|based \OPENTYPE\ font \NC\NR
8766 \stoptabulate
8768 (\type{type3} fonts are provided for backward compatibility only, and do not
8769 support the new wide encoding options.)
8771 Values for \type{embedding} are:
8773 \starttabulate[|Tl|p|]
8774 \NC \ssbf value \NC \bf description \NC\NR
8775 \NC no \NC don't embed the font at all \NC\NR
8776 \NC subset \NC include and atttempt to subset the font \NC\NR
8777 \NC full \NC include this font in its entirety \NC\NR
8778 \stoptabulate
8780 It is not possible to artificially modify the transformation matrix
8781 for the font at the moment.
8783 The other fields are used as follows: The \type{fullname} will be the
8784 \POSTSCRIPT|/|\PDF\ font name. The \type{cidinfo} will be used as the
8785 character set (the CID \type{/Ordering} and \type{/Registry} keys). The
8786 \type{filename} points to the actual font file. If you include the
8787 full path in the \type{filename} or if the file is in the local
8788 directory, \LUATEX\ will run a little bit more efficient because it
8789 will not have to re|-|run the \type{find_xxx_file} callback in that
8790 case.
8792 Be careful: when mixing old and new fonts in one document, it is possible to
8793 create \POSTSCRIPT\ name clashes that can result in printing
8794 errors. When this happens, you have to change the \type{fullname}
8795 of the font.
8797 Typeset strings are written out in a wide format using 2~bytes per
8798 glyph, using the \type{index} key in the character information as
8799 value. The overall effect is like having an encoding based on numbers
8800 instead of traditional (\POSTSCRIPT) name|-|based reencoding. The way
8801 to get the correct \type{index} numbers for \TYPEONE\ fonts is by
8802 loading the font via \type{fontloader.open}; use the table indices as
8803 \type{index} fields.
8805 This type of reencoding means that there is no longer a clear
8806 connection between the text in your input file and the strings in the
8807 output \PDF\ file. Dealing with this is high on the agenda.
8809 \section[virtualfonts]{Virtual fonts}
8811 You have to take the following steps if you want \LUATEX\ to treat the
8812 returned table from \luatex{define_font} as a virtual font:
8814 \startitemize[packed]
8815 \item Set the top|-|level key \type {type} to \type {virtual}.
8816 \item Make sure there is at least one valid entry in \luatex{fonts} (see below).
8817 \item Give a \type {commands} array to every character (see below).
8818 \stopitemize
8820 The presence of the toplevel \type {type} key with the specific value
8821 \type {virtual} will trigger handling of the rest of the special virtual
8822 font fields in the table, but the mere existence of 'type' is enough
8823 to prevent \LUATEX\ from looking for a virtual font on its own.
8825 Therefore, this also works \quote{in reverse}: if you are absolutely certain
8826 that a font is not a virtual font, assigning the value \type{base} or
8827 \type{real} to \type{type} will inhibit \LUATEX\ from looking for a virtual font
8828 file, thereby saving you a disk search.
8830 The \luatex{fonts} is another \LUA\ array. The values are one- or two|-|key
8831 hashes themselves, each entry indicating one of the base fonts in a
8832 virtual font. In case your font is referring to itself, you can use the
8833 \type {font.nextid()} function which returns the index of the next to be defined
8834 font which is probably the currently defined one.
8836 An example makes this easy to understand
8838 \starttyping
8839 fonts = {
8840 { name = 'ptmr8a', size = 655360 },
8841 { name = 'psyr', size = 600000 },
8842 { id = 38 }
8844 \stoptyping
8846 says that the first referenced font (index 1) in this virtual font is
8847 \type{ptrmr8a} loaded at 10pt, and the second is \type{psyr} loaded
8848 at a little over 9pt. The third one is previously defined font that
8849 is known to \LUATEX\ as fontid \quote{38}.
8851 The array index numbers are used by the character command definitions
8852 that are part of each character.
8854 The \luatex{commands} array is a hash where each item is another small array, with the first
8855 entry representing a command and the extra items being the parameters to that command. The
8856 allowed commands and their arguments are:
8858 \starttabulate[|Tl|l|l|p|]
8859 \NC \ssbf command name \NC \bf arguments \NC \bf arg type \NC \bf description \NC\NR
8860 \NC font \NC 1 \NC number \NC select a new font from the local \luatex{fonts} table\NC\NR
8861 \NC char \NC 1 \NC number \NC typeset this character number from the current font,
8862 and move right by the character's width\NC\NR
8863 \NC node \NC 1 \NC node \NC output this node (list), and move right
8864 by the width of this list\NC\NR
8865 \NC slot \NC 2 \NC number \NC a shortcut for the combination of a font and char command\NC\NR
8866 \NC push \NC 0 \NC \NC save current position\NC\NR
8867 \NC nop \NC 0 \NC \NC do nothing \NC\NR
8868 \NC pop \NC 0 \NC \NC pop position \NC\NR
8869 \NC rule \NC 2 \NC 2 numbers \NC output a rule $ht*wd$, and move right.\NC\NR
8870 \NC down \NC 1 \NC number \NC move down on the page\NC\NR
8871 \NC right \NC 1 \NC number \NC move right on the page\NC\NR
8872 \NC special \NC 1 \NC string \NC output a \tex{special} command\NC\NR
8873 \NC lua \NC 1 \NC string \NC execute a \LUA\ script (at \tex{latelua} time)\NC\NR
8874 \NC image \NC 1 \NC image \NC output an image (the argument can be either an \type{<image>}
8875 variable or an \type{image_spec} table)\NC\NR
8876 \NC comment \NC any \NC any \NC the arguments of this command are ignored\NC\NR
8877 \stoptabulate
8879 Here is a rather elaborate glyph commands example:
8880 \starttyping
8882 commands = {
8883 {'push'}, -- remember where we are
8884 {'right', 5000}, -- move right about 0.08pt
8885 {'font', 3}, -- select the fonts[3] entry
8886 {'char', 97}, -- place character 97 (ASCII 'a')
8887 {'pop'}, -- go all the way back
8888 {'down', -200000}, -- move upwards by about 3pt
8889 {'special', 'pdf: 1 0 0 rg'} -- switch to red color
8890 {'rule', 500000, 20000} -- draw a bar
8891 {'special','pdf: 0 g'} -- back to black
8894 \stoptyping
8896 The default value for \type {font} is always~1 at the start of the \type{commands} array.
8897 Therefore, if the virtual font is essentially only a re|-|encoding, then you do usually not
8898 have create an explicit \quote{font} command in the array.
8900 Rules inside of \type{commands} arrays are built up using only two dimensions:
8901 they do not have depth. For correct vertical placement, an extra \type{down} command
8902 may be needed.
8904 Regardless of the amount of movement you create within the \type {commands},
8905 the output pointer will always move by exactly the width that was given in
8906 the \type {width} key of the character hash. Any movements that take place
8907 inside the \type{commands} array are ignored on the upper level.
8909 \subsection{Artificial fonts}
8911 Even in a \quote{real} font, there can be virtual characters. When \LUATEX\ encounters a \type {commands}
8912 field inside a character when it becomes time to typeset the character, it will interpret the commands, just
8913 like for a true virtual character. In this case, if you have created no \quote{fonts} array, then the default
8914 (and only) \quote{base} font is taken to be the current font itself. In practice, this means that you can
8915 create virtual duplicates of existing characters which is useful if you want to create composite characters.
8917 Note: this feature does {\it not\/} work the other way around. There can not be \quote{real} characters in a
8918 virtual font! You cannot use this technique for font re-encoding either; you need a truly virtual
8919 font for that (because characters that are already present cannot be altered).
8921 \subsection{Example virtual font}
8923 Finally, here is a plain \TEX\ input file with a virtual font demonstration:
8925 \startbuffer
8926 \directlua {
8927 callback.register('define_font',
8928 function (name,size)
8929 if name == 'cmr10-red' then
8930 f = font.read_tfm('cmr10',size)
8931 f.name = 'cmr10-red'
8932 f.type = 'virtual'
8933 f.fonts = {{ name = 'cmr10', size = size }}
8934 for i,v in pairs(f.characters) do
8935 if (string.char(i)):find('[tacohanshartmut]') then
8936 v.commands = {
8937 {'special','pdf: 1 0 0 rg'},
8938 {'char',i},
8939 {'special','pdf: 0 g'},
8941 else
8942 v.commands = {{'char',i}}
8945 else
8946 f = font.read_tfm(name,size)
8948 return f
8953 \font\myfont = cmr10-red at 10pt \myfont This is a line of text \par
8954 \font\myfontx= cmr10 at 10pt \myfontx Here is another line of text \par
8955 \stopbuffer
8957 \typebuffer
8959 %\getbuffer
8961 \chapter[nodes]{Nodes}
8963 \section{\LUA\ node representation}
8965 \TEX's nodes are represented in \LUA\ as userdata object with a variable
8966 set of fields. In the following syntax tables, such the type of such a
8967 userdata object is represented as \syntax{<node>}.
8970 The current return value of \luatex{node.types()} is:
8971 \ctxlua {
8972 local d = node.types()
8973 tex.print('\\type{' .. d[0] .. '} (' .. 0 .. '), ')
8974 for _,v in pairs(d) do
8975 if _ > 0 then
8976 tex.print('\\type{' .. v .. '} (' .. _ .. '), ')
8981 NOTE: The \type {\lastnodetype} primitive is \ETEX\ compliant. The valid
8982 range is still -1 .. 15 and glyph nodes have number 0 (used to be
8983 char node) and ligature nodes are mapped to 7. That way macro
8984 packages can use the same symbolic names as in traditional \ETEX.
8985 Keep in mind that the internal node numbers are different and that
8986 there are more node types than 15.
8988 \subsection{Auxiliary items}
8990 A few node|-|typed userdata objects do not occur in the \quote{normal}
8991 list of nodes, but can be pointed to from within that list. They are
8992 not quite the same as regular nodes, but it is easier for the library
8993 routines to treat them as if they were.
8995 \subsubsection{glue_spec items}
8997 Skips are about the only type of data objects in traditional \TEX\
8998 that are not a simple value. The structure that represents the glue
8999 components of a skip is called a \type {glue_spec}, and it has the following
9000 accessible fields:
9002 \starttabulate[|lT|l|p|]
9003 \NC \ssbf key \NC \bf type \NC \bf explanation \NC\NR
9004 \NC width \NC number \NC \NC\NR
9005 \NC stretch \NC number \NC \NC\NR
9006 \NC stretch_order \NC number \NC \NC\NR
9007 \NC shrink \NC number \NC \NC\NR
9008 \NC shrink_order \NC number \NC \NC\NR
9009 \NC writable \NC boolean \NC If this is true, you can't assign to this \type{glue_spec}
9010 because it is one of the preallocated special cases. New in 0.52\NC\NR
9011 \stoptabulate
9013 These objects are reference counted, so there is actually an extra
9014 read-only field named \type {ref_count} as well. This item type will likely
9015 disappear in the future, and the glue fields themselves will
9016 become part of the nodes referencing glue items.
9018 \subsubsection{attribute{\_}list and attribute items}
9020 The newly introduced attribute registers are non|-|trivial, because
9021 the value that is attached to a node is essentially a sparse array of
9022 key|-|value pairs.
9024 It is generally easiest to deal with attribute lists and attributes
9025 by using the dedicated functions in the \luatex{node} library, but
9026 for completeness, here is the low|-|level interface.
9028 An \type{attribute_list} item is used as a head pointer for a list
9029 of attribute items. It has only one user-visible field:
9031 \starttabulate[|lT|l|p|]
9032 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9033 \NC next \NC \syntax{<node>} \NC pointer to the first attribute\NC\NR
9034 \stoptabulate
9036 A normal node's attribute field will point to an item of type
9037 \type{attribute_list}, and the \type{next} field in that item will point
9038 to the first defined \quote{attribute} item, whose \type {next} will
9039 point to the second \quote{attribute} item, etc.
9041 Valid fields in \type{attribute} items:
9043 \starttabulate[|lT|l|p|]
9044 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9045 \NC next \NC \syntax{<node>} \NC pointer to the next attribute\NC\NR
9046 \NC number \NC number \NC the attribute type id\NC\NR
9047 \NC value \NC number \NC the attribute value\NC\NR
9048 \stoptabulate
9050 \subsubsection{action item}
9052 Valid fields: \showfields{action}\crlf
9053 Id: \showid{action}
9055 These are a special kind of item that only appears inside
9056 pdf start link objects.
9058 \starttabulate[|lT|l|p|]
9059 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9060 \NC action_type \NC number \NC \NC\NR
9061 \NC action_id \NC number or string \NC \NC\NR
9062 \NC named_id \NC number \NC \NC\NR
9063 \NC file \NC string \NC \NC\NR
9064 \NC new_window \NC number \NC \NC\NR
9065 \NC data \NC string \NC \NC\NR
9066 \NC ref_count \NC number \NC (read-only)\NC\NR
9067 \stoptabulate
9069 \subsection{Main text nodes}
9071 These are the nodes that comprise actual typesetting commands.
9073 A few fields are present in all nodes regardless of their type, these are:
9075 \starttabulate[|lT|l|p|]
9076 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9077 \NC next \NC \syntax{<node>} \NC The next node in a list, or nil\NC\NR
9078 \NC id \NC number \NC The node's type (\type{id}) number \NC\NR
9079 \NC subtype \NC number \NC The node \type{subtype} identifier\NC\NR
9080 \stoptabulate
9082 The \type{subtype} is sometimes just a stub entry. Not all nodes
9083 actually use the \type{subtype}, but this way you can be sure that all
9084 nodes accept it as a valid field name, and that is often handy in node
9085 list traversal. In the following tables \type{next} and \type{id} are
9086 not explicitly mentioned.
9088 Besides these three fields, almost all nodes also have an \type {attr}
9089 field, and there is a also a field called \type{prev}. That last field
9090 is always present, but only initialized on explicit request: when the
9091 function \type{node.slide()} is called, it will set up the \type{prev}
9092 fields to be a backwards pointer in the argument node list.
9095 \subsubsection{hlist nodes}
9097 Valid fields: \showfields{hlist}\crlf
9098 Id: \showid{hlist}
9100 \starttabulate[|lT|l|p|]
9101 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9102 \NC subtype \NC number \NC 0 = unknown origin, 1 = created by
9103 linebreaking, 2 = explicit box command. (0.46.0),
9104 3 = paragraph indentation box, 4 = alignment column or row, 5 = alignment cell (0.62.0)\NC\NR
9105 \NC attr \NC \syntax{<node>} \NC The head of the associated attribute list \NC\NR
9106 \NC width \NC number \NC \NC\NR
9107 \NC height \NC number \NC \NC\NR
9108 \NC depth \NC number \NC \NC\NR
9109 \NC shift \NC number \NC a displacement perpendicular to the
9110 character progression direction \NC\NR
9111 \NC glue_order \NC number \NC a number in the range 0--4, indicating
9112 the glue order\NC\NR
9113 \NC glue_set \NC number \NC the calculated glue ratio\NC\NR
9114 \NC glue_sign \NC number \NC 0 = normal,1 = stretching,2 = shrinking \NC\NR
9115 \NC head \NC \syntax{<node>} \NC the first node of the body of this list\NC\NR
9116 \NC dir \NC string \NC the direction of this box. see~\in{}[dirnodes]\NC\NR
9117 \stoptabulate
9119 A warning: never assign a node list to the \type{head} field
9120 unless you are sure its internal link structure is correct, otherwise
9121 an error may result.
9123 Note: the new field name \type{head} was introduced in 0.65 to replace
9124 the old name \type{list}. Use of the name \type{list} is now
9125 deprecated, but it will stay available until at least version 0.80.
9127 \subsubsection{vlist nodes}
9129 Valid fields: As for hlist, except that \quote{shift} is a displacement
9130 perpendicular to the line progression direction, and \quote{subtype} only
9131 has subtypes 0, 4, and 5.
9133 \subsubsection{rule nodes}
9135 Valid fields: \showfields{rule}\crlf
9136 Id: \showid{rule}
9138 \starttabulate[|lT|l|p|]
9139 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9140 \NC subtype \NC number \NC unused\NC\NR
9141 \NC attr \NC \syntax{<node>} \NC \NC\NR
9142 \NC width \NC number \NC the width of the rule; the special value $-1073741824$
9143 is used for \quote{running} glue dimensions\NC\NR
9144 \NC height \NC number \NC the height of the rule (can be negative)\NC\NR
9145 \NC depth \NC number \NC the depth of the rule (can be negative)\NC\NR
9146 \NC dir \NC string \NC the direction of this rule. see~\in{}[dirnodes]\NC\NR
9147 \stoptabulate
9149 \subsubsection{ins nodes}
9151 Valid fields: \showfields{ins}\crlf
9152 Id: \showid{ins}
9154 \starttabulate[|lT|l|p|]
9155 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9156 \NC subtype \NC number \NC the insertion class\NC\NR
9157 \NC attr \NC \syntax{<node>} \NC \NC\NR
9158 \NC cost \NC number \NC the penalty associated with this insert\NC\NR
9159 \NC height \NC number \NC \NC\NR
9160 \NC depth \NC number \NC \NC\NR
9161 \NC head \NC \syntax{<node>} \NC the first node of the body of this insert\NC\NR
9162 \NC spec \NC \syntax{<node>} \NC a pointer to the \tex{splittopskip} glue spec\NC\NR
9163 \stoptabulate
9165 A warning: never assign a node list to the \type{head} field
9166 unless you are sure its internal link structure is correct, otherwise
9167 an error may be result.
9169 Note: the new field name \type{head} was introduced in 0.65 to replace
9170 the old name \type{list}. Use of the name \type{list} is now
9171 deprecated, but it will stay available until at least version 0.80.
9174 \subsubsection{mark nodes}
9176 Valid fields: \showfields{mark}\crlf
9177 Id: \showid{mark}
9179 \starttabulate[|lT|l|p|]
9180 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9181 \NC subtype \NC number \NC unused\NC\NR
9182 \NC attr \NC \syntax{<node>} \NC \NC\NR
9183 \NC class \NC number \NC the mark class\NC\NR
9184 \NC mark \NC table \NC a table representing a token list\NC\NR
9185 \stoptabulate
9187 \subsubsection{adjust nodes}
9189 Valid fields: \showfields{adjust}\crlf
9190 Id: \showid{adjust}
9192 \starttabulate[|lT|l|p|]
9193 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9194 \NC subtype \NC number \NC 0 = normal, 1 = \quote{pre}\NC\NR
9195 \NC attr \NC \syntax{<node>} \NC \NC\NR
9196 \NC head \NC \syntax{<node>} \NC adjusted material\NC\NR
9197 \stoptabulate
9199 A warning: never assign a node list to the \type{head} field
9200 unless you are sure its internal link structure is correct, otherwise
9201 an error may be result.
9203 Note: the new field name \type{head} was introduced in 0.65 to replace
9204 the old name \type{list}. Use of the name \type{list} is now
9205 deprecated, but it will stay available until at least version 0.80.
9208 \subsubsection{disc nodes}
9210 Valid fields: \showfields{disc}\crlf
9211 Id: \showid{disc}
9213 \starttabulate[|lT|l|p|]
9214 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9215 \NC subtype \NC number \NC indicates the source of a discretionary.
9216 0 = the \tex{discretionary} command,
9217 1 = the \tex{-} command,
9218 2 = added automatically following a \type{-},
9219 3 = added by the hyphenation algorithm (simple),
9220 4 = added by the hyphenation algorithm (hard, first item),
9221 5 = added by the hyphenation algorithm (hard, second item)\NC\NR
9222 \NC attr \NC \syntax{<node>} \NC \NC\NR
9223 \NC pre \NC \syntax{<node>} \NC pointer to the pre|-|break text\NC\NR
9224 \NC post \NC \syntax{<node>} \NC pointer to the post|-|break text\NC\NR
9225 \NC replace \NC \syntax{<node>} \NC pointer to the no|-|break text\NC\NR
9226 \stoptabulate
9228 The subtype numbers~4 and~5 belong to the \quote{of-f-ice} explanation given elsewhere.
9230 A warning: never assign a node list to the pre, post or replace field
9231 unless you are sure its internal link structure is correct, otherwise
9232 an error may be result.
9234 \subsubsection{math nodes}
9236 Valid fields: \showfields{math}\crlf
9237 Id: \showid{math}
9239 \starttabulate[|lT|l|p|]
9240 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9241 \NC subtype \NC number \NC 0 = \quote{on}, 1 = \quote{off}\NC\NR
9242 \NC attr \NC \syntax{<node>} \NC \NC\NR
9243 \NC surround \NC number \NC width of the \tex{mathsurround} kern\NC\NR
9244 \stoptabulate
9246 \subsubsection{glue nodes}
9248 Valid fields: \showfields{glue}\crlf
9249 Id: \showid{glue}
9251 \starttabulate[|lT|l|p|]
9252 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9253 \NC subtype \NC number \NC 0 = \tex{skip},
9254 1--18 = internal glue parameters,
9255 100-103 = \quote{leader} subtypes \NC\NR
9256 \NC attr \NC \syntax{<node>} \NC \NC\NR
9257 \NC spec \NC \syntax{<node>} \NC pointer to a glue{\_}spec item \NC\NR
9258 \NC leader \NC \syntax{<node>} \NC pointer to a box or rule for leaders\NC\NR
9259 \stoptabulate
9261 The exact meanings of the subtypes are as follows:
9263 \starttabulate[|rT|l|]
9264 \NC 1 \NC \tex{lineskip} \NC \NR
9265 \NC 2 \NC \tex{baselineskip} \NC \NR
9266 \NC 3 \NC \tex{parskip} \NC \NR
9267 \NC 4 \NC \tex{abovedisplayskip} \NC \NR
9268 \NC 5 \NC \tex{belowdisplayskip} \NC \NR
9269 \NC 6 \NC \tex{abovedisplayshortskip} \NC \NR
9270 \NC 7 \NC \tex{belowdisplayshortskip} \NC \NR
9271 \NC 8 \NC \tex{leftskip} \NC \NR
9272 \NC 9 \NC \tex{rightskip} \NC \NR
9273 \NC 10 \NC \tex{topskip} \NC \NR
9274 \NC 11 \NC \tex{splittopskip} \NC \NR
9275 \NC 12 \NC \tex{tabskip} \NC \NR
9276 \NC 13 \NC \tex{spaceskip} \NC \NR
9277 \NC 14 \NC \tex{xspaceskip} \NC \NR
9278 \NC 15 \NC \tex{parfillskip} \NC \NR
9279 \NC 16 \NC \tex{thinmuskip} \NC \NR
9280 \NC 17 \NC \tex{medmuskip} \NC \NR
9281 \NC 18 \NC \tex{thickmuskip} \NC \NR
9282 \NC 100 \NC \tex{leaders} \NC \NR
9283 \NC 101 \NC \tex{cleaders} \NC \NR
9284 \NC 102 \NC \tex{xleaders} \NC \NR
9285 \NC 103 \NC \tex{gleaders} \NC \NR
9286 \stoptabulate
9288 \subsubsection{kern nodes}
9290 Valid fields: \showfields{kern}\crlf
9291 Id: \showid{kern}
9293 \starttabulate[|lT|l|p|]
9294 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9295 \NC subtype \NC number \NC 0 = from font,
9296 1 = from \tex{kern} or \tex{/},
9297 2 = from \tex{accent}\NC\NR
9298 \NC attr \NC \syntax{<node>} \NC \NC\NR
9299 \NC kern \NC number \NC \NC\NR
9300 \stoptabulate
9303 \subsubsection{penalty nodes}
9305 Valid fields: \showfields{penalty}\crlf
9306 Id: \showid{penalty}
9308 \starttabulate[|lT|l|p|]
9309 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9310 \NC subtype \NC number \NC not used\NC\NR
9311 \NC attr \NC \syntax{<node>} \NC \NC\NR
9312 \NC penalty \NC number \NC \NC\NR
9313 \stoptabulate
9315 \subsubsection[glyphnodes]{glyph nodes}
9317 Valid fields: \showfields{glyph}\crlf
9318 Id: \showid{glyph}
9320 \starttabulate[|lT|l|p|]
9321 \NC \ssbf field \NC \ssbf type \NC \ssbf explanation \NC \NR
9322 \NC subtype \NC number \NC bitfield \NC \NR
9323 \NC attr \NC \syntax{<node>} \NC \NC \NR
9324 \NC char \NC number \NC \NC \NR
9325 \NC font \NC number \NC \NC \NR
9326 \NC lang \NC number \NC \NC \NR
9327 \NC left \NC number \NC \NC \NR
9328 \NC right \NC number \NC \NC \NR
9329 \NC uchyph \NC boolean \NC \NC \NR
9330 \NC components \NC \syntax{<node>} \NC pointer to ligature components \NC \NR
9331 \NC xoffset \NC number \NC \NC \NR
9332 \NC yoffset \NC number \NC \NC \NR
9333 \NC width \NC number \NC (new in 0.53) \NC \NR
9334 \NC height \NC number \NC (new in 0.53) \NC \NR
9335 \NC depth \NC number \NC (new in 0.53) \NC \NR
9336 \NC expansion_factor \NC number \NC (new in 0.78) \NC \NR
9337 \stoptabulate
9339 A warning: never assign a node list to the components field
9340 unless you are sure its internal link structure is correct, otherwise
9341 an error may be result.
9343 Valid bits for the \type{subtype} field are:
9345 \starttabulate[|c|l|]
9346 \NC \ssbf bit \NC \bf meaning \NC\NR
9347 \NC 0 \NC character \NC\NR
9348 \NC 1 \NC ligature \NC\NR
9349 \NC 2 \NC ghost \NC\NR
9350 \NC 3 \NC left \NC\NR
9351 \NC 4 \NC right \NC\NR
9352 \stoptabulate
9354 See \in{section}[charsandglyphs] for a detailed description of the
9355 \type{subtype} field.
9357 The \type {expansion_factor} is relatively new and the result of extensive
9358 experiments with a more efficient implementation of expansion. Early versions of
9359 \LUATEX\ already replaced multiple instances of fonts in the backend by scaling
9360 but contrary to \PDFTEX\ in \LUATEX\ we now also got rid of font copies in the
9361 frontend and replaced them by expansion factors that travel with glyph nodes. Apart
9362 from a cleaner approach this is also a step towards a better separation between
9363 front- and backend.
9365 \subsubsection{margin{\_}kern nodes}
9367 Valid fields: \showfields{margin_kern}\crlf
9368 Id: \showid{margin_kern}
9370 \starttabulate[|lT|l|p|]
9371 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9372 \NC subtype \NC number \NC 0 = left side,
9373 1 = right side\NC\NR
9374 \NC attr \NC \syntax{<node>} \NC \NC\NR
9375 \NC width \NC number \NC \NC\NR
9376 \NC glyph \NC \syntax{<node>} \NC \NC\NR
9377 \stoptabulate
9379 \subsection{Math nodes}
9381 These are the so||called \quote{noad}s and the nodes that are specifically
9382 associated with math processing. Most of these nodes contain sub-nodes so
9383 that the list of possible fields is actually quite small. First, the subnodes:
9385 \subsubsection{Math kernel subnodes}
9387 Many object fields in math mode are either simple characters in a
9388 specific family or math lists or node lists. There are four associated
9389 subnodes that represent these cases (in the following node
9390 descriptions these are indicated by the word \type{<kernel>}).
9392 The \type{next} and \type{prev} fields for these subnodes are unused.
9394 \subsubsubsection{math{\_}char and math{\_}text{\_}char subnodes}
9396 Valid fields: \showfields{math_char}\crlf
9397 Id: \showid{math_char}
9399 \starttabulate[|lT|l|p|]
9400 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9401 \NC attr \NC \syntax{<node>}\NC \NC\NR
9402 \NC char \NC number \NC \NC \NR
9403 \NC fam \NC number \NC \NC\NR
9404 \stoptabulate
9406 The \type{math_char} is the simplest subnode field, it contains
9407 the character and family for a single glyph object. The
9408 \type{math_text_char} is a special case that you will not
9409 normally encounter, it arises temporarily during math list conversion
9410 (its sole function is to suppress a following italic correction).
9412 \subsubsubsection{sub{\_}box and sub{\_}mlist subnodes}
9414 Valid fields: \showfields{sub_box}\crlf
9415 Id: \showid{sub_box}
9417 \starttabulate[|lT|l|p|]
9418 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9419 \NC attr \NC \syntax{<node>}\NC \NC\NR
9420 \NC head \NC \syntax{<node>}\NC \NC \NR
9421 \stoptabulate
9423 These two subnode types are used for subsidiary list items. For
9424 \type{sub_box}, the \type{head} points to a \quote{normal} vbox or
9425 hbox. For \type{sub_mlist}, the \type{head} points to a math list
9426 that is yet to be converted.
9428 A warning: never assign a node list to the \type{head} field
9429 unless you are sure its internal link structure is correct, otherwise
9430 an error may be result.
9432 Note: the new field name \type{head} was introduced in 0.65 to replace
9433 the old name \type{list}. Use of the name \type{list} is now
9434 deprecated, but it will stay available until at least version 0.80.
9436 \subsubsection{Math delimiter subnode}
9438 There is a fifth subnode type that is used exclusively for delimiter
9439 fields. As before, the \type{next} and \type{prev} fields are unused.
9441 \subsubsubsection{delim subnodes}
9443 Valid fields: \showfields{delim}\crlf
9444 Id: \showid{delim}
9446 \starttabulate[|lT|l|p|]
9447 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9448 \NC attr \NC \syntax{<node>}\NC \NC\NR
9449 \NC small_char \NC number \NC \NC \NR
9450 \NC small_fam \NC number \NC \NC\NR
9451 \NC large_char \NC number \NC \NC \NR
9452 \NC large_fam \NC number \NC \NC\NR
9453 \stoptabulate
9455 The fields \type{large_char} and \type{large_fam} can be zero, in that
9456 case the font that is sed for the \type{small_fam} is expected to
9457 provide the large version as an extension to the \type{small_char}.
9459 \subsubsection{Math core nodes}
9461 First, there are the objects (the \TEX book calls then \quote{atoms})
9462 that are associated with the simple math objects: Ord, Op, Bin, Rel,
9463 Open, Close, Punct, Inner, Over, Under, Vcent. These all have
9464 the same fields, and they are combined into a single node type with
9465 separate subtypes for differentiation.
9467 \subsubsubsection{simple nodes}
9469 Valid fields: \showfields{noad}\crlf
9470 Id: \showid{noad}
9472 \starttabulate[|lT|l|p|]
9473 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9474 \NC subtype \NC number \NC see below \NC\NR
9475 \NC attr \NC \syntax{<node>} \NC \NC\NR
9476 \NC nucleus \NC \syntax{<kernel>}\NC \NC\NR
9477 \NC sub \NC \syntax{<kernel>}\NC \NC\NR
9478 \NC sup \NC \syntax{<kernel>}\NC \NC\NR
9479 \stoptabulate
9481 Operators are a bit special because they occupy three subtypes.
9482 \type{subtype}.
9484 \starttabulate[|lT|p|]
9485 \NC \ssbf number \NC \bf node sub type \NC\NR
9486 \NC 0 \NC Ord \NC\NR
9487 \NC 1 \NC Op, \type{\displaylimits} \NC\NR
9488 \NC 2 \NC Op, \type{\limits} \NC\NR
9489 \NC 3 \NC Op, \type{\nolimits} \NC\NR
9490 \NC 4 \NC Bin \NC\NR
9491 \NC 5 \NC Rel \NC\NR
9492 \NC 6 \NC Open \NC\NR
9493 \NC 7 \NC Close \NC\NR
9494 \NC 8 \NC Punct \NC\NR
9495 \NC 9 \NC Inner \NC\NR
9496 \NC 10 \NC Under \NC\NR
9497 \NC 11 \NC Over \NC\NR
9498 \NC 12 \NC Vcent \NC\NR
9499 \stoptabulate
9501 \subsubsubsection{accent nodes}
9503 Valid fields: \showfields{accent}\crlf
9504 Id: \showid{accent}
9506 \starttabulate[|lT|l|p|]
9507 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9508 \NC subtype \NC number \NC the first bit is used for a fixed top accent flag (if the \type{accent} field is present),
9509 the second bit for a fixed bottom accent flag (if the \type{bot_accent} field is present).
9510 Example: the actual value \type{3} means: do not stretch either accent\NC\NR
9511 \NC attr \NC \syntax{<node>}\NC \NC\NR
9512 \NC nucleus \NC \syntax{<kernel>}\NC \NC \NR
9513 \NC sub \NC \syntax{<kernel>}\NC \NC\NR
9514 \NC sup \NC \syntax{<kernel>}\NC \NC \NR
9515 \NC accent \NC \syntax{<kernel>}\NC \NC\NR
9516 \NC bot_accent \NC \syntax{<kernel>}\NC \NC\NR
9517 \stoptabulate
9519 \subsubsubsection{style nodes}
9521 Valid fields: \showfields{style}\crlf
9522 Id: \showid{style}
9524 \starttabulate[|lT|l|p|]
9525 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9526 \NC style \NC string \NC contains the style \NC\NR
9527 \stoptabulate
9529 There are eight possibilities for the string value: one of
9530 \quote{display}, \quote{text}, \quote{script}, or \quote{scriptscript}.
9531 Each of these can have a trailing \type{'} to signify
9532 \quote{cramped} styles.
9534 \subsubsubsection{choice nodes}
9536 Valid fields: \showfields{choice}\crlf
9537 Id: \showid{choice}
9539 \starttabulate[|lT|l|p|]
9540 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9541 \NC attr \NC \syntax{<node>}\NC \NC\NR
9542 \NC display \NC \syntax{<node>}\NC \NC\NR
9543 \NC text \NC \syntax{<node>}\NC \NC\NR
9544 \NC script \NC \syntax{<node>}\NC \NC\NR
9545 \NC scriptscript \NC \syntax{<node>}\NC \NC\NR
9546 \stoptabulate
9548 A warning: never assign a node list to the display, text, script, or
9549 scriptscript field unless you are sure its internal link structure is
9550 correct, otherwise an error may be result.
9552 \subsubsubsection{radical nodes}
9554 Valid fields: \showfields{radical}\crlf
9555 Id: \showid{radical}
9557 \starttabulate[|lT|l|p|]
9558 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9559 \NC attr \NC \syntax{<node>}\NC \NC\NR
9560 \NC nucleus \NC \syntax{<kernel>}\NC \NC \NR
9561 \NC sub \NC \syntax{<kernel>}\NC \NC\NR
9562 \NC sup \NC \syntax{<kernel>}\NC \NC \NR
9563 \NC left \NC \syntax{<delim>}\NC \NC \NR
9564 \NC degree \NC \syntax{<kernel>}\NC Only set by \type{\Uroot} \NC \NR
9565 \stoptabulate
9567 A warning: never assign a node list to the nucleus, sub, sup, left, or
9568 degree field
9569 unless you are sure its internal link structure is correct, otherwise
9570 an error may be result.
9572 \subsubsubsection{fraction nodes}
9574 Valid fields: \showfields{fraction}\crlf
9575 Id: \showid{fraction}
9577 \starttabulate[|lT|l|p|]
9578 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9579 \NC attr \NC \syntax{<node>}\NC \NC\NR
9580 \NC width \NC number \NC \NC \NR
9581 \NC num \NC \syntax{<kernel>}\NC \NC\NR
9582 \NC denom \NC \syntax{<kernel>}\NC \NC \NR
9583 \NC left \NC \syntax{<delim>}\NC \NC \NR
9584 \NC right \NC \syntax{<delim>}\NC \NC \NR
9585 \stoptabulate
9587 A warning: never assign a node list to the num, or denom field
9588 unless you are sure its internal link structure is correct, otherwise
9589 an error may be result.
9591 \subsubsubsection{fence nodes}
9593 Valid fields: \showfields{fence}\crlf
9594 Id: \showid{fence}
9596 \starttabulate[|lT|l|p|]
9597 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9598 \NC subtype \NC number \NC 1 = \type{\left},
9599 2 = \type{\middle},
9600 3 = \type{\right} \NC\NR
9601 \NC attr \NC \syntax{<node>}\NC \NC\NR
9602 \NC delim \NC \syntax{<delim>}\NC \NC \NR
9603 \stoptabulate
9605 \subsection{whatsit nodes}
9607 Whatsit nodes come in many subtypes that you can ask for by running
9608 \luatex{node.whatsits()}:
9609 \ctxlua {for n,name in table.sortedpairs(node.whatsits()) do
9610 if (n<100) then
9611 if (n>0) then tex.sprint (', ') end
9612 tex.sprint('\\type{' .. name .. '} (' .. n .. ')') end
9613 end }
9615 \subsubsection{open nodes}
9617 Valid fields: \showfields{whatsit,open}\crlf
9618 Id: \showid{whatsit,open}
9620 \starttabulate[|lT|l|p|]
9621 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9622 \NC attr \NC \syntax{<node>} \NC \NC\NR
9623 \NC stream \NC number \NC \TEX's stream id number\NC\NR
9624 \NC name \NC string \NC file name \NC\NR
9625 \NC ext \NC string \NC file extension \NC\NR
9626 \NC area \NC string \NC file area (this may become obsolete) \NC\NR
9627 \stoptabulate
9629 \subsubsection{write nodes}
9631 Valid fields: \showfields{whatsit,write}\crlf
9632 Id: \showid{whatsit,write}
9634 \starttabulate[|lT|l|p|]
9635 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9636 \NC attr \NC \syntax{<node>} \NC \NC\NR
9637 \NC stream \NC number \NC \TEX's stream id number\NC\NR
9638 \NC data \NC table \NC a table representing the token list to be written\NC\NR
9639 \stoptabulate
9641 \subsubsection{close nodes}
9643 Valid fields: \showfields{whatsit,close}\crlf
9644 Id: \showid{whatsit,close}
9646 \starttabulate[|lT|l|p|]
9647 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9648 \NC attr \NC \syntax{<node>} \NC \NC\NR
9649 \NC stream \NC number \NC \TEX's stream id number\NC\NR
9650 \stoptabulate
9652 \subsubsection{special nodes}
9654 Valid fields: \showfields{whatsit,special}\crlf
9655 Id: \showid{whatsit,special}
9657 \starttabulate[|lT|l|p|]
9658 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9659 \NC attr \NC \syntax{<node>} \NC \NC\NR
9660 \NC data \NC string \NC the \tex{special} information\NC\NR
9661 \stoptabulate
9663 \subsubsection{language nodes}
9666 \LUATEX\ does not have language whatsits any more. All language
9667 information is already present inside the glyph nodes themselves.
9668 This whatsit subtype will be removed in the next release.
9671 \subsubsection{local_par nodes}
9673 Valid fields: \showfields{whatsit,local_par}\crlf
9674 Id: \showid{whatsit,local_par}
9676 \starttabulate[|lT|l|p|]
9677 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9678 \NC attr \NC \syntax{<node>} \NC \NC\NR
9679 \NC pen_inter \NC number \NC local interline penalty (from \tex{localinterlinepenalty})\NC\NR
9680 \NC pen_broken\NC number \NC local broken penalty (from \tex{localbrokenpenalty})\NC\NR
9681 \NC dir \NC string \NC the direction of this par. see~\in{}[dirnodes]\NC\NR
9682 \NC box_left \NC \syntax{<node>} \NC the \tex{localleftbox}\NC\NR
9683 \NC box_left_width\NC number\NC width of the \tex{localleftbox}\NC\NR
9684 \NC box_right \NC \syntax{<node>} \NC the \tex{localrightbox}\NC\NR
9685 \NC box_right_width\NC number\NC width of the \tex{localrightbox}\NC\NR
9686 \stoptabulate
9688 A warning: never assign a node list to the box_left or box_right field
9689 unless you are sure its internal link structure is correct, otherwise
9690 an error may be result.
9695 \subsubsection[dirnodes]{dir nodes}
9697 Valid fields: \showfields{whatsit,dir}\crlf
9698 Id: \showid{whatsit,dir}
9700 \starttabulate[|lT|l|p|]
9701 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9702 \NC attr \NC \syntax{<node>} \NC \NC\NR
9703 \NC dir \NC string \NC the direction (but see below)\NC\NR
9704 \NC level \NC number \NC nesting level of this direction whatsit\NC\NR
9705 \NC dvi_ptr \NC number \NC a saved dvi buffer byte offset\NC\NR
9706 \NC dir_h \NC number \NC a saved dvi position\NC\NR
9707 \stoptabulate
9709 A note on \type{dir} strings. Direction specifiers are three-letter
9710 combinations of \type{T}, \type{B}, \type{R}, and \type{L}.
9712 These are built up out of three separate items:
9713 \startitemize
9714 \item the first is the direction of the \quote{top} of paragraphs.
9715 \item the second is the direction of the \quote{start} of lines.
9716 \item the third is the direction of the \quote{top} of glyphs.
9717 \stopitemize
9719 However, only four combinations are accepted: \type{TLT}, \type{TRT},
9720 \type{RTT}, and \type{LTL}.
9722 Inside actual \type{dir} whatsit nodes, the representation of
9723 \type{dir} is not a three-letter but a four-letter combination. The
9724 first character in this case is always either \type{+} or \type{-},
9725 indicating whether the value is pushed or popped from the direction
9726 stack.
9728 \subsubsection{pdf_literal nodes}
9730 Valid fields: \showfields{whatsit,pdf_literal}\crlf
9731 Id: \showid{whatsit,pdf_literal}
9733 \starttabulate[|lT|l|p|]
9734 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9735 \NC attr \NC \syntax{<node>} \NC \NC\NR
9736 \NC mode \NC number \NC the \quote{mode} setting of this literal\NC\NR
9737 \NC data \NC string \NC the \tex{pdfliteral} information\NC\NR
9738 \stoptabulate
9740 Mode values:
9742 \starttabulate[|lT|p|]
9743 \NC \ssbf value \NC \ssbf corresponding \tex{pdftex} keyword \NC \NR
9744 \NC 0 \NC setorigin \NC \NR
9745 \NC 1 \NC page \NC \NR
9746 \NC 2 \NC direct \NC \NR
9747 \stoptabulate
9749 \subsubsection{pdf_refobj nodes}
9751 Valid fields: \showfields{whatsit,pdf_refobj}\crlf
9752 Id: \showid{whatsit,pdf_refobj}
9754 \starttabulate[|lT|l|p|]
9755 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9756 \NC attr \NC \syntax{<node>} \NC \NC\NR
9757 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
9758 \stoptabulate
9760 \subsubsection{pdf_refxform nodes}
9762 Valid fields: \showfields{whatsit,pdf_refxform}\crlf
9763 Id: \showid{whatsit,pdf_refxform}.
9765 \starttabulate[|lT|l|p|]
9766 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9767 \NC attr \NC \syntax{<node>} \NC \NC\NR
9768 \NC width \NC number \NC \NC \NR
9769 \NC height \NC number \NC \NC \NR
9770 \NC depth \NC number \NC \NC \NR
9771 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
9772 \stoptabulate
9774 Be aware that \type{pdf_refxform} nodes have dimensions that are used by \LUATEX.
9776 \subsubsection{pdf_refximage nodes}
9778 Valid fields: \showfields{whatsit,pdf_refximage}\crlf
9779 Id: \showid{whatsit,pdf_refximage}
9781 \starttabulate[|lT|l|p|]
9782 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9783 \NC attr \NC \syntax{<node>} \NC \NC\NR
9784 \NC width \NC number \NC \NC \NR
9785 \NC height \NC number \NC \NC \NR
9786 \NC depth \NC number \NC \NC \NR
9787 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
9788 \stoptabulate
9790 Be aware that \type{pdf_refximage} nodes have dimensions that are used by \LUATEX.
9792 \subsubsection{pdf_annot nodes}
9794 Valid fields: \showfields{whatsit,pdf_annot}\crlf
9795 Id: \showid{whatsit,pdf_annot}
9797 \starttabulate[|lT|l|p|]
9798 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9799 \NC attr \NC \syntax{<node>} \NC \NC\NR
9800 \NC width \NC number \NC \NC \NR
9801 \NC height \NC number \NC \NC \NR
9802 \NC depth \NC number \NC \NC \NR
9803 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
9804 \NC data \NC string \NC the annotation data\NC\NR
9805 \stoptabulate
9808 \subsubsection{pdf_start_link nodes}
9810 Valid fields: \showfields{whatsit,pdf_start_link}\crlf
9811 Id: \showid{whatsit,pdf_start_link}
9813 \starttabulate[|lT|l|p|]
9814 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9815 \NC attr \NC \syntax{<node>} \NC \NC\NR
9816 \NC width \NC number \NC \NC \NR
9817 \NC height \NC number \NC \NC \NR
9818 \NC depth \NC number \NC \NC \NR
9819 \NC objnum \NC number \NC the referenced \PDF\ object number\NC\NR
9820 \NC link_attr \NC table \NC the link attribute token list\NC\NR
9821 \NC action \NC \syntax{<node>} \NC the action to perform\NC\NR
9822 \stoptabulate
9824 \subsubsection{pdf_end_link nodes}
9826 Valid fields: \showfields{whatsit,pdf_end_link}\crlf
9827 Id: \showid{whatsit,pdf_end_link}
9829 \starttabulate[|lT|l|p|]
9830 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9831 \NC attr \NC \syntax{<node>} \NC \NC\NR
9832 \stoptabulate
9834 \subsubsection{pdf_dest nodes}
9836 Valid fields: \showfields{whatsit,pdf_dest}\crlf
9837 Id: \showid{whatsit,pdf_dest}
9839 \starttabulate[|lT|l|p|]
9840 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9841 \NC attr \NC \syntax{<node>} \NC \NC\NR
9842 \NC width \NC number \NC \NC \NR
9843 \NC height \NC number \NC \NC \NR
9844 \NC depth \NC number \NC \NC \NR
9845 \NC named_id \NC number \NC is the dest_id a string value?\NC\NR
9846 \NC dest_id \NC number or string \NC the destination id\NC\NR
9847 \NC dest_type \NC number\NC type of destination\NC\NR
9848 \NC xyz_zoom \NC number\NC \NC\NR
9849 \NC objnum \NC number \NC the \PDF\ object number\NC\NR
9850 \stoptabulate
9852 \subsubsection{pdf_thread nodes}
9854 Valid fields: \showfields{whatsit,pdf_thread}\crlf
9855 Id: \showid{whatsit,pdf_thread}
9857 \starttabulate[|lT|l|p|]
9858 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9859 \NC attr \NC \syntax{<node>} \NC \NC\NR
9860 \NC width \NC number \NC \NC \NR
9861 \NC height \NC number \NC \NC \NR
9862 \NC depth \NC number \NC \NC \NR
9863 \NC named_id \NC number \NC is the tread_id a string value?\NC\NR
9864 \NC tread_id \NC number or string \NC the thread id\NC\NR
9865 \NC thread_attr\NC number \NC extra thread information\NC\NR
9866 \stoptabulate
9868 \subsubsection{pdf_start_thread nodes}
9870 Valid fields: \showfields{whatsit,pdf_start_thread}\crlf
9871 Id: \showid{whatsit,pdf_start_thread}
9873 \starttabulate[|lT|l|p|]
9874 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9875 \NC attr \NC \syntax{<node>} \NC \NC\NR
9876 \NC width \NC number \NC \NC \NR
9877 \NC height \NC number \NC \NC \NR
9878 \NC depth \NC number \NC \NC \NR
9879 \NC named_id \NC number \NC is the tread_id a string value?\NC\NR
9880 \NC tread_id \NC number or string \NC the thread id\NC\NR
9881 \NC thread_attr\NC number \NC extra thread information\NC\NR
9882 \stoptabulate
9884 \subsubsection{pdf_end_thread nodes}
9886 Valid fields: \showfields{whatsit,pdf_end_thread}\crlf
9887 Id: \showid{whatsit,pdf_end_thread}
9889 \starttabulate[|lT|l|p|]
9890 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9891 \NC attr \NC \syntax{<node>} \NC \NC\NR
9892 \stoptabulate
9894 \subsubsection{pdf_save_pos nodes}
9896 Valid fields: \showfields{whatsit,pdf_save_pos}\crlf
9897 Id: \showid{whatsit,pdf_save_pos}
9899 \starttabulate[|lT|l|p|]
9900 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9901 \NC attr \NC \syntax{<node>} \NC \NC\NR
9902 \stoptabulate
9904 \subsubsection{late_lua nodes}
9906 Valid fields: \showfields{whatsit,late_lua}\crlf
9907 Id: \showid{whatsit,late_lua}
9909 \starttabulate[|lT|l|p|]
9910 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9911 \NC attr \NC \syntax{<node>} \NC \NC\NR
9912 \NC data \NC string \NC data to execute\NC\NR
9913 \NC string \NC string \NC data to execute (0.63)\NC\NR
9914 \NC name \NC string \NC the name to use for lua error reporting\NC\NR
9915 \stoptabulate
9917 The difference between \type{data} and \type{string} is that on
9918 assignment, the \type{data} field is converted to a token list, cf. use as
9919 \tex{latelua}. The \type{string} version is treated as a literal string.
9921 \subsubsection{pdf_colorstack nodes}
9923 Valid fields: \showfields{whatsit,pdf_colorstack}\crlf
9924 Id: \showid{whatsit,pdf_colorstack}
9926 \starttabulate[|lT|l|p|]
9927 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9928 \NC attr \NC \syntax{<node>} \NC \NC\NR
9929 \NC stack \NC number \NC colorstack id number\NC\NR
9930 \NC cmd \NC number \NC command to execute\NC\NR
9931 \NC data \NC string \NC data\NC\NR
9932 \stoptabulate
9934 \subsubsection{pdf_setmatrix nodes}
9936 Valid fields: \showfields{whatsit,pdf_setmatrix}\crlf
9937 Id: \showid{whatsit,pdf_setmatrix}
9939 \starttabulate[|lT|l|p|]
9940 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9941 \NC attr \NC \syntax{<node>} \NC \NC\NR
9942 \NC data \NC string \NC data\NC\NR
9943 \stoptabulate
9945 \subsubsection{pdf_save nodes}
9947 Valid fields: \showfields{whatsit,pdf_save}\crlf
9948 Id: \showid{whatsit,pdf_save}
9950 \starttabulate[|lT|l|p|]
9951 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9952 \NC attr \NC \syntax{<node>} \NC \NC\NR
9953 \stoptabulate
9955 \subsubsection{pdf_restore nodes}
9957 Valid fields: \showfields{whatsit,pdf_restore}\crlf
9958 Id: \showid{whatsit,pdf_restore}
9960 \starttabulate[|lT|l|p|]
9961 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9962 \NC attr \NC \syntax{<node>} \NC \NC\NR
9963 \stoptabulate
9965 \subsubsection{user_defined nodes}
9967 User|-|defined whatsit nodes can only be created and handled from \LUA\
9968 code. In effect, they are an extension to the extension
9969 mechanism. The \LUATEX\ engine will simply step over such whatsits
9970 without ever looking at the contents.
9972 Valid fields: \showfields{whatsit,user_defined}\crlf
9973 Id: \showid{whatsit,user_defined}
9975 \starttabulate[|lT|l|p|]
9976 \NC \ssbf field \NC \bf type \NC \bf explanation \NC\NR
9977 \NC attr \NC \syntax{<node>} \NC \NC\NR
9978 \NC user_id \NC number \NC id number\NC\NR
9979 \NC type \NC number \NC type of the value\NC\NR
9980 \NC value \NC number \NC \NC\NR
9981 \NC \NC string \NC \NC\NR
9982 \NC \NC \syntax{<node>} \NC \NC\NR
9983 \NC \NC table \NC \NC\NR
9984 \stoptabulate
9986 The \type{type} can have one of five distinct values:
9988 \starttabulate[|lT|p|]
9989 \NC \ssbf value \NC \bf explanation \NC\NR
9990 \NC 97 \NC the value is an attribute node list \NC\NR
9991 \NC 100 \NC the value is a number \NC\NR
9992 \NC 110 \NC the value is a node list \NC\NR
9993 \NC 115 \NC the value is a string\NC\NR
9994 \NC 116 \NC the value is a token list in \LUA\ table form\NC\NR
9995 \stoptabulate
9997 \section{Two access models}
9999 After doing lots of tests with \LUATEX\ and \LUAJITTEX\, with and without just in
10000 time compilation enabled, and with and without using ffi, we came to the
10001 conclusion that userdata prevents a speedup. We also found that the checking of
10002 metatables as well as assignment comes with overhead that can't be neglected.
10003 This is normally not really a problem but when processing fonts for more complex
10004 scripts it could have quite some overhead.
10006 Because the userdata approach has some benefits, this remains the recommended way
10007 to access nodes. We did several experiments with faster access using this model,
10008 but eventually settled for the \quote {direct} approach. For code that is proven
10009 to be okay, one can use this access model that operates on nodes more directly.
10011 Deep down in \TEX\ a node has a number which is an entry in a memory table. In
10012 fact, this model, where \TEX\ manages memory is real fast and one of the reasons
10013 why plugging in callbacks that operate on nodes is quite fast. No matter what
10014 future memory model \LUATEX\ has, an internal reference will always be a simple data
10015 type (like a number or light userdata in \LUA\ speak). So, if you use the direct
10016 model, even if you know that you currently deal with numbers, you should not depend
10017 on that property but treat it an abstraction just like traditional nodes. In fact,
10018 the fact that we use a simple basic datatype has the penalty that less checking can
10019 be done, but less checking is also the reason why it's somewhat faster. An
10020 important aspect is that one cannot mix both methods, but you can cast both
10021 models.
10023 So our advice is: use the indexed approach when possible and investigate the
10024 direct one when speed might be an issue. For that reason we also provide the
10025 \type {get*} and \type {set*} functions in the top level node namespace. There is
10026 a limited set of getters. When implementing this direct approach the regular
10027 index by key variant was also optimized, so direct access only makes sense when
10028 we're accessing nodes millions of times (which happens in some font processing
10029 for instance).
10031 We're talking mostly of getters because setters are less important. Documents
10032 have not that many content related nodes and setting many thousands of properties
10033 is hardly a burden contrary to millions of consultations.
10035 Normally you will access nodes like this:
10037 \starttyping
10038 local next = current.next
10039 if next then
10040 -- do something
10042 \stoptyping
10044 Here \type {next} is not a real field, but a virtual one. Accessing it results in
10045 a metatable method being called. In practice it boils down to looking up the
10046 node type and based on the node type checking for the field name. In a worst case
10047 you have a node type that sits at the end of the lookup list and a field that is
10048 last in the lookup chain. However, in successive versions of \LUATEX\ these lookups
10049 have been optimized and the most frequently accessed nodes and fields have a higher
10050 priority.
10052 Because in practice the \type {next} accessor results in a function call, there
10053 is some overhead involved. The next code does the same and performs a tiny bit
10054 faster (but not that much because it is still a function call but one that
10055 knows what to look up).
10057 \starttyping
10058 local next = node.next(current)
10059 if next then
10060 -- do something
10062 \stoptyping
10064 There are several such function based accessors now:
10066 \starttabulate[|T|p|]
10067 \NC getnext \NC parsing nodelist always involves this one \NC \NR
10068 \NC getprev \NC used less but is logical companion to getnext \NC \NR
10069 \NC getid \NC consulted a lot \NC \NR
10070 \NC getsubtype \NC consulted less but also a topper \NC \NR
10071 \NC getfont \NC used a lot in otf handling (glyph nodes are consulted a lot) \NC \NR
10072 \NC getchar \NC idem and also in other places \NC \NR
10073 \NC getlist \NC we often parse nested lists so this is a convenient one too
10074 (only works for hlist and vlist!) \NC \NR
10075 \NC getleader \NC comparable to list, seldom used in \TEX\ (but needs frequent consulting
10076 like lists; leaders could have been made a dedicated node type) \NC \NR
10077 \NC getfield \NC generic getter, sufficient for the rest (other field names are
10078 often shared so a specific getter makes no sense then) \NC \NR
10079 \stoptabulate
10081 It doesn't make sense to add more. Profiling demonstrated that these fields can
10082 get accesses way more times than other fields. Even in complex documents, many
10083 node and fields types never get seen, or seen only a few times. Most functions in the
10084 \type {node} namespace have a companion in \type {node.direct}, but of course not the
10085 ones that don't deal with nodes themselves. The following table summarized this:
10087 \start \def\yes{$+$} \def\nop{$-$}
10089 \starttabulate[|T|c|c|]
10091 \NC \bf function \NC \bf node \NC \bf direct \NC \NR
10093 \NC copy \NC \yes \NC \yes \NC \NR
10094 \NC copy_list \NC \yes \NC \yes \NC \NR
10095 \NC count \NC \yes \NC \yes \NC \NR
10096 \NC current_attr \NC \yes \NC \yes \NC \NR
10097 \NC dimensions \NC \yes \NC \yes \NC \NR
10098 \NC do_ligature_n \NC \yes \NC \yes \NC \NR
10099 \NC end_of_math \NC \yes \NC \yes \NC \NR
10100 \NC family_font \NC \yes \NC \nop \NC \NR
10101 \NC fields \NC \yes \NC \nop \NC \NR
10102 \NC first_character \NC \yes \NC \nop \NC \NR
10103 \NC first_glyph \NC \yes \NC \yes \NC \NR
10104 \NC flush_list \NC \yes \NC \yes \NC \NR
10105 \NC flush_node \NC \yes \NC \yes \NC \NR
10106 \NC free \NC \yes \NC \yes \NC \NR
10107 \NC getbox \NC \nop \NC \yes \NC \NR
10108 \NC getchar \NC \yes \NC \yes \NC \NR
10109 \NC getfield \NC \yes \NC \yes \NC \NR
10110 \NC getfont \NC \yes \NC \yes \NC \NR
10111 \NC getid \NC \yes \NC \yes \NC \NR
10112 \NC getnext \NC \yes \NC \yes \NC \NR
10113 \NC getprev \NC \yes \NC \yes \NC \NR
10114 \NC getlist \NC \yes \NC \yes \NC \NR
10115 \NC getleader \NC \yes \NC \yes \NC \NR
10116 \NC getsubtype \NC \yes \NC \yes \NC \NR
10117 \NC has_glyph \NC \yes \NC \yes \NC \NR
10118 \NC has_attribute \NC \yes \NC \yes \NC \NR
10119 \NC has_field \NC \yes \NC \yes \NC \NR
10120 \NC hpack \NC \yes \NC \yes \NC \NR
10121 \NC id \NC \yes \NC \nop \NC \NR
10122 \NC insert_after \NC \yes \NC \yes \NC \NR
10123 \NC insert_before \NC \yes \NC \yes \NC \NR
10124 \NC is_direct \NC \nop \NC \yes \NC \NR
10125 \NC is_node \NC \yes \NC \yes \NC \NR
10126 \NC kerning \NC \yes \NC \nop \NC \NR
10127 \NC last_node \NC \yes \NC \yes \NC \NR
10128 \NC length \NC \yes \NC \yes \NC \NR
10129 \NC ligaturing \NC \yes \NC \nop \NC \NR
10130 \NC mlist_to_hlist \NC \yes \NC \nop \NC \NR
10131 \NC new \NC \yes \NC \yes \NC \NR
10132 \NC next \NC \yes \NC \nop \NC \NR
10133 \NC prev \NC \yes \NC \nop \NC \NR
10134 \NC tostring \NC \yes \NC \yes \NC \NR
10135 \NC protect_glyphs \NC \yes \NC \yes \NC \NR
10136 \NC protrusion_skippable \NC \yes \NC \yes \NC \NR
10137 \NC remove \NC \yes \NC \yes \NC \NR
10138 \NC set_attribute \NC \yes \NC \yes \NC \NR
10139 \NC setbox \NC \yes \NC \yes \NC \NR
10140 \NC setfield \NC \yes \NC \yes \NC \NR
10141 \NC slide \NC \yes \NC \yes \NC \NR
10142 \NC subtype \NC \yes \NC \nop \NC \NR
10143 \NC tail \NC \yes \NC \yes \NC \NR
10144 \NC todirect \NC \yes \NC \yes \NC \NR
10145 \NC tonode \NC \yes \NC \yes \NC \NR
10146 \NC traverse \NC \yes \NC \yes \NC \NR
10147 \NC traverse_id \NC \yes \NC \yes \NC \NR
10148 \NC type \NC \yes \NC \nop \NC \NR
10149 \NC types \NC \yes \NC \nop \NC \NR
10150 \NC unprotect_glyphs \NC \yes \NC \yes \NC \NR
10151 \NC unset_attribute \NC \yes \NC \yes \NC \NR
10152 \NC usedlist \NC \yes \NC \yes \NC \NR
10153 \NC vpack \NC \yes \NC \yes \NC \NR
10154 \NC whatsits \NC \yes \NC \nop \NC \NR
10155 \NC write \NC \yes \NC \yes \NC \NR
10156 \stoptabulate
10158 \stop
10160 The \type {node.next} and \type {node.prev} functions will stay but for
10161 consistency there are variants called \type {getnext} and \type {getprev}.
10162 We had to use \type{get} because \type {node.id} and \type {node.subtype} are
10163 already taken for providing meta information about nodes.
10165 \chapter{Modifications}
10167 Besides the expected changes caused by new functionality, there are a
10168 number of not|-|so|-|expected changes. These are sometimes a side|-|effect
10169 of a new (conflicting) feature, or, more often than not, a change
10170 necessary to clean up the internal interfaces.
10172 \section{Changes from \TEX\ 3.1415926}
10174 \startitemize
10176 \item The current code base is written in C, not Pascal web (as of \LUATEX~0.42.0).
10178 \item See~\in{chapter}[languages] for many small changes related to paragraph
10179 building, language handling, and hyphenation. Most important change:
10180 adding a brace group in the middle of a word (like in \type{of{}fice})
10181 does not prevent ligature creation.
10183 \item There is no pool file, all strings are embedded during compilation.
10185 \item \type {plus 1 fillll} does not generate an error. The extra \quote{l} is
10186 simply typeset.
10188 \item The upper limit to \tex{endlinechar} and \tex{newlinechar} is 127.
10190 \stopitemize
10192 \section{Changes from \ETEX\ 2.2}
10194 \startitemize
10196 \item The \ETEX\ functionality is always present and enabled
10197 (but see below about \TEXXET), so the prepended asterisk or
10198 \type{-etex} switch for \INITEX\ is not needed.
10200 \item \TEXXET\ is not present, so the primitives
10202 \starttyping
10203 \TeXXeTstate
10204 \beginR
10205 \beginL
10206 \endR
10207 \endL
10208 \stoptyping
10210 are missing.
10212 \item Some of the tracing information that is output by \ETEX's \tex{tracingassigns} and
10213 \tex{tracingrestores} is not there.
10215 \item Register management in \LUATEX\ uses the \ALEPH\ model, so the maximum value is 65535
10216 and the implementation uses a flat array instead of the mixed
10217 flat|\&|sparse model from \ETEX.
10219 \item \type{savinghyphcodes} is a no-op.
10220 See~\in{chapter}[languages] for details.
10222 \item When kpathsea is used to find files, \LUATEX\ uses the
10223 \type{ofm} file format to search for font metrics. In turn, this means
10224 that \LUATEX\ looks at the \type{OFMFONTS} configuration variable
10225 (like \OMEGA\ and \ALEPH) instead of \type{TFMFONTS} (like \TEX\ and
10226 \PDFTEX). Likewise for virtual fonts (\LUATEX\ uses the variable
10227 \type{OVFFONTS} instead of \type{VFFONTS}).
10230 \stopitemize
10232 \section{Changes from \PDFTEX\ 1.40}
10234 \startitemize
10236 \item The (experimental) support for snap nodes has been removed, because
10237 it is much more natural to build this functionality on top of node
10238 processing and attributes. The associated primitives that are now gone
10239 are: \tex{pdfsnaprefpoint}, \tex{pdfsnapy}, and \tex{pdfsnapycomp}.
10241 \item The (experimental) support for specialized spacing around nodes
10242 has also been removed. The associated primitives that are now gone are:
10243 \tex{pdfadjustinterwordglue}, \tex{pdfprependkern}, and \tex{pdfappendkern},
10244 as well as the five supporting primitives \tex{knbscode}, \tex{stbscode},
10245 \tex{shbscode}, \tex{knbccode}, and \tex{knaccode}.
10247 \item A number of \quote{utility functions} is removed:
10249 \startcolumns[n=3]
10250 \starttyping
10251 \pdfelapsedtime
10252 \pdfescapehex
10253 \pdfescapename
10254 \pdfescapestring
10255 \pdffiledump
10256 \pdffilemoddate
10257 \pdffilesize
10258 \pdflastmatch
10259 \pdfmatch
10260 \pdfmdfivesum
10261 \pdfresettimer
10262 \pdfshellescape
10263 \pdfstrcmp
10264 \pdfunescapehex
10265 \stoptyping
10266 \stopcolumns
10268 \item The four primitives that were already marked obsolete in \PDFTEX~1.40
10269 have been removed since \LUATEX~0.42:
10271 \startcolumns[n=2]
10272 \starttyping
10273 \pdfoptionalwaysusepdfpagebox
10274 \pdfoptionpdfinclusionerrorlevel
10275 \pdfforcepagebox
10276 \pdfmovechars
10277 \stoptyping
10278 \stopcolumns
10281 \item A few other experimental primitives are also provided without the
10282 extra \luatex {pdf} prefix, so they can also be called like this:
10284 \startcolumns[n=3]
10285 \starttyping
10286 \primitive
10287 \ifprimitive
10288 \ifabsnum
10289 \ifabsdim
10290 \stoptyping
10291 \stopcolumns
10293 \item The \tex{pdftexversion} is set to 200.
10295 \item The PNG transparency fix from 1.40.6 is not applied
10296 (high-level support is pending)
10298 \item LFS (\PDF\ Files larger than 2GiB) support is not working yet.
10300 \item \LUATEX~0.45.0 introduces two extra token lists, \tex{pdfxformresources}
10301 and \tex{pdfxformattr}, as an alternative to \tex{pdfxform} keywords.
10303 \item As of \LUATEX~0.50.0 is no longer possible for fonts from embedded pdf files
10304 to be replaced by / merged with the document fonts of the enveloping
10305 pdf document. This regression may be temporary, depending on how the
10306 rewritten font backend will look after beta 0.60.
10309 \stopitemize
10311 \section{Changes from \ALEPH\ RC4}
10313 \startitemize
10315 \item Starting with \LUATEX\ 0.75.0, the extended 16-bit math primitives
10316 (\tex{omathcode} etc.~) have been removed.
10318 \item Starting with \LUATEX\ 0.63.0, OCP processing is no longer
10319 supported at all. As a consequence, the following primitives have
10320 been removed:
10322 \startcolumns[n=2]
10323 \starttyping
10324 \ocp
10325 \externalocp
10326 \ocplist
10327 \pushocplist
10328 \popocplist
10329 \clearocplists
10330 \addbeforeocplist
10331 \addafterocplist
10332 \removebeforeocplist
10333 \removeafterocplist
10334 \ocptracelevel
10335 \stoptyping
10336 \stopcolumns
10338 \item \LUATEX\ only understands 4~of the 16~direction
10339 specifiers of \ALEPH: \type{TLT} (latin), \type{TRT} (arabic),
10340 \type{RTT} (cjk), \type{LTL} (mongolian). All other direction
10341 specifiers generate an error (\LUATEX\ 0.45).
10343 \item The input translations from \ALEPH\ are not implemented, the
10344 related primitives are not available:
10346 \startcolumns[n=2]
10347 \starttyping
10348 \DefaultInputMode
10349 \noDefaultInputMode
10350 \noInputMode
10351 \InputMode
10352 \DefaultOutputMode
10353 \noDefaultOutputMode
10354 \noOutputMode
10355 \OutputMode
10356 \DefaultInputTranslation
10357 \noDefaultInputTranslation
10358 \noInputTranslation
10359 \InputTranslation
10360 \DefaultOutputTranslation
10361 \noDefaultOutputTranslation
10362 \noOutputTranslation
10363 \OutputTranslation
10364 \stoptyping
10365 \stopcolumns
10367 \item The \tex{hoffset} bug when \tex{pagedir TRT} is fixed,
10368 removing the need for an explicit fix to \tex{hoffset}
10370 \item A bug causing \tex{fam} to fail for family numbers above
10371 15 is fixed.
10373 \item A fair amount of other minor bugs are fixed as well, most of these
10374 related to \tex{tracingcommands} output.
10376 \item The internal function \type{scan_dir()} has been renamed to
10377 \type{scan_direction()} to prevent a naming clash, and it now allows
10378 an optional space after the direction is completely parsed.
10380 \item The \type{^^} notation can come in five and six item repetitions also, to
10381 insert characters that do not fit in the BMP.
10383 \item Glues {\it immediately after} direction change commands are not
10384 legal breakpoints.
10386 \stopitemize
10388 \section{Changes from standard \WEBC}
10390 \startitemize
10392 \item There is no mltex
10394 \item There is no enctex
10396 \item The following commandline switches are silently ignored, even
10397 in non|-|\LUA\ mode:
10399 \starttyping
10400 -8bit
10401 -translate-file=TCXNAME
10402 -mltex
10403 -enc
10404 -etex
10405 \stoptyping
10407 \item \tex{openout} whatsits are not written to the log file.
10409 \item Some of the so|-|called web2c extensions are hard to set up
10410 in non|-|\KPSE\ mode because texmf.cnf is not read: \type{shell-escape}
10411 is off (but that is not a problem because of \LUA's
10412 \lua{os.execute}), and the paranoia checks on \type{openin} and
10413 \type{openout} do not happen (however, it is easy for a \LUA\ script
10414 to do this itself by overloading \lua{io.open}).
10416 \item The \quote{E} option does not do anything useful.
10418 \stopitemize
10420 \chapter{Implementation notes}
10422 \section{Primitives overlap}
10424 The primitives
10426 \starttabulate[|l|l|]
10427 \NC \tex{pdfpagewidth} \NC \tex{pagewidth} \NC \NR
10428 \NC \tex{pdfpageheight}\NC \tex{pageheight} \NC \NR
10429 \NC \tex{fontcharwd} \NC \tex{charwd} \NC \NR
10430 \NC \tex{fontcharht} \NC \tex{charht} \NC \NR
10431 \NC \tex{fontchardp} \NC \tex{chardp} \NC \NR
10432 \NC \tex{fontcharic} \NC \tex{charit} \NC \NR
10433 \stoptabulate
10435 are all aliases of each other.
10437 \section{Memory allocation}
10439 The single internal memory heap that traditional \TEX\ used for tokens
10440 and nodes is split into two separate arrays. Each of these will grow
10441 dynamically when needed.
10443 The \type{texmf.cnf} settings related to main memory are no longer
10444 used (these are: \type{main_memory}, \type{mem_bot},
10445 \type{extra_mem_top} and \type{extra_mem_bot}). \quote{Out of main
10446 memory} errors can still occur, but the limiting factor is now the
10447 amount of RAM in your system, not a predefined limit.
10449 Also, the memory (de)allocation routines for nodes are completely
10450 rewritten. The relevant code now lives in the C file \type{texnode.c},
10451 and basically uses a dozen or so \quote{avail} lists instead of a
10452 doubly|-|linked model. An extra function layer is added so that the
10453 code can ask for nodes by type instead of directly requisitioning
10454 a certain amount of memory words.
10456 Because of the split into two arrays and the resulting differences in the data
10457 structures, some of the macros have been duplicated. For instance, there are now
10458 \type{vlink} and \type{vinfo} as well as \type{token_link} and \type{token_info}. All
10459 access to the variable memory array is now hidden behind a macro called \type{vmem}.
10461 The implementation of the growth of two arrays (via reallocation)
10462 introduces a potential pitfall: the memory arrays should never be used
10463 as the left hand side of a statement that can modify the array in
10464 question.
10466 The input line buffer and pool size are now also reallocated when
10467 needed, and the \type{texmf.cnf} settings \type{buf_size} and
10468 \type{pool_size} are silently ignored.
10470 \section{Sparse arrays}
10472 The \tex{mathcode}, \tex{delcode}, \tex{catcode},
10473 \tex{sfcode}, \tex{lccode} and \tex{uccode} tables are now
10474 sparse arrays that are implemented in~C. They are no longer part of
10475 the \TEX\ \quote{equivalence table} and because each had 1.1 million
10476 entries with a few memory words each, this makes a major difference
10477 in memory usage.
10479 The \tex{catcode}, \tex{sfcode}, \tex{lccode} and \tex{uccode} assignments
10480 do not yet show up when using the etex tracing routines \tex{tracingassigns}
10481 and \tex{tracingrestores} (code simply not written yet).
10483 A side|-|effect of the current implementation is that \tex{global} is
10484 now more expensive in terms of processing than non|-|global assignments.
10486 See \type{mathcodes.c} and \type{textcodes.c} if you are interested in
10487 the details.
10489 Also, the glyph ids within a font are now managed by means
10490 of a sparse array and glyph ids can go up to index $2^{21}-1$.
10492 \section{Simple single-character csnames}
10494 Single|-|character commands are no longer treated specially in the
10495 internals, they are stored in the hash just like the multiletter
10496 csnames.
10498 The code that displays control sequences explicitly checks if
10499 the length is one when it has to decide whether or not to add a
10500 trailing space.
10502 Active characters are internally implemented as a special type
10503 of multi-letter control sequences that uses a prefix that is
10504 otherwise impossible to obtain.
10506 \section{Compressed format}
10508 The format is passed through zlib, allowing it to shrink to roughly
10509 half of the size it would have had in uncompressed form. This takes a
10510 bit more CPU cycles but much less disk I/O, so it should still be
10511 faster.
10513 \section{Binary file reading}
10515 All of the internal code is changed in such a way that if one of the
10516 \type{read_xxx_file} callbacks is not set, then the file is read by
10517 a C function using basically the same convention as the callback: a
10518 single read into a buffer big enough to hold the entire file
10519 contents. While this uses more memory than the previous code (that
10520 mostly used \type{getc} calls), it can be quite a bit faster
10521 (depending on your I/O subsystem).
10523 \chapter{Known bugs and limitations, TODO}
10525 There used to be a lists of bugs and planned features below here, but that did not
10526 work out too well. There are lists of open bugs and feature requests in the tracker at
10527 \hyphenatedurl{http://tracker.luatex.org}.
10529 \stoptext