boundary nodes made consistent (cleanup and document): WARNING: bump the format numbe...
[luatex.git] / manual / luatex-enhancements.tex
blob4127f93300b947e39cbd865b0e5fabec434dd1a7
1 % language=uk
3 \environment luatex-style
4 \environment luatex-logos
6 \startcomponent luatex-enhancements
8 \startchapter[reference=enhancements,title={Basic \TEX\ enhancements}]
10 \section{Introduction}
12 From day one, \LUATEX\ has offered extra features compared to the superset of
13 \PDFTEX\ and \ALEPH. This has not been limited to the possibility to execute
14 \LUA\ code via \type {\directlua}, but \LUATEX\ also adds functionality via new
15 \TEX|-|side primitives or extensions to existing ones.
17 When \LUATEX\ starts up in \quote {iniluatex} mode (\type {luatex -ini}), it
18 defines only the primitive commands known by \TEX82 and the one extra command
19 \type {\directlua}. As is fitting, a \LUA\ function has to be called to add the
20 extra primitives to the user environment. The simplest method to get access to
21 all of the new primitive commands is by adding this line to the format generation
22 file:
24 \starttyping
25 \directlua { tex.enableprimitives('',tex.extraprimitives()) }
26 \stoptyping
28 But be aware that the curly braces may not have the proper \type {\catcode}
29 assigned to them at this early time (giving a \quote {Missing number} error), so
30 it may be needed to put these assignments before the above line:
32 \starttyping
33 \catcode `\{=1
34 \catcode `\}=2
35 \stoptyping
37 More fine|-|grained primitives control is possible and you can look up the
38 details in \in {section} [luaprimitives]. For simplicity's sake, this manual
39 assumes that you have executed the \type {\directlua} command as given above.
41 The startup behaviour documented above is considered stable in the sense that
42 there will not be backward|-|incompatible changes any more. We have promoted some
43 rather generic \PDFTEX\ primitives to core \LUATEX\ ones, and the ones inherited
44 frome \ALEPH\ (\OMEGA) are also promoted. Effectively this means that we now only
45 have the \type {tex}, \type {etex} and \type {luatex} sets left.
47 In \in {Chapter} [modifications] we discuss several primitives that are derived
48 from \PDFTEX\ and \ALEPH\ (\OMEGA). Here we stick to real new ones. In the
49 chapters on fonts and math we discuss a few more new ones.
51 \section{Version information}
53 There are three new primitives to test the version of \LUATEX:
55 \starttabulate[|l|pl|pl|]
56 \NC \bf primitive \NC \bf explanation \NC \bf value \NC \NR
57 \NC \type {\luatexbanner} \NC the banner reported on the command line \NC \luatexbanner \NC \NR
58 \NC \type {\luatexversion} \NC a combination of major and minor number \NC \the\luatexversion \NC \NR
59 \NC \type {\luatexrevision} \NC the revision number, the current value is \NC \luatexrevision \NC \NR
60 \stoptabulate
62 The official \LUATEX\ version is defined as follows:
64 \startitemize
65 \startitem
66 The major version is the integer result of \type {\luatexversion} divided by
67 100. The primitive is an \quote {internal variable}, so you may need to prefix
68 its use with \type {\the} depending on the context.
69 \stopitem
70 \startitem
71 The minor version is the two-digit result of \type {\luatexversion} modulo 100.
72 \stopitem
73 \startitem
74 The revision is the given by \type {\luatexrevision}. This primitive expands to
75 a positive integer.
76 \stopitem
77 \startitem
78 The full version number consists of the major version, minor version and
79 revision, separated by dots.
80 \stopitem
81 \stopitemize
83 \section{\UNICODE\ text support}
85 Text input and output is now considered to be \UNICODE\ text, so input characters
86 can use the full range of \UNICODE\ ($2^{20}+2^{16}-1 = \hbox{0x10FFFF}$). Later
87 chapters will talk of characters and glyphs. Although these are not
88 interchangeable, they are closely related. During typesetting, a character is
89 always converted to a suitable graphic representation of that character in a
90 specific font. However, while processing a list of to|-|be|-|typeset nodes, its
91 contents may still be seen as a character. Inside \LUATEX\ there is no clear
92 separation between the two concepts. Because the subtype of a glyph node can be
93 changed in \LUA\ it is up to the user: subtypes larger than 255 indicate that
94 font processing has happened.
96 A few primitives are affected by this, all in a similar fashion: each of them has
97 to accommodate for a larger range of acceptable numbers. For instance, \type
98 {\char} now accepts values between~0 and $1{,}114{,}111$. This should not be a
99 problem for well|-|behaved input files, but it could create incompatibilities for
100 input that would have generated an error when processed by older \TEX|-|based
101 engines. The affected commands with an altered initial (left of the equals sign)
102 or secondary (right of the equals sign) value are: \type {\char}, \type
103 {\lccode}, \type {\uccode}, \type {\catcode}, \type {\sfcode}, \type {\efcode},
104 \type {\lpcode}, \type {\rpcode}, \type {\chardef}.
106 As far as the core engine is concerned, all input and output to text files is
107 \UTF-8 encoded. Input files can be pre|-|processed using the \type {reader}
108 callback. This will be explained in a later chapter.
110 Output in byte|-|sized chunks can be achieved by using characters just outside of
111 the valid \UNICODE\ range, starting at the value $1{,}114{,}112$ (0x110000). When
112 the time comes to print a character $c>=1{,}114{,}112$, \LUATEX\ will actually
113 print the single byte corresponding to $c$ minus 1{,}114{,}112.
115 Output to the terminal uses \type {^^} notation for the lower control range
116 ($c<32$), with the exception of \type {^^I}, \type {^^J} and \type {^^M}. These
117 are considered \quote {safe} and therefore printed as|-|is. You can disable
118 escaping with \type {texio.setescape(false)} in which case you get the normal
119 characters on the console.
121 Normalization of the \UNICODE\ input can be handled by a macro package during
122 callback processing (this will be explained in \in {section} [iocallback]).
124 \section{Extended tables}
126 All traditional \TEX\ and \ETEX\ registers can be 16-bit numbers. The affected
127 commands are:
129 \startfourcolumns
130 \starttyping
131 \count
132 \dimen
133 \skip
134 \muskip
135 \marks
136 \toks
137 \countdef
138 \dimendef
139 \skipdef
140 \muskipdef
141 \toksdef
142 \insert
143 \box
144 \unhbox
145 \unvbox
146 \copy
147 \unhcopy
148 \unvcopy
152 \setbox
153 \vsplit
154 \stoptyping
155 \stopfourcolumns
157 Because font memory management has been rewritten, character properties in fonts
158 are no longer shared among fonts instances that originate from the same metric
159 file.
161 \section{Attributes}
163 \subsection{Attribute registers}
165 Attributes are a completely new concept in \LUATEX. Syntactically, they behave a
166 lot like counters: attributes obey \TEX's nesting stack and can be used after
167 \type {\the} etc.\ just like the normal \type {\count} registers.
169 \startsyntax
170 \attribute <16-bit number> <optional equals> <32-bit number>!crlf
171 \attributedef <csname> <optional equals> <16-bit number>
172 \stopsyntax
174 Conceptually, an attribute is either \quote {set} or \quote {unset}. Unset
175 attributes have a special negative value to indicate that they are unset, that
176 value is the lowest legal value: \type {-"7FFFFFFF} in hexadecimal, a.k.a.
177 $-2147483647$ in decimal. It follows that the value \type {-"7FFFFFFF} cannot be
178 used as a legal attribute value, but you {\it can\/} assign \type {-"7FFFFFFF} to
179 \quote {unset} an attribute. All attributes start out in this \quote {unset}
180 state in \INITEX.
182 Attributes can be used as extra counter values, but their usefulness comes mostly
183 from the fact that the numbers and values of all \quote {set} attributes are
184 attached to all nodes created in their scope. These can then be queried from any
185 \LUA\ code that deals with node processing. Further information about how to use
186 attributes for node list processing from \LUA\ is given in~\in {chapter}[nodes].
188 Attributes are stored in a sorted (sparse) linked list that are shared when
189 possible. This permits efficient testing and updating.
191 \subsection{Box attributes}
193 Nodes typically receive the list of attributes that is in effect when they are
194 created. This moment can be quite asynchronous. For example: in paragraph
195 building, the individual line boxes are created after the \type {\par} command has
196 been processed, so they will receive the list of attributes that is in effect
197 then, not the attributes that were in effect in, say, the first or third line of
198 the paragraph.
200 Similar situations happen in \LUATEX\ regularly. A few of the more obvious
201 problematic cases are dealt with: the attributes for nodes that are created
202 during hyphenation, kerning and ligaturing borrow their attributes from their
203 surrounding glyphs, and it is possible to influence box attributes directly.
205 When you assemble a box in a register, the attributes of the nodes contained in
206 the box are unchanged when such a box is placed, unboxed, or copied. In this
207 respect attributes act the same as characters that have been converted to
208 references to glyphs in fonts. For instance, when you use attributes to implement
209 color support, each node carries information about its eventual color. In that
210 case, unless you implement mechanisms that deal with it, applying a color to
211 already boxed material will have no effect. Keep in mind that this
212 incompatibility is mostly due to the fact that separate specials and literals are
213 a more unnatural approach to colors than attributes.
215 It is possible to fine-tune the list of attributes that are applied to a \type
216 {hbox}, \type {vbox} or \type {vtop} by the use of the keyword \type {attr}. An
217 example:
219 \starttyping
220 \attribute2=5
221 \setbox0=\hbox {Hello}
222 \setbox2=\hbox attr1=12 attr2=-"7FFFFFFF{Hello}
223 \stoptyping
225 This will set the attribute list of box~2 to $1=12$, and the attributes of box~0
226 will be $2=5$. As you can see, assigning the maximum negative value causes an
227 attribute to be ignored.
229 The \type {attr} keyword(s) should come before a \type {to} or \type {spread}, if
230 that is also specified.
232 \section{\LUA\ related primitives}
234 \subsection{\type {\directlua}}
236 In order to merge \LUA\ code with \TEX\ input, a few new primitives are needed.
237 The primitive \type {\directlua} is used to execute \LUA\ code immediately. The
238 syntax is
240 \startsyntax
241 \directlua <general text>!crlf
242 \directlua <16-bit number> <general text>
243 \stopsyntax
245 The \syntax {<general text>} is expanded fully, and then fed into the \LUA\
246 interpreter. After reading and expansion has been applied to the \syntax
247 {<general text>}, the resulting token list is converted to a string as if it was
248 displayed using \type {\the\toks}. On the \LUA\ side, each \type {\directlua}
249 block is treated as a separate chunk. In such a chunk you can use the \type
250 {local} directive to keep your variables from interfering with those used by the
251 macro package.
253 The conversion to and from a token list means that you normally can not use \LUA\
254 line comments (starting with \type {--}) within the argument. As there typically
255 will be only one \quote {line} the first line comment will run on until the end
256 of the input. You will either need to use \TEX|-|style line comments (starting
257 with \%), or change the \TEX\ category codes locally. Another possibility is to
258 say:
260 \starttyping
261 \begingroup
262 \endlinechar=10
263 \directlua ...
264 \endgroup
265 \stoptyping
267 Then \LUA\ line comments can be used, since \TEX\ does not replace line endings
268 with spaces.
270 Likewise, the \syntax {<16-bit number>} designates a name of a \LUA\ chunk and is
271 taken from the \type {lua.name} array (see the documentation of the \type {lua}
272 table further in this manual). When a chunk name starts with a \type {@} it will
273 be displayed as a file name. This is a side effect of the way \LUA\ implements
274 error handling.
276 The \type {\directlua} command is expandable. Since it passes \LUA\ code to the
277 \LUA\ interpreter its expansion from the \TEX\ viewpoint is usually empty.
278 However, there are some \LUA\ functions that produce material to be read by \TEX,
279 the so called print functions. The most simple use of these is \type
280 {tex.print(<string> s)}. The characters of the string \type {s} will be placed on
281 the \TEX\ input buffer, that is, \quote {before \TEX's eyes} to be read by \TEX\
282 immediately. For example:
284 \startbuffer
285 \count10=20
286 a\directlua{tex.print(tex.count[10]+5)}b
287 \stopbuffer
289 \typebuffer
291 expands to
293 \getbuffer
295 Here is another example:
297 \startbuffer
298 $\pi = \directlua{tex.print(math.pi)}$
299 \stopbuffer
301 \typebuffer
303 will result in
305 \getbuffer
307 Note that the expansion of \type {\directlua} is a sequence of characters, not of
308 tokens, contrary to all \TEX\ commands. So formally speaking its expansion is
309 null, but it places material on a pseudo-file to be immediately read by \TEX, as
310 \ETEX's \type {\scantokens}. For a description of print functions look at \in
311 {section} [sec:luaprint].
313 Because the \syntax {<general text>} is a chunk, the normal \LUA\ error handling
314 is triggered if there is a problem in the included code. The \LUA\ error messages
315 should be clear enough, but the contextual information is still pretty bad.
316 Often, you will only see the line number of the right brace at the end of the
317 code.
319 While on the subject of errors: some of the things you can do inside \LUA\ code
320 can break up \LUATEX\ pretty bad. If you are not careful while working with the
321 node list interface, you may even end up with assertion errors from within the
322 \TEX\ portion of the executable.
324 The behaviour documented in the above subsection is considered stable in the sense
325 that there will not be backward-incompatible changes any more.
327 \subsection{\type {\latelua}}
329 Contrary to \type {\directlua}, \type {\latelua} stores \LUA\ code in a whatsit
330 that will be processed at the time of shipping out. Its intended use is a cross
331 between \PDF\ literals (often available as \type {\pdfliteral}) and the
332 traditional \TEX\ extension \type {\write}. Within the \LUA\ code you can print
333 \PDF\ statements directly to the \PDF\ file via \type {pdf.print}, or you can
334 write to other output streams via \type {texio.write} or simply using \LUA\ \IO\
335 routines.
337 \startsyntax
338 \latelua <general text>!crlf
339 \latelua <16-bit number> <general text>
340 \stopsyntax
342 Expansion of macros in the final \type {<general text>} is delayed until just
343 before the whatsit is executed (like in \type {\write}). With regard to \PDF\
344 output stream \type {\latelua} behaves as \PDF\ page literals. The \syntax
345 {name <general text>} and \syntax {<16-bit number>} behave in the same way as
346 they do for \type {\directlua}
348 \subsection{\type {\luaescapestring}}
350 This primitive converts a \TEX\ token sequence so that it can be safely used as
351 the contents of a \LUA\ string: embedded backslashes, double and single quotes,
352 and newlines and carriage returns are escaped. This is done by prepending an
353 extra token consisting of a backslash with category code~12, and for the line
354 endings, converting them to \type {n} and \type {r} respectively. The token
355 sequence is fully expanded.
357 \startsyntax
358 \luaescapestring <general text>
359 \stopsyntax
361 Most often, this command is not actually the best way to deal with the
362 differences between the \TEX\ and \LUA. In very short bits of \LUA\
363 code it is often not needed, and for longer stretches of \LUA\ code it
364 is easier to keep the code in a separate file and load it using \LUA's
365 \type {dofile}:
367 \starttyping
368 \directlua { dofile('mysetups.lua') }
369 \stoptyping
371 \subsection{\type {\luafunction}}
373 The \type {\directlua} commands involves tokenization of its argument (after
374 picking up an optional name or number specification). The tokenlist is then
375 converted into a string and given to \LUA\ to turn into a function that is
376 called. The overhead is rather small but when you use this primitive hundreds of
377 thousands of times, it can become noticeable. For this reason there is a variant
378 call available: \type {\luafunction}. This command is used as follows:
380 \starttyping
381 \directlua {
382 local t = lua.get_functions_table()
383 t[1] = function() tex.print("!") end
384 t[2] = function() tex.print("?") end
387 \luafunction1
388 \luafunction2
389 \stoptyping
391 Of course the functions can also be defined in a separate file. There is no limit
392 on the number of functions apart from normal \LUA\ limitations. Of course there
393 is the limitation of no arguments but that would involve parsing and thereby give
394 no gain. The function, when called in fact gets one argument, being the index, so
395 in the following example the number \type {8} gets typeset.
397 \starttyping
398 \directlua {
399 local t = lua.get_functions_table()
400 t[8] = function(slot) tex.print(slot) end
402 \stoptyping
404 \section{\type {\clearmarks}}
406 This primitive complements the \ETEX\ mark primitives and clears a mark class
407 completely, resetting all three connected mark texts to empty. It is an
408 immediate command.
410 \startsyntax
411 \clearmarks <16-bit number>
412 \stopsyntax
414 \section{\type {\noligs} and \type {\nokerns}}
416 These primitives prohibit ligature and kerning insertion at the time when the
417 initial node list is built by \LUATEX's main control loop. You can enable these
418 primitives when you want to do node list processing of \quote {characters}, where
419 \TEX's normal processing would get in the way.
421 \startsyntax
422 \noligs <integer>!crlf
423 \nokerns <integer>
424 \stopsyntax
426 These primitives can also be implemented by overloading the ligature building and
427 kerning functions, i.e.\ by assigning dummy functions to their associated
428 callbacks. Keep in mind that when you define a font (using \LUA) you can also
429 omit the kern and ligature tables, which has the same effect as the above.
431 \section{\type {\formatname}}
433 The \type {\formatname} syntax is identical to \type {\jobname}. In \INITEX, the
434 expansion is empty. Otherwise, the expansion is the value that \type {\jobname} had
435 during the \INITEX\ run that dumped the currently loaded format.
437 \section{\type {\scantextokens}}
439 The syntax of \type {\scantextokens} is identical to \type {\scantokens}. This
440 primitive is a slightly adapted version of \ETEX's \type {\scantokens}. The
441 differences are:
443 \startitemize
444 \startitem
445 The last (and usually only) line does not have a \type {\endlinechar}
446 appended.
447 \stopitem
448 \startitem
449 \type {\scantextokens} never raises an EOF error, and it does not execute
450 \type {\everyeof} tokens.
451 \stopitem
452 \startitem
453 There are no \quote {\unknown\ while end of file \unknown} error tests
454 executed. This allows the expansion to end on a different grouping level or
455 while a conditional is still incomplete.
456 \stopitem
457 \stopitemize
459 \section {Alignments}
461 \subsection{\tex {alignmark}}
463 This primitive duplicates the functionality of \type {#} inside alignment
464 preambles.
466 \subsection{\tex {aligntab}}
468 This primitive duplicates the functionality of \type {&} inside alignments and
469 preambles.
471 \section{Catcode tables}
473 Catcode tables are a new feature that allows you to switch to a predefined
474 catcode regime in a single statement. You can have a practically unlimited number
475 of different tables. This subsystem is backward compatible: if you never use the
476 following commands, your document will not notice any difference in behaviour
477 compared to traditional \TEX. The contents of each catcode table is independent
478 from any other catcode tables, and their contents is stored and retrieved from
479 the format file.
481 \subsection{\type {\catcodetable}}
483 \startsyntax
484 \catcodetable <15-bit number>
485 \stopsyntax
487 The primitive \type {\catcodetable} switches to a different catcode table. Such a
488 table has to be previously created using one of the two primitives below, or it
489 has to be zero. Table zero is initialized by \INITEX.
491 \subsection{\type {\initcatcodetable}}
493 \startsyntax
494 \initcatcodetable <15-bit number>
495 \stopsyntax
497 The primitive \type {\initcatcodetable} creates a new table with catcodes identical
498 to those defined by \INITEX:
500 \starttabulate[|r|l|l|l|]
501 \NC 0 \NC \tttf \letterbackslash \NC \NC \type {escape} \NC\NR
502 \NC 5 \NC \tttf \letterhat\letterhat M \NC return \NC \type {car_ret} \NC\NR
503 \NC 9 \NC \tttf \letterhat\letterhat @ \NC null \NC \type {ignore} \NC\NR
504 \NC 10 \NC \tttf <space> \NC space \NC \type {spacer} \NC\NR
505 \NC 11 \NC {\tttf a} \endash\ {\tttf z} \NC \NC \type {letter} \NC\NR
506 \NC 11 \NC {\tttf A} \endash\ {\tttf Z} \NC \NC \type {letter} \NC\NR
507 \NC 12 \NC everything else \NC \NC \type {other} \NC\NR
508 \NC 14 \NC \tttf \letterpercent \NC \NC \type {comment} \NC\NR
509 \NC 15 \NC \tttf \letterhat\letterhat ? \NC delete \NC \type {invalid_char} \NC\NR
510 \stoptabulate
512 The new catcode table is allocated globally: it will not go away after the
513 current group has ended. If the supplied number is identical to the currently
514 active table, an error is raised.
516 \subsection{\type {\savecatcodetable}}
518 \startsyntax
519 \savecatcodetable <15-bit number>
520 \stopsyntax
522 \type {\savecatcodetable} copies the current set of catcodes to a new table with
523 the requested number. The definitions in this new table are all treated as if
524 they were made in the outermost level.
526 The new table is allocated globally: it will not go away after the current group
527 has ended. If the supplied number is the currently active table, an error is
528 raised.
530 \section{Suppressing errors}
532 \subsection{\type {\suppressfontnotfounderror}}
534 \startsyntax
535 \suppressfontnotfounderror = 1
536 \stopsyntax
538 If this integer parameter is non|-|zero, then \LUATEX\ will not complain about
539 font metrics that are not found. Instead it will silently skip the font
540 assignment, making the requested csname for the font \type {\ifx} equal to \type
541 {\nullfont}, so that it can be tested against that without bothering the user.
543 \subsection{\type {\suppresslongerror}}
545 \startsyntax
546 \suppresslongerror = 1
547 \stopsyntax
549 If this integer parameter is non|-|zero, then \LUATEX\ will not complain about
550 \type {\par} commands encountered in contexts where that is normally prohibited
551 (most prominently in the arguments of non-long macros).
553 \subsection{\type {\suppressifcsnameerror}}
555 \startsyntax
556 \suppressifcsnameerror = 1
557 \stopsyntax
559 If this integer parameter is non|-|zero, then \LUATEX\ will not complain about
560 non-expandable commands appearing in the middle of a \type {\ifcsname} expansion.
561 Instead, it will keep getting expanded tokens from the input until it encounters
562 an \type {\endcsname} command. If the input expansion is unbalanced with respect
563 to \type {\csname} \ldots \type {\endcsname} pairs, the \LUATEX\ process may hang
564 indefinitely.
566 \subsection{\type {\suppressoutererror}}
568 \startsyntax
569 \suppressoutererror = 1
570 \stopsyntax
572 If this new integer parameter is non|-|zero, then \LUATEX\ will not complain
573 about \type {\outer} commands encountered in contexts where that is normally
574 prohibited.
576 \subsection{\type {\suppressmathparerror}}
578 The following setting will permit \type {\par} tokens in a math formula:
580 \startsyntax
581 \suppressmathparerror = 1
582 \stopsyntax
584 So, the next code is valid then:
586 \starttyping
587 $ x + 1 =
590 \stoptyping
592 \section{\type {\matheqnogapstep}}
594 By default \TEX\ will add one quad between the equation and the number. This is
595 hard coded. A new primitive can control this:
597 \startsyntax
598 \matheqnogapstep = 1000
599 \stopsyntax
601 Because a math quad from the math text font is used instead of a dimension, we
602 use a step to control the size. A value of zero will suppress the gap. The step
603 is divided by 1000 which is the usual way to mimmick floating point factors in
604 \TEX.
606 \section{\type {\outputbox}}
608 \startsyntax
609 \outputbox = 65535
610 \stopsyntax
612 This new integer parameter allows you to alter the number of the box that will be
613 used to store the page sent to the output routine. Its default value is 255, and
614 the acceptable range is from 0 to 65535.
616 \section{\type {\fontid} and \type {\setfontid}}
618 \startsyntax
619 \fontid\font
620 \stopsyntax
622 This primitive expands into a number. It is not a register so there is no need to
623 prefix with \type {\number} (and using \type {\the} gives an error). The currently
624 used font id is \fontid\font. Here are some more:
626 \starttabulate[|l|c|]
627 \NC \type {\bf} \NC \bf \fontid\font \NC \NR
628 \NC \type {\it} \NC \it \fontid\font \NC \NR
629 \NC \type {\bi} \NC \bi \fontid\font \NC \NR
630 \stoptabulate
632 These numbers depend on the macro package used because each one has its own way
633 of dealing with fonts. They can also differ per run, as they can depend on the
634 order of loading fonts. For instance, when in \CONTEXT\ virtual math \UNICODE\
635 fonts are used, we can easily get over a hundred ids in use. Not all ids have to
636 be bound to a real font, after all it's just a number.
638 The primitive \type {\setfontid} can be used to enable a font with the given id
639 (which of course needs to be a valid one).
641 \section{\type {\gleaders}}
643 This type of leaders is anchored to the origin of the box to be shipped out. So
644 they are like normal \type {\leaders} in that they align nicely, except that the
645 alignment is based on the {\it largest\/} enclosing box instead of the {\it
646 smallest\/}. The \type {g} stresses this global nature.
648 \section{\type {\nohrule} and \type {\novrule}}
650 Because internally box resources and image resources are now stored as a special
651 kind of rule, we also introduced an empty rule variant. Because introducing a new
652 keyword can cause incompatibilities, two new primitives were introduced: \type
653 {\nohrule} and \type {\novrule}. These can be used to reserve space. This is
654 often more efficient than creating an empty box with fake dimensions).
656 \section{\type {\Uchar}}
658 The expandable command \type {\Uchar} reads a number between~0 and $1{,}114{,}111$
659 and expands to the associated \UNICODE\ character.
661 \section{\type {\hyphenationmin}}
663 This primitive can be used to set the minimal word length, so setting it to a value
664 of~$5$ means that only words of 6 characters and more will be hyphenated, of course
665 within the constraints of the \type {\lefthyphenmin} and \type {\righthyphenmin}
666 values (as stored in the glyph node). This primitive accepts a number and stores
667 the value with the language.
669 \section{\type {\boundary}, \type {\noboundary}, \type {\protrusionboundary} and \type
670 {\wordboundary}}
672 The \type {\noboundary} commands used to inject a whatsit node but now injects a normal
673 node with type \type {boundary} and subtype~0. In addition you can say:
675 \starttyping
676 x\boundary 123\relax y
677 \stoptyping
679 This has the same effect but the subtype is now~1 and the value~123 is stored.
680 The traditional ligature builder still sees this as a cancel boundary directive
681 but at the \LUA\ end you can implement different behaviour. The added benefit of
682 passing this value is a side effect of the generalization. The subtypes~2 and~3
683 are used to control protrusion and word boundaries in hyphenation.
685 \section{\type {\vpack}, \type {\hpack} and \type {\tpack}}
687 These three primitives are like \type {\vbox}, \type {\hbox} and \type {\vtop}
688 but don't apply the related callbacks.
690 \section{\type {\csstring}, \type {\begincsname} and \type {\lastnamedcs}}
692 These are somewhat special. The \type {\csstring} primitive is like
693 \type {\string} but it omits the leading escape character. This can be
694 somewhat more efficient that stripping it of afterwards.
696 The \type {\begincsname} primitive is like \type {\csname} but doesn't create
697 a relaxed equivalent when there is no such name. It is equivalent to
699 \starttyping
700 \ifcsname foo\endcsname
701 \csname foo\endcsname
703 \stoptyping
705 The advantage is that it saves a lookup (don't expect much speedup) but more
706 important is that it avoids using the \type {\if}.
708 The \type {\lastnamedcs} is one that should be used with care. The above
709 example could be written as:
711 \starttyping
712 \ifcsname foo\endcsname
713 \lastnamedcs
715 \stoptyping
717 This is slightly more efficient than constructing the string twice (deep down in
718 \LUATEX\ this also involves some \UTF8 juggling), but probably more relevant is
719 that it saves a few tokens and can make code a bit more more readable.
721 \section{\type {\toksapp}, \type {\tokspre}, \type {\etoksapp} and \type {\etokspre}}
723 Instead of:
725 \starttyping
726 \toks0\expandafter{\the\toks0 foo}
727 \stoptyping
729 you can use:
731 \starttyping
732 \etoksapp0{foo}
733 \stoptyping
735 The \type {pre} variants prepend instead of append, and the \type {e} variants
736 expand the passed general text.
738 \section{Debugging}
740 If \type {\tracingonline} is larger than~2, the node list display will also print
741 the node number of the nodes.
743 \section{Images and Forms}
745 These two concepts are now core concepts and no longer whatsits. They are in fact
746 now implemented as rules with special properties. Normal rules have subtype~0,
747 saved boxes have subtype~1 and images have subtype~2. This has the positive side
748 effect that whenever we need to take content with dimensions into account, when we
749 look at rule nodes, we automatically also deal with these two types.
751 The syntax of the \type {\save...resource} is the same as in \PDFTEX\ but you
752 should consider them to be backend specific. This means that a macro package
753 should treat them as such and check for the current output mode if applicable.
754 Here are the equivalents:
756 \starttabulate[|l|l|]
757 \NC \type {\saveboxresource} \EQ \type {\pdfxform} \NC \NR
758 \NC \type {\saveimageresource} \EQ \type {\pdfximage} \NC \NR
759 \NC \type {\useboxresource} \EQ \type {\pdfrefxform} \NC \NR
760 \NC \type {\useimageresource} \EQ \type {\pdfrefximage} \NC \NR
761 \NC \type {\lastsavedboxresourceindex} \EQ \type {\pdflastxform} \NC \NR
762 \NC \type {\lastsavedimageresourceindex} \EQ \type {\pdflastximage} \NC \NR
763 \NC \type {\lastsavedimageresourcepages} \EQ \type {\pdflastximagepages} \NC \NR
764 \stoptabulate
766 \LUATEX\ accepts optional dimension parameters for \type {\use...resource} in the
767 same format as for rules. With images, these dimensions are then used instead of
768 the ones given to \type {\useimageresource} but the original dimensions are not
769 overwritten, so that a \type {\useimageresource} without dimensions still
770 provides the image with dimensions defined by \type {\saveimageresource}. These
771 optional parameters are not implemented for \type {\saveboxresource}.
773 \starttyping
774 \pdfrefximage width 20mm height 10mm depth 5mm \pdflastximage
775 \pdfrefxform width 20mm height 10mm depth 5mm \pdflastxform
776 \stoptyping
778 \section{\type {\outputmode} and \type {\draftmode}}
780 The \type {\outputmode} variable tells \LUATEX\ what it has to produce:
782 \starttabulate[|l|l|]
783 \NC \type {0} \NC \DVI\ code \NC \NR
784 \NC \type {1} \NC \PDF\ code \NC \NR
785 \stoptabulate
787 The value of the \type {\draftmode} counter signals the backend if it should
788 output less. The \PDF\ backend accepts a value of~$1$, while the \DVI\ backend
789 ignores the value.
791 \section{File syntax}
793 \LUATEX\ will accept a braced argument as a file name:
795 \starttyping
796 \input {plain}
797 \openin 0 {plain}
798 \stoptyping
800 This allows for embedded spaces, without the need for double quotes. Macro
801 expansion takes place inside the argument.
803 \section{Font syntax}
805 \LUATEX\ will accept a braced argument as a font name:
807 \starttyping
808 \font\myfont = {cmr10}
809 \stoptyping
811 This allows for embedded spaces, without the need for double quotes. Macro
812 expansion takes place inside the argument.
814 \section{Writing to file}
816 You can now open upto 127 files with \type {\openout}. When no file is open
817 writes will go to the console and log. As a consequence a system command is
818 no longer possible but one can use \type {os.execute} to do the same.
820 \section{\type{\nospaces}}
822 This new primitive can be used to overrule the usual \type {\spaceskip}
823 related heuristics when a space character is seen in a text flow. The
824 value~\type{1} triggers no injection while \type{2} results in injection of
825 a zero skip. Below we see the results for four characters separated by a
826 space.
828 \startlinecorrection
829 \startcombination[3*2]
830 {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=0\relax x x x x \par}\hss}} {\type {0 / hsize 10mm}}
831 {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=1\relax x x x x \par}\hss}} {\type {1 / hsize 10mm}}
832 {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=2\relax x x x x \par}\hss}} {\type {2 / hsize 10mm}}
833 {\ruledhbox to 5cm{\vtop{\hsize 1mm\nospaces=0\relax x x x x \par}\hss}} {\type {0 / hsize 1mm}}
834 {\ruledhbox to 5cm{\vtop{\hsize 1mm\nospaces=1\relax x x x x \par}\hss}} {\type {1 / hsize 1mm}}
835 {\ruledhbox to 5cm{\vtop{\hsize 1mm\nospaces=2\relax x x x x \par}\hss}} {\type {2 / hsize 1mm}}
836 \stopcombination
837 \stoplinecorrection
839 \section{\type{\letcharcode}}
841 This primitive is still experimental but can be used to assign a meaning to an active
842 character, as in:
844 \starttyping
845 \def\foo{bar} \letcharcode123\foo
846 \stoptyping
848 This can be a bit nicer that using the uppercase tricks (using the property of
849 \type {\uppercase} that it treats active characters special).
851 \stopchapter
853 \stopcomponent