fix getsup (HH)
[luatex.git] / manual / luatex-enhancements.tex
blobd55eef2860e4d466b4ed00f7d377f01e3270d383
1 % language=uk
3 \environment luatex-style
4 \environment luatex-logos
6 \startcomponent luatex-enhancements
8 \startchapter[reference=enhancements,title={Basic \TEX\ enhancements}]
10 \section{Introduction}
12 From day one, \LUATEX\ has offered extra features compared to the superset of
13 \PDFTEX\ and \ALEPH. This has not been limited to the possibility to execute
14 \LUA\ code via \type {\directlua}, but \LUATEX\ also adds functionality via new
15 \TEX|-|side primitives or extensions to existing ones.
17 When \LUATEX\ starts up in \quote {iniluatex} mode (\type {luatex -ini}), it
18 defines only the primitive commands known by \TEX82 and the one extra command
19 \type {\directlua}. As is fitting, a \LUA\ function has to be called to add the
20 extra primitives to the user environment. The simplest method to get access to
21 all of the new primitive commands is by adding this line to the format generation
22 file:
24 \starttyping
25 \directlua { tex.enableprimitives('',tex.extraprimitives()) }
26 \stoptyping
28 But be aware that the curly braces may not have the proper \type {\catcode}
29 assigned to them at this early time (giving a \quote {Missing number} error), so
30 it may be needed to put these assignments before the above line:
32 \starttyping
33 \catcode `\{=1
34 \catcode `\}=2
35 \stoptyping
37 More fine|-|grained primitives control is possible and you can look up the
38 details in \in {section} [luaprimitives]. For simplicity's sake, this manual
39 assumes that you have executed the \type {\directlua} command as given above.
41 The startup behaviour documented above is considered stable in the sense that
42 there will not be backward|-|incompatible changes any more. We have promoted some
43 rather generic \PDFTEX\ primitives to core \LUATEX\ ones, and the ones inherited
44 frome \ALEPH\ (\OMEGA) are also promoted. Effectively this means that we now only
45 have the \type {tex}, \type {etex} and \type {luatex} sets left.
47 In \in {Chapter} [modifications] we discuss several primitives that are derived
48 from \PDFTEX\ and \ALEPH\ (\OMEGA). Here we stick to real new ones. In the
49 chapters on fonts and math we discuss a few more new ones.
51 \section{Version information}
53 \subsection {\type {\luatexbanner}, \type {\luatexversion} and \type {\luatexrevision}}
55 There are three new primitives to test the version of \LUATEX:
57 \starttabulate[|l|pl|pl|]
58 \NC \bf primitive \NC \bf explanation \NC \bf value \NC \NR
59 \NC \type {\luatexbanner} \NC the banner reported on the command line \NC \luatexbanner \NC \NR
60 \NC \type {\luatexversion} \NC a combination of major and minor number \NC \the\luatexversion \NC \NR
61 \NC \type {\luatexrevision} \NC the revision number, the current value is \NC \luatexrevision \NC \NR
62 \stoptabulate
64 The official \LUATEX\ version is defined as follows:
66 \startitemize
67 \startitem
68 The major version is the integer result of \type {\luatexversion} divided by
69 100. The primitive is an \quote {internal variable}, so you may need to prefix
70 its use with \type {\the} depending on the context.
71 \stopitem
72 \startitem
73 The minor version is the two-digit result of \type {\luatexversion} modulo 100.
74 \stopitem
75 \startitem
76 The revision is the given by \type {\luatexrevision}. This primitive expands to
77 a positive integer.
78 \stopitem
79 \startitem
80 The full version number consists of the major version, minor version and
81 revision, separated by dots.
82 \stopitem
83 \stopitemize
85 \subsection{\type {\formatname}}
87 The \type {\formatname} syntax is identical to \type {\jobname}. In \INITEX, the
88 expansion is empty. Otherwise, the expansion is the value that \type {\jobname} had
89 during the \INITEX\ run that dumped the currently loaded format. You can use this
90 token list to provide your own version info.
92 \section{\UNICODE\ text support}
94 \subsection {Extended ranges}
96 Text input and output is now considered to be \UNICODE\ text, so input characters
97 can use the full range of \UNICODE\ ($2^{20}+2^{16}-1 = \hbox{0x10FFFF}$). Later
98 chapters will talk of characters and glyphs. Although these are not
99 interchangeable, they are closely related. During typesetting, a character is
100 always converted to a suitable graphic representation of that character in a
101 specific font. However, while processing a list of to|-|be|-|typeset nodes, its
102 contents may still be seen as a character. Inside \LUATEX\ there is no clear
103 separation between the two concepts. Because the subtype of a glyph node can be
104 changed in \LUA\ it is up to the user: subtypes larger than 255 indicate that
105 font processing has happened.
107 A few primitives are affected by this, all in a similar fashion: each of them has
108 to accommodate for a larger range of acceptable numbers. For instance, \type
109 {\char} now accepts values between~0 and $1{,}114{,}111$. This should not be a
110 problem for well|-|behaved input files, but it could create incompatibilities for
111 input that would have generated an error when processed by older \TEX|-|based
112 engines. The affected commands with an altered initial (left of the equals sign)
113 or secondary (right of the equals sign) value are: \type {\char}, \type
114 {\lccode}, \type {\uccode}, \type {\catcode}, \type {\sfcode}, \type {\efcode},
115 \type {\lpcode}, \type {\rpcode}, \type {\chardef}.
117 As far as the core engine is concerned, all input and output to text files is
118 \UTF-8 encoded. Input files can be pre|-|processed using the \type {reader}
119 callback. This will be explained in a later chapter.
121 Output in byte|-|sized chunks can be achieved by using characters just outside of
122 the valid \UNICODE\ range, starting at the value $1{,}114{,}112$ (0x110000). When
123 the time comes to print a character $c>=1{,}114{,}112$, \LUATEX\ will actually
124 print the single byte corresponding to $c$ minus 1{,}114{,}112.
126 Output to the terminal uses \type {^^} notation for the lower control range
127 ($c<32$), with the exception of \type {^^I}, \type {^^J} and \type {^^M}. These
128 are considered \quote {safe} and therefore printed as|-|is. You can disable
129 escaping with \type {texio.setescape(false)} in which case you get the normal
130 characters on the console.
132 Normalization of the \UNICODE\ input can be handled by a macro package during
133 callback processing (this will be explained in \in {section} [iocallback]).
135 \subsection{\type {\Uchar}}
137 The expandable command \type {\Uchar} reads a number between~0 and $1{,}114{,}111$
138 and expands to the associated \UNICODE\ character.
140 \section{Extended tables}
142 All traditional \TEX\ and \ETEX\ registers can be 16-bit numbers. The affected
143 commands are:
145 \startfourcolumns
146 \starttyping
147 \count
148 \dimen
149 \skip
150 \muskip
151 \marks
152 \toks
153 \countdef
154 \dimendef
155 \skipdef
156 \muskipdef
157 \toksdef
158 \insert
159 \box
160 \unhbox
161 \unvbox
162 \copy
163 \unhcopy
164 \unvcopy
168 \setbox
169 \vsplit
170 \stoptyping
171 \stopfourcolumns
173 Because font memory management has been rewritten, character properties in fonts
174 are no longer shared among fonts instances that originate from the same metric
175 file.
177 \section{Attributes}
179 \subsection{Attribute registers}
181 Attributes are a completely new concept in \LUATEX. Syntactically, they behave a
182 lot like counters: attributes obey \TEX's nesting stack and can be used after
183 \type {\the} etc.\ just like the normal \type {\count} registers.
185 \startsyntax
186 \attribute <16-bit number> <optional equals> <32-bit number>!crlf
187 \attributedef <csname> <optional equals> <16-bit number>
188 \stopsyntax
190 Conceptually, an attribute is either \quote {set} or \quote {unset}. Unset
191 attributes have a special negative value to indicate that they are unset, that
192 value is the lowest legal value: \type {-"7FFFFFFF} in hexadecimal, a.k.a.
193 $-2147483647$ in decimal. It follows that the value \type {-"7FFFFFFF} cannot be
194 used as a legal attribute value, but you {\it can\/} assign \type {-"7FFFFFFF} to
195 \quote {unset} an attribute. All attributes start out in this \quote {unset}
196 state in \INITEX.
198 Attributes can be used as extra counter values, but their usefulness comes mostly
199 from the fact that the numbers and values of all \quote {set} attributes are
200 attached to all nodes created in their scope. These can then be queried from any
201 \LUA\ code that deals with node processing. Further information about how to use
202 attributes for node list processing from \LUA\ is given in~\in {chapter}[nodes].
204 Attributes are stored in a sorted (sparse) linked list that are shared when
205 possible. This permits efficient testing and updating.
207 \subsection{Box attributes}
209 Nodes typically receive the list of attributes that is in effect when they are
210 created. This moment can be quite asynchronous. For example: in paragraph
211 building, the individual line boxes are created after the \type {\par} command has
212 been processed, so they will receive the list of attributes that is in effect
213 then, not the attributes that were in effect in, say, the first or third line of
214 the paragraph.
216 Similar situations happen in \LUATEX\ regularly. A few of the more obvious
217 problematic cases are dealt with: the attributes for nodes that are created
218 during hyphenation, kerning and ligaturing borrow their attributes from their
219 surrounding glyphs, and it is possible to influence box attributes directly.
221 When you assemble a box in a register, the attributes of the nodes contained in
222 the box are unchanged when such a box is placed, unboxed, or copied. In this
223 respect attributes act the same as characters that have been converted to
224 references to glyphs in fonts. For instance, when you use attributes to implement
225 color support, each node carries information about its eventual color. In that
226 case, unless you implement mechanisms that deal with it, applying a color to
227 already boxed material will have no effect. Keep in mind that this
228 incompatibility is mostly due to the fact that separate specials and literals are
229 a more unnatural approach to colors than attributes.
231 It is possible to fine-tune the list of attributes that are applied to a \type
232 {hbox}, \type {vbox} or \type {vtop} by the use of the keyword \type {attr}. An
233 example:
235 \starttyping
236 \attribute2=5
237 \setbox0=\hbox {Hello}
238 \setbox2=\hbox attr1=12 attr2=-"7FFFFFFF{Hello}
239 \stoptyping
241 This will set the attribute list of box~2 to $1=12$, and the attributes of box~0
242 will be $2=5$. As you can see, assigning the maximum negative value causes an
243 attribute to be ignored.
245 The \type {attr} keyword(s) should come before a \type {to} or \type {spread}, if
246 that is also specified.
248 \section{\LUA\ related primitives}
250 \subsection{\type {\directlua}}
252 In order to merge \LUA\ code with \TEX\ input, a few new primitives are needed.
253 The primitive \type {\directlua} is used to execute \LUA\ code immediately. The
254 syntax is
256 \startsyntax
257 \directlua <general text>!crlf
258 \directlua <16-bit number> <general text>
259 \stopsyntax
261 The \syntax {<general text>} is expanded fully, and then fed into the \LUA\
262 interpreter. After reading and expansion has been applied to the \syntax
263 {<general text>}, the resulting token list is converted to a string as if it was
264 displayed using \type {\the\toks}. On the \LUA\ side, each \type {\directlua}
265 block is treated as a separate chunk. In such a chunk you can use the \type
266 {local} directive to keep your variables from interfering with those used by the
267 macro package.
269 The conversion to and from a token list means that you normally can not use \LUA\
270 line comments (starting with \type {--}) within the argument. As there typically
271 will be only one \quote {line} the first line comment will run on until the end
272 of the input. You will either need to use \TEX|-|style line comments (starting
273 with \%), or change the \TEX\ category codes locally. Another possibility is to
274 say:
276 \starttyping
277 \begingroup
278 \endlinechar=10
279 \directlua ...
280 \endgroup
281 \stoptyping
283 Then \LUA\ line comments can be used, since \TEX\ does not replace line endings
284 with spaces.
286 Likewise, the \syntax {<16-bit number>} designates a name of a \LUA\ chunk and is
287 taken from the \type {lua.name} array (see the documentation of the \type {lua}
288 table further in this manual). When a chunk name starts with a \type {@} it will
289 be displayed as a file name. This is a side effect of the way \LUA\ implements
290 error handling.
292 The \type {\directlua} command is expandable. Since it passes \LUA\ code to the
293 \LUA\ interpreter its expansion from the \TEX\ viewpoint is usually empty.
294 However, there are some \LUA\ functions that produce material to be read by \TEX,
295 the so called print functions. The most simple use of these is \type
296 {tex.print(<string> s)}. The characters of the string \type {s} will be placed on
297 the \TEX\ input buffer, that is, \quote {before \TEX's eyes} to be read by \TEX\
298 immediately. For example:
300 \startbuffer
301 \count10=20
302 a\directlua{tex.print(tex.count[10]+5)}b
303 \stopbuffer
305 \typebuffer
307 expands to
309 \getbuffer
311 Here is another example:
313 \startbuffer
314 $\pi = \directlua{tex.print(math.pi)}$
315 \stopbuffer
317 \typebuffer
319 will result in
321 \getbuffer
323 Note that the expansion of \type {\directlua} is a sequence of characters, not of
324 tokens, contrary to all \TEX\ commands. So formally speaking its expansion is
325 null, but it places material on a pseudo-file to be immediately read by \TEX, as
326 \ETEX's \type {\scantokens}. For a description of print functions look at \in
327 {section} [sec:luaprint].
329 Because the \syntax {<general text>} is a chunk, the normal \LUA\ error handling
330 is triggered if there is a problem in the included code. The \LUA\ error messages
331 should be clear enough, but the contextual information is still pretty bad.
332 Often, you will only see the line number of the right brace at the end of the
333 code.
335 While on the subject of errors: some of the things you can do inside \LUA\ code
336 can break up \LUATEX\ pretty bad. If you are not careful while working with the
337 node list interface, you may even end up with assertion errors from within the
338 \TEX\ portion of the executable.
340 The behaviour documented in the above subsection is considered stable in the sense
341 that there will not be backward-incompatible changes any more.
343 \subsection{\type {\latelua}}
345 Contrary to \type {\directlua}, \type {\latelua} stores \LUA\ code in a whatsit
346 that will be processed at the time of shipping out. Its intended use is a cross
347 between \PDF\ literals (often available as \type {\pdfliteral}) and the
348 traditional \TEX\ extension \type {\write}. Within the \LUA\ code you can print
349 \PDF\ statements directly to the \PDF\ file via \type {pdf.print}, or you can
350 write to other output streams via \type {texio.write} or simply using \LUA\ \IO\
351 routines.
353 \startsyntax
354 \latelua <general text>!crlf
355 \latelua <16-bit number> <general text>
356 \stopsyntax
358 Expansion of macros in the final \type {<general text>} is delayed until just
359 before the whatsit is executed (like in \type {\write}). With regard to \PDF\
360 output stream \type {\latelua} behaves as \PDF\ page literals. The \syntax
361 {name <general text>} and \syntax {<16-bit number>} behave in the same way as
362 they do for \type {\directlua}
364 \subsection{\type {\luaescapestring}}
366 This primitive converts a \TEX\ token sequence so that it can be safely used as
367 the contents of a \LUA\ string: embedded backslashes, double and single quotes,
368 and newlines and carriage returns are escaped. This is done by prepending an
369 extra token consisting of a backslash with category code~12, and for the line
370 endings, converting them to \type {n} and \type {r} respectively. The token
371 sequence is fully expanded.
373 \startsyntax
374 \luaescapestring <general text>
375 \stopsyntax
377 Most often, this command is not actually the best way to deal with the
378 differences between the \TEX\ and \LUA. In very short bits of \LUA\
379 code it is often not needed, and for longer stretches of \LUA\ code it
380 is easier to keep the code in a separate file and load it using \LUA's
381 \type {dofile}:
383 \starttyping
384 \directlua { dofile('mysetups.lua') }
385 \stoptyping
387 \subsection{\type {\luafunction}}
389 The \type {\directlua} commands involves tokenization of its argument (after
390 picking up an optional name or number specification). The tokenlist is then
391 converted into a string and given to \LUA\ to turn into a function that is
392 called. The overhead is rather small but when you use this primitive hundreds of
393 thousands of times, it can become noticeable. For this reason there is a variant
394 call available: \type {\luafunction}. This command is used as follows:
396 \starttyping
397 \directlua {
398 local t = lua.get_functions_table()
399 t[1] = function() tex.print("!") end
400 t[2] = function() tex.print("?") end
403 \luafunction1
404 \luafunction2
405 \stoptyping
407 Of course the functions can also be defined in a separate file. There is no limit
408 on the number of functions apart from normal \LUA\ limitations. Of course there
409 is the limitation of no arguments but that would involve parsing and thereby give
410 no gain. The function, when called in fact gets one argument, being the index, so
411 in the following example the number \type {8} gets typeset.
413 \starttyping
414 \directlua {
415 local t = lua.get_functions_table()
416 t[8] = function(slot) tex.print(slot) end
418 \stoptyping
420 \section {Alignments}
422 \subsection{\tex {alignmark}}
424 This primitive duplicates the functionality of \type {#} inside alignment
425 preambles.
427 \subsection{\tex {aligntab}}
429 This primitive duplicates the functionality of \type {&} inside alignments and
430 preambles.
432 \section{Catcode tables}
434 Catcode tables are a new feature that allows you to switch to a predefined
435 catcode regime in a single statement. You can have a practically unlimited number
436 of different tables. This subsystem is backward compatible: if you never use the
437 following commands, your document will not notice any difference in behaviour
438 compared to traditional \TEX. The contents of each catcode table is independent
439 from any other catcode tables, and their contents is stored and retrieved from
440 the format file.
442 \subsection{\type {\catcodetable}}
444 \startsyntax
445 \catcodetable <15-bit number>
446 \stopsyntax
448 The primitive \type {\catcodetable} switches to a different catcode table. Such a
449 table has to be previously created using one of the two primitives below, or it
450 has to be zero. Table zero is initialized by \INITEX.
452 \subsection{\type {\initcatcodetable}}
454 \startsyntax
455 \initcatcodetable <15-bit number>
456 \stopsyntax
458 The primitive \type {\initcatcodetable} creates a new table with catcodes identical
459 to those defined by \INITEX:
461 \starttabulate[|r|l|l|l|]
462 \NC 0 \NC \tttf \letterbackslash \NC \NC \type {escape} \NC\NR
463 \NC 5 \NC \tttf \letterhat\letterhat M \NC return \NC \type {car_ret} \NC\NR
464 \NC 9 \NC \tttf \letterhat\letterhat @ \NC null \NC \type {ignore} \NC\NR
465 \NC 10 \NC \tttf <space> \NC space \NC \type {spacer} \NC\NR
466 \NC 11 \NC {\tttf a} \endash\ {\tttf z} \NC \NC \type {letter} \NC\NR
467 \NC 11 \NC {\tttf A} \endash\ {\tttf Z} \NC \NC \type {letter} \NC\NR
468 \NC 12 \NC everything else \NC \NC \type {other} \NC\NR
469 \NC 14 \NC \tttf \letterpercent \NC \NC \type {comment} \NC\NR
470 \NC 15 \NC \tttf \letterhat\letterhat ? \NC delete \NC \type {invalid_char} \NC\NR
471 \stoptabulate
473 The new catcode table is allocated globally: it will not go away after the
474 current group has ended. If the supplied number is identical to the currently
475 active table, an error is raised.
477 \subsection{\type {\savecatcodetable}}
479 \startsyntax
480 \savecatcodetable <15-bit number>
481 \stopsyntax
483 \type {\savecatcodetable} copies the current set of catcodes to a new table with
484 the requested number. The definitions in this new table are all treated as if
485 they were made in the outermost level.
487 The new table is allocated globally: it will not go away after the current group
488 has ended. If the supplied number is the currently active table, an error is
489 raised.
491 \section{Suppressing errors}
493 \subsection{\type {\suppressfontnotfounderror}}
495 \startsyntax
496 \suppressfontnotfounderror = 1
497 \stopsyntax
499 If this integer parameter is non|-|zero, then \LUATEX\ will not complain about
500 font metrics that are not found. Instead it will silently skip the font
501 assignment, making the requested csname for the font \type {\ifx} equal to \type
502 {\nullfont}, so that it can be tested against that without bothering the user.
504 \subsection{\type {\suppresslongerror}}
506 \startsyntax
507 \suppresslongerror = 1
508 \stopsyntax
510 If this integer parameter is non|-|zero, then \LUATEX\ will not complain about
511 \type {\par} commands encountered in contexts where that is normally prohibited
512 (most prominently in the arguments of non-long macros).
514 \subsection{\type {\suppressifcsnameerror}}
516 \startsyntax
517 \suppressifcsnameerror = 1
518 \stopsyntax
520 If this integer parameter is non|-|zero, then \LUATEX\ will not complain about
521 non-expandable commands appearing in the middle of a \type {\ifcsname} expansion.
522 Instead, it will keep getting expanded tokens from the input until it encounters
523 an \type {\endcsname} command. If the input expansion is unbalanced with respect
524 to \type {\csname} \ldots \type {\endcsname} pairs, the \LUATEX\ process may hang
525 indefinitely.
527 \subsection{\type {\suppressoutererror}}
529 \startsyntax
530 \suppressoutererror = 1
531 \stopsyntax
533 If this new integer parameter is non|-|zero, then \LUATEX\ will not complain
534 about \type {\outer} commands encountered in contexts where that is normally
535 prohibited.
537 \subsection{\type {\suppressmathparerror}}
539 The following setting will permit \type {\par} tokens in a math formula:
541 \startsyntax
542 \suppressmathparerror = 1
543 \stopsyntax
545 So, the next code is valid then:
547 \starttyping
548 $ x + 1 =
551 \stoptyping
553 \section {Math}
555 \subsection{Extensions}
557 We will cover math in its own chapter because not only the font subsystem and
558 spacing model have been enhanced (thereby introducing many new primitives) but
559 also because some more control has been added to existing functionality.
561 \subsection{\type {\matheqnogapstep}}
563 By default \TEX\ will add one quad between the equation and the number. This is
564 hard coded. A new primitive can control this:
566 \startsyntax
567 \matheqnogapstep = 1000
568 \stopsyntax
570 Because a math quad from the math text font is used instead of a dimension, we
571 use a step to control the size. A value of zero will suppress the gap. The step
572 is divided by 1000 which is the usual way to mimmick floating point factors in
573 \TEX.
575 \section{Fonts}
577 \subsection{Font syntax}
579 \LUATEX\ will accept a braced argument as a font name:
581 \starttyping
582 \font\myfont = {cmr10}
583 \stoptyping
585 This allows for embedded spaces, without the need for double quotes. Macro
586 expansion takes place inside the argument.
588 \subsection{\type {\fontid}}
590 \startsyntax
591 \fontid\font
592 \stopsyntax
594 This primitive expands into a number. It is not a register so there is no need to
595 prefix with \type {\number} (and using \type {\the} gives an error). The currently
596 used font id is \fontid\font. Here are some more:
598 \starttabulate[|l|c|]
599 \NC \type {\bf} \NC \bf \fontid\font \NC \NR
600 \NC \type {\it} \NC \it \fontid\font \NC \NR
601 \NC \type {\bi} \NC \bi \fontid\font \NC \NR
602 \stoptabulate
604 These numbers depend on the macro package used because each one has its own way
605 of dealing with fonts. They can also differ per run, as they can depend on the
606 order of loading fonts. For instance, when in \CONTEXT\ virtual math \UNICODE\
607 fonts are used, we can easily get over a hundred ids in use. Not all ids have to
608 be bound to a real font, after all it's just a number.
610 \subsection{\type {\setfontid}}
612 The primitive \type {\setfontid} can be used to enable a font with the given id
613 (which of course needs to be a valid one).
615 \subsection{\type {\noligs} and \type {\nokerns}}
617 These primitives prohibit ligature and kerning insertion at the time when the
618 initial node list is built by \LUATEX's main control loop. You can enable these
619 primitives when you want to do node list processing of \quote {characters}, where
620 \TEX's normal processing would get in the way.
622 \startsyntax
623 \noligs <integer>!crlf
624 \nokerns <integer>
625 \stopsyntax
627 These primitives can also be implemented by overloading the ligature building and
628 kerning functions, i.e.\ by assigning dummy functions to their associated
629 callbacks. Keep in mind that when you define a font (using \LUA) you can also
630 omit the kern and ligature tables, which has the same effect as the above.
632 \subsection{\type{\nospaces}}
634 This new primitive can be used to overrule the usual \type {\spaceskip}
635 related heuristics when a space character is seen in a text flow. The
636 value~\type{1} triggers no injection while \type{2} results in injection of
637 a zero skip. Below we see the results for four characters separated by a
638 space.
640 \startlinecorrection
641 \startcombination[3*2]
642 {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=0\relax x x x x \par}\hss}} {\type {0 / hsize 10mm}}
643 {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=1\relax x x x x \par}\hss}} {\type {1 / hsize 10mm}}
644 {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=2\relax x x x x \par}\hss}} {\type {2 / hsize 10mm}}
645 {\ruledhbox to 5cm{\vtop{\hsize 1mm\nospaces=0\relax x x x x \par}\hss}} {\type {0 / hsize 1mm}}
646 {\ruledhbox to 5cm{\vtop{\hsize 1mm\nospaces=1\relax x x x x \par}\hss}} {\type {1 / hsize 1mm}}
647 {\ruledhbox to 5cm{\vtop{\hsize 1mm\nospaces=2\relax x x x x \par}\hss}} {\type {2 / hsize 1mm}}
648 \stopcombination
649 \stoplinecorrection
651 \section{Tokens, commands and strings}
653 \subsection{\type {\scantextokens}}
655 The syntax of \type {\scantextokens} is identical to \type {\scantokens}. This
656 primitive is a slightly adapted version of \ETEX's \type {\scantokens}. The
657 differences are:
659 \startitemize
660 \startitem
661 The last (and usually only) line does not have a \type {\endlinechar}
662 appended.
663 \stopitem
664 \startitem
665 \type {\scantextokens} never raises an EOF error, and it does not execute
666 \type {\everyeof} tokens.
667 \stopitem
668 \startitem
669 There are no \quote {\unknown\ while end of file \unknown} error tests
670 executed. This allows the expansion to end on a different grouping level or
671 while a conditional is still incomplete.
672 \stopitem
673 \stopitemize
675 \subsection{\type {\toksapp}, \type {\tokspre}, \type {\etoksapp} and \type {\etokspre}}
677 Instead of:
679 \starttyping
680 \toks0\expandafter{\the\toks0 foo}
681 \stoptyping
683 you can use:
685 \starttyping
686 \etoksapp0{foo}
687 \stoptyping
689 The \type {pre} variants prepend instead of append, and the \type {e} variants
690 expand the passed general text.
692 \subsection{\type {\csstring}, \type {\begincsname} and \type {\lastnamedcs}}
694 These are somewhat special. The \type {\csstring} primitive is like
695 \type {\string} but it omits the leading escape character. This can be
696 somewhat more efficient that stripping it of afterwards.
698 The \type {\begincsname} primitive is like \type {\csname} but doesn't create
699 a relaxed equivalent when there is no such name. It is equivalent to
701 \starttyping
702 \ifcsname foo\endcsname
703 \csname foo\endcsname
705 \stoptyping
707 The advantage is that it saves a lookup (don't expect much speedup) but more
708 important is that it avoids using the \type {\if}.
710 The \type {\lastnamedcs} is one that should be used with care. The above
711 example could be written as:
713 \starttyping
714 \ifcsname foo\endcsname
715 \lastnamedcs
717 \stoptyping
719 This is slightly more efficient than constructing the string twice (deep down in
720 \LUATEX\ this also involves some \UTF8 juggling), but probably more relevant is
721 that it saves a few tokens and can make code a bit more more readable.
723 \subsection{\type {\clearmarks}}
725 This primitive complements the \ETEX\ mark primitives and clears a mark class
726 completely, resetting all three connected mark texts to empty. It is an
727 immediate command.
729 \startsyntax
730 \clearmarks <16-bit number>
731 \stopsyntax
733 \subsection{\type{\letcharcode}}
735 This primitive is still experimental but can be used to assign a meaning to an active
736 character, as in:
738 \starttyping
739 \def\foo{bar} \letcharcode123\foo
740 \stoptyping
742 This can be a bit nicer that using the uppercase tricks (using the property of
743 \type {\uppercase} that it treats active characters special).
745 \section{Boxes, rules and leaders}
747 \subsection{\type {\outputbox}}
749 \startsyntax
750 \outputbox = 65535
751 \stopsyntax
753 This new integer parameter allows you to alter the number of the box that will be
754 used to store the page sent to the output routine. Its default value is 255, and
755 the acceptable range is from 0 to 65535.
757 \subsection{\type {\vpack}, \type {\hpack} and \type {\tpack}}
759 These three primitives are like \type {\vbox}, \type {\hbox} and \type {\vtop}
760 but don't apply the related callbacks.
762 \subsection{\type {\vsplit}}
764 The \type {\vsplit} primitive has to be followed by a specification of the
765 required height. As alternative for the \type {to} keyword you can use \type
766 {upto} to get a split of the given size but result has the natural dimensions
767 then.
769 \subsection{Images and Forms}
771 These two concepts are now core concepts and no longer whatsits. They are in fact
772 now implemented as rules with special properties. Normal rules have subtype~0,
773 saved boxes have subtype~1 and images have subtype~2. This has the positive side
774 effect that whenever we need to take content with dimensions into account, when we
775 look at rule nodes, we automatically also deal with these two types.
777 The syntax of the \type {\save...resource} is the same as in \PDFTEX\ but you
778 should consider them to be backend specific. This means that a macro package
779 should treat them as such and check for the current output mode if applicable.
780 Here are the equivalents:
782 \starttabulate[|l|l|]
783 \NC \type {\saveboxresource} \EQ \type {\pdfxform} \NC \NR
784 \NC \type {\saveimageresource} \EQ \type {\pdfximage} \NC \NR
785 \NC \type {\useboxresource} \EQ \type {\pdfrefxform} \NC \NR
786 \NC \type {\useimageresource} \EQ \type {\pdfrefximage} \NC \NR
787 \NC \type {\lastsavedboxresourceindex} \EQ \type {\pdflastxform} \NC \NR
788 \NC \type {\lastsavedimageresourceindex} \EQ \type {\pdflastximage} \NC \NR
789 \NC \type {\lastsavedimageresourcepages} \EQ \type {\pdflastximagepages} \NC \NR
790 \stoptabulate
792 \LUATEX\ accepts optional dimension parameters for \type {\use...resource} in the
793 same format as for rules. With images, these dimensions are then used instead of
794 the ones given to \type {\useimageresource} but the original dimensions are not
795 overwritten, so that a \type {\useimageresource} without dimensions still
796 provides the image with dimensions defined by \type {\saveimageresource}. These
797 optional parameters are not implemented for \type {\saveboxresource}.
799 \starttyping
800 \useimageresource width 20mm height 10mm depth 5mm \lastsavedimageresourceindex
801 \useboxresource width 20mm height 10mm depth 5mm \lastsavedboxresourceindex
802 \stoptyping
804 The box resources are of course implemented in the backend and therefore we do
805 support the \type {attr} and \type {resources} keys that accept a token list. New
806 is the \type {type} key. When set to non|-|zero the \type {/Type} entry is
807 omitted. A value of 1 or 3 still writes a \type {/BBox}, while 2 or 3 will write
808 a \type {/Matrix}.
810 \subsection{\type {\nohrule} and \type {\novrule}}
812 Because introducing a new keyword can cause incompatibilities, two new primitives
813 were introduced: \type {\nohrule} and \type {\novrule}. These can be used to
814 reserve space. This is often more efficient than creating an empty box with fake
815 dimensions).
817 \subsection{\type {\gleaders}}
819 This type of leaders is anchored to the origin of the box to be shipped out. So
820 they are like normal \type {\leaders} in that they align nicely, except that the
821 alignment is based on the {\it largest\/} enclosing box instead of the {\it
822 smallest\/}. The \type {g} stresses this global nature.
824 \section {Languages}
826 \subsection{\type {\hyphenationmin}}
828 This primitive can be used to set the minimal word length, so setting it to a value
829 of~$5$ means that only words of 6 characters and more will be hyphenated, of course
830 within the constraints of the \type {\lefthyphenmin} and \type {\righthyphenmin}
831 values (as stored in the glyph node). This primitive accepts a number and stores
832 the value with the language.
834 \subsection{\type {\boundary}, \type {\noboundary}, \type {\protrusionboundary} and \type
835 {\wordboundary}}
837 The \type {\noboundary} commands used to inject a whatsit node but now injects a normal
838 node with type \type {boundary} and subtype~0. In addition you can say:
840 \starttyping
841 x\boundary 123\relax y
842 \stoptyping
844 This has the same effect but the subtype is now~1 and the value~123 is stored.
845 The traditional ligature builder still sees this as a cancel boundary directive
846 but at the \LUA\ end you can implement different behaviour. The added benefit of
847 passing this value is a side effect of the generalization. The subtypes~2 and~3
848 are used to control protrusion and word boundaries in hyphenation.
850 \section{Control and debugging}
852 \subsection {Tracing}
854 If \type {\tracingonline} is larger than~2, the node list display will also print
855 the node number of the nodes.
857 \subsection{\type {\outputmode} and \type {\draftmode}}
859 The \type {\outputmode} variable tells \LUATEX\ what it has to produce:
861 \starttabulate[|l|l|]
862 \NC \type {0} \NC \DVI\ code \NC \NR
863 \NC \type {1} \NC \PDF\ code \NC \NR
864 \stoptabulate
866 The value of the \type {\draftmode} counter signals the backend if it should
867 output less. The \PDF\ backend accepts a value of~$1$, while the \DVI\ backend
868 ignores the value.
870 \section {Files}
872 \subsection{File syntax}
874 \LUATEX\ will accept a braced argument as a file name:
876 \starttyping
877 \input {plain}
878 \openin 0 {plain}
879 \stoptyping
881 This allows for embedded spaces, without the need for double quotes. Macro
882 expansion takes place inside the argument.
884 \subsection{Writing to file}
886 You can now open upto 127 files with \type {\openout}. When no file is open
887 writes will go to the console and log. As a consequence a system command is
888 no longer possible but one can use \type {os.execute} to do the same.
890 \stopchapter
892 \stopcomponent