3 \environment luatex-style
4 \environment luatex-logos
6 \startcomponent luatex-nodes
8 \startchapter[reference=nodes,title=
{Nodes
}]
10 \section{\LUA\ node representation
}
12 \TEX's nodes are represented in
\LUA\ as userdata object with a variable set of
13 fields. In the following syntax tables, such the type of such a userdata object
14 is represented as
\syntax {<node>
}.
16 The current return value of
\type {node.types()
} is:
18 for id, name in table.sortedhash(node.types()) do
22 context.removeunwantedspaces()
23 context.removepunctuation()
27 The
\type {\lastnodetype} primitive is
\ETEX\ compliant. The valid range is still
28 $
[-
1,
15]$ and glyph nodes (formerly known as char nodes) have number~
0 while
29 ligature nodes are mapped to~
7. That way macro packages can use the same symbolic
30 names as in traditional
\ETEX. Keep in mind that these
\ETEX\ node numbers are
31 different from the real internal ones and that there are more
\ETEX\ node types
34 You can ask for a list of fields with the
\type {node.fields
} (which takes an id)
35 and for valid subtypes with
\type {node.subtypes
} (which takes a string because
36 eventually we might support more used enumerations).
38 \subsection{Attributes
}
40 The newly introduced attribute registers are non|-|trivial, because the value
41 that is attached to a node is essentially a sparse array of key|-|value pairs. It
42 is generally easiest to deal with attribute lists and attributes by using the
43 dedicated functions in the
\type {node
} library, but for completeness, here is
44 the low|-|level interface.
46 \subsubsection{attribute_list nodes
}
48 An
\type {attribute_list
} item is used as a head pointer for a list of attribute
49 items. It has only one user-visible field:
51 \starttabulate[|lT|l|p|
]
52 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
53 \NC next
\NC node
\NC pointer to the first attribute
\NC \NR
56 \subsubsection{attribute nodes
}
58 A normal node's attribute field will point to an item of type
\type
59 {attribute_list
}, and the
\type {next
} field in that item will point to the first
60 defined
\quote {attribute
} item, whose
\type {next
} will point to the second
61 \quote {attribute
} item, etc.
63 \starttabulate[|lT|l|p|
]
64 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
65 \NC next
\NC node
\NC pointer to the next attribute
\NC \NR
66 \NC number
\NC number
\NC the attribute type id
\NC \NR
67 \NC value
\NC number
\NC the attribute value
\NC \NR
70 As mentioned it's better to use the official helpers rather than edit these
71 fields directly. For instance the
\type {prev
} field is used for other purposes
72 and there is no double linked list.
74 \subsection{Main text nodes
}
76 These are the nodes that comprise actual typesetting commands. A few fields are
77 present in all nodes regardless of their type, these are:
79 \starttabulate[|lT|l|p|
]
80 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
81 \NC next
\NC node
\NC the next node in a list, or nil
\NC \NR
82 \NC id
\NC number
\NC the node's type (
\type {id
}) number
\NC \NR
83 \NC subtype
\NC number
\NC the node
\type {subtype
} identifier
\NC \NR
86 The
\type {subtype
} is sometimes just a stub entry. Not all nodes actually use
87 the
\type {subtype
}, but this way you can be sure that all nodes accept it as a
88 valid field name, and that is often handy in node list traversal. In the
89 following tables
\type {next
} and
\type {id
} are not explicitly mentioned.
91 Besides these three fields, almost all nodes also have an
\type {attr
} field, and
92 there is a also a field called
\type {prev
}. That last field is always present,
93 but only initialized on explicit request: when the function
\type {node.slide()
}
94 is called, it will set up the
\type {prev
} fields to be a backwards pointer in
95 the argument node list. By now most of
\TEX's node processing makes sure that the
96 \type {prev
} nodes are valid but there can be exceptions, especially when the
97 internal magic uses a leading
\type {temp
} nodes to temporarily store a state.
99 \subsubsection{hlist nodes
}
101 \starttabulate[|lT|l|p|
]
102 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
103 \NC subtype
\NC number
\NC \showsubtypes{list
} \NC \NR
104 \NC attr
\NC node
\NC list of attributes
\NC \NR
105 \NC width
\NC number
\NC the width of the box
\NC \NR
106 \NC height
\NC number
\NC the height of the box
\NC \NR
107 \NC depth
\NC number
\NC the depth of the box
\NC \NR
108 \NC shift
\NC number
\NC a displacement perpendicular to the character progression direction
\NC \NR
109 \NC glue_order
\NC number
\NC a number in the range $
[0,
4]$, indicating the glue order
\NC \NR
110 \NC glue_set
\NC number
\NC the calculated glue ratio
\NC \NR
111 \NC glue_sign
\NC number
\NC 0 =
\type {normal
},
1 =
\type {stretching
},
2 =
\type {shrinking
} \NC \NR
112 \NC head/list
\NC node
\NC the first node of the body of this list
\NC \NR
113 \NC dir
\NC string
\NC the direction of this box, see~
\in[dirnodes
] \NC \NR
116 A warning: never assign a node list to the
\type {head
} field unless you are sure
117 its internal link structure is correct, otherwise an error may result.
119 Note: the field name
\type {head
} and
\type {list
} are both valid. Sometimes it
120 makes more sense to refer to a list by
\type {head
}, sometimes
\type {list
} makes
123 \subsubsection{vlist nodes
}
125 This node is similar to
\type {hlist
}, except that
\quote {shift
} is a displacement
126 perpendicular to the line progression direction, and
\quote {subtype
} only has
127 the values
0,
4, and~
5.
129 \subsubsection{rule nodes
}
131 Contrary to traditional
\TEX,
\LUATEX\ has more subtypes because we also use
132 rules to store reuseable objects and images. User nodes are invisible and can be
133 intercepted by a callback.
135 \starttabulate[|lT|l|p|
]
136 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
137 \NC subtype
\NC number
\NC \showsubtypes{rule
} \NC \NR
138 \NC attr
\NC node
\NC list of attributes
\NC \NR
139 \NC width
\NC number
\NC the width of the rule where the special value $-
1073741824$ is used for
\quote {running
} glue dimensions
\NC \NR
140 \NC height
\NC number
\NC the height of the rule (can be negative)
\NC \NR
141 \NC depth
\NC number
\NC the depth of the rule (can be negative)
\NC \NR
142 \NC dir
\NC string
\NC the direction of this rule, see~
\in[dirnodes
] \NC \NR
143 \NC index
\NC number
\NC an optional index that can be referred to
\NC \NR
146 \subsubsection{ins nodes
}
148 \starttabulate[|lT|l|p|
]
149 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
150 \NC subtype
\NC number
\NC the insertion class
\NC \NR
151 \NC attr
\NC node
\NC list of attributes
\NC \NR
152 \NC cost
\NC number
\NC the penalty associated with this insert
\NC \NR
153 \NC height
\NC number
\NC height of the insert
\NC \NR
154 \NC depth
\NC number
\NC depth of the insert
\NC \NR
155 \NC head/list
\NC node
\NC the first node of the body of this insert
\NC \NR
158 There is a set of extra fields that concern the associated glue:
\type {width
},
159 \type {stretch
},
\type {stretch_order
},
\type {shrink
} and
\type {shrink_order
}.
160 These are all numbers.
162 A warning: never assign a node list to the
\type {head
} field unless you are sure
163 its internal link structure is correct, otherwise an error may be result. You can use
164 \type {list
} instead (often in functions you want to use local variable swith similar
165 names and both names are equally sensible).
167 \subsubsection{mark nodes
}
169 \starttabulate[|lT|l|p|
]
170 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
171 \NC subtype
\NC number
\NC unused
\NC \NR
172 \NC attr
\NC node
\NC list of attributes
\NC \NR
173 \NC class
\NC number
\NC the mark class
\NC \NR
174 \NC mark
\NC table
\NC a table representing a token list
\NC \NR
177 \subsubsection{adjust nodes
}
179 \starttabulate[|lT|l|p|
]
180 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
181 \NC subtype
\NC number
\NC \showsubtypes{adjust
} \NC \NR
182 \NC attr
\NC node
\NC list of attributes
\NC \NR
183 \NC head/list
\NC node
\NC adjusted material
\NC \NR
186 A warning: never assign a node list to the
\type {head
} field unless you are sure
187 its internal link structure is correct, otherwise an error may be result.
189 \subsubsection{disc nodes
}
191 \starttabulate[|lT|l|p|
]
192 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
193 \NC subtype
\NC number
\NC \showsubtypes{disc
} \NC \NR
194 \NC attr
\NC node
\NC list of attributes
\NC \NR
195 \NC pre
\NC node
\NC pointer to the pre|-|break text
\NC \NR
196 \NC post
\NC node
\NC pointer to the post|-|break text
\NC \NR
197 \NC replace
\NC node
\NC pointer to the no|-|break text
\NC \NR
198 \NC penalty
\NC number
\NC the penalty associated with the break, normally
\type {\hyphenpenalty} or
\type {\exhyphenpenalty} \NC \NR
201 The subtype numbers~
4 and~
5 belong to the
\quote {of-f-ice
} explanation given
204 \subsubsection{math nodes
}
206 \starttabulate[|lT|l|p|
]
207 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
208 \NC subtype
\NC number
\NC \showsubtypes{math
} \NC \NR
209 \NC attr
\NC node
\NC list of attributes
\NC \NR
210 \NC surround
\NC number
\NC width of the
\type {\mathsurround} kern
\NC \NR
213 There is a set of extra fields that concern the associated glue:
\type {width
},
214 \type {stretch
},
\type {stretch_order
},
\type {shrink
} and
\type {shrink_order
}.
215 These are all numbers.
217 \subsubsection{glue nodes
}
219 Skips are about the only type of data objects in traditional
\TEX\ that are not a
220 simple value. The structure that represents the glue components of a skip is
221 called a
\type {glue_spec
}, and it has the following accessible fields:
223 \starttabulate[|lT|l|p|
]
224 \NC \rmbf key
\NC \bf type
\NC \bf explanation
\NC \NR
225 \NC width
\NC number
\NC the horizontal or vertical displacement
\NC \NR
226 \NC stretch
\NC number
\NC extra (positive) displacement or stretch amount
\NC \NR
227 \NC stretch_order
\NC number
\NC factor applied to stretch amount
\NC \NR
228 \NC shrink
\NC number
\NC extra (negative) displacement or shrink amount
\NC \NR
229 \NC shrink_order
\NC number
\NC factor applied to shrink amount
\NC \NR
232 The effective width of some glue subtypes depends on the stretch or shrink needed
233 to make the encapsulating box fit its dimensions. For instance, in a paragraph
234 lines normally have glue representing spaces and these stretch of shrink to make
235 the content fit in the available space. The
\type {effective_glue
} function that
236 takes a glue node and a parent (hlist or vlist) returns the effective width of
239 A gluespec node is a special kind of node that is used for storing a set of glue
240 values in registers. Originally they were also used to store properties of glue
241 nodes (using a system of reference counts) but we now keep these properties in
242 the glue nodes themselves, which gives a cleaner interface to
\LUA.
244 The indirect spec approach was in fact an optimization in the original
\TEX\
245 code. First of all it can save quite some memory because all these spaces that
246 become glue now share the same specification (only the reference count is
247 incremented), and zero testing is also a bit faster because only the pointer has
248 to be checked (this is no longer true for engines that implement for instance
249 protrusion where we really need to ensure that zero is zero when we test for
250 bounds). Another side effect is that glue specifications are read|-|only, so in
251 the end copies need to be made when they are used from
\LUA\ (each assignment to
252 a field can result in a new copy). So in the end the advantages of sharing are
253 not that high (and nowadays memory is less an issue, also given that a glue node
254 is only a few memory words larger than a spec).
256 \starttabulate[|lT|l|p|
]
257 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
258 \NC subtype
\NC number
\NC \showsubtypes{glue
} \NC \NR
259 \NC attr
\NC node
\NC list of attributes
\NC \NR
260 \NC leader
\NC node
\NC pointer to a box or rule for leaders
\NC \NR
263 In addition there are the
\type {width
},
\type {stretch
} \type {stretch_order
},
264 \type {shrink
}, and
\type {shrink_order
} fields. Note that we use the key
\type
265 {width
} in both horizontal and vertical glue. This suits the
\TEX\ internals well
266 so we decided to stick to that naming.
268 A regular word space also results in a
\type {spaceskip
} subtype (this used to be
269 a
\type {userskip
} with subtype zero).
271 \subsubsection{kern nodes
}
273 \starttabulate[|lT|l|p|
]
274 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
275 \NC subtype
\NC number
\NC \showsubtypes{kern
} \NC \NR
276 \NC attr
\NC node
\NC list of attributes
\NC \NR
277 \NC kern
\NC number
\NC fixed horizontal or vertical advance
\NC \NR
280 \subsubsection{penalty nodes
}
282 \starttabulate[|lT|l|p|
]
283 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
284 \NC subtype
\NC number
\NC not used
\NC \NR
285 \NC attr
\NC node
\NC list of attributes
\NC \NR
286 \NC penalty
\NC number
\NC the penalty value
\NC \NR
289 \subsubsection[glyphnodes
]{glyph nodes
}
291 \starttabulate[|lT|l|p|
]
292 \NC \rmbf field
\NC \rmbf type
\NC \rmbf explanation
\NC \NR
293 \NC subtype
\NC number
\NC bitfield
\NC \NR
294 \NC attr
\NC node
\NC list of attributes
\NC \NR
295 \NC char
\NC number
\NC the chatacter index in the font
\NC \NR
296 \NC font
\NC number
\NC the font identifier
\NC \NR
297 \NC lang
\NC number
\NC the language identifier
\NC \NR
298 \NC left
\NC number
\NC the frozen
\type {\lefthyphenmnin} value
\NC \NR
299 \NC right
\NC number
\NC the frozen
\type {\righthyphenmnin} value
\NC \NR
300 \NC uchyph
\NC boolean
\NC the frozen
\type {\uchyph} value
\NC \NR
301 \NC components
\NC node
\NC pointer to ligature components
\NC \NR
302 \NC xoffset
\NC number
\NC a virtual displacement in horizontal direction
\NC \NR
303 \NC yoffset
\NC number
\NC a virtual displacement in vertical direction
\NC \NR
304 \NC xadvance
\NC number
\NC an additional advance after the glyph (experimental)
\NC \NR
305 \NC width
\NC number
\NC the (original) width of the character
\NC \NR
306 \NC height
\NC number
\NC the (original) height of the character
\NC \NR
307 \NC depth
\NC number
\NC the (original) depth of the character
\NC \NR
308 \NC expansion_factor
\NC number
\NC the to be applied expansion_factor
\NC \NR
311 The
\type {width
},
\type {height
} and
\type {depth
} values are read|-|only. The
312 \type {expansion_factor
} is assigned in the parbuilder and used in the backend.
314 A warning: never assign a node list to the components field unless you are sure
315 its internal link structure is correct, otherwise an error may be result. Valid
316 bits for the
\type {subtype
} field are:
318 \starttabulate[|c|l|
]
319 \NC \rmbf bit
\NC \bf meaning
\NC \NR
320 \NC 0 \NC character
\NC \NR
321 \NC 1 \NC ligature
\NC \NR
322 \NC 2 \NC ghost
\NC \NR
323 \NC 3 \NC left
\NC \NR
324 \NC 4 \NC right
\NC \NR
327 See
\in {section
} [charsandglyphs
] for a detailed description of the
\type
330 The
\type {expansion_factor
} has been introduced as part of the separation
331 between font- and backend. It is the result of extensive experiments with a more
332 efficient implementation of expansion. Early versions of
\LUATEX\ already
333 replaced multiple instances of fonts in the backend by scaling but contrary to
334 \PDFTEX\ in
\LUATEX\ we now also got rid of font copies in the frontend and
335 replaced them by expansion factors that travel with glyph nodes. Apart from a
336 cleaner approach this is also a step towards a better separation between front-
339 The
\type {is_char
} function checks if a node is a glyph node with a subtype still
340 less than
256. This function can be used to determine if applying font logic to a
341 glyph node makes sense. The value
\type {nil
} gets returned when the node is not
342 a glyph, a character number is returned if the node is still tagged as character
343 and
\type {false
} gets returned otherwise. When nil is returned, the id is also
344 returned. The
\type {is_glyph
} variant doesn't check for a subtype being less
345 than
256, so it returns either the character value or nil plus the id. These
346 helpers are not always faster than separate calls but they sometimes permit
347 making more readable tests.
349 \subsubsection{boundary nodes
}
351 \starttabulate[|lT|l|p|
]
352 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
353 \NC subtype
\NC number
\NC \showsubtypes{boundary
} \NC \NR
354 \NC attr
\NC node
\NC list of attributes
\NC \NR
355 \NC value
\NC number
\NC values
0--
255 are reserved
\NC \NR
358 This node relates to the
\type {\noboundary},
\type {\boundary},
\type
359 {\protrusionboundary} and
\type {\wordboundary} primitives.
361 \subsubsection{local_par nodes
}
363 \starttabulate[|lT|l|p|
]
364 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
365 \NC attr
\NC node
\NC list of attributes
\NC \NR
366 \NC pen_inter
\NC number
\NC local interline penalty (from
\type {\localinterlinepenalty})
\NC \NR
367 \NC pen_broken
\NC number
\NC local broken penalty (from
\type {\localbrokenpenalty})
\NC \NR
368 \NC dir
\NC string
\NC the direction of this par. see~
\in [dirnodes
] \NC \NR
369 \NC box_left
\NC node
\NC the
\type {\localleftbox} \NC \NR
370 \NC box_left_width
\NC number
\NC width of the
\type {\localleftbox} \NC \NR
371 \NC box_right
\NC node
\NC the
\type {\localrightbox} \NC \NR
372 \NC box_right_width
\NC number
\NC width of the
\type {\localrightbox} \NC \NR
375 A warning: never assign a node list to the
\type {box_left
} or
\type {box_right
}
376 field unless you are sure its internal link structure is correct, otherwise an
379 \subsubsection[dirnodes
]{dir nodes
}
381 \starttabulate[|lT|l|p|
]
382 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
383 \NC attr
\NC node
\NC list of attributes
\NC \NR
384 \NC dir
\NC string
\NC the direction (but see below)
\NC \NR
385 \NC level
\NC number
\NC nesting level of this direction whatsit
\NC \NR
388 A note on
\type {dir
} strings. Direction specifiers are three|-|letter
389 combinations of
\type {T
},
\type {B
},
\type {R
}, and
\type {L
}.
391 These are built up out of three separate items:
393 \startitemize[packed
]
395 the first is the direction of the
\quote{top
} of paragraphs.
398 the second is the direction of the
\quote{start
} of lines.
401 the third is the direction of the
\quote{top
} of glyphs.
405 However, only four combinations are accepted:
\type {TLT
},
\type {TRT
},
\type
406 {RTT
}, and
\type {LTL
}.
408 Inside actual
\type {dir
} whatsit nodes, the representation of
\type {dir
} is not
409 a three-letter but a four|-|letter combination. The first character in this case
410 is always either
\type {+
} or
\type {-
}, indicating whether the value is pushed
411 or popped from the direction stack.
413 \subsubsection{margin_kern nodes
}
415 \starttabulate[|lT|l|p|
]
416 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
417 \NC subtype
\NC number
\NC \showsubtypes{margin_kern
} \NC \NR
418 \NC attr
\NC node
\NC list of attributes
\NC \NR
419 \NC width
\NC number
\NC the advance of the kern
\NC \NR
420 \NC glyph
\NC node
\NC the glyph to be used
\NC \NR
423 \subsection{Math nodes
}
425 These are the so||called
\quote {noad
}s and the nodes that are specifically
426 associated with math processing. Most of these nodes contain subnodes so that the
427 list of possible fields is actually quite small. First, the subnodes:
429 \subsubsection{Math kernel subnodes
}
431 Many object fields in math mode are either simple characters in a specific family
432 or math lists or node lists. There are four associated subnodes that represent
433 these cases (in the following node descriptions these are indicated by the word
436 The
\type {next
} and
\type {prev
} fields for these subnodes are unused.
438 \subsubsubsection{math_char and math_text_char subnodes
}
440 \starttabulate[|lT|l|p|
]
441 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
442 \NC attr
\NC node
\NC list of attributes
\NC \NR
443 \NC char
\NC number
\NC the character index
\NC \NR
444 \NC fam
\NC number
\NC the family number
\NC \NR
447 The
\type {math_char
} is the simplest subnode field, it contains the character
448 and family for a single glyph object. The
\type {math_text_char
} is a special
449 case that you will not normally encounter, it arises temporarily during math list
450 conversion (its sole function is to suppress a following italic correction).
452 \subsubsubsection{sub_box and sub_mlist subnodes
}
454 \starttabulate[|lT|l|p|
]
455 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
456 \NC attr
\NC node
\NC list of attributes
\NC \NR
457 \NC head/list
\NC node
\NC list of nodes
\NC \NR
460 These two subnode types are used for subsidiary list items. For
\type {sub_box
},
461 the
\type {head
} points to a
\quote {normal
} vbox or hbox. For
\type {sub_mlist
},
462 the
\type {head
} points to a math list that is yet to be converted.
464 A warning: never assign a node list to the
\type {head
} field unless you are sure
465 its internal link structure is correct, otherwise an error may be result.
467 \subsubsection{Math delimiter subnode
}
469 There is a fifth subnode type that is used exclusively for delimiter fields. As
470 before, the
\type {next
} and
\type {prev
} fields are unused.
472 \subsubsubsection{delim subnodes
}
474 \starttabulate[|lT|l|p|
]
475 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
476 \NC attr
\NC node
\NC list of attributes
\NC \NR
477 \NC small_char
\NC number
\NC character index of base character
\NC \NR
478 \NC small_fam
\NC number
\NC family number of base character
\NC \NR
479 \NC large_char
\NC number
\NC character index of next larger character
\NC \NR
480 \NC large_fam
\NC number
\NC family number of next larger character
\NC \NR
483 The fields
\type {large_char
} and
\type {large_fam
} can be zero, in that case the
484 font that is sed for the
\type {small_fam
} is expected to provide the large
485 version as an extension to the
\type {small_char
}.
487 \subsubsection{Math core nodes
}
489 First, there are the objects (the
\TEX book calls then
\quote {atoms
}) that are
490 associated with the simple math objects: ord, op, bin, rel, open, close, punct,
491 inner, over, under, vcent. These all have the same fields, and they are combined
492 into a single node type with separate subtypes for differentiation.
494 \subsubsubsection{simple nodes
}
496 \starttabulate[|lT|l|p|
]
497 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
498 \NC subtype
\NC number
\NC \showsubtypes{noad
} \NC \NR
499 \NC attr
\NC node
\NC list of attributes
\NC \NR
500 \NC nucleus
\NC kernel node
\NC base
\NC \NR
501 \NC sub
\NC kernel node
\NC subscript
\NC \NR
502 \NC sup
\NC kernel node
\NC superscript
\NC \NR
505 \subsubsubsection{accent nodes
}
507 \starttabulate[|lT|l|p|
]
508 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
509 \NC subtype
\NC number
\NC \showsubtypes{accent
} \NC \NR
510 \NC nucleus
\NC kernel node
\NC base
\NC \NR
511 \NC sub
\NC kernel node
\NC subscript
\NC \NR
512 \NC sup
\NC kernel node
\NC superscript
\NC \NR
513 \NC accent
\NC kernel node
\NC top accent
\NC \NR
514 \NC bot_accent
\NC kernel node
\NC bottom accent
\NC \NR
517 \subsubsubsection{style nodes
}
519 \starttabulate[|lT|l|p|
]
520 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
521 \NC style
\NC string
\NC contains the style
\NC \NR
524 There are eight possibilities for the string value: one of
\quote {display
},
525 \quote {text
},
\quote {script
}, or
\quote {scriptscript
}. Each of these can have
526 a trailing
\type {'
} to signify
\quote {cramped
} styles.
528 \subsubsubsection{choice nodes
}
530 \starttabulate[|lT|l|p|
]
531 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
532 \NC attr
\NC node
\NC list of attributes
\NC \NR
533 \NC display
\NC node
\NC list of display size alternatives
\NC \NR
534 \NC text
\NC node
\NC list of text size alternatives
\NC \NR
535 \NC script
\NC node
\NC list of scriptsize alternatives
\NC \NR
536 \NC scriptscript
\NC node
\NC list of scriptscriptsize alternatives
\NC \NR
539 A warning: never assign a node list to the display, text, script, or
540 scriptscript field unless you are sure its internal link structure is
541 correct, otherwise an error may be result.
543 \subsubsubsection{radical nodes
}
545 \starttabulate[|lT|l|p|
]
546 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
547 \NC subtype
\NC number
\NC \showsubtypes{radical
} \NC \NR
548 \NC attr
\NC node
\NC list of attributes
\NC \NR
549 \NC nucleus
\NC kernel node
\NC base
\NC \NR
550 \NC sub
\NC kernel node
\NC subscript
\NC \NR
551 \NC sup
\NC kernel node
\NC superscript
\NC \NR
552 \NC left
\NC delimiter node
\NC \NC \NR
553 \NC degree
\NC kernel node
\NC only set by
\type {\Uroot} \NC \NR
556 A warning: never assign a node list to the nucleus, sub, sup, left, or degree
557 field unless you are sure its internal link structure is correct, otherwise an
560 \subsubsubsection{fraction nodes
}
562 \starttabulate[|lT|l|p|
]
563 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
564 \NC attr
\NC node
\NC list of attributes
\NC \NR
565 \NC width
\NC number
\NC (optional) width of the fraction
\NC \NR
566 \NC num
\NC kernel node
\NC numerator
\NC \NR
567 \NC denom
\NC kernel node
\NC denominator
\NC \NR
568 \NC left
\NC delimiter node
\NC left side symbol
\NC \NR
569 \NC right
\NC delimiter node
\NC right side symbol
\NC \NR
572 A warning: never assign a node list to the num, or denom field unless you are
573 sure its internal link structure is correct, otherwise an error may be result.
575 \subsubsubsection{fence nodes
}
577 \starttabulate[|lT|l|p|
]
578 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
579 \NC subtype
\NC number
\NC \showsubtypes{fence
} \NC \NR
580 \NC attr
\NC node
\NC list of attributes
\NC \NR
581 \NC delim
\NC delimiter node
\NC delimiter specification
\NC \NR
584 \subsection{whatsit nodes
}
586 Whatsit nodes come in many subtypes that you can ask for by running
587 \type {node.whatsits()
}:
589 for id, name in table.sortedpairs(node.whatsits()) do
591 context(" (
%s), ",id)
593 context.removeunwantedspaces()
594 context.removepunctuation()
598 \subsubsection{front|-|end whatits
}
600 \subsubsubsection{open whatsits
}
602 \starttabulate[|lT|l|p|
]
603 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
604 \NC attr
\NC node
\NC list of attributes
\NC \NR
605 \NC stream
\NC number
\NC \TEX's stream id number
\NC \NR
606 \NC name
\NC string
\NC file name
\NC \NR
607 \NC ext
\NC string
\NC file extension
\NC \NR
608 \NC area
\NC string
\NC file area (this may become obsolete)
\NC \NR
611 \subsubsubsection{write whatsits
}
613 \starttabulate[|lT|l|p|
]
614 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
615 \NC attr
\NC node
\NC list of attributes
\NC \NR
616 \NC stream
\NC number
\NC \TEX's stream id number
\NC \NR
617 \NC data
\NC table
\NC a table representing the token list to be written
\NC \NR
620 \subsubsubsection{close whatsits
}
622 \starttabulate[|lT|l|p|
]
623 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
624 \NC attr
\NC node
\NC list of attributes
\NC \NR
625 \NC stream
\NC number
\NC \TEX's stream id number
\NC \NR
628 \subsubsubsection{user_defined whatits
}
630 User|-|defined whatsit nodes can only be created and handled from
\LUA\ code. In
631 effect, they are an extension to the extension mechanism. The
\LUATEX\ engine
632 will simply step over such whatsits without ever looking at the contents.
634 \starttabulate[|lT|l|p|
]
635 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
636 \NC attr
\NC node
\NC list of attributes
\NC \NR
637 \NC user_id
\NC number
\NC id number
\NC \NR
638 \NC type
\NC number
\NC type of the value
\NC \NR
639 \NC value
\NC number
\NC a
\LUA\ number
\NC \NR
640 \NC \NC node
\NC a node list
\NC \NR
641 \NC \NC string
\NC a
\LUA\ string
\NC \NR
642 \NC \NC table
\NC a
\LUA\ table
\NC \NR
645 The
\type {type
} can have one of five distinct values:
647 \starttabulate[|lT|p|
]
648 \NC \rmbf value
\NC \bf explanation
\NC \NR
649 \NC 97 \NC list of attributes
\NC \NR
650 \NC 100 \NC a
\LUA\ number
\NC \NR
651 \NC 110 \NC a node list
\NC \NR
652 \NC 115 \NC a
\LUA\ string
\NC \NR
653 \NC 116 \NC a
\LUA\ token list in
\LUA\ table form
\NC \NR
656 \subsubsubsection{save_pos whatsits
}
658 \starttabulate[|lT|l|p|
]
659 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
660 \NC attr
\NC node
\NC list of attributes
\NC \NR
663 \subsubsubsection{late_lua whatsits
}
665 \starttabulate[|lT|l|p|
]
666 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
667 \NC attr
\NC node
\NC list of attributes
\NC \NR
668 \NC data
\NC string
\NC data to execute
\NC \NR
669 \NC string
\NC string
\NC data to execute
\NC \NR
670 \NC name
\NC string
\NC the name to use for
\LUA\ error reporting
\NC \NR
673 The difference between
\type {data
} and
\type {string
} is that on assignment, the
674 \type {data
} field is converted to a token list, cf. use as
\type {\latelua}. The
675 \type {string
} version is treated as a literal string.
677 \subsubsection{\DVI\ backend whatits
}
679 \subsubsection{special whatits
}
681 \starttabulate[|lT|l|p|
]
682 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
683 \NC attr
\NC node
\NC list of attributes
\NC \NR
684 \NC data
\NC string
\NC the
\type {\special} information
\NC \NR
687 \subsubsection{\PDF\ backend whatits
}
689 \subsubsubsection{pdf_literal whatits
}
691 \starttabulate[|lT|l|p|
]
692 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
693 \NC attr
\NC node
\NC list of attributes
\NC \NR
694 \NC mode
\NC number
\NC the
\quote {mode
} setting of this literal
\NC \NR
695 \NC data
\NC string
\NC the
\type {\pdfliteral} information
\NC \NR
698 Possible mode values are:
700 \starttabulate[|lT|p|
]
701 \NC \rmbf value
\NC \rmbf \PDFTEX\ keyword
\NC \NR
702 \NC 0 \NC setorigin
\NC \NR
703 \NC 1 \NC page
\NC \NR
704 \NC 2 \NC direct
\NC \NR
707 \subsubsubsection{pdf_refobj whatits
}
709 \starttabulate[|lT|l|p|
]
710 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
711 \NC attr
\NC node
\NC list of attributes
\NC \NR
712 \NC objnum
\NC number
\NC the referenced
\PDF\ object number
\NC \NR
715 \subsubsubsection{pdf_annot whatits
}
717 \starttabulate[|lT|l|p|
]
718 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
719 \NC attr
\NC node
\NC list of attributes
\NC \NR
720 \NC width
\NC number
\NC the width (not used in calculations)
\NC \NR
721 \NC height
\NC number
\NC the height (not used in calculations)
\NC \NR
722 \NC depth
\NC number
\NC the depth (not used in calculations)
\NC \NR
723 \NC objnum
\NC number
\NC the referenced
\PDF\ object number
\NC \NR
724 \NC data
\NC string
\NC the annotation data
\NC \NR
727 \subsubsubsection{pdf_start_link whatits
}
729 \starttabulate[|lT|l|p|
]
730 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
731 \NC attr
\NC node
\NC list of attributes
\NC \NR
732 \NC width
\NC number
\NC the width (not used in calculations)
\NC \NR
733 \NC height
\NC number
\NC the height (not used in calculations)
\NC \NR
734 \NC depth
\NC number
\NC the depth (not used in calculations)
\NC \NR
735 \NC objnum
\NC number
\NC the referenced
\PDF\ object number
\NC \NR
736 \NC link_attr
\NC table
\NC the link attribute token list
\NC \NR
737 \NC action
\NC node
\NC the action to perform
\NC \NR
740 \subsubsubsection{pdf_end_link whatits
}
742 \starttabulate[|lT|l|p|
]
743 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
744 \NC attr
\NC node
\NC \NC \NR
747 \subsubsubsection{pdf_dest whatits
}
749 \starttabulate[|lT|l|p|
]
750 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
751 \NC attr
\NC node
\NC list of attributes
\NC \NR
752 \NC width
\NC number
\NC the width (not used in calculations)
\NC \NR
753 \NC height
\NC number
\NC the height (not used in calculations)
\NC \NR
754 \NC depth
\NC number
\NC the depth (not used in calculations)
\NC \NR
755 \NC named_id
\NC number
\NC is the
\type {dest_id
} a string value?
\NC \NR
756 \NC dest_id
\NC number
\NC the destination id
\NC \NR
757 \NC \NC string
\NC the destination name
\NC \NR
758 \NC dest_type
\NC number
\NC type of destination
\NC \NR
759 \NC xyz_zoom
\NC number
\NC the zoom factor (times
1000)
\NC \NR
760 \NC objnum
\NC number
\NC the
\PDF\ object number
\NC \NR
763 \subsubsubsection{pdf_action whatits
}
765 These are a special kind of item that only appears inside
\PDF\ start link
768 \starttabulate[|lT|l|p|
]
769 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
770 \NC action_type
\NC number
\NC the kind of action involved
\NC \NR
771 \NC action_id
\NC number or string
\NC token list reference or string
\NC \NR
772 \NC named_id
\NC number
\NC the index of the destination
\NC \NR
773 \NC file
\NC string
\NC the target filename
\NC \NR
774 \NC new_window
\NC number
\NC the window state of the target
\NC \NR
775 \NC data
\NC string
\NC the name of the destination
\NC \NR
778 Valid action types are:
780 \starttabulate[|lT|lT|
]
781 \NC 0 \NC page
\NC \NR
782 \NC 1 \NC goto
\NC \NR
783 \NC 2 \NC thread
\NC \NR
784 \NC 3 \NC user
\NC \NR
787 Valid window types are:
789 \starttabulate[|lT|lT|
]
790 \NC 0 \NC notset
\NC \NR
791 \NC 1 \NC new
\NC \NR
792 \NC 2 \NC nonew
\NC \NR
795 \subsubsubsection{pdf_thread whatits
}
797 \starttabulate[|lT|l|p|
]
798 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
799 \NC attr
\NC node
\NC list of attributes
\NC \NR
800 \NC width
\NC number
\NC the width (not used in calculations)
\NC \NR
801 \NC height
\NC number
\NC the height (not used in calculations)
\NC \NR
802 \NC depth
\NC number
\NC the depth (not used in calculations)
\NC \NR
803 \NC named_id
\NC number
\NC is
\type {tread_id
} a string value?
\NC \NR
804 \NC tread_id
\NC number
\NC the thread id
\NC \NR
805 \NC \NC string
\NC the thread name
\NC \NR
806 \NC thread_attr
\NC number
\NC extra thread information
\NC \NR
809 \subsubsubsection{pdf_start_thread whatits
}
811 \starttabulate[|lT|l|p|
]
812 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
813 \NC attr
\NC node
\NC list of attributes
\NC \NR
814 \NC width
\NC number
\NC the width (not used in calculations)
\NC \NR
815 \NC height
\NC number
\NC the height (not used in calculations)
\NC \NR
816 \NC depth
\NC number
\NC the depth (not used in calculations)
\NC \NR
817 \NC named_id
\NC number
\NC is
\type {tread_id
} a string value?
\NC \NR
818 \NC tread_id
\NC number
\NC the thread id
\NC \NR
819 \NC \NC string
\NC the thread name
\NC \NR
820 \NC thread_attr
\NC number
\NC extra thread information
\NC \NR
823 \subsubsubsection{pdf_end_thread whatits
}
825 \starttabulate[|lT|l|p|
]
826 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
827 \NC attr
\NC node
\NC \NC \NR
830 \subsubsubsection{pdf_colorstack whatits
}
832 \starttabulate[|lT|l|p|
]
833 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
834 \NC attr
\NC node
\NC list of attributes
\NC \NR
835 \NC stack
\NC number
\NC colorstack id number
\NC \NR
836 \NC command
\NC number
\NC command to execute
\NC \NR
837 \NC data
\NC string
\NC data
\NC \NR
840 \subsubsubsection{pdf_setmatrix whatits
}
842 \starttabulate[|lT|l|p|
]
843 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
844 \NC attr
\NC node
\NC list of attributes
\NC \NR
845 \NC data
\NC string
\NC data
\NC \NR
848 \subsubsubsection{pdf_save whatits
}
850 \starttabulate[|lT|l|p|
]
851 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
852 \NC attr
\NC node
\NC list of attributes
\NC \NR
855 \subsubsubsection{pdf_restore whatits
}
857 \starttabulate[|lT|l|p|
]
858 \NC \rmbf field
\NC \bf type
\NC \bf explanation
\NC \NR
859 \NC attr
\NC node
\NC list of attributes
\NC \NR
862 \section{Two access models
}
864 Deep down in
\TEX\ a node has a number which is an numeric entry in a memory
865 table. In fact, this model, where
\TEX\ manages memory is real fast and one of
866 the reasons why plugging in callbacks that operate on nodes is quite fast too.
867 Each node gets a number that is in fact an index in the memory table and that
868 number often gets reported when you print node related information.
870 There are two access models, a robust one using a so called user data object that
871 provides a virtual interface to the internal nodes, and a more direct access which
872 uses the node numbers directly. The first model provide key based access while
873 the second always accesses fields via functions:
877 getfield(nodenumber,"char")
880 If you use the direct model, even if you know that you deal with numbers, you
881 should not depend on that property but treat it an abstraction just like
882 traditional nodes. In fact, the fact that we use a simple basic datatype has the
883 penalty that less checking can be done, but less checking is also the reason why
884 it's somewhat faster. An important aspect is that one cannot mix both methods,
885 but you can cast both models. So, multiplying a node number makes no sense.
887 So our advice is: use the indexed (table) approach when possible and investigate
888 the direct one when speed might be an real issue. For that reason we also provide
889 the
\type {get*
} and
\type {set*
} functions in the top level node namespace.
890 There is a limited set of getters. When implementing this direct approach the
891 regular index by key variant was also optimized, so direct access only makes
892 sense when we're accessing nodes millions of times (which happens in some font
893 processing for instance).
895 We're talking mostly of getters because setters are less important. Documents
896 have not that many content related nodes and setting many thousands of properties
897 is hardly a burden contrary to millions of consultations.
899 Normally you will access nodes like this:
902 local next = current.next
908 Here
\type {next
} is not a real field, but a virtual one. Accessing it results in
909 a metatable method being called. In practice it boils down to looking up the node
910 type and based on the node type checking for the field name. In a worst case you
911 have a node type that sits at the end of the lookup list and a field that is last
912 in the lookup chain. However, in successive versions of
\LUATEX\ these lookups
913 have been optimized and the most frequently accessed nodes and fields have a
916 Because in practice the
\type {next
} accessor results in a function call, there
917 is some overhead involved. The next code does the same and performs a tiny bit
918 faster (but not that much because it is still a function call but one that knows
922 local next = node.next(current)
928 If performance matters you can use an function instead:
930 \starttabulate[|T|p|
]
931 \NC getnext
\NC parsing nodelist always involves this one
\NC \NR
932 \NC getprev
\NC used less but is logical companion to
\type {getnext
} \NC \NR
933 \NC getboth
\NC returns the next and prev pointer of a node
\NC \NR
934 \NC getid
\NC consulted a lot
\NC \NR
935 \NC getsubtype
\NC consulted less but also a topper
\NC \NR
936 \NC getfont
\NC used a lot in
\OPENTYPE\ handling (glyph nodes are consulted a lot)
\NC \NR
937 \NC getchar
\NC idem and also in other places
\NC \NR
938 \NC getdisc
\NC returns the
\type {pre
},
\type {post
} and
\type {replace
} fields and
939 optionally when true is passed also the tail fields.
\NC \NR
940 \NC getlist
\NC we often parse nested lists so this is a convenient one too
941 (only works for hlist and vlist!)
\NC \NR
942 \NC getleader
\NC comparable to list, seldom used in
\TEX\ (but needs frequent consulting
943 like lists; leaders could have been made a dedicated node type)
\NC \NR
944 \NC getfield
\NC generic getter, sufficient for the rest (other field names are
945 often shared so a specific getter makes no sense then)
\NC \NR
948 The direct variants also have setters, where the discretionary setter takes three
949 (optional) arguments plus an optional fourth indicating the subtype.
951 It doesn't make sense to add getters for all fields, also because some are not
952 unique to one node type. Profiling demonstrated that these fields can get
953 accesses way more times than other fields. Even in complex documents, many node
954 and fields types never get seen, or seen only a few times. Most functions in the
955 \type {node
} namespace have a companion in
\type {node.direct
}, but of course not
956 the ones that don't deal with nodes themselves. The following table summarized
959 % \startcolumns[balance=yes]
961 \def\yes{$+$
} \def\nop{$-$
}
963 \starttabulate[|T|c|c|
]
965 \NC \bf function
\NC \bf node
\NC \bf direct
\NC \NR
967 \NC \type {copy_list
} \NC \yes \NC \yes \NC \NR
968 \NC \type {copy
} \NC \yes \NC \yes \NC \NR
969 \NC \type {count
} \NC \yes \NC \yes \NC \NR
970 \NC \type {current_attr
} \NC \yes \NC \yes \NC \NR
971 \NC \type {dimensions
} \NC \yes \NC \yes \NC \NR
972 \NC \type {do_ligature_n
} \NC \yes \NC \yes \NC \NR
973 \NC \type {effective_glue
} \NC \yes \NC \yes \NC \NR
974 \NC \type {end_of_math
} \NC \yes \NC \yes \NC \NR
975 \NC \type {family_font
} \NC \yes \NC \nop \NC \NR
976 \NC \type {fields
} \NC \yes \NC \nop \NC \NR
977 \NC \type {first_character
} \NC \yes \NC \nop \NC \NR
978 \NC \type {first_glyph
} \NC \yes \NC \yes \NC \NR
979 \NC \type {flush_list
} \NC \yes \NC \yes \NC \NR
980 \NC \type {flush_node
} \NC \yes \NC \yes \NC \NR
981 \NC \type {free
} \NC \yes \NC \yes \NC \NR
982 \NC \type {getboth
} \NC \yes \NC \yes \NC \NR
983 \NC \type {getbox
} \NC \nop \NC \yes \NC \NR
984 \NC \type {getchar
} \NC \yes \NC \yes \NC \NR
985 \NC \type {getdisc
} \NC \yes \NC \yes \NC \NR
986 \NC \type {getfield
} \NC \yes \NC \yes \NC \NR
987 \NC \type {getfont
} \NC \yes \NC \yes \NC \NR
988 \NC \type {getid
} \NC \yes \NC \yes \NC \NR
989 \NC \type {getleader
} \NC \yes \NC \yes \NC \NR
990 \NC \type {getlist
} \NC \yes \NC \yes \NC \NR
991 \NC \type {getnext
} \NC \yes \NC \yes \NC \NR
992 \NC \type {getprev
} \NC \yes \NC \yes \NC \NR
993 \NC \type {getsubtype
} \NC \yes \NC \yes \NC \NR
994 \NC \type {has_attribute
} \NC \yes \NC \yes \NC \NR
995 \NC \type {has_field
} \NC \yes \NC \yes \NC \NR
996 \NC \type {has_glyph
} \NC \yes \NC \yes \NC \NR
997 \NC \type {hpack
} \NC \yes \NC \yes \NC \NR
998 \NC \type {id
} \NC \yes \NC \nop \NC \NR
999 \NC \type {insert_after
} \NC \yes \NC \yes \NC \NR
1000 \NC \type {insert_before
} \NC \yes \NC \yes \NC \NR
1001 \NC \type {is_char
} \NC \yes \NC \yes \NC \NR
1002 \NC \type {is_glyph
} \NC \yes \NC \yes \NC \NR
1003 \NC \type {is_direct
} \NC \nop \NC \yes \NC \NR
1004 \NC \type {is_node
} \NC \yes \NC \yes \NC \NR
1005 \NC \type {kerning
} \NC \yes \NC \yes \NC \NR
1006 \NC \type {last_node
} \NC \yes \NC \yes \NC \NR
1007 \NC \type {length
} \NC \yes \NC \yes \NC \NR
1008 \NC \type {ligaturing
} \NC \yes \NC \yes \NC \NR
1009 \NC \type {mlist_to_hlist
} \NC \yes \NC \nop \NC \NR
1010 \NC \type {new
} \NC \yes \NC \yes \NC \NR
1011 \NC \type {next
} \NC \yes \NC \nop \NC \NR
1012 \NC \type {prev
} \NC \yes \NC \nop \NC \NR
1013 \NC \type {protect_glyph
} \NC \yes \NC \yes \NC \NR
1014 \NC \type {protect_glyphs
} \NC \yes \NC \yes \NC \NR
1015 \NC \type {protrusion_skippable
} \NC \yes \NC \yes \NC \NR
1016 \NC \type {remove
} \NC \yes \NC \yes \NC \NR
1017 \NC \type {set_attribute
} \NC \yes \NC \yes \NC \NR
1018 \NC \type {setboth
} \NC \yes \NC \yes \NC \NR
1019 \NC \type {setbox
} \NC \yes \NC \yes \NC \NR
1020 \NC \type {setchar
} \NC \yes \NC \yes \NC \NR
1021 \NC \type {setdisc
} \NC \yes \NC \yes \NC \NR
1022 \NC \type {setfield
} \NC \yes \NC \yes \NC \NR
1023 \NC \type {setlink
} \NC \yes \NC \yes \NC \NR
1024 \NC \type {setnext
} \NC \yes \NC \yes \NC \NR
1025 \NC \type {setprev
} \NC \yes \NC \yes \NC \NR
1026 \NC \type {slide
} \NC \yes \NC \yes \NC \NR
1027 \NC \type {subtype
} \NC \yes \NC \nop \NC \NR
1028 \NC \type {subtypes
} \NC \yes \NC \nop \NC \NR
1029 \NC \type {tail
} \NC \yes \NC \yes \NC \NR
1030 \NC \type {todirect
} \NC \yes \NC \yes \NC \NR
1031 \NC \type {tonode
} \NC \yes \NC \yes \NC \NR
1032 \NC \type {tostring
} \NC \yes \NC \yes \NC \NR
1033 \NC \type {traverse_id
} \NC \yes \NC \yes \NC \NR
1034 \NC \type {traverse_char
} \NC \yes \NC \yes \NC \NR
1035 \NC \type {traverse
} \NC \yes \NC \yes \NC \NR
1036 \NC \type {types
} \NC \yes \NC \nop \NC \NR
1037 \NC \type {type
} \NC \yes \NC \nop \NC \NR
1038 \NC \type {unprotect_glyphs
} \NC \yes \NC \yes \NC \NR
1039 \NC \type {unset_attribute
} \NC \yes \NC \yes \NC \NR
1040 \NC \type {usedlist
} \NC \yes \NC \yes \NC \NR
1041 \NC \type {vpack
} \NC \yes \NC \yes \NC \NR
1042 \NC \type {whatsits
} \NC \yes \NC \nop \NC \NR
1043 \NC \type {whatsitsubtypes
} \NC \yes \NC \nop \NC \NR
1044 \NC \type {write
} \NC \yes \NC \yes \NC \NR
1045 \NC \type {setglue
} \NC \yes \NC \yes \NC \NR
1046 \NC \type {getglue
} \NC \yes \NC \yes \NC \NR
1047 \NC \type {glue_is_zero
} \NC \yes \NC \yes \NC \NR
1052 The
\type {node.next
} and
\type {node.prev
} functions will stay but for
1053 consistency there are variants called
\type {getnext
} and
\type {getprev
}. We had
1054 to use
\type {get
} because
\type {node.id
} and
\type {node.subtype
} are already
1055 taken for providing meta information about nodes. Note: The getters do only basic
1056 checking for valid keys. You should just stick to the keys mentioned in the
1057 sections that describe node properties.
1059 Some nodes have indirect references. For instance a math character refers to a
1060 family instead of a font. In that case we provide a virtual font field as
1061 accessor. So,
\type {getfont
} and
\type {.font
} can be used on them. The same is
1062 true for the
\type {width
},
\type {height
} and
\type {depth
} of glue nodes. These
1063 actually access the spec node properties, and here we can set as well as get the
1066 \section{The
\type {node
} library
}
1068 The
\type {node
} library contains functions that facilitate dealing with (lists
1069 of) nodes and their values. They allow you to create, alter, copy, delete, and
1070 insert
\LUATEX\ node objects, the core objects within the typesetter.
1072 \LUATEX\ nodes are represented in
\LUA\ as userdata with the metadata type
1073 \type {luatex.node
}. The various parts within a node can be accessed using
1076 Each node has at least the three fields
\type {next
},
\type {id
}, and
\type
1079 \startitemize[intro
]
1082 The
\type {next
} field returns the userdata object for the next node in a
1083 linked list of nodes, or
\type {nil
}, if there is no next node.
1087 The
\type {id
} indicates
\TEX's
\quote{node type
}. The field
\type {id
} has a
1088 numeric value for efficiency reasons, but some of the library functions also
1089 accept a string value instead of
\type {id
}.
1093 The
\type {subtype
} is another number. It often gives further information
1094 about a node of a particular
\type {id
}, but it is most important when
1095 dealing with
\quote {whatsits
}, because they are differentiated solely based
1096 on their
\type {subtype
}.
1101 The other available fields depend on the
\type {id
} (and for
\quote {whatsits
},
1102 the
\type {subtype
}) of the node. Further details on the various fields and their
1103 meanings are given in~
\in{chapter
}[nodes
].
1105 Support for
\type {unset
} (alignment) nodes is partial: they can be queried and
1106 modified from
\LUA\ code, but not created.
1108 Nodes can be compared to each other, but: you are actually comparing indices into
1109 the node memory. This means that equality tests can only be trusted under very
1110 limited conditions. It will not work correctly in any situation where one of the
1111 two nodes has been freed and|/|or reallocated: in that case, there will be false
1114 At the moment, memory management of nodes should still be done explicitly by the
1115 user. Nodes are not
\quote {seen
} by the
\LUA\ garbage collector, so you have to
1116 call the node freeing functions yourself when you are no longer in need of a node
1117 (list). Nodes form linked lists without reference counting, so you have to be
1118 careful that when control returns back to
\LUATEX\ itself, you have not deleted
1119 nodes that are still referenced from a
\type {next
} pointer elsewhere, and that
1120 you did not create nodes that are referenced more than once.
1122 There are statistics available with regards to the allocated node memory, which
1123 can be handy for tracing.
1125 \subsection{Node handling functions
}
1127 \subsubsection{\type {node.is_node
}}
1131 node.is_node(<any> item)
1134 This function returns true if the argument is a userdata object of
1135 type
\type {<node>
}.
1137 \subsubsection{\type {node.types
}}
1144 This function returns an array that maps node id numbers to node type strings,
1145 providing an overview of the possible top|-|level
\type {id
} types.
1147 \subsubsection{\type {node.whatsits
}}
1154 \TEX's
\quote{whatsits
} all have the same
\type {id
}. The various subtypes are
1155 defined by their
\type {subtype
} fields. The function is much like
\type
1156 {node.types
}, except that it provides an array of
\type {subtype
} mappings.
1158 \subsubsection{\type {node.id
}}
1162 node.id(<string> type)
1165 This converts a single type name to its internal numeric representation.
1167 \subsubsection{\type {node.subtype
}}
1171 node.subtype(<string> type)
1174 This converts a single whatsit name to its internal numeric representation (
\type
1177 \subsubsection{\type {node.type
}}
1184 In the argument is a number, then this function converts an internal numeric
1185 representation to an external string representation. Otherwise, it will return
1186 the string
\type {node
} if the object represents a node, and
\type {nil
}
1189 \subsubsection{\type {node.fields
}}
1193 node.fields(<number> id)
1195 node.fields(<number> id, <number> subtype)
1198 This function returns an array of valid field names for a particular type of
1199 node. If you want to get the valid fields for a
\quote {whatsit
}, you have to
1200 supply the second argument also. In other cases, any given second argument will
1201 be silently ignored.
1203 This function accepts string
\type {id
} and
\type {subtype
} values as well.
1205 \subsubsection{\type {node.has_field
}}
1209 node.has_field(<node> n, <string> field)
1212 This function returns a boolean that is only true if
\type {n
} is
1213 actually a node, and it has the field.
1215 \subsubsection{\type {node.new
}}
1219 node.new(<number> id)
1221 node.new(<number> id, <number> subtype)
1224 Creates a new node. All of the new node's fields are initialized to either zero
1225 or
\type {nil
} except for
\type {id
} and
\type {subtype
} (if supplied). If you
1226 want to create a new whatsit, then the second argument is required, otherwise it
1227 need not be present. As with all node functions, this function creates a node on
1230 This function accepts string
\type {id
} and
\type {subtype
} values as well.
1232 \subsubsection{\type {node.free
}}
1238 Removes the node
\type {n
} from
\TEX's memory. Be careful: no checks are done on
1239 whether this node is still pointed to from a register or some
\type {next
} field:
1240 it is up to you to make sure that the internal data structures remain correct.
1242 \subsubsection{\type {node.flush_list
}}
1245 node.flush_list(<node> n)
1248 Removes the node list
\type {n
} and the complete node list following
\type {n
}
1249 from
\TEX's memory. Be careful: no checks are done on whether any of these nodes
1250 is still pointed to from a register or some
\type {next
} field: it is up to you
1251 to make sure that the internal data structures remain correct.
1253 \subsubsection{\type {node.copy
}}
1260 Creates a deep copy of node
\type {n
}, including all nested lists as in the case
1261 of a hlist or vlist node. Only the
\type {next
} field is not copied.
1263 \subsubsection{\type {node.copy_list
}}
1267 node.copy_list(<node> n)
1269 node.copy_list(<node> n, <node> m)
1272 Creates a deep copy of the node list that starts at
\type {n
}. If
\type {m
} is
1273 also given, the copy stops just before node
\type {m
}.
1275 Note that you cannot copy attribute lists this way, specialized functions for
1276 dealing with attribute lists will be provided later but are not there yet.
1277 However, there is normally no need to copy attribute lists as when you do
1278 assignments to the
\type {attr
} field or make changes to specific attributes, the
1279 needed copying and freeing takes place automatically.
1281 \subsubsection{\type {node.next
}}
1288 Returns the node following this node, or
\type {nil
} if there is no such node.
1290 \subsubsection{\type {node.prev
}}
1297 Returns the node preceding this node, or
\type {nil
} if there is no such node.
1299 \subsubsection{\type {node.current_attr
}}
1306 Returns the currently active list of attributes, if there is one.
1308 The intended usage of
\type {current_attr
} is as follows:
1311 local x1 = node.new("glyph")
1312 x1.attr = node.current_attr()
1313 local x2 = node.new("glyph")
1314 x2.attr = node.current_attr()
1320 local x1 = node.new("glyph")
1321 local x2 = node.new("glyph")
1322 local ca = node.current_attr()
1327 The attribute lists are ref counted and the assignment takes care of incrementing
1328 the refcount. You cannot expect the value
\type {ca
} to be valid any more when
1329 you assign attributes (using
\type {tex.setattribute
}) or when control has been
1330 passed back to
\TEX.
1332 Note: this function is somewhat experimental, and it returns the
{\it actual
}
1333 attribute list, not a copy thereof. Therefore, changing any of the attributes in
1334 the list will change these values for all nodes that have the current attribute
1335 list assigned to them.
1337 \subsubsection{\type {node.hpack
}}
1340 <node> h, <number> b =
1341 node.hpack(<node> n)
1342 <node> h, <number> b =
1343 node.hpack(<node> n, <number> w, <string> info)
1344 <node> h, <number> b =
1345 node.hpack(<node> n, <number> w, <string> info, <string> dir)
1348 This function creates a new hlist by packaging the list that begins at node
\type
1349 {n
} into a horizontal box. With only a single argument, this box is created using
1350 the natural width of its components. In the three argument form,
\type {info
}
1351 must be either
\type {additional
} or
\type {exactly
}, and
\type {w
} is the
1352 additional (
\type {\hbox spread
}) or exact (
\type {\hbox to
}) width to be used. The
1353 second return value is the badness of the generated box.
1355 Caveat: at this moment, there can be unexpected side|-|effects to this function,
1356 like updating some of the
\type {\marks} and
\type {\inserts}. Also note that the
1357 content of
\type {h
} is the original node list
\type {n
}: if you call
\type
1358 {node.free(h)
} you will also free the node list itself, unless you explicitly set
1359 the
\type {list
} field to
\type {nil
} beforehand. And in a similar way, calling
1360 \type {node.free(n)
} will invalidate
\type {h
} as well!
1362 \subsubsection{\type {node.vpack
}}
1365 <node> h, <number> b =
1366 node.vpack(<node> n)
1367 <node> h, <number> b =
1368 node.vpack(<node> n, <number> w, <string> info)
1369 <node> h, <number> b =
1370 node.vpack(<node> n, <number> w, <string> info, <string> dir)
1373 This function creates a new vlist by packaging the list that begins at node
\type
1374 {n
} into a vertical box. With only a single argument, this box is created using
1375 the natural height of its components. In the three argument form,
\type {info
}
1376 must be either
\type {additional
} or
\type {exactly
}, and
\type {w
} is the
1377 additional (
\type {\vbox spread
}) or exact (
\type {\vbox to
}) height to be used.
1379 The second return value is the badness of the generated box.
1381 See the description of
\type {node.hpack()
} for a few memory allocation caveats.
1383 \subsubsection{\type {node.dimensions
}}
1386 <number> w, <number> h, <number> d =
1387 node.dimensions(<node> n)
1388 <number> w, <number> h, <number> d =
1389 node.dimensions(<node> n, <string> dir)
1390 <number> w, <number> h, <number> d =
1391 node.dimensions(<node> n, <node> t)
1392 <number> w, <number> h, <number> d =
1393 node.dimensions(<node> n, <node> t, <string> dir)
1396 This function calculates the natural in|-|line dimensions of the node list starting
1397 at node
\type {n
} and terminating just before node
\type {t
} (or the end of the
1398 list, if there is no second argument). The return values are scaled points. An
1399 alternative format that starts with glue parameters as the first three arguments
1403 <number> w, <number> h, <number> d =
1404 node.dimensions(<number> glue_set, <number> glue_sign, <number> glue_order,
1406 <number> w, <number> h, <number> d =
1407 node.dimensions(<number> glue_set, <number> glue_sign, <number> glue_order,
1408 <node> n, <string> dir)
1409 <number> w, <number> h, <number> d =
1410 node.dimensions(<number> glue_set, <number> glue_sign, <number> glue_order,
1412 <number> w, <number> h, <number> d =
1413 node.dimensions(<number> glue_set, <number> glue_sign, <number> glue_order,
1414 <node> n, <node> t, <string> dir)
1417 This calling method takes glue settings into account and is especially useful for
1418 finding the actual width of a sublist of nodes that are already boxed, for
1419 example in code like this, which prints the width of the space in between the
1420 \type {a
} and
\type {b
} as it would be if
\type {\box0} was used as-is:
1423 \setbox0 =
\hbox to
20pt
{a b
}
1425 \directlua{print (node
.dimensions(
1426 tex
.box
[0].glue_set
,
1427 tex
.box
[0].glue_sign
,
1428 tex
.box
[0].glue_order
,
1429 tex
.box
[0].head
.next,
1430 node
.tail(tex
.box
[0].head
)
1434 You need to keep in mind that this is one of the few places in
\TEX\ where floats
1435 are used, which means that you can get small differences in rounding when you
1436 compare the width repported by
\type {hpack
} with
\type {dimensions
}.
1438 \subsubsection{\type {node.mlist_to_hlist
}}
1442 node.mlist_to_hlist(<node> n, <string> display_type, <boolean> penalties)
1445 This runs the internal mlist to hlist conversion, converting the math list in
1446 \type {n
} into the horizontal list
\type {h
}. The interface is exactly the same
1447 as for the callback
\type {mlist_to_hlist
}.
1449 \subsubsection{\type {node.slide
}}
1453 node.slide(<node> n)
1456 Returns the last node of the node list that starts at
\type {n
}. As a
1457 side|-|effect, it also creates a reverse chain of
\type {prev
} pointers between
1460 \subsubsection{\type {node.tail
}}
1467 Returns the last node of the node list that starts at
\type {n
}.
1469 \subsubsection{\type {node.length
}}
1473 node.length(<node> n)
1475 node.length(<node> n, <node> m)
1478 Returns the number of nodes contained in the node list that starts at
\type {n
}.
1479 If
\type {m
} is also supplied it stops at
\type {m
} instead of at the end of the
1480 list. The node
\type {m
} is not counted.
1482 \subsubsection{\type {node.count
}}
1486 node.count(<number> id, <node> n)
1488 node.count(<number> id, <node> n, <node> m)
1491 Returns the number of nodes contained in the node list that starts at
\type {n
}
1492 that have a matching
\type {id
} field. If
\type {m
} is also supplied, counting
1493 stops at
\type {m
} instead of at the end of the list. The node
\type {m
} is not
1496 This function also accept string
\type {id
}'s.
1498 \subsubsection{\type {node.traverse
}}
1502 node.traverse(<node> n)
1505 This is a
\LUA\ iterator that loops over the node list that starts at
\type {n
}.
1506 Typically code looks like this:
1509 for n in node.traverse(head) do
1514 is functionally equivalent to:
1519 local function f (head,var)
1530 if n == nil then break end
1536 It should be clear from the definition of the function
\type {f
} that even though
1537 it is possible to add or remove nodes from the node list while traversing, you
1538 have to take great care to make sure all the
\type {next
} (and
\type {prev
})
1539 pointers remain valid.
1541 If the above is unclear to you, see the section
\quote {For Statement
} in the
1542 \LUA\ Reference Manual.
1544 \subsubsection{\type {node.traverse_id
}}
1548 node.traverse_id(<number> id, <node> n)
1551 This is an iterator that loops over all the nodes in the list that starts at
1552 \type {n
} that have a matching
\type {id
} field.
1554 See the previous section for details. The change is in the local function
\type
1555 {f
}, which now does an extra while loop checking against the upvalue
\type {id
}:
1558 local function f(head,var)
1565 while not t.id == id do
1572 \subsubsection{\type {node.end_of_math
}}
1576 node.end_of_math(<node> start)
1579 Looks for and returns the next
\type {math_node
} following the
\type {start
}. If
1580 the given node is a math endnode this helper return that node, else it follows
1581 the list and return the next math endnote. If no such node is found nil is
1584 \subsubsection{\type {node.remove
}}
1587 <node> head, current =
1588 node.remove(<node> head, <node> current)
1591 This function removes the node
\type {current
} from the list following
\type
1592 {head
}. It is your responsibility to make sure it is really part of that list.
1593 The return values are the new
\type {head
} and
\type {current
} nodes. The
1594 returned
\type {current
} is the node following the
\type {current
} in the calling
1595 argument, and is only passed back as a convenience (or
\type {nil
}, if there is
1596 no such node). The returned
\type {head
} is more important, because if the
1597 function is called with
\type {current
} equal to
\type {head
}, it will be
1600 \subsubsection{\type {node.insert_before
}}
1604 node.insert_before(<node> head, <node> current, <node> new)
1607 This function inserts the node
\type {new
} before
\type {current
} into the list
1608 following
\type {head
}. It is your responsibility to make sure that
\type
1609 {current
} is really part of that list. The return values are the (potentially
1610 mutated)
\type {head
} and the node
\type {new
}, set up to be part of the list
1611 (with correct
\type {next
} field). If
\type {head
} is initially
\type {nil
}, it
1612 will become
\type {new
}.
1614 \subsubsection{\type {node.insert_after
}}
1618 node.insert_after(<node> head, <node> current, <node> new)
1621 This function inserts the node
\type {new
} after
\type {current
} into the list
1622 following
\type {head
}. It is your responsibility to make sure that
\type
1623 {current
} is really part of that list. The return values are the
\type {head
} and
1624 the node
\type {new
}, set up to be part of the list (with correct
\type {next
}
1625 field). If
\type {head
} is initially
\type {nil
}, it will become
\type {new
}.
1627 \subsubsection{\type {node.first_glyph
}}
1631 node.first_glyph(<node> n)
1633 node.first_glyph(<node> n, <node> m)
1636 Returns the first node in the list starting at
\type {n
} that is a glyph node
1637 with a subtype indicating it is a glyph, or
\type {nil
}. If
\type {m
} is given,
1638 processing stops at (but including) that node, otherwise processing stops at the
1641 \subsubsection{\type {node.ligaturing
}}
1644 <node> h, <node> t, <boolean> success =
1645 node.ligaturing(<node> n)
1646 <node> h, <node> t, <boolean> success =
1647 node.ligaturing(<node> n, <node> m)
1650 Apply
\TEX-style ligaturing to the specified nodelist. The tail node
\type {m
} is
1651 optional. The two returned nodes
\type {h
} and
\type {t
} are the new head and
1652 tail (both
\type {n
} and
\type {m
} can change into a new ligature).
1654 \subsubsection{\type {node.kerning
}}
1657 <node> h, <node> t, <boolean> success =
1658 node.kerning(<node> n)
1659 <node> h, <node> t, <boolean> success =
1660 node.kerning(<node> n, <node> m)
1663 Apply
\TEX|-|style kerning to the specified node list. The tail node
\type {m
} is
1664 optional. The two returned nodes
\type {h
} and
\type {t
} are the head and tail
1665 (either one of these can be an inserted kern node, because special kernings with
1666 word boundaries are possible).
1668 \subsubsection{\type {node.unprotect_glyphs
}}
1671 node.unprotect_glyphs(<node> n)
1674 Subtracts
256 from all glyph node subtypes. This and the next function are
1675 helpers to convert from
\type {characters
} to
\type {glyphs
} during node
1678 \subsubsection{\type {node.protect_glyphs
}}
1681 node.protect_glyphs(<node> n)
1684 Adds
256 to all glyph node subtypes in the node list starting at
\type {n
},
1685 except that if the value is
1, it adds only
255. The special handling of
1 means
1686 that
\type {characters
} will become
\type {glyphs
} after subtraction of
256.
1688 \subsubsection{\type {node.last_node
}}
1695 This function pops the last node from
\TEX's
\quote{current list
}. It returns
1696 that node, or
\type {nil
} if the current list is empty.
1698 \subsubsection{\type {node.write
}}
1701 node.write(<node> n)
1704 This is an experimental function that will append a node list to
\TEX's
\quote
1705 {current list
} The node list is not deep|-|copied! There is no error checking
1708 \subsubsection{\type {node.protrusion_skippable
}}
1711 <boolean> skippable =
1712 node.protrusion_skippable(<node> n)
1715 Returns
\type {true
} if, for the purpose of line boundary discovery when
1716 character protrusion is active, this node can be skipped.
1718 \subsection{Glue handling
}
1720 \subsubsection{\type {node.setglue
}}
1722 You can set the properties of a glue in one go. If you pass no values, the glue
1723 will become a zero glue.
1726 node.setglue(<node> n)
1727 node.setglue(<node> n,width,stretch,shrink,stretch_order,shrink_order)
1730 When you pass values, only arguments that are numbers
1734 node.setglue(n,
655360,false,
65536)
1737 will only adapt the width and shrink.
1739 \subsubsection{\type {node.getglue
}}
1741 The next call will return
5 values (or northing when no glue is passed).
1744 <integer> width, <integer> stretch, <integer> shrink, <integer> stretch_order,
1745 <integer> shrink_order = node.getglue(<node> n)
1748 \subsubsection{\type {node.is_zero_glue
}}
1750 This function returns
\type {true
} when the width, stretch and shrink properties
1755 node.is_zero_glue(<node> n)
1758 \subsection{Attribute handling
}
1760 Attributes appear as linked list of userdata objects in the
\type {attr
} field of
1761 individual nodes. They can be handled individually, but it is much safer and more
1762 efficient to use the dedicated functions associated with them.
1764 \subsubsection{\type {node.has_attribute
}}
1768 node.has_attribute(<node> n, <number> id)
1770 node.has_attribute(<node> n, <number> id, <number> val)
1773 Tests if a node has the attribute with number
\type {id
} set. If
\type {val
} is
1774 also supplied, also tests if the value matches
\type {val
}. It returns the value,
1775 or, if no match is found,
\type {nil
}.
1777 \subsubsection{\type {node.set_attribute
}}
1780 node.set_attribute(<node> n, <number> id, <number> val)
1783 Sets the attribute with number
\type {id
} to the value
\type {val
}. Duplicate
1784 assignments are ignored.
{\em [needs explanation
]}
1786 \subsubsection{\type {node.unset_attribute
}}
1790 node.unset_attribute(<node> n, <number> id)
1792 node.unset_attribute(<node> n, <number> id, <number> val)
1795 Unsets the attribute with number
\type {id
}. If
\type {val
} is also supplied, it
1796 will only perform this operation if the value matches
\type {val
}. Missing
1797 attributes or attribute|-|value pairs are ignored.
1799 If the attribute was actually deleted, returns its old value. Otherwise, returns