FFI: Add unwind definitions for lj_vm_ffi_call.
[luajit-2.0.git] / doc / ext_ffi_semantics.html
blob79f25510a9476bbb5f471563f6d6fe2cf47fdde3
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
2 <html>
3 <head>
4 <title>FFI Semantics</title>
5 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
6 <meta name="Author" content="Mike Pall">
7 <meta name="Copyright" content="Copyright (C) 2005-2011, Mike Pall">
8 <meta name="Language" content="en">
9 <link rel="stylesheet" type="text/css" href="bluequad.css" media="screen">
10 <link rel="stylesheet" type="text/css" href="bluequad-print.css" media="print">
11 <style type="text/css">
12 table.convtable { line-height: 1.2; }
13 tr.convhead td { font-weight: bold; }
14 td.convin { width: 11em; }
15 td.convop { font-style: italic; width: 16em; }
16 </style>
17 </head>
18 <body>
19 <div id="site">
20 <a href="http://luajit.org"><span>Lua<span id="logo">JIT</span></span></a>
21 </div>
22 <div id="head">
23 <h1>FFI Semantics</h1>
24 </div>
25 <div id="nav">
26 <ul><li>
27 <a href="luajit.html">LuaJIT</a>
28 <ul><li>
29 <a href="install.html">Installation</a>
30 </li><li>
31 <a href="running.html">Running</a>
32 </li></ul>
33 </li><li>
34 <a href="extensions.html">Extensions</a>
35 <ul><li>
36 <a href="ext_ffi.html">FFI Library</a>
37 <ul><li>
38 <a href="ext_ffi_tutorial.html">FFI Tutorial</a>
39 </li><li>
40 <a href="ext_ffi_api.html">ffi.* API</a>
41 </li><li>
42 <a class="current" href="ext_ffi_semantics.html">FFI Semantics</a>
43 </li></ul>
44 </li><li>
45 <a href="ext_jit.html">jit.* Library</a>
46 </li><li>
47 <a href="ext_c_api.html">Lua/C API</a>
48 </li></ul>
49 </li><li>
50 <a href="status.html">Status</a>
51 <ul><li>
52 <a href="changes.html">Changes</a>
53 </li></ul>
54 </li><li>
55 <a href="faq.html">FAQ</a>
56 </li><li>
57 <a href="http://luajit.org/performance.html">Performance <span class="ext">&raquo;</span></a>
58 </li><li>
59 <a href="http://luajit.org/download.html">Download <span class="ext">&raquo;</span></a>
60 </li></ul>
61 </div>
62 <div id="main">
63 <p>
64 This page describes the detailed semantics underlying the FFI library
65 and its interaction with both Lua and C&nbsp;code.
66 </p>
67 <p>
68 Given that the FFI library is designed to interface with C&nbsp;code
69 and that declarations can be written in plain C&nbsp;syntax, <b>it
70 closely follows the C&nbsp;language semantics</b>, wherever possible.
71 Some minor concessions are needed for smoother interoperation with Lua
72 language semantics.
73 </p>
74 <p>
75 Please don't be overwhelmed by the contents of this page &mdash; this
76 is a reference and you may need to consult it, if in doubt. It doesn't
77 hurt to skim this page, but most of the semantics "just work" as you'd
78 expect them to work. It should be straightforward to write
79 applications using the LuaJIT FFI for developers with a C or C++
80 background.
81 </p>
82 <p class="indent" style="color: #c00000;">
83 Please note: this doesn't comprise the final specification for the FFI
84 semantics, yet. Some semantics may need to be changed, based on your
85 feedback. Please <a href="contact.html">report</a> any problems you may
86 encounter or any improvements you'd like to see &mdash; thank you!
87 </p>
89 <h2 id="clang">C Language Support</h2>
90 <p>
91 The FFI library has a built-in C&nbsp;parser with a minimal memory
92 footprint. It's used by the <a href="ext_ffi_api.html">ffi.* library
93 functions</a> to declare C&nbsp;types or external symbols.
94 </p>
95 <p>
96 It's only purpose is to parse C&nbsp;declarations, as found e.g. in
97 C&nbsp;header files. Although it does evaluate constant expressions,
98 it's <em>not</em> a C&nbsp;compiler. The body of <tt>inline</tt>
99 C&nbsp;function definitions is simply ignored.
100 </p>
102 Also, this is <em>not</em> a validating C&nbsp;parser. It expects and
103 accepts correctly formed C&nbsp;declarations, but it may choose to
104 ignore bad declarations or show rather generic error messages. If in
105 doubt, please check the input against your favorite C&nbsp;compiler.
106 </p>
108 The C&nbsp;parser complies to the <b>C99 language standard</b> plus
109 the following extensions:
110 </p>
111 <ul>
113 <li>The <tt>'\e'</tt> escape in character and string literals.</li>
115 <li>The C99/C++ boolean type, declared with the keywords <tt>bool</tt>
116 or <tt>_Bool</tt>.</li>
118 <li>Complex numbers, declared with the keywords <tt>complex</tt> or
119 <tt>_Complex</tt>.</li>
121 <li>Two complex number types: <tt>complex</tt> (aka
122 <tt>complex&nbsp;double</tt>) and <tt>complex&nbsp;float</tt>.</li>
124 <li>Vector types, declared with the GCC <tt>mode</tt> or
125 <tt>vector_size</tt> attribute.</li>
127 <li>Unnamed ('transparent') <tt>struct</tt>/<tt>union</tt> fields
128 inside a <tt>struct</tt>/<tt>union</tt>.</li>
130 <li>Incomplete <tt>enum</tt> declarations, handled like incomplete
131 <tt>struct</tt> declarations.</li>
133 <li>Unnamed <tt>enum</tt> fields inside a
134 <tt>struct</tt>/<tt>union</tt>. This is similar to a scoped C++
135 <tt>enum</tt>, except that declared constants are visible in the
136 global namespace, too.</li>
138 <li>Scoped <tt>static&nbsp;const</tt> declarations inside a
139 <tt>struct</tt>/<tt>union</tt> (from C++).</li>
141 <li>Zero-length arrays (<tt>[0]</tt>), empty
142 <tt>struct</tt>/<tt>union</tt>, variable-length arrays (VLA,
143 <tt>[?]</tt>) and variable-length structs (VLS, with a trailing
144 VLA).</li>
146 <li>C++ reference types (<tt>int&nbsp;&amp;x</tt>).</li>
148 <li>Alternate GCC keywords with '<tt>__</tt>', e.g.
149 <tt>__const__</tt>.</li>
151 <li>GCC <tt>__attribute__</tt> with the following attributes:
152 <tt>aligned</tt>, <tt>packed</tt>, <tt>mode</tt>,
153 <tt>vector_size</tt>, <tt>cdecl</tt>, <tt>fastcall</tt>,
154 <tt>stdcall</tt>.</li>
156 <li>The GCC <tt>__extension__</tt> keyword and the GCC
157 <tt>__alignof__</tt> operator.</li>
159 <li>GCC <tt>__asm__("symname")</tt> symbol name redirection for
160 function declarations.</li>
162 <li>MSVC keywords for fixed-length types: <tt>__int8</tt>,
163 <tt>__int16</tt>, <tt>__int32</tt> and <tt>__int64</tt>.</li>
165 <li>MSVC <tt>__cdecl</tt>, <tt>__fastcall</tt>, <tt>__stdcall</tt>,
166 <tt>__ptr32</tt>, <tt>__ptr64</tt>, <tt>__declspec(align(n))</tt>
167 and <tt>#pragma&nbsp;pack</tt>.</li>
169 <li>All other GCC/MSVC-specific attributes are ignored.</li>
171 </ul>
173 The following C&nbsp;types are pre-defined by the C&nbsp;parser (like
174 a <tt>typedef</tt>, except re-declarations will be ignored):
175 </p>
176 <ul>
178 <li>Vararg handling: <tt>va_list</tt>, <tt>__builtin_va_list</tt>,
179 <tt>__gnuc_va_list</tt>.</li>
181 <li>From <tt>&lt;stddef.h&gt;</tt>: <tt>ptrdiff_t</tt>,
182 <tt>size_t</tt>, <tt>wchar_t</tt>.</li>
184 <li>From <tt>&lt;stdint.h&gt;</tt>: <tt>int8_t</tt>, <tt>int16_t</tt>,
185 <tt>int32_t</tt>, <tt>int64_t</tt>, <tt>uint8_t</tt>,
186 <tt>uint16_t</tt>, <tt>uint32_t</tt>, <tt>uint64_t</tt>,
187 <tt>intptr_t</tt>, <tt>uintptr_t</tt>.</li>
189 </ul>
191 You're encouraged to use these types in preference to the
192 compiler-specific extensions or the target-dependent standard types.
193 E.g. <tt>char</tt> differs in signedness and <tt>long</tt> differs in
194 size, depending on the target architecture and platform ABI.
195 </p>
197 The following C&nbsp;features are <b>not</b> supported:
198 </p>
199 <ul>
201 <li>A declaration must always have a type specifier; it doesn't
202 default to an <tt>int</tt> type.</li>
204 <li>Old-style empty function declarations (K&amp;R) are not allowed.
205 All C&nbsp;functions must have a proper prototype declaration. A
206 function declared without parameters (<tt>int&nbsp;foo();</tt>) is
207 treated as a function taking zero arguments, like in C++.</li>
209 <li>The <tt>long double</tt> C&nbsp;type is parsed correctly, but
210 there's no support for the related conversions, accesses or arithmetic
211 operations.</li>
213 <li>Wide character strings and character literals are not
214 supported.</li>
216 <li><a href="#status">See below</a> for features that are currently
217 not implemented.</li>
219 </ul>
221 <h2 id="convert">C Type Conversion Rules</h2>
223 <h3 id="convert_tolua">Conversions from C&nbsp;types to Lua objects</h3>
225 These conversion rules apply for <em>read accesses</em> to
226 C&nbsp;types: indexing pointers, arrays or
227 <tt>struct</tt>/<tt>union</tt> types; reading external variables or
228 constant values; retrieving return values from C&nbsp;calls:
229 </p>
230 <table class="convtable">
231 <tr class="convhead">
232 <td class="convin">Input</td>
233 <td class="convop">Conversion</td>
234 <td class="convout">Output</td>
235 </tr>
236 <tr class="odd separate">
237 <td class="convin"><tt>int8_t</tt>, <tt>int16_t</tt></td><td class="convop">&rarr;<sup>sign-ext</sup> <tt>int32_t</tt> &rarr; <tt>double</tt></td><td class="convout">number</td></tr>
238 <tr class="even">
239 <td class="convin"><tt>uint8_t</tt>, <tt>uint16_t</tt></td><td class="convop">&rarr;<sup>zero-ext</sup> <tt>int32_t</tt> &rarr; <tt>double</tt></td><td class="convout">number</td></tr>
240 <tr class="odd">
241 <td class="convin"><tt>int32_t</tt>, <tt>uint32_t</tt></td><td class="convop">&rarr; <tt>double</tt></td><td class="convout">number</td></tr>
242 <tr class="even">
243 <td class="convin"><tt>int64_t</tt>, <tt>uint64_t</tt></td><td class="convop">boxed value</td><td class="convout">64 bit int cdata</td></tr>
244 <tr class="odd separate">
245 <td class="convin"><tt>double</tt>, <tt>float</tt></td><td class="convop">&rarr; <tt>double</tt></td><td class="convout">number</td></tr>
246 <tr class="even separate">
247 <td class="convin"><tt>bool</tt></td><td class="convop">0 &rarr; <tt>false</tt>, otherwise <tt>true</tt></td><td class="convout">boolean</td></tr>
248 <tr class="odd separate">
249 <td class="convin">Complex number</td><td class="convop">boxed value</td><td class="convout">complex cdata</td></tr>
250 <tr class="even">
251 <td class="convin">Vector</td><td class="convop">boxed value</td><td class="convout">vector cdata</td></tr>
252 <tr class="odd">
253 <td class="convin">Pointer</td><td class="convop">boxed value</td><td class="convout">pointer cdata</td></tr>
254 <tr class="even separate">
255 <td class="convin">Array</td><td class="convop">boxed reference</td><td class="convout">reference cdata</td></tr>
256 <tr class="odd">
257 <td class="convin"><tt>struct</tt>/<tt>union</tt></td><td class="convop">boxed reference</td><td class="convout">reference cdata</td></tr>
258 </table>
260 Bitfields or <tt>enum</tt> types are treated like their underlying
261 type.
262 </p>
264 Reference types are dereferenced <em>before</em> a conversion can take
265 place &mdash; the conversion is applied to the C&nbsp;type pointed to
266 by the reference.
267 </p>
269 <h3 id="convert_fromlua">Conversions from Lua objects to C&nbsp;types</h3>
271 These conversion rules apply for <em>write accesses</em> to
272 C&nbsp;types: indexing pointers, arrays or
273 <tt>struct</tt>/<tt>union</tt> types; initializing cdata objects;
274 casts to C&nbsp;types; writing to external variables; passing
275 arguments to C&nbsp;calls:
276 </p>
277 <table class="convtable">
278 <tr class="convhead">
279 <td class="convin">Input</td>
280 <td class="convop">Conversion</td>
281 <td class="convout">Output</td>
282 </tr>
283 <tr class="odd separate">
284 <td class="convin">number</td><td class="convop">&rarr;</td><td class="convout"><tt>double</tt></td></tr>
285 <tr class="even">
286 <td class="convin">boolean</td><td class="convop"><tt>false</tt> &rarr; 0, <tt>true</tt> &rarr; 1</td><td class="convout"><tt>bool</tt></td></tr>
287 <tr class="odd separate">
288 <td class="convin">nil</td><td class="convop"><tt>NULL</tt> &rarr;</td><td class="convout"><tt>(void *)</tt></td></tr>
289 <tr class="even">
290 <td class="convin">userdata</td><td class="convop">userdata payload &rarr;</td><td class="convout"><tt>(void *)</tt></td></tr>
291 <tr class="odd">
292 <td class="convin">lightuserdata</td><td class="convop">lightuserdata address &rarr;</td><td class="convout"><tt>(void *)</tt></td></tr>
293 <tr class="even separate">
294 <td class="convin">string</td><td class="convop">match against <tt>enum</tt> constant</td><td class="convout"><tt>enum</tt></td></tr>
295 <tr class="odd">
296 <td class="convin">string</td><td class="convop">copy string data + zero-byte</td><td class="convout"><tt>int8_t[]</tt>, <tt>uint8_t[]</tt></td></tr>
297 <tr class="even">
298 <td class="convin">string</td><td class="convop">string data &rarr;</td><td class="convout"><tt>const char[]</tt></td></tr>
299 <tr class="odd separate">
300 <td class="convin">table</td><td class="convop"><a href="#init_table">table initializer</a></td><td class="convout">Array</td></tr>
301 <tr class="even">
302 <td class="convin">table</td><td class="convop"><a href="#init_table">table initializer</a></td><td class="convout"><tt>struct</tt>/<tt>union</tt></td></tr>
303 <tr class="odd separate">
304 <td class="convin">cdata</td><td class="convop">cdata payload &rarr;</td><td class="convout">C type</td></tr>
305 </table>
307 If the result type of this conversion doesn't match the
308 C&nbsp;type of the destination, the
309 <a href="#convert_between">conversion rules between C&nbsp;types</a>
310 are applied.
311 </p>
313 Reference types are immutable after initialization ("no re-seating of
314 references"). For initialization purposes or when passing values to
315 reference parameters, they are treated like pointers. Note that unlike
316 in C++, there's no way to implement automatic reference generation of
317 variables under the Lua language semantics. If you want to call a
318 function with a reference parameter, you need to explicitly pass a
319 one-element array.
320 </p>
322 <h3 id="convert_between">Conversions between C&nbsp;types</h3>
324 These conversion rules are more or less the same as the standard
325 C&nbsp;conversion rules. Some rules only apply to casts, or require
326 pointer or type compatibility:
327 </p>
328 <table class="convtable">
329 <tr class="convhead">
330 <td class="convin">Input</td>
331 <td class="convop">Conversion</td>
332 <td class="convout">Output</td>
333 </tr>
334 <tr class="odd separate">
335 <td class="convin">Signed integer</td><td class="convop">&rarr;<sup>narrow or sign-extend</sup></td><td class="convout">Integer</td></tr>
336 <tr class="even">
337 <td class="convin">Unsigned integer</td><td class="convop">&rarr;<sup>narrow or zero-extend</sup></td><td class="convout">Integer</td></tr>
338 <tr class="odd">
339 <td class="convin">Integer</td><td class="convop">&rarr;<sup>round</sup></td><td class="convout"><tt>double</tt>, <tt>float</tt></td></tr>
340 <tr class="even">
341 <td class="convin"><tt>double</tt>, <tt>float</tt></td><td class="convop">&rarr;<sup>trunc</sup> <tt>int32_t</tt> &rarr;<sup>narrow</sup></td><td class="convout"><tt>(u)int8_t</tt>, <tt>(u)int16_t</tt></td></tr>
342 <tr class="odd">
343 <td class="convin"><tt>double</tt>, <tt>float</tt></td><td class="convop">&rarr;<sup>trunc</sup></td><td class="convout"><tt>(u)int32_t</tt>, <tt>(u)int64_t</tt></td></tr>
344 <tr class="even">
345 <td class="convin"><tt>double</tt>, <tt>float</tt></td><td class="convop">&rarr;<sup>round</sup></td><td class="convout"><tt>float</tt>, <tt>double</tt></td></tr>
346 <tr class="odd separate">
347 <td class="convin">Number</td><td class="convop">n == 0 &rarr; 0, otherwise 1</td><td class="convout"><tt>bool</tt></td></tr>
348 <tr class="even">
349 <td class="convin"><tt>bool</tt></td><td class="convop"><tt>false</tt> &rarr; 0, <tt>true</tt> &rarr; 1</td><td class="convout">Number</td></tr>
350 <tr class="odd separate">
351 <td class="convin">Complex number</td><td class="convop">convert real part</td><td class="convout">Number</td></tr>
352 <tr class="even">
353 <td class="convin">Number</td><td class="convop">convert real part, imag = 0</td><td class="convout">Complex number</td></tr>
354 <tr class="odd">
355 <td class="convin">Complex number</td><td class="convop">convert real and imag part</td><td class="convout">Complex number</td></tr>
356 <tr class="even separate">
357 <td class="convin">Number</td><td class="convop">convert scalar and replicate</td><td class="convout">Vector</td></tr>
358 <tr class="odd">
359 <td class="convin">Vector</td><td class="convop">copy (same size)</td><td class="convout">Vector</td></tr>
360 <tr class="even separate">
361 <td class="convin"><tt>struct</tt>/<tt>union</tt></td><td class="convop">take base address (compat)</td><td class="convout">Pointer</td></tr>
362 <tr class="odd">
363 <td class="convin">Array</td><td class="convop">take base address (compat)</td><td class="convout">Pointer</td></tr>
364 <tr class="even">
365 <td class="convin">Function</td><td class="convop">take function address</td><td class="convout">Function pointer</td></tr>
366 <tr class="odd separate">
367 <td class="convin">Number</td><td class="convop">convert via <tt>uintptr_t</tt> (cast)</td><td class="convout">Pointer</td></tr>
368 <tr class="even">
369 <td class="convin">Pointer</td><td class="convop">convert address (compat/cast)</td><td class="convout">Pointer</td></tr>
370 <tr class="odd">
371 <td class="convin">Pointer</td><td class="convop">convert address (cast)</td><td class="convout">Integer</td></tr>
372 <tr class="even">
373 <td class="convin">Array</td><td class="convop">convert base address (cast)</td><td class="convout">Integer</td></tr>
374 <tr class="odd separate">
375 <td class="convin">Array</td><td class="convop">copy (compat)</td><td class="convout">Array</td></tr>
376 <tr class="even">
377 <td class="convin"><tt>struct</tt>/<tt>union</tt></td><td class="convop">copy (identical type)</td><td class="convout"><tt>struct</tt>/<tt>union</tt></td></tr>
378 </table>
380 Bitfields or <tt>enum</tt> types are treated like their underlying
381 type.
382 </p>
384 Conversions not listed above will raise an error. E.g. it's not
385 possible to convert a pointer to a complex number or vice versa.
386 </p>
388 <h3 id="convert_vararg">Conversions for vararg C&nbsp;function arguments</h3>
390 The following default conversion rules apply when passing Lua objects
391 to the variable argument part of vararg C&nbsp;functions:
392 </p>
393 <table class="convtable">
394 <tr class="convhead">
395 <td class="convin">Input</td>
396 <td class="convop">Conversion</td>
397 <td class="convout">Output</td>
398 </tr>
399 <tr class="odd separate">
400 <td class="convin">number</td><td class="convop">&rarr;</td><td class="convout"><tt>double</tt></td></tr>
401 <tr class="even">
402 <td class="convin">boolean</td><td class="convop"><tt>false</tt> &rarr; 0, <tt>true</tt> &rarr; 1</td><td class="convout"><tt>bool</tt></td></tr>
403 <tr class="odd separate">
404 <td class="convin">nil</td><td class="convop"><tt>NULL</tt> &rarr;</td><td class="convout"><tt>(void *)</tt></td></tr>
405 <tr class="even">
406 <td class="convin">userdata</td><td class="convop">userdata payload &rarr;</td><td class="convout"><tt>(void *)</tt></td></tr>
407 <tr class="odd">
408 <td class="convin">lightuserdata</td><td class="convop">lightuserdata address &rarr;</td><td class="convout"><tt>(void *)</tt></td></tr>
409 <tr class="even separate">
410 <td class="convin">string</td><td class="convop">string data &rarr;</td><td class="convout"><tt>const char *</tt></td></tr>
411 <tr class="odd separate">
412 <td class="convin"><tt>float</tt> cdata</td><td class="convop">&rarr;</td><td class="convout"><tt>double</tt></td></tr>
413 <tr class="even">
414 <td class="convin">Array cdata</td><td class="convop">take base address</td><td class="convout">Element pointer</td></tr>
415 <tr class="odd">
416 <td class="convin"><tt>struct</tt>/<tt>union</tt> cdata</td><td class="convop">take base address</td><td class="convout"><tt>struct</tt>/<tt>union</tt> pointer</td></tr>
417 <tr class="even">
418 <td class="convin">Function cdata</td><td class="convop">take function address</td><td class="convout">Function pointer</td></tr>
419 <tr class="odd">
420 <td class="convin">Any other cdata</td><td class="convop">no conversion</td><td class="convout">C type</td></tr>
421 </table>
423 To pass a Lua object, other than a cdata object, as a specific type,
424 you need to override the conversion rules: create a temporary cdata
425 object with a constructor or a cast and initialize it with the value
426 to pass:
427 </p>
429 Assuming <tt>x</tt> is a Lua number, here's how to pass it as an
430 integer to a vararg function:
431 </p>
432 <pre class="code">
433 ffi.cdef[[
434 int printf(const char *fmt, ...);
436 ffi.C.printf("integer value: %d\n", ffi.new("int", x))
437 </pre>
439 If you don't do this, the default Lua number &rarr; <tt>double</tt>
440 conversion rule applies. A vararg C&nbsp;function expecting an integer
441 will see a garbled or uninitialized value.
442 </p>
444 <h2 id="init">Initializers</h2>
446 Creating a cdata object with
447 <a href="ext_ffi_api.html#ffi_new"><tt>ffi.new()</tt></a> or the
448 equivalent constructor syntax always initializes its contents, too.
449 Different rules apply, depending on the number of optional
450 initializers and the C&nbsp;types involved:
451 </p>
452 <ul>
453 <li>If no initializers are given, the object is filled with zero bytes.</li>
455 <li>Scalar types (numbers and pointers) accept a single initializer.
456 The Lua object is <a href="#convert_fromlua">converted to the scalar
457 C&nbsp;type</a>.</li>
459 <li>Valarrays (complex numbers and vectors) are treated like scalars
460 when a single initializer is given. Otherwise they are treated like
461 regular arrays.</li>
463 <li>Aggregate types (arrays and structs) accept either a single
464 <a href="#init_table">table initializer</a> or a flat list of
465 initializers.</li>
467 <li>The elements of an array are initialized, starting at index zero.
468 If a single initializer is given for an array, it's repeated for all
469 remaining elements. This doesn't happen if two or more initializers
470 are given: all remaining uninitialized elements are filled with zero
471 bytes.</li>
473 <li>Byte arrays may also be initialized with a Lua string. This copies
474 the whole string plus a terminating zero-byte. The copy stops early only
475 if the array has a known, fixed size.</li>
477 <li>The fields of a <tt>struct</tt> are initialized in the order of
478 their declaration. Uninitialized fields are filled with zero
479 bytes.</li>
481 <li>Only the first field of a <tt>union</tt> can be initialized with a
482 flat initializer.</li>
484 <li>Elements or fields which are aggregates themselves are initialized
485 with a <em>single</em> initializer, but this may be a table
486 initializer or a compatible aggregate.</li>
488 <li>Excess initializers cause an error.</li>
490 </ul>
492 <h2 id="init_table">Table Initializers</h2>
494 The following rules apply if a Lua table is used to initialize an
495 Array or a <tt>struct</tt>/<tt>union</tt>:
496 </p>
497 <ul>
499 <li>If the table index <tt>[0]</tt> is non-<tt>nil</tt>, then the
500 table is assumed to be zero-based. Otherwise it's assumed to be
501 one-based.</li>
503 <li>Array elements, starting at index zero, are initialized one-by-one
504 with the consecutive table elements, starting at either index
505 <tt>[0]</tt> or <tt>[1]</tt>. This process stops at the first
506 <tt>nil</tt> table element.</li>
508 <li>If exactly one array element was initialized, it's repeated for
509 all the remaining elements. Otherwise all remaining uninitialized
510 elements are filled with zero bytes.</li>
512 <li>The above logic only applies to arrays with a known fixed size.
513 A VLA is only initialized with the element(s) given in the table.
514 Depending on the use case, you may need to explicitly add a
515 <tt>NULL</tt> or <tt>0</tt> terminator to a VLA.</li>
517 <li>If the table has a non-empty hash part, a
518 <tt>struct</tt>/<tt>union</tt> is initialized by looking up each field
519 name (as a string key) in the table. Each non-<tt>nil</tt> value is
520 used to initialize the corresponding field.</li>
522 <li>Otherwise a <tt>struct</tt>/<tt>union</tt> is initialized in the
523 order of the declaration of its fields. Each field is initialized with
524 the consecutive table elements, starting at either index <tt>[0]</tt>
525 or <tt>[1]</tt>. This process stops at the first <tt>nil</tt> table
526 element.</li>
528 <li>Uninitialized fields of a <tt>struct</tt> are filled with zero
529 bytes, except for the trailing VLA of a VLS.</li>
531 <li>Initialization of a <tt>union</tt> stops after one field has been
532 initialized. If no field has been initialized, the <tt>union</tt> is
533 filled with zero bytes.</li>
535 <li>Elements or fields which are aggregates themselves are initialized
536 with a <em>single</em> initializer, but this may be a nested table
537 initializer (or a compatible aggregate).</li>
539 <li>Excess initializers for an array cause an error. Excess
540 initializers for a <tt>struct</tt>/<tt>union</tt> are ignored.
541 Unrelated table entries are ignored, too.</li>
543 </ul>
545 Example:
546 </p>
547 <pre class="code">
548 local ffi = require("ffi")
550 ffi.cdef[[
551 struct foo { int a, b; };
552 union bar { int i; double d; };
553 struct nested { int x; struct foo y; };
556 ffi.new("int[3]", {}) --> 0, 0, 0
557 ffi.new("int[3]", {1}) --> 1, 1, 1
558 ffi.new("int[3]", {1,2}) --> 1, 2, 0
559 ffi.new("int[3]", {1,2,3}) --> 1, 2, 3
560 ffi.new("int[3]", {[0]=1}) --> 1, 1, 1
561 ffi.new("int[3]", {[0]=1,2}) --> 1, 2, 0
562 ffi.new("int[3]", {[0]=1,2,3}) --> 1, 2, 3
563 ffi.new("int[3]", {[0]=1,2,3,4}) --> error: too many initializers
565 ffi.new("struct foo", {}) --> a = 0, b = 0
566 ffi.new("struct foo", {1}) --> a = 1, b = 0
567 ffi.new("struct foo", {1,2}) --> a = 1, b = 2
568 ffi.new("struct foo", {[0]=1,2}) --> a = 1, b = 2
569 ffi.new("struct foo", {b=2}) --> a = 0, b = 2
570 ffi.new("struct foo", {a=1,b=2,c=3}) --> a = 1, b = 2 'c' is ignored
572 ffi.new("union bar", {}) --> i = 0, d = 0.0
573 ffi.new("union bar", {1}) --> i = 1, d = ?
574 ffi.new("union bar", {[0]=1,2}) --> i = 1, d = ? '2' is ignored
575 ffi.new("union bar", {d=2}) --> i = ?, d = 2.0
577 ffi.new("struct nested", {1,{2,3}}) --> x = 1, y.a = 2, y.b = 3
578 ffi.new("struct nested", {x=1,y={2,3}}) --> x = 1, y.a = 2, y.b = 3
579 </pre>
581 <h2 id="cdata_ops">Operations on cdata Objects</h2>
583 All of the standard Lua operators can be applied to cdata objects or a
584 mix of a cdata object and another Lua object. The following list shows
585 the valid combinations. All other combinations currently raise an
586 error.
587 </p>
589 Reference types are dereferenced <em>before</em> performing each of
590 the operations below &mdash; the operation is applied to the
591 C&nbsp;type pointed to by the reference.
592 </p>
594 The pre-defined operations are always tried first before deferring to a
595 metamethod for a ctype (if defined).
596 </p>
598 <h3 id="cdata_array">Indexing a cdata object</h3>
599 <ul>
601 <li><b>Indexing a pointer/array</b>: a cdata pointer/array can be
602 indexed by a cdata number or a Lua number. The element address is
603 computed as the base address plus the number value multiplied by the
604 element size in bytes. A read access loads the element value and
605 <a href="#convert_tolua">converts it to a Lua object</a>. A write
606 access <a href="#convert_fromlua">converts a Lua object to the element
607 type</a> and stores the converted value to the element. An error is
608 raised if the element size is undefined or a write access to a
609 constant element is attempted.</li>
611 <li><b>Dereferencing a <tt>struct</tt>/<tt>union</tt> field</b>: a
612 cdata <tt>struct</tt>/<tt>union</tt> or a pointer to a
613 <tt>struct</tt>/<tt>union</tt> can be dereferenced by a string key,
614 giving the field name. The field address is computed as the base
615 address plus the relative offset of the field. A read access loads the
616 field value and <a href="#convert_tolua">converts it to a Lua
617 object</a>. A write access <a href="#convert_fromlua">converts a Lua
618 object to the field type</a> and stores the converted value to the
619 field. An error is raised if a write access to a constant
620 <tt>struct</tt>/<tt>union</tt> or a constant field is attempted.</li>
622 <li><b>Indexing a complex number</b>: a complex number can be indexed
623 either by a cdata number or a Lua number with the values 0 or 1, or by
624 the strings <tt>"re"</tt> or <tt>"im"</tt>. A read access loads the
625 real part (<tt>[0]</tt>, <tt>.re</tt>) or the imaginary part
626 (<tt>[1]</tt>, <tt>.im</tt>) part of a complex number and
627 <a href="#convert_tolua">converts it to a Lua number</a>. The
628 sub-parts of a complex number are immutable &mdash; assigning to an
629 index of a complex number raises an error. Accessing out-of-bound
630 indexes returns unspecified results, but is guaranteed not to trigger
631 memory access violations.</li>
633 <li><b>Indexing a vector</b>: a vector is treated like an array for
634 indexing purposes, except the vector elements are immutable &mdash;
635 assigning to an index of a vector raises an error.</li>
637 </ul>
639 Note: since there's (deliberately) no address-of operator, a cdata
640 object holding a value type is effectively immutable after
641 initialization. The JIT compiler benefits from this fact when applying
642 certain optimizations.
643 </p>
645 As a consequence of this, the <em>elements</em> of complex numbers and
646 vectors are immutable. But the elements of an aggregate holding these
647 types <em>may</em> be modified of course. I.e. you cannot assign to
648 <tt>foo.c.im</tt>, but you can assign a (newly created) complex number
649 to <tt>foo.c</tt>.
650 </p>
652 <h3 id="cdata_call">Calling a cdata object</h3>
653 <ul>
655 <li><b>Constructor</b>: a ctype object can be called and used as a
656 <a href="ext_ffi_api.html#ffi_new">constructor</a>.</li>
658 <li><b>C&nbsp;function call</b>: a cdata function or cdata function
659 pointer can be called. The passed arguments are
660 <a href="#convert_fromlua">converted to the C&nbsp;types</a> of the
661 parameters given by the function declaration. Arguments passed to the
662 variable argument part of vararg C&nbsp;function use
663 <a href="#convert_vararg">special conversion rules</a>. This
664 C&nbsp;function is called and the return value (if any) is
665 <a href="#convert_tolua">converted to a Lua object</a>.<br>
666 On Windows/x86 systems, <tt>__stdcall</tt> functions are automatically
667 detected and a function declared as <tt>__cdecl</tt> (the default) is
668 silently fixed up after the first call.</li>
670 </ul>
672 <h3 id="cdata_arith">Arithmetic on cdata objects</h3>
673 <ul>
675 <li><b>Pointer arithmetic</b>: a cdata pointer/array and a cdata
676 number or a Lua number can be added or subtracted. The number must be
677 on the right hand side for a subtraction. The result is a pointer of
678 the same type with an address plus or minus the number value
679 multiplied by the element size in bytes. An error is raised if the
680 element size is undefined.</li>
682 <li><b>Pointer difference</b>: two compatible cdata pointers/arrays
683 can be subtracted. The result is the difference between their
684 addresses, divided by the element size in bytes. An error is raised if
685 the element size is undefined or zero.</li>
687 <li><b>64&nbsp;bit integer arithmetic</b>: the standard arithmetic
688 operators (<tt>+&nbsp;-&nbsp;*&nbsp;/&nbsp;%&nbsp;^</tt> and unary
689 minus) can be applied to two cdata numbers, or a cdata number and a
690 Lua number. If one of them is an <tt>uint64_t</tt>, the other side is
691 converted to an <tt>uint64_t</tt> and an unsigned arithmetic operation
692 is performed. Otherwise both sides are converted to an
693 <tt>int64_t</tt> and a signed arithmetic operation is performed. The
694 result is a boxed 64&nbsp;bit cdata object.<br>
696 These rules ensure that 64&nbsp;bit integers are "sticky". Any
697 expression involving at least one 64&nbsp;bit integer operand results
698 in another one. The undefined cases for the division, modulo and power
699 operators return <tt>2LL&nbsp;^&nbsp;63</tt> or
700 <tt>2ULL&nbsp;^&nbsp;63</tt>.<br>
702 You'll have to explicitly convert a 64&nbsp;bit integer to a Lua
703 number (e.g. for regular floating-point calculations) with
704 <tt>tonumber()</tt>. But note this may incur a precision loss.</li>
706 </ul>
708 <h3 id="cdata_comp">Comparisons of cdata objects</h3>
709 <ul>
711 <li><b>Pointer comparison</b>: two compatible cdata pointers/arrays
712 can be compared. The result is the same as an unsigned comparison of
713 their addresses. <tt>nil</tt> is treated like a <tt>NULL</tt> pointer,
714 which is compatible with any other pointer type.</li>
716 <li><b>64&nbsp;bit integer comparison</b>: two cdata numbers, or a
717 cdata number and a Lua number can be compared with each other. If one
718 of them is an <tt>uint64_t</tt>, the other side is converted to an
719 <tt>uint64_t</tt> and an unsigned comparison is performed. Otherwise
720 both sides are converted to an <tt>int64_t</tt> and a signed
721 comparison is performed.</li>
723 </ul>
725 <h3 id="cdata_key">cdata objects as table keys</h3>
727 Lua tables may be indexed by cdata objects, but this doesn't provide
728 any useful semantics &mdash; <b>cdata objects are unsuitable as table
729 keys!</b>
730 </p>
732 A cdata object is treated like any other garbage-collected object and
733 is hashed and compared by its address for table indexing. Since
734 there's no interning for cdata value types, the same value may be
735 boxed in different cdata objects with different addresses. Thus
736 <tt>t[1LL+1LL]</tt> and <tt>t[2LL]</tt> usually <b>do not</b> point to
737 the same hash slot and they certainly <b>do not</b> point to the same
738 hash slot as <tt>t[2]</tt>.
739 </p>
741 It would seriously drive up implementation complexity and slow down
742 the common case, if one were to add extra handling for by-value
743 hashing and comparisons to Lua tables. Given the ubiquity of their use
744 inside the VM, this is not acceptable.
745 </p>
747 There are three viable alternatives, if you really need to use cdata
748 objects as keys:
749 </p>
750 <ul>
752 <li>If you can get by with the precision of Lua numbers
753 (52&nbsp;bits), then use <tt>tonumber()</tt> on a cdata number or
754 combine multiple fields of a cdata aggregate to a Lua number. Then use
755 the resulting Lua number as a key when indexing tables.<br>
756 One obvious benefit: <tt>t[tonumber(2LL)]</tt> <b>does</b> point to
757 the same slot as <tt>t[2]</tt>.</li>
759 <li>Otherwise use either <tt>tostring()</tt> on 64&nbsp;bit integers
760 or complex numbers or combine multiple fields of a cdata aggregate to
761 a Lua string (e.g. with
762 <a href="ext_ffi_api.html#ffi_string"><tt>ffi.string()</tt></a>). Then
763 use the resulting Lua string as a key when indexing tables.</li>
765 <li>Create your own specialized hash table implementation using the
766 C&nbsp;types provided by the FFI library, just like you would in
767 C&nbsp;code. Ultimately this may give much better performance than the
768 other alternatives or what a generic by-value hash table could
769 possibly provide.</li>
771 </ul>
773 <h2 id="gc">Garbage Collection of cdata Objects</h2>
775 All explicitly (<tt>ffi.new()</tt>, <tt>ffi.cast()</tt> etc.) or
776 implicitly (accessors) created cdata objects are garbage collected.
777 You need to ensure to retain valid references to cdata objects
778 somewhere on a Lua stack, an upvalue or in a Lua table while they are
779 still in use. Once the last reference to a cdata object is gone, the
780 garbage collector will automatically free the memory used by it (at
781 the end of the next GC cycle).
782 </p>
784 Please note that pointers themselves are cdata objects, however they
785 are <b>not</b> followed by the garbage collector. So e.g. if you
786 assign a cdata array to a pointer, you must keep the cdata object
787 holding the array alive as long as the pointer is still in use:
788 </p>
789 <pre class="code">
790 ffi.cdef[[
791 typedef struct { int *a; } foo_t;
794 local s = ffi.new("foo_t", ffi.new("int[10]")) -- <span style="color:#c00000;">WRONG!</span>
796 local a = ffi.new("int[10]") -- <span style="color:#00a000;">OK</span>
797 local s = ffi.new("foo_t", a)
798 -- Now do something with 's', but keep 'a' alive until you're done.
799 </pre>
801 Similar rules apply for Lua strings which are implicitly converted to
802 <tt>"const&nbsp;char&nbsp;*"</tt>: the string object itself must be
803 referenced somewhere or it'll be garbage collected eventually. The
804 pointer will then point to stale data, which may have already been
805 overwritten. Note that <em>string literals</em> are automatically kept
806 alive as long as the function containing it (actually its prototype)
807 is not garbage collected.
808 </p>
810 Objects which are passed as an argument to an external C&nbsp;function
811 are kept alive until the call returns. So it's generally safe to
812 create temporary cdata objects in argument lists. This is a common
813 idiom for <a href="#convert_vararg">passing specific C&nbsp;types to
814 vararg functions</a>.
815 </p>
817 Memory areas returned by C functions (e.g. from <tt>malloc()</tt>)
818 must be manually managed, of course (or use
819 <a href="ext_ffi_api.html#ffi_gc"><tt>ffi.gc()</tt></a>). Pointers to
820 cdata objects are indistinguishable from pointers returned by C
821 functions (which is one of the reasons why the GC cannot follow them).
822 </p>
824 <h2 id="clib">C Library Namespaces</h2>
826 A C&nbsp;library namespace is a special kind of object which allows
827 access to the symbols contained in shared libraries or the default
828 symbol namespace. The default
829 <a href="ext_ffi_api.html#ffi_C"><tt>ffi.C</tt></a> namespace is
830 automatically created when the FFI library is loaded. C&nbsp;library
831 namespaces for specific shared libraries may be created with the
832 <a href="ext_ffi_api.html#ffi_load"><tt>ffi.load()</tt></a> API
833 function.
834 </p>
836 Indexing a C&nbsp;library namespace object with a symbol name (a Lua
837 string) automatically binds it to the library. First the symbol type
838 is resolved &mdash; it must have been declared with
839 <a href="ext_ffi_api.html#ffi_cdef"><tt>ffi.cdef</tt></a>. Then the
840 symbol address is resolved by searching for the symbol name in the
841 associated shared libraries or the default symbol namespace. Finally,
842 the resulting binding between the symbol name, the symbol type and its
843 address is cached. Missing symbol declarations or nonexistent symbol
844 names cause an error.
845 </p>
847 This is what happens on a <b>read access</b> for the different kinds of
848 symbols:
849 </p>
850 <ul>
852 <li>External functions: a cdata object with the type of the function
853 and its address is returned.</li>
855 <li>External variables: the symbol address is dereferenced and the
856 loaded value is <a href="#convert_tolua">converted to a Lua object</a>
857 and returned.</li>
859 <li>Constant values (<tt>static&nbsp;const</tt> or <tt>enum</tt>
860 constants): the constant is <a href="#convert_tolua">converted to a
861 Lua object</a> and returned.</li>
863 </ul>
865 This is what happens on a <b>write access</b>:
866 </p>
867 <ul>
869 <li>External variables: the value to be written is
870 <a href="#convert_fromlua">converted to the C&nbsp;type</a> of the
871 variable and then stored at the symbol address.</li>
873 <li>Writing to constant variables or to any other symbol type causes
874 an error, like any other attempted write to a constant location.</li>
876 </ul>
878 C&nbsp;library namespaces themselves are garbage collected objects. If
879 the last reference to the namespace object is gone, the garbage
880 collector will eventually release the shared library reference and
881 remove all memory associated with the namespace. Since this may
882 trigger the removal of the shared library from the memory of the
883 running process, it's generally <em>not safe</em> to use function
884 cdata objects obtained from a library if the namespace object may be
885 unreferenced.
886 </p>
888 Performance notice: the JIT compiler specializes to the identity of
889 namespace objects and to the strings used to index it. This
890 effectively turns function cdata objects into constants. It's not
891 useful and actually counter-productive to explicitly cache these
892 function objects, e.g. <tt>local strlen = ffi.C.strlen</tt>. OTOH it
893 <em>is</em> useful to cache the namespace itself, e.g. <tt>local C =
894 ffi.C</tt>.
895 </p>
897 <h2 id="policy">No Hand-holding!</h2>
899 The FFI library has been designed as <b>a low-level library</b>. The
900 goal is to interface with C&nbsp;code and C&nbsp;data types with a
901 minimum of overhead. This means <b>you can do anything you can do
902 from&nbsp;C</b>: access all memory, overwrite anything in memory, call
903 machine code at any memory address and so on.
904 </p>
906 The FFI library provides <b>no memory safety</b>, unlike regular Lua
907 code. It will happily allow you to dereference a <tt>NULL</tt>
908 pointer, to access arrays out of bounds or to misdeclare
909 C&nbsp;functions. If you make a mistake, your application might crash,
910 just like equivalent C&nbsp;code would.
911 </p>
913 This behavior is inevitable, since the goal is to provide full
914 interoperability with C&nbsp;code. Adding extra safety measures, like
915 bounds checks, would be futile. There's no way to detect
916 misdeclarations of C&nbsp;functions, since shared libraries only
917 provide symbol names, but no type information. Likewise there's no way
918 to infer the valid range of indexes for a returned pointer.
919 </p>
921 Again: the FFI library is a low-level library. This implies it needs
922 to be used with care, but it's flexibility and performance often
923 outweigh this concern. If you're a C or C++ developer, it'll be easy
924 to apply your existing knowledge. OTOH writing code for the FFI
925 library is not for the faint of heart and probably shouldn't be the
926 first exercise for someone with little experience in Lua, C or C++.
927 </p>
929 As a corollary of the above, the FFI library is <b>not safe for use by
930 untrusted Lua code</b>. If you're sandboxing untrusted Lua code, you
931 definitely don't want to give this code access to the FFI library or
932 to <em>any</em> cdata object (except 64&nbsp;bit integers or complex
933 numbers). Any properly engineered Lua sandbox needs to provide safety
934 wrappers for many of the standard Lua library functions &mdash;
935 similar wrappers need to be written for high-level operations on FFI
936 data types, too.
937 </p>
939 <h2 id="status">Current Status</h2>
941 The initial release of the FFI library has some limitations and is
942 missing some features. Most of these will be fixed in future releases.
943 </p>
945 <a href="#clang">C language support</a> is
946 currently incomplete:
947 </p>
948 <ul>
949 <li>C&nbsp;declarations are not passed through a C&nbsp;pre-processor,
950 yet.</li>
951 <li>The C&nbsp;parser is able to evaluate most constant expressions
952 commonly found in C&nbsp;header files. However it doesn't handle the
953 full range of C&nbsp;expression semantics and may fail for some
954 obscure constructs.</li>
955 <li><tt>static const</tt> declarations only work for integer types
956 up to 32&nbsp;bits. Neither declaring string constants nor
957 floating-point constants is supported.</li>
958 <li>Packed <tt>struct</tt> bitfields that cross container boundaries
959 are not implemented.</li>
960 <li>Native vector types may be defined with the GCC <tt>mode</tt> or
961 <tt>vector_size</tt> attribute. But no operations other than loading,
962 storing and initializing them are supported, yet.</li>
963 <li>The <tt>volatile</tt> type qualifier is currently ignored by
964 compiled code.</li>
965 <li><a href="ext_ffi_api.html#ffi_cdef"><tt>ffi.cdef</tt></a> silently
966 ignores all re-declarations.</li>
967 </ul>
969 The JIT compiler already handles a large subset of all FFI operations.
970 It automatically falls back to the interpreter for unimplemented
971 operations (you can check for this with the
972 <a href="running.html#opt_j"><tt>-jv</tt></a> command line option).
973 The following operations are currently not compiled and may exhibit
974 suboptimal performance, especially when used in inner loops:
975 </p>
976 <ul>
977 <li>Array/<tt>struct</tt> copies and bulk initializations.</li>
978 <li>Bitfield accesses and initializations.</li>
979 <li>Vector operations.</li>
980 <li>Table initializers.</li>
981 <li>Initialization of nested <tt>struct</tt>/<tt>union</tt> types.</li>
982 <li>Allocations of variable-length arrays or structs.</li>
983 <li>Allocations of C&nbsp;types with a size &gt; 64&nbsp;bytes or an
984 alignment &gt; 8&nbsp;bytes.</li>
985 <li>Conversions from lightuserdata to <tt>void&nbsp;*</tt>.</li>
986 <li>Pointer differences for element sizes that are not a power of
987 two.</li>
988 <li>Calls to C&nbsp;functions with aggregates passed or returned by
989 value.</li>
990 <li>Calls to ctype metamethods which are not plain functions.</li>
991 <li>ctype <tt>__newindex</tt> tables and non-string lookups in ctype
992 <tt>__index</tt> tables.</li>
993 <li><tt>tostring()</tt> for cdata types.</li>
994 <li>Calls to the following <a href="ext_ffi_api.html">ffi.* API</a>
995 functions: <tt>cdef</tt>, <tt>load</tt>, <tt>typeof</tt>,
996 <tt>metatype</tt>, <tt>gc</tt>, <tt>sizeof</tt>, <tt>alignof</tt>,
997 <tt>offsetof</tt>, <tt>errno</tt>.</li>
998 </ul>
1000 Other missing features:
1001 </p>
1002 <ul>
1003 <li>Bit operations for 64&nbsp;bit types.</li>
1004 <li>Arithmetic for <tt>complex</tt> numbers.</li>
1005 <li>Callbacks from C&nbsp;code to Lua functions.</li>
1006 <li>Passing structs by value to vararg C&nbsp;functions.</li>
1007 <li><a href="extensions.html#exceptions">C++ exception interoperability</a>
1008 does not extend to C&nbsp;functions called via the FFI, if the call is
1009 compiled.</li>
1010 </ul>
1011 <br class="flush">
1012 </div>
1013 <div id="foot">
1014 <hr class="hide">
1015 Copyright &copy; 2005-2011 Mike Pall
1016 <span class="noprint">
1017 &middot;
1018 <a href="contact.html">Contact</a>
1019 </span>
1020 </div>
1021 </body>
1022 </html>