1 # extension.rdoc - -*- RDoc -*- created at: Mon Aug 7 16:45:54 JST 1995
3 = Creating extension libraries for Ruby
5 This document explains how to make extension libraries for Ruby.
9 In C, variables have types and data do not have types. In contrast,
10 Ruby variables do not have a static type, and data themselves have
11 types, so data will need to be converted between the languages.
13 Objects in Ruby are represented by the C type `VALUE'. Each VALUE
14 data has its data type.
16 To retrieve C data from a VALUE, you need to:
18 1. Identify the VALUE's data type
19 2. Convert the VALUE into C data
21 Converting to the wrong data type may cause serious problems.
25 The Ruby interpreter has the following data types:
28 T_OBJECT :: ordinary object
31 T_FLOAT :: floating point number
33 T_REGEXP :: regular expression
35 T_HASH :: associative array
36 T_STRUCT :: (Ruby) structure
37 T_BIGNUM :: multi precision integer
38 T_FIXNUM :: Fixnum(31bit or 63bit integer)
39 T_COMPLEX :: complex number
40 T_RATIONAL :: rational number
47 In addition, there are several other types used internally:
49 T_ICLASS :: included module
50 T_MATCH :: MatchData object
52 T_NODE :: syntax tree node
53 T_ZOMBIE :: object awaiting finalization
55 Most of the types are represented by C structures.
57 === Check type of the VALUE data
59 The macro TYPE() defined in ruby.h shows the data type of the VALUE.
60 TYPE() returns the constant number T_XXXX described above. To handle
61 data types, your code will look something like this:
75 rb_raise(rb_eTypeError, "not valid value");
79 There is the data type check function
81 void Check_Type(VALUE value, int type)
83 which raises an exception if the VALUE does not have the type
86 There are also faster check macros for fixnums and nil.
91 === Convert VALUE into C data
93 The data for type T_NIL, T_FALSE, T_TRUE are nil, false, true
94 respectively. They are singletons for the data type.
95 The equivalent C constants are: Qnil, Qfalse, Qtrue.
96 RTEST() will return true if a VALUE is neither Qfalse nor Qnil.
97 If you need to differentiate Qfalse from Qnil,
98 specifically test against Qfalse.
100 The T_FIXNUM data is a 31bit or 63bit length fixed integer.
101 This size depends on the size of long: if long is 32bit then
102 T_FIXNUM is 31bit, if long is 64bit then T_FIXNUM is 63bit.
103 T_FIXNUM can be converted to a C integer by using the
104 FIX2INT() macro or FIX2LONG(). Though you have to check that the
105 data is really FIXNUM before using them, they are faster. FIX2LONG()
106 never raises exceptions, but FIX2INT() raises RangeError if the
107 result is bigger or smaller than the size of int.
108 There are also NUM2INT() and NUM2LONG() which converts any Ruby
109 numbers into C integers. These macros include a type check,
110 so an exception will be raised if the conversion failed. NUM2DBL()
111 can be used to retrieve the double float value in the same way.
113 You can use the macros
114 StringValue() and StringValuePtr() to get a char* from a VALUE.
115 StringValue(var) replaces var's value with the result of "var.to_str()".
116 StringValuePtr(var) does the same replacement and returns the char*
117 representation of var. These macros will skip the replacement if var
118 is a String. Notice that the macros take only the lvalue as their
119 argument, to change the value of var in place.
121 You can also use the macro named StringValueCStr(). This is just
122 like StringValuePtr(), but always adds a NUL character at the end of
123 the result. If the result contains a NUL character, this macro causes
124 the ArgumentError exception.
125 StringValuePtr() doesn't guarantee the existence of a NUL at the end
126 of the result, and the result may contain NUL.
128 Other data types have corresponding C structures, e.g. struct RArray
129 for T_ARRAY etc. The VALUE of the type which has the corresponding
130 structure can be cast to retrieve the pointer to the struct. The
131 casting macro will be of the form RXXXX for each data type; for
132 instance, RARRAY(obj). See "ruby.h". However, we do not recommend
133 to access RXXXX data directly because these data structures are complex.
134 Use corresponding rb_xxx() functions to access the internal struct.
135 For example, to access an entry of array, use rb_ary_entry(ary, offset)
136 and rb_ary_store(ary, offset, obj).
138 There are some accessing macros for structure members, for example
139 `RSTRING_LEN(str)' to get the size of the Ruby String object. The
140 allocated region can be accessed by `RSTRING_PTR(str)'.
142 Notice: Do not change the value of the structure directly, unless you
143 are responsible for the result. This ends up being the cause of
146 === Convert C data into VALUE
148 To convert C data to Ruby values:
152 left shift 1 bit, and turn on its least significant bit (LSB).
154 Other pointer values ::
158 You can determine whether a VALUE is a pointer or not by checking its LSB.
160 Notice: Ruby does not allow arbitrary pointer values to be a VALUE. They
161 should be pointers to the structures which Ruby knows about. The known
162 structures are defined in <ruby.h>.
164 To convert C numbers to Ruby values, use these macros:
166 INT2FIX() :: for integers within 31bits.
167 INT2NUM() :: for arbitrary sized integers.
169 INT2NUM() converts an integer into a Bignum if it is out of the FIXNUM
170 range, but is a bit slower.
172 === Manipulating Ruby object
174 As I already mentioned, it is not recommended to modify an object's
175 internal structure. To manipulate objects, use the functions supplied
176 by the Ruby interpreter. Some (not all) of the useful functions are
179 ==== String functions
181 rb_str_new(const char *ptr, long len) ::
183 Creates a new Ruby string.
185 rb_str_new2(const char *ptr) ::
186 rb_str_new_cstr(const char *ptr) ::
188 Creates a new Ruby string from a C string. This is equivalent to
189 rb_str_new(ptr, strlen(ptr)).
191 rb_str_new_literal(const char *ptr) ::
193 Creates a new Ruby string from a C string literal.
195 rb_sprintf(const char *format, ...) ::
196 rb_vsprintf(const char *format, va_list ap) ::
198 Creates a new Ruby string with printf(3) format.
200 Note: In the format string, "%"PRIsVALUE can be used for Object#to_s
201 (or Object#inspect if '+' flag is set) output (and related argument
202 must be a VALUE). Since it conflicts with "%i", for integers in
203 format strings, use "%d".
205 rb_str_append(VALUE str1, VALUE str2) ::
207 Appends Ruby string str2 to Ruby string str1.
209 rb_str_cat(VALUE str, const char *ptr, long len) ::
211 Appends len bytes of data from ptr to the Ruby string.
213 rb_str_cat2(VALUE str, const char* ptr) ::
214 rb_str_cat_cstr(VALUE str, const char* ptr) ::
216 Appends C string ptr to Ruby string str. This function is
217 equivalent to rb_str_cat(str, ptr, strlen(ptr)).
219 rb_str_catf(VALUE str, const char* format, ...) ::
220 rb_str_vcatf(VALUE str, const char* format, va_list ap) ::
222 Appends C string format and successive arguments to Ruby string
223 str according to a printf-like format. These functions are
224 equivalent to rb_str_append(str, rb_sprintf(format, ...)) and
225 rb_str_append(str, rb_vsprintf(format, ap)), respectively.
227 rb_enc_str_new(const char *ptr, long len, rb_encoding *enc) ::
228 rb_enc_str_new_cstr(const char *ptr, rb_encoding *enc) ::
230 Creates a new Ruby string with the specified encoding.
232 rb_enc_str_new_literal(const char *ptr, rb_encoding *enc) ::
234 Creates a new Ruby string from a C string literal with the specified
237 rb_usascii_str_new(const char *ptr, long len) ::
238 rb_usascii_str_new_cstr(const char *ptr) ::
240 Creates a new Ruby string with encoding US-ASCII.
242 rb_usascii_str_new_literal(const char *ptr) ::
244 Creates a new Ruby string from a C string literal with encoding
247 rb_utf8_str_new(const char *ptr, long len) ::
248 rb_utf8_str_new_cstr(const char *ptr) ::
250 Creates a new Ruby string with encoding UTF-8.
252 rb_utf8_str_new_literal(const char *ptr) ::
254 Creates a new Ruby string from a C string literal with encoding
257 rb_str_resize(VALUE str, long len) ::
259 Resizes a Ruby string to len bytes. If str is not modifiable, this
260 function raises an exception. The length of str must be set in
261 advance. If len is less than the old length the content beyond
262 len bytes is discarded, else if len is greater than the old length
263 the content beyond the old length bytes will not be preserved but
264 will be garbage. Note that RSTRING_PTR(str) may change by calling
267 rb_str_set_len(VALUE str, long len) ::
269 Sets the length of a Ruby string. If str is not modifiable, this
270 function raises an exception. This function preserves the content
271 up to len bytes, regardless RSTRING_LEN(str). len must not exceed
274 rb_str_modify(VALUE str) ::
276 Prepares a Ruby string to modify. If str is not modifiable, this
277 function raises an exception, or if the buffer of str is shared,
278 this function allocates new buffer to make it unshared. Always
279 you MUST call this function before modifying the contents using
280 RSTRING_PTR and/or rb_str_set_len.
286 Creates an array with no elements.
288 rb_ary_new2(long len) ::
289 rb_ary_new_capa(long len) ::
291 Creates an array with no elements, allocating internal buffer
294 rb_ary_new3(long n, ...) ::
295 rb_ary_new_from_args(long n, ...) ::
297 Creates an n-element array from the arguments.
299 rb_ary_new4(long n, VALUE *elts) ::
300 rb_ary_new_from_values(long n, VALUE *elts) ::
302 Creates an n-element array from a C array.
304 rb_ary_to_ary(VALUE obj) ::
306 Converts the object into an array.
307 Equivalent to Object#to_ary.
309 There are many functions to operate an array. They may dump core if other
312 rb_ary_aref(int argc, const VALUE *argv, VALUE ary) ::
314 Equivalent to Array#[].
316 rb_ary_entry(VALUE ary, long offset) ::
320 rb_ary_store(VALUE ary, long offset, VALUE obj) ::
324 rb_ary_subseq(VALUE ary, long beg, long len) ::
328 rb_ary_push(VALUE ary, VALUE val) ::
329 rb_ary_pop(VALUE ary) ::
330 rb_ary_shift(VALUE ary) ::
331 rb_ary_unshift(VALUE ary, VALUE val) ::
333 ary.push, ary.pop, ary.shift, ary.unshift
335 rb_ary_cat(VALUE ary, const VALUE *ptr, long len) ::
337 Appends len elements of objects from ptr to the array.
339 == Extending Ruby with C
341 === Adding new features to Ruby
343 You can add new features (classes, methods, etc.) to the Ruby
344 interpreter. Ruby provides APIs for defining the following things:
347 - Methods, singleton methods
350 ==== Class and Module Definition
352 To define a class or module, use the functions below:
354 VALUE rb_define_class(const char *name, VALUE super)
355 VALUE rb_define_module(const char *name)
357 These functions return the newly created class or module. You may
358 want to save this reference into a variable to use later.
360 To define nested classes or modules, use the functions below:
362 VALUE rb_define_class_under(VALUE outer, const char *name, VALUE super)
363 VALUE rb_define_module_under(VALUE outer, const char *name)
365 ==== Method and singleton method definition
367 To define methods or singleton methods, use these functions:
369 void rb_define_method(VALUE klass, const char *name,
370 VALUE (*func)(ANYARGS), int argc)
372 void rb_define_singleton_method(VALUE object, const char *name,
373 VALUE (*func)(ANYARGS), int argc)
375 The `argc' represents the number of the arguments to the C function,
376 which must be less than 17. But I doubt you'll need that many.
378 If `argc' is negative, it specifies the calling sequence, not number of
381 If argc is -1, the function will be called as:
383 VALUE func(int argc, VALUE *argv, VALUE obj)
385 where argc is the actual number of arguments, argv is the C array of
386 the arguments, and obj is the receiver.
388 If argc is -2, the arguments are passed in a Ruby array. The function
391 VALUE func(VALUE obj, VALUE args)
393 where obj is the receiver, and args is the Ruby array containing
396 There are some more functions to define methods. One takes an ID
397 as the name of method to be defined. See also ID or Symbol below.
399 void rb_define_method_id(VALUE klass, ID name,
400 VALUE (*func)(ANYARGS), int argc)
402 There are two functions to define private/protected methods:
404 void rb_define_private_method(VALUE klass, const char *name,
405 VALUE (*func)(ANYARGS), int argc)
406 void rb_define_protected_method(VALUE klass, const char *name,
407 VALUE (*func)(ANYARGS), int argc)
409 At last, rb_define_module_function defines a module function,
410 which are private AND singleton methods of the module.
411 For example, sqrt is a module function defined in the Math module.
412 It can be called in the following way:
421 To define module functions, use:
423 void rb_define_module_function(VALUE module, const char *name,
424 VALUE (*func)(ANYARGS), int argc)
426 In addition, function-like methods, which are private methods defined
427 in the Kernel module, can be defined using:
429 void rb_define_global_function(const char *name, VALUE (*func)(ANYARGS), int argc)
431 To define an alias for the method,
433 void rb_define_alias(VALUE module, const char* new, const char* old);
435 To define a reader/writer for an attribute,
437 void rb_define_attr(VALUE klass, const char *name, int read, int write)
439 To define and undefine the `allocate' class method,
441 void rb_define_alloc_func(VALUE klass, VALUE (*func)(VALUE klass));
442 void rb_undef_alloc_func(VALUE klass);
444 func has to take the klass as the argument and return a newly
445 allocated instance. This instance should be as empty as possible,
446 without any expensive (including external) resources.
448 If you are overriding an existing method of any ancestor of your class,
451 VALUE rb_call_super(int argc, const VALUE *argv)
453 To specify whether keyword arguments are passed when calling super:
455 VALUE rb_call_super_kw(int argc, const VALUE *argv, int kw_splat)
457 +kw_splat+ can have these possible values (used by all methods that accept
458 +kw_splat+ argument):
460 RB_NO_KEYWORDS :: Do not pass keywords
461 RB_PASS_KEYWORDS :: Pass keywords, final argument should be a hash of keywords
462 RB_PASS_CALLED_KEYWORDS :: Pass keywords if current method was called with
463 keywords, useful for argument delegation
465 To achieve the receiver of the current scope (if no other way is
466 available), you can use:
468 VALUE rb_current_receiver(void)
470 ==== Constant definition
472 We have 2 functions to define constants:
474 void rb_define_const(VALUE klass, const char *name, VALUE val)
475 void rb_define_global_const(const char *name, VALUE val)
477 The former is to define a constant under specified class/module. The
478 latter is to define a global constant.
480 === Use Ruby features from C
482 There are several ways to invoke Ruby's features from C code.
484 ==== Evaluate Ruby programs in a string
486 The easiest way to use Ruby's functionality from a C program is to
487 evaluate the string as Ruby program. This function will do the job:
489 VALUE rb_eval_string(const char *str)
491 Evaluation is done under the current context, thus current local variables
492 of the innermost method (which is defined by Ruby) can be accessed.
494 Note that the evaluation can raise an exception. There is a safer
497 VALUE rb_eval_string_protect(const char *str, int *state)
499 It returns nil when an error occurred. Moreover, *state is zero if str was
500 successfully evaluated, or nonzero otherwise.
504 You can invoke methods directly, without parsing the string. First I
505 need to explain about ID. ID is the integer number to represent
506 Ruby's identifiers such as variable names. The Ruby data type
507 corresponding to ID is Symbol. It can be accessed from Ruby in the
514 :"any kind of string"
516 You can get the ID value from a string within C code by using
518 rb_intern(const char *name)
519 rb_intern_str(VALUE name)
521 You can retrieve ID from Ruby object (Symbol or String) given as an
524 rb_to_id(VALUE symbol)
525 rb_check_id(volatile VALUE *name)
526 rb_check_id_cstr(const char *name, long len, rb_encoding *enc)
528 These functions try to convert the argument to a String if it was not
529 a Symbol nor a String. The second function stores the converted
530 result into *name, and returns 0 if the string is not a known symbol.
531 After this function returned a non-zero value, *name is always a
532 Symbol or a String, otherwise it is a String if the result is 0.
533 The third function takes NUL-terminated C string, not Ruby VALUE.
535 You can retrieve Symbol from Ruby object (Symbol or String) given as
538 rb_to_symbol(VALUE name)
539 rb_check_symbol(volatile VALUE *namep)
540 rb_check_symbol_cstr(const char *ptr, long len, rb_encoding *enc)
542 These functions are similar to above functions except that these
543 return a Symbol instead of an ID.
545 You can convert C ID to Ruby Symbol by using
549 and to convert Ruby Symbol object to ID, use
551 ID SYM2ID(VALUE symbol)
553 ==== Invoke Ruby method from C
555 To invoke methods directly, you can use the function below
557 VALUE rb_funcall(VALUE recv, ID mid, int argc, ...)
559 This function invokes a method on the recv, with the method name
560 specified by the symbol mid.
562 ==== Accessing the variables and constants
564 You can access class variables and instance variables using access
565 functions. Also, global variables can be shared between both
566 environments. There's no way to access Ruby's local variables.
568 The functions to access/modify instance variables are below:
570 VALUE rb_ivar_get(VALUE obj, ID id)
571 VALUE rb_ivar_set(VALUE obj, ID id, VALUE val)
573 id must be the symbol, which can be retrieved by rb_intern().
575 To access the constants of the class/module:
577 VALUE rb_const_get(VALUE obj, ID id)
579 See also Constant Definition above.
581 == Information sharing between Ruby and C
583 === Ruby constants that can be accessed from C
585 As stated in section 1.3,
586 the following Ruby constants can be referred from C.
591 Boolean values. Qfalse is false in C also (i.e. 0).
597 === Global variables shared between C and Ruby
599 Information can be shared between the two environments using shared global
600 variables. To define them, you can use functions listed below:
602 void rb_define_variable(const char *name, VALUE *var)
604 This function defines the variable which is shared by both environments.
605 The value of the global variable pointed to by `var' can be accessed
606 through Ruby's global variable named `name'.
608 You can define read-only (from Ruby, of course) variables using the
611 void rb_define_readonly_variable(const char *name, VALUE *var)
613 You can define hooked variables. The accessor functions (getter and
614 setter) are called on access to the hooked variables.
616 void rb_define_hooked_variable(const char *name, VALUE *var,
617 VALUE (*getter)(), void (*setter)())
619 If you need to supply either setter or getter, just supply 0 for the
620 hook you don't need. If both hooks are 0, rb_define_hooked_variable()
621 works just like rb_define_variable().
623 The prototypes of the getter and setter functions are as follows:
625 VALUE (*getter)(ID id, VALUE *var);
626 void (*setter)(VALUE val, ID id, VALUE *var);
628 Also you can define a Ruby global variable without a corresponding C
629 variable. The value of the variable will be set/get only by hooks.
631 void rb_define_virtual_variable(const char *name,
632 VALUE (*getter)(), void (*setter)())
634 The prototypes of the getter and setter functions are as follows:
636 VALUE (*getter)(ID id);
637 void (*setter)(VALUE val, ID id);
639 === Encapsulate C data into a Ruby object
641 Sometimes you need to expose your struct in the C world as a Ruby
643 In a situation like this, making use of the TypedData_XXX macro
644 family, the pointer to the struct and the Ruby object can be mutually
648 The old (non-Typed) Data_XXX macro family has been deprecated.
649 In the future version of Ruby, it is possible old macros will not
653 ==== C struct to Ruby object
655 You can convert sval, a pointer to your struct, into a Ruby object
658 TypedData_Wrap_Struct(klass, data_type, sval)
660 TypedData_Wrap_Struct() returns a created Ruby object as a VALUE.
662 The klass argument is the class for the object. The klass should
663 derive from rb_cObject, and the allocator must be set by calling
664 rb_define_alloc_func or rb_undef_alloc_func.
666 data_type is a pointer to a const rb_data_type_t which describes
667 how Ruby should manage the struct.
669 rb_data_type_t is defined like this. Let's take a look at each
670 member of the struct.
672 typedef struct rb_data_type_struct rb_data_type_t;
674 struct rb_data_type_struct {
675 const char *wrap_struct_name;
677 void (*dmark)(void*);
678 void (*dfree)(void*);
679 size_t (*dsize)(const void *);
680 void (*dcompact)(void*);
683 const rb_data_type_t *parent;
688 wrap_struct_name is an identifier of this instance of the struct.
689 It is basically used for collecting and emitting statistics.
690 So the identifier must be unique in the process, but doesn't need
691 to be valid as a C or Ruby identifier.
693 These dmark / dfree functions are invoked during GC execution. No
694 object allocations are allowed during it, so do not allocate ruby
697 dmark is a function to mark Ruby objects referred from your struct.
698 It must mark all references from your struct with rb_gc_mark or
699 its family if your struct keeps such references.
702 Note that it is recommended to avoid such a reference.
705 dfree is a function to free the pointer allocation.
706 If this is RUBY_DEFAULT_FREE, the pointer will be just freed.
708 dsize calculates memory consumption in bytes by the struct.
709 Its parameter is a pointer to your struct.
710 You can pass 0 as dsize if it is hard to implement such a function.
711 But it is still recommended to avoid 0.
713 dcompact is invoked when memory compaction took place.
714 Referred Ruby objects that were marked by rb_gc_mark_movable()
715 can here be updated per rb_gc_location().
717 You have to fill reserved with 0.
719 parent can point to another C type definition that the Ruby object
720 is inherited from. Then TypedData_Get_Struct() does also accept
723 You can fill "data" with an arbitrary value for your use.
724 Ruby does nothing with the member.
726 flags is a bitwise-OR of the following flag values.
727 Since they require deep understanding of garbage collector in Ruby,
728 you can just set 0 to flags if you are not sure.
730 RUBY_TYPED_FREE_IMMEDIATELY ::
732 This flag makes the garbage collector immediately invoke dfree()
733 during GC when it need to free your struct.
734 You can specify this flag if the dfree never unlocks Ruby's
737 If this flag is not set, Ruby defers invocation of dfree()
738 and invokes dfree() at the same time as finalizers.
740 RUBY_TYPED_WB_PROTECTED ::
742 It shows that implementation of the object supports write barriers.
743 If this flag is set, Ruby is better able to do garbage collection
746 When it is set, however, you are responsible for putting write
747 barriers in all implementations of methods of that object as
748 appropriate. Otherwise Ruby might crash while running.
750 More about write barriers can be found in {Generational
751 GC}[rdoc-ref:@Appendix+D.+Generational+GC].
753 RUBY_TYPED_FROZEN_SHAREABLE ::
755 This flag indicates that the object is shareable object if the object
756 is frozen. See {Ractor support}[rdoc-ref:@Appendix+F.+Ractor+support]
759 If this flag is not set, the object can not become a shareable
760 object by Ractor.make_shareable() method.
762 Note that this macro can raise an exception. If sval to be wrapped
763 holds a resource needs to be released (e.g., allocated memory, handle
764 from an external library, and etc), you will have to use rb_protect.
766 You can allocate and wrap the structure in one step, in more
769 TypedData_Make_Struct(klass, type, data_type, sval)
771 This macro returns an allocated T_DATA object, wrapping the pointer to
772 the structure, which is also allocated. This macro works like:
774 (sval = ZALLOC(type), TypedData_Wrap_Struct(klass, data_type, sval))
776 However, you should use this macro instead of "allocation then wrap"
777 like the above code if it is simply allocated, because the latter can
778 raise a NoMemoryError and sval will be memory leaked in that case.
780 Arguments klass and data_type work like their counterparts in
781 TypedData_Wrap_Struct(). A pointer to the allocated structure will
782 be assigned to sval, which should be a pointer of the type specified.
784 ==== Declaratively marking/compacting struct references
786 In the case where your struct refers to Ruby objects that are simple values,
787 not wrapped in conditional logic or complex data structures an alternative
788 approach to marking and reference updating is provided, by declaring offset
789 references to the VALUES in your struct.
791 Doing this allows the Ruby GC to support marking these references and GC
792 compaction without the need to define the +dmark+ and +dcompact+ callbacks.
794 You must define a static list of VALUE pointers to the offsets within your
795 struct where the references are located, and set the "data" member to point to
796 this reference list. The reference list must end with +RUBY_END_REFS+.
798 Some Macros have been provided to make edge referencing easier:
800 * <code>RUBY_TYPED_DECL_MARKING</code> =A flag that can be set on the +ruby_data_type_t+ to indicate that references are being declared as edges.
802 * <code>RUBY_REFERENCES(ref_list_name)</code> - Define _ref_list_name_ as a list of references
804 * <code>RUBY_REF_END</code> - The end mark of the references list.
806 * <code>RUBY_REF_EDGE(struct, member)</code> - Declare _member_ as a VALUE edge from _struct_. Use this after +RUBY_REFERENCES_START+
808 * +RUBY_REFS_LIST_PTR+ - Coerce the reference list into a format that can be
809 accepted by the existing +dmark+ interface.
811 The example below is from Dir (defined in +dir.c+)
813 // The struct being wrapped. Notice this contains 3 members of which the second
814 // is a VALUE reference to another ruby object.
821 // Define a reference list `dir_refs` containing a single entry to `path`.
822 // Needs terminating with RUBY_REF_END
823 RUBY_REFERENCES(dir_refs) = {
824 RUBY_REF_EDGE(dir_data, path),
828 // Override the "dmark" field with the defined reference list now that we
829 // no longer need a marking callback and add RUBY_TYPED_DECL_MARKING to the
831 static const rb_data_type_t dir_data_type = {
833 {RUBY_REFS_LIST_PTR(dir_refs), dir_free, dir_memsize,},
834 0, NULL, RUBY_TYPED_WB_PROTECTED | RUBY_TYPED_FREE_IMMEDIATELY | RUBY_TYPED_DECL_MARKING
837 Declaring simple references declaratively in this manner allows the GC to both
838 mark, and move the underlying object, and automatically update the reference to
839 it during compaction.
841 ==== Ruby object to C struct
843 To retrieve the C pointer from the T_DATA object, use the macro
844 TypedData_Get_Struct().
846 TypedData_Get_Struct(obj, type, &data_type, sval)
848 A pointer to the structure will be assigned to the variable sval.
850 See the example below for details.
852 == Example - Creating the dbm Extension
854 OK, here's the example of making an extension library. This is the
855 extension to access DBMs. The full source is included in the ext/
856 directory in the Ruby's source tree.
858 === Make the directory
862 Make a directory for the extension library under ext directory.
864 === Design the Library
866 You need to design the library features, before making it.
870 You need to write C code for your extension library. If your library
871 has only one source file, choosing ``LIBRARY.c'' as a file name is
872 preferred. On the other hand, in case your library has multiple source
873 files, avoid choosing ``LIBRARY.c'' for a file name. It may conflict
874 with an intermediate file ``LIBRARY.o'' on some platforms.
875 Note that some functions in mkmf library described below generate
876 a file ``conftest.c'' for checking with compilation. You shouldn't
877 choose ``conftest.c'' as a name of a source file.
879 Ruby will execute the initializing function named ``Init_LIBRARY'' in
880 the library. For example, ``Init_dbm()'' will be executed when loading
883 Here's the example of an initializing function.
889 /* define DBM class */
890 VALUE cDBM = rb_define_class("DBM", rb_cObject);
891 /* Redefine DBM.allocate
892 rb_define_alloc_func(cDBM, fdbm_alloc);
893 /* DBM includes Enumerable module */
894 rb_include_module(cDBM, rb_mEnumerable);
896 /* DBM has class method open(): arguments are received as C array */
897 rb_define_singleton_method(cDBM, "open", fdbm_s_open, -1);
899 /* DBM instance method close(): no args */
900 rb_define_method(cDBM, "close", fdbm_close, 0);
901 /* DBM instance method []: 1 argument */
902 rb_define_method(cDBM, "[]", fdbm_aref, 1);
906 /* ID for a instance variable to store DBM data */
907 id_dbm = rb_intern("dbm");
910 The dbm extension wraps the dbm struct in the C environment using
911 TypedData_Make_Struct.
918 static const rb_data_type_t dbm_type = {
920 {0, free_dbm, memsize_dbm,},
922 RUBY_TYPED_FREE_IMMEDIATELY,
926 fdbm_alloc(VALUE klass)
928 struct dbmdata *dbmp;
929 /* Allocate T_DATA object and C struct and fill struct with zero bytes */
930 return TypedData_Make_Struct(klass, struct dbmdata, &dbm_type, dbmp);
933 This code wraps the dbmdata structure into a Ruby object. We avoid
934 wrapping DBM* directly, because we want to cache size information.
935 Since Object.allocate allocates an ordinary T_OBJECT type (instead
936 of T_DATA), it's important to either use rb_define_alloc_func() to
937 overwrite it or rb_undef_alloc_func() to delete it.
939 To retrieve the dbmdata structure from a Ruby object, we define the
942 #define GetDBM(obj, dbmp) do {\
943 TypedData_Get_Struct((obj), struct dbmdata, &dbm_type, (dbmp));\
944 if ((dbmp) == 0) closed_dbm();\
945 if ((dbmp)->di_dbm == 0) closed_dbm();\
948 This sort of complicated macro does the retrieving and close checking
951 There are three kinds of way to receive method arguments. First,
952 methods with a fixed number of arguments receive arguments like this:
955 fdbm_aref(VALUE obj, VALUE keystr)
957 struct dbmdata *dbmp;
959 /* Use dbmp to access the key */
960 dbm_fetch(dbmp->di_dbm, StringValueCStr(keystr));
964 The first argument of the C function is the self, the rest are the
965 arguments to the method.
967 Second, methods with an arbitrary number of arguments receive
971 fdbm_s_open(int argc, VALUE *argv, VALUE klass)
974 if (rb_scan_args(argc, argv, "11", &file, &vmode) == 1) {
975 mode = 0666; /* default value */
980 The first argument is the number of method arguments, the second
981 argument is the C array of the method arguments, and the third
982 argument is the receiver of the method.
984 You can use the function rb_scan_args() to check and retrieve the
985 arguments. The third argument is a string that specifies how to
986 capture method arguments and assign them to the following VALUE
989 You can just check the argument number with rb_check_arity(), this is
990 handy in the case you want to treat the arguments as a list.
992 The following is an example of a method that takes arguments by Ruby's
996 thread_initialize(VALUE thread, VALUE args)
1001 The first argument is the receiver, the second one is the Ruby array
1002 which contains the arguments to the method.
1004 <b>Notice</b>: GC should know about global variables which refer to Ruby's objects,
1005 but are not exported to the Ruby world. You need to protect them by
1007 void rb_global_variable(VALUE *var)
1009 or the objects themselves by
1011 void rb_gc_register_mark_object(VALUE object)
1013 === Prepare extconf.rb
1015 If the file named extconf.rb exists, it will be executed to generate
1018 extconf.rb is the file for checking compilation conditions etc. You
1023 at the top of the file. You can use the functions below to check
1026 append_cppflags(array-of-flags[, opt]): append each flag to $CPPFLAGS if usable
1027 append_cflags(array-of-flags[, opt]): append each flag to $CFLAGS if usable
1028 append_ldflags(array-of-flags[, opt]): append each flag to $LDFLAGS if usable
1029 have_macro(macro[, headers[, opt]]): check whether macro is defined
1030 have_library(lib[, func[, headers[, opt]]]): check whether library containing function exists
1031 find_library(lib[, func, *paths]): find library from paths
1032 have_func(func[, headers[, opt]): check whether function exists
1033 have_var(var[, headers[, opt]]): check whether variable exists
1034 have_header(header[, preheaders[, opt]]): check whether header file exists
1035 find_header(header, *paths): find header from paths
1036 have_framework(fw): check whether framework exists (for MacOS X)
1037 have_struct_member(type, member[, headers[, opt]]): check whether struct has member
1038 have_type(type[, headers[, opt]]): check whether type exists
1039 find_type(type, opt, *headers): check whether type exists in headers
1040 have_const(const[, headers[, opt]]): check whether constant is defined
1041 check_sizeof(type[, headers[, opts]]): check size of type
1042 check_signedness(type[, headers[, opts]]): check signedness of type
1043 convertible_int(type[, headers[, opts]]): find convertible integer type
1044 find_executable(bin[, path]): find executable file path
1045 create_header(header): generate configured header
1046 create_makefile(target[, target_prefix]): generate Makefile
1048 See MakeMakefile for full documentation of these functions.
1050 The value of the variables below will affect the Makefile.
1052 $CFLAGS: included in CFLAGS make variable (such as -O)
1053 $CPPFLAGS: included in CPPFLAGS make variable (such as -I, -D)
1054 $LDFLAGS: included in LDFLAGS make variable (such as -L)
1055 $objs: list of object file names
1057 Compiler/linker flags are not portable usually, you should use
1058 +append_cppflags+, +append_cpflags+ and +append_ldflags+ respectively
1059 instead of appending the above variables directly.
1061 Normally, the object files list is automatically generated by searching
1062 source files, but you must define them explicitly if any sources will
1063 be generated while building.
1065 If a compilation condition is not fulfilled, you should not call
1066 ``create_makefile''. The Makefile will not be generated, compilation will
1069 === Prepare depend (Optional)
1071 If the file named depend exists, Makefile will include that file to
1072 check dependencies. You can make this file by invoking
1074 % gcc -MM *.c > depend
1076 It's harmless. Prepare it.
1078 === Generate Makefile
1080 Try generating the Makefile by:
1084 If the library should be installed under vendor_ruby directory
1085 instead of site_ruby directory, use --vendor option as follows.
1087 ruby extconf.rb --vendor
1089 You don't need this step if you put the extension library under the ext
1090 directory of the ruby source tree. In that case, compilation of the
1091 interpreter will do this step for you.
1099 to compile your extension. You don't need this step either if you have
1100 put the extension library under the ext directory of the ruby source tree.
1104 You may need to rb_debug the extension. Extensions can be linked
1105 statically by adding the directory name in the ext/Setup file so that
1106 you can inspect the extension with the debugger.
1108 === Done! Now you have the extension library
1110 You can do anything you want with your library. The author of Ruby
1111 will not claim any restrictions on your code depending on the Ruby API.
1112 Feel free to use, modify, distribute or sell your program.
1114 == Appendix A. Ruby header and source files overview
1116 === Ruby header files
1118 Everything under <tt>$repo_root/include/ruby</tt> is installed with
1119 <tt>make install</tt>.
1120 It should be included per <tt>#include <ruby.h></tt> from C extensions.
1121 All symbols are public API with the exception of symbols prefixed with
1122 +rbimpl_+ or +RBIMPL_+. They are implementation details and shouldn't
1123 be used by C extensions.
1125 Only <tt>$repo_root/include/ruby/*.h</tt> whose corresponding macros
1126 are defined in the <tt>$repo_root/include/ruby.h</tt> header are
1127 allowed to be <tt>#include</tt>-d by C extensions.
1129 Header files under <tt>$repo_root/internal/</tt> or directly under the
1130 root <tt>$repo_root/*.h</tt> are not make-installed.
1131 They are internal headers with only internal APIs.
1133 === Ruby language core
1135 class.c :: classes and modules
1136 error.c :: exception classes and exception mechanism
1137 gc.c :: memory management
1138 load.c :: library loading
1140 variable.c :: variables and constants
1142 === Ruby syntax parser
1144 parse.y :: grammar definition
1145 parse.c :: automatically generated from parse.y
1146 defs/keywords :: reserved keywords
1147 lex.c :: automatically generated from keywords
1149 === Ruby evaluator (a.k.a. YARV)
1156 insns.def : definition of VM instructions
1157 iseq.c : implementation of VM::ISeq
1158 thread.c : thread management and context switching
1159 thread_win32.c : thread implementation
1160 thread_pthread.c : ditto
1168 defs/opt_insns_unif.def : instruction unification
1169 defs/opt_operand.def : definitions for optimization
1171 -> insn*.inc : automatically generated
1172 -> opt*.inc : automatically generated
1173 -> vm.inc : automatically generated
1175 === Regular expression engine (Onigumo)
1184 === Utility functions
1186 debug.c :: debug symbols for C debugger
1187 dln.c :: dynamic loading
1188 st.c :: general purpose hash table
1189 strftime.c :: formatting times
1190 util.c :: misc utilities
1192 === Ruby interpreter implementation
1210 compar.c :: Comparable
1211 complex.c :: Complex
1212 cont.c :: Fiber, Continuation
1214 enum.c :: Enumerable
1215 enumerator.c :: Enumerator
1219 marshal.c :: Marshal
1221 numeric.c :: Numeric, Integer, Fixnum, Float
1222 pack.c :: Array#pack, String#unpack
1223 proc.c :: Binding, Proc
1224 process.c :: Process
1225 random.c :: random number
1227 rational.c :: Rational
1228 re.c :: Regexp, MatchData
1230 sprintf.c :: String#sprintf
1235 defs/known_errors.def :: Errno::* exception classes
1236 -> known_errors.inc :: automatically generated
1238 === Multilingualization
1240 encoding.c :: Encoding
1241 transcode.c :: Encoding::Converter
1242 enc/*.c :: encoding classes
1243 enc/trans/* :: codepoint mapping tables
1245 === goruby interpreter implementation
1248 golf_prelude.rb : goruby specific libraries.
1249 -> golf_prelude.c : automatically generated
1251 == Appendix B. Ruby extension API reference
1257 The type for the Ruby object. Actual structures are defined in ruby.h,
1258 such as struct RString, etc. To refer the values in structures, use
1259 casting macros like RSTRING(obj).
1261 === Variables and constants
1269 true object (default true value)
1275 === C pointer wrapping
1277 Data_Wrap_Struct(VALUE klass, void (*mark)(), void (*free)(), void *sval) ::
1279 Wrap a C pointer into a Ruby object. If object has references to other
1280 Ruby objects, they should be marked by using the mark function during
1281 the GC process. Otherwise, mark should be 0. When this object is no
1282 longer referred by anywhere, the pointer will be discarded by free
1285 Data_Make_Struct(klass, type, mark, free, sval) ::
1287 This macro allocates memory using malloc(), assigns it to the variable
1288 sval, and returns the DATA encapsulating the pointer to memory region.
1290 Data_Get_Struct(data, type, sval) ::
1292 This macro retrieves the pointer value from DATA, and assigns it to
1295 === Checking VALUE types
1297 RB_TYPE_P(value, type) ::
1299 Is +value+ an internal type (T_NIL, T_FIXNUM, etc.)?
1303 Internal type (T_NIL, T_FIXNUM, etc.)
1307 Is +value+ a Fixnum?
1313 RB_INTEGER_TYPE_P(value) ::
1315 Is +value+ an Integer?
1317 RB_FLOAT_TYPE_P(value) ::
1321 void Check_Type(VALUE value, int type) ::
1323 Ensures +value+ is of the given internal +type+ or raises a TypeError
1325 === VALUE type conversion
1327 FIX2INT(value), INT2FIX(i) ::
1331 FIX2LONG(value), LONG2FIX(l) ::
1335 NUM2INT(value), INT2NUM(i) ::
1339 NUM2UINT(value), UINT2NUM(ui) ::
1341 Numeric <-> unsigned integer
1343 NUM2LONG(value), LONG2NUM(l) ::
1347 NUM2ULONG(value), ULONG2NUM(ul) ::
1349 Numeric <-> unsigned long
1351 NUM2LL(value), LL2NUM(ll) ::
1353 Numeric <-> long long
1355 NUM2ULL(value), ULL2NUM(ull) ::
1357 Numeric <-> unsigned long long
1359 NUM2OFFT(value), OFFT2NUM(off) ::
1363 NUM2SIZET(value), SIZET2NUM(size) ::
1367 NUM2SSIZET(value), SSIZET2NUM(ssize) ::
1371 rb_integer_pack(value, words, numwords, wordsize, nails, flags), rb_integer_unpack(words, numwords, wordsize, nails, flags) ::
1373 Numeric <-> Arbitrary size integer buffer
1385 String -> length of String data in bytes
1389 String -> pointer to String data
1390 Note that the result pointer may not be NUL-terminated
1392 StringValue(value) ::
1394 Object with \#to_str -> String
1396 StringValuePtr(value) ::
1398 Object with \#to_str -> pointer to String data
1400 StringValueCStr(value) ::
1402 Object with \#to_str -> pointer to String data without NUL bytes
1403 It is guaranteed that the result data is NUL-terminated
1409 === Defining classes and modules
1411 VALUE rb_define_class(const char *name, VALUE super) ::
1413 Defines a new Ruby class as a subclass of super.
1415 VALUE rb_define_class_under(VALUE module, const char *name, VALUE super) ::
1417 Creates a new Ruby class as a subclass of super, under the module's
1420 VALUE rb_define_module(const char *name) ::
1422 Defines a new Ruby module.
1424 VALUE rb_define_module_under(VALUE module, const char *name) ::
1426 Defines a new Ruby module under the module's namespace.
1428 void rb_include_module(VALUE klass, VALUE module) ::
1430 Includes module into class. If class already includes it, just ignored.
1432 void rb_extend_object(VALUE object, VALUE module) ::
1434 Extend the object with the module's attributes.
1436 === Defining global variables
1438 void rb_define_variable(const char *name, VALUE *var) ::
1440 Defines a global variable which is shared between C and Ruby. If name
1441 contains a character which is not allowed to be part of the symbol,
1442 it can't be seen from Ruby programs.
1444 void rb_define_readonly_variable(const char *name, VALUE *var) ::
1446 Defines a read-only global variable. Works just like
1447 rb_define_variable(), except the defined variable is read-only.
1449 void rb_define_virtual_variable(const char *name, VALUE (*getter)(), void (*setter)()) ::
1451 Defines a virtual variable, whose behavior is defined by a pair of C
1452 functions. The getter function is called when the variable is
1453 referenced. The setter function is called when the variable is set to a
1454 value. The prototype for getter/setter functions are:
1457 void setter(VALUE val, ID id)
1459 The getter function must return the value for the access.
1461 void rb_define_hooked_variable(const char *name, VALUE *var, VALUE (*getter)(), void (*setter)()) ::
1463 Defines hooked variable. It's a virtual variable with a C variable.
1464 The getter is called as
1466 VALUE getter(ID id, VALUE *var)
1468 returning a new value. The setter is called as
1470 void setter(VALUE val, ID id, VALUE *var)
1472 void rb_global_variable(VALUE *var) ::
1474 Tells GC to protect C global variable, which holds Ruby value to be marked.
1476 void rb_gc_register_mark_object(VALUE object) ::
1478 Tells GC to protect the +object+, which may not be referenced anywhere.
1480 === Constant definition
1482 void rb_define_const(VALUE klass, const char *name, VALUE val) ::
1484 Defines a new constant under the class/module.
1486 void rb_define_global_const(const char *name, VALUE val) ::
1488 Defines a global constant. This is just the same as
1490 rb_define_const(rb_cObject, name, val)
1492 === Method definition
1494 rb_define_method(VALUE klass, const char *name, VALUE (*func)(ANYARGS), int argc) ::
1496 Defines a method for the class. func is the function pointer. argc
1497 is the number of arguments. if argc is -1, the function will receive
1498 3 arguments: argc, argv, and self. if argc is -2, the function will
1499 receive 2 arguments, self and args, where args is a Ruby array of
1500 the method arguments.
1502 rb_define_private_method(VALUE klass, const char *name, VALUE (*func)(ANYARGS), int argc) ::
1504 Defines a private method for the class. Arguments are same as
1507 rb_define_singleton_method(VALUE klass, const char *name, VALUE (*func)(ANYARGS), int argc) ::
1509 Defines a singleton method. Arguments are same as rb_define_method().
1511 rb_check_arity(int argc, int min, int max) ::
1513 Check the number of arguments, argc is in the range of min..max. If
1514 max is UNLIMITED_ARGUMENTS, upper bound is not checked. If argc is
1515 out of bounds, an ArgumentError will be raised.
1517 rb_scan_args(int argc, VALUE *argv, const char *fmt, ...) ::
1519 Retrieve argument from argc and argv to given VALUE references
1520 according to the format string. The format can be described in ABNF
1523 scan-arg-spec := param-arg-spec [keyword-arg-spec] [block-arg-spec]
1525 param-arg-spec := pre-arg-spec [post-arg-spec] / post-arg-spec /
1526 pre-opt-post-arg-spec
1527 pre-arg-spec := num-of-leading-mandatory-args [num-of-optional-args]
1528 post-arg-spec := sym-for-variable-length-args
1529 [num-of-trailing-mandatory-args]
1530 pre-opt-post-arg-spec := num-of-leading-mandatory-args num-of-optional-args
1531 num-of-trailing-mandatory-args
1532 keyword-arg-spec := sym-for-keyword-arg
1533 block-arg-spec := sym-for-block-arg
1535 num-of-leading-mandatory-args := DIGIT ; The number of leading
1536 ; mandatory arguments
1537 num-of-optional-args := DIGIT ; The number of optional
1539 sym-for-variable-length-args := "*" ; Indicates that variable
1540 ; length arguments are
1541 ; captured as a ruby array
1542 num-of-trailing-mandatory-args := DIGIT ; The number of trailing
1543 ; mandatory arguments
1544 sym-for-keyword-arg := ":" ; Indicates that keyword
1545 ; argument captured as a hash.
1546 ; If keyword arguments are not
1547 ; provided, returns nil.
1548 sym-for-block-arg := "&" ; Indicates that an iterator
1549 ; block should be captured if
1552 For example, "12" means that the method requires at least one
1553 argument, and at most receives three (1+2) arguments. So, the format
1554 string must be followed by three variable references, which are to be
1555 assigned to captured arguments. For omitted arguments, variables are
1556 set to Qnil. NULL can be put in place of a variable reference, which
1557 means the corresponding captured argument(s) should be just dropped.
1559 The number of given arguments, excluding an option hash or iterator
1562 rb_scan_args_kw(int kw_splat, int argc, VALUE *argv, const char *fmt, ...) ::
1564 The same as +rb_scan_args+, except the +kw_splat+ argument specifies whether
1565 keyword arguments are provided (instead of being determined by the call
1566 from Ruby to the C function). +kw_splat+ should be one of the following
1569 RB_SCAN_ARGS_PASS_CALLED_KEYWORDS :: Same behavior as +rb_scan_args+.
1570 RB_SCAN_ARGS_KEYWORDS :: The final argument should be a hash treated as
1572 RB_SCAN_ARGS_LAST_HASH_KEYWORDS :: Treat a final argument as keywords if it
1573 is a hash, and not as keywords otherwise.
1575 int rb_get_kwargs(VALUE keyword_hash, const ID *table, int required, int optional, VALUE *values) ::
1577 Retrieves argument VALUEs bound to keywords, which directed by +table+
1578 into +values+, deleting retrieved entries from +keyword_hash+ along
1579 the way. First +required+ number of IDs referred by +table+ are
1580 mandatory, and succeeding +optional+ (- +optional+ - 1 if
1581 +optional+ is negative) number of IDs are optional. If a
1582 mandatory key is not contained in +keyword_hash+, raises "missing
1583 keyword" +ArgumentError+. If an optional key is not present in
1584 +keyword_hash+, the corresponding element in +values+ is set to +Qundef+.
1585 If +optional+ is negative, rest of +keyword_hash+ are ignored, otherwise
1586 raises "unknown keyword" +ArgumentError+.
1588 Be warned, handling keyword arguments in the C API is less efficient
1589 than handling them in Ruby. Consider using a Ruby wrapper method
1590 around a non-keyword C function.
1591 ref: https://bugs.ruby-lang.org/issues/11339
1593 VALUE rb_extract_keywords(VALUE *original_hash) ::
1595 Extracts pairs whose key is a symbol into a new hash from a hash
1596 object referred by +original_hash+. If the original hash contains
1597 non-symbol keys, then they are copied to another hash and the new hash
1598 is stored through +original_hash+, else 0 is stored.
1600 === Invoking Ruby method
1602 VALUE rb_funcall(VALUE recv, ID mid, int narg, ...) ::
1604 Invokes a method. To retrieve mid from a method name, use rb_intern().
1605 Able to call even private/protected methods.
1607 VALUE rb_funcall2(VALUE recv, ID mid, int argc, VALUE *argv) ::
1608 VALUE rb_funcallv(VALUE recv, ID mid, int argc, VALUE *argv) ::
1610 Invokes a method, passing arguments as an array of values.
1611 Able to call even private/protected methods.
1613 VALUE rb_funcallv_kw(VALUE recv, ID mid, int argc, VALUE *argv, int kw_splat) ::
1615 Same as rb_funcallv, using +kw_splat+ to determine whether keyword
1616 arguments are passed.
1618 VALUE rb_funcallv_public(VALUE recv, ID mid, int argc, VALUE *argv) ::
1620 Invokes a method, passing arguments as an array of values.
1621 Able to call only public methods.
1623 VALUE rb_funcallv_public_kw(VALUE recv, ID mid, int argc, VALUE *argv, int kw_splat) ::
1625 Same as rb_funcallv_public, using +kw_splat+ to determine whether keyword
1626 arguments are passed.
1628 VALUE rb_funcall_passing_block(VALUE recv, ID mid, int argc, const VALUE* argv) ::
1630 Same as rb_funcallv_public, except is passes the currently active block as
1631 the block when calling the method.
1633 VALUE rb_funcall_passing_block_kw(VALUE recv, ID mid, int argc, const VALUE* argv, int kw_splat) ::
1635 Same as rb_funcall_passing_block, using +kw_splat+ to determine whether
1636 keyword arguments are passed.
1638 VALUE rb_funcall_with_block(VALUE recv, ID mid, int argc, const VALUE *argv, VALUE passed_procval) ::
1640 Same as rb_funcallv_public, except +passed_procval+ specifies the block to
1643 VALUE rb_funcall_with_block_kw(VALUE recv, ID mid, int argc, const VALUE *argv, VALUE passed_procval, int kw_splat) ::
1645 Same as rb_funcall_with_block, using +kw_splat+ to determine whether
1646 keyword arguments are passed.
1648 VALUE rb_eval_string(const char *str) ::
1650 Compiles and executes the string as a Ruby program.
1652 ID rb_intern(const char *name) ::
1654 Returns ID corresponding to the name.
1656 char *rb_id2name(ID id) ::
1658 Returns the name corresponding ID.
1660 char *rb_class2name(VALUE klass) ::
1662 Returns the name of the class.
1664 int rb_respond_to(VALUE obj, ID id) ::
1666 Returns true if the object responds to the message specified by id.
1668 === Instance variables
1670 VALUE rb_iv_get(VALUE obj, const char *name) ::
1672 Retrieve the value of the instance variable. If the name is not
1673 prefixed by `@', that variable shall be inaccessible from Ruby.
1675 VALUE rb_iv_set(VALUE obj, const char *name, VALUE val) ::
1677 Sets the value of the instance variable.
1679 === Control structure
1681 VALUE rb_block_call(VALUE recv, ID mid, int argc, VALUE * argv, VALUE (*func) (ANYARGS), VALUE data2) ::
1683 Calls a method on the recv, with the method name specified by the
1684 symbol mid, with argc arguments in argv, supplying func as the
1685 block. When func is called as the block, it will receive the value
1686 from yield as the first argument, and data2 as the second argument.
1687 When yielded with multiple values (in C, rb_yield_values(),
1688 rb_yield_values2() and rb_yield_splat()), data2 is packed as an Array,
1689 whereas yielded values can be gotten via argc/argv of the third/fourth
1692 VALUE rb_block_call_kw(VALUE recv, ID mid, int argc, VALUE * argv, VALUE (*func) (ANYARGS), VALUE data2, int kw_splat) ::
1694 Same as rb_funcall_with_block, using +kw_splat+ to determine whether
1695 keyword arguments are passed.
1697 \[OBSOLETE] VALUE rb_iterate(VALUE (*func1)(), VALUE arg1, VALUE (*func2)(), VALUE arg2) ::
1699 Calls the function func1, supplying func2 as the block. func1 will be
1700 called with the argument arg1. func2 receives the value from yield as
1701 the first argument, arg2 as the second argument.
1703 When rb_iterate is used in 1.9, func1 has to call some Ruby-level method.
1704 This function is obsolete since 1.9; use rb_block_call instead.
1706 VALUE rb_yield(VALUE val) ::
1708 Yields val as a single argument to the block.
1710 VALUE rb_yield_values(int n, ...) ::
1712 Yields +n+ number of arguments to the block, using one C argument per Ruby
1715 VALUE rb_yield_values2(int n, VALUE *argv) ::
1717 Yields +n+ number of arguments to the block, with all Ruby arguments in the
1720 VALUE rb_yield_values_kw(int n, VALUE *argv, int kw_splat) ::
1722 Same as rb_yield_values2, using +kw_splat+ to determine whether
1723 keyword arguments are passed.
1725 VALUE rb_yield_splat(VALUE args) ::
1727 Same as rb_yield_values2, except arguments are specified by the Ruby
1730 VALUE rb_yield_splat_kw(VALUE args, int kw_splat) ::
1732 Same as rb_yield_splat, using +kw_splat+ to determine whether
1733 keyword arguments are passed.
1735 VALUE rb_rescue(VALUE (*func1)(ANYARGS), VALUE arg1, VALUE (*func2)(ANYARGS), VALUE arg2) ::
1737 Calls the function func1, with arg1 as the argument. If an exception
1738 occurs during func1, it calls func2 with arg2 as the first argument
1739 and the exception object as the second argument. The return value
1740 of rb_rescue() is the return value from func1 if no exception occurs,
1741 from func2 otherwise.
1743 VALUE rb_ensure(VALUE (*func1)(ANYARGS), VALUE arg1, VALUE (*func2)(ANYARGS), VALUE arg2) ::
1745 Calls the function func1 with arg1 as the argument, then calls func2
1746 with arg2 if execution terminated. The return value from
1747 rb_ensure() is that of func1 when no exception occurred.
1749 VALUE rb_protect(VALUE (*func) (VALUE), VALUE arg, int *state) ::
1751 Calls the function func with arg as the argument. If no exception
1752 occurred during func, it returns the result of func and *state is zero.
1753 Otherwise, it returns Qnil and sets *state to nonzero. If state is
1754 NULL, it is not set in both cases.
1755 You have to clear the error info with rb_set_errinfo(Qnil) when
1756 ignoring the caught exception.
1758 void rb_jump_tag(int state) ::
1760 Continues the exception caught by rb_protect() and rb_eval_string_protect().
1761 state must be the returned value from those functions. This function
1762 never return to the caller.
1764 void rb_iter_break() ::
1766 Exits from the current innermost block. This function never return to
1769 void rb_iter_break_value(VALUE value) ::
1771 Exits from the current innermost block with the value. The block will
1772 return the given argument value. This function never return to the
1775 === Exceptions and errors
1777 void rb_warn(const char *fmt, ...) ::
1779 Prints a warning message according to a printf-like format.
1781 void rb_warning(const char *fmt, ...) ::
1783 Prints a warning message according to a printf-like format, if
1786 void rb_raise(rb_eRuntimeError, const char *fmt, ...) ::
1788 Raises RuntimeError. The fmt is a format string just like printf().
1790 void rb_raise(VALUE exception, const char *fmt, ...) ::
1792 Raises a class exception. The fmt is a format string just like printf().
1794 void rb_fatal(const char *fmt, ...) ::
1796 Raises a fatal error, terminates the interpreter. No exception handling
1797 will be done for fatal errors, but ensure blocks will be executed.
1799 void rb_bug(const char *fmt, ...) ::
1801 Terminates the interpreter immediately. This function should be
1802 called under the situation caused by the bug in the interpreter. No
1803 exception handling nor ensure execution will be done.
1805 Note: In the format string, "%"PRIsVALUE can be used for Object#to_s
1806 (or Object#inspect if '+' flag is set) output (and related argument
1807 must be a VALUE). Since it conflicts with "%i", for integers in
1808 format strings, use "%d".
1812 As of Ruby 1.9, Ruby supports native 1:1 threading with one kernel
1813 thread per Ruby Thread object. Currently, there is a GVL (Global VM Lock)
1814 which prevents simultaneous execution of Ruby code which may be released
1815 by the rb_thread_call_without_gvl and rb_thread_call_without_gvl2 functions.
1816 These functions are tricky-to-use and documented in thread.c; do not
1817 use them before reading comments in thread.c.
1819 void rb_thread_schedule(void) ::
1821 Give the scheduler a hint to pass execution to another thread.
1823 === Input/Output (IO) on a single file descriptor
1825 int rb_io_wait_readable(int fd) ::
1827 Wait indefinitely for the given FD to become readable, allowing other
1828 threads to be scheduled. Returns a true value if a read may be
1829 performed, false if there is an unrecoverable error.
1831 int rb_io_wait_writable(int fd) ::
1833 Like rb_io_wait_readable, but for writability.
1835 int rb_wait_for_single_fd(int fd, int events, struct timeval *timeout) ::
1837 Allows waiting on a single FD for one or multiple events with a
1840 +events+ is a mask of any combination of the following values:
1842 * RB_WAITFD_IN - wait for readability of normal data
1843 * RB_WAITFD_OUT - wait for writability
1844 * RB_WAITFD_PRI - wait for readability of urgent data
1846 Use a NULL +timeout+ to wait indefinitely.
1848 === I/O multiplexing
1850 Ruby supports I/O multiplexing based on the select(2) system call.
1851 The Linux select_tut(2) manpage
1852 <http://man7.org/linux/man-pages/man2/select_tut.2.html>
1853 provides a good overview on how to use select(2), and the Ruby API has
1854 analogous functions and data structures to the well-known select API.
1855 Understanding of select(2) is required to understand this section.
1857 typedef struct rb_fdset_t ::
1859 The data structure which wraps the fd_set bitmap used by select(2).
1860 This allows Ruby to use FD sets larger than that allowed by
1861 historic limitations on modern platforms.
1863 void rb_fd_init(rb_fdset_t *) ::
1865 Initializes the rb_fdset_t, it must be initialized before other rb_fd_*
1866 operations. Analogous to calling malloc(3) to allocate an fd_set.
1868 void rb_fd_term(rb_fdset_t *) ::
1870 Destroys the rb_fdset_t, releasing any memory and resources it used.
1871 It must be reinitialized using rb_fd_init before future use.
1872 Analogous to calling free(3) to release memory for an fd_set.
1874 void rb_fd_zero(rb_fdset_t *) ::
1876 Clears all FDs from the rb_fdset_t, analogous to FD_ZERO(3).
1878 void rb_fd_set(int fd, rb_fdset_t *) ::
1880 Adds a given FD in the rb_fdset_t, analogous to FD_SET(3).
1882 void rb_fd_clr(int fd, rb_fdset_t *) ::
1884 Removes a given FD from the rb_fdset_t, analogous to FD_CLR(3).
1886 int rb_fd_isset(int fd, const rb_fdset_t *) ::
1888 Returns true if a given FD is set in the rb_fdset_t, false if not.
1889 Analogous to FD_ISSET(3).
1891 int rb_thread_fd_select(int nfds, rb_fdset_t *readfds, rb_fdset_t *writefds, rb_fdset_t *exceptfds, struct timeval *timeout) ::
1893 Analogous to the select(2) system call, but allows other Ruby
1894 threads to be scheduled while waiting.
1896 When only waiting on a single FD, favor rb_io_wait_readable,
1897 rb_io_wait_writable, or rb_wait_for_single_fd functions since
1898 they can be optimized for specific platforms (currently, only Linux).
1900 === Initialize and start the interpreter
1902 The embedding API functions are below (not needed for extension libraries):
1906 Initializes the interpreter.
1908 void *ruby_options(int argc, char **argv) ::
1910 Process command line arguments for the interpreter.
1911 And compiles the Ruby source to execute.
1912 It returns an opaque pointer to the compiled source
1913 or an internal special value.
1915 int ruby_run_node(void *n) ::
1917 Runs the given compiled source and exits this process.
1918 It returns EXIT_SUCCESS if successfully runs the source.
1919 Otherwise, it returns other value.
1921 void ruby_script(char *name) ::
1923 Specifies the name of the script ($0).
1925 === Hooks for the interpreter events
1927 void rb_add_event_hook(rb_event_hook_func_t func, rb_event_flag_t events, VALUE data) ::
1929 Adds a hook function for the specified interpreter events.
1930 events should be OR'ed value of:
1942 The definition of rb_event_hook_func_t is below:
1944 typedef void (*rb_event_hook_func_t)(rb_event_t event, VALUE data,
1945 VALUE self, ID id, VALUE klass)
1947 The third argument `data' to rb_add_event_hook() is passed to the hook
1948 function as the second argument, which was the pointer to the current
1949 NODE in 1.8. See RB_EVENT_HOOKS_HAVE_CALLBACK_DATA below.
1951 int rb_remove_event_hook(rb_event_hook_func_t func) ::
1953 Removes the specified hook function.
1957 void rb_gc_adjust_memory_usage(ssize_t diff) ::
1959 Adjusts the amount of registered external memory. You can tell GC how
1960 much memory is used by an external library by this function. Calling
1961 this function with positive diff means the memory usage is increased;
1962 new memory block is allocated or a block is reallocated as larger
1963 size. Calling this function with negative diff means the memory usage
1964 is decreased; a memory block is freed or a block is reallocated as
1965 smaller size. This function may trigger the GC.
1967 === Macros for compatibility
1969 Some macros to check API compatibilities are available by default.
1971 NORETURN_STYLE_NEW ::
1973 Means that NORETURN macro is functional style instead of prefix.
1975 HAVE_RB_DEFINE_ALLOC_FUNC ::
1977 Means that function rb_define_alloc_func() is provided, that means the
1978 allocation framework is used. This is the same as the result of
1979 have_func("rb_define_alloc_func", "ruby.h").
1981 HAVE_RB_REG_NEW_STR ::
1983 Means that function rb_reg_new_str() is provided, that creates Regexp
1984 object from String object. This is the same as the result of
1985 have_func("rb_reg_new_str", "ruby.h").
1989 Means that type rb_io_t is provided.
1991 USE_SYMBOL_AS_METHOD_NAME ::
1993 Means that Symbols will be returned as method names, e.g.,
1994 Module#methods, \#singleton_methods and so on.
1998 Defined in ruby.h and means corresponding header is available. For
1999 instance, when HAVE_RUBY_ST_H is defined you should use ruby/st.h not
2002 Header files corresponding to these macros may be <tt>#include</tt>
2003 directly from extension libraries.
2005 RB_EVENT_HOOKS_HAVE_CALLBACK_DATA ::
2007 Means that rb_add_event_hook() takes the third argument `data', to be
2008 passed to the given event hook function.
2010 === Defining backward compatible macros for keyword argument functions
2012 Most ruby C extensions are designed to support multiple Ruby versions.
2013 In order to correctly support Ruby 2.7+ in regards to keyword
2014 argument separation, C extensions need to use <code>*_kw</code>
2015 functions. However, these functions do not exist in Ruby 2.6 and
2016 below, so in those cases macros should be defined to allow you to use
2017 the same code on multiple Ruby versions. Here are example macros
2018 you can use in extensions that support Ruby 2.6 (or below) when using
2019 the <code>*_kw</code> functions introduced in Ruby 2.7.
2021 #ifndef RB_PASS_KEYWORDS
2022 /* Only define macros on Ruby <2.7 */
2023 #define rb_funcallv_kw(o, m, c, v, kw) rb_funcallv(o, m, c, v)
2024 #define rb_funcallv_public_kw(o, m, c, v, kw) rb_funcallv_public(o, m, c, v)
2025 #define rb_funcall_passing_block_kw(o, m, c, v, kw) rb_funcall_passing_block(o, m, c, v)
2026 #define rb_funcall_with_block_kw(o, m, c, v, b, kw) rb_funcall_with_block(o, m, c, v, b)
2027 #define rb_scan_args_kw(kw, c, v, s, ...) rb_scan_args(c, v, s, __VA_ARGS__)
2028 #define rb_call_super_kw(c, v, kw) rb_call_super(c, v)
2029 #define rb_yield_values_kw(c, v, kw) rb_yield_values2(c, v)
2030 #define rb_yield_splat_kw(a, kw) rb_yield_splat(a)
2031 #define rb_block_call_kw(o, m, c, v, f, p, kw) rb_block_call(o, m, c, v, f, p)
2032 #define rb_fiber_resume_kw(o, c, v, kw) rb_fiber_resume(o, c, v)
2033 #define rb_fiber_yield_kw(c, v, kw) rb_fiber_yield(c, v)
2034 #define rb_enumeratorize_with_size_kw(o, m, c, v, f, kw) rb_enumeratorize_with_size(o, m, c, v, f)
2035 #define SIZED_ENUMERATOR_KW(obj, argc, argv, size_fn, kw_splat) \
2036 rb_enumeratorize_with_size((obj), ID2SYM(rb_frame_this_func()), \
2037 (argc), (argv), (size_fn))
2038 #define RETURN_SIZED_ENUMERATOR_KW(obj, argc, argv, size_fn, kw_splat) do { \
2039 if (!rb_block_given_p()) \
2040 return SIZED_ENUMERATOR(obj, argc, argv, size_fn); \
2042 #define RETURN_ENUMERATOR_KW(obj, argc, argv, kw_splat) RETURN_SIZED_ENUMERATOR(obj, argc, argv, 0)
2043 #define rb_check_funcall_kw(o, m, c, v, kw) rb_check_funcall(o, m, c, v)
2044 #define rb_obj_call_init_kw(o, c, v, kw) rb_obj_call_init(o, c, v)
2045 #define rb_class_new_instance_kw(c, v, k, kw) rb_class_new_instance(c, v, k)
2046 #define rb_proc_call_kw(p, a, kw) rb_proc_call(p, a)
2047 #define rb_proc_call_with_block_kw(p, c, v, b, kw) rb_proc_call_with_block(p, c, v, b)
2048 #define rb_method_call_kw(c, v, m, kw) rb_method_call(c, v, m)
2049 #define rb_method_call_with_block_kw(c, v, m, b, kw) rb_method_call_with_block(c, v, m, b)
2050 #define rb_eval_cmd_kwd(c, a, kw) rb_eval_cmd(c, a, 0)
2053 == Appendix C. Functions available for use in extconf.rb
2055 See documentation for {mkmf}[rdoc-ref:MakeMakefile].
2057 == Appendix D. Generational GC
2059 Ruby 2.1 introduced a generational garbage collector (called RGenGC).
2060 RGenGC (mostly) keeps compatibility.
2062 Generally, the use of the technique called write barriers is required in
2063 extension libraries for generational GC
2064 (https://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29).
2065 RGenGC works fine without write barriers in extension libraries.
2067 If your library adheres to the following tips, performance can
2068 be further improved. Especially, the "Don't touch pointers directly" section is
2073 You can't write RBASIC(obj)->klass field directly because it is const
2076 Basically you should not write this field because MRI expects it to be
2077 an immutable field, but if you want to do it in your extension you can
2078 use the following functions:
2080 VALUE rb_obj_hide(VALUE obj) ::
2082 Clear RBasic::klass field. The object will be an internal object.
2083 ObjectSpace::each_object can't find this object.
2085 VALUE rb_obj_reveal(VALUE obj, VALUE klass) ::
2087 Reset RBasic::klass to be klass.
2088 We expect the `klass' is hidden class by rb_obj_hide().
2092 RGenGC doesn't require write barriers to support generational GC.
2093 However, caring about write barrier can improve the performance of
2094 RGenGC. Please check the following tips.
2096 ==== Don't touch pointers directly
2098 In MRI (include/ruby/ruby.h), some macros to acquire pointers to the
2099 internal data structures are supported such as RARRAY_PTR(),
2100 RSTRUCT_PTR() and so on.
2102 DO NOT USE THESE MACROS and instead use the corresponding C-APIs such as
2103 rb_ary_aref(), rb_ary_store() and so on.
2105 ==== Consider whether to insert write barriers
2107 You don't need to care about write barriers if you only use built-in
2110 If you support T_DATA objects, you may consider using write barriers.
2112 Inserting write barriers into T_DATA objects only works with the
2113 following type objects: (a) long-lived objects, (b) when a huge number
2114 of objects are generated and \(c) container-type objects that have
2115 references to other objects. If your extension provides such a type of
2116 T_DATA objects, consider inserting write barriers.
2118 (a): short-lived objects don't become old generation objects.
2119 (b): only a few oldgen objects don't have performance impact.
2120 \(c): only a few references don't have performance impact.
2122 Inserting write barriers is a very difficult hack, it is easy to
2123 introduce critical bugs. And inserting write barriers has several areas
2124 of overhead. Basically we don't recommend you insert write barriers.
2125 Please carefully consider the risks.
2127 ==== Combine with built-in types
2129 Please consider utilizing built-in types. Most built-in types support
2130 write barrier, so you can use them to avoid manually inserting write
2133 For example, if your T_DATA has references to other objects, then you
2134 can move these references to Array. A T_DATA object only has a reference
2135 to an array object. Or you can also use a Struct object to gather a
2136 T_DATA object (without any references) and an that Array contains
2139 With use of such techniques, you don't need to insert write barriers
2142 ==== Insert write barriers
2144 \[AGAIN] Inserting write barriers is a very difficult hack, and it is
2145 easy to introduce critical bugs. And inserting write barriers has
2146 several areas of overhead. Basically we don't recommend you insert write
2147 barriers. Please carefully consider the risks.
2149 Before inserting write barriers, you need to know about RGenGC algorithm
2150 (gc.c will help you). Macros and functions to insert write barriers are
2151 available in include/ruby/ruby.h. An example is available in iseq.c.
2153 For a complete guide for RGenGC and write barriers, please refer to
2154 <https://bugs.ruby-lang.org/projects/ruby-master/wiki/RGenGC>.
2156 == Appendix E. RB_GC_GUARD to protect from premature GC
2158 C Ruby currently uses conservative garbage collection, thus VALUE
2159 variables must remain visible on the stack or registers to ensure any
2160 associated data remains usable. Optimizing C compilers are not designed
2161 with conservative garbage collection in mind, so they may optimize away
2162 the original VALUE even if the code depends on data associated with that
2165 The following example illustrates the use of RB_GC_GUARD to ensure
2166 the contents of sptr remain valid while the second invocation of
2167 rb_str_new_cstr is running.
2172 s = rb_str_new_cstr("hello world!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!");
2173 sptr = RSTRING_PTR(s);
2174 w = rb_str_new_cstr(sptr + 6); /* Possible GC invocation */
2176 RB_GC_GUARD(s); /* ensure s (and thus sptr) do not get GC-ed */
2178 In the above example, RB_GC_GUARD must be placed _after_ the last use of
2179 sptr. Placing RB_GC_GUARD before dereferencing sptr would be of no use.
2180 RB_GC_GUARD is only effective on the VALUE data type, not converted C
2183 RB_GC_GUARD would not be necessary at all in the above example if
2184 non-inlined function calls are made on the `s' VALUE after sptr is
2185 dereferenced. Thus, in the above example, calling any un-inlined
2186 function on `s' such as:
2190 Will ensure `s' stays on the stack or register to prevent a
2191 GC invocation from prematurely freeing it.
2193 Using the RB_GC_GUARD macro is preferable to using the "volatile"
2194 keyword in C. RB_GC_GUARD has the following advantages:
2196 1. the intent of the macro use is clear
2198 2. RB_GC_GUARD only affects its call site, "volatile" generates some
2199 extra code every time the variable is used, hurting optimization.
2201 3. "volatile" implementations may be buggy/inconsistent in some
2202 compilers and architectures. RB_GC_GUARD is customizable for broken
2203 systems/compilers without negatively affecting other systems.
2205 == Appendix F. Ractor support
2207 Ractor(s) are the parallel execution mechanism introduced in Ruby 3.0. All
2208 ractors can run in parallel on a different OS thread (using an underlying system
2209 provided thread), so the C extension should be thread-safe. A C extension that
2210 can run in multiple ractors is called "Ractor-safe".
2212 Ractor safety around C extensions has the following properties:
2213 1. By default, all C extensions are recognized as Ractor-unsafe.
2214 2. Ractor-unsafe C-methods may only be called from the main Ractor. If invoked
2215 by a non-main Ractor, then a Ractor::UnsafeError is raised.
2216 3. If an extension desires to be marked as Ractor-safe the extension should
2217 call rb_ext_ractor_safe(true) at the Init_ function for the extension, and
2218 all defined methods will be marked as Ractor-safe.
2220 To make a "Ractor-safe" C extension, we need to check the following points:
2222 1. Do not share unshareable objects between ractors
2224 For example, C's global variable can lead sharing an unshareable objects
2228 VALUE set(VALUE self, VALUE v){ return g_var = v; }
2229 VALUE get(VALUE self){ return g_var; }
2231 set() and get() pair can share an unshareable objects using g_var, and
2232 it is Ractor-unsafe.
2234 Not only using global variables directly, some indirect data structure
2235 such as global st_table can share the objects, so please take care.
2237 Note that class and module objects are shareable objects, so you can
2238 keep the code "cFoo = rb_define_class(...)" with C's global variables.
2240 2. Check the thread-safety of the extension
2242 An extension should be thread-safe. For example, the following code is
2245 bool g_called = false;
2246 VALUE call(VALUE self) {
2247 if (g_called) rb_raise("recursive call is not allowed.");
2249 VALUE ret = do_something();
2254 because g_called global variable should be synchronized by other
2255 ractor's threads. To avoid such data-race, some synchronization should
2256 be used. Check include/ruby/thread_native.h and include/ruby/atomic.h.
2258 With Ractors, all objects given as method parameters and the receiver (self)
2259 are guaranteed to be from the current Ractor or to be shareable. As a
2260 consequence, it is easier to make code ractor-safe than to make code generally
2261 thread-safe. For example, we don't need to lock an array object to access the
2264 3. Check the thread-safety of any used library
2266 If the extension relies on an external library, such as a function foo() from
2267 a library libfoo, the function libfoo foo() should be thread safe.
2269 4. Make an object shareable
2271 This is not required to make an extension Ractor-safe.
2273 If an extension provides special objects defined by rb_data_type_t,
2274 consider these objects can become shareable or not.
2276 RUBY_TYPED_FROZEN_SHAREABLE flag indicates that these objects can be
2277 shareable objects if the object is frozen. This means that if the object
2278 is frozen, the mutation of wrapped data is not allowed.
2282 There are possibly other points or requirements which must be considered in the
2283 making of a Ractor-safe extension. This document will be extended as they are