2003-12-26 Guilhem Lavaux <guilhem@kaffe.org>
[official-gcc.git] / libjava / doc / cni.sgml
blob495e3e9c5a5f4d813e26569ace9d41e7093a9507
1 <!DOCTYPE article PUBLIC "-//Davenport//DTD DocBook V3.0//EN">
2 <article>
3 <artheader>
4 <title>The Cygnus Native Interface for C++/Java Integration</title>
5 <subtitle>Writing native Java methods in natural C++</subtitle>
6 <authorgroup>
7 <corpauthor>Cygnus Solutions</corpauthor>
8 </authorgroup>
9 <date>March, 2000</date>
10 </artheader>
12 <abstract><para>
13 This documents CNI, the Cygnus Native Interface,
14 which is is a convenient way to write Java native methods using C++.
15 This is a more efficient, more convenient, but less portable
16 alternative to the standard JNI (Java Native Interface).</para>
17 </abstract>
19 <sect1><title>Basic Concepts</title>
20 <para>
21 In terms of languages features, Java is mostly a subset
22 of C++. Java has a few important extensions, plus a powerful standard
23 class library, but on the whole that does not change the basic similarity.
24 Java is a hybrid object-oriented language, with a few native types,
25 in addition to class types. It is class-based, where a class may have
26 static as well as per-object fields, and static as well as instance methods.
27 Non-static methods may be virtual, and may be overloaded. Overloading is
28 resolved at compile time by matching the actual argument types against
29 the parameter types. Virtual methods are implemented using indirect calls
30 through a dispatch table (virtual function table). Objects are
31 allocated on the heap, and initialized using a constructor method.
32 Classes are organized in a package hierarchy.
33 </para>
34 <para>
35 All of the listed attributes are also true of C++, though C++ has
36 extra features (for example in C++ objects may be allocated not just
37 on the heap, but also statically or in a local stack frame). Because
38 <acronym>gcj</acronym> uses the same compiler technology as
39 <acronym>g++</acronym> (the GNU C++ compiler), it is possible
40 to make the intersection of the two languages use the same
41 <acronym>ABI</acronym> (object representation and calling conventions).
42 The key idea in <acronym>CNI</acronym> is that Java objects are C++ objects,
43 and all Java classes are C++ classes (but not the other way around).
44 So the most important task in integrating Java and C++ is to
45 remove gratuitous incompatibilities.
46 </para>
47 <para>
48 You write CNI code as a regular C++ source file. (You do have to use
49 a Java/CNI-aware C++ compiler, specifically a recent version of G++.)</para>
50 <para>
51 You start with:
52 <programlisting>
53 #include &lt;gcj/cni.h&gt;
54 </programlisting></para>
56 <para>
57 You then include header files for the various Java classes you need
58 to use:
59 <programlisting>
60 #include &lt;java/lang/Character.h&gt;
61 #include &lt;java/util/Date.h&gt;
62 #include &lt;java/lang/IndexOutOfBoundsException.h&gt;
63 </programlisting></para>
65 <para>
66 In general, <acronym>CNI</acronym> functions and macros start with the
67 `<literal>Jv</literal>' prefix, for example the function
68 `<literal>JvNewObjectArray</literal>'. This convention is used to
69 avoid conflicts with other libraries.
70 Internal functions in <acronym>CNI</acronym> start with the prefix
71 `<literal>_Jv_</literal>'. You should not call these;
72 if you find a need to, let us know and we will try to come up with an
73 alternate solution. (This manual lists <literal>_Jv_AllocBytes</literal>
74 as an example; <acronym>CNI</acronym> should instead provide
75 a <literal>JvAllocBytes</literal> function.)</para>
76 <para>
77 These header files are automatically generated by <command>gcjh</command>.
78 </para>
79 </sect1>
81 <sect1><title>Packages</title>
82 <para>
83 The only global names in Java are class names, and packages.
84 A <firstterm>package</firstterm> can contain zero or more classes, and
85 also zero or more sub-packages.
86 Every class belongs to either an unnamed package or a package that
87 has a hierarchical and globally unique name.
88 </para>
89 <para>
90 A Java package is mapped to a C++ <firstterm>namespace</firstterm>.
91 The Java class <literal>java.lang.String</literal>
92 is in the package <literal>java.lang</literal>, which is a sub-package
93 of <literal>java</literal>. The C++ equivalent is the
94 class <literal>java::lang::String</literal>,
95 which is in the namespace <literal>java::lang</literal>,
96 which is in the namespace <literal>java</literal>.
97 </para>
98 <para>
99 Here is how you could express this:
100 <programlisting>
101 // Declare the class(es), possibly in a header file:
102 namespace java {
103 namespace lang {
104 class Object;
105 class String;
110 class java::lang::String : public java::lang::Object
114 </programlisting>
115 </para>
116 <para>
117 The <literal>gcjh</literal> tool automatically generates the
118 nessary namespace declarations.</para>
120 <sect2><title>Nested classes as a substitute for namespaces</title>
121 <para>
122 <!-- FIXME the next line reads poorly jsm -->
123 It is not that long since g++ got complete namespace support,
124 and it was very recent (end of February 1999) that <literal>libgcj</literal>
125 was changed to uses namespaces. Releases before then used
126 nested classes, which are the C++ equivalent of Java inner classes.
127 They provide similar (though less convenient) functionality.
128 The old syntax is:
129 <programlisting>
130 class java {
131 class lang {
132 class Object;
133 class String;
136 </programlisting>
137 The obvious difference is the use of <literal>class</literal> instead
138 of <literal>namespace</literal>. The more important difference is
139 that all the members of a nested class have to be declared inside
140 the parent class definition, while namespaces can be defined in
141 multiple places in the source. This is more convenient, since it
142 corresponds more closely to how Java packages are defined.
143 The main difference is in the declarations; the syntax for
144 using a nested class is the same as with namespaces:
145 <programlisting>
146 class java::lang::String : public java::lang::Object
147 { ... }
148 </programlisting>
149 Note that the generated code (including name mangling)
150 using nested classes is the same as that using namespaces.</para>
151 </sect2>
153 <sect2><title>Leaving out package names</title>
154 <para>
155 <!-- FIXME next line reads poorly jsm -->
156 Having to always type the fully-qualified class name is verbose.
157 It also makes it more difficult to change the package containing a class.
158 The Java <literal>package</literal> declaration specifies that the
159 following class declarations are in the named package, without having
160 to explicitly name the full package qualifiers.
161 The <literal>package</literal> declaration can be followed by zero or
162 more <literal>import</literal> declarations, which allows either
163 a single class or all the classes in a package to be named by a simple
164 identifier. C++ provides something similar
165 with the <literal>using</literal> declaration and directive.
166 </para>
167 <para>
168 A Java simple-type-import declaration:
169 <programlisting>
170 import <replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable>;
171 </programlisting>
172 allows using <replaceable>TypeName</replaceable> as a shorthand for
173 <literal><replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable></literal>.
174 The C++ (more-or-less) equivalent is a <literal>using</literal>-declaration:
175 <programlisting>
176 using <replaceable>PackageName</replaceable>::<replaceable>TypeName</replaceable>;
177 </programlisting>
178 </para>
179 <para>
180 A Java import-on-demand declaration:
181 <programlisting>
182 import <replaceable>PackageName</replaceable>.*;
183 </programlisting>
184 allows using <replaceable>TypeName</replaceable> as a shorthand for
185 <literal><replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable></literal>
186 The C++ (more-or-less) equivalent is a <literal>using</literal>-directive:
187 <programlisting>
188 using namespace <replaceable>PackageName</replaceable>;
189 </programlisting>
190 </para>
191 </sect2>
192 </sect1>
194 <sect1><title>Primitive types</title>
195 <para>
196 Java provides 8 <quote>primitives</quote> types:
197 <literal>byte</literal>, <literal>short</literal>, <literal>int</literal>,
198 <literal>long</literal>, <literal>float</literal>, <literal>double</literal>,
199 <literal>char</literal>, and <literal>boolean</literal>.
200 These are the same as the following C++ <literal>typedef</literal>s
201 (which are defined by <literal>gcj/cni.h</literal>):
202 <literal>jbyte</literal>, <literal>jshort</literal>, <literal>jint</literal>,
203 <literal>jlong</literal>, <literal>jfloat</literal>,
204 <literal>jdouble</literal>,
205 <literal>jchar</literal>, and <literal>jboolean</literal>.
206 You should use the C++ typenames
207 (<ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase> <literal>jint</literal>),
208 and not the Java types names
209 (<ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase> <literal>int</literal>),
210 even if they are <quote>the same</quote>.
211 This is because there is no guarantee that the C++ type
212 <literal>int</literal> is a 32-bit type, but <literal>jint</literal>
213 <emphasis>is</emphasis> guaranteed to be a 32-bit type.
215 <informaltable frame="all" colsep="1" rowsep="0">
216 <tgroup cols="3">
217 <thead>
218 <row>
219 <entry>Java type</entry>
220 <entry>C/C++ typename</entry>
221 <entry>Description</entry>
222 </thead>
223 <tbody>
224 <row>
225 <entry>byte</entry>
226 <entry>jbyte</entry>
227 <entry>8-bit signed integer</entry>
228 </row>
229 <row>
230 <entry>short</entry>
231 <entry>jshort</entry>
232 <entry>16-bit signed integer</entry>
233 </row>
234 <row>
235 <entry>int</entry>
236 <entry>jint</entry>
237 <entry>32-bit signed integer</entry>
238 </row>
239 <row>
240 <entry>long</entry>
241 <entry>jlong</entry>
242 <entry>64-bit signed integer</entry>
243 </row>
244 <row>
245 <entry>float</entry>
246 <entry>jfloat</entry>
247 <entry>32-bit IEEE floating-point number</entry>
248 </row>
249 <row>
250 <entry>double</entry>
251 <entry>jdouble</entry>
252 <entry>64-bit IEEE floating-point number</entry>
253 </row>
254 <row>
255 <entry>char</entry>
256 <entry>jchar</entry>
257 <entry>16-bit Unicode character</entry>
258 </row>
259 <row>
260 <entry>boolean</entry>
261 <entry>jboolean</entry>
262 <entry>logical (Boolean) values</entry>
263 </row>
264 <row>
265 <entry>void</entry>
266 <entry>void</entry>
267 <entry>no value</entry>
268 </row>
269 </tbody></tgroup>
270 </informaltable>
271 </para>
273 <para>
274 <funcsynopsis>
275 <funcdef><function>JvPrimClass</function></funcdef>
276 <paramdef><parameter>primtype</parameter></paramdef>
277 </funcsynopsis>
278 This is a macro whose argument should be the name of a primitive
279 type, <ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase>
280 <literal>byte</literal>.
281 The macro expands to a pointer to the <literal>Class</literal> object
282 corresponding to the primitive type.
283 <ForeignPhrase><Abbrev>E.g.</Abbrev></ForeignPhrase>,
284 <literal>JvPrimClass(void)</literal>
285 has the same value as the Java expression
286 <literal>Void.TYPE</literal> (or <literal>void.class</literal>).
287 </para>
289 </sect1>
291 <sect1><title>Objects and Classes</title>
292 <sect2><title>Classes</title>
293 <para>
294 All Java classes are derived from <literal>java.lang.Object</literal>.
295 C++ does not have a unique <quote>root</quote>class, but we use
296 a C++ <literal>java::lang::Object</literal> as the C++ version
297 of the <literal>java.lang.Object</literal> Java class. All
298 other Java classes are mapped into corresponding C++ classes
299 derived from <literal>java::lang::Object</literal>.</para>
300 <para>
301 Interface inheritance (the <quote><literal>implements</literal></quote>
302 keyword) is currently not reflected in the C++ mapping.</para>
303 </sect2>
304 <sect2><title>Object references</title>
305 <para>
306 We implement a Java object reference as a pointer to the start
307 of the referenced object. It maps to a C++ pointer.
308 (We cannot use C++ references for Java references, since
309 once a C++ reference has been initialized, you cannot change it to
310 point to another object.)
311 The <literal>null</literal> Java reference maps to the <literal>NULL</literal>
312 C++ pointer.
313 </para>
314 <para>
315 Note that in some Java implementations an object reference is implemented as
316 a pointer to a two-word <quote>handle</quote>. One word of the handle
317 points to the fields of the object, while the other points
318 to a method table. Gcj does not use this extra indirection.
319 </para>
320 </sect2>
321 <sect2><title>Object fields</title>
322 <para>
323 Each object contains an object header, followed by the instance
324 fields of the class, in order. The object header consists of
325 a single pointer to a dispatch or virtual function table.
326 (There may be extra fields <quote>in front of</quote> the object,
327 for example for
328 memory management, but this is invisible to the application, and
329 the reference to the object points to the dispatch table pointer.)
330 </para>
331 <para>
332 The fields are laid out in the same order, alignment, and size
333 as in C++. Specifically, 8-bite and 16-bit native types
334 (<literal>byte</literal>, <literal>short</literal>, <literal>char</literal>,
335 and <literal>boolean</literal>) are <emphasis>not</emphasis>
336 widened to 32 bits.
337 Note that the Java VM does extend 8-bit and 16-bit types to 32 bits
338 when on the VM stack or temporary registers.</para>
339 <para>
340 If you include the <literal>gcjh</literal>-generated header for a
341 class, you can access fields of Java classes in the <quote>natural</quote>
342 way. Given the following Java class:
343 <programlisting>
344 public class Int
346 public int i;
347 public Integer (int i) { this.i = i; }
348 public static zero = new Integer(0);
350 </programlisting>
351 you can write:
352 <programlisting>
353 #include &lt;gcj/cni.h&gt;
354 #include &lt;Int.h&gt;
355 Int*
356 mult (Int *p, jint k)
358 if (k == 0)
359 return Int::zero; // static member access.
360 return new Int(p->i * k);
362 </programlisting>
363 </para>
364 <para>
365 <acronym>CNI</acronym> does not strictly enforce the Java access
366 specifiers, because Java permissions cannot be directly mapped
367 into C++ permission. Private Java fields and methods are mapped
368 to private C++ fields and methods, but other fields and methods
369 are mapped to public fields and methods.
370 </para>
371 </sect2>
372 </sect1>
374 <sect1><title>Arrays</title>
375 <para>
376 While in many ways Java is similar to C and C++,
377 it is quite different in its treatment of arrays.
378 C arrays are based on the idea of pointer arithmetic,
379 which would be incompatible with Java's security requirements.
380 Java arrays are true objects (array types inherit from
381 <literal>java.lang.Object</literal>). An array-valued variable
382 is one that contains a reference (pointer) to an array object.
383 </para>
384 <para>
385 Referencing a Java array in C++ code is done using the
386 <literal>JArray</literal> template, which as defined as follows:
387 <programlisting>
388 class __JArray : public java::lang::Object
390 public:
391 int length;
394 template&lt;class T&gt;
395 class JArray : public __JArray
397 T data[0];
398 public:
399 T&amp; operator[](jint i) { return data[i]; }
401 </programlisting></para>
402 <para>
403 <funcsynopsis>
404 <funcdef>template&lt;class T&gt; T *<function>elements</function></funcdef>
405 <paramdef>JArray&lt;T&gt; &amp;<parameter>array</parameter></paramdef>
406 </funcsynopsis>
407 This template function can be used to get a pointer to the
408 elements of the <parameter>array</parameter>.
409 For instance, you can fetch a pointer
410 to the integers that make up an <literal>int[]</literal> like so:
411 <programlisting>
412 extern jintArray foo;
413 jint *intp = elements (foo);
414 </programlisting>
415 The name of this function may change in the future.</para>
416 <para>
417 There are a number of typedefs which correspond to typedefs from JNI.
418 Each is the type of an array holding objects of the appropriate type:
419 <programlisting>
420 typedef __JArray *jarray;
421 typedef JArray&lt;jobject&gt; *jobjectArray;
422 typedef JArray&lt;jboolean&gt; *jbooleanArray;
423 typedef JArray&lt;jbyte&gt; *jbyteArray;
424 typedef JArray&lt;jchar&gt; *jcharArray;
425 typedef JArray&lt;jshort&gt; *jshortArray;
426 typedef JArray&lt;jint&gt; *jintArray;
427 typedef JArray&lt;jlong&gt; *jlongArray;
428 typedef JArray&lt;jfloat&gt; *jfloatArray;
429 typedef JArray&lt;jdouble&gt; *jdoubleArray;
430 </programlisting>
431 </para>
432 <para>
433 You can create an array of objects using this function:
434 <funcsynopsis>
435 <funcdef>jobjectArray <function>JvNewObjectArray</function></funcdef>
436 <paramdef>jint <parameter>length</parameter></paramdef>
437 <paramdef>jclass <parameter>klass</parameter></paramdef>
438 <paramdef>jobject <parameter>init</parameter></paramdef>
439 </funcsynopsis>
440 Here <parameter>klass</parameter> is the type of elements of the array;
441 <parameter>init</parameter> is the initial
442 value to be put into every slot in the array.
443 </para>
444 <para>
445 For each primitive type there is a function which can be used
446 to create a new array holding that type. The name of the function
447 is of the form
448 `<literal>JvNew&lt;<replaceable>Type</replaceable>&gt;Array</literal>',
449 where `&lt;<replaceable>Type</replaceable>&gt;' is the name of
450 the primitive type, with its initial letter in upper-case. For
451 instance, `<literal>JvNewBooleanArray</literal>' can be used to create
452 a new array of booleans.
453 Each such function follows this example:
454 <funcsynopsis>
455 <funcdef>jbooleanArray <function>JvNewBooleanArray</function></funcdef>
456 <paramdef>jint <parameter>length</parameter></paramdef>
457 </funcsynopsis>
458 </para>
459 <para>
460 <funcsynopsis>
461 <funcdef>jsize <function>JvGetArrayLength</function></funcdef>
462 <paramdef>jarray <parameter>array</parameter></paramdef>
463 </funcsynopsis>
464 Returns the length of <parameter>array</parameter>.</para>
465 </sect1>
467 <sect1><title>Methods</title>
469 <para>
470 Java methods are mapped directly into C++ methods.
471 The header files generated by <literal>gcjh</literal>
472 include the appropriate method definitions.
473 Basically, the generated methods have the same names and
474 <quote>corresponding</quote> types as the Java methods,
475 and are called in the natural manner.</para>
477 <sect2><title>Overloading</title>
478 <para>
479 Both Java and C++ provide method overloading, where multiple
480 methods in a class have the same name, and the correct one is chosen
481 (at compile time) depending on the argument types.
482 The rules for choosing the correct method are (as expected) more complicated
483 in C++ than in Java, but given a set of overloaded methods
484 generated by <literal>gcjh</literal> the C++ compiler will choose
485 the expected one.</para>
486 <para>
487 Common assemblers and linkers are not aware of C++ overloading,
488 so the standard implementation strategy is to encode the
489 parameter types of a method into its assembly-level name.
490 This encoding is called <firstterm>mangling</firstterm>,
491 and the encoded name is the <firstterm>mangled name</firstterm>.
492 The same mechanism is used to implement Java overloading.
493 For C++/Java interoperability, it is important that both the Java
494 and C++ compilers use the <emphasis>same</emphasis> encoding scheme.
495 </para>
496 </sect2>
498 <sect2><title>Static methods</title>
499 <para>
500 Static Java methods are invoked in <acronym>CNI</acronym> using the standard
501 C++ syntax, using the `<literal>::</literal>' operator rather
502 than the `<literal>.</literal>' operator. For example:
503 </para>
504 <programlisting>
505 jint i = java::lang::Math::round((jfloat) 2.3);
506 </programlisting>
507 <para>
508 <!-- FIXME this next sentence seems ungammatical jsm -->
509 Defining a static native method uses standard C++ method
510 definition syntax. For example:
511 <programlisting>
512 #include &lt;java/lang/Integer.h&gt;
513 java::lang::Integer*
514 java::lang::Integer::getInteger(jstring str)
518 </programlisting>
519 </sect2>
521 <sect2><title>Object Constructors</title>
522 <para>
523 Constructors are called implicitly as part of object allocation
524 using the <literal>new</literal> operator. For example:
525 <programlisting>
526 java::lang::Int x = new java::lang::Int(234);
527 </programlisting>
528 </para>
529 <para>
530 <!-- FIXME rewrite needed here, mine may not be good jsm -->
531 Java does not allow a constructor to be a native method.
532 Instead, you could define a private method which
533 you can have the constructor call.
534 </para>
535 </sect2>
537 <sect2><title>Instance methods</title>
538 <para>
539 <!-- FIXME next para week, I would remove a few words from some sentences jsm -->
540 Virtual method dispatch is handled essentially the same way
541 in C++ and Java -- <abbrev>i.e.</abbrev> by doing an
542 indirect call through a function pointer stored in a per-class virtual
543 function table. C++ is more complicated because it has to support
544 multiple inheritance, but this does not effect Java classes.
545 However, G++ has historically used a different calling convention
546 that is not compatible with the one used by <acronym>gcj</acronym>.
547 During 1999, G++ will switch to a new ABI that is compatible with
548 <acronym>gcj</acronym>. Some platforms (including Linux) have already
549 changed. On other platforms, you will have to pass
550 the <literal>-fvtable-thunks</literal> flag to g++ when
551 compiling <acronym>CNI</acronym> code. Note that you must also compile
552 your C++ source code with <literal>-fno-rtti</literal>.
553 </para>
554 <para>
555 Calling a Java instance method in <acronym>CNI</acronym> is done
556 using the standard C++ syntax. For example:
557 <programlisting>
558 java::lang::Number *x;
559 if (x-&gt;doubleValue() &gt; 0.0) ...
560 </programlisting>
561 </para>
562 <para>
563 Defining a Java native instance method is also done the natural way:
564 <programlisting>
565 #include &lt;java/lang/Integer.h&gt;
566 jdouble
567 java::lang:Integer::doubleValue()
569 return (jdouble) value;
571 </programlisting>
572 </para>
573 </sect2>
575 <sect2><title>Interface method calls</title>
576 <para>
577 In Java you can call a method using an interface reference.
578 This is not yet supported in <acronym>CNI</acronym>.</para>
579 </sect2>
580 </sect1>
582 <sect1><title>Object allocation</title>
584 <para>
585 New Java objects are allocated using a
586 <firstterm>class-instance-creation-expression</firstterm>:
587 <programlisting>
588 new <replaceable>Type</replaceable> ( <replaceable>arguments</replaceable> )
589 </programlisting>
590 The same syntax is used in C++. The main difference is that
591 C++ objects have to be explicitly deleted; in Java they are
592 automatically deleted by the garbage collector.
593 Using <acronym>CNI</acronym>, you can allocate a new object
594 using standard C++ syntax. The C++ compiler is smart enough to
595 realize the class is a Java class, and hence it needs to allocate
596 memory from the garbage collector. If you have overloaded
597 constructors, the compiler will choose the correct one
598 using standard C++ overload resolution rules. For example:
599 <programlisting>
600 java::util::Hashtable *ht = new java::util::Hashtable(120);
601 </programlisting>
602 </para>
603 <para>
604 <funcsynopsis>
605 <funcdef>void *<function>_Jv_AllocBytes</function></funcdef>
606 <paramdef>jsize <parameter>size</parameter></paramdef>
607 </funcsynopsis>
608 Allocate <parameter>size</parameter> bytes. This memory is not
609 scanned by the garbage collector. However, it will be freed by
610 the GC if no references to it are discovered.
611 </para>
612 </sect1>
614 <sect1><title>Interfaces</title>
615 <para>
616 A Java class can <firstterm>implement</firstterm> zero or more
617 <firstterm>interfaces</firstterm>, in addition to inheriting from
618 a single base class.
619 An interface is a collection of constants and method specifications;
620 it is similar to the <firstterm>signatures</firstterm> available
621 as a G++ extension. An interface provides a subset of the
622 functionality of C++ abstract virtual base classes, but they
623 are currently implemented differently.
624 CNI does not currently provide any support for interfaces,
625 or calling methods from an interface pointer.
626 This is partly because we are planning to re-do how
627 interfaces are implemented in <acronym>gcj</acronym>.
628 </para>
629 </sect1>
631 <sect1><title>Strings</title>
632 <para>
633 <acronym>CNI</acronym> provides a number of utility functions for
634 working with Java <literal>String</literal> objects.
635 The names and interfaces are analogous to those of <acronym>JNI</acronym>.
636 </para>
638 <para>
639 <funcsynopsis>
640 <funcdef>jstring <function>JvNewString</function></funcdef>
641 <paramdef>const jchar *<parameter>chars</parameter></paramdef>
642 <paramdef>jsize <parameter>len</parameter></paramdef>
643 </funcsynopsis>
644 Creates a new Java String object, where
645 <parameter>chars</parameter> are the contents, and
646 <parameter>len</parameter> is the number of characters.
647 </para>
649 <para>
650 <funcsynopsis>
651 <funcdef>jstring <function>JvNewStringLatin1</function></funcdef>
652 <paramdef>const char *<parameter>bytes</parameter></paramdef>
653 <paramdef>jsize <parameter>len</parameter></paramdef>
654 </funcsynopsis>
655 Creates a new Java String object, where <parameter>bytes</parameter>
656 are the Latin-1 encoded
657 characters, and <parameter>len</parameter> is the length of
658 <parameter>bytes</parameter>, in bytes.
659 </para>
661 <para>
662 <funcsynopsis>
663 <funcdef>jstring <function>JvNewStringLatin1</function></funcdef>
664 <paramdef>const char *<parameter>bytes</parameter></paramdef>
665 </funcsynopsis>
666 Like the first JvNewStringLatin1, but computes <parameter>len</parameter>
667 using <literal>strlen</literal>.
668 </para>
670 <para>
671 <funcsynopsis>
672 <funcdef>jstring <function>JvNewStringUTF</function></funcdef>
673 <paramdef>const char *<parameter>bytes</parameter></paramdef>
674 </funcsynopsis>
675 Creates a new Java String object, where <parameter>bytes</parameter> are
676 the UTF-8 encoded characters of the string, terminated by a null byte.
677 </para>
679 <para>
680 <funcsynopsis>
681 <funcdef>jchar *<function>JvGetStringChars</function></funcdef>
682 <paramdef>jstring <parameter>str</parameter></paramdef>
683 </funcsynopsis>
684 Returns a pointer to the array of characters which make up a string.
685 </para>
687 <para>
688 <funcsynopsis>
689 <funcdef> int <function>JvGetStringUTFLength</function></funcdef>
690 <paramdef>jstring <parameter>str</parameter></paramdef>
691 </funcsynopsis>
692 Returns number of bytes required to encode contents
693 of <parameter>str</parameter> as UTF-8.
694 </para>
696 <para>
697 <funcsynopsis>
698 <funcdef> jsize <function>JvGetStringUTFRegion</function></funcdef>
699 <paramdef>jstring <parameter>str</parameter></paramdef>
700 <paramdef>jsize <parameter>start</parameter></paramdef>
701 <paramdef>jsize <parameter>len</parameter></paramdef>
702 <paramdef>char *<parameter>buf</parameter></paramdef>
703 </funcsynopsis>
704 This puts the UTF-8 encoding of a region of the
705 string <parameter>str</parameter> into
706 the buffer <parameter>buf</parameter>.
707 The region of the string to fetch is specifued by
708 <parameter>start</parameter> and <parameter>len</parameter>.
709 It is assumed that <parameter>buf</parameter> is big enough
710 to hold the result. Note
711 that <parameter>buf</parameter> is <emphasis>not</emphasis> null-terminated.
712 </para>
713 </sect1>
715 <sect1><title>Class Initialization</title>
716 <para>
717 Java requires that each class be automatically initialized at the time
718 of the first active use. Initializing a class involves
719 initializing the static fields, running code in class initializer
720 methods, and initializing base classes. There may also be
721 some implementation specific actions, such as allocating
722 <classname>String</classname> objects corresponding to string literals in
723 the code.</para>
724 <para>
725 The Gcj compiler inserts calls to <literal>JvInitClass</literal> (actually
726 <literal>_Jv_InitClass</literal>) at appropriate places to ensure that a
727 class is initialized when required. The C++ compiler does not
728 insert these calls automatically - it is the programmer's
729 responsibility to make sure classes are initialized. However,
730 this is fairly painless because of the conventions assumed by the Java
731 system.</para>
732 <para>
733 First, <literal>libgcj</literal> will make sure a class is initialized
734 before an instance of that object is created. This is one
735 of the responsibilities of the <literal>new</literal> operation. This is
736 taken care of both in Java code, and in C++ code. (When the G++
737 compiler sees a <literal>new</literal> of a Java class, it will call
738 a routine in <literal>libgcj</literal> to allocate the object, and that
739 routine will take care of initializing the class.) It follows that you can
740 access an instance field, or call an instance (non-static)
741 method and be safe in the knowledge that the class and all
742 of its base classes have been initialized.</para>
743 <para>
744 Invoking a static method is also safe. This is because the
745 Java compiler adds code to the start of a static method to make sure
746 the class is initialized. However, the C++ compiler does not
747 add this extra code. Hence, if you write a native static method
748 using CNI, you are responsible for calling <literal>JvInitClass</literal>
749 before doing anything else in the method (unless you are sure
750 it is safe to leave it out).</para>
751 <para>
752 Accessing a static field also requires the class of the
753 field to be initialized. The Java compiler will generate code
754 to call <literal>_Jv_InitClass</literal> before getting or setting the field.
755 However, the C++ compiler will not generate this extra code,
756 so it is your responsibility to make sure the class is
757 initialized before you access a static field.</para>
758 </sect1>
759 <sect1><title>Exception Handling</title>
760 <para>
761 While C++ and Java share a common exception handling framework,
762 things are not yet perfectly integrated. The main issue is that the
763 <quote>run-time type information</quote> facilities of the two
764 languages are not integrated.</para>
765 <para>
766 Still, things work fairly well. You can throw a Java exception from
767 C++ using the ordinary <literal>throw</literal> construct, and this
768 exception can be caught by Java code. Similarly, you can catch an
769 exception thrown from Java using the C++ <literal>catch</literal>
770 construct.
771 <para>
772 Note that currently you cannot mix C++ catches and Java catches in
773 a single C++ translation unit. We do intend to fix this eventually.
774 </para>
775 <para>
776 Here is an example:
777 <programlisting>
778 if (i >= count)
779 throw new java::lang::IndexOutOfBoundsException();
780 </programlisting>
781 </para>
782 <para>
783 Normally, GNU C++ will automatically detect when you are writing C++
784 code that uses Java exceptions, and handle them appropriately.
785 However, if C++ code only needs to execute destructors when Java
786 exceptions are thrown through it, GCC will guess incorrectly. Sample
787 problematic code:
788 <programlisting>
789 struct S { ~S(); };
790 extern void bar(); // is implemented in Java and may throw exceptions
791 void foo()
793 S s;
794 bar();
796 </programlisting>
797 The usual effect of an incorrect guess is a link failure, complaining of
798 a missing routine called <literal>__gxx_personality_v0</literal>.
799 </para>
800 <para>
801 You can inform the compiler that Java exceptions are to be used in a
802 translation unit, irrespective of what it might think, by writing
803 <literal>#pragma GCC java_exceptions</literal> at the head of the
804 file. This <literal>#pragma</literal> must appear before any
805 functions that throw or catch exceptions, or run destructors when
806 exceptions are thrown through them.</para>
807 </sect1>
809 <sect1><title>Synchronization</title>
810 <para>
811 Each Java object has an implicit monitor.
812 The Java VM uses the instruction <literal>monitorenter</literal> to acquire
813 and lock a monitor, and <literal>monitorexit</literal> to release it.
814 The JNI has corresponding methods <literal>MonitorEnter</literal>
815 and <literal>MonitorExit</literal>. The corresponding CNI macros
816 are <literal>JvMonitorEnter</literal> and <literal>JvMonitorExit</literal>.
817 </para>
818 <para>
819 The Java source language does not provide direct access to these primitives.
820 Instead, there is a <literal>synchronized</literal> statement that does an
821 implicit <literal>monitorenter</literal> before entry to the block,
822 and does a <literal>monitorexit</literal> on exit from the block.
823 Note that the lock has to be released even the block is abnormally
824 terminated by an exception, which means there is an implicit
825 <literal>try</literal>-<literal>finally</literal>.
826 </para>
827 <para>
828 From C++, it makes sense to use a destructor to release a lock.
829 CNI defines the following utility class.
830 <programlisting>
831 class JvSynchronize() {
832 jobject obj;
833 JvSynchronize(jobject o) { obj = o; JvMonitorEnter(o); }
834 ~JvSynchronize() { JvMonitorExit(obj); }
836 </programlisting>
837 The equivalent of Java's:
838 <programlisting>
839 synchronized (OBJ) { CODE; }
840 </programlisting>
841 can be simply expressed:
842 <programlisting>
843 { JvSynchronize dummy(OBJ); CODE; }
844 </programlisting>
845 </para>
846 <para>
847 Java also has methods with the <literal>synchronized</literal> attribute.
848 This is equivalent to wrapping the entire method body in a
849 <literal>synchronized</literal> statement.
850 (Alternatively, an implementation could require the caller to do
851 the synchronization. This is not practical for a compiler, because
852 each virtual method call would have to test at run-time if
853 synchronization is needed.) Since in <literal>gcj</literal>
854 the <literal>synchronized</literal> attribute is handled by the
855 method implementation, it is up to the programmer
856 of a synchronized native method to handle the synchronization
857 (in the C++ implementation of the method).
858 In otherwords, you need to manually add <literal>JvSynchronize</literal>
859 in a <literal>native synchornized</literal> method.</para>
860 </sect1>
862 <sect1><title>Reflection</title>
863 <para>The types <literal>jfieldID</literal> and <literal>jmethodID</literal>
864 are as in JNI.</para>
865 <para>
866 The function <literal>JvFromReflectedField</literal>,
867 <literal>JvFromReflectedMethod</literal>,
868 <literal>JvToReflectedField</literal>, and
869 <literal>JvToFromReflectedMethod</literal> (as in Java 2 JNI)
870 will be added shortly, as will other functions corresponding to JNI.</para>
872 <sect1><title>Using gcjh</title>
873 <para>
874 The <command>gcjh</command> is used to generate C++ header files from
875 Java class files. By default, <command>gcjh</command> generates
876 a relatively straightforward C++ header file. However, there
877 are a few caveats to its use, and a few options which can be
878 used to change how it operates:
879 </para>
880 <variablelist>
881 <varlistentry>
882 <term><literal>--classpath</literal> <replaceable>path</replaceable></term>
883 <term><literal>--CLASSPATH</literal> <replaceable>path</replaceable></term>
884 <term><literal>-I</literal> <replaceable>dir</replaceable></term>
885 <listitem><para>
886 These options can be used to set the class path for gcjh.
887 Gcjh searches the class path the same way the compiler does;
888 these options have their familiar meanings.</para>
889 </listitem>
890 </varlistentry>
892 <varlistentry>
893 <term><literal>-d <replaceable>directory</replaceable></literal></term>
894 <listitem><para>
895 Puts the generated <literal>.h</literal> files
896 beneath <replaceable>directory</replaceable>.</para>
897 </listitem>
898 </varlistentry>
900 <varlistentry>
901 <term><literal>-o <replaceable>file</replaceable></literal></term>
902 <listitem><para>
903 Sets the name of the <literal>.h</literal> file to be generated.
904 By default the <literal>.h</literal> file is named after the class.
905 This option only really makes sense if just a single class file
906 is specified.</para>
907 </listitem>
908 </varlistentry>
910 <varlistentry>
911 <term><literal>--verbose</literal></term>
912 <listitem><para>
913 gcjh will print information to stderr as it works.</para>
914 </listitem>
915 </varlistentry>
917 <varlistentry>
918 <term><literal>-M</literal></term>
919 <term><literal>-MM</literal></term>
920 <term><literal>-MD</literal></term>
921 <term><literal>-MMD</literal></term>
922 <listitem><para>
923 These options can be used to generate dependency information
924 for the generated header file. They work the same way as the
925 corresponding compiler options.</para>
926 </listitem>
927 </varlistentry>
929 <varlistentry>
930 <term><literal>-prepend <replaceable>text</replaceable></literal></term>
931 <listitem><para>
932 This causes the <replaceable>text</replaceable> to be put into the generated
933 header just after class declarations (but before declaration
934 of the current class). This option should be used with caution.</para>
935 </listitem>
936 </varlistentry>
938 <varlistentry>
939 <term><literal>-friend <replaceable>text</replaceable></literal></term>
940 <listitem><para>
941 This causes the <replaceable>text</replaceable> to be put into the class
942 declaration after a <literal>friend</literal> keyword.
943 This can be used to declare some
944 other class or function to be a friend of this class.
945 This option should be used with caution.</para>
946 </listitem>
947 </varlistentry>
949 <varlistentry>
950 <term><literal>-add <replaceable>text</replaceable></literal></term>
951 <listitem><para>
952 The <replaceable>text</replaceable> is inserted into the class declaration.
953 This option should be used with caution.</para>
954 </listitem>
955 </varlistentry>
957 <varlistentry>
958 <term><literal>-append <replaceable>text</replaceable></literal></term>
959 <listitem><para>
960 The <replaceable>text</replaceable> is inserted into the header file
961 after the class declaration. One use for this is to generate
962 inline functions. This option should be used with caution.
963 </listitem>
964 </varlistentry>
965 </variablelist>
966 <para>
967 All other options not beginning with a <literal>-</literal> are treated
968 as the names of classes for which headers should be generated.</para>
969 <para>
970 gcjh will generate all the required namespace declarations and
971 <literal>#include</literal>'s for the header file.
972 In some situations, gcjh will generate simple inline member
973 functions. Note that, while gcjh puts <literal>#pragma
974 interface</literal> in the generated header file, you should
975 <emphasis>not</emphasis> put <literal>#pragma implementation</literal>
976 into your C++ source file. If you do, duplicate definitions of
977 inline functions will sometimes be created, leading to link-time
978 errors.
979 </para>
980 <para>
981 There are a few cases where gcjh will fail to work properly:</para>
982 <para>
983 gcjh assumes that all the methods and fields of a class have ASCII
984 names. The C++ compiler cannot correctly handle non-ASCII
985 identifiers. gcjh does not currently diagnose this problem.</para>
986 <para>
987 gcjh also cannot fully handle classes where a field and a method have
988 the same name. If the field is static, an error will result.
989 Otherwise, the field will be renamed in the generated header; `__'
990 will be appended to the field name.</para>
991 <para>
992 Eventually we hope to change the C++ compiler so that these
993 restrictions can be lifted.</para>
994 </sect1>
996 </article>