1 \input texinfo @c -*-texinfo-*-
4 @setfilename hacking.info
5 @settitle GNU Classpath Hacker's Guide
11 This file contains important information you will need to know if you
12 are going to hack on the GNU Classpath project code.
14 Copyright (C) 1998,1999,2000,2001,2002,2003,2004, 2005 Free Software Foundation, Inc.
17 @dircategory GNU Libraries
19 * Classpath Hacking: (hacking). GNU Classpath Hacker's Guide
25 @title GNU Classpath Hacker's Guide
27 @author Paul N. Fisher
29 @author C. Brian Jones
30 @author Mark J. Wielaard
33 @vskip 0pt plus 1filll
34 Copyright @copyright{} 1998,1999,2000,2001,2002,2003,2004 Free Software Foundation, Inc.
36 Permission is granted to make and distribute verbatim copies of
37 this document provided the copyright notice and this permission notice
38 are preserved on all copies.
40 Permission is granted to copy and distribute modified versions of this
41 document under the conditions for verbatim copying, provided that the
42 entire resulting derived work is distributed under the terms of a
43 permission notice identical to this one.
45 Permission is granted to copy and distribute translations of this manual
46 into another language, under the above conditions for modified versions,
47 except that this permission notice may be stated in a translation
48 approved by the Free Software Foundation.
53 @node Top, Introduction, (dir), (dir)
54 @top GNU Classpath Hacker's Guide
56 This document contains important information you'll want to know if
57 you want to hack on GNU Classpath, Essential Libraries for Java, to
58 help create free core class libraries for use with virtual machines
59 and compilers for the java programming language.
63 * Introduction:: An introduction to the GNU Classpath project
64 * Requirements:: Very important rules that must be followed
65 * Volunteering:: So you want to help out
66 * Project Goals:: Goals of the GNU Classpath project
67 * Needed Tools and Libraries:: A list of programs and libraries you will need
68 * Programming Standards:: Standards to use when writing code
69 * Hacking Code:: Working on code, Working with others
70 * Programming Goals:: What to consider when writing code
71 * API Compatibility:: How to handle serialization and deprecated methods
72 * Specification Sources:: Where to find class library specs
73 * Naming Conventions:: How files and directories are named
74 * Character Conversions:: Working on Character conversions
75 * Localization:: How to handle localization/internationalization
78 --- The Detailed Node Listing ---
82 * Source Code Style Guide::
84 Working on the code, Working with others
87 * Writing ChangeLogs::
91 * Writing ChangeLogs::
95 * Portability:: Writing Portable Software
96 * Utility Classes:: Reusing Software
97 * Robustness:: Writing Robust Software
98 * Java Efficiency:: Writing Efficient Java
99 * Native Efficiency:: Writing Efficient JNI
100 * Security:: Writing Secure Software
104 * Serialization:: Serialization
105 * Deprecated Methods:: Deprecated methods
109 * String Collation:: Sorting strings in different locales
110 * Break Iteration:: Breaking up text into words, sentences, and lines
111 * Date Formatting and Parsing:: Locale specific date handling
112 * Decimal/Currency Formatting and Parsing:: Local specific number handling
117 @node Introduction, Requirements, Top, Top
118 @comment node-name, next, previous, up
119 @chapter Introduction
121 The GNU Classpath Project is a dedicated to providing a 100% free,
122 clean room implementation of the standard core class libraries for
123 compilers and runtime environments for the java programming language.
124 It offers free software developers an alternative core library
125 implementation upon which larger java-like programming environments
126 can be build. The GNU Classpath Project was started in the Spring of
127 1998 as an official Free Software Foundation project. Most of the
128 volunteers working on GNU Classpath do so in their spare time, but a
129 couple of projects based on GNU Classpath have paid programmers to
130 improve the core libraries. We appreciate everyone's efforts in the
131 past to improve and help the project and look forward to future
132 contributions by old and new members alike.
134 @node Requirements, Volunteering, Introduction, Top
135 @comment node-name, next, previous, up
136 @chapter Requirements
138 Although GNU Classpath is following an open development model where input
139 from developers is welcome, there are certain base requirements that
140 need to be met by anyone who wants to contribute code to this project.
141 They are mostly dictated by legal requirements and are not arbitrary
142 restrictions chosen by the GNU Classpath team.
144 You will need to adhere to the following things if you want to donate
145 code to the GNU Classpath project:
149 @strong{Never under any circumstances refer to proprietary code while
150 working on GNU Classpath.} It is best if you have never looked at
151 alternative proprietary core library code at all. To reduce
152 temptation, it would be best if you deleted the @file{src.zip} file
153 from your proprietary JDK distribution (note that recent versions of
154 GNU Classpath and the compilers and environments build on it are
155 mature enough to not need any proprietary implementation at all when
156 working on GNU Classpath, except in exceptional cases where you need
157 to test compatibility issues pointed out by users). If you have
158 signed Sun's non-disclosure statement, then you unfortunately cannot
159 work on Classpath code at all. If you have any reason to believe that
160 your code might be ``tainted'', please say something on the mailing
161 list before writing anything. If it turns out that your code was not
162 developed in a clean room environment, we could be very embarrassed
163 someday in court. Please don't let that happen.
166 @strong{Never decompile proprietary class library implementations.} While
167 the wording of the license in Sun's Java 2 releases has changed, it is
168 not acceptable, under any circumstances, for a person working on
169 GNU Classpath to decompile Sun's class libraries. Allowing the use of
170 decompilation in the GNU Classpath project would open up a giant can of
171 legal worms, which we wish to avoid.
174 Classpath is licensed under the terms of the
175 @uref{http://www.fsf.org/copyleft/gpl.html,GNU General Public
176 License}, with a special exception included to allow linking with
177 non-GPL licensed works as long as no other license would restrict such
178 linking. To preserve freedom for all users and to maintain uniform
179 licensing of Classpath, we will not accept code into the main
180 distribution that is not licensed under these terms. The exact
181 wording of the license of the current version of GNU Classpath can be
182 found online from the
183 @uref{http://www.gnu.org/software/classpath/license.html, GNU
184 Classpath license page} and is of course distributed with current
185 snapshot release from @uref{ftp://ftp.gnu.org/gnu/classpath/} or by
186 obtaining a copy of the current CVS tree.
189 GNU Classpath is GNU software and this project is being officially sponsored
190 by the @uref{http://www.fsf.org/,Free Software Foundation}. Because of
191 this, the FSF will hold copyright to all code developed as part of
192 GNU Classpath. This will allow them to pursue copyright violators in court,
193 something an individual developer may neither have the time nor
194 resources to do. Everyone contributing code to GNU Classpath will need to
195 sign a copyright assignment statement. Additionally, if you are
196 employed as a programmer, your employer may need to sign a copyright
197 waiver disclaiming all interest in the software. This may sound harsh,
198 but unfortunately, it is the only way to ensure that the code you write
199 is legally yours to distribute.
202 @node Volunteering, Project Goals, Requirements, Top
203 @comment node-name, next, previous, up
204 @chapter Volunteering to Help
206 The GNU Classpath project needs volunteers to help us out. People are
207 needed to write unimplemented core packages, to test GNU Classpath on
208 free software programs written in the java programming language, to
209 test it on various platforms, and to port it to platforms that are
210 currently unsupported.
212 While pretty much all contributions are welcome (but see
213 @pxref{Requirements}) it is always preferable that volunteers do the
214 whole job when volunteering for a task. So when you volunteer to write
215 a Java package, please be willing to do the following:
219 Implement a complete drop-in replacement for the particular package.
220 That means implementing any ``internal'' classes. For example, in the
221 java.net package, there are non-public classes for implementing sockets.
222 Without those classes, the public socket interface is useless. But do
223 not feel obligated to completely implement all of the functionality at
224 once. For example, in the java.net package, there are different types
225 of protocol handlers for different types of URL's. Not all of these
226 need to be written at once.
229 Please write complete and thorough API documentation comments for
230 every public and protected method and variable. These should be
231 superior to Sun's and cover everything about the item being
235 Please write a regression test package that can be used to run tests
236 of your package's functionality. GNU Classpath uses the
237 @uref{http://sources.redhat.com/mauve/,Mauve project} for testing the
238 functionality of the core class libraries. The Classpath Project is
239 fast approaching the point in time where all modifications to the
240 source code repository will require appropriate test cases in Mauve to
241 ensure correctness and prevent regressions.
244 Writing good documentation, tests and fixing bugs should be every
245 developer's top priority in order to reach the elusive release of
248 @node Project Goals, Needed Tools and Libraries, Volunteering, Top
249 @comment node-name, next, previous, up
250 @chapter Project Goals
252 The goal of the Classpath project is to produce a
253 @uref{http://www.fsf.org/philosophy/free-sw.html,free} implementation of
254 the standard class library for Java. However, there are other more
255 specific goals as to which platforms should be supported.
257 Classpath is targeted to support the following operating systems:
261 Free operating systems. This includes GNU/Linux, GNU/Hurd, and the free
265 Other Unix-like operating systems.
268 Platforms which currently have no Java support at all.
271 Other platforms such as MS-Windows.
274 While free operating systems are the top priority, the other priorities
275 can shift depending on whether or not there is a volunteer to port
276 Classpath to those platforms and to test releases.
278 Eventually we hope the Classpath will support all JVM's that provide
279 JNI or CNI support. However, the top priority is free JVM's. A small
280 list of Compiler/VM environments that are currently actively
281 incorporating GNU Classpath is below. A more complete overview of
282 projects based on GNU classpath can be found online at
283 @uref{http://www.gnu.org/software/classpath/stories.html,the GNU
284 Classpath stories page}.
288 @uref{http://gcc.gnu.org/java/,GCJ}
290 @uref{http://jamvm.sourceforge.net/,jamvm}
292 @uref{http://kissme.sourceforge.net/,Kissme}
294 @uref{http://www.ibm.com/developerworks/oss/jikesrvm/,Jikes RVM}
296 @uref{http://www.sablevm.org/,SableVM}
298 @uref{http://www.kaffe.org/,Kaffe}
301 As with OS platform support, this priority list could change if a
302 volunteer comes forward to port, maintain, and test releases for a
303 particular JVM. Since gcj is part of the GNU Compiler Collective it
304 is one of the most important targets. But since it doesn't currently
305 work out of the box with GNU Classpath it is currently not the easiest
306 target. When hacking on GNU Classpath the easiest is to use
307 compilers and runtime environments that that work out of the box with
308 it, such as the jikes compiler and the runtime environments jamvm and
309 kissme. But you can also work directly with targets like gcj and
310 kaffe that have their own copy of GNU Classpath currently. In that
311 case changes have to be merged back into GNU Classpath proper though,
312 which is sometimes more work. SableVM is starting to migrate from an
313 integrated GNU Classpath version to being usable with GNU Classpath
317 The initial target version for Classpath is the 1.1 spec. Higher
318 versions can be implemented (and have been implemented, including lots
319 of 1.4 functionality) if desired, but please do not create classes
320 that depend on features in those packages unless GNU Classpath already
321 contains those features. GNU Classpath has been free of any
322 proprietary dependencies for a long time now and we like to keep it
323 that way. But finishing, polishing up, documenting, testing and
324 debugging current functionality is of higher priority then adding new
327 @node Needed Tools and Libraries, Programming Standards, Project Goals, Top
328 @comment node-name, next, previous, up
329 @chapter Needed Tools and Libraries
331 If you want to hack on Classpath, you should at least download and
332 install the following tools. And try to familiarize yourself with
333 them. Although in most cases having these tools installed will be all
334 you really need to know about them. Also note that when working on
335 (snapshot) releases only GCC 3.3+ (plus a free VM from the list above
336 and the libraries listed below) is needed. The other tools are only
337 needed when working directly on the CVS version.
356 All of these tools are available from
357 @uref{ftp://gnudist.gnu.org/pub/gnu/,gnudist.gnu.org} via anonymous
358 ftp, except CVS which is available from
359 @uref{http://www.cvshome.org/,www.cvshome.org}. They are fully
360 documented with texinfo manuals. Texinfo can be browsed with the
361 Emacs editor, or with the text editor of your choice, or transformed
362 into nicely printable Postscript.
364 Here is a brief description of the purpose of those tools.
369 The GNU Compiler Collection. This contains a C compiler (gcc) for
370 compiling the native C code and a compiler for the java programming
371 language (gcj). You will need at least gcj version 3.3 or higher. If
372 that version is not available for your platform you can try the
373 @uref{http://www.jikes.org/, jikes compiler}. We try to keep all code
374 compilable with both gcj and jikes at all times.
377 A version control system that maintains a centralized Internet
378 repository of all code in the Classpath system.
381 This tool automatically creates Makefile.in files from Makefile.am
382 files. The Makefile.in is turned into a Makefile by autoconf. Why
383 use this? Because it automatically generates every makefile target
384 you would ever want (clean, install, dist, etc) in full compliance
385 with the GNU coding standards. It also simplifies Makefile creation
386 in a number of ways that cannot be described here. Read the docs for
390 Automatically configures a package for the platform on which it is
391 being built and generates the Makefile for that platform.
394 Handles all of the zillions of hairy platform specific options needed
395 to build shared libraries.
398 The free GNU replacement for the standard Unix macro processor.
399 Proprietary m4 programs are broken and so GNU m4 is required for
400 autoconf to work though knowing a lot about GNU m4 is not required to
404 Larry Wall's scripting language. It is used internally by automake.
407 Manuals and documentation (like this guide) are written in texinfo.
408 Texinfo is the official documentation format of the GNU project.
409 Texinfo uses a single source file to produce output in a number of formats,
410 both online and printed (dvi, info, html, xml, etc.). This means that
411 instead of writing different documents for online information and another
412 for a printed manual, you need write only one document. And when the work
413 is revised, you need revise only that one document.
418 For compiling the native AWT libraries you need to have the following
423 @uref{http://www.gtk.org/,GTK+} is a multi-platform toolkit for
424 creating graphical user interfaces. It is used as the basis of the
425 GNU desktop project GNOME.
428 @uref{http://www.gnome.org/start/,gdk-pixbuf} is a GNOME library for
433 GNU Classpath comes with a couple of libraries included in the source
434 that are not part of GNU Classpath proper, but that have been included
435 to provide certain needed functionality. All these external libraries
436 should be clearly marked as such. In general we try to use as much as
437 possible the clean upstream versions of these sources. That way
438 merging in new versions will be easiest. You should always try to get
439 bug fixes to these files accepted upstream first. Currently we
440 include the following 'external' libraries. Most of these sources are
441 included in the @file{external} directory. That directory also
442 contains a @file{README} file explaining how to import newer versions.
447 Can be found in @file{external/jaxp}. Provides javax.xml, org.w3c and
448 org.xml packages. Upstream is
449 @uref{http://www.gnu.org/software/classpathx/,GNU ClasspathX}.
452 Can be found in @file{native/fdlibm}. Provides native implementations
453 of some of the Float and Double operations. Upstream is
454 @uref{http://gcc.gnu.org/java/,libgcj}, they sync again with the
455 'real' upstream @uref{http://www.netlib.org/fdlibm/readme}. See also
456 java.lang.StrictMath.
461 @node Programming Standards, Hacking Code, Needed Tools and Libraries, Top
462 @comment node-name, next, previous, up
463 @chapter Programming Standards
465 For C source code, follow the
466 @uref{http://www.gnu.org/prep/standards/,GNU Coding Standards}.
467 The standards also specify various things like the install directory
468 structure. These should be followed if possible.
470 For Java source code, please follow the
471 @uref{http://www.gnu.org/prep/standards/,GNU Coding
472 Standards}, as much as possible. There are a number of exceptions to
473 the GNU Coding Standards that we make for GNU Classpath as documented
474 in this guide. We will hopefully be providing developers with a code
475 formatting tool that closely matches those rules soon.
477 For API documentation comments, please follow
478 @uref{http://java.sun.com/products/jdk/javadoc/writingdoccomments.html,How
479 to Write Doc Comments for Javadoc}. We would like to have a set of
480 guidelines more tailored to GNU Classpath as part of this document.
483 * Source Code Style Guide::
486 @node Source Code Style Guide, , Programming Standards, Programming Standards
487 @comment node-name, next, previous, up
488 @section Java source coding style
490 Here is a list of some specific rules used when hacking on GNU
491 Classpath java source code. We try to follow the standard
492 @uref{http://www.gnu.org/prep/standards/,GNU Coding Standards}
493 for that. There are lots of tools that can automatically generate it
494 (although most tools assume C source, not java source code) and it
495 seems as good a standard as any. There are a couple of exceptions and
496 specific rules when hacking on GNU Classpath java source code however.
497 The following lists how code is formatted (and some other code
504 Java source files in GNU Classpath are encoded using UTF-8. However,
505 ordinarily it is considered best practice to use the ASCII subset of
506 UTF-8 and write non-ASCII characters using \u escapes.
509 If possible, generate specific imports (expand) over java.io.* type
510 imports. Order by gnu, java, javax, org. There must be one blank line
511 between each group. The imports themselves are ordered alphabetically by
512 package name. Classes and interfaces occur before sub-packages. The
513 classes/interfaces are then also sorted alphabetical. Note that uppercase
514 characters occur before lowercase characters.
517 import gnu.java.awt.EmbeddedWindow;
519 import java.io.IOException;
520 import java.io.InputStream;
522 import javax.swing.JFrame;
526 Blank line after package statement, last import statement, classes,
530 Opening/closing brace for class and method is at the same level of
531 indent as the declaration. All other braces are indented and content
532 between braces indented again.
535 Since method definitions don't start in column zero anyway (since they
536 are always inside a class definition), the rational for easy grepping
537 for ``^method_def'' is mostly gone already. Since it is customary for
538 almost everybody who writes java source code to put modifiers, return
539 value and method name on the same line, we do too.
541 @c fixme Another rational for always indenting the method definition is that itmakes it a bit easier to distinguish methods in inner and anonymousclasses from code in their enclosing context. NEED EXAMPLE.
544 Implements and extends on separate lines, throws too. Indent extends,
545 implements, throws. Apply deep indentation for method arguments.
547 @c fixme Needs example.
550 Don't add a space between a method or constructor call/definition and
551 the open-bracket. This is because often the return value is an object on
552 which you want to apply another method or from which you want to access
558 getToolkit ().createWindow (this);
563 getToolkit().createWindow(this);
567 The GNU Coding Standard it gives examples for almost every construct
568 (if, switch, do, while, etc.). One missing is the try-catch construct
569 which should be formatted as:
583 Wrap lines at 80 characters after assignments and before operators.
584 Wrap always before extends, implements, throws, and labels.
587 Don't put multiple class definitions in the same file, except for
588 inner classes. File names (plus .java) and class names should be the
592 Don't catch a @code{NullPointerException} as an alternative to simply
593 checking for @code{null}. It is clearer and usually more efficient
594 to simply write an explicit check.
596 For instance, don't write:
603 catch (NullPointerException _)
609 If your intent above is to check whether @samp{foo} is @code{null},
620 Don't use redundant modifiers or other redundant constructs. Here is
621 some sample code that shows various redundant items in comments:
624 /*import java.lang.Integer;*/
625 /*abstract*/ interface I @{
626 /*public abstract*/ void m();
627 /*public static final*/ int i = 1;
628 /*public static*/ class Inner @{@}
630 final class C /*extends Object*/ @{
631 /*final*/ void m() @{@}
635 Note that Jikes will generate warnings for redundant modifiers if you
636 use @code{+Predundant-modifiers} on the command line.
639 Modifiers should be listed in the standard order recommended by the
640 JLS. Jikes will warn for this when given @code{+Pmodifier-order}.
643 Because the output of different compilers differs, we have
644 standardized on explicitly specifying @code{serialVersionUID} in
645 @code{Serializable} classes in Classpath. This field should be
646 declared as @code{private static final}. Note that a class may be
647 @code{Serializable} without being explicitly marked as such, due to
648 inheritance. For instance, all subclasses of @code{Throwable} need to
649 have @code{serialVersionUID} declared.
651 @c fixme link to the discussion
654 Don't declare unchecked exceptions in the @code{throws} clause of a
655 method. However, if throwing an unchecked exception is part of the
656 method's API, you should mention it in the Javadoc.
659 When overriding @code{Object.equals}, remember that @code{instanceof}
660 filters out @code{null}, so an explicit check is not needed.
663 When catching an exception and rethrowing a new exception you should
664 ``chain'' the Throwables. Don't just add the String representation of
665 the caught exception.
670 // Some code that can throw
672 catch (IOException ioe)
674 throw (SQLException) new SQLException("Database corrupt").setCause(ioe);
679 Avoid the use of reserved words for identifiers. This is obvious with those
680 such as @code{if} and @code{while} which have always been part of the Java
681 programming language, but you should be careful about accidentally using
682 words which have been added in later versions. Notable examples are
683 @code{assert} (added in 1.4) and @code{enum} (added in 1.5). Jikes will warn
684 of the use of the word @code{enum}, but, as it doesn't yet support the 1.5
685 version of the language, it will still allow this usage through. A
686 compiler which supports 1.5 (e.g. the Eclipse compiler, ecj) will simply
687 fail to compile the offending source code.
689 @c fixme Describe Anonymous classes (example).
690 @c fixme Descibe Naming conventions when different from GNU Coding Standards.
691 @c fixme Describee API doc javadoc tags used.
695 Some things are the same as in the normal GNU Coding Standards:
700 Unnecessary braces can be removed, one line after an if, for, while as
704 Space around operators (assignment, logical, relational, bitwise,
705 mathematical, shift).
708 Blank line before single-line comments, multi-line comments, javadoc
712 If more than 2 blank lines, trim to 2.
715 Don't keep commented out code. Just remove it or add a real comment
716 describing what it used to do and why it is changed to the current
721 @node Hacking Code, Programming Goals, Programming Standards, Top
722 @comment node-name, next, previous, up
723 @chapter Working on the code, Working with others
725 There are a lot of people helping out with GNU Classpath. Here are a
726 couple of practical guidelines to make working together on the code
729 The main thing is to always discuss what you are up to on the
730 mailinglist. Making sure that everybody knows who is working on what
731 is the most important thing to make sure we cooperate most
735 @uref{http://www.gnu.org/software/classpath/tasks.html,Task List}
736 which contains items that you might want to work on.
738 Before starting to work on something please make sure you read this
739 complete guide. And discuss it on list to make sure your work does
740 not duplicate or interferes with work someone else is already doing.
741 Always make sure that you submit things that are your own work. And
742 that you have paperwork on file (as stated in the requirements
743 section) with the FSF authorizing the use of your additions.
745 Technically the GNU Classpath project is hosted on
746 @uref{http://savannah.gnu.org/,Savannah} a central point for
747 development, distribution and maintenance of GNU Software. Here you
749 @uref{https://savannah.gnu.org/projects/classpath/,project page}, bug
750 reports, pending patches, links to mailing lists, news items and CVS.
752 You can find instructions on getting a CVS checkout for classpath at
753 @uref{https://savannah.gnu.org/cvs/?group=classpath}.
755 You don't have to get CVS commit write access to contribute, but it is
756 sometimes more convenient to be able to add your changes directly to
757 the project CVS. Please contact the GNU Classpath savannah admins to
758 arrange CVS access if you would like to have it.
760 Make sure to be subscribed to the commit-classpath mailinglist while
761 you are actively hacking on Classpath. You have to send patches (cvs
762 diff -uN) to this list before committing.
764 We really want to have a pretty open check-in policy. But this means
765 that you should be extra careful if you check something in. If at all
766 in doubt or if you think that something might need extra explaining
767 since it is not completely obvious please make a little announcement
768 about the change on the mailinglist. And if you do commit something
769 without discussing it first and another GNU Classpath hackers asks for
770 extra explanation or suggests to revert a certain commit then please
771 reply to the request by explaining why something should be so or if
772 you agree to revert it. (Just reverting immediately is OK without
773 discussion, but then please don't mix it with other changes and please
776 Patches that are already approved for libgcj or also OK for Classpath.
777 (But you still have to send a patch/diff to the list.) All other
778 patches require you to think whether or not they are really OK and
779 non-controversial, or if you would like some feedback first on them
780 before committing. We might get real commit rules in the future, for
781 now use your own judgment, but be a bit conservative.
783 Always contact the GNU Classpath maintainer before adding anything
784 non-trivial that you didn't write yourself and that does not come from
785 libgcj or from another known GNU Classpath or libgcj hacker. If you
786 have been assigned to commit changes on behalf of another project or
787 a company always make sure they come from people who have signed the
788 papers for the FSF and/or fall under the arrangement your company made
789 with the FSF for contributions. Mention in the ChangeLog who actually
792 Commits for completely unrelated changes they should be committed
793 separately (especially when doing a formatting change and a logical
794 change, do them in two separate commits). But do try to do a commit of
795 as much things/files that are done at the same time which can
796 logically be seen as part of the same change/cleanup etc.
798 When the change fixes an important bug or adds nice new functionality
799 please write a short entry for inclusion in the @file{NEWS} file. If it
800 changes the VM interface you must mention that in both the @file{NEWS} file
801 and the VM Integration Guide.
803 All the ``rules'' are really meant to make sure that GNU Classpath
804 will be maintainable in the long run and to give all the projects that
805 are now using GNU Classpath an accurate view of the changes we make to
806 the code and to see what changed when. If you think the requirements
807 are ``unworkable'' please try it first for a couple of weeks. If you
808 still feel the same after having some more experience with the project
809 please feel free to bring up suggestions for improvements on the list.
810 But don't just ignore the rules! Other hackers depend on them being
811 followed to be the most productive they can be (given the above
816 * Writing ChangeLogs::
819 @node Branches, Writing ChangeLogs, Hacking Code, Hacking Code
820 @comment node-name, next, previous, up
821 @section Working with branches
823 Sometimes it is necessary to create branch of the source for doing new
824 work that is disruptive to the other hackers, or that needs new
825 language or libraries not yet (easily) available.
827 After discussing the need for a branch on the main mailinglist with
828 the other hackers explaining the need of a branch and suggestion of
829 the particular branch rules (what will be done on the branch, who will
830 work on it, will there be different commit guidelines then for the
831 mainline trunk and when is the branch estimated to be finished and
832 merged back into the trunk) every GNU Classpath hacker with commit
833 access should feel free to create a branch. There are however a couple
834 of rules that every branch should follow:
838 @item All branches ought to be documented in the developer wiki at
839 @uref{http://developer.classpath.org/mediation/ClasspathBranches}, so
840 we can know which are live, who owns them, and when they die.
842 @item Some rules can be changed on a branch. In particular the branch
843 maintainer can change the review requirements, and the requirement of
844 keeping things building, testing, etc, can also be lifted. (These
845 should be documented along with the branch name and owner if they
846 differ from the trunk.)
848 @item Requirements for patch email to classpath-patches and for paperwork
849 @strong{cannot} be lifted. See @ref{Requirements}.
851 @item A branch should not be seen as ``private'' or
852 ``may be completely broken''. It should be as much as possible
853 something that you work on with a team (and if there is no team - yet
854 - then there is nothing as bad as having a completely broken build to
855 get others to help out). There can of course be occasional breakage, but
856 it should be planned and explained. And you can certainly have a rule
857 like ``please ask me before committing to this branch''.
859 @item Merges from the trunk to a branch are at the discretion of the
862 @item A merge from a branch to the trunk is treated like any other patch.
863 In particular, it has to go through review, it must satisfy all the
864 trunk requirements (build, regression test, documentation).
866 @item There may be additional timing requirements on merging a branch to
867 the trunk depending on the release schedule, etc. For instance we may
868 not want to do a branch merge just before a release.
872 If any of these rules are unclear please discuss on the list first.
875 * Writing ChangeLogs::
878 @node Writing ChangeLogs, , Branches, Hacking Code
879 @comment node-name, next, previous, up
880 @section Documenting what changed when with ChangeLog entries
882 To keep track of who did what when we keep an explicit ChangeLog entry
883 together with the code. This mirrors the CVS commit messages and in
884 general the ChangeLog entry is the same as the CVS commit message.
885 This provides an easy way for people getting a (snapshot) release or
886 without access to the CVS server to see what happened when. We do not
887 generate the ChangeLog file automatically from the CVS server since
888 that is not reliable.
890 A good ChangeLog entry guideline can be found in the Guile Manual at
891 @uref{http://www.gnu.org/software/guile/changelogs/guile-changelogs_3.html}.
893 Here are some example to explain what should or shouldn't be in a
894 ChangeLog entry (and the corresponding commit message):
899 The first line of a ChangeLog entry should be:
902 [date] <two spaces> [full name] <two spaces> [email-contact]
905 The second line should be blank. All other lines should be indented
909 Just state what was changed. Why something is done as it is done in
910 the current code should be either stated in the code itself or be
911 added to one of the documentation files (like this Hacking Guide).
916 * java/awt/font/OpenType.java: Remove 'public static final'
917 from OpenType tags, reverting the change of 2003-08-11. See
918 Classpath discussion list of 2003-08-11.
924 * java/awt/font/OpenType.java: Remove 'public static final' from
928 In this case the reason for the change was added to this guide.
931 Just as with the normal code style guide, don't make lines longer then
935 Just as with comments in the code. The ChangeLog entry should be a
936 full sentence, starting with a captital and ending with a period.
939 Be precise in what changed, not the effect of the change (which should
940 be clear from the code/patch). So don't write:
943 * java/io/ObjectOutputStream.java : Allow putFields be called more
947 But explain what changed and in which methods it was changed:
950 * java/io/ObjectOutputStream.java (putFields): Don't call
951 markFieldsWritten(). Only create new PutField when
952 currentPutField is null.
953 (writeFields): Call markFieldsWritten().
958 The above are all just guidelines. We all appreciate the fact that writing
959 ChangeLog entries, using a coding style that is not ``your own'' and the
960 CVS, patch and diff tools do take some time to getting used to. So don't
961 feel like you have to do it perfect right away or that contributions
962 aren't welcome if they aren't ``perfect''. We all learn by doing and
963 interacting with each other.
966 @node Programming Goals, API Compatibility, Hacking Code, Top
967 @comment node-name, next, previous, up
968 @chapter Programming Goals
970 When you write code for Classpath, write with three things in mind, and
971 in the following order: portability, robustness, and efficiency.
973 If efficiency breaks portability or robustness, then don't do it the
974 efficient way. If robustness breaks portability, then bye-bye robust
975 code. Of course, as a programmer you would probably like to find sneaky
976 ways to get around the issue so that your code can be all three ... the
977 following chapters will give some hints on how to do this.
980 * Portability:: Writing Portable Software
981 * Utility Classes:: Reusing Software
982 * Robustness:: Writing Robust Software
983 * Java Efficiency:: Writing Efficient Java
984 * Native Efficiency:: Writing Efficient JNI
985 * Security:: Writing Secure Software
988 @node Portability, Utility Classes, Programming Goals, Programming Goals
989 @comment node-name, next, previous, up
992 The portability goal for Classpath is the following:
996 native functions for each platform that work across all VMs on that
999 a single classfile set that work across all VMs on all platforms that
1000 support the native functions.
1003 For almost all of Classpath, this is a very feasible goal, using a
1004 combination of JNI and native interfaces. This is what you should shoot
1005 for. For those few places that require knowledge of the Virtual Machine
1006 beyond that provided by the Java standards, the VM Interface was designed.
1007 Read the Virtual Machine Integration Guide for more information.
1009 Right now the only supported platform is Linux. This will change as that
1010 version stabilizes and we begin the effort to port to many other
1011 platforms. Jikes RVM runs Classpath on AIX, and generally the Jikes
1012 RVM team fixes Classpath to work on that platform.
1014 @node Utility Classes, Robustness, Portability, Programming Goals
1015 @comment node-name, next, previous, up
1016 @section Utility Classes
1018 At the moment, we are not very good at reuse of the JNI code. There
1019 have been some attempts, called @dfn{libclasspath}, to
1020 create generally useful utility classes. The utility classes are in
1021 the directory @file{native/jni/classpath} and they are mostly declared
1022 in @file{native/jni/classpath/jcl.h}. These utility classes are
1023 currently only discussed in @ref{Robustness} and in @ref{Native
1026 There are more utility classes available that could be factored out if
1027 a volunteer wants something nice to hack on. The error reporting and
1028 exception throwing functions and macros in
1029 @file{native/jni/gtk-peer/gthread-jni.c} might be good
1030 candidates for reuse. There are also some generally useful utility
1031 functions in @file{gnu_java_awt_peer_gtk_GtkMainThread.c} that could
1032 be split out and put into libclasspath.
1034 @node Robustness, Java Efficiency, Utility Classes, Programming Goals
1035 @comment node-name, next, previous, up
1038 Native code is very easy to make non-robust. (That's one reason Java is
1039 so much better!) Here are a few hints to make your native code more
1042 Always check return values for standard functions. It's sometimes easy
1043 to forget to check that malloc() return for an error. Don't make that
1044 mistake. (In fact, use JCL_malloc() in the jcl library instead--it will
1045 check the return value and throw an exception if necessary.)
1047 Always check the return values of JNI functions, or call
1048 @code{ExceptionOccurred} to check whether an error occurred. You must
1049 do this after @emph{every} JNI call. JNI does not work well when an
1050 exception has been raised, and can have unpredictable behavior.
1052 Throw exceptions using @code{JCL_ThrowException}. This guarantees that if
1053 something is seriously wrong, the exception text will at least get out
1054 somewhere (even if it is stderr).
1056 Check for null values of @code{jclass}es before you send them to JNI functions.
1057 JNI does not behave nicely when you pass a null class to it: it
1058 terminates Java with a "JNI Panic."
1060 In general, try to use functions in @file{native/jni/classpath/jcl.h}. They
1061 check exceptions and return values and throw appropriate exceptions.
1063 @node Java Efficiency, Native Efficiency, Robustness, Programming Goals
1064 @comment node-name, next, previous, up
1065 @section Java Efficiency
1067 For methods which explicitly throw a @code{NullPointerException} when an
1068 argument is passed which is null, per a Sun specification, do not write
1073 strlen (String foo) throws NullPointerException
1076 throw new NullPointerException ("foo is null");
1077 return foo.length ();
1081 Instead, the code should be written as:
1085 strlen (String foo) throws NullPointerException
1087 return foo.length ();
1091 Explicitly comparing foo to null is unnecessary, as the virtual machine
1092 will throw a NullPointerException when length() is invoked. Classpath
1093 is designed to be as fast as possible -- every optimization, no matter
1094 how small, is important.
1096 @node Native Efficiency, Security, Java Efficiency, Programming Goals
1097 @comment node-name, next, previous, up
1098 @section Native Efficiency
1100 You might think that using native methods all over the place would give
1101 our implementation of Java speed, speed, blinding speed. You'd be
1102 thinking wrong. Would you believe me if I told you that an empty
1103 @emph{interpreted} Java method is typically about three and a half times
1104 @emph{faster} than the equivalent native method?
1106 Bottom line: JNI is overhead incarnate. In Sun's implementation, even
1107 the JNI functions you use once you get into Java are slow.
1109 A final problem is efficiency of native code when it comes to things
1110 like method calls, fields, finding classes, etc. Generally you should
1111 cache things like that in static C variables if you're going to use them
1112 over and over again. GetMethodID(), GetFieldID(), and FindClass() are
1113 @emph{slow}. Classpath provides utility libraries for caching methodIDs
1114 and fieldIDs in @file{native/jni/classpath/jnilink.h}. Other native data can
1115 be cached between method calls using functions found in
1116 @file{native/jni/classpath/native_state.h}.
1118 Here are a few tips on writing native code efficiently:
1120 Make as few native method calls as possible. Note that this is not the
1121 same thing as doing less in native method calls; it just means that, if
1122 given the choice between calling two native methods and writing a single
1123 native method that does the job of both, it will usually be better to
1124 write the single native method. You can even call the other two native
1125 methods directly from your native code and not incur the overhead of a
1126 method call from Java to C.
1128 Cache @code{jmethodID}s and @code{jfieldID}s wherever you can. String
1130 expensive. The best way to do this is to use the
1131 @file{native/jni/classpath/jnilink.h}
1132 library. It will ensure that @code{jmethodID}s are always valid, even if the
1133 class is unloaded at some point. In 1.1, jnilink simply caches a
1134 @code{NewGlobalRef()} to the method's underlying class; however, when 1.2 comes
1135 along, it will use a weak reference to allow the class to be unloaded
1136 and then re-resolve the @code{jmethodID} the next time it is used.
1138 Cache classes that you need to access often. jnilink will help with
1139 this as well. The issue here is the same as the methodID and fieldID
1140 issue--how to make certain the class reference remains valid.
1142 If you need to associate native C data with your class, use Paul
1143 Fisher's native_state library (NSA). It will allow you to get and set
1144 state fairly efficiently. Japhar now supports this library, making
1145 native state get and set calls as fast as accessing a C variable
1148 If you are using native libraries defined outside of Classpath, then
1149 these should be wrapped by a Classpath function instead and defined
1150 within a library of their own. This makes porting Classpath's native
1151 libraries to new platforms easier in the long run. It would be nice
1152 to be able to use Mozilla's NSPR or Apache's APR, as these libraries
1153 are already ported to numerous systems and provide all the necessary
1154 system functions as well.
1156 @node Security, , Native Efficiency, Programming Goals
1157 @comment node-name, next, previous, up
1160 Security is such a huge topic it probably deserves its own chapter.
1161 Most of the current code needs to be audited for security to ensure
1162 all of the proper security checks are in place within the Java
1163 platform, but also to verify that native code is reasonably secure and
1164 avoids common pitfalls, buffer overflows, etc. A good source for
1165 information on secure programming is the excellent HOWTO by David
1167 @uref{http://www.dwheeler.com/secure-programs/Secure-Programs-HOWTO/index.html,Secure
1168 Programming for Linux and Unix HOWTO}.
1170 @node API Compatibility, Specification Sources, Programming Goals, Top
1171 @comment node-name, next, previous, up
1172 @chapter API Compatibility
1175 * Serialization:: Serialization
1176 * Deprecated Methods:: Deprecated methods
1179 @node Serialization, Deprecated Methods, API Compatibility, API Compatibility
1180 @comment node-name, next, previous, up
1181 @section Serialization
1183 Sun has produced documentation concerning much of the information
1184 needed to make Classpath serializable compatible with Sun
1185 implementations. Part of doing this is to make sure that every class
1186 that is Serializable actually defines a field named serialVersionUID
1187 with a value that matches the output of serialver on Sun's
1188 implementation. The reason for doing this is below.
1190 If a class has a field (of any accessibility) named serialVersionUID
1191 of type long, that is what serialver uses. Otherwise it computes a
1192 value using some sort of hash function on the names of all method
1193 signatures in the .class file. The fact that different compilers
1194 create different synthetic method signatures, such as access$0() if an
1195 inner class needs access to a private member of an enclosing class,
1196 make it impossible for two distinct compilers to reliably generate the
1197 same serial #, because their .class files differ. However, once you
1198 have a .class file, its serial # is unique, and the computation will
1199 give the same result no matter what platform you execute on.
1201 Serialization compatibility can be tested using tools provided with
1202 @uref{http://www.kaffe.org/~stuart/japi/,Japitools}. These
1203 tools can test binary serialization compatibility and also provide
1204 information about unknown serialized formats by writing these in XML
1205 instead. Japitools is also the primary means of checking API
1206 compatibility for GNU Classpath with Sun's Java Platform.
1208 @node Deprecated Methods, , Serialization, API Compatibility
1209 @comment node-name, next, previous, up
1210 @section Deprecated Methods
1212 Sun has a practice of creating ``alias'' methods, where a public or
1213 protected method is deprecated in favor of a new one that has the same
1214 function but a different name. Sun's reasons for doing this vary; as
1215 an example, the original name may contain a spelling error or it may
1216 not follow Java naming conventions.
1218 Unfortunately, this practice complicates class library code that calls
1219 these aliased methods. Library code must still call the deprecated
1220 method so that old client code that overrides it continues to work.
1221 But library code must also call the new version, because new code is
1222 expected to override the new method.
1224 The correct way to handle this (and the way Sun does it) may seem
1225 counterintuitive because it means that new code is less efficient than
1226 old code: the new method must call the deprecated method, and throughout
1227 the library code calls to the old method must be replaced with calls to
1230 Take the example of a newly-written container laying out a component and
1231 wanting to know its preferred size. The Component class has a
1232 deprecated preferredSize method and a new method, getPreferredSize.
1233 Assume that the container is laying out an old component that overrides
1234 preferredSize and a new component that overrides getPreferredSize. If
1235 the container calls getPreferredSize and the default implementation of
1236 getPreferredSize calls preferredSize, then the old component will have
1237 its preferredSize method called and new code will have its
1238 getPreferredSize method called.
1240 Even using this calling scheme, an old component may still be laid out
1241 improperly if it implements a method, getPreferredSize, that has the
1242 same signature as the new Component.getPreferredSize. But that is a
1243 general problem -- adding new public or protected methods to a
1244 widely-used class that calls those methods internally is risky, because
1245 existing client code may have already declared methods with the same
1248 The solution may still seem counterintuitive -- why not have the
1249 deprecated method call the new method, then have the library always call
1250 the old method? One problem with that, using the preferred size example
1251 again, is that new containers, which will use the non-deprecated
1252 getPreferredSize, will not get the preferred size of old components.
1254 @node Specification Sources, Naming Conventions, API Compatibility, Top
1255 @comment node-name, next, previous, up
1256 @chapter Specification Sources
1258 There are a number of specification sources to use when working on
1259 Classpath. In general, the only place you'll find your classes
1260 specified is in the JavaDoc documentation or possibly in the
1261 corresponding white paper. In the case of java.lang, java.io and
1262 java.util, you should look at the Java Language Specification.
1264 Here, however, is a list of specs, in order of canonicality:
1268 @uref{http://java.sun.com/docs/books/jls/clarify.html,Clarifications and Amendments to the JLS - 1.1}
1270 @uref{http://java.sun.com/docs/books/jls/html/1.1Update.html,JLS Updates
1273 @uref{http://java.sun.com/docs/books/jls/html/index.html,The 1.0 JLS}
1275 @uref{http://java.sun.com/docs/books/vmspec/index.html,JVM spec - 1.1}
1277 @uref{http://java.sun.com/products/jdk/1.1/docs/guide/jni/spec/jniTOC.doc.html,JNI spec - 1.1}
1279 @uref{http://java.sun.com/products/jdk/1.1/docs/api/packages.html,Sun's javadoc - 1.1}
1280 (since Sun's is the reference implementation, the javadoc is
1281 documentation for the Java platform itself.)
1283 @uref{http://java.sun.com/products/jdk/1.2/docs/guide/jvmdi/jvmdi.html,JVMDI spec - 1.2},
1284 @uref{http://java.sun.com/products/jdk/1.2/docs/guide/jni/jni-12.html,JNI spec - 1.2}
1285 (sometimes gives clues about unspecified things in 1.1; if
1286 it was not specified accurately in 1.1, then use the spec
1287 for 1.2; also, we are using JVMDI in this project.)
1289 @uref{http://java.sun.com/products/jdk/1.2/docs/api/frame.html,Sun's javadoc - 1.2}
1290 (sometimes gives clues about unspecified things in 1.1; if
1291 it was not specified accurately in 1.1, then use the spec
1294 @uref{http://developer.java.sun.com/developer/bugParade/index.html,The
1295 Bug Parade}: I have obtained a ton of useful information about how
1296 things do work and how they *should* work from the Bug Parade just by
1297 searching for related bugs. The submitters are very careful about their
1298 use of the spec. And if something is unspecified, usually you can find
1299 a request for specification or a response indicating how Sun thinks it
1300 should be specified here.
1303 You'll notice that in this document, white papers and specification
1304 papers are more canonical than the JavaDoc documentation. This is true
1308 @node Naming Conventions, Character Conversions, Specification Sources, Top
1309 @comment node-name, next, previous, up
1310 @chapter Directory and File Naming Conventions
1312 The Classpath directory structure is laid out in the following manner:
1354 Here is a brief description of the toplevel directories and their contents.
1359 Contains the source code to the Java packages that make up the core
1360 class library. Because this is the public interface to Java, it is
1361 important that the public classes, interfaces, methods, and variables
1362 are exactly the same as specified in Sun's documentation. The directory
1363 structure is laid out just like the java package names. For example,
1364 the class java.util.zip would be in the directory java-util.
1367 Internal classes (roughly analogous to Sun's sun.* classes) should go
1368 under the @file{gnu/java} directory. Classes related to a particular public
1369 Java package should go in a directory named like that package. For
1370 example, classes related to java.util.zip should go under a directory
1371 @file{gnu/java/util/zip}. Sub-packages under the main package name are
1372 allowed. For classes spanning multiple public Java packages, pick an
1373 appropriate name and see what everybody else thinks.
1376 This directory holds native code needed by the public Java packages.
1377 Each package has its own subdirectory, which is the ``flattened'' name
1378 of the package. For example, native method implementations for
1379 java.util.zip should go in @file{native/classpath/java-util}. Classpath
1380 actually includes an all Java version of the zip classes, so no native
1385 Each person working on a package get's his or her own ``directory
1386 space'' underneath each of the toplevel directories. In addition to the
1387 general guidelines above, the following standards should be followed:
1392 Classes that need to load native code should load a library with the
1393 same name as the flattened package name, with all hyphens removed. For
1394 example, the native library name specified in LoadLibrary for
1395 java-util would be ``javautil''.
1398 Each package has its own shared library for native code (if any).
1401 The main native method implementation for a given method in class should
1402 go in a file with the same name as the class with a ``.c'' extension.
1403 For example, the JNI implementation of the native methods in
1404 java.net.InetAddress would go in @file{native/jni/java-net/InetAddress.c}.
1405 ``Internal'' native functions called from the main native method can
1406 reside in files of any name.
1409 @node Character Conversions, Localization, Naming Conventions, Top
1410 @comment node-name, next, previous, up
1411 @chapter Character Conversions
1413 Java uses the Unicode character encoding system internally. This is a
1414 sixteen bit (two byte) collection of characters encompassing most of the
1415 world's written languages. However, Java programs must often deal with
1416 outside interfaces that are byte (eight bit) oriented. For example, a
1417 Unix file, a stream of data from a network socket, etc. Beginning with
1418 Java 1.1, the @code{Reader} and @code{Writer} classes provide functionality
1419 for dealing with character oriented streams. The classes
1420 @code{InputStreamReader} and @code{OutputStreamWriter} bridge the gap
1421 between byte streams and character streams by converting bytes to
1422 Unicode characters and vice versa.
1424 In Classpath, @code{InputStreamReader} and @code{OutputStreamWriter}
1425 rely on an internal class called @code{gnu.java.io.EncodingManager} to load
1426 translaters that perform the actual conversion. There are two types of
1427 converters, encoders and decoders. Encoders are subclasses of
1428 @code{gnu.java.io.encoder.Encoder}. This type of converter takes a Java
1429 (Unicode) character stream or buffer and converts it to bytes using
1430 a specified encoding scheme. Decoders are a subclass of
1431 @code{gnu.java.io.decoder.Decoder}. This type of converter takes a
1432 byte stream or buffer and converts it to Unicode characters. The
1433 @code{Encoder} and @code{Decoder} classes are subclasses of
1434 @code{Writer} and @code{Reader} respectively, and so can be used in
1435 contexts that require character streams, but the Classpath implementation
1436 currently does not make use of them in this fashion.
1438 The @code{EncodingManager} class searches for requested encoders and
1439 decoders by name. Since encoders and decoders are separate in Classpath,
1440 it is possible to have a decoder without an encoder for a particular
1441 encoding scheme, or vice versa. @code{EncodingManager} searches the
1442 package path specified by the @code{file.encoding.pkg} property. The
1443 name of the encoder or decoder is appended to the search path to
1444 produce the required class name. Note that @code{EncodingManager} knows
1445 about the default system encoding scheme, which it retrieves from the
1446 system property @code{file.encoding}, and it will return the proper
1447 translator for the default encoding if no scheme is specified. Also, the
1448 Classpath standard translator library, which is the @code{gnu.java.io} package,
1449 is automatically appended to the end of the path.
1451 For efficiency, @code{EncodingManager} maintains a cache of translators
1452 that it has loaded. This eliminates the need to search for a commonly
1453 used translator each time it is requested.
1455 Finally, @code{EncodingManager} supports aliasing of encoding scheme names.
1456 For example, the ISO Latin-1 encoding scheme can be referred to as
1457 ''8859_1'' or ''ISO-8859-1''. @code{EncodingManager} searches for
1458 aliases by looking for the existence of a system property called
1459 @code{gnu.java.io.encoding_scheme_alias.<encoding name>}. If such a
1460 property exists. The value of that property is assumed to be the
1461 canonical name of the encoding scheme, and a translator with that name is
1462 looked up instead of one with the original name.
1464 Here is an example of how @code{EncodingManager} works. A class requests
1465 a decoder for the ''UTF-8'' encoding scheme by calling
1466 @code{EncodingManager.getDecoder("UTF-8")}. First, an alias is searched
1467 for by looking for the system property
1468 @code{gnu.java.io.encoding_scheme_alias.UTF-8}. In our example, this
1469 property exists and has the value ''UTF8''. That is the actual
1470 decoder that will be searched for. Next, @code{EncodingManager} looks
1471 in its cache for this translator. Assuming it does not find it, it
1472 searches the translator path, which is this example consists only of
1473 the default @code{gnu.java.io}. The ''decoder'' package name is
1474 appended since we are looking for a decoder. (''encoder'' would be
1475 used if we were looking for an encoder). Then name name of the translator
1476 is appended. So @code{EncodingManager} attempts to load a translator
1477 class called @code{gnu.java.io.decoder.UTF8}. If that class is found,
1478 an instance of it is returned. If it is not found, a
1479 @code{UnsupportedEncodingException}.
1481 To write a new translator, it is only necessary to subclass
1482 @code{Encoder} and/or @code{Decoder}. Only a handful of abstract
1483 methods need to be implemented. In general, no methods need to be
1484 overridden. The needed methods calculate the number of bytes/chars
1485 that the translation will generate, convert buffers to/from bytes,
1486 and read/write a requested number of characters to/from a stream.
1488 Many common encoding schemes use only eight bits to encode characters.
1489 Writing a translator for these encodings is very easy. There are
1490 abstract translator classes @code{gnu.java.io.decode.DecoderEightBitLookup}
1491 and @code{gnu.java.io.encode.EncoderEightBitLookup}. These classes
1492 implement all of the necessary methods. All that is necessary to
1493 create a lookup table array that maps bytes to Unicode characters and
1494 set the class variable @code{lookup_table} equal to it in a static
1495 initializer. Also, a single constructor that takes an appropriate
1496 stream as an argument must be supplied. These translators are
1497 exceptionally easy to create and there are several of them supplied
1498 in the Classpath distribution.
1500 Writing multi-byte or variable-byte encodings is more difficult, but
1501 often not especially challenging. The Classpath distribution ships with
1502 translators for the UTF8 encoding scheme which uses from one to three
1503 bytes to encode Unicode characters. This can serve as an example of
1504 how to write such a translator.
1506 Many more translators are needed. All major character encodings should
1507 eventually be supported.
1509 @node Localization, , Character Conversions, Top
1510 @comment node-name, next, previous, up
1511 @chapter Localization
1513 There are many parts of the Java standard runtime library that must
1514 be customized to the particular locale the program is being run in.
1515 These include the parsing and display of dates, times, and numbers;
1516 sorting words alphabetically; breaking sentences into words, etc.
1517 In general, Classpath uses general classes for performing these tasks,
1518 and customizes their behavior with configuration data specific to a
1522 * String Collation:: Sorting strings in different locales
1523 * Break Iteration:: Breaking up text into words, sentences, and lines
1524 * Date Formatting and Parsing:: Locale specific date handling
1525 * Decimal/Currency Formatting and Parsing:: Local specific number handling
1528 In Classpath, all locale specific data is stored in a
1529 @code{ListResourceBundle} class in the package @code{gnu/java/locale}.
1530 The basename of the bundle is @code{LocaleInformation}. See the
1531 documentation for the @code{java.util.ResourceBundle} class for details
1532 on how the specific locale classes should be named.
1534 @code{ListResourceBundle}'s are used instead of
1535 @code{PropertyResourceBundle}'s because data more complex than simple
1536 strings need to be provided to configure certain Classpath components.
1537 Because @code{ListResourceBundle} allows an arbitrary Java object to
1538 be associated with a given configuration option, it provides the
1539 needed flexibility to accomodate Classpath's needs.
1541 Each Java library component that can be localized requires that certain
1542 configuration options be specified in the resource bundle for it. It is
1543 important that each and every option be supplied for a specific
1544 component or a critical runtime error will most likely result.
1546 As a standard, each option should be assigned a name that is a string.
1547 If the value is stored in a class or instance variable, then the option
1548 should name should have the name name as the variable. Also, the value
1549 associated with each option should be a Java object with the same name
1550 as the option name (unless a simple scalar value is used). Here is an
1553 A class loads a value for the @code{format_string} variable from the
1554 resource bundle in the specified locale. Here is the code in the
1558 ListResourceBundle lrb =
1559 ListResourceBundle.getBundle ("gnu/java/locale/LocaleInformation", locale);
1560 String format_string = lrb.getString ("format_string");
1563 In the actual resource bundle class, here is how the configuration option
1568 * This is the format string used for displaying values
1570 private static final String format_string = "%s %d %i";
1572 private static final Object[][] contents =
1574 @{ "format_string", format_string @}
1578 Note that each variable should be @code{private}, @code{final}, and
1579 @code{static}. Each variable should also have a description of what it
1580 does as a documentation comment. The @code{getContents()} method returns
1581 the @code{contents} array.
1583 There are many functional areas of the standard class library that are
1584 configured using this mechanism. A given locale does not need to support
1585 each functional area. But if a functional area is supported, then all
1586 of the specified entries for that area must be supplied. In order to
1587 determine which functional areas are supported, there is a special key
1588 that is queried by the affected class or classes. If this key exists,
1589 and has a value that is a @code{Boolean} object wrappering the
1590 @code{true} value, then full support is assumed. Otherwise it is
1591 assumed that no support exists for this functional area. Every class
1592 using resources for configuration must use this scheme and define a special
1593 scheme that indicates the functional area is supported. Simply checking
1594 for the resource bundle's existence is not sufficient to ensure that a
1595 given functional area is supported.
1597 The following sections define the functional areas that use resources
1598 for locale specific configuration in GNU Classpath. Please refer to the
1599 documentation for the classes mentioned for details on how these values
1600 are used. You may also wish to look at the source file for
1601 @file{gnu/java/locale/LocaleInformation_en} as an example.
1603 @node String Collation, Break Iteration, Localization, Localization
1604 @comment node-name, next, previous, up
1605 @section String Collation
1607 Collation involves the sorting of strings. The Java class library provides
1608 a public class called @code{java.text.RuleBasedCollator} that performs
1609 sorting based on a set of sorting rules.
1612 @item RuleBasedCollator - A @code{Boolean} wrappering @code{true} to indicate
1613 that this functional area is supported.
1614 @item collation_rules - The rules the specify how string collation is to
1618 Note that some languages might be too complex for @code{RuleBasedCollator}
1619 to handle. In this case an entirely new class might need to be written in
1620 lieu of defining this rule string.
1622 @node Break Iteration, Date Formatting and Parsing, String Collation, Localization
1623 @comment node-name, next, previous, up
1624 @section Break Iteration
1626 The class @code{java.text.BreakIterator} breaks text into words, sentences,
1627 and lines. It is configured with the following resource bundle entries:
1630 @item BreakIterator - A @code{Boolean} wrappering @code{true} to indicate
1631 that this functional area is supported.
1632 @item word_breaks - A @code{String} array of word break character sequences.
1633 @item sentence_breaks - A @code{String} array of sentence break character
1635 @item line_breaks - A @code{String} array of line break character sequences.
1638 @node Date Formatting and Parsing, Decimal/Currency Formatting and Parsing, Break Iteration, Localization
1639 @comment node-name, next, previous, up
1640 @section Date Formatting and Parsing
1642 Date formatting and parsing is handled by the
1643 @code{java.text.SimpleDateFormat} class in most locales. This class is
1644 configured by attaching an instance of the @code{java.text.DateFormatSymbols}
1645 class. That class simply reads properties from our locale specific
1646 resource bundle. The following items are required (refer to the
1647 documentation of the @code{java.text.DateFormatSymbols} class for details
1648 io what the actual values should be):
1651 @item DateFormatSymbols - A @code{Boolean} wrappering @code{true} to indicate
1652 that this functional area is supported.
1653 @item months - A @code{String} array of month names.
1654 @item shortMonths - A @code{String} array of abbreviated month names.
1655 @item weekdays - A @code{String} array of weekday names.
1656 @item shortWeekdays - A @code{String} array of abbreviated weekday names.
1657 @item ampms - A @code{String} array containing AM/PM names.
1658 @item eras - A @code{String} array containing era (ie, BC/AD) names.
1659 @item zoneStrings - An array of information about valid timezones for this
1661 @item localPatternChars - A @code{String} defining date/time pattern symbols.
1662 @item shortDateFormat - The format string for dates used by
1663 @code{DateFormat.SHORT}
1664 @item mediumDateFormat - The format string for dates used by
1665 @code{DateFormat.MEDIUM}
1666 @item longDateFormat - The format string for dates used by
1667 @code{DateFormat.LONG}
1668 @item fullDateFormat - The format string for dates used by
1669 @code{DateFormat.FULL}
1670 @item shortTimeFormat - The format string for times used by
1671 @code{DateFormat.SHORT}
1672 @item mediumTimeFormat - The format string for times used by
1673 @code{DateFormat.MEDIUM}
1674 @item longTimeFormat - The format string for times used by
1675 @code{DateFormat.LONG}
1676 @item fullTimeFormat - The format string for times used by
1677 @code{DateFormat.FULL}
1680 Note that it may not be possible to use this mechanism for all locales.
1681 In those cases a special purpose class may need to be written to handle
1682 date/time processing.
1684 @node Decimal/Currency Formatting and Parsing, , Date Formatting and Parsing, Localization
1685 @comment node-name, next, previous, up
1686 @section Decimal/Currency Formatting and Parsing
1688 @code{NumberFormat} is an abstract class for formatting and parsing numbers.
1689 The class @code{DecimalFormat} provides a concrete subclass that handles
1690 this is in a locale independent manner. As with @code{SimpleDateFormat},
1691 this class gets information on how to format numbers from a class that
1692 wrappers a collection of locale specific formatting values. In this case,
1693 the class is @code{DecimalFormatSymbols}. That class reads its default
1694 values for a locale from the resource bundle. The required entries are:
1697 @item DecimalFormatSymbols - A @code{Boolean} wrappering @code{true} to
1698 indicate that this functional area is supported.
1699 @item currencySymbol - The string representing the local currency.
1700 @item intlCurrencySymbol - The string representing the local currency in an
1701 international context.
1702 @item decimalSeparator - The character to use as the decimal point as a
1704 @item digit - The character used to represent digits in a format string,
1706 @item exponential - The char used to represent the exponent separator of a
1707 number written in scientific notation, as a @code{String}.
1708 @item groupingSeparator - The character used to separate groups of numbers
1709 in a large number, such as the ``,'' separator for thousands in the US, as
1711 @item infinity - The string representing infinity.
1712 @item NaN - The string representing the Java not a number value.
1713 @item minusSign - The character representing the negative sign, as a
1715 @item monetarySeparator - The decimal point used in currency values, as a
1717 @item patternSeparator - The character used to separate positive and
1718 negative format patterns, as a @code{String}.
1719 @item percent - The percent sign, as a @code{String}.
1720 @item perMill - The per mille sign, as a @code{String}.
1721 @item zeroDigit - The character representing the digit zero, as a @code{String}.
1724 Note that several of these values are an individual character. These should
1725 be wrappered in a @code{String} at character position 0, not in a
1726 @code{Character} object.