release/src/router/gettext/gettext-tools/doc/gettext_13.html

   1 <HTML>
   2 <HEAD>
   3 <!-- This HTML file has been created by texi2html 1.52a
   4      from gettext.texi on 23 May 2005 -->
   5
   6 <TITLE>GNU gettext utilities - 13  Other Programming Languages</TITLE>
   7 </HEAD>
   8 <BODY>
   9 Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_12.html">previous</A>, <A HREF="gettext_14.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
  10 <P><HR><P>
  11
  12
  13 <H1><A NAME="SEC221" HREF="gettext_toc.html#TOC221">13  Other Programming Languages</A></H1>
  14
  15 <P>
  16 While the presentation of <CODE>gettext</CODE> focuses mostly on C and
  17 implicitly applies to C++ as well, its scope is far broader than that:
  18 Many programming languages, scripting languages and other textual data
  19 like GUI resources or package descriptions can make use of the gettext
  20 approach.
  21
  22 </P>
  23
  24
  25
  26 <H2><A NAME="SEC222" HREF="gettext_toc.html#TOC222">13.1  The Language Implementor's View</A></H2>
  27 <P>
  28 <A NAME="IDX1072"></A>
  29 <A NAME="IDX1073"></A>
  30
  31 </P>
  32 <P>
  33 All programming and scripting languages that have the notion of strings
  34 are eligible to supporting <CODE>gettext</CODE>.  Supporting <CODE>gettext</CODE>
  35 means the following:
  36
  37 </P>
  38
  39 <OL>
  40 <LI>
  41
  42 You should add to the language a syntax for translatable strings.  In
  43 principle, a function call of <CODE>gettext</CODE> would do, but a shorthand
  44 syntax helps keeping the legibility of internationalized programs.  For
  45 example, in C we use the syntax <CODE>_("string")</CODE>, and in GNU awk we use
  46 the shorthand <CODE>_"string"</CODE>.
  47
  48 <LI>
  49
  50 You should arrange that evaluation of such a translatable string at
  51 runtime calls the <CODE>gettext</CODE> function, or performs equivalent
  52 processing.
  53
  54 <LI>
  55
  56 Similarly, you should make the functions <CODE>ngettext</CODE>,
  57 <CODE>dcgettext</CODE>, <CODE>dcngettext</CODE> available from within the language.
  58 These functions are less often used, but are nevertheless necessary for
  59 particular purposes: <CODE>ngettext</CODE> for correct plural handling, and
  60 <CODE>dcgettext</CODE> and <CODE>dcngettext</CODE> for obeying other locale
  61 environment variables than <CODE>LC_MESSAGES</CODE>, such as <CODE>LC_TIME</CODE> or
  62 <CODE>LC_MONETARY</CODE>.  For these latter functions, you need to make the
  63 <CODE>LC_*</CODE> constants, available in the C header <CODE>&#60;locale.h&#62;</CODE>,
  64 referenceable from within the language, usually either as enumeration
  65 values or as strings.
  66
  67 <LI>
  68
  69 You should allow the programmer to designate a message domain, either by
  70 making the <CODE>textdomain</CODE> function available from within the
  71 language, or by introducing a magic variable called <CODE>TEXTDOMAIN</CODE>.
  72 Similarly, you should allow the programmer to designate where to search
  73 for message catalogs, by providing access to the <CODE>bindtextdomain</CODE>
  74 function.
  75
  76 <LI>
  77
  78 You should either perform a <CODE>setlocale (LC_ALL, "")</CODE> call during
  79 the startup of your language runtime, or allow the programmer to do so.
  80 Remember that gettext will act as a no-op if the <CODE>LC_MESSAGES</CODE> and
  81 <CODE>LC_CTYPE</CODE> locale facets are not both set.
  82
  83 <LI>
  84
  85 A programmer should have a way to extract translatable strings from a
  86 program into a PO file.  The GNU <CODE>xgettext</CODE> program is being
  87 extended to support very different programming languages.  Please
  88 contact the GNU <CODE>gettext</CODE> maintainers to help them doing this.  If
  89 the string extractor is best integrated into your language's parser, GNU
  90 <CODE>xgettext</CODE> can function as a front end to your string extractor.
  91
  92 <LI>
  93
  94 The language's library should have a string formatting facility where
  95 the arguments of a format string are denoted by a positional number or a
  96 name.  This is needed because for some languages and some messages with
  97 more than one substitutable argument, the translation will need to
  98 output the substituted arguments in different order.  See section <A HREF="gettext_3.html#SEC18">3.5  Special Comments preceding Keywords</A>.
  99
 100 <LI>
 101
 102 If the language has more than one implementation, and not all of the
 103 implementations use <CODE>gettext</CODE>, but the programs should be portable
 104 across implementations, you should provide a no-i18n emulation, that
 105 makes the other implementations accept programs written for yours,
 106 without actually translating the strings.
 107
 108 <LI>
 109
 110 To help the programmer in the task of marking translatable strings,
 111 which is usually performed using the Emacs PO mode, you are welcome to
 112 contact the GNU <CODE>gettext</CODE> maintainers, so they can add support for
 113 your language to <TT>`po-mode.el&acute;</TT>.
 114 </OL>
 115
 116 <P>
 117 On the implementation side, three approaches are possible, with
 118 different effects on portability and copyright:
 119
 120 </P>
 121
 122 <UL>
 123 <LI>
 124
 125 You may integrate the GNU <CODE>gettext</CODE>'s <TT>`intl/&acute;</TT> directory in
 126 your package, as described in section <A HREF="gettext_12.html#SEC192">12  The Maintainer's View</A>.  This allows you to
 127 have internationalization on all kinds of platforms.  Note that when you
 128 then distribute your package, it legally falls under the GNU General
 129 Public License, and the GNU project will be glad about your contribution
 130 to the Free Software pool.
 131
 132 <LI>
 133
 134 You may link against GNU <CODE>gettext</CODE> functions if they are found in
 135 the C library.  For example, an autoconf test for <CODE>gettext()</CODE> and
 136 <CODE>ngettext()</CODE> will detect this situation.  For the moment, this test
 137 will succeed on GNU systems and not on other platforms.  No severe
 138 copyright restrictions apply.
 139
 140 <LI>
 141
 142 You may emulate or reimplement the GNU <CODE>gettext</CODE> functionality.
 143 This has the advantage of full portability and no copyright
 144 restrictions, but also the drawback that you have to reimplement the GNU
 145 <CODE>gettext</CODE> features (such as the <CODE>LANGUAGE</CODE> environment
 146 variable, the locale aliases database, the automatic charset conversion,
 147 and plural handling).
 148 </UL>
 149
 150
 151
 152 <H2><A NAME="SEC223" HREF="gettext_toc.html#TOC223">13.2  The Programmer's View</A></H2>
 153
 154 <P>
 155 For the programmer, the general procedure is the same as for the C
 156 language.  The Emacs PO mode supports other languages, and the GNU
 157 <CODE>xgettext</CODE> string extractor recognizes other languages based on the
 158 file extension or a command-line option.  In some languages,
 159 <CODE>setlocale</CODE> is not needed because it is already performed by the
 160 underlying language runtime.
 161
 162 </P>
 163
 164
 165 <H2><A NAME="SEC224" HREF="gettext_toc.html#TOC224">13.3  The Translator's View</A></H2>
 166
 167 <P>
 168 The translator works exactly as in the C language case.  The only
 169 difference is that when translating format strings, she has to be aware
 170 of the language's particular syntax for positional arguments in format
 171 strings.
 172
 173 </P>
 174
 175
 176
 177 <H3><A NAME="SEC225" HREF="gettext_toc.html#TOC225">13.3.1  C Format Strings</A></H3>
 178
 179 <P>
 180 C format strings are described in POSIX (IEEE P1003.1 2001), section
 181 XSH 3 fprintf(),
 182 <A HREF="http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html">http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html</A>.
 183 See also the fprintf(3) manual page,
 184 <A HREF="http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php">http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php</A>,
 185 <A HREF="http://informatik.fh-wuerzburg.de/student/i510/man/printf.html">http://informatik.fh-wuerzburg.de/student/i510/man/printf.html</A>.
 186
 187 </P>
 188 <P>
 189 Although format strings with positions that reorder arguments, such as
 190
 191 </P>
 192
 193 <PRE>
 194 "Only %2$d bytes free on '%1$s'."
 195 </PRE>
 196
 197 <P>
 198 which is semantically equivalent to
 199
 200 </P>
 201
 202 <PRE>
 203 "'%s' has only %d bytes free."
 204 </PRE>
 205
 206 <P>
 207 are a POSIX/XSI feature and not specified by ISO C 99, translators can rely
 208 on this reordering ability: On the few platforms where <CODE>printf()</CODE>,
 209 <CODE>fprintf()</CODE> etc. don't support this feature natively, <TT>`libintl.a&acute;</TT>
 210 or <TT>`libintl.so&acute;</TT> provides replacement functions, and GNU <CODE>&#60;libintl.h&#62;</CODE>
 211 activates these replacement functions automatically.
 212
 213 </P>
 214 <P>
 215 <A NAME="IDX1074"></A>
 216 <A NAME="IDX1075"></A>
 217 As a special feature for Farsi (Persian) and maybe Arabic, translators can
 218 insert an <SAMP>`I&acute;</SAMP> flag into numeric format directives.  For example, the
 219 translation of <CODE>"%d"</CODE> can be <CODE>"%Id"</CODE>.  The effect of this flag,
 220 on systems with GNU <CODE>libc</CODE>, is that in the output, the ASCII digits are
 221 replaced with the <SAMP>`outdigits&acute;</SAMP> defined in the <CODE>LC_CTYPE</CODE> locale
 222 facet.  On other systems, the <CODE>gettext</CODE> function removes this flag,
 223 so that it has no effect.
 224
 225 </P>
 226 <P>
 227 Note that the programmer should <EM>not</EM> put this flag into the
 228 untranslated string.  (Putting the <SAMP>`I&acute;</SAMP> format directive flag into an
 229 <VAR>msgid</VAR> string would lead to undefined behaviour on platforms without
 230 glibc when NLS is disabled.)
 231
 232 </P>
 233
 234
 235 <H3><A NAME="SEC226" HREF="gettext_toc.html#TOC226">13.3.2  Objective C Format Strings</A></H3>
 236
 237 <P>
 238 Objective C format strings are like C format strings.  They support an
 239 additional format directive: "$@", which when executed consumes an argument
 240 of type <CODE>Object *</CODE>.
 241
 242 </P>
 243
 244
 245 <H3><A NAME="SEC227" HREF="gettext_toc.html#TOC227">13.3.3  Shell Format Strings</A></H3>
 246
 247 <P>
 248 Shell format strings, as supported by GNU gettext and the <SAMP>`envsubst&acute;</SAMP>
 249 program, are strings with references to shell variables in the form
 250 <CODE>$<VAR>variable</VAR></CODE> or <CODE>${<VAR>variable</VAR>}</CODE>.  References of the form
 251 <CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE>,
 252 <CODE>${<VAR>variable</VAR>:-<VAR>default</VAR>}</CODE>,
 253 <CODE>${<VAR>variable</VAR>=<VAR>default</VAR>}</CODE>,
 254 <CODE>${<VAR>variable</VAR>:=<VAR>default</VAR>}</CODE>,
 255 <CODE>${<VAR>variable</VAR>+<VAR>replacement</VAR>}</CODE>,
 256 <CODE>${<VAR>variable</VAR>:+<VAR>replacement</VAR>}</CODE>,
 257 <CODE>${<VAR>variable</VAR>?<VAR>ignored</VAR>}</CODE>,
 258 <CODE>${<VAR>variable</VAR>:?<VAR>ignored</VAR>}</CODE>,
 259 that would be valid inside shell scripts, are not supported.  The
 260 <VAR>variable</VAR> names must consist solely of alphanumeric or underscore
 261 ASCII characters, not start with a digit and be nonempty; otherwise such
 262 a variable reference is ignored.
 263
 264 </P>
 265
 266
 267 <H3><A NAME="SEC228" HREF="gettext_toc.html#TOC228">13.3.4  Python Format Strings</A></H3>
 268
 269 <P>
 270 Python format strings are described in
 271 Python Library reference /
 272 2. Built-in Types, Exceptions and Functions /
 273 2.2. Built-in Types /
 274 2.2.6. Sequence Types /
 275 2.2.6.2. String Formatting Operations.
 276 <A HREF="http://www.python.org/doc/2.2.1/lib/typesseq-strings.html">http://www.python.org/doc/2.2.1/lib/typesseq-strings.html</A>.
 277
 278 </P>
 279
 280
 281 <H3><A NAME="SEC229" HREF="gettext_toc.html#TOC229">13.3.5  Lisp Format Strings</A></H3>
 282
 283 <P>
 284 Lisp format strings are described in the Common Lisp HyperSpec,
 285 chapter 22.3 Formatted Output,
 286 <A HREF="http://www.lisp.org/HyperSpec/Body/sec_22-3.html">http://www.lisp.org/HyperSpec/Body/sec_22-3.html</A>.
 287
 288 </P>
 289
 290
 291 <H3><A NAME="SEC230" HREF="gettext_toc.html#TOC230">13.3.6  Emacs Lisp Format Strings</A></H3>
 292
 293 <P>
 294 Emacs Lisp format strings are documented in the Emacs Lisp reference,
 295 section Formatting Strings,
 296 <A HREF="http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75">http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75</A>.
 297 Note that as of version 21, XEmacs supports numbered argument specifications
 298 in format strings while FSF Emacs doesn't.
 299
 300 </P>
 301
 302
 303 <H3><A NAME="SEC231" HREF="gettext_toc.html#TOC231">13.3.7  librep Format Strings</A></H3>
 304
 305 <P>
 306 librep format strings are documented in the librep manual, section
 307 Formatted Output,
 308 <A HREF="http://librep.sourceforge.net/librep-manual.html#Formatted%20Output">http://librep.sourceforge.net/librep-manual.html#Formatted%20Output</A>,
 309 <A HREF="http://www.gwinnup.org/research/docs/librep.html#SEC122">http://www.gwinnup.org/research/docs/librep.html#SEC122</A>.
 310
 311 </P>
 312
 313
 314 <H3><A NAME="SEC232" HREF="gettext_toc.html#TOC232">13.3.8  Scheme Format Strings</A></H3>
 315
 316 <P>
 317 Scheme format strings are documented in the SLIB manual, section
 318 Format Specification.
 319
 320 </P>
 321
 322
 323 <H3><A NAME="SEC233" HREF="gettext_toc.html#TOC233">13.3.9  Smalltalk Format Strings</A></H3>
 324
 325 <P>
 326 Smalltalk format strings are described in the GNU Smalltalk documentation,
 327 class <CODE>CharArray</CODE>, methods <SAMP>`bindWith:&acute;</SAMP> and
 328 <SAMP>`bindWithArguments:&acute;</SAMP>.
 329 <A HREF="http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238">http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238</A>.
 330 In summary, a directive starts with <SAMP>`%&acute;</SAMP> and is followed by <SAMP>`%&acute;</SAMP>
 331 or a nonzero digit (<SAMP>`1&acute;</SAMP> to <SAMP>`9&acute;</SAMP>).
 332
 333 </P>
 334
 335
 336 <H3><A NAME="SEC234" HREF="gettext_toc.html#TOC234">13.3.10  Java Format Strings</A></H3>
 337
 338 <P>
 339 Java format strings are described in the JDK documentation for class
 340 <CODE>java.text.MessageFormat</CODE>,
 341 <A HREF="http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html">http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html</A>.
 342 See also the ICU documentation
 343 <A HREF="http://oss.software.ibm.com/icu/apiref/classMessageFormat.html">http://oss.software.ibm.com/icu/apiref/classMessageFormat.html</A>.
 344
 345 </P>
 346
 347
 348 <H3><A NAME="SEC235" HREF="gettext_toc.html#TOC235">13.3.11  C# Format Strings</A></H3>
 349
 350 <P>
 351 C# format strings are described in the .NET documentation for class
 352 <CODE>System.String</CODE> and in
 353 <A HREF="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpConFormattingOverview.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpConFormattingOverview.asp</A>.
 354
 355 </P>
 356
 357
 358 <H3><A NAME="SEC236" HREF="gettext_toc.html#TOC236">13.3.12  awk Format Strings</A></H3>
 359
 360 <P>
 361 awk format strings are described in the gawk documentation, section
 362 Printf,
 363 <A HREF="http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf">http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf</A>.
 364
 365 </P>
 366
 367
 368 <H3><A NAME="SEC237" HREF="gettext_toc.html#TOC237">13.3.13  Object Pascal Format Strings</A></H3>
 369
 370 <P>
 371 Where is this documented?
 372
 373 </P>
 374
 375
 376 <H3><A NAME="SEC238" HREF="gettext_toc.html#TOC238">13.3.14  YCP Format Strings</A></H3>
 377
 378 <P>
 379 YCP sformat strings are described in the libycp documentation
 380 <A HREF="file:/usr/share/doc/packages/libycp/YCP-builtins.html">file:/usr/share/doc/packages/libycp/YCP-builtins.html</A>.
 381 In summary, a directive starts with <SAMP>`%&acute;</SAMP> and is followed by <SAMP>`%&acute;</SAMP>
 382 or a nonzero digit (<SAMP>`1&acute;</SAMP> to <SAMP>`9&acute;</SAMP>).
 383
 384 </P>
 385
 386
 387 <H3><A NAME="SEC239" HREF="gettext_toc.html#TOC239">13.3.15  Tcl Format Strings</A></H3>
 388
 389 <P>
 390 Tcl format strings are described in the <TT>`format.n&acute;</TT> manual page,
 391 <A HREF="http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm">http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm</A>.
 392
 393 </P>
 394
 395
 396 <H3><A NAME="SEC240" HREF="gettext_toc.html#TOC240">13.3.16  Perl Format Strings</A></H3>
 397
 398 <P>
 399 There are two kinds format strings in Perl: those acceptable to the
 400 Perl built-in function <CODE>printf</CODE>, labelled as <SAMP>`perl-format&acute;</SAMP>,
 401 and those acceptable to the <CODE>libintl-perl</CODE> function <CODE>__x</CODE>,
 402 labelled as <SAMP>`perl-brace-format&acute;</SAMP>.
 403
 404 </P>
 405 <P>
 406 Perl <CODE>printf</CODE> format strings are described in the <CODE>sprintf</CODE>
 407 section of <SAMP>`man perlfunc&acute;</SAMP>.
 408
 409 </P>
 410 <P>
 411 Perl brace format strings are described in the
 412 <TT>`Locale::TextDomain(3pm)&acute;</TT> manual page of the CPAN package
 413 libintl-perl.  In brief, Perl format uses placeholders put between
 414 braces (<SAMP>`{&acute;</SAMP> and <SAMP>`}&acute;</SAMP>).  The placeholder must have the syntax
 415 of simple identifiers.
 416
 417 </P>
 418
 419
 420 <H3><A NAME="SEC241" HREF="gettext_toc.html#TOC241">13.3.17  PHP Format Strings</A></H3>
 421
 422 <P>
 423 PHP format strings are described in the documentation of the PHP function
 424 <CODE>sprintf</CODE>, in <TT>`phpdoc/manual/function.sprintf.html&acute;</TT> or
 425 <A HREF="http://www.php.net/manual/en/function.sprintf.php">http://www.php.net/manual/en/function.sprintf.php</A>.
 426
 427 </P>
 428
 429
 430 <H3><A NAME="SEC242" HREF="gettext_toc.html#TOC242">13.3.18  GCC internal Format Strings</A></H3>
 431
 432 <P>
 433 These format strings are used inside the GCC sources.  In such a format
 434 string, a directive starts with <SAMP>`%&acute;</SAMP>, is optionally followed by a
 435 size specifier <SAMP>`l&acute;</SAMP>, an optional flag <SAMP>`+&acute;</SAMP>, another optional flag
 436 <SAMP>`#&acute;</SAMP>, and is finished by a specifier: <SAMP>`%&acute;</SAMP> denotes a literal
 437 percent sign, <SAMP>`c&acute;</SAMP> denotes a character, <SAMP>`s&acute;</SAMP> denotes a string,
 438 <SAMP>`i&acute;</SAMP> and <SAMP>`d&acute;</SAMP> denote an integer, <SAMP>`o&acute;</SAMP>, <SAMP>`u&acute;</SAMP>, <SAMP>`x&acute;</SAMP>
 439 denote an unsigned integer, <SAMP>`.*s&acute;</SAMP> denotes a string preceded by a
 440 width specification, <SAMP>`H&acute;</SAMP> denotes a <SAMP>`location_t *&acute;</SAMP> pointer,
 441 <SAMP>`D&acute;</SAMP> denotes a general declaration, <SAMP>`F&acute;</SAMP> denotes a function
 442 declaration, <SAMP>`T&acute;</SAMP> denotes a type, <SAMP>`A&acute;</SAMP> denotes a function argument,
 443 <SAMP>`C&acute;</SAMP> denotes a tree code, <SAMP>`E&acute;</SAMP> denotes an expression, <SAMP>`L&acute;</SAMP>
 444 denotes a programming language, <SAMP>`O&acute;</SAMP> denotes a binary operator,
 445 <SAMP>`P&acute;</SAMP> denotes a function parameter, <SAMP>`Q&acute;</SAMP> denotes an assignment
 446 operator, <SAMP>`V&acute;</SAMP> denotes a const/volatile qualifier.
 447
 448 </P>
 449
 450
 451 <H3><A NAME="SEC243" HREF="gettext_toc.html#TOC243">13.3.19  Qt Format Strings</A></H3>
 452
 453 <P>
 454 Qt format strings are described in the documentation of the QString class
 455 <A HREF="file:/usr/lib/qt-3.0.5/doc/html/qstring.html">file:/usr/lib/qt-3.0.5/doc/html/qstring.html</A>.
 456 In summary, a directive consists of a <SAMP>`%&acute;</SAMP> followed by a digit. The same
 457 directive cannot occur more than once in a format string.
 458
 459 </P>
 460
 461
 462 <H2><A NAME="SEC244" HREF="gettext_toc.html#TOC244">13.4  The Maintainer's View</A></H2>
 463
 464 <P>
 465 For the maintainer, the general procedure differs from the C language
 466 case in two ways.
 467
 468 </P>
 469
 470 <UL>
 471 <LI>
 472
 473 For those languages that don't use GNU gettext, the <TT>`intl/&acute;</TT> directory
 474 is not needed and can be omitted.  This means that the maintainer calls the
 475 <CODE>gettextize</CODE> program without the <SAMP>`--intl&acute;</SAMP> option, and that he
 476 invokes the <CODE>AM_GNU_GETTEXT</CODE> autoconf macro via
 477 <SAMP>`AM_GNU_GETTEXT([external])&acute;</SAMP>.
 478
 479 <LI>
 480
 481 If only a single programming language is used, the <CODE>XGETTEXT_OPTIONS</CODE>
 482 variable in <TT>`po/Makevars&acute;</TT> (see section <A HREF="gettext_12.html#SEC199">12.4.3  <TT>`Makevars&acute;</TT> in <TT>`po/&acute;</TT></A>) should be adjusted to
 483 match the <CODE>xgettext</CODE> options for that particular programming language.
 484 If the package uses more than one programming language with <CODE>gettext</CODE>
 485 support, it becomes necessary to change the POT file construction rule
 486 in <TT>`po/Makefile.in.in&acute;</TT>.  It is recommended to make one <CODE>xgettext</CODE>
 487 invocation per programming language, each with the options appropriate for
 488 that language, and to combine the resulting files using <CODE>msgcat</CODE>.
 489 </UL>
 490
 491
 492
 493 <H2><A NAME="SEC245" HREF="gettext_toc.html#TOC245">13.5  Individual Programming Languages</A></H2>
 494
 495
 496
 497 <H3><A NAME="SEC246" HREF="gettext_toc.html#TOC246">13.5.1  C, C++, Objective C</A></H3>
 498 <P>
 499 <A NAME="IDX1076"></A>
 500
 501 </P>
 502 <DL COMPACT>
 503
 504 <DT>RPMs
 505 <DD>
 506 gcc, gpp, gobjc, glibc, gettext
 507
 508 <DT>File extension
 509 <DD>
 510 For C: <CODE>c</CODE>, <CODE>h</CODE>.
 511 <BR>For C++: <CODE>C</CODE>, <CODE>c++</CODE>, <CODE>cc</CODE>, <CODE>cxx</CODE>, <CODE>cpp</CODE>, <CODE>hpp</CODE>.
 512 <BR>For Objective C: <CODE>m</CODE>.
 513
 514 <DT>String syntax
 515 <DD>
 516 <CODE>"abc"</CODE>
 517
 518 <DT>gettext shorthand
 519 <DD>
 520 <CODE>_("abc")</CODE>
 521
 522 <DT>gettext/ngettext functions
 523 <DD>
 524 <CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>,
 525 <CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
 526
 527 <DT>textdomain
 528 <DD>
 529 <CODE>textdomain</CODE> function
 530
 531 <DT>bindtextdomain
 532 <DD>
 533 <CODE>bindtextdomain</CODE> function
 534
 535 <DT>setlocale
 536 <DD>
 537 Programmer must call <CODE>setlocale (LC_ALL, "")</CODE>
 538
 539 <DT>Prerequisite
 540 <DD>
 541 <CODE>#include &#60;libintl.h&#62;</CODE>
 542 <BR><CODE>#include &#60;locale.h&#62;</CODE>
 543 <BR><CODE>#define _(string) gettext (string)</CODE>
 544
 545 <DT>Use or emulate GNU gettext
 546 <DD>
 547 Use
 548
 549 <DT>Extractor
 550 <DD>
 551 <CODE>xgettext -k_</CODE>
 552
 553 <DT>Formatting with positions
 554 <DD>
 555 <CODE>fprintf "%2$d %1$d"</CODE>
 556 <BR>In C++: <CODE>autosprintf "%2$d %1$d"</CODE>
 557 (see section `Introduction' in <CITE>GNU autosprintf</CITE>)
 558
 559 <DT>Portability
 560 <DD>
 561 autoconf (gettext.m4) and #if ENABLE_NLS
 562
 563 <DT>po-mode marking
 564 <DD>
 565 yes
 566 </DL>
 567
 568 <P>
 569 The following examples are available in the <TT>`examples&acute;</TT> directory:
 570 <CODE>hello-c</CODE>, <CODE>hello-c-gnome</CODE>, <CODE>hello-c++</CODE>, <CODE>hello-c++-qt</CODE>,
 571 <CODE>hello-c++-kde</CODE>, <CODE>hello-c++-gnome</CODE>, <CODE>hello-objc</CODE>,
 572 <CODE>hello-objc-gnustep</CODE>, <CODE>hello-objc-gnome</CODE>.
 573
 574 </P>
 575
 576
 577 <H3><A NAME="SEC247" HREF="gettext_toc.html#TOC247">13.5.2  sh - Shell Script</A></H3>
 578 <P>
 579 <A NAME="IDX1077"></A>
 580
 581 </P>
 582 <DL COMPACT>
 583
 584 <DT>RPMs
 585 <DD>
 586 bash, gettext
 587
 588 <DT>File extension
 589 <DD>
 590 <CODE>sh</CODE>
 591
 592 <DT>String syntax
 593 <DD>
 594 <CODE>"abc"</CODE>, <CODE>'abc'</CODE>, <CODE>abc</CODE>
 595
 596 <DT>gettext shorthand
 597 <DD>
 598 <CODE>"`gettext \"abc\"`"</CODE>
 599
 600 <DT>gettext/ngettext functions
 601 <DD>
 602 <A NAME="IDX1078"></A>
 603 <A NAME="IDX1079"></A>
 604 <CODE>gettext</CODE>, <CODE>ngettext</CODE> programs
 605 <BR><CODE>eval_gettext</CODE>, <CODE>eval_ngettext</CODE> shell functions
 606
 607 <DT>textdomain
 608 <DD>
 609 <A NAME="IDX1080"></A>
 610 environment variable <CODE>TEXTDOMAIN</CODE>
 611
 612 <DT>bindtextdomain
 613 <DD>
 614 <A NAME="IDX1081"></A>
 615 environment variable <CODE>TEXTDOMAINDIR</CODE>
 616
 617 <DT>setlocale
 618 <DD>
 619 automatic
 620
 621 <DT>Prerequisite
 622 <DD>
 623 <CODE>. gettext.sh</CODE>
 624
 625 <DT>Use or emulate GNU gettext
 626 <DD>
 627 use
 628
 629 <DT>Extractor
 630 <DD>
 631 <CODE>xgettext</CODE>
 632
 633 <DT>Formatting with positions
 634 <DD>
 635 ---
 636
 637 <DT>Portability
 638 <DD>
 639 fully portable
 640
 641 <DT>po-mode marking
 642 <DD>
 643 ---
 644 </DL>
 645
 646 <P>
 647 An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-sh</CODE>.
 648
 649 </P>
 650
 651
 652
 653 <H4><A NAME="SEC248" HREF="gettext_toc.html#TOC248">13.5.2.1  Preparing Shell Scripts for Internationalization</A></H4>
 654 <P>
 655 <A NAME="IDX1082"></A>
 656
 657 </P>
 658 <P>
 659 Preparing a shell script for internationalization is conceptually similar
 660 to the steps described in section <A HREF="gettext_3.html#SEC13">3  Preparing Program Sources</A>.  The concrete steps for shell
 661 scripts are as follows.
 662
 663 </P>
 664
 665 <OL>
 666 <LI>
 667
 668 Insert the line
 669
 670
 671 <PRE>
 672 . gettext.sh
 673 </PRE>
 674
 675 near the top of the script.  <CODE>gettext.sh</CODE> is a shell function library
 676 that provides the functions
 677 <CODE>eval_gettext</CODE> (see section <A HREF="gettext_13.html#SEC253">13.5.2.6  Invoking the <CODE>eval_gettext</CODE> function</A>) and
 678 <CODE>eval_ngettext</CODE> (see section <A HREF="gettext_13.html#SEC254">13.5.2.7  Invoking the <CODE>eval_ngettext</CODE> function</A>).
 679 You have to ensure that <CODE>gettext.sh</CODE> can be found in the <CODE>PATH</CODE>.
 680
 681 <LI>
 682
 683 Set and export the <CODE>TEXTDOMAIN</CODE> and <CODE>TEXTDOMAINDIR</CODE> environment
 684 variables.  Usually <CODE>TEXTDOMAIN</CODE> is the package or program name, and
 685 <CODE>TEXTDOMAINDIR</CODE> is the absolute pathname corresponding to
 686 <CODE>$prefix/share/locale</CODE>, where <CODE>$prefix</CODE> is the installation location.
 687
 688
 689 <PRE>
 690 TEXTDOMAIN=@PACKAGE@
 691 export TEXTDOMAIN
 692 TEXTDOMAINDIR=@LOCALEDIR@
 693 export TEXTDOMAINDIR
 694 </PRE>
 695
 696 <LI>
 697
 698 Prepare the strings for translation, as described in section <A HREF="gettext_3.html#SEC15">3.2  Preparing Translatable Strings</A>.
 699
 700 <LI>
 701
 702 Simplify translatable strings so that they don't contain command substitution
 703 (<CODE>"`...`"</CODE> or <CODE>"$(...)"</CODE>), variable access with defaulting (like
 704 <CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE>), access to positional arguments
 705 (like <CODE>$0</CODE>, <CODE>$1</CODE>, ...) or highly volatile shell variables (like
 706 <CODE>$?</CODE>). This can always be done through simple local code restructuring.
 707 For example,
 708
 709
 710 <PRE>
 711 echo "Usage: $0 [OPTION] FILE..."
 712 </PRE>
 713
 714 becomes
 715
 716
 717 <PRE>
 718 program_name=$0
 719 echo "Usage: $program_name [OPTION] FILE..."
 720 </PRE>
 721
 722 Similarly,
 723
 724
 725 <PRE>
 726 echo "Remaining files: `ls | wc -l`"
 727 </PRE>
 728
 729 becomes
 730
 731
 732 <PRE>
 733 filecount="`ls | wc -l`"
 734 echo "Remaining files: $filecount"
 735 </PRE>
 736
 737 <LI>
 738
 739 For each translatable string, change the output command <SAMP>`echo&acute;</SAMP> or
 740 <SAMP>`$echo&acute;</SAMP> to <SAMP>`gettext&acute;</SAMP> (if the string contains no references to
 741 shell variables) or to <SAMP>`eval_gettext&acute;</SAMP> (if it refers to shell variables),
 742 followed by a no-argument <SAMP>`echo&acute;</SAMP> command (to account for the terminating
 743 newline). Similarly, for cases with plural handling, replace a conditional
 744 <SAMP>`echo&acute;</SAMP> command with an invocation of <SAMP>`ngettext&acute;</SAMP> or
 745 <SAMP>`eval_ngettext&acute;</SAMP>, followed by a no-argument <SAMP>`echo&acute;</SAMP> command.
 746
 747 When doing this, you also need to add an extra backslash before the dollar
 748 sign in references to shell variables, so that the <SAMP>`eval_gettext&acute;</SAMP>
 749 function receives the translatable string before the variable values are
 750 substituted into it. For example,
 751
 752
 753 <PRE>
 754 echo "Remaining files: $filecount"
 755 </PRE>
 756
 757 becomes
 758
 759
 760 <PRE>
 761 eval_gettext "Remaining files: \$filecount"; echo
 762 </PRE>
 763
 764 If the output command is not <SAMP>`echo&acute;</SAMP>, you can make it use <SAMP>`echo&acute;</SAMP>
 765 nevertheless, through the use of backquotes. However, note that inside
 766 backquotes, backslashes must be doubled to be effective (because the
 767 backquoting eats one level of backslashes). For example, assuming that
 768 <SAMP>`error&acute;</SAMP> is a shell function that signals an error,
 769
 770
 771 <PRE>
 772 error "file not found: $filename"
 773 </PRE>
 774
 775 is first transformed into
 776
 777
 778 <PRE>
 779 error "`echo \"file not found: \$filename\"`"
 780 </PRE>
 781
 782 which then becomes
 783
 784
 785 <PRE>
 786 error "`eval_gettext \"file not found: \\\$filename\"`"
 787 </PRE>
 788
 789 </OL>
 790
 791
 792
 793 <H4><A NAME="SEC249" HREF="gettext_toc.html#TOC249">13.5.2.2  Contents of <CODE>gettext.sh</CODE></A></H4>
 794
 795 <P>
 796 <CODE>gettext.sh</CODE>, contained in the run-time package of GNU gettext, provides
 797 the following:
 798
 799 </P>
 800
 801 <UL>
 802 <LI>$echo
 803
 804 The variable <CODE>echo</CODE> is set to a command that outputs its first argument
 805 and a newline, without interpreting backslashes in the argument string.
 806
 807 <LI>eval_gettext
 808
 809 See section <A HREF="gettext_13.html#SEC253">13.5.2.6  Invoking the <CODE>eval_gettext</CODE> function</A>.
 810
 811 <LI>eval_ngettext
 812
 813 See section <A HREF="gettext_13.html#SEC254">13.5.2.7  Invoking the <CODE>eval_ngettext</CODE> function</A>.
 814 </UL>
 815
 816
 817
 818 <H4><A NAME="SEC250" HREF="gettext_toc.html#TOC250">13.5.2.3  Invoking the <CODE>gettext</CODE> program</A></H4>
 819
 820 <P>
 821 <A NAME="IDX1083"></A>
 822 <A NAME="IDX1084"></A>
 823
 824 <PRE>
 825 gettext [<VAR>option</VAR>] [[<VAR>textdomain</VAR>] <VAR>msgid</VAR>]
 826 gettext [<VAR>option</VAR>] -s [<VAR>msgid</VAR>]...
 827 </PRE>
 828
 829 <P>
 830 <A NAME="IDX1085"></A>
 831 The <CODE>gettext</CODE> program displays the native language translation of a
 832 textual message.
 833
 834 </P>
 835 <P>
 836 <STRONG>Arguments</STRONG>
 837
 838 </P>
 839 <DL COMPACT>
 840
 841 <DT><SAMP>`-d <VAR>textdomain</VAR>&acute;</SAMP>
 842 <DD>
 843 <DT><SAMP>`--domain=<VAR>textdomain</VAR>&acute;</SAMP>
 844 <DD>
 845 <A NAME="IDX1086"></A>
 846 <A NAME="IDX1087"></A>
 847 Retrieve translated messages from <VAR>textdomain</VAR>.  Usually a <VAR>textdomain</VAR>
 848 corresponds to a package, a program, or a module of a program.
 849
 850 <DT><SAMP>`-e&acute;</SAMP>
 851 <DD>
 852 <A NAME="IDX1088"></A>
 853 Enable expansion of some escape sequences.  This option is for compatibility
 854 with the <SAMP>`echo&acute;</SAMP> program or shell built-in.  The escape sequences
 855 <SAMP>`\a&acute;</SAMP>, <SAMP>`\b&acute;</SAMP>, <SAMP>`\c&acute;</SAMP>, <SAMP>`\f&acute;</SAMP>, <SAMP>`\n&acute;</SAMP>, <SAMP>`\r&acute;</SAMP>, <SAMP>`\t&acute;</SAMP>,
 856 <SAMP>`\v&acute;</SAMP>, <SAMP>`\\&acute;</SAMP>, and <SAMP>`\&acute;</SAMP> followed by one to three octal digits, are
 857 interpreted like the SystemV <SAMP>`echo&acute;</SAMP> program does.
 858
 859 <DT><SAMP>`-E&acute;</SAMP>
 860 <DD>
 861 <A NAME="IDX1089"></A>
 862 This option is only for compatibility with the <SAMP>`echo&acute;</SAMP> program or shell
 863 built-in.  It has no effect.
 864
 865 <DT><SAMP>`-h&acute;</SAMP>
 866 <DD>
 867 <DT><SAMP>`--help&acute;</SAMP>
 868 <DD>
 869 <A NAME="IDX1090"></A>
 870 <A NAME="IDX1091"></A>
 871 Display this help and exit.
 872
 873 <DT><SAMP>`-n&acute;</SAMP>
 874 <DD>
 875 <A NAME="IDX1092"></A>
 876 Suppress trailing newline.  By default, <CODE>gettext</CODE> adds a newline to
 877 the output.
 878
 879 <DT><SAMP>`-V&acute;</SAMP>
 880 <DD>
 881 <DT><SAMP>`--version&acute;</SAMP>
 882 <DD>
 883 <A NAME="IDX1093"></A>
 884 <A NAME="IDX1094"></A>
 885 Output version information and exit.
 886
 887 <DT><SAMP>`[<VAR>textdomain</VAR>] <VAR>msgid</VAR>&acute;</SAMP>
 888 <DD>
 889 Retrieve translated message corresponding to <VAR>msgid</VAR> from <VAR>textdomain</VAR>.
 890
 891 </DL>
 892
 893 <P>
 894 If the <VAR>textdomain</VAR> parameter is not given, the domain is determined from
 895 the environment variable <CODE>TEXTDOMAIN</CODE>.  If the message catalog is not
 896 found in the regular directory, another location can be specified with the
 897 environment variable <CODE>TEXTDOMAINDIR</CODE>.
 898
 899 </P>
 900 <P>
 901 When used with the <CODE>-s</CODE> option the program behaves like the <SAMP>`echo&acute;</SAMP>
 902 command.  But it does not simply copy its arguments to stdout.  Instead those
 903 messages found in the selected catalog are translated.
 904
 905 </P>
 906
 907
 908 <H4><A NAME="SEC251" HREF="gettext_toc.html#TOC251">13.5.2.4  Invoking the <CODE>ngettext</CODE> program</A></H4>
 909
 910 <P>
 911 <A NAME="IDX1095"></A>
 912 <A NAME="IDX1096"></A>
 913
 914 <PRE>
 915 ngettext [<VAR>option</VAR>] [<VAR>textdomain</VAR>] <VAR>msgid</VAR> <VAR>msgid-plural</VAR> <VAR>count</VAR>
 916 </PRE>
 917
 918 <P>
 919 <A NAME="IDX1097"></A>
 920 The <CODE>ngettext</CODE> program displays the native language translation of a
 921 textual message whose grammatical form depends on a number.
 922
 923 </P>
 924 <P>
 925 <STRONG>Arguments</STRONG>
 926
 927 </P>
 928 <DL COMPACT>
 929
 930 <DT><SAMP>`-d <VAR>textdomain</VAR>&acute;</SAMP>
 931 <DD>
 932 <DT><SAMP>`--domain=<VAR>textdomain</VAR>&acute;</SAMP>
 933 <DD>
 934 <A NAME="IDX1098"></A>
 935 <A NAME="IDX1099"></A>
 936 Retrieve translated messages from <VAR>textdomain</VAR>.  Usually a <VAR>textdomain</VAR>
 937 corresponds to a package, a program, or a module of a program.
 938
 939 <DT><SAMP>`-e&acute;</SAMP>
 940 <DD>
 941 <A NAME="IDX1100"></A>
 942 Enable expansion of some escape sequences.  This option is for compatibility
 943 with the <SAMP>`gettext&acute;</SAMP> program.  The escape sequences
 944 <SAMP>`\a&acute;</SAMP>, <SAMP>`\b&acute;</SAMP>, <SAMP>`\c&acute;</SAMP>, <SAMP>`\f&acute;</SAMP>, <SAMP>`\n&acute;</SAMP>, <SAMP>`\r&acute;</SAMP>, <SAMP>`\t&acute;</SAMP>,
 945 <SAMP>`\v&acute;</SAMP>, <SAMP>`\\&acute;</SAMP>, and <SAMP>`\&acute;</SAMP> followed by one to three octal digits, are
 946 interpreted like the SystemV <SAMP>`echo&acute;</SAMP> program does.
 947
 948 <DT><SAMP>`-E&acute;</SAMP>
 949 <DD>
 950 <A NAME="IDX1101"></A>
 951 This option is only for compatibility with the <SAMP>`gettext&acute;</SAMP> program.  It has
 952 no effect.
 953
 954 <DT><SAMP>`-h&acute;</SAMP>
 955 <DD>
 956 <DT><SAMP>`--help&acute;</SAMP>
 957 <DD>
 958 <A NAME="IDX1102"></A>
 959 <A NAME="IDX1103"></A>
 960 Display this help and exit.
 961
 962 <DT><SAMP>`-V&acute;</SAMP>
 963 <DD>
 964 <DT><SAMP>`--version&acute;</SAMP>
 965 <DD>
 966 <A NAME="IDX1104"></A>
 967 <A NAME="IDX1105"></A>
 968 Output version information and exit.
 969
 970 <DT><SAMP>`<VAR>textdomain</VAR>&acute;</SAMP>
 971 <DD>
 972 Retrieve translated message from <VAR>textdomain</VAR>.
 973
 974 <DT><SAMP>`<VAR>msgid</VAR> <VAR>msgid-plural</VAR>&acute;</SAMP>
 975 <DD>
 976 Translate <VAR>msgid</VAR> (English singular) / <VAR>msgid-plural</VAR> (English plural).
 977
 978 <DT><SAMP>`<VAR>count</VAR>&acute;</SAMP>
 979 <DD>
 980 Choose singular/plural form based on this value.
 981
 982 </DL>
 983
 984 <P>
 985 If the <VAR>textdomain</VAR> parameter is not given, the domain is determined from
 986 the environment variable <CODE>TEXTDOMAIN</CODE>.  If the message catalog is not
 987 found in the regular directory, another location can be specified with the
 988 environment variable <CODE>TEXTDOMAINDIR</CODE>.
 989
 990 </P>
 991
 992
 993 <H4><A NAME="SEC252" HREF="gettext_toc.html#TOC252">13.5.2.5  Invoking the <CODE>envsubst</CODE> program</A></H4>
 994
 995 <P>
 996 <A NAME="IDX1106"></A>
 997 <A NAME="IDX1107"></A>
 998
 999 <PRE>
1000 envsubst [<VAR>option</VAR>] [<VAR>shell-format</VAR>]
1001 </PRE>
1002
1003 <P>
1004 <A NAME="IDX1108"></A>
1005 <A NAME="IDX1109"></A>
1006 <A NAME="IDX1110"></A>
1007 The <CODE>envsubst</CODE> program substitutes the values of environment variables.
1008
1009 </P>
1010 <P>
1011 <STRONG>Operation mode</STRONG>
1012
1013 </P>
1014 <DL COMPACT>
1015
1016 <DT><SAMP>`-v&acute;</SAMP>
1017 <DD>
1018 <DT><SAMP>`--variables&acute;</SAMP>
1019 <DD>
1020 <A NAME="IDX1111"></A>
1021 <A NAME="IDX1112"></A>
1022 Output the variables occurring in <VAR>shell-format</VAR>.
1023
1024 </DL>
1025
1026 <P>
1027 <STRONG>Informative output</STRONG>
1028
1029 </P>
1030 <DL COMPACT>
1031
1032 <DT><SAMP>`-h&acute;</SAMP>
1033 <DD>
1034 <DT><SAMP>`--help&acute;</SAMP>
1035 <DD>
1036 <A NAME="IDX1113"></A>
1037 <A NAME="IDX1114"></A>
1038 Display this help and exit.
1039
1040 <DT><SAMP>`-V&acute;</SAMP>
1041 <DD>
1042 <DT><SAMP>`--version&acute;</SAMP>
1043 <DD>
1044 <A NAME="IDX1115"></A>
1045 <A NAME="IDX1116"></A>
1046 Output version information and exit.
1047
1048 </DL>
1049
1050 <P>
1051 In normal operation mode, standard input is copied to standard output,
1052 with references to environment variables of the form <CODE>$VARIABLE</CODE> or
1053 <CODE>${VARIABLE}</CODE> being replaced with the corresponding values.  If a
1054 <VAR>shell-format</VAR> is given, only those environment variables that are
1055 referenced in <VAR>shell-format</VAR> are substituted; otherwise all environment
1056 variables references occurring in standard input are substituted.
1057
1058 </P>
1059 <P>
1060 These substitutions are a subset of the substitutions that a shell performs
1061 on unquoted and double-quoted strings.  Other kinds of substitutions done
1062 by a shell, such as <CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE> or
1063 <CODE>$(<VAR>command-list</VAR>)</CODE> or <CODE>`<VAR>command-list</VAR>`</CODE>, are not performed
1064 by the <CODE>envsubst</CODE> program, due to security reasons.
1065
1066 </P>
1067 <P>
1068 When <CODE>--variables</CODE> is used, standard input is ignored, and the output
1069 consists of the environment variables that are referenced in
1070 <VAR>shell-format</VAR>, one per line.
1071
1072 </P>
1073
1074
1075 <H4><A NAME="SEC253" HREF="gettext_toc.html#TOC253">13.5.2.6  Invoking the <CODE>eval_gettext</CODE> function</A></H4>
1076
1077 <P>
1078 <A NAME="IDX1117"></A>
1079
1080 <PRE>
1081 eval_gettext <VAR>msgid</VAR>
1082 </PRE>
1083
1084 <P>
1085 <A NAME="IDX1118"></A>
1086 This function outputs the native language translation of a textual message,
1087 performing dollar-substitution on the result.  Note that only shell variables
1088 mentioned in <VAR>msgid</VAR> will be dollar-substituted in the result.
1089
1090 </P>
1091
1092
1093 <H4><A NAME="SEC254" HREF="gettext_toc.html#TOC254">13.5.2.7  Invoking the <CODE>eval_ngettext</CODE> function</A></H4>
1094
1095 <P>
1096 <A NAME="IDX1119"></A>
1097
1098 <PRE>
1099 eval_ngettext <VAR>msgid</VAR> <VAR>msgid-plural</VAR> <VAR>count</VAR>
1100 </PRE>
1101
1102 <P>
1103 <A NAME="IDX1120"></A>
1104 This function outputs the native language translation of a textual message
1105 whose grammatical form depends on a number, performing dollar-substitution
1106 on the result.  Note that only shell variables mentioned in <VAR>msgid</VAR> or
1107 <VAR>msgid-plural</VAR> will be dollar-substituted in the result.
1108
1109 </P>
1110
1111
1112 <H3><A NAME="SEC255" HREF="gettext_toc.html#TOC255">13.5.3  bash - Bourne-Again Shell Script</A></H3>
1113 <P>
1114 <A NAME="IDX1121"></A>
1115
1116 </P>
1117 <P>
1118 GNU <CODE>bash</CODE> 2.0 or newer has a special shorthand for translating a
1119 string and substituting variable values in it: <CODE>$"msgid"</CODE>.  But
1120 the use of this construct is <STRONG>discouraged</STRONG>, due to the security
1121 holes it opens and due to its portability problems.
1122
1123 </P>
1124 <P>
1125 The security holes of <CODE>$"..."</CODE> come from the fact that after looking up
1126 the translation of the string, <CODE>bash</CODE> processes it like it processes
1127 any double-quoted string: dollar and backquote processing, like <SAMP>`eval&acute;</SAMP>
1128 does.
1129
1130 </P>
1131
1132 <OL>
1133 <LI>
1134
1135 In a locale whose encoding is one of BIG5, BIG5-HKSCS, GBK, GB18030, SHIFT_JIS,
1136 JOHAB, some double-byte characters have a second byte whose value is
1137 <CODE>0x60</CODE>.  For example, the byte sequence <CODE>\xe0\x60</CODE> is a single
1138 character in these locales.  Many versions of <CODE>bash</CODE> (all versions
1139 up to bash-2.05, and newer versions on platforms without <CODE>mbsrtowcs()</CODE>
1140 function) don't know about character boundaries and see a backquote character
1141 where there is only a particular Chinese character.  Thus it can start
1142 executing part of the translation as a command list.  This situation can occur
1143 even without the translator being aware of it: if the translator provides
1144 translations in the UTF-8 encoding, it is the <CODE>gettext()</CODE> function which
1145 will, during its conversion from the translator's encoding to the user's
1146 locale's encoding, produce the dangerous <CODE>\x60</CODE> bytes.
1147
1148 <LI>
1149
1150 A translator could - voluntarily or inadvertantly - use backquotes
1151 <CODE>"`...`"</CODE> or dollar-parentheses <CODE>"$(...)"</CODE> in her translations.
1152 The enclosed strings would be executed as command lists by the shell.
1153 </OL>
1154
1155 <P>
1156 The portability problem is that <CODE>bash</CODE> must be built with
1157 internationalization support; this is normally not the case on systems
1158 that don't have the <CODE>gettext()</CODE> function in libc.
1159
1160 </P>
1161
1162
1163 <H3><A NAME="SEC256" HREF="gettext_toc.html#TOC256">13.5.4  Python</A></H3>
1164 <P>
1165 <A NAME="IDX1122"></A>
1166
1167 </P>
1168 <DL COMPACT>
1169
1170 <DT>RPMs
1171 <DD>
1172 python
1173
1174 <DT>File extension
1175 <DD>
1176 <CODE>py</CODE>
1177
1178 <DT>String syntax
1179 <DD>
1180 <CODE>'abc'</CODE>, <CODE>u'abc'</CODE>, <CODE>r'abc'</CODE>, <CODE>ur'abc'</CODE>,
1181 <BR><CODE>"abc"</CODE>, <CODE>u"abc"</CODE>, <CODE>r"abc"</CODE>, <CODE>ur"abc"</CODE>,
1182 <BR><CODE>"'abc"'</CODE>, <CODE>u"'abc"'</CODE>, <CODE>r"'abc"'</CODE>, <CODE>ur"'abc"'</CODE>,
1183 <BR><CODE>"""abc"""</CODE>, <CODE>u"""abc"""</CODE>, <CODE>r"""abc"""</CODE>, <CODE>ur"""abc"""</CODE>
1184
1185 <DT>gettext shorthand
1186 <DD>
1187 <CODE>_('abc')</CODE> etc.
1188
1189 <DT>gettext/ngettext functions
1190 <DD>
1191 <CODE>gettext.gettext</CODE>, <CODE>gettext.dgettext</CODE>,
1192 <CODE>gettext.ngettext</CODE>, <CODE>gettext.dngettext</CODE>,
1193 also <CODE>ugettext</CODE>, <CODE>ungettext</CODE>
1194
1195 <DT>textdomain
1196 <DD>
1197 <CODE>gettext.textdomain</CODE> function, or
1198 <CODE>gettext.install(<VAR>domain</VAR>)</CODE> function
1199
1200 <DT>bindtextdomain
1201 <DD>
1202 <CODE>gettext.bindtextdomain</CODE> function, or
1203 <CODE>gettext.install(<VAR>domain</VAR>,<VAR>localedir</VAR>)</CODE> function
1204
1205 <DT>setlocale
1206 <DD>
1207 not used by the gettext emulation
1208
1209 <DT>Prerequisite
1210 <DD>
1211 <CODE>import gettext</CODE>
1212
1213 <DT>Use or emulate GNU gettext
1214 <DD>
1215 emulate
1216
1217 <DT>Extractor
1218 <DD>
1219 <CODE>xgettext</CODE>
1220
1221 <DT>Formatting with positions
1222 <DD>
1223 <CODE>'...%(ident)d...' % { 'ident': value }</CODE>
1224
1225 <DT>Portability
1226 <DD>
1227 fully portable
1228
1229 <DT>po-mode marking
1230 <DD>
1231 ---
1232 </DL>
1233
1234 <P>
1235 An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-python</CODE>.
1236
1237 </P>
1238
1239
1240 <H3><A NAME="SEC257" HREF="gettext_toc.html#TOC257">13.5.5  GNU clisp - Common Lisp</A></H3>
1241 <P>
1242 <A NAME="IDX1123"></A>
1243 <A NAME="IDX1124"></A>
1244 <A NAME="IDX1125"></A>
1245
1246 </P>
1247 <DL COMPACT>
1248
1249 <DT>RPMs
1250 <DD>
1251 clisp 2.28 or newer
1252
1253 <DT>File extension
1254 <DD>
1255 <CODE>lisp</CODE>
1256
1257 <DT>String syntax
1258 <DD>
1259 <CODE>"abc"</CODE>
1260
1261 <DT>gettext shorthand
1262 <DD>
1263 <CODE>(_ "abc")</CODE>, <CODE>(ENGLISH "abc")</CODE>
1264
1265 <DT>gettext/ngettext functions
1266 <DD>
1267 <CODE>i18n:gettext</CODE>, <CODE>i18n:ngettext</CODE>
1268
1269 <DT>textdomain
1270 <DD>
1271 <CODE>i18n:textdomain</CODE>
1272
1273 <DT>bindtextdomain
1274 <DD>
1275 <CODE>i18n:textdomaindir</CODE>
1276
1277 <DT>setlocale
1278 <DD>
1279 automatic
1280
1281 <DT>Prerequisite
1282 <DD>
1283 ---
1284
1285 <DT>Use or emulate GNU gettext
1286 <DD>
1287 use
1288
1289 <DT>Extractor
1290 <DD>
1291 <CODE>xgettext -k_ -kENGLISH</CODE>
1292
1293 <DT>Formatting with positions
1294 <DD>
1295 <CODE>format "~1@*~D ~0@*~D"</CODE>
1296
1297 <DT>Portability
1298 <DD>
1299 On platforms without gettext, no translation.
1300
1301 <DT>po-mode marking
1302 <DD>
1303 ---
1304 </DL>
1305
1306 <P>
1307 An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-clisp</CODE>.
1308
1309 </P>
1310
1311
1312 <H3><A NAME="SEC258" HREF="gettext_toc.html#TOC258">13.5.6  GNU clisp C sources</A></H3>
1313 <P>
1314 <A NAME="IDX1126"></A>
1315
1316 </P>
1317 <DL COMPACT>
1318
1319 <DT>RPMs
1320 <DD>
1321 clisp
1322
1323 <DT>File extension
1324 <DD>
1325 <CODE>d</CODE>
1326
1327 <DT>String syntax
1328 <DD>
1329 <CODE>"abc"</CODE>
1330
1331 <DT>gettext shorthand
1332 <DD>
1333 <CODE>ENGLISH ? "abc" : ""</CODE>
1334 <BR><CODE>GETTEXT("abc")</CODE>
1335 <BR><CODE>GETTEXTL("abc")</CODE>
1336
1337 <DT>gettext/ngettext functions
1338 <DD>
1339 <CODE>clgettext</CODE>, <CODE>clgettextl</CODE>
1340
1341 <DT>textdomain
1342 <DD>
1343 ---
1344
1345 <DT>bindtextdomain
1346 <DD>
1347 ---
1348
1349 <DT>setlocale
1350 <DD>
1351 automatic
1352
1353 <DT>Prerequisite
1354 <DD>
1355 <CODE>#include "lispbibl.c"</CODE>
1356
1357 <DT>Use or emulate GNU gettext
1358 <DD>
1359 use
1360
1361 <DT>Extractor
1362 <DD>
1363 <CODE>clisp-xgettext</CODE>
1364
1365 <DT>Formatting with positions
1366 <DD>
1367 <CODE>fprintf "%2$d %1$d"</CODE>
1368
1369 <DT>Portability
1370 <DD>
1371 On platforms without gettext, no translation.
1372
1373 <DT>po-mode marking
1374 <DD>
1375 ---
1376 </DL>
1377
1378
1379
1380 <H3><A NAME="SEC259" HREF="gettext_toc.html#TOC259">13.5.7  Emacs Lisp</A></H3>
1381 <P>
1382 <A NAME="IDX1127"></A>
1383
1384 </P>
1385 <DL COMPACT>
1386
1387 <DT>RPMs
1388 <DD>
1389 emacs, xemacs
1390
1391 <DT>File extension
1392 <DD>
1393 <CODE>el</CODE>
1394
1395 <DT>String syntax
1396 <DD>
1397 <CODE>"abc"</CODE>
1398
1399 <DT>gettext shorthand
1400 <DD>
1401 <CODE>(_"abc")</CODE>
1402
1403 <DT>gettext/ngettext functions
1404 <DD>
1405 <CODE>gettext</CODE>, <CODE>dgettext</CODE> (xemacs only)
1406
1407 <DT>textdomain
1408 <DD>
1409 <CODE>domain</CODE> special form (xemacs only)
1410
1411 <DT>bindtextdomain
1412 <DD>
1413 <CODE>bind-text-domain</CODE> function (xemacs only)
1414
1415 <DT>setlocale
1416 <DD>
1417 automatic
1418
1419 <DT>Prerequisite
1420 <DD>
1421 ---
1422
1423 <DT>Use or emulate GNU gettext
1424 <DD>
1425 use
1426
1427 <DT>Extractor
1428 <DD>
1429 <CODE>xgettext</CODE>
1430
1431 <DT>Formatting with positions
1432 <DD>
1433 <CODE>format "%2$d %1$d"</CODE>
1434
1435 <DT>Portability
1436 <DD>
1437 Only XEmacs.  Without <CODE>I18N3</CODE> defined at build time, no translation.
1438
1439 <DT>po-mode marking
1440 <DD>
1441 ---
1442 </DL>
1443
1444
1445
1446 <H3><A NAME="SEC260" HREF="gettext_toc.html#TOC260">13.5.8  librep</A></H3>
1447 <P>
1448 <A NAME="IDX1128"></A>
1449
1450 </P>
1451 <DL COMPACT>
1452
1453 <DT>RPMs
1454 <DD>
1455 librep 0.15.3 or newer
1456
1457 <DT>File extension
1458 <DD>
1459 <CODE>jl</CODE>
1460
1461 <DT>String syntax
1462 <DD>
1463 <CODE>"abc"</CODE>
1464
1465 <DT>gettext shorthand
1466 <DD>
1467 <CODE>(_"abc")</CODE>
1468
1469 <DT>gettext/ngettext functions
1470 <DD>
1471 <CODE>gettext</CODE>
1472
1473 <DT>textdomain
1474 <DD>
1475 <CODE>textdomain</CODE> function
1476
1477 <DT>bindtextdomain
1478 <DD>
1479 <CODE>bindtextdomain</CODE> function
1480
1481 <DT>setlocale
1482 <DD>
1483 ---
1484
1485 <DT>Prerequisite
1486 <DD>
1487 <CODE>(require 'rep.i18n.gettext)</CODE>
1488
1489 <DT>Use or emulate GNU gettext
1490 <DD>
1491 use
1492
1493 <DT>Extractor
1494 <DD>
1495 <CODE>xgettext</CODE>
1496
1497 <DT>Formatting with positions
1498 <DD>
1499 <CODE>format "%2$d %1$d"</CODE>
1500
1501 <DT>Portability
1502 <DD>
1503 On platforms without gettext, no translation.
1504
1505 <DT>po-mode marking
1506 <DD>
1507 ---
1508 </DL>
1509
1510 <P>
1511 An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-librep</CODE>.
1512
1513 </P>
1514
1515
1516 <H3><A NAME="SEC261" HREF="gettext_toc.html#TOC261">13.5.9  GNU guile - Scheme</A></H3>
1517 <P>
1518 <A NAME="IDX1129"></A>
1519 <A NAME="IDX1130"></A>
1520
1521 </P>
1522 <DL COMPACT>
1523
1524 <DT>RPMs
1525 <DD>
1526 guile
1527
1528 <DT>File extension
1529 <DD>
1530 <CODE>scm</CODE>
1531
1532 <DT>String syntax
1533 <DD>
1534 <CODE>"abc"</CODE>
1535
1536 <DT>gettext shorthand
1537 <DD>
1538 <CODE>(_ "abc")</CODE>
1539
1540 <DT>gettext/ngettext functions
1541 <DD>
1542 <CODE>gettext</CODE>, <CODE>ngettext</CODE>
1543
1544 <DT>textdomain
1545 <DD>
1546 <CODE>textdomain</CODE>
1547
1548 <DT>bindtextdomain
1549 <DD>
1550 <CODE>bindtextdomain</CODE>
1551
1552 <DT>setlocale
1553 <DD>
1554 <CODE>(catch #t (lambda () (setlocale LC_ALL "")) (lambda args #f))</CODE>
1555
1556 <DT>Prerequisite
1557 <DD>
1558 <CODE>(use-modules (ice-9 format))</CODE>
1559
1560 <DT>Use or emulate GNU gettext
1561 <DD>
1562 use
1563
1564 <DT>Extractor
1565 <DD>
1566 <CODE>xgettext -k_</CODE>
1567
1568 <DT>Formatting with positions
1569 <DD>
1570 ---
1571
1572 <DT>Portability
1573 <DD>
1574 On platforms without gettext, no translation.
1575
1576 <DT>po-mode marking
1577 <DD>
1578 ---
1579 </DL>
1580
1581 <P>
1582 An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-guile</CODE>.
1583
1584 </P>
1585
1586
1587 <H3><A NAME="SEC262" HREF="gettext_toc.html#TOC262">13.5.10  GNU Smalltalk</A></H3>
1588 <P>
1589 <A NAME="IDX1131"></A>
1590
1591 </P>
1592 <DL COMPACT>
1593
1594 <DT>RPMs
1595 <DD>
1596 smalltalk
1597
1598 <DT>File extension
1599 <DD>
1600 <CODE>st</CODE>
1601
1602 <DT>String syntax
1603 <DD>
1604 <CODE>'abc'</CODE>
1605
1606 <DT>gettext shorthand
1607 <DD>
1608 <CODE>NLS ? 'abc'</CODE>
1609
1610 <DT>gettext/ngettext functions
1611 <DD>
1612 <CODE>LcMessagesDomain&#62;&#62;#at:</CODE>, <CODE>LcMessagesDomain&#62;&#62;#at:plural:with:</CODE>
1613
1614 <DT>textdomain
1615 <DD>
1616 <CODE>LcMessages&#62;&#62;#domain:localeDirectory:</CODE> (returns a <CODE>LcMessagesDomain</CODE>
1617 object).<BR>
1618 Example: <CODE>I18N Locale default messages domain: 'gettext' localeDirectory: /usr/local/share/locale'</CODE>
1619
1620 <DT>bindtextdomain
1621 <DD>
1622 <CODE>LcMessages&#62;&#62;#domain:localeDirectory:</CODE>, see above.
1623
1624 <DT>setlocale
1625 <DD>
1626 Automatic if you use <CODE>I18N Locale default</CODE>.
1627
1628 <DT>Prerequisite
1629 <DD>
1630 <CODE>PackageLoader fileInPackage: 'I18N'!</CODE>
1631
1632 <DT>Use or emulate GNU gettext
1633 <DD>
1634 emulate
1635
1636 <DT>Extractor
1637 <DD>
1638 <CODE>xgettext</CODE>
1639
1640 <DT>Formatting with positions
1641 <DD>
1642 <CODE>'%1 %2' bindWith: 'Hello' with: 'world'</CODE>
1643
1644 <DT>Portability
1645 <DD>
1646 fully portable
1647
1648 <DT>po-mode marking
1649 <DD>
1650 ---
1651 </DL>
1652
1653 <P>
1654 An example is available in the <TT>`examples&acute;</TT> directory:
1655 <CODE>hello-smalltalk</CODE>.
1656
1657 </P>
1658
1659
1660 <H3><A NAME="SEC263" HREF="gettext_toc.html#TOC263">13.5.11  Java</A></H3>
1661 <P>
1662 <A NAME="IDX1132"></A>
1663
1664 </P>
1665 <DL COMPACT>
1666
1667 <DT>RPMs
1668 <DD>
1669 java, java2
1670
1671 <DT>File extension
1672 <DD>
1673 <CODE>java</CODE>
1674
1675 <DT>String syntax
1676 <DD>
1677 "abc"
1678
1679 <DT>gettext shorthand
1680 <DD>
1681 _("abc")
1682
1683 <DT>gettext/ngettext functions
1684 <DD>
1685 <CODE>GettextResource.gettext</CODE>, <CODE>GettextResource.ngettext</CODE>
1686
1687 <DT>textdomain
1688 <DD>
1689 ---, use <CODE>ResourceBundle.getResource</CODE> instead
1690
1691 <DT>bindtextdomain
1692 <DD>
1693 ---, use CLASSPATH instead
1694
1695 <DT>setlocale
1696 <DD>
1697 automatic
1698
1699 <DT>Prerequisite
1700 <DD>
1701 ---
1702
1703 <DT>Use or emulate GNU gettext
1704 <DD>
1705 ---, uses a Java specific message catalog format
1706
1707 <DT>Extractor
1708 <DD>
1709 <CODE>xgettext -k_</CODE>
1710
1711 <DT>Formatting with positions
1712 <DD>
1713 <CODE>MessageFormat.format "{1,number} {0,number}"</CODE>
1714
1715 <DT>Portability
1716 <DD>
1717 fully portable
1718
1719 <DT>po-mode marking
1720 <DD>
1721 ---
1722 </DL>
1723
1724 <P>
1725 Before marking strings as internationalizable, uses of the string
1726 concatenation operator need to be converted to <CODE>MessageFormat</CODE>
1727 applications.  For example, <CODE>"file "+filename+" not found"</CODE> becomes
1728 <CODE>MessageFormat.format("file {0} not found", new Object[] { filename })</CODE>.
1729 Only after this is done, can the strings be marked and extracted.
1730
1731 </P>
1732 <P>
1733 GNU gettext uses the native Java internationalization mechanism, namely
1734 <CODE>ResourceBundle</CODE>s.  There are two formats of <CODE>ResourceBundle</CODE>s:
1735 <CODE>.properties</CODE> files and <CODE>.class</CODE> files.  The <CODE>.properties</CODE>
1736 format is a text file which the translators can directly edit, like PO
1737 files, but which doesn't support plural forms.  Whereas the <CODE>.class</CODE>
1738 format is compiled from <CODE>.java</CODE> source code and can support plural
1739 forms (provided it is accessed through an appropriate API, see below).
1740
1741 </P>
1742 <P>
1743 To convert a PO file to a <CODE>.properties</CODE> file, the <CODE>msgcat</CODE>
1744 program can be used with the option <CODE>--properties-output</CODE>.  To convert
1745 a <CODE>.properties</CODE> file back to a PO file, the <CODE>msgcat</CODE> program
1746 can be used with the option <CODE>--properties-input</CODE>.  All the tools
1747 that manipulate PO files can work with <CODE>.properties</CODE> files as well,
1748 if given the <CODE>--properties-input</CODE> and/or <CODE>--properties-output</CODE>
1749 option.
1750
1751 </P>
1752 <P>
1753 To convert a PO file to a ResourceBundle class, the <CODE>msgfmt</CODE> program
1754 can be used with the option <CODE>--java</CODE> or <CODE>--java2</CODE>.  To convert a
1755 ResourceBundle back to a PO file, the <CODE>msgunfmt</CODE> program can be used
1756 with the option <CODE>--java</CODE>.
1757
1758 </P>
1759 <P>
1760 Two different programmatic APIs can be used to access ResourceBundles.
1761 Note that both APIs work with all kinds of ResourceBundles, whether
1762 GNU gettext generated classes, or other <CODE>.class</CODE> or <CODE>.properties</CODE>
1763 files.
1764
1765 </P>
1766
1767 <OL>
1768 <LI>
1769
1770 The <CODE>java.util.ResourceBundle</CODE> API.
1771
1772 In particular, its <CODE>getString</CODE> function returns a string translation.
1773 Note that a missing translation yields a <CODE>MissingResourceException</CODE>.
1774
1775 This has the advantage of being the standard API.  And it does not require
1776 any additional libraries, only the <CODE>msgcat</CODE> generated <CODE>.properties</CODE>
1777 files or the <CODE>msgfmt</CODE> generated <CODE>.class</CODE> files.  But it cannot do
1778 plural handling, even if the resource was generated by <CODE>msgfmt</CODE> from
1779 a PO file with plural handling.
1780
1781 <LI>
1782
1783 The <CODE>gnu.gettext.GettextResource</CODE> API.
1784
1785 Reference documentation in Javadoc 1.1 style format
1786 is in the <A HREF="javadoc1/tree.html">javadoc1 directory</A> and
1787 in Javadoc 2 style format
1788 in the <A HREF="javadoc2/index.html">javadoc2 directory</A>.
1789
1790 Its <CODE>gettext</CODE> function returns a string translation.  Note that when
1791 a translation is missing, the <VAR>msgid</VAR> argument is returned unchanged.
1792
1793 This has the advantage of having the <CODE>ngettext</CODE> function for plural
1794 handling.
1795
1796 <A NAME="IDX1133"></A>
1797 To use this API, one needs the <CODE>libintl.jar</CODE> file which is part of
1798 the GNU gettext package and distributed under the LGPL.
1799 </OL>
1800
1801 <P>
1802 Three examples, using the second API, are available in the <TT>`examples&acute;</TT>
1803 directory: <CODE>hello-java</CODE>, <CODE>hello-java-awt</CODE>, <CODE>hello-java-swing</CODE>.
1804
1805 </P>
1806 <P>
1807 Now, to make use of the API and define a shorthand for <SAMP>`getString&acute;</SAMP>,
1808 there are two idioms that you can choose from:
1809
1810 </P>
1811
1812 <UL>
1813 <LI>
1814
1815 In a unique class of your project, say <SAMP>`Util&acute;</SAMP>, define a static variable
1816 holding the <CODE>ResourceBundle</CODE> instance:
1817
1818
1819 <PRE>
1820 public static ResourceBundle myResources =
1821   ResourceBundle.getBundle("domain-name");
1822 </PRE>
1823
1824 All classes containing internationalized strings then contain
1825
1826
1827 <PRE>
1828 private static ResourceBundle res = Util.myResources;
1829 private static String _(String s) { return res.getString(s); }
1830 </PRE>
1831
1832 and the shorthand is used like this:
1833
1834
1835 <PRE>
1836 System.out.println(_("Operation completed."));
1837 </PRE>
1838
1839 <LI>
1840
1841 You add a class with a very short name, say <SAMP>`S&acute;</SAMP>, containing just the
1842 definition of the resource bundle and of the shorthand:
1843
1844
1845 <PRE>
1846 public class S {
1847   public static ResourceBundle myResources =
1848     ResourceBundle.getBundle("domain-name");
1849   public static String _(String s) {
1850     return myResources.getString(s);
1851   }
1852 }
1853 </PRE>
1854
1855 and the shorthand is used like this:
1856
1857
1858 <PRE>
1859 System.out.println(S._("Operation completed."));
1860 </PRE>
1861
1862 </UL>
1863
1864 <P>
1865 Which of the two idioms you choose, will depend on whether copying two lines
1866 of codes into every class is more acceptable in your project than a class
1867 with a single-letter name.
1868
1869 </P>
1870
1871
1872 <H3><A NAME="SEC264" HREF="gettext_toc.html#TOC264">13.5.12  C#</A></H3>
1873 <P>
1874 <A NAME="IDX1134"></A>
1875
1876 </P>
1877 <DL COMPACT>
1878
1879 <DT>RPMs
1880 <DD>
1881 pnet, pnetlib 0.6.2 or newer, or mono 0.29 or newer
1882
1883 <DT>File extension
1884 <DD>
1885 <CODE>cs</CODE>
1886
1887 <DT>String syntax
1888 <DD>
1889 <CODE>"abc"</CODE>, <CODE>@"abc"</CODE>
1890
1891 <DT>gettext shorthand
1892 <DD>
1893 _("abc")
1894
1895 <DT>gettext/ngettext functions
1896 <DD>
1897 <CODE>GettextResourceManager.GetString</CODE>,
1898 <CODE>GettextResourceManager.GetPluralString</CODE>
1899
1900 <DT>textdomain
1901 <DD>
1902 <CODE>new GettextResourceManager(domain)</CODE>
1903
1904 <DT>bindtextdomain
1905 <DD>
1906 ---, compiled message catalogs are located in subdirectories of the directory
1907 containing the executable
1908
1909 <DT>setlocale
1910 <DD>
1911 automatic
1912
1913 <DT>Prerequisite
1914 <DD>
1915 ---
1916
1917 <DT>Use or emulate GNU gettext
1918 <DD>
1919 ---, uses a C# specific message catalog format
1920
1921 <DT>Extractor
1922 <DD>
1923 <CODE>xgettext -k_</CODE>
1924
1925 <DT>Formatting with positions
1926 <DD>
1927 <CODE>String.Format "{1} {0}"</CODE>
1928
1929 <DT>Portability
1930 <DD>
1931 fully portable
1932
1933 <DT>po-mode marking
1934 <DD>
1935 ---
1936 </DL>
1937
1938 <P>
1939 Before marking strings as internationalizable, uses of the string
1940 concatenation operator need to be converted to <CODE>String.Format</CODE>
1941 invocations.  For example, <CODE>"file "+filename+" not found"</CODE> becomes
1942 <CODE>String.Format("file {0} not found", filename)</CODE>.
1943 Only after this is done, can the strings be marked and extracted.
1944
1945 </P>
1946 <P>
1947 GNU gettext uses the native C#/.NET internationalization mechanism, namely
1948 the classes <CODE>ResourceManager</CODE> and <CODE>ResourceSet</CODE>.  Applications
1949 use the <CODE>ResourceManager</CODE> methods to retrieve the native language
1950 translation of strings.  An instance of <CODE>ResourceSet</CODE> is the in-memory
1951 representation of a message catalog file.  The <CODE>ResourceManager</CODE> loads
1952 and accesses <CODE>ResourceSet</CODE> instances as needed to look up the
1953 translations.
1954
1955 </P>
1956 <P>
1957 There are two formats of <CODE>ResourceSet</CODE>s that can be directly loaded by
1958 the C# runtime: <CODE>.resources</CODE> files and <CODE>.dll</CODE> files.
1959
1960 </P>
1961
1962 <UL>
1963 <LI>
1964
1965 The <CODE>.resources</CODE> format is a binary file usually generated through the
1966 <CODE>resgen</CODE> or <CODE>monoresgen</CODE> utility, but which doesn't support plural
1967 forms.  <CODE>.resources</CODE> files can also be embedded in .NET <CODE>.exe</CODE> files.
1968 This only affects whether a file system access is performed to load the message
1969 catalog; it doesn't affect the contents of the message catalog.
1970
1971 <LI>
1972
1973 On the other hand, the <CODE>.dll</CODE> format is a binary file that is compiled
1974 from <CODE>.cs</CODE> source code and can support plural forms (provided it is
1975 accessed through the GNU gettext API, see below).
1976 </UL>
1977
1978 <P>
1979 Note that these .NET <CODE>.dll</CODE> and <CODE>.exe</CODE> files are not tied to a
1980 particular platform; their file format and GNU gettext for C# can be used
1981 on any platform.
1982
1983 </P>
1984 <P>
1985 To convert a PO file to a <CODE>.resources</CODE> file, the <CODE>msgfmt</CODE> program
1986 can be used with the option <SAMP>`--csharp-resources&acute;</SAMP>.  To convert a
1987 <CODE>.resources</CODE> file back to a PO file, the <CODE>msgunfmt</CODE> program can be
1988 used with the option <SAMP>`--csharp-resources&acute;</SAMP>.  You can also, in some cases,
1989 use the <CODE>resgen</CODE> program (from the <CODE>pnet</CODE> package) or the
1990 <CODE>monoresgen</CODE> program (from the <CODE>mono</CODE>/<CODE>mcs</CODE> package).  These
1991 programs can also convert a <CODE>.resources</CODE> file back to a PO file.  But
1992 beware: as of this writing (January 2004), the <CODE>monoresgen</CODE> converter is
1993 quite buggy and the <CODE>resgen</CODE> converter ignores the encoding of the PO
1994 files.
1995
1996 </P>
1997 <P>
1998 To convert a PO file to a <CODE>.dll</CODE> file, the <CODE>msgfmt</CODE> program can be
1999 used with the option <CODE>--csharp</CODE>.  The result will be a <CODE>.dll</CODE> file
2000 containing a subclass of <CODE>GettextResourceSet</CODE>, which itself is a subclass
2001 of <CODE>ResourceSet</CODE>.  To convert a <CODE>.dll</CODE> file containing a
2002 <CODE>GettextResourceSet</CODE> subclass back to a PO file, the <CODE>msgunfmt</CODE>
2003 program can be used with the option <CODE>--csharp</CODE>.
2004
2005 </P>
2006 <P>
2007 The advantages of the <CODE>.dll</CODE> format over the <CODE>.resources</CODE> format
2008 are:
2009
2010 </P>
2011
2012 <OL>
2013 <LI>
2014
2015 Freedom to localize: Users can add their own translations to an application
2016 after it has been built and distributed.  Whereas when the programmer uses
2017 a <CODE>ResourceManager</CODE> constructor provided by the system, the set of
2018 <CODE>.resources</CODE> files for an application must be specified when the
2019 application is built and cannot be extended afterwards.
2020
2021 <LI>
2022
2023 Plural handling: A message catalog in <CODE>.dll</CODE> format supports the plural
2024 handling function <CODE>GetPluralString</CODE>.  Whereas <CODE>.resources</CODE> files can
2025 only contain data and only support lookups that depend on a single string.
2026
2027 <LI>
2028
2029 The <CODE>GettextResourceManager</CODE> that loads the message catalogs in
2030 <CODE>.dll</CODE> format also provides for inheritance on a per-message basis.
2031 For example, in Austrian (<CODE>de_AT</CODE>) locale, translations from the German
2032 (<CODE>de</CODE>) message catalog will be used for messages not found in the
2033 Austrian message catalog.  This has the consequence that the Austrian
2034 translators need only translate those few messages for which the translation
2035 into Austrian differs from the German one.  Whereas when working with
2036 <CODE>.resources</CODE> files, each message catalog must provide the translations
2037 of all messages by itself.
2038
2039 <LI>
2040
2041 The <CODE>GettextResourceManager</CODE> that loads the message catalogs in
2042 <CODE>.dll</CODE> format also provides for a fallback: The English <VAR>msgid</VAR> is
2043 returned when no translation can be found.  Whereas when working with
2044 <CODE>.resources</CODE> files, a language-neutral <CODE>.resources</CODE> file must
2045 explicitly be provided as a fallback.
2046 </OL>
2047
2048 <P>
2049 On the side of the programmatic APIs, the programmer can use either the
2050 standard <CODE>ResourceManager</CODE> API and the GNU <CODE>GettextResourceManager</CODE>
2051 API.  The latter is an extension of the former, because
2052 <CODE>GettextResourceManager</CODE> is a subclass of <CODE>ResourceManager</CODE>.
2053
2054 </P>
2055
2056 <OL>
2057 <LI>
2058
2059 The <CODE>System.Resources.ResourceManager</CODE> API.
2060
2061 This API works with resources in <CODE>.resources</CODE> format.
2062
2063 The creation of the <CODE>ResourceManager</CODE> is done through
2064
2065 <PRE>
2066   new ResourceManager(domainname, Assembly.GetExecutingAssembly())
2067 </PRE>
2068
2069
2070 The <CODE>GetString</CODE> function returns a string's translation.  Note that this
2071 function returns null when a translation is missing (i.e. not even found in
2072 the fallback resource file).
2073
2074 <LI>
2075
2076 The <CODE>GNU.Gettext.GettextResourceManager</CODE> API.
2077
2078 This API works with resources in <CODE>.dll</CODE> format.
2079
2080 Reference documentation is in the
2081 <A HREF="csharpdoc/index.html">csharpdoc directory</A>.
2082
2083 The creation of the <CODE>ResourceManager</CODE> is done through
2084
2085 <PRE>
2086   new GettextResourceManager(domainname)
2087 </PRE>
2088
2089 The <CODE>GetString</CODE> function returns a string's translation.  Note that when
2090 a translation is missing, the <VAR>msgid</VAR> argument is returned unchanged.
2091
2092 The <CODE>GetPluralString</CODE> function returns a string translation with plural
2093 handling, like the <CODE>ngettext</CODE> function in C.
2094
2095 <A NAME="IDX1135"></A>
2096 To use this API, one needs the <CODE>GNU.Gettext.dll</CODE> file which is part of
2097 the GNU gettext package and distributed under the LGPL.
2098 </OL>
2099
2100 <P>
2101 You can also mix both approaches: use the
2102 <CODE>GNU.Gettext.GettextResourceManager</CODE> constructor, but otherwise use
2103 only the <CODE>ResourceManager</CODE> type and only the <CODE>GetString</CODE> method.
2104 This is appropriate when you want to profit from the tools for PO files,
2105 but don't want to change an existing source code that uses
2106 <CODE>ResourceManager</CODE> and don't (yet) need the <CODE>GetPluralString</CODE> method.
2107
2108 </P>
2109 <P>
2110 Two examples, using the second API, are available in the <TT>`examples&acute;</TT>
2111 directory: <CODE>hello-csharp</CODE>, <CODE>hello-csharp-forms</CODE>.
2112
2113 </P>
2114 <P>
2115 Now, to make use of the API and define a shorthand for <SAMP>`GetString&acute;</SAMP>,
2116 there are two idioms that you can choose from:
2117
2118 </P>
2119
2120 <UL>
2121 <LI>
2122
2123 In a unique class of your project, say <SAMP>`Util&acute;</SAMP>, define a static variable
2124 holding the <CODE>ResourceManager</CODE> instance:
2125
2126
2127 <PRE>
2128 public static GettextResourceManager MyResourceManager =
2129   new GettextResourceManager("domain-name");
2130 </PRE>
2131
2132 All classes containing internationalized strings then contain
2133
2134
2135 <PRE>
2136 private static GettextResourceManager Res = Util.MyResourceManager;
2137 private static String _(String s) { return Res.GetString(s); }
2138 </PRE>
2139
2140 and the shorthand is used like this:
2141
2142
2143 <PRE>
2144 Console.WriteLine(_("Operation completed."));
2145 </PRE>
2146
2147 <LI>
2148
2149 You add a class with a very short name, say <SAMP>`S&acute;</SAMP>, containing just the
2150 definition of the resource manager and of the shorthand:
2151
2152
2153 <PRE>
2154 public class S {
2155   public static GettextResourceManager MyResourceManager =
2156     new GettextResourceManager("domain-name");
2157   public static String _(String s) {
2158      return MyResourceManager.GetString(s);
2159   }
2160 }
2161 </PRE>
2162
2163 and the shorthand is used like this:
2164
2165
2166 <PRE>
2167 Console.WriteLine(S._("Operation completed."));
2168 </PRE>
2169
2170 </UL>
2171
2172 <P>
2173 Which of the two idioms you choose, will depend on whether copying two lines
2174 of codes into every class is more acceptable in your project than a class
2175 with a single-letter name.
2176
2177 </P>
2178
2179
2180 <H3><A NAME="SEC265" HREF="gettext_toc.html#TOC265">13.5.13  GNU awk</A></H3>
2181 <P>
2182 <A NAME="IDX1136"></A>
2183 <A NAME="IDX1137"></A>
2184
2185 </P>
2186 <DL COMPACT>
2187
2188 <DT>RPMs
2189 <DD>
2190 gawk 3.1 or newer
2191
2192 <DT>File extension
2193 <DD>
2194 <CODE>awk</CODE>
2195
2196 <DT>String syntax
2197 <DD>
2198 <CODE>"abc"</CODE>
2199
2200 <DT>gettext shorthand
2201 <DD>
2202 <CODE>_"abc"</CODE>
2203
2204 <DT>gettext/ngettext functions
2205 <DD>
2206 <CODE>dcgettext</CODE>, missing <CODE>dcngettext</CODE> in gawk-3.1.0
2207
2208 <DT>textdomain
2209 <DD>
2210 <CODE>TEXTDOMAIN</CODE> variable
2211
2212 <DT>bindtextdomain
2213 <DD>
2214 <CODE>bindtextdomain</CODE> function
2215
2216 <DT>setlocale
2217 <DD>
2218 automatic, but missing <CODE>setlocale (LC_MESSAGES, "")</CODE> in gawk-3.1.0
2219
2220 <DT>Prerequisite
2221 <DD>
2222 ---
2223
2224 <DT>Use or emulate GNU gettext
2225 <DD>
2226 use
2227
2228 <DT>Extractor
2229 <DD>
2230 <CODE>xgettext</CODE>
2231
2232 <DT>Formatting with positions
2233 <DD>
2234 <CODE>printf "%2$d %1$d"</CODE> (GNU awk only)
2235
2236 <DT>Portability
2237 <DD>
2238 On platforms without gettext, no translation.  On non-GNU awks, you must
2239 define <CODE>dcgettext</CODE>, <CODE>dcngettext</CODE> and <CODE>bindtextdomain</CODE>
2240 yourself.
2241
2242 <DT>po-mode marking
2243 <DD>
2244 ---
2245 </DL>
2246
2247 <P>
2248 An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-gawk</CODE>.
2249
2250 </P>
2251
2252
2253 <H3><A NAME="SEC266" HREF="gettext_toc.html#TOC266">13.5.14  Pascal - Free Pascal Compiler</A></H3>
2254 <P>
2255 <A NAME="IDX1138"></A>
2256 <A NAME="IDX1139"></A>
2257 <A NAME="IDX1140"></A>
2258
2259 </P>
2260 <DL COMPACT>
2261
2262 <DT>RPMs
2263 <DD>
2264 fpk
2265
2266 <DT>File extension
2267 <DD>
2268 <CODE>pp</CODE>, <CODE>pas</CODE>
2269
2270 <DT>String syntax
2271 <DD>
2272 <CODE>'abc'</CODE>
2273
2274 <DT>gettext shorthand
2275 <DD>
2276 automatic
2277
2278 <DT>gettext/ngettext functions
2279 <DD>
2280 ---, use <CODE>ResourceString</CODE> data type instead
2281
2282 <DT>textdomain
2283 <DD>
2284 ---, use <CODE>TranslateResourceStrings</CODE> function instead
2285
2286 <DT>bindtextdomain
2287 <DD>
2288 ---, use <CODE>TranslateResourceStrings</CODE> function instead
2289
2290 <DT>setlocale
2291 <DD>
2292 automatic, but uses only LANG, not LC_MESSAGES or LC_ALL
2293
2294 <DT>Prerequisite
2295 <DD>
2296 <CODE>{$mode delphi}</CODE> or <CODE>{$mode objfpc}</CODE><BR><CODE>uses gettext;</CODE>
2297
2298 <DT>Use or emulate GNU gettext
2299 <DD>
2300 emulate partially
2301
2302 <DT>Extractor
2303 <DD>
2304 <CODE>ppc386</CODE> followed by <CODE>xgettext</CODE> or <CODE>rstconv</CODE>
2305
2306 <DT>Formatting with positions
2307 <DD>
2308 <CODE>uses sysutils;</CODE><BR><CODE>format "%1:d %0:d"</CODE>
2309
2310 <DT>Portability
2311 <DD>
2312 ?
2313
2314 <DT>po-mode marking
2315 <DD>
2316 ---
2317 </DL>
2318
2319 <P>
2320 The Pascal compiler has special support for the <CODE>ResourceString</CODE> data
2321 type.  It generates a <CODE>.rst</CODE> file.  This is then converted to a
2322 <CODE>.pot</CODE> file by use of <CODE>xgettext</CODE> or <CODE>rstconv</CODE>.  At runtime,
2323 a <CODE>.mo</CODE> file corresponding to translations of this <CODE>.pot</CODE> file
2324 can be loaded using the <CODE>TranslateResourceStrings</CODE> function in the
2325 <CODE>gettext</CODE> unit.
2326
2327 </P>
2328 <P>
2329 An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-pascal</CODE>.
2330
2331 </P>
2332
2333
2334 <H3><A NAME="SEC267" HREF="gettext_toc.html#TOC267">13.5.15  wxWindows library</A></H3>
2335 <P>
2336 <A NAME="IDX1141"></A>
2337
2338 </P>
2339 <DL COMPACT>
2340
2341 <DT>RPMs
2342 <DD>
2343 wxGTK, gettext
2344
2345 <DT>File extension
2346 <DD>
2347 <CODE>cpp</CODE>
2348
2349 <DT>String syntax
2350 <DD>
2351 <CODE>"abc"</CODE>
2352
2353 <DT>gettext shorthand
2354 <DD>
2355 <CODE>_("abc")</CODE>
2356
2357 <DT>gettext/ngettext functions
2358 <DD>
2359 <CODE>wxLocale::GetString</CODE>, <CODE>wxGetTranslation</CODE>
2360
2361 <DT>textdomain
2362 <DD>
2363 <CODE>wxLocale::AddCatalog</CODE>
2364
2365 <DT>bindtextdomain
2366 <DD>
2367 <CODE>wxLocale::AddCatalogLookupPathPrefix</CODE>
2368
2369 <DT>setlocale
2370 <DD>
2371 <CODE>wxLocale::Init</CODE>, <CODE>wxSetLocale</CODE>
2372
2373 <DT>Prerequisite
2374 <DD>
2375 <CODE>#include &#60;wx/intl.h&#62;</CODE>
2376
2377 <DT>Use or emulate GNU gettext
2378 <DD>
2379 emulate, see <CODE>include/wx/intl.h</CODE> and <CODE>src/common/intl.cpp</CODE>
2380
2381 <DT>Extractor
2382 <DD>
2383 <CODE>xgettext</CODE>
2384
2385 <DT>Formatting with positions
2386 <DD>
2387 ---
2388
2389 <DT>Portability
2390 <DD>
2391 fully portable
2392
2393 <DT>po-mode marking
2394 <DD>
2395 yes
2396 </DL>
2397
2398
2399
2400 <H3><A NAME="SEC268" HREF="gettext_toc.html#TOC268">13.5.16  YCP - YaST2 scripting language</A></H3>
2401 <P>
2402 <A NAME="IDX1142"></A>
2403 <A NAME="IDX1143"></A>
2404
2405 </P>
2406 <DL COMPACT>
2407
2408 <DT>RPMs
2409 <DD>
2410 libycp, libycp-devel, yast2-core, yast2-core-devel
2411
2412 <DT>File extension
2413 <DD>
2414 <CODE>ycp</CODE>
2415
2416 <DT>String syntax
2417 <DD>
2418 <CODE>"abc"</CODE>
2419
2420 <DT>gettext shorthand
2421 <DD>
2422 <CODE>_("abc")</CODE>
2423
2424 <DT>gettext/ngettext functions
2425 <DD>
2426 <CODE>_()</CODE> with 1 or 3 arguments
2427
2428 <DT>textdomain
2429 <DD>
2430 <CODE>textdomain</CODE> statement
2431
2432 <DT>bindtextdomain
2433 <DD>
2434 ---
2435
2436 <DT>setlocale
2437 <DD>
2438 ---
2439
2440 <DT>Prerequisite
2441 <DD>
2442 ---
2443
2444 <DT>Use or emulate GNU gettext
2445 <DD>
2446 use
2447
2448 <DT>Extractor
2449 <DD>
2450 <CODE>xgettext</CODE>
2451
2452 <DT>Formatting with positions
2453 <DD>
2454 <CODE>sformat "%2 %1"</CODE>
2455
2456 <DT>Portability
2457 <DD>
2458 fully portable
2459
2460 <DT>po-mode marking
2461 <DD>
2462 ---
2463 </DL>
2464
2465 <P>
2466 An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-ycp</CODE>.
2467
2468 </P>
2469
2470
2471 <H3><A NAME="SEC269" HREF="gettext_toc.html#TOC269">13.5.17  Tcl - Tk's scripting language</A></H3>
2472 <P>
2473 <A NAME="IDX1144"></A>
2474 <A NAME="IDX1145"></A>
2475
2476 </P>
2477 <DL COMPACT>
2478
2479 <DT>RPMs
2480 <DD>
2481 tcl
2482
2483 <DT>File extension
2484 <DD>
2485 <CODE>tcl</CODE>
2486
2487 <DT>String syntax
2488 <DD>
2489 <CODE>"abc"</CODE>
2490
2491 <DT>gettext shorthand
2492 <DD>
2493 <CODE>[_ "abc"]</CODE>
2494
2495 <DT>gettext/ngettext functions
2496 <DD>
2497 <CODE>::msgcat::mc</CODE>
2498
2499 <DT>textdomain
2500 <DD>
2501 ---
2502
2503 <DT>bindtextdomain
2504 <DD>
2505 ---, use <CODE>::msgcat::mcload</CODE> instead
2506
2507 <DT>setlocale
2508 <DD>
2509 automatic, uses LANG, but ignores LC_MESSAGES and LC_ALL
2510
2511 <DT>Prerequisite
2512 <DD>
2513 <CODE>package require msgcat</CODE>
2514 <BR><CODE>proc _ {s} {return [::msgcat::mc $s]}</CODE>
2515
2516 <DT>Use or emulate GNU gettext
2517 <DD>
2518 ---, uses a Tcl specific message catalog format
2519
2520 <DT>Extractor
2521 <DD>
2522 <CODE>xgettext -k_</CODE>
2523
2524 <DT>Formatting with positions
2525 <DD>
2526 <CODE>format "%2\$d %1\$d"</CODE>
2527
2528 <DT>Portability
2529 <DD>
2530 fully portable
2531
2532 <DT>po-mode marking
2533 <DD>
2534 ---
2535 </DL>
2536
2537 <P>
2538 Two examples are available in the <TT>`examples&acute;</TT> directory:
2539 <CODE>hello-tcl</CODE>, <CODE>hello-tcl-tk</CODE>.
2540
2541 </P>
2542 <P>
2543 Before marking strings as internationalizable, substitutions of variables
2544 into the string need to be converted to <CODE>format</CODE> applications.  For
2545 example, <CODE>"file $filename not found"</CODE> becomes
2546 <CODE>[format "file %s not found" $filename]</CODE>.
2547 Only after this is done, can the strings be marked and extracted.
2548 After marking, this example becomes
2549 <CODE>[format [_ "file %s not found"] $filename]</CODE> or
2550 <CODE>[msgcat::mc "file %s not found" $filename]</CODE>.  Note that the
2551 <CODE>msgcat::mc</CODE> function implicitly calls <CODE>format</CODE> when more than one
2552 argument is given.
2553
2554 </P>
2555
2556
2557 <H3><A NAME="SEC270" HREF="gettext_toc.html#TOC270">13.5.18  Perl</A></H3>
2558 <P>
2559 <A NAME="IDX1146"></A>
2560
2561 </P>
2562 <DL COMPACT>
2563
2564 <DT>RPMs
2565 <DD>
2566 perl
2567
2568 <DT>File extension
2569 <DD>
2570 <CODE>pl</CODE>, <CODE>PL</CODE>, <CODE>pm</CODE>, <CODE>cgi</CODE>
2571
2572 <DT>String syntax
2573 <DD>
2574
2575 <UL>
2576
2577 <LI><CODE>"abc"</CODE>
2578
2579 <LI><CODE>'abc'</CODE>
2580
2581 <LI><CODE>qq (abc)</CODE>
2582
2583 <LI><CODE>q (abc)</CODE>
2584
2585 <LI><CODE>qr /abc/</CODE>
2586
2587 <LI><CODE>qx (/bin/date)</CODE>
2588
2589 <LI><CODE>/pattern match/</CODE>
2590
2591 <LI><CODE>?pattern match?</CODE>
2592
2593 <LI><CODE>s/substitution/operators/</CODE>
2594
2595 <LI><CODE>$tied_hash{"message"}</CODE>
2596
2597 <LI><CODE>$tied_hash_reference-&#62;{"message"}</CODE>
2598
2599 <LI>etc., issue the command <SAMP>`man perlsyn&acute;</SAMP> for details
2600
2601 </UL>
2602
2603 <DT>gettext shorthand
2604 <DD>
2605 <CODE>__</CODE> (double underscore)
2606
2607 <DT>gettext/ngettext functions
2608 <DD>
2609 <CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>,
2610 <CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
2611
2612 <DT>textdomain
2613 <DD>
2614 <CODE>textdomain</CODE> function
2615
2616 <DT>bindtextdomain
2617 <DD>
2618 <CODE>bindtextdomain</CODE> function
2619
2620 <DT>bind_textdomain_codeset
2621 <DD>
2622 <CODE>bind_textdomain_codeset</CODE> function
2623
2624 <DT>setlocale
2625 <DD>
2626 Use <CODE>setlocale (LC_ALL, "");</CODE>
2627
2628 <DT>Prerequisite
2629 <DD>
2630 <CODE>use POSIX;</CODE>
2631 <BR><CODE>use Locale::TextDomain;</CODE> (included in the package libintl-perl
2632 which is available on the Comprehensive Perl Archive Network CPAN,
2633 http://www.cpan.org/).
2634
2635 <DT>Use or emulate GNU gettext
2636 <DD>
2637 platform dependent: gettext_pp emulates, gettext_xs uses GNU gettext
2638
2639 <DT>Extractor
2640 <DD>
2641 <CODE>xgettext -k__ -k\$__ -k%__ -k__x -k__n:1,2 -k__nx:1,2 -k__xn:1,2 -kN__ -k</CODE>
2642
2643 <DT>Formatting with positions
2644 <DD>
2645 Both kinds of format strings support formatting with positions.
2646 <BR><CODE>printf "%2\$d %1\$d", ...</CODE> (requires Perl 5.8.0 or newer)
2647 <BR><CODE>__expand("[new] replaces [old]", old =&#62; $oldvalue, new =&#62; $newvalue)</CODE>
2648
2649 <DT>Portability
2650 <DD>
2651 The <CODE>libintl-perl</CODE> package is platform independent but is not
2652 part of the Perl core.  The programmer is responsible for
2653 providing a dummy implementation of the required functions if the
2654 package is not installed on the target system.
2655
2656 <DT>po-mode marking
2657 <DD>
2658 ---
2659
2660 <DT>Documentation
2661 <DD>
2662 Included in <CODE>libintl-perl</CODE>, available on CPAN
2663 (http://www.cpan.org/).
2664
2665 </DL>
2666
2667 <P>
2668 An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-perl</CODE>.
2669
2670 </P>
2671 <P>
2672 <A NAME="IDX1147"></A>
2673
2674 </P>
2675 <P>
2676 The <CODE>xgettext</CODE> parser backend for Perl differs significantly from
2677 the parser backends for other programming languages, just as Perl
2678 itself differs significantly from other programming languages.  The
2679 Perl parser backend offers many more string marking facilities than
2680 the other backends but it also has some Perl specific limitations, the
2681 worst probably being its imperfectness.
2682
2683 </P>
2684
2685
2686
2687 <H4><A NAME="SEC271" HREF="gettext_toc.html#TOC271">13.5.18.1  General Problems Parsing Perl Code</A></H4>
2688
2689 <P>
2690 It is often heard that only Perl can parse Perl.  This is not true.
2691 Perl cannot be <EM>parsed</EM> at all, it can only be <EM>executed</EM>.
2692 Perl has various built-in ambiguities that can only be resolved at runtime.
2693
2694 </P>
2695 <P>
2696 The following example may illustrate one common problem:
2697
2698 </P>
2699
2700 <PRE>
2701 print gettext "Hello World!";
2702 </PRE>
2703
2704 <P>
2705 Although this example looks like a bullet-proof case of a function
2706 invocation, it is not:
2707
2708 </P>
2709
2710 <PRE>
2711 open gettext, "&#62;testfile" or die;
2712 print gettext "Hello world!"
2713 </PRE>
2714
2715 <P>
2716 In this context, the string <CODE>gettext</CODE> looks more like a
2717 file handle.  But not necessarily:
2718
2719 </P>
2720
2721 <PRE>
2722 use Locale::Messages qw (:libintl_h);
2723 open gettext "&#62;testfile" or die;
2724 print gettext "Hello world!";
2725 </PRE>
2726
2727 <P>
2728 Now, the file is probably syntactically incorrect, provided that the module
2729 <CODE>Locale::Messages</CODE> found first in the Perl include path exports a
2730 function <CODE>gettext</CODE>.  But what if the module
2731 <CODE>Locale::Messages</CODE> really looks like this?
2732
2733 </P>
2734
2735 <PRE>
2736 use vars qw (*gettext);
2737
2738 1;
2739 </PRE>
2740
2741 <P>
2742 In this case, the string <CODE>gettext</CODE> will be interpreted as a file
2743 handle again, and the above example will create a file <TT>`testfile&acute;</TT>
2744 and write the string "Hello world!" into it.  Even advanced
2745 control flow analysis will not really help:
2746
2747 </P>
2748
2749 <PRE>
2750 if (0.5 &#60; rand) {
2751    eval "use Sane";
2752 } else {
2753    eval "use InSane";
2754 }
2755 print gettext "Hello world!";
2756 </PRE>
2757
2758 <P>
2759 If the module <CODE>Sane</CODE> exports a function <CODE>gettext</CODE> that does
2760 what we expect, and the module <CODE>InSane</CODE> opens a file for writing
2761 and associates the <EM>handle</EM> <CODE>gettext</CODE> with this output
2762 stream, we are clueless again about what will happen at runtime.  It is
2763 completely unpredictable.  The truth is that Perl has so many ways to
2764 fill its symbol table at runtime that it is impossible to interpret a
2765 particular piece of code without executing it.
2766
2767 </P>
2768 <P>
2769 Of course, <CODE>xgettext</CODE> will not execute your Perl sources while
2770 scanning for translatable strings, but rather use heuristics in order
2771 to guess what you meant.
2772
2773 </P>
2774 <P>
2775 Another problem is the ambiguity of the slash and the question mark.
2776 Their interpretation depends on the context:
2777
2778 </P>
2779
2780 <PRE>
2781 # A pattern match.
2782 print "OK\n" if /foobar/;
2783
2784 # A division.
2785 print 1 / 2;
2786
2787 # Another pattern match.
2788 print "OK\n" if ?foobar?;
2789
2790 # Conditional.
2791 print $x ? "foo" : "bar";
2792 </PRE>
2793
2794 <P>
2795 The slash may either act as the division operator or introduce a
2796 pattern match, whereas the question mark may act as the ternary
2797 conditional operator or as a pattern match, too.  Other programming
2798 languages like <CODE>awk</CODE> present similar problems, but the consequences of a
2799 misinterpretation are particularly nasty with Perl sources.  In <CODE>awk</CODE>
2800 for instance, a statement can never exceed one line and the parser
2801 can recover from a parsing error at the next newline and interpret
2802 the rest of the input stream correctly.  Perl is different, as a
2803 pattern match is terminated by the next appearance of the delimiter
2804 (the slash or the question mark) in the input stream, regardless of
2805 the semantic context.  If a slash is really a division sign but
2806 mis-interpreted as a pattern match, the rest of the input file is most
2807 probably parsed incorrectly.
2808
2809 </P>
2810 <P>
2811 If you find that <CODE>xgettext</CODE> fails to extract strings from
2812 portions of your sources, you should therefore look out for slashes
2813 and/or question marks preceding these sections.  You may have come
2814 across a bug in <CODE>xgettext</CODE>'s Perl parser (and of course you
2815 should report that bug).  In the meantime you should consider to
2816 reformulate your code in a manner less challenging to <CODE>xgettext</CODE>.
2817
2818 </P>
2819
2820
2821 <H4><A NAME="SEC272" HREF="gettext_toc.html#TOC272">13.5.18.2  Which keywords will xgettext look for?</A></H4>
2822 <P>
2823 <A NAME="IDX1148"></A>
2824
2825 </P>
2826 <P>
2827 Unless you instruct <CODE>xgettext</CODE> otherwise by invoking it with one
2828 of the options <CODE>--keyword</CODE> or <CODE>-k</CODE>, it will recognize the
2829 following keywords in your Perl sources:
2830
2831 </P>
2832
2833 <UL>
2834
2835 <LI><CODE>gettext</CODE>
2836
2837 <LI><CODE>dgettext</CODE>
2838
2839 <LI><CODE>dcgettext</CODE>
2840
2841 <LI><CODE>ngettext:1,2</CODE>
2842
2843 The first (singular) and the second (plural) argument will be
2844 extracted.
2845
2846 <LI><CODE>dngettext:1,2</CODE>
2847
2848 The first (singular) and the second (plural) argument will be
2849 extracted.
2850
2851 <LI><CODE>dcngettext:1,2</CODE>
2852
2853 The first (singular) and the second (plural) argument will be
2854 extracted.
2855
2856 <LI><CODE>gettext_noop</CODE>
2857
2858 <LI><CODE>%gettext</CODE>
2859
2860 The keys of lookups into the hash <CODE>%gettext</CODE> will be extracted.
2861
2862 <LI><CODE>$gettext</CODE>
2863
2864 The keys of lookups into the hash reference <CODE>$gettext</CODE> will be extracted.
2865
2866 </UL>
2867
2868
2869
2870 <H4><A NAME="SEC273" HREF="gettext_toc.html#TOC273">13.5.18.3  How to Extract Hash Keys</A></H4>
2871 <P>
2872 <A NAME="IDX1149"></A>
2873
2874 </P>
2875 <P>
2876 Translating messages at runtime is normally performed by looking up the
2877 original string in the translation database and returning the
2878 translated version.  The "natural" Perl implementation is a hash
2879 lookup, and, of course, <CODE>xgettext</CODE> supports such practice.
2880
2881 </P>
2882
2883 <PRE>
2884 print __"Hello world!";
2885 print $__{"Hello world!"};
2886 print $__-&#62;{"Hello world!"};
2887 print $$__{"Hello world!"};
2888 </PRE>
2889
2890 <P>
2891 The above four lines all do the same thing.  The Perl module
2892 <CODE>Locale::TextDomain</CODE> exports by default a hash <CODE>%__</CODE> that
2893 is tied to the function <CODE>__()</CODE>.  It also exports a reference
2894 <CODE>$__</CODE> to <CODE>%__</CODE>.
2895
2896 </P>
2897 <P>
2898 If an argument to the <CODE>xgettext</CODE> option <CODE>--keyword</CODE>,
2899 resp. <CODE>-k</CODE> starts with a percent sign, the rest of the keyword is
2900 interpreted as the name of a hash.  If it starts with a dollar
2901 sign, the rest of the keyword is interpreted as a reference to a
2902 hash.
2903
2904 </P>
2905 <P>
2906 Note that you can omit the quotation marks (single or double) around
2907 the hash key (almost) whenever Perl itself allows it:
2908
2909 </P>
2910
2911 <PRE>
2912 print $gettext{Error};
2913 </PRE>
2914
2915 <P>
2916 The exact rule is: You can omit the surrounding quotes, when the hash
2917 key is a valid C (!) identifier, i. e. when it starts with an
2918 underscore or an ASCII letter and is followed by an arbitrary number
2919 of underscores, ASCII letters or digits.  Other Unicode characters
2920 are <EM>not</EM> allowed, regardless of the <CODE>use utf8</CODE> pragma.
2921
2922 </P>
2923
2924
2925 <H4><A NAME="SEC274" HREF="gettext_toc.html#TOC274">13.5.18.4  What are Strings And Quote-like Expressions?</A></H4>
2926 <P>
2927 <A NAME="IDX1150"></A>
2928
2929 </P>
2930 <P>
2931 Perl offers a plethora of different string constructs.  Those that can
2932 be used either as arguments to functions or inside braces for hash
2933 lookups are generally supported by <CODE>xgettext</CODE>.
2934
2935 </P>
2936
2937 <UL>
2938 <LI><STRONG>double-quoted strings</STRONG>
2939
2940 <BR>
2941
2942 <PRE>
2943 print gettext "Hello World!";
2944 </PRE>
2945
2946 <LI><STRONG>single-quoted strings</STRONG>
2947
2948 <BR>
2949
2950 <PRE>
2951 print gettext 'Hello World!';
2952 </PRE>
2953
2954 <LI><STRONG>the operator qq</STRONG>
2955
2956 <BR>
2957
2958 <PRE>
2959 print gettext qq |Hello World!|;
2960 print gettext qq &#60;E-mail: &#60;guido\@imperia.net&#62;&#62;;
2961 </PRE>
2962
2963 The operator <CODE>qq</CODE> is fully supported.  You can use arbitrary
2964 delimiters, including the four bracketing delimiters (round, angle,
2965 square, curly) that nest.
2966
2967 <LI><STRONG>the operator q</STRONG>
2968
2969 <BR>
2970
2971 <PRE>
2972 print gettext q |Hello World!|;
2973 print gettext q &#60;E-mail: &#60;guido@imperia.net&#62;&#62;;
2974 </PRE>
2975
2976 The operator <CODE>q</CODE> is fully supported.  You can use arbitrary
2977 delimiters, including the four bracketing delimiters (round, angle,
2978 square, curly) that nest.
2979
2980 <LI><STRONG>the operator qx</STRONG>
2981
2982 <BR>
2983
2984 <PRE>
2985 print gettext qx ;LANGUAGE=C /bin/date;
2986 print gettext qx [/usr/bin/ls | grep '^[A-Z]*'];
2987 </PRE>
2988
2989 The operator <CODE>qx</CODE> is fully supported.  You can use arbitrary
2990 delimiters, including the four bracketing delimiters (round, angle,
2991 square, curly) that nest.
2992
2993 The example is actually a useless use of <CODE>gettext</CODE>.  It will
2994 invoke the <CODE>gettext</CODE> function on the output of the command
2995 specified with the <CODE>qx</CODE> operator.  The feature was included
2996 in order to make the interface consistent (the parser will extract
2997 all strings and quote-like expressions).
2998
2999 <LI><STRONG>here documents</STRONG>
3000
3001 <BR>
3002
3003 <PRE>
3004 print gettext &#60;&#60;'EOF';
3005 program not found in $PATH
3006 EOF
3007
3008 print ngettext &#60;&#60;EOF, &#60;&#60;"EOF";
3009 one file deleted
3010 EOF
3011 several files deleted
3012 EOF
3013 </PRE>
3014
3015 Here-documents are recognized.  If the delimiter is enclosed in single
3016 quotes, the string is not interpolated.  If it is enclosed in double
3017 quotes or has no quotes at all, the string is interpolated.
3018
3019 Delimiters that start with a digit are not supported!
3020
3021 </UL>
3022
3023
3024
3025 <H4><A NAME="SEC275" HREF="gettext_toc.html#TOC275">13.5.18.5  Invalid Uses Of String Interpolation</A></H4>
3026 <P>
3027 <A NAME="IDX1151"></A>
3028
3029 </P>
3030 <P>
3031 Perl is capable of interpolating variables into strings.  This offers
3032 some nice features in localized programs but can also lead to
3033 problems.
3034
3035 </P>
3036 <P>
3037 A common error is a construct like the following:
3038
3039 </P>
3040
3041 <PRE>
3042 print gettext "This is the program $0!\n";
3043 </PRE>
3044
3045 <P>
3046 Perl will interpolate at runtime the value of the variable <CODE>$0</CODE>
3047 into the argument of the <CODE>gettext()</CODE> function.  Hence, this
3048 argument is not a string constant but a variable argument (<CODE>$0</CODE>
3049 is a global variable that holds the name of the Perl script being
3050 executed).  The interpolation is performed by Perl before the string
3051 argument is passed to <CODE>gettext()</CODE> and will therefore depend on
3052 the name of the script which can only be determined at runtime.
3053 Consequently, it is almost impossible that a translation can be looked
3054 up at runtime (except if, by accident, the interpolated string is found
3055 in the message catalog).
3056
3057 </P>
3058 <P>
3059 The <CODE>xgettext</CODE> program will therefore terminate parsing with a fatal
3060 error if it encounters a variable inside of an extracted string.  In
3061 general, this will happen for all kinds of string interpolations that
3062 cannot be safely performed at compile time.  If you absolutely know
3063 what you are doing, you can always circumvent this behavior:
3064
3065 </P>
3066
3067 <PRE>
3068 my $know_what_i_am_doing = "This is program $0!\n";
3069 print gettext $know_what_i_am_doing;
3070 </PRE>
3071
3072 <P>
3073 Since the parser only recognizes strings and quote-like expressions,
3074 but not variables or other terms, the above construct will be
3075 accepted.  You will have to find another way, however, to let your
3076 original string make it into your message catalog.
3077
3078 </P>
3079 <P>
3080 If invoked with the option <CODE>--extract-all</CODE>, resp. <CODE>-a</CODE>,
3081 variable interpolation will be accepted.  Rationale: You will
3082 generally use this option in order to prepare your sources for
3083 internationalization.
3084
3085 </P>
3086 <P>
3087 Please see the manual page <SAMP>`man perlop&acute;</SAMP> for details of strings and
3088 quote-like expressions that are subject to interpolation and those
3089 that are not.  Safe interpolations (that will not lead to a fatal
3090 error) are:
3091
3092 </P>
3093
3094 <UL>
3095
3096 <LI>the escape sequences <CODE>\t</CODE> (tab, HT, TAB), <CODE>\n</CODE>
3097
3098 (newline, NL), <CODE>\r</CODE> (return, CR), <CODE>\f</CODE> (form feed, FF),
3099 <CODE>\b</CODE> (backspace, BS), <CODE>\a</CODE> (alarm, bell, BEL), and <CODE>\e</CODE>
3100 (escape, ESC).
3101
3102 <LI>octal chars, like <CODE>\033</CODE>
3103
3104 <BR>
3105 Note that octal escapes in the range of 400-777 are translated into a
3106 UTF-8 representation, regardless of the presence of the <CODE>use utf8</CODE> pragma.
3107
3108 <LI>hex chars, like <CODE>\x1b</CODE>
3109
3110 <LI>wide hex chars, like <CODE>\x{263a}</CODE>
3111
3112 <BR>
3113 Note that this escape is translated into a UTF-8 representation,
3114 regardless of the presence of the <CODE>use utf8</CODE> pragma.
3115
3116 <LI>control chars, like <CODE>\c[</CODE> (CTRL-[)
3117
3118 <LI>named Unicode chars, like <CODE>\N{LATIN CAPITAL LETTER C WITH CEDILLA}</CODE>
3119
3120 <BR>
3121 Note that this escape is translated into a UTF-8 representation,
3122 regardless of the presence of the <CODE>use utf8</CODE> pragma.
3123 </UL>
3124
3125 <P>
3126 The following escapes are considered partially safe:
3127
3128 </P>
3129
3130 <UL>
3131
3132 <LI><CODE>\l</CODE> lowercase next char
3133
3134 <LI><CODE>\u</CODE> uppercase next char
3135
3136 <LI><CODE>\L</CODE> lowercase till \E
3137
3138 <LI><CODE>\U</CODE> uppercase till \E
3139
3140 <LI><CODE>\E</CODE> end case modification
3141
3142 <LI><CODE>\Q</CODE> quote non-word characters till \E
3143
3144 </UL>
3145
3146 <P>
3147 These escapes are only considered safe if the string consists of
3148 ASCII characters only.  Translation of characters outside the range
3149 defined by ASCII is locale-dependent and can actually only be performed
3150 at runtime; <CODE>xgettext</CODE> doesn't do these locale-dependent translations
3151 at extraction time.
3152
3153 </P>
3154 <P>
3155 Except for the modifier <CODE>\Q</CODE>, these translations, albeit valid,
3156 are generally useless and only obfuscate your sources.  If a
3157 translation can be safely performed at compile time you can just as
3158 well write what you mean.
3159
3160 </P>
3161
3162
3163 <H4><A NAME="SEC276" HREF="gettext_toc.html#TOC276">13.5.18.6  Valid Uses Of String Interpolation</A></H4>
3164 <P>
3165 <A NAME="IDX1152"></A>
3166
3167 </P>
3168 <P>
3169 Perl is often used to generate sources for other programming languages
3170 or arbitrary file formats.  Web applications that output HTML code
3171 make a prominent example for such usage.
3172
3173 </P>
3174 <P>
3175 You will often come across situations where you want to intersperse
3176 code written in the target (programming) language with translatable
3177 messages, like in the following HTML example:
3178
3179 </P>
3180
3181 <PRE>
3182 print gettext &#60;&#60;EOF;
3183 &#60;h1&#62;My Homepage&#60;/h1&#62;
3184 &#60;script language="JavaScript"&#62;&#60;!--
3185 for (i = 0; i &#60; 100; ++i) {
3186     alert ("Thank you so much for visiting my homepage!");
3187 }
3188 //--&#62;&#60;/script&#62;
3189 EOF
3190 </PRE>
3191
3192 <P>
3193 The parser will extract the entire here document, and it will appear
3194 entirely in the resulting PO file, including the JavaScript snippet
3195 embedded in the HTML code.  If you exaggerate with constructs like
3196 the above, you will run the risk that the translators of your package
3197 will look out for a less challenging project.  You should consider an
3198 alternative expression here:
3199
3200 </P>
3201
3202 <PRE>
3203 print &#60;&#60;EOF;
3204 &#60;h1&#62;$gettext{"My Homepage"}&#60;/h1&#62;
3205 &#60;script language="JavaScript"&#62;&#60;!--
3206 for (i = 0; i &#60; 100; ++i) {
3207     alert ("$gettext{'Thank you so much for visiting my homepage!'}");
3208 }
3209 //--&#62;&#60;/script&#62;
3210 EOF
3211 </PRE>
3212
3213 <P>
3214 Only the translatable portions of the code will be extracted here, and
3215 the resulting PO file will begrudgingly improve in terms of readability.
3216
3217 </P>
3218 <P>
3219 You can interpolate hash lookups in all strings or quote-like
3220 expressions that are subject to interpolation (see the manual page
3221 <SAMP>`man perlop&acute;</SAMP> for details).  Double interpolation is invalid, however:
3222
3223 </P>
3224
3225 <PRE>
3226 # TRANSLATORS: Replace "the earth" with the name of your planet.
3227 print gettext qq{Welcome to $gettext-&#62;{"the earth"}};
3228 </PRE>
3229
3230 <P>
3231 The <CODE>qq</CODE>-quoted string is recognized as an argument to <CODE>xgettext</CODE> in
3232 the first place, and checked for invalid variable interpolation.  The
3233 dollar sign of hash-dereferencing will therefore terminate the parser
3234 with an "invalid interpolation" error.
3235
3236 </P>
3237 <P>
3238 It is valid to interpolate hash lookups in regular expressions:
3239
3240 </P>
3241
3242 <PRE>
3243 if ($var =~ /$gettext{"the earth"}/) {
3244    print gettext "Match!\n";
3245 }
3246 s/$gettext{"U. S. A."}/$gettext{"U. S. A."} $gettext{"(dial +0)"}/g;
3247 </PRE>
3248
3249
3250
3251 <H4><A NAME="SEC277" HREF="gettext_toc.html#TOC277">13.5.18.7  When To Use Parentheses</A></H4>
3252 <P>
3253 <A NAME="IDX1153"></A>
3254
3255 </P>
3256 <P>
3257 In Perl, parentheses around function arguments are mostly optional.
3258 <CODE>xgettext</CODE> will always assume that all
3259 recognized keywords (except for hashs and hash references) are names
3260 of properly prototyped functions, and will (hopefully) only require
3261 parentheses where Perl itself requires them.  All constructs in the
3262 following example are therefore ok to use:
3263
3264 </P>
3265
3266 <PRE>
3267 print gettext ("Hello World!\n");
3268 print gettext "Hello World!\n";
3269 print dgettext ($package =&#62; "Hello World!\n");
3270 print dgettext $package, "Hello World!\n";
3271
3272 # The "fat comma" =&#62; turns the left-hand side argument into a
3273 # single-quoted string!
3274 print dgettext smellovision =&#62; "Hello World!\n";
3275
3276 # The following assignment only works with prototyped functions.
3277 # Otherwise, the functions will act as "greedy" list operators and
3278 # eat up all following arguments.
3279 my $anonymous_hash = {
3280    planet =&#62; gettext "earth",
3281    cakes =&#62; ngettext "one cake", "several cakes", $n,
3282    still =&#62; $works,
3283 };
3284 # The same without fat comma:
3285 my $other_hash = {
3286    'planet', gettext "earth",
3287    'cakes', ngettext "one cake", "several cakes", $n,
3288    'still', $works,
3289 };
3290
3291 # Parentheses are only significant for the first argument.
3292 print dngettext 'package', ("one cake", "several cakes", $n), $discarded;
3293 </PRE>
3294
3295
3296
3297 <H4><A NAME="SEC278" HREF="gettext_toc.html#TOC278">13.5.18.8  How To Grok with Long Lines</A></H4>
3298 <P>
3299 <A NAME="IDX1154"></A>
3300
3301 </P>
3302 <P>
3303 The necessity of long messages can often lead to a cumbersome or
3304 unreadable coding style.  Perl has several options that may prevent
3305 you from writing unreadable code, and
3306 <CODE>xgettext</CODE> does its best to do likewise.  This is where the dot
3307 operator (the string concatenation operator) may come in handy:
3308
3309 </P>
3310
3311 <PRE>
3312 print gettext ("This is a very long"
3313                . " message that is still"
3314                . " readable, because"
3315                . " it is split into"
3316                . " multiple lines.\n");
3317 </PRE>
3318
3319 <P>
3320 Perl is smart enough to concatenate these constant string fragments
3321 into one long string at compile time, and so is
3322 <CODE>xgettext</CODE>.  You will only find one long message in the resulting
3323 POT file.
3324
3325 </P>
3326 <P>
3327 Note that the future Perl 6 will probably use the underscore
3328 (<SAMP>`_&acute;</SAMP>) as the string concatenation operator, and the dot
3329 (<SAMP>`.&acute;</SAMP>) for dereferencing.  This new syntax is not yet supported by
3330 <CODE>xgettext</CODE>.
3331
3332 </P>
3333 <P>
3334 If embedded newline characters are not an issue, or even desired, you
3335 may also insert newline characters inside quoted strings wherever you
3336 feel like it:
3337
3338 </P>
3339
3340 <PRE>
3341 print gettext ("&#60;em&#62;In HTML output
3342 embedded newlines are generally no
3343 problem, since adjacent whitespace
3344 is always rendered into a single
3345 space character.&#60;/em&#62;");
3346 </PRE>
3347
3348 <P>
3349 You may also consider to use here documents:
3350
3351 </P>
3352
3353 <PRE>
3354 print gettext &#60;&#60;EOF;
3355 &#60;em&#62;In HTML output
3356 embedded newlines are generally no
3357 problem, since adjacent whitespace
3358 is always rendered into a single
3359 space character.&#60;/em&#62;
3360 EOF
3361 </PRE>
3362
3363 <P>
3364 Please do not forget, that the line breaks are real, i. e. they
3365 translate into newline characters that will consequently show up in
3366 the resulting POT file.
3367
3368 </P>
3369
3370
3371 <H4><A NAME="SEC279" HREF="gettext_toc.html#TOC279">13.5.18.9  Bugs, Pitfalls, And Things That Do Not Work</A></H4>
3372 <P>
3373 <A NAME="IDX1155"></A>
3374
3375 </P>
3376 <P>
3377 The foregoing sections should have proven that
3378 <CODE>xgettext</CODE> is quite smart in extracting translatable strings from
3379 Perl sources.  Yet, some more or less exotic constructs that could be
3380 expected to work, actually do not work.
3381
3382 </P>
3383 <P>
3384 One of the more relevant limitations can be found in the
3385 implementation of variable interpolation inside quoted strings.  Only
3386 simple hash lookups can be used there:
3387
3388 </P>
3389
3390 <PRE>
3391 print &#60;&#60;EOF;
3392 $gettext{"The dot operator"
3393           . " does not work"
3394           . "here!"}
3395 Likewise, you cannot @{[ gettext ("interpolate function calls") ]}
3396 inside quoted strings or quote-like expressions.
3397 EOF
3398 </PRE>
3399
3400 <P>
3401 This is valid Perl code and will actually trigger invocations of the
3402 <CODE>gettext</CODE> function at runtime.  Yet, the Perl parser in
3403 <CODE>xgettext</CODE> will fail to recognize the strings.  A less obvious
3404 example can be found in the interpolation of regular expressions:
3405
3406 </P>
3407
3408 <PRE>
3409 s/&#60;!--START_OF_WEEK--&#62;/gettext ("Sunday")/e;
3410 </PRE>
3411
3412 <P>
3413 The modifier <CODE>e</CODE> will cause the substitution to be interpreted as
3414 an evaluable statement.  Consequently, at runtime the function
3415 <CODE>gettext()</CODE> is called, but again, the parser fails to extract the
3416 string "Sunday".  Use a temporary variable as a simple workaround if
3417 you really happen to need this feature:
3418
3419 </P>
3420
3421 <PRE>
3422 my $sunday = gettext "Sunday";
3423 s/&#60;!--START_OF_WEEK--&#62;/$sunday/;
3424 </PRE>
3425
3426 <P>
3427 Hash slices would also be handy but are not recognized:
3428
3429 </P>
3430
3431 <PRE>
3432 my @weekdays = @gettext{'Sunday', 'Monday', 'Tuesday', 'Wednesday',
3433                         'Thursday', 'Friday', 'Saturday'};
3434 # Or even:
3435 @weekdays = @gettext{qw (Sunday Monday Tuesday Wednesday Thursday
3436                          Friday Saturday) };
3437 </PRE>
3438
3439 <P>
3440 This is perfectly valid usage of the tied hash <CODE>%gettext</CODE> but the
3441 strings are not recognized and therefore will not be extracted.
3442
3443 </P>
3444 <P>
3445 Another caveat of the current version is its rudimentary support for
3446 non-ASCII characters in identifiers.  You may encounter serious
3447 problems if you use identifiers with characters outside the range of
3448 'A'-'Z', 'a'-'z', '0'-'9' and the underscore '_'.
3449
3450 </P>
3451 <P>
3452 Maybe some of these missing features will be implemented in future
3453 versions, but since you can always make do without them at minimal effort,
3454 these todos have very low priority.
3455
3456 </P>
3457 <P>
3458 A nasty problem are brace format strings that already contain braces
3459 as part of the normal text, for example the usage strings typically
3460 encountered in programs:
3461
3462 </P>
3463
3464 <PRE>
3465 die "usage: $0 {OPTIONS} FILENAME...\n";
3466 </PRE>
3467
3468 <P>
3469 If you want to internationalize this code with Perl brace format strings,
3470 you will run into a problem:
3471
3472 </P>
3473
3474 <PRE>
3475 die __x ("usage: {program} {OPTIONS} FILENAME...\n", program =&#62; $0);
3476 </PRE>
3477
3478 <P>
3479 Whereas <SAMP>`{program}&acute;</SAMP> is a placeholder, <SAMP>`{OPTIONS}&acute;</SAMP>
3480 is not and should probably be translated. Yet, there is no way to teach
3481 the Perl parser in <CODE>xgettext</CODE> to recognize the first one, and leave
3482 the other one alone.
3483
3484 </P>
3485 <P>
3486 There are two possible work-arounds for this problem.  If you are
3487 sure that your program will run under Perl 5.8.0 or newer (these
3488 Perl versions handle positional parameters in <CODE>printf()</CODE>) or
3489 if you are sure that the translator will not have to reorder the arguments
3490 in her translation -- for example if you have only one brace placeholder
3491 in your string, or if it describes a syntax, like in this one --, you can
3492 mark the string as <CODE>no-perl-brace-format</CODE> and use <CODE>printf()</CODE>:
3493
3494 </P>
3495
3496 <PRE>
3497 # xgettext: no-perl-brace-format
3498 die sprintf ("usage: %s {OPTIONS} FILENAME...\n", $0);
3499 </PRE>
3500
3501 <P>
3502 If you want to use the more portable Perl brace format, you will have to do
3503 put placeholders in place of the literal braces:
3504
3505 </P>
3506
3507 <PRE>
3508 die __x ("usage: {program} {[}OPTIONS{]} FILENAME...\n",
3509          program =&#62; $0, '[' =&#62; '{', ']' =&#62; '}');
3510 </PRE>
3511
3512 <P>
3513 Perl brace format strings know no escaping mechanism.  No matter how this
3514 escaping mechanism looked like, it would either give the programmer a
3515 hard time, make translating Perl brace format strings heavy-going, or
3516 result in a performance penalty at runtime, when the format directives
3517 get executed.  Most of the time you will happily get along with
3518 <CODE>printf()</CODE> for this special case.
3519
3520 </P>
3521
3522
3523 <H3><A NAME="SEC280" HREF="gettext_toc.html#TOC280">13.5.19  PHP Hypertext Preprocessor</A></H3>
3524 <P>
3525 <A NAME="IDX1156"></A>
3526
3527 </P>
3528 <DL COMPACT>
3529
3530 <DT>RPMs
3531 <DD>
3532 mod_php4, mod_php4-core, phpdoc
3533
3534 <DT>File extension
3535 <DD>
3536 <CODE>php</CODE>, <CODE>php3</CODE>, <CODE>php4</CODE>
3537
3538 <DT>String syntax
3539 <DD>
3540 <CODE>"abc"</CODE>, <CODE>'abc'</CODE>
3541
3542 <DT>gettext shorthand
3543 <DD>
3544 <CODE>_("abc")</CODE>
3545
3546 <DT>gettext/ngettext functions
3547 <DD>
3548 <CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>; starting with PHP 4.2.0
3549 also <CODE>ngettext</CODE>, <CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
3550
3551 <DT>textdomain
3552 <DD>
3553 <CODE>textdomain</CODE> function
3554
3555 <DT>bindtextdomain
3556 <DD>
3557 <CODE>bindtextdomain</CODE> function
3558
3559 <DT>setlocale
3560 <DD>
3561 Programmer must call <CODE>setlocale (LC_ALL, "")</CODE>
3562
3563 <DT>Prerequisite
3564 <DD>
3565 ---
3566
3567 <DT>Use or emulate GNU gettext
3568 <DD>
3569 use
3570
3571 <DT>Extractor
3572 <DD>
3573 <CODE>xgettext</CODE>
3574
3575 <DT>Formatting with positions
3576 <DD>
3577 <CODE>printf "%2\$d %1\$d"</CODE>
3578
3579 <DT>Portability
3580 <DD>
3581 On platforms without gettext, the functions are not available.
3582
3583 <DT>po-mode marking
3584 <DD>
3585 ---
3586 </DL>
3587
3588 <P>
3589 An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-php</CODE>.
3590
3591 </P>
3592
3593
3594 <H3><A NAME="SEC281" HREF="gettext_toc.html#TOC281">13.5.20  Pike</A></H3>
3595 <P>
3596 <A NAME="IDX1157"></A>
3597
3598 </P>
3599 <DL COMPACT>
3600
3601 <DT>RPMs
3602 <DD>
3603 roxen
3604
3605 <DT>File extension
3606 <DD>
3607 <CODE>pike</CODE>
3608
3609 <DT>String syntax
3610 <DD>
3611 <CODE>"abc"</CODE>
3612
3613 <DT>gettext shorthand
3614 <DD>
3615 ---
3616
3617 <DT>gettext/ngettext functions
3618 <DD>
3619 <CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>
3620
3621 <DT>textdomain
3622 <DD>
3623 <CODE>textdomain</CODE> function
3624
3625 <DT>bindtextdomain
3626 <DD>
3627 <CODE>bindtextdomain</CODE> function
3628
3629 <DT>setlocale
3630 <DD>
3631 <CODE>setlocale</CODE> function
3632
3633 <DT>Prerequisite
3634 <DD>
3635 <CODE>import Locale.Gettext;</CODE>
3636
3637 <DT>Use or emulate GNU gettext
3638 <DD>
3639 use
3640
3641 <DT>Extractor
3642 <DD>
3643 ---
3644
3645 <DT>Formatting with positions
3646 <DD>
3647 ---
3648
3649 <DT>Portability
3650 <DD>
3651 On platforms without gettext, the functions are not available.
3652
3653 <DT>po-mode marking
3654 <DD>
3655 ---
3656 </DL>
3657
3658
3659
3660 <H3><A NAME="SEC282" HREF="gettext_toc.html#TOC282">13.5.21  GNU Compiler Collection sources</A></H3>
3661 <P>
3662 <A NAME="IDX1158"></A>
3663
3664 </P>
3665 <DL COMPACT>
3666
3667 <DT>RPMs
3668 <DD>
3669 gcc
3670
3671 <DT>File extension
3672 <DD>
3673 <CODE>c</CODE>, <CODE>h</CODE>.
3674
3675 <DT>String syntax
3676 <DD>
3677 <CODE>"abc"</CODE>
3678
3679 <DT>gettext shorthand
3680 <DD>
3681 <CODE>_("abc")</CODE>
3682
3683 <DT>gettext/ngettext functions
3684 <DD>
3685 <CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>,
3686 <CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
3687
3688 <DT>textdomain
3689 <DD>
3690 <CODE>textdomain</CODE> function
3691
3692 <DT>bindtextdomain
3693 <DD>
3694 <CODE>bindtextdomain</CODE> function
3695
3696 <DT>setlocale
3697 <DD>
3698 Programmer must call <CODE>setlocale (LC_ALL, "")</CODE>
3699
3700 <DT>Prerequisite
3701 <DD>
3702 <CODE>#include "intl.h"</CODE>
3703
3704 <DT>Use or emulate GNU gettext
3705 <DD>
3706 Use
3707
3708 <DT>Extractor
3709 <DD>
3710 <CODE>xgettext -k_</CODE>
3711
3712 <DT>Formatting with positions
3713 <DD>
3714 ---
3715
3716 <DT>Portability
3717 <DD>
3718 Uses autoconf macros
3719
3720 <DT>po-mode marking
3721 <DD>
3722 yes
3723 </DL>
3724
3725
3726
3727 <H2><A NAME="SEC283" HREF="gettext_toc.html#TOC283">13.6  Internationalizable Data</A></H2>
3728
3729 <P>
3730 Here is a list of other data formats which can be internationalized
3731 using GNU gettext.
3732
3733 </P>
3734
3735
3736
3737 <H3><A NAME="SEC284" HREF="gettext_toc.html#TOC284">13.6.1  POT - Portable Object Template</A></H3>
3738
3739 <DL COMPACT>
3740
3741 <DT>RPMs
3742 <DD>
3743 gettext
3744
3745 <DT>File extension
3746 <DD>
3747 <CODE>pot</CODE>, <CODE>po</CODE>
3748
3749 <DT>Extractor
3750 <DD>
3751 <CODE>xgettext</CODE>
3752 </DL>
3753
3754
3755
3756 <H3><A NAME="SEC285" HREF="gettext_toc.html#TOC285">13.6.2  Resource String Table</A></H3>
3757 <P>
3758 <A NAME="IDX1159"></A>
3759
3760 </P>
3761 <DL COMPACT>
3762
3763 <DT>RPMs
3764 <DD>
3765 fpk
3766
3767 <DT>File extension
3768 <DD>
3769 <CODE>rst</CODE>
3770
3771 <DT>Extractor
3772 <DD>
3773 <CODE>xgettext</CODE>, <CODE>rstconv</CODE>
3774 </DL>
3775
3776
3777
3778 <H3><A NAME="SEC286" HREF="gettext_toc.html#TOC286">13.6.3  Glade - GNOME user interface description</A></H3>
3779
3780 <DL COMPACT>
3781
3782 <DT>RPMs
3783 <DD>
3784 glade, libglade, glade2, libglade2, intltool
3785
3786 <DT>File extension
3787 <DD>
3788 <CODE>glade</CODE>, <CODE>glade2</CODE>
3789
3790 <DT>Extractor
3791 <DD>
3792 <CODE>xgettext</CODE>, <CODE>libglade-xgettext</CODE>, <CODE>xml-i18n-extract</CODE>, <CODE>intltool-extract</CODE>
3793 </DL>
3794
3795 <P><HR><P>
3796 Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_12.html">previous</A>, <A HREF="gettext_14.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
3797 </BODY>
3798 </HTML>