2008-01-10 Vladimir Makarov <vmakarov@redhat.com>
[official-gcc.git] / libstdc++-v3 / docs / html / 22_locale / messages.html
blob41e94a42f83fc834804aa96aa72d6e94750685ac
1 <?xml version="1.0" encoding="ISO-8859-1"?>
2 <!DOCTYPE html
3 PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
4 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
6 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7 <head>
8 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
9 <meta name="AUTHOR" content="bkoz@redhat.com (Benjamin Kosnik)" />
10 <meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" />
11 <meta name="DESCRIPTION" content="Notes on the messages implementation." />
12 <title>Notes on the messages implementation.</title>
13 <link rel="StyleSheet" href="../lib3styles.css" type="text/css" />
14 <link rel="Start" href="../documentation.html" type="text/html"
15 title="GNU C++ Standard Library" />
16 <link rel="Bookmark" href="howto.html" type="text/html" title="Localization" />
17 <link rel="Copyright" href="../17_intro/license.html" type="text/html" />
18 <link rel="Help" href="../faq/index.html" type="text/html" title="F.A.Q." />
19 </head>
20 <body>
21 <h1>
22 Notes on the messages implementation.
23 </h1>
24 <em>
25 prepared by Benjamin Kosnik (bkoz@redhat.com) on August 8, 2001
26 </em>
28 <h2>
29 1. Abstract
30 </h2>
31 <p>
32 The std::messages facet implements message retrieval functionality
33 equivalent to Java's java.text.MessageFormat .using either GNU gettext
34 or IEEE 1003.1-200 functions.
35 </p>
37 <h2>
38 2. What the standard says
39 </h2>
40 The std::messages facet is probably the most vaguely defined facet in
41 the standard library. It's assumed that this facility was built into
42 the standard library in order to convert string literals from one
43 locale to the other. For instance, converting the "C" locale's
44 <code>const char* c = "please"</code> to a German-localized <code>"bitte"</code>
45 during program execution.
47 <blockquote>
48 22.2.7.1 - Template class messages [lib.locale.messages]
49 </blockquote>
51 This class has three public member functions, which directly
52 correspond to three protected virtual member functions.
54 The public member functions are:
56 <p>
57 <code>catalog open(const string&amp;, const locale&amp;) const</code>
58 </p>
60 <p>
61 <code>string_type get(catalog, int, int, const string_type&amp;) const</code>
62 </p>
64 <p>
65 <code>void close(catalog) const</code>
66 </p>
68 <p>
69 While the virtual functions are:
70 </p>
72 <p>
73 <code>catalog do_open(const string&amp;, const locale&amp;) const</code>
74 </p>
75 <blockquote>
76 <em>
77 -1- Returns: A value that may be passed to get() to retrieve a
78 message, from the message catalog identified by the string name
79 according to an implementation-defined mapping. The result can be used
80 until it is passed to close(). Returns a value less than 0 if no such
81 catalog can be opened.
82 </em>
83 </blockquote>
85 <p>
86 <code>string_type do_get(catalog, int, int, const string_type&amp;) const</code>
87 </p>
88 <blockquote>
89 <em>
90 -3- Requires: A catalog cat obtained from open() and not yet closed.
91 -4- Returns: A message identified by arguments set, msgid, and dfault,
92 according to an implementation-defined mapping. If no such message can
93 be found, returns dfault.
94 </em>
95 </blockquote>
97 <p>
98 <code>void do_close(catalog) const</code>
99 </p>
100 <blockquote>
101 <em>
102 -5- Requires: A catalog cat obtained from open() and not yet closed.
103 -6- Effects: Releases unspecified resources associated with cat.
104 -7- Notes: The limit on such resources, if any, is implementation-defined.
105 </em>
106 </blockquote>
109 <h2>
110 3. Problems with &quot;C&quot; messages: thread safety,
111 over-specification, and assumptions.
112 </h2>
113 A couple of notes on the standard.
116 First, why is <code>messages_base::catalog</code> specified as a typedef
117 to int? This makes sense for implementations that use
118 <code>catopen</code>, but not for others. Fortunately, it's not heavily
119 used and so only a minor irritant.
120 </p>
123 Second, by making the member functions <code>const</code>, it is
124 impossible to save state in them. Thus, storing away information used
125 in the 'open' member function for use in 'get' is impossible. This is
126 unfortunate.
127 </p>
130 The 'open' member function in particular seems to be oddly
131 designed. The signature seems quite peculiar. Why specify a <code>const
132 string&amp; </code> argument, for instance, instead of just <code>const
133 char*</code>? Or, why specify a <code>const locale&amp;</code> argument that is
134 to be used in the 'get' member function? How, exactly, is this locale
135 argument useful? What was the intent? It might make sense if a locale
136 argument was associated with a given default message string in the
137 'open' member function, for instance. Quite murky and unclear, on
138 reflection.
139 </p>
142 Lastly, it seems odd that messages, which explicitly require code
143 conversion, don't use the codecvt facet. Because the messages facet
144 has only one template parameter, it is assumed that ctype, and not
145 codecvt, is to be used to convert between character sets.
146 </p>
149 It is implicitly assumed that the locale for the default message
150 string in 'get' is in the "C" locale. Thus, all source code is assumed
151 to be written in English, so translations are always from "en_US" to
152 other, explicitly named locales.
153 </p>
155 <h2>
156 4. Design and Implementation Details
157 </h2>
158 This is a relatively simple class, on the face of it. The standard
159 specifies very little in concrete terms, so generic implementations
160 that are conforming yet do very little are the norm. Adding
161 functionality that would be useful to programmers and comparable to
162 Java's java.text.MessageFormat takes a bit of work, and is highly
163 dependent on the capabilities of the underlying operating system.
166 Three different mechanisms have been provided, selectable via
167 configure flags:
168 </p>
170 <ul>
171 <li> generic
173 This model does very little, and is what is used by default.
174 </p>
175 </li>
177 <li> gnu
179 The gnu model is complete and fully tested. It's based on the
180 GNU gettext package, which is part of glibc. It uses the functions
181 <code>textdomain, bindtextdomain, gettext</code>
182 to implement full functionality. Creating message
183 catalogs is a relatively straight-forward process and is
184 lightly documented below, and fully documented in gettext's
185 distributed documentation.
186 </p>
187 </li>
189 <li> ieee_1003.1-200x
191 This is a complete, though untested, implementation based on
192 the IEEE standard. The functions
193 <code>catopen, catgets, catclose</code>
194 are used to retrieve locale-specific messages given the
195 appropriate message catalogs that have been constructed for
196 their use. Note, the script <code> po2msg.sed</code> that is part
197 of the gettext distribution can convert gettext catalogs into
198 catalogs that <code>catopen</code> can use.
199 </p>
200 </li>
201 </ul>
204 A new, standards-conformant non-virtual member function signature was
205 added for 'open' so that a directory could be specified with a given
206 message catalog. This simplifies calling conventions for the gnu
207 model.
208 </p>
211 The rest of this document discusses details of the GNU model.
212 </p>
215 The messages facet, because it is retrieving and converting between
216 characters sets, depends on the ctype and perhaps the codecvt facet in
217 a given locale. In addition, underlying "C" library locale support is
218 necessary for more than just the <code>LC_MESSAGES</code> mask:
219 <code>LC_CTYPE</code> is also necessary. To avoid any unpleasantness, all
220 bits of the "C" mask (ie <code>LC_ALL</code>) are set before retrieving
221 messages.
222 </p>
225 Making the message catalogs can be initially tricky, but become quite
226 simple with practice. For complete info, see the gettext
227 documentation. Here's an idea of what is required:
228 </p>
230 <ul>
231 <li> Make a source file with the required string literals
232 that need to be translated. See
233 <code>intl/string_literals.cc</code> for an example.
234 </li>
236 <li> Make initial catalog (see "4 Making the PO Template File"
237 from the gettext docs).
239 <code> xgettext --c++ --debug string_literals.cc -o libstdc++.pot </code>
240 </p>
241 </li>
243 <li> Make language and country-specific locale catalogs.
245 <code>cp libstdc++.pot fr_FR.po</code>
246 </p>
248 <code>cp libstdc++.pot de_DE.po</code>
249 </p>
250 </li>
252 <li> Edit localized catalogs in emacs so that strings are
253 translated.
255 <code>emacs fr_FR.po</code>
256 </p>
257 </li>
259 <li> Make the binary mo files.
261 <code>msgfmt fr_FR.po -o fr_FR.mo</code>
262 </p>
264 <code>msgfmt de_DE.po -o de_DE.mo</code>
265 </p>
266 </li>
268 <li> Copy the binary files into the correct directory structure.
270 <code>cp fr_FR.mo (dir)/fr_FR/LC_MESSAGES/libstdc++-v3.mo</code>
271 </p>
273 <code>cp de_DE.mo (dir)/de_DE/LC_MESSAGES/libstdc++-v3.mo</code>
274 </p>
275 </li>
277 <li> Use the new message catalogs.
279 <code>locale loc_de("de_DE");</code>
280 </p>
282 <code>
283 use_facet&lt;messages&lt;char&gt; &gt;(loc_de).open("libstdc++", locale(), dir);
284 </code>
285 </p>
286 </li>
287 </ul>
289 <h2>
290 5. Examples
291 </h2>
293 <ul>
294 <li> message converting, simple example using the GNU model.
296 <pre>
297 #include &lt;iostream&gt;
298 #include &lt;locale&gt;
299 using namespace std;
301 void test01()
303 typedef messages&lt;char&gt;::catalog catalog;
304 const char* dir =
305 "/mnt/egcs/build/i686-pc-linux-gnu/libstdc++-v3/po/share/locale";
306 const locale loc_de("de_DE");
307 const messages&lt;char&gt;&amp; mssg_de = use_facet&lt;messages&lt;char&gt; &gt;(loc_de);
309 catalog cat_de = mssg_de.open("libstdc++", loc_de, dir);
310 string s01 = mssg_de.get(cat_de, 0, 0, "please");
311 string s02 = mssg_de.get(cat_de, 0, 0, "thank you");
312 cout &lt;&lt; "please in german:" &lt;&lt; s01 &lt;&lt; '\n';
313 cout &lt;&lt; "thank you in german:" &lt;&lt; s02 &lt;&lt; '\n';
314 mssg_de.close(cat_de);
316 </pre>
317 </li>
318 </ul>
320 More information can be found in the following testcases:
321 <ul>
322 <li> testsuite/22_locale/messages.cc </li>
323 <li> testsuite/22_locale/messages_byname.cc </li>
324 <li> testsuite/22_locale/messages_char_members.cc </li>
325 </ul>
327 <h2>
328 6. Unresolved Issues
329 </h2>
330 <ul>
331 <li> Things that are sketchy, or remain unimplemented:
332 <ul>
333 <li>_M_convert_from_char, _M_convert_to_char are in
334 flux, depending on how the library ends up doing
335 character set conversions. It might not be possible to
336 do a real character set based conversion, due to the
337 fact that the template parameter for messages is not
338 enough to instantiate the codecvt facet (1 supplied,
339 need at least 2 but would prefer 3).
340 </li>
342 <li> There are issues with gettext needing the global
343 locale set to extract a message. This dependence on
344 the global locale makes the current "gnu" model non
345 MT-safe. Future versions of glibc, ie glibc 2.3.x will
346 fix this, and the C++ library bits are already in
347 place.
348 </li>
349 </ul>
350 </li>
352 <li> Development versions of the GNU "C" library, glibc 2.3 will allow
353 a more efficient, MT implementation of std::messages, and will
354 allow the removal of the _M_name_messages data member. If this
355 is done, it will change the library ABI. The C++ parts to
356 support glibc 2.3 have already been coded, but are not in use:
357 once this version of the "C" library is released, the marked
358 parts of the messages implementation can be switched over to
359 the new "C" library functionality.
360 </li>
361 <li> At some point in the near future, std::numpunct will probably use
362 std::messages facilities to implement truename/falename
363 correctly. This is currently not done, but entries in
364 libstdc++.pot have already been made for "true" and "false"
365 string literals, so all that remains is the std::numpunct
366 coding and the configure/make hassles to make the installed
367 library search its own catalog. Currently the libstdc++.mo
368 catalog is only searched for the testsuite cases involving
369 messages members.
370 </li>
372 <li> The following member functions:
375 <code>
376 catalog
377 open(const basic_string&lt;char&gt;&amp; __s, const locale&amp; __loc) const
378 </code>
379 </p>
382 <code>
383 catalog
384 open(const basic_string&lt;char&gt;&amp;, const locale&amp;, const char*) const;
385 </code>
386 </p>
389 Don't actually return a "value less than 0 if no such catalog
390 can be opened" as required by the standard in the "gnu"
391 model. As of this writing, it is unknown how to query to see
392 if a specified message catalog exists using the gettext
393 package.
394 </p>
395 </li>
396 </ul>
398 <h2>
399 7. Acknowledgments
400 </h2>
401 Ulrich Drepper for the character set explanations, gettext details,
402 and patient answering of late-night questions, Tom Tromey for the java details.
405 <h2>
406 8. Bibliography / Referenced Documents
407 </h2>
409 Drepper, Ulrich, GNU libc (glibc) 2.2 manual. In particular, Chapters
410 &quot;7 Locales and Internationalization&quot;
413 Drepper, Ulrich, Thread-Aware Locale Model, A proposal. This is a
414 draft document describing the design of glibc 2.3 MT locale
415 functionality.
416 </p>
419 Drepper, Ulrich, Numerous, late-night email correspondence
420 </p>
423 ISO/IEC 9899:1999 Programming languages - C
424 </p>
427 ISO/IEC 14882:1998 Programming languages - C++
428 </p>
431 Java 2 Platform, Standard Edition, v 1.3.1 API Specification. In
432 particular, java.util.Properties, java.text.MessageFormat,
433 java.util.Locale, java.util.ResourceBundle.
434 http://java.sun.com/j2se/1.3/docs/api
435 </p>
438 System Interface Definitions, Issue 7 (IEEE Std. 1003.1-200x)
439 The Open Group/The Institute of Electrical and Electronics Engineers, Inc.
440 In particular see lines 5268-5427.
441 http://www.opennc.org/austin/docreg.html
442 </p>
444 <p> GNU gettext tools, version 0.10.38, Native Language Support
445 Library and Tools.
446 http://sources.redhat.com/gettext
447 </p>
450 Langer, Angelika and Klaus Kreft, Standard C++ IOStreams and Locales,
451 Advanced Programmer's Guide and Reference, Addison Wesley Longman,
452 Inc. 2000. See page 725, Internationalized Messages.
453 </p>
456 Stroustrup, Bjarne, Appendix D, The C++ Programming Language, Special Edition, Addison Wesley, Inc. 2000
457 </p>
459 </body>
460 </html>