2 * This file is part of gtkD.
4 * gtkD is free software; you can redistribute it and/or modify
5 * it under the terms of the GNU Lesser General Public License as published by
6 * the Free Software Foundation; either version 2.1 of the License, or
7 * (at your option) any later version.
9 * gtkD is distributed in the hope that it will be useful,
10 * but WITHOUT ANY WARRANTY; without even the implied warranty of
11 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 * GNU Lesser General Public License for more details.
14 * You should have received a copy of the GNU Lesser General Public License
15 * along with gtkD; if not, write to the Free Software
16 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
19 // generated automatically - do not change
20 // find conversion definition on APILookup.txt
21 // implement new conversion functionalities on the wrap.utils pakage
24 * Conversion parameters:
25 * inFile = glib-Character-Set-Conversion.html
27 * outFile = CharacterSet
42 * - g_convert_with_iconv
55 module glib
.CharacterSet
;
61 import tango
.io
.Stdout
; // use the tango loging?
65 private import gtkc
.glibtypes
;
67 private import gtkc
.glib
;
70 private import glib
.ErrorG
;
71 private import glib
.Str
;
79 * Historically, Unix has not had a defined encoding for file
80 * names: a file name is valid as long as it does not have path
81 * separators in it ("/"). However, displaying file names may
82 * require conversion: from the character set in which they were
83 * created, to the character set in which the application
84 * operates. Consider the Spanish file name
85 * "Presentacin.sxi". If the
86 * application which created it uses ISO-8859-1 for its encoding,
87 * then the actual file name on disk would look like this:
88 * Character: P r e s e n t a c i n . s x i
89 * Hex code: 50 72 65 73 65 6e 74 61 63 69 f3 6e 2e 73 78 69
90 * However, if the application use UTF-8, the actual file name on
91 * disk would look like this:
92 * Character: P r e s e n t a c i n . s x i
93 * Hex code: 50 72 65 73 65 6e 74 61 63 69 c3 b3 6e 2e 73 78 69
94 * Glib uses UTF-8 for its strings, and GUI toolkits like GTK+
95 * that use Glib do the same thing. If you get a file name from
96 * the file system, for example, from
97 * readdir(3) or from g_dir_read_name(),
98 * and you wish to display the file name to the user, you
99 * will need to convert it into UTF-8. The
100 * opposite case is when the user types the name of a file he
101 * wishes to save: the toolkit will give you that string in
102 * UTF-8 encoding, and you will need to convert it to the
103 * character set used for file names before you can create the
104 * file with open(2) or
106 * By default, Glib assumes that file names on disk are in UTF-8
107 * encoding. This is a valid assumption for file systems which
108 * were created relatively recently: most applications use UTF-8
109 * encoding for their strings, and that is also what they use for
110 * the file names they create. However, older file systems may
111 * still contain file names created in "older" encodings, such as
112 * ISO-8859-1. In this case, for compatibility reasons, you may
113 * want to instruct Glib to use that particular encoding for file
114 * names rather than UTF-8. You can do this by specifying the
115 * encoding for file names in the G_FILENAME_ENCODING
116 * environment variable. For example, if your installation uses
117 * ISO-8859-1 for file names, you can put this in your
119 * export G_FILENAME_ENCODING=ISO-8859-1
120 * Glib provides the functions g_filename_to_utf8()
121 * and g_filename_from_utf8()
122 * to perform the necessary conversions. These functions convert
123 * file names from the encoding specified in
124 * G_FILENAME_ENCODING to UTF-8 and vice-versa.
125 * Figure1, Conversion between File Name Encodings illustrates how
126 * these functions are used to convert between UTF-8 and the
127 * encoding for file names in the file system.
128 * Figure1.Conversion between File Name Encodings
129 * Checklist for Application Writers
130 * This section is a practical summary of the detailed
131 * description above. You can use this as a checklist of
132 * things to do to make sure your applications process file
133 * name encodings correctly.
134 * If you get a file name from the file system from a
135 * function such as readdir(3) or
136 * gtk_file_chooser_get_filename(),
137 * you do not need to do any conversion to pass that
138 * file name to functions like open(2),
140 * fopen(3) those are "raw"
141 * file names which the file system understands.
142 * If you need to display a file name, convert it to UTF-8
143 * first by using g_filename_to_utf8().
144 * If conversion fails, display a string like
145 * "Unknown file name". Do
146 * not convert this string back into the
147 * encoding used for file names if you wish to pass it to
148 * the file system; use the original file name instead.
149 * For example, the document window of a word processor
150 * could display "Unknown file name" in its title bar but
151 * still let the user save the file, as it would keep the
152 * raw file name internally. This can happen if the user
153 * has not set the G_FILENAME_ENCODING
154 * environment variable even though he has files whose
155 * names are not encoded in UTF-8.
156 * If your user interface lets the user type a file name
157 * for saving or renaming, convert it to the encoding used
158 * for file names in the file system by using g_filename_from_utf8().
159 * Pass the converted file name to functions like
160 * fopen(3). If conversion fails, ask
161 * the user to enter a different file name. This can
162 * happen if the user types Japanese characters when
163 * G_FILENAME_ENCODING is set to
164 * ISO-8859-1, for example.
166 public class CharacterSet
173 * Converts a string from one character set to another.
174 * Note that you should use g_iconv() for streaming
177 * the string to convert
179 * the length of the string, or -1 if the string is
182 * name of character set into which to convert str
184 * character set of str.
186 * location to store the number of bytes in the
187 * input string that were successfully converted, or NULL.
188 * Even if the conversion was successful, this may be
189 * less than len if there were partial characters
190 * at the end of the input. If the error
191 * G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
192 * stored will the byte offset after the last valid
195 * the number of bytes stored in the output buffer (not
196 * including the terminating nul).
198 * location to store the error occuring, or NULL to ignore
199 * errors. Any of the errors in GConvertError may occur.
201 * If the conversion was successful, a newly allocated
202 * nul-terminated string, which must be freed with
203 * g_free(). Otherwise NULL and error will be set.
205 public static char[] convert(char[] str, int len
, char[] toCodeset
, char[] fromCodeset
, uint* bytesRead
, uint* bytesWritten
, GError
** error
)
207 // gchar* g_convert (const gchar *str, gssize len, const gchar *to_codeset, const gchar *from_codeset, gsize *bytes_read, gsize *bytes_written, GError **error);
208 return Str
.toString(g_convert(Str
.toStringz(str), len
, Str
.toStringz(toCodeset
), Str
.toStringz(fromCodeset
), bytesRead
, bytesWritten
, error
) );
212 * Converts a string from one character set to another, possibly
213 * including fallback sequences for characters not representable
214 * in the output. Note that it is not guaranteed that the specification
215 * for the fallback sequences in fallback will be honored. Some
216 * systems may do a approximate conversion from from_codeset
217 * to to_codeset in their iconv() functions,
218 * in which case GLib will simply return that approximate conversion.
219 * Note that you should use g_iconv() for streaming
222 * the string to convert
224 * the length of the string, or -1 if the string is
227 * name of character set into which to convert str
229 * character set of str.
231 * UTF-8 string to use in place of character not
232 * present in the target encoding. (The string must be
233 * representable in the target encoding).
234 * If NULL, characters not in the target encoding will
235 * be represented as Unicode escapes \uxxxx or \Uxxxxyyyy.
237 * location to store the number of bytes in the
238 * input string that were successfully converted, or NULL.
239 * Even if the conversion was successful, this may be
240 * less than len if there were partial characters
241 * at the end of the input.
243 * the number of bytes stored in the output buffer (not
244 * including the terminating nul).
246 * location to store the error occuring, or NULL to ignore
247 * errors. Any of the errors in GConvertError may occur.
249 * If the conversion was successful, a newly allocated
250 * nul-terminated string, which must be freed with
251 * g_free(). Otherwise NULL and error will be set.
253 public static char[] convertWithFallback(char[] str, int len
, char[] toCodeset
, char[] fromCodeset
, char[] fallback
, uint* bytesRead
, uint* bytesWritten
, GError
** error
)
255 // gchar* g_convert_with_fallback (const gchar *str, gssize len, const gchar *to_codeset, const gchar *from_codeset, gchar *fallback, gsize *bytes_read, gsize *bytes_written, GError **error);
256 return Str
.toString(g_convert_with_fallback(Str
.toStringz(str), len
, Str
.toStringz(toCodeset
), Str
.toStringz(fromCodeset
), Str
.toStringz(fallback
), bytesRead
, bytesWritten
, error
) );
266 * Converts a string which is in the encoding used for strings by
267 * the C runtime (usually the same as that used by the operating
268 * system) in the current locale into a
271 * a string in the encoding of the current locale. On Windows
272 * this means the system codepage.
274 * the length of the string, or -1 if the string is
277 * location to store the number of bytes in the
278 * input string that were successfully converted, or NULL.
279 * Even if the conversion was successful, this may be
280 * less than len if there were partial characters
281 * at the end of the input. If the error
282 * G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
283 * stored will the byte offset after the last valid
286 * the number of bytes stored in the output buffer (not
287 * including the terminating nul).
289 * location to store the error occuring, or NULL to ignore
290 * errors. Any of the errors in GConvertError may occur.
292 * The converted string, or NULL on an error.
294 public static char[] localeToUtf8(char[] opsysstring
, int len
, uint* bytesRead
, uint* bytesWritten
, GError
** error
)
296 // gchar* g_locale_to_utf8 (const gchar *opsysstring, gssize len, gsize *bytes_read, gsize *bytes_written, GError **error);
297 return Str
.toString(g_locale_to_utf8(Str
.toStringz(opsysstring
), len
, bytesRead
, bytesWritten
, error
) );
301 * Converts a string which is in the encoding used by GLib for
302 * filenames into a UTF-8 string. Note that on Windows GLib uses UTF-8
303 * for filenames; on other platforms, this function indirectly depends on
304 * the current locale.
306 * a string in the encoding for filenames
308 * the length of the string, or -1 if the string is
311 * location to store the number of bytes in the
312 * input string that were successfully converted, or NULL.
313 * Even if the conversion was successful, this may be
314 * less than len if there were partial characters
315 * at the end of the input. If the error
316 * G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
317 * stored will the byte offset after the last valid
320 * the number of bytes stored in the output buffer (not
321 * including the terminating nul).
323 * location to store the error occuring, or NULL to ignore
324 * errors. Any of the errors in GConvertError may occur.
326 * The converted string, or NULL on an error.
328 public static char[] filenameToUtf8(char[] opsysstring
, int len
, uint* bytesRead
, uint* bytesWritten
, GError
** error
)
330 // gchar* g_filename_to_utf8 (const gchar *opsysstring, gssize len, gsize *bytes_read, gsize *bytes_written, GError **error);
331 return Str
.toString(g_filename_to_utf8(Str
.toStringz(opsysstring
), len
, bytesRead
, bytesWritten
, error
) );
335 * Converts a string from UTF-8 to the encoding GLib uses for
336 * filenames. Note that on Windows GLib uses UTF-8 for filenames;
337 * on other platforms, this function indirectly depends on the
340 * a UTF-8 encoded string.
342 * the length of the string, or -1 if the string is
345 * location to store the number of bytes in the
346 * input string that were successfully converted, or NULL.
347 * Even if the conversion was successful, this may be
348 * less than len if there were partial characters
349 * at the end of the input. If the error
350 * G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
351 * stored will the byte offset after the last valid
354 * the number of bytes stored in the output buffer (not
355 * including the terminating nul).
357 * location to store the error occuring, or NULL to ignore
358 * errors. Any of the errors in GConvertError may occur.
360 * The converted string, or NULL on an error.
362 public static char[] filenameFromUtf8(char[] utf8string
, int len
, uint* bytesRead
, uint* bytesWritten
, GError
** error
)
364 // gchar* g_filename_from_utf8 (const gchar *utf8string, gssize len, gsize *bytes_read, gsize *bytes_written, GError **error);
365 return Str
.toString(g_filename_from_utf8(Str
.toStringz(utf8string
), len
, bytesRead
, bytesWritten
, error
) );
369 * Converts an escaped ASCII-encoded URI to a local filename in the
370 * encoding used for filenames.
372 * a uri describing a filename (escaped, encoded in ASCII).
374 * Location to store hostname for the URI, or NULL.
375 * If there is no hostname in the URI, NULL will be
376 * stored in this location.
378 * location to store the error occuring, or NULL to ignore
379 * errors. Any of the errors in GConvertError may occur.
381 * a newly-allocated string holding the resulting
382 * filename, or NULL on an error.
384 public static char[] filenameFromUri(char[] uri
, char** hostname
, GError
** error
)
386 // gchar* g_filename_from_uri (const gchar *uri, gchar **hostname, GError **error);
387 return Str
.toString(g_filename_from_uri(Str
.toStringz(uri
), hostname
, error
) );
391 * Converts an absolute filename to an escaped ASCII-encoded URI, with the path
392 * component following Section 3.3. of RFC 2396.
394 * an absolute filename specified in the GLib file name encoding,
395 * which is the on-disk file name bytes on Unix, and UTF-8 on
398 * A UTF-8 encoded hostname, or NULL for none.
400 * location to store the error occuring, or NULL to ignore
401 * errors. Any of the errors in GConvertError may occur.
403 * a newly-allocated string holding the resulting
404 * URI, or NULL on an error.
406 public static char[] filenameToUri(char[] filename
, char[] hostname
, GError
** error
)
408 // gchar* g_filename_to_uri (const gchar *filename, const gchar *hostname, GError **error);
409 return Str
.toString(g_filename_to_uri(Str
.toStringz(filename
), Str
.toStringz(hostname
), error
) );
413 * Determines the preferred character sets used for filenames.
414 * The first character set from the charsets is the filename encoding, the
415 * subsequent character sets are used when trying to generate a displayable
416 * representation of a filename, see g_filename_display_name().
417 * On Unix, the character sets are determined by consulting the
418 * environment variables G_FILENAME_ENCODING and
419 * G_BROKEN_FILENAMES. On Windows, the character set
420 * used in the GLib API is always UTF-8 and said environment variables
422 * G_FILENAME_ENCODING may be set to a comma-separated list
423 * of character set names. The special token "@locale" is taken to
424 * mean the character set for the current
425 * locale. If G_FILENAME_ENCODING is not set, but
426 * G_BROKEN_FILENAMES is, the character set of the current
427 * locale is taken as the filename encoding. If neither environment variable
428 * is set, UTF-8 is taken as the filename encoding, but the character
429 * set of the current locale is also put in the list of encodings.
430 * The returned charsets belong to GLib and must not be freed.
431 * Note that on Unix, regardless of the locale character set or
432 * G_FILENAME_ENCODING value, the actual file names present
433 * on a system might be in any random encoding or just gibberish.
435 * return location for the NULL-terminated list of encoding names
437 * TRUE if the filename encoding is UTF-8.
440 public static int getFilenameCharsets(char*** charsets
)
442 // gboolean g_get_filename_charsets (G_CONST_RETURN gchar ***charsets);
443 return g_get_filename_charsets(charsets
);
447 * Converts a filename into a valid UTF-8 string. The conversion is
448 * not necessarily reversible, so you should keep the original around
449 * and use the return value of this function only for display purposes.
450 * Unlike g_filename_to_utf8(), the result is guaranteed to be non-NULL
451 * even if the filename actually isn't in the GLib file name encoding.
452 * If GLib can not make sense of the encoding of filename, as a last resort it
453 * replaces unknown characters with U+FFFD, the Unicode replacement character.
454 * You can search the result for the UTF-8 encoding of this character (which is
455 * "\357\277\275" in octal notation) to find out if filename was in an invalid
457 * If you know the whole pathname of the file you should use
458 * g_filename_display_basename(), since that allows location-based
459 * translation of filenames.
461 * a pathname hopefully in the GLib file name encoding
463 * a newly allocated string containing
464 * a rendition of the filename in valid UTF-8
467 public static char[] filenameDisplayName(char[] filename
)
469 // gchar* g_filename_display_name (const gchar *filename);
470 return Str
.toString(g_filename_display_name(Str
.toStringz(filename
)) );
474 * Returns the display basename for the particular filename, guaranteed
475 * to be valid UTF-8. The display name might not be identical to the filename,
476 * for instance there might be problems converting it to UTF-8, and some files
477 * can be translated in the display.
478 * If GLib can not make sense of the encoding of filename, as a last resort it
479 * replaces unknown characters with U+FFFD, the Unicode replacement character.
480 * You can search the result for the UTF-8 encoding of this character (which is
481 * "\357\277\275" in octal notation) to find out if filename was in an invalid
483 * You must pass the whole absolute pathname to this functions so that
484 * translation of well known locations can be done.
485 * This function is preferred over g_filename_display_name() if you know the
486 * whole path, as it allows translation.
488 * an absolute pathname in the GLib file name encoding
490 * a newly allocated string containing
491 * a rendition of the basename of the filename in valid UTF-8
494 public static char[] filenameDisplayBasename(char[] filename
)
496 // gchar* g_filename_display_basename (const gchar *filename);
497 return Str
.toString(g_filename_display_basename(Str
.toStringz(filename
)) );
501 * Splits an URI list conforming to the text/uri-list
502 * mime type defined in RFC 2483 into individual URIs,
503 * discarding any comments. The URIs are not validated.
507 * a newly allocated NULL-terminated list of
508 * strings holding the individual URIs. The array should
509 * be freed with g_strfreev().
512 public static char** uriListExtractUris(char[] uriList
)
514 // gchar** g_uri_list_extract_uris (const gchar *uri_list);
515 return g_uri_list_extract_uris(Str
.toStringz(uriList
));
519 * Converts a string from UTF-8 to the encoding used for strings by
520 * the C runtime (usually the same as that used by the operating
521 * system) in the current locale.
523 * a UTF-8 encoded string
525 * the length of the string, or -1 if the string is
528 * location to store the number of bytes in the
529 * input string that were successfully converted, or NULL.
530 * Even if the conversion was successful, this may be
531 * less than len if there were partial characters
532 * at the end of the input. If the error
533 * G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
534 * stored will the byte offset after the last valid
537 * the number of bytes stored in the output buffer (not
538 * including the terminating nul).
540 * location to store the error occuring, or NULL to ignore
541 * errors. Any of the errors in GConvertError may occur.
543 * The converted string, or NULL on an error.
545 public static char[] localeFromUtf8(char[] utf8string
, int len
, uint* bytesRead
, uint* bytesWritten
, GError
** error
)
547 // gchar* g_locale_from_utf8 (const gchar *utf8string, gssize len, gsize *bytes_read, gsize *bytes_written, GError **error);
548 return Str
.toString(g_locale_from_utf8(Str
.toStringz(utf8string
), len
, bytesRead
, bytesWritten
, error
) );
553 * Obtains the character set for the current
554 * locale; you might use this character set as an argument to
555 * g_convert(), to convert from the current locale's encoding to some
556 * other encoding. (Frequently g_locale_to_utf8() and g_locale_from_utf8()
557 * are nice shortcuts, though.)
558 * The return value is TRUE if the locale's encoding is UTF-8, in that
559 * case you can perhaps avoid calling g_convert().
560 * The string returned in charset is not allocated, and should not be
563 * return location for character set name
565 * TRUE if the returned charset is UTF-8
567 * Note that some encodings may allow nul bytes to
568 * occur inside strings. In that case, using -1 for
569 * the len parameter is unsafe.
571 * Despite the fact that byes_read can return information about partial
572 * characters, the g_convert_... functions
573 * are not generally suitable for streaming. If the underlying converter
574 * being used maintains internal state, then this won't be preserved
575 * across successive calls to g_convert(), g_convert_with_iconv() or
576 * g_convert_with_fallback(). (An example of this is the GNU C converter
577 * for CP1255 which does not emit a base character until it knows that
578 * the next character is not a mark that could combine with the base
581 public static int getCharset(char** charset
)
583 // gboolean g_get_charset (G_CONST_RETURN char **charset);
584 return g_get_charset(charset
);