alternative to assert
[gtkD.git] / gtkD / src / glib / CharacterSet.d
blobcc0cbf14060ff20fcd77fbcffe175b64ff40e72e
1 /*
2 * This file is part of gtkD.
4 * gtkD is free software; you can redistribute it and/or modify
5 * it under the terms of the GNU Lesser General Public License as published by
6 * the Free Software Foundation; either version 2.1 of the License, or
7 * (at your option) any later version.
9 * gtkD is distributed in the hope that it will be useful,
10 * but WITHOUT ANY WARRANTY; without even the implied warranty of
11 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 * GNU Lesser General Public License for more details.
14 * You should have received a copy of the GNU Lesser General Public License
15 * along with gtkD; if not, write to the Free Software
16 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
19 // generated automatically - do not change
20 // find conversion definition on APILookup.txt
21 // implement new conversion functionalities on the wrap.utils pakage
24 * Conversion parameters:
25 * inFile = glib-Character-Set-Conversion.html
26 * outPack = glib
27 * outFile = CharacterSet
28 * strct =
29 * realStrct=
30 * ctorStrct=
31 * clss = CharacterSet
32 * interf =
33 * class Code: No
34 * interface Code: No
35 * template for:
36 * extend =
37 * implements:
38 * prefixes:
39 * - g_
40 * omit structs:
41 * omit prefixes:
42 * - g_convert_with_iconv
43 * - g_iconv_open
44 * - g_iconv
45 * - g_iconv_close
46 * omit code:
47 * imports:
48 * - glib.ErrorG
49 * - glib.Str
50 * structWrap:
51 * module aliases:
52 * local aliases:
55 module glib.CharacterSet;
57 version(noAssert)
59 version(Tango)
61 import tango.io.Stdout; // use the tango loging?
65 private import gtkc.glibtypes;
67 private import gtkc.glib;
70 private import glib.ErrorG;
71 private import glib.Str;
76 /**
77 * Description
78 * File Name Encodings
79 * Historically, Unix has not had a defined encoding for file
80 * names: a file name is valid as long as it does not have path
81 * separators in it ("/"). However, displaying file names may
82 * require conversion: from the character set in which they were
83 * created, to the character set in which the application
84 * operates. Consider the Spanish file name
85 * "Presentacin.sxi". If the
86 * application which created it uses ISO-8859-1 for its encoding,
87 * then the actual file name on disk would look like this:
88 * Character: P r e s e n t a c i n . s x i
89 * Hex code: 50 72 65 73 65 6e 74 61 63 69 f3 6e 2e 73 78 69
90 * However, if the application use UTF-8, the actual file name on
91 * disk would look like this:
92 * Character: P r e s e n t a c i n . s x i
93 * Hex code: 50 72 65 73 65 6e 74 61 63 69 c3 b3 6e 2e 73 78 69
94 * Glib uses UTF-8 for its strings, and GUI toolkits like GTK+
95 * that use Glib do the same thing. If you get a file name from
96 * the file system, for example, from
97 * readdir(3) or from g_dir_read_name(),
98 * and you wish to display the file name to the user, you
99 * will need to convert it into UTF-8. The
100 * opposite case is when the user types the name of a file he
101 * wishes to save: the toolkit will give you that string in
102 * UTF-8 encoding, and you will need to convert it to the
103 * character set used for file names before you can create the
104 * file with open(2) or
105 * fopen(3).
106 * By default, Glib assumes that file names on disk are in UTF-8
107 * encoding. This is a valid assumption for file systems which
108 * were created relatively recently: most applications use UTF-8
109 * encoding for their strings, and that is also what they use for
110 * the file names they create. However, older file systems may
111 * still contain file names created in "older" encodings, such as
112 * ISO-8859-1. In this case, for compatibility reasons, you may
113 * want to instruct Glib to use that particular encoding for file
114 * names rather than UTF-8. You can do this by specifying the
115 * encoding for file names in the G_FILENAME_ENCODING
116 * environment variable. For example, if your installation uses
117 * ISO-8859-1 for file names, you can put this in your
118 * ~/.profile:
119 * export G_FILENAME_ENCODING=ISO-8859-1
120 * Glib provides the functions g_filename_to_utf8()
121 * and g_filename_from_utf8()
122 * to perform the necessary conversions. These functions convert
123 * file names from the encoding specified in
124 * G_FILENAME_ENCODING to UTF-8 and vice-versa.
125 * Figure1, Conversion between File Name Encodings illustrates how
126 * these functions are used to convert between UTF-8 and the
127 * encoding for file names in the file system.
128 * Figure1.Conversion between File Name Encodings
129 * Checklist for Application Writers
130 * This section is a practical summary of the detailed
131 * description above. You can use this as a checklist of
132 * things to do to make sure your applications process file
133 * name encodings correctly.
134 * If you get a file name from the file system from a
135 * function such as readdir(3) or
136 * gtk_file_chooser_get_filename(),
137 * you do not need to do any conversion to pass that
138 * file name to functions like open(2),
139 * rename(2), or
140 * fopen(3) those are "raw"
141 * file names which the file system understands.
142 * If you need to display a file name, convert it to UTF-8
143 * first by using g_filename_to_utf8().
144 * If conversion fails, display a string like
145 * "Unknown file name". Do
146 * not convert this string back into the
147 * encoding used for file names if you wish to pass it to
148 * the file system; use the original file name instead.
149 * For example, the document window of a word processor
150 * could display "Unknown file name" in its title bar but
151 * still let the user save the file, as it would keep the
152 * raw file name internally. This can happen if the user
153 * has not set the G_FILENAME_ENCODING
154 * environment variable even though he has files whose
155 * names are not encoded in UTF-8.
156 * If your user interface lets the user type a file name
157 * for saving or renaming, convert it to the encoding used
158 * for file names in the file system by using g_filename_from_utf8().
159 * Pass the converted file name to functions like
160 * fopen(3). If conversion fails, ask
161 * the user to enter a different file name. This can
162 * happen if the user types Japanese characters when
163 * G_FILENAME_ENCODING is set to
164 * ISO-8859-1, for example.
166 public class CharacterSet
173 * Converts a string from one character set to another.
174 * Note that you should use g_iconv() for streaming
175 * conversions[2].
176 * str:
177 * the string to convert
178 * len:
179 * the length of the string, or -1 if the string is
180 * nul-terminated[1].
181 * to_codeset:
182 * name of character set into which to convert str
183 * from_codeset:
184 * character set of str.
185 * bytes_read:
186 * location to store the number of bytes in the
187 * input string that were successfully converted, or NULL.
188 * Even if the conversion was successful, this may be
189 * less than len if there were partial characters
190 * at the end of the input. If the error
191 * G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
192 * stored will the byte offset after the last valid
193 * input sequence.
194 * bytes_written:
195 * the number of bytes stored in the output buffer (not
196 * including the terminating nul).
197 * error:
198 * location to store the error occuring, or NULL to ignore
199 * errors. Any of the errors in GConvertError may occur.
200 * Returns:
201 * If the conversion was successful, a newly allocated
202 * nul-terminated string, which must be freed with
203 * g_free(). Otherwise NULL and error will be set.
205 public static char[] convert(char[] str, int len, char[] toCodeset, char[] fromCodeset, uint* bytesRead, uint* bytesWritten, GError** error)
207 // gchar* g_convert (const gchar *str, gssize len, const gchar *to_codeset, const gchar *from_codeset, gsize *bytes_read, gsize *bytes_written, GError **error);
208 return Str.toString(g_convert(Str.toStringz(str), len, Str.toStringz(toCodeset), Str.toStringz(fromCodeset), bytesRead, bytesWritten, error) );
212 * Converts a string from one character set to another, possibly
213 * including fallback sequences for characters not representable
214 * in the output. Note that it is not guaranteed that the specification
215 * for the fallback sequences in fallback will be honored. Some
216 * systems may do a approximate conversion from from_codeset
217 * to to_codeset in their iconv() functions,
218 * in which case GLib will simply return that approximate conversion.
219 * Note that you should use g_iconv() for streaming
220 * conversions[2].
221 * str:
222 * the string to convert
223 * len:
224 * the length of the string, or -1 if the string is
225 * nul-terminated[1].
226 * to_codeset:
227 * name of character set into which to convert str
228 * from_codeset:
229 * character set of str.
230 * fallback:
231 * UTF-8 string to use in place of character not
232 * present in the target encoding. (The string must be
233 * representable in the target encoding).
234 * If NULL, characters not in the target encoding will
235 * be represented as Unicode escapes \uxxxx or \Uxxxxyyyy.
236 * bytes_read:
237 * location to store the number of bytes in the
238 * input string that were successfully converted, or NULL.
239 * Even if the conversion was successful, this may be
240 * less than len if there were partial characters
241 * at the end of the input.
242 * bytes_written:
243 * the number of bytes stored in the output buffer (not
244 * including the terminating nul).
245 * error:
246 * location to store the error occuring, or NULL to ignore
247 * errors. Any of the errors in GConvertError may occur.
248 * Returns:
249 * If the conversion was successful, a newly allocated
250 * nul-terminated string, which must be freed with
251 * g_free(). Otherwise NULL and error will be set.
253 public static char[] convertWithFallback(char[] str, int len, char[] toCodeset, char[] fromCodeset, char[] fallback, uint* bytesRead, uint* bytesWritten, GError** error)
255 // gchar* g_convert_with_fallback (const gchar *str, gssize len, const gchar *to_codeset, const gchar *from_codeset, gchar *fallback, gsize *bytes_read, gsize *bytes_written, GError **error);
256 return Str.toString(g_convert_with_fallback(Str.toStringz(str), len, Str.toStringz(toCodeset), Str.toStringz(fromCodeset), Str.toStringz(fallback), bytesRead, bytesWritten, error) );
266 * Converts a string which is in the encoding used for strings by
267 * the C runtime (usually the same as that used by the operating
268 * system) in the current locale into a
269 * UTF-8 string.
270 * opsysstring:
271 * a string in the encoding of the current locale. On Windows
272 * this means the system codepage.
273 * len:
274 * the length of the string, or -1 if the string is
275 * nul-terminated[1].
276 * bytes_read:
277 * location to store the number of bytes in the
278 * input string that were successfully converted, or NULL.
279 * Even if the conversion was successful, this may be
280 * less than len if there were partial characters
281 * at the end of the input. If the error
282 * G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
283 * stored will the byte offset after the last valid
284 * input sequence.
285 * bytes_written:
286 * the number of bytes stored in the output buffer (not
287 * including the terminating nul).
288 * error:
289 * location to store the error occuring, or NULL to ignore
290 * errors. Any of the errors in GConvertError may occur.
291 * Returns:
292 * The converted string, or NULL on an error.
294 public static char[] localeToUtf8(char[] opsysstring, int len, uint* bytesRead, uint* bytesWritten, GError** error)
296 // gchar* g_locale_to_utf8 (const gchar *opsysstring, gssize len, gsize *bytes_read, gsize *bytes_written, GError **error);
297 return Str.toString(g_locale_to_utf8(Str.toStringz(opsysstring), len, bytesRead, bytesWritten, error) );
301 * Converts a string which is in the encoding used by GLib for
302 * filenames into a UTF-8 string. Note that on Windows GLib uses UTF-8
303 * for filenames; on other platforms, this function indirectly depends on
304 * the current locale.
305 * opsysstring:
306 * a string in the encoding for filenames
307 * len:
308 * the length of the string, or -1 if the string is
309 * nul-terminated[1].
310 * bytes_read:
311 * location to store the number of bytes in the
312 * input string that were successfully converted, or NULL.
313 * Even if the conversion was successful, this may be
314 * less than len if there were partial characters
315 * at the end of the input. If the error
316 * G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
317 * stored will the byte offset after the last valid
318 * input sequence.
319 * bytes_written:
320 * the number of bytes stored in the output buffer (not
321 * including the terminating nul).
322 * error:
323 * location to store the error occuring, or NULL to ignore
324 * errors. Any of the errors in GConvertError may occur.
325 * Returns:
326 * The converted string, or NULL on an error.
328 public static char[] filenameToUtf8(char[] opsysstring, int len, uint* bytesRead, uint* bytesWritten, GError** error)
330 // gchar* g_filename_to_utf8 (const gchar *opsysstring, gssize len, gsize *bytes_read, gsize *bytes_written, GError **error);
331 return Str.toString(g_filename_to_utf8(Str.toStringz(opsysstring), len, bytesRead, bytesWritten, error) );
335 * Converts a string from UTF-8 to the encoding GLib uses for
336 * filenames. Note that on Windows GLib uses UTF-8 for filenames;
337 * on other platforms, this function indirectly depends on the
338 * current locale.
339 * utf8string:
340 * a UTF-8 encoded string.
341 * len:
342 * the length of the string, or -1 if the string is
343 * nul-terminated.
344 * bytes_read:
345 * location to store the number of bytes in the
346 * input string that were successfully converted, or NULL.
347 * Even if the conversion was successful, this may be
348 * less than len if there were partial characters
349 * at the end of the input. If the error
350 * G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
351 * stored will the byte offset after the last valid
352 * input sequence.
353 * bytes_written:
354 * the number of bytes stored in the output buffer (not
355 * including the terminating nul).
356 * error:
357 * location to store the error occuring, or NULL to ignore
358 * errors. Any of the errors in GConvertError may occur.
359 * Returns:
360 * The converted string, or NULL on an error.
362 public static char[] filenameFromUtf8(char[] utf8string, int len, uint* bytesRead, uint* bytesWritten, GError** error)
364 // gchar* g_filename_from_utf8 (const gchar *utf8string, gssize len, gsize *bytes_read, gsize *bytes_written, GError **error);
365 return Str.toString(g_filename_from_utf8(Str.toStringz(utf8string), len, bytesRead, bytesWritten, error) );
369 * Converts an escaped ASCII-encoded URI to a local filename in the
370 * encoding used for filenames.
371 * uri:
372 * a uri describing a filename (escaped, encoded in ASCII).
373 * hostname:
374 * Location to store hostname for the URI, or NULL.
375 * If there is no hostname in the URI, NULL will be
376 * stored in this location.
377 * error:
378 * location to store the error occuring, or NULL to ignore
379 * errors. Any of the errors in GConvertError may occur.
380 * Returns:
381 * a newly-allocated string holding the resulting
382 * filename, or NULL on an error.
384 public static char[] filenameFromUri(char[] uri, char** hostname, GError** error)
386 // gchar* g_filename_from_uri (const gchar *uri, gchar **hostname, GError **error);
387 return Str.toString(g_filename_from_uri(Str.toStringz(uri), hostname, error) );
391 * Converts an absolute filename to an escaped ASCII-encoded URI, with the path
392 * component following Section 3.3. of RFC 2396.
393 * filename:
394 * an absolute filename specified in the GLib file name encoding,
395 * which is the on-disk file name bytes on Unix, and UTF-8 on
396 * Windows
397 * hostname:
398 * A UTF-8 encoded hostname, or NULL for none.
399 * error:
400 * location to store the error occuring, or NULL to ignore
401 * errors. Any of the errors in GConvertError may occur.
402 * Returns:
403 * a newly-allocated string holding the resulting
404 * URI, or NULL on an error.
406 public static char[] filenameToUri(char[] filename, char[] hostname, GError** error)
408 // gchar* g_filename_to_uri (const gchar *filename, const gchar *hostname, GError **error);
409 return Str.toString(g_filename_to_uri(Str.toStringz(filename), Str.toStringz(hostname), error) );
413 * Determines the preferred character sets used for filenames.
414 * The first character set from the charsets is the filename encoding, the
415 * subsequent character sets are used when trying to generate a displayable
416 * representation of a filename, see g_filename_display_name().
417 * On Unix, the character sets are determined by consulting the
418 * environment variables G_FILENAME_ENCODING and
419 * G_BROKEN_FILENAMES. On Windows, the character set
420 * used in the GLib API is always UTF-8 and said environment variables
421 * have no effect.
422 * G_FILENAME_ENCODING may be set to a comma-separated list
423 * of character set names. The special token "@locale" is taken to
424 * mean the character set for the current
425 * locale. If G_FILENAME_ENCODING is not set, but
426 * G_BROKEN_FILENAMES is, the character set of the current
427 * locale is taken as the filename encoding. If neither environment variable
428 * is set, UTF-8 is taken as the filename encoding, but the character
429 * set of the current locale is also put in the list of encodings.
430 * The returned charsets belong to GLib and must not be freed.
431 * Note that on Unix, regardless of the locale character set or
432 * G_FILENAME_ENCODING value, the actual file names present
433 * on a system might be in any random encoding or just gibberish.
434 * charsets:
435 * return location for the NULL-terminated list of encoding names
436 * Returns:
437 * TRUE if the filename encoding is UTF-8.
438 * Since 2.6
440 public static int getFilenameCharsets(char*** charsets)
442 // gboolean g_get_filename_charsets (G_CONST_RETURN gchar ***charsets);
443 return g_get_filename_charsets(charsets);
447 * Converts a filename into a valid UTF-8 string. The conversion is
448 * not necessarily reversible, so you should keep the original around
449 * and use the return value of this function only for display purposes.
450 * Unlike g_filename_to_utf8(), the result is guaranteed to be non-NULL
451 * even if the filename actually isn't in the GLib file name encoding.
452 * If GLib can not make sense of the encoding of filename, as a last resort it
453 * replaces unknown characters with U+FFFD, the Unicode replacement character.
454 * You can search the result for the UTF-8 encoding of this character (which is
455 * "\357\277\275" in octal notation) to find out if filename was in an invalid
456 * encoding.
457 * If you know the whole pathname of the file you should use
458 * g_filename_display_basename(), since that allows location-based
459 * translation of filenames.
460 * filename:
461 * a pathname hopefully in the GLib file name encoding
462 * Returns:
463 * a newly allocated string containing
464 * a rendition of the filename in valid UTF-8
465 * Since 2.6
467 public static char[] filenameDisplayName(char[] filename)
469 // gchar* g_filename_display_name (const gchar *filename);
470 return Str.toString(g_filename_display_name(Str.toStringz(filename)) );
474 * Returns the display basename for the particular filename, guaranteed
475 * to be valid UTF-8. The display name might not be identical to the filename,
476 * for instance there might be problems converting it to UTF-8, and some files
477 * can be translated in the display.
478 * If GLib can not make sense of the encoding of filename, as a last resort it
479 * replaces unknown characters with U+FFFD, the Unicode replacement character.
480 * You can search the result for the UTF-8 encoding of this character (which is
481 * "\357\277\275" in octal notation) to find out if filename was in an invalid
482 * encoding.
483 * You must pass the whole absolute pathname to this functions so that
484 * translation of well known locations can be done.
485 * This function is preferred over g_filename_display_name() if you know the
486 * whole path, as it allows translation.
487 * filename:
488 * an absolute pathname in the GLib file name encoding
489 * Returns:
490 * a newly allocated string containing
491 * a rendition of the basename of the filename in valid UTF-8
492 * Since 2.6
494 public static char[] filenameDisplayBasename(char[] filename)
496 // gchar* g_filename_display_basename (const gchar *filename);
497 return Str.toString(g_filename_display_basename(Str.toStringz(filename)) );
501 * Splits an URI list conforming to the text/uri-list
502 * mime type defined in RFC 2483 into individual URIs,
503 * discarding any comments. The URIs are not validated.
504 * uri_list:
505 * an URI list
506 * Returns:
507 * a newly allocated NULL-terminated list of
508 * strings holding the individual URIs. The array should
509 * be freed with g_strfreev().
510 * Since 2.6
512 public static char** uriListExtractUris(char[] uriList)
514 // gchar** g_uri_list_extract_uris (const gchar *uri_list);
515 return g_uri_list_extract_uris(Str.toStringz(uriList));
519 * Converts a string from UTF-8 to the encoding used for strings by
520 * the C runtime (usually the same as that used by the operating
521 * system) in the current locale.
522 * utf8string:
523 * a UTF-8 encoded string
524 * len:
525 * the length of the string, or -1 if the string is
526 * nul-terminated[1].
527 * bytes_read:
528 * location to store the number of bytes in the
529 * input string that were successfully converted, or NULL.
530 * Even if the conversion was successful, this may be
531 * less than len if there were partial characters
532 * at the end of the input. If the error
533 * G_CONVERT_ERROR_ILLEGAL_SEQUENCE occurs, the value
534 * stored will the byte offset after the last valid
535 * input sequence.
536 * bytes_written:
537 * the number of bytes stored in the output buffer (not
538 * including the terminating nul).
539 * error:
540 * location to store the error occuring, or NULL to ignore
541 * errors. Any of the errors in GConvertError may occur.
542 * Returns:
543 * The converted string, or NULL on an error.
545 public static char[] localeFromUtf8(char[] utf8string, int len, uint* bytesRead, uint* bytesWritten, GError** error)
547 // gchar* g_locale_from_utf8 (const gchar *utf8string, gssize len, gsize *bytes_read, gsize *bytes_written, GError **error);
548 return Str.toString(g_locale_from_utf8(Str.toStringz(utf8string), len, bytesRead, bytesWritten, error) );
553 * Obtains the character set for the current
554 * locale; you might use this character set as an argument to
555 * g_convert(), to convert from the current locale's encoding to some
556 * other encoding. (Frequently g_locale_to_utf8() and g_locale_from_utf8()
557 * are nice shortcuts, though.)
558 * The return value is TRUE if the locale's encoding is UTF-8, in that
559 * case you can perhaps avoid calling g_convert().
560 * The string returned in charset is not allocated, and should not be
561 * freed.
562 * charset:
563 * return location for character set name
564 * Returns:
565 * TRUE if the returned charset is UTF-8
566 * [1]
567 * Note that some encodings may allow nul bytes to
568 * occur inside strings. In that case, using -1 for
569 * the len parameter is unsafe.
570 * [2]
571 * Despite the fact that byes_read can return information about partial
572 * characters, the g_convert_... functions
573 * are not generally suitable for streaming. If the underlying converter
574 * being used maintains internal state, then this won't be preserved
575 * across successive calls to g_convert(), g_convert_with_iconv() or
576 * g_convert_with_fallback(). (An example of this is the GNU C converter
577 * for CP1255 which does not emit a base character until it knows that
578 * the next character is not a mark that could combine with the base
579 * character.)
581 public static int getCharset(char** charset)
583 // gboolean g_get_charset (G_CONST_RETURN char **charset);
584 return g_get_charset(charset);