* remove "\r" nonsense
[mascara-docs.git] / C / the.ansi.c.programming.language / c.programming.notes.int / sx3b.html
blob4fc8c34f33e1e34736987590de23b5a5903bb0b1
1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
2 <!-- This collection of hypertext pages is Copyright 1995-7 by Steve Summit. -->
3 <!-- This material may be freely redistributed and used -->
4 <!-- but may not be republished or sold without permission. -->
5 <html>
6 <head>
7 <link rev="owner" href="mailto:scs@eskimo.com">
8 <link rev="made" href="mailto:scs@eskimo.com">
9 <title>17.2: Binary Data Files</title>
10 <link href="sx3a.html" rev=precedes>
11 <link href="sx4.html" rel=precedes>
12 <link href="sx3.html" rev=subdocument>
13 </head>
14 <body>
15 <H2>17.2: Binary Data Files</H2>
17 <p>Normally,
18 when writing notes like these,
19 I progress from the easy to the hard,
20 or the boring to the interesting,
21 or the deficient to the recommended.
22 This chapter is the reverse;
23 I heartily recommend
24 that you use the text data files of the previous section
25 whenever possible.
26 This section on binary data files is included for completeness,
27 and you're welcome to skip it
28 if you're not interested in using binary data files
29 or if it doesn't make sense.
30 </p><p>We've already seen two examples
31 of writing and reading binary data files,
32 in section
34 16.7
35 of the previous chapter.
36 To write out an array of integers,
37 we called
38 <pre>
39 fwrite(array, sizeof(int), na, fp);
40 </pre>
41 To read them back in,
42 we called
43 <pre>
44 na = fread(array, sizeof(int), 10, fp);
45 </pre>
46 To write out a structure,
47 we called
48 <pre>
49 fwrite(&amp;x, sizeof(struct s), 1, fp);
50 </pre>
51 To read it back in, we called
52 <pre>
53 fread(&amp;x, sizeof(struct s), 1, fp);
54 </pre>
55 (which returns 1 if it succeeds).
56 </p><p>These examples certainly seem attractive:
57 they will result in compact data files,
58 they will probably be quite efficient,
59 and they are certainly simple for the programmer to write.
60 However,
61 data files created in this way fare quite badly
62 when evaluated against our other criteria.
63 They will not be human-readable;
64 they will contain sets of inscrutable byte values
65 which are exact copies of the memory regions
66 used to contain the data structures.
67 They will not be at all portable;
68 they cannot be correctly read
69 (at least, not with the simple calls to <TT>fread</TT>)
70 on machines where basic types such as <TT>int</TT> have different sizes,
71 or where the basic types are laid out differently in memory
72 (e.g. ``big endian'' vs. ``little endian'',
73 or different floating-point representations).
75 They may not even be able to be read by the same code
76 compiled under a different compiler on the same machine,
77 since different compilers may
78 use different sizes for integers,
80 lay out
81 the fields of structures
82 differently in memory.
83 (The fields will always be in the order you expect,
84 but different compilers may, for various reasons,
85 leave different amounts of empty space or ``padding''
86 between certain fields.)
87 These binary files will have no provision whatsoever
88 for backwards or forwards compatibility;
89 any change to the structure definition
90 will completely change the implied format of the data file,
91 with no hope of reading older (or newer) files.
92 The only other benefit these files have
93 is that if the data is for any reason sensitive,
94 it will certainly be a bit better concealed from prying eyes.
95 </p><p>We can get around these disadvantages of binary data files,
96 but in so doing we'll lose many of the advantages,
97 such as blinding efficiency or programmer convenience.
98 If we care about data file portability
99 or backwards or forwards compatibility,
100 we will have to write structures one field at a time,
101 not in one fell swoop.
102 Furthermore,
103 if we have an <TT>int</TT> to write,
104 we may choose not to write it using <TT>fwrite</TT>:
105 <pre>
106 fwrite(&amp;i, sizeof(int), 1, fp);
107 </pre>
108 but rather a byte at a time, using <TT>putc</TT>:
109 <pre>
110 putc(i / 256, fp);
111 putc(i % 256, fp);
112 </pre>
113 In this way,
114 we'd have precise control over the order
115 in which the two halves of the <TT>int</TT> are being written.
116 (We're assuming here that there's no more than
117 two bytes' worth of data in the <TT>int</TT>,
118 which is a safe assumption
119 if we're portably assuming
120 that <TT>int</TT>s can only hold up to +-32767.)
121 When it came time to read the <TT>int</TT> back in,
122 we might do that a byte at a time, too:
123 <pre>
124 i = getc(fp);
125 i = 256 * i + getc(fp);
126 </pre>
127 (We could <em>not</em> collapse this to
128 <TT>i = 256 * getc(fp) + getc(fp)</TT>,
129 because we wouldn't know which order
130 the two calls to <TT>getc</TT> would occur in.)
131 </p><p>We might also choose to use tags
132 to mark the various ``fields'' within our binary data file;
133 the fields would be more likely to be byte codes
134 such as <TT>0x00</TT>, <TT>0x01</TT>, and <TT>0x02</TT>
135 than the character or string codes
136 we used in the tagged text data file of the previous section.
137 </p><p>If you do choose to use binary data files,
138 you <em>must</em> open them for writing with
139 <TT>fopen</TT> mode <TT>"wb"</TT>
140 and for reading with <TT>"rb"</TT>
141 (or perhaps one of the <TT>+</TT> modes;
142 the point is that you do need the <TT>b</TT>).
143 Remember that,
144 in the default mode,
145 the standard I/O functions all assume text files,
146 and translate between <TT>\n</TT>
147 and the operating system's end-of-line representation.
148 If you try to read or write a binary data file in text mode,
149 whenever
150 your internal data happens to contain a byte
151 which matches the code for <TT>\n</TT>,
152 or your external data happens to contain bytes
153 which match the operating system's end-of-line representation,
154 they may
156 be translated out from under you,
157 screwing up your data.
158 </p><hr>
160 Read sequentially:
161 <a href="sx3a.html" rev=precedes>prev</a>
162 <a href="sx4.html" rel=precedes>next</a>
163 <a href="sx3.html" rev=subdocument>up</a>
164 <a href="top.html">top</a>
165 </p>
167 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
168 // <a href="copyright.html">Copyright</a> 1996-1999
169 // <a href="mailto:scs@eskimo.com">mail feedback</a>
170 </p>
171 </body>
172 </html>