* better
[mascara-docs.git] / lang / C / the.ansi.c.programming.language / c.programming.notes.int / sx5.html
blob7c65cf361603d646652f6c7ec9a9eea5903258c6
1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
2 <!-- This collection of hypertext pages is Copyright 1995-7 by Steve Summit. -->
3 <!-- This material may be freely redistributed and used -->
4 <!-- but may not be republished or sold without permission. -->
5 <html>
6 <head>
7 <link rev="owner" href="mailto:scs@eskimo.com">
8 <link rev="made" href="mailto:scs@eskimo.com">
9 <title>Chapter 19: Returning Arrays</title>
10 <link href="sx4cc.html" rev=precedes>
11 <link href="sx6.html" rel=precedes>
12 <link href="top.html" rev=subdocument>
13 </head>
14 <body>
15 <H1>Chapter 19: Returning Arrays</H1>
17 <p>Arrays are ``second-class citizens'' in C.
18 Related to the fact that arrays can't be assigned
19 is the fact that
20 they can't be returned by functions, either;
21 that is,
22 there is no such type as ``function returning array of ...''.
23 In this chapter we'll study three workarounds,
24 three ways to implement a function
25 which attempts to return a string
26 (that is, an array of <TT>char</TT>)
27 or an array of some other type.
28 </p><p>In the last chapter, we looked at some code for converting an integer
29 into a string of digits representing its value.
30 This operation is the inverse of the function performed
31 by the standard function <TT>atoi</TT>.
32 Suppose we wanted to wrap our digit-generating code
33 up in a function and call it <TT>itoa</TT>.
34 How would it return the generated string of digits?
35 We'll use this example to demonstrate all three techniques.
36 For simplicity, though,
37 we won't repeat the <TT>do</TT>/<TT>while</TT> loop
38 in each example function;
39 instead, we'll simply call <TT>sprintf</TT>.
40 (In fact, since calling <TT>sprintf</TT> is so easy,
41 most C programs call it directly when they need to
42 convert integers to strings,
43 and consequently there is no standard <TT>itoa</TT> function.)
44 </p><p>First, let's look at the way that <em>won't</em> work,
45 so that we can set it aside and make sure we never use it.
46 What if we wrote <TT>itoa</TT> like this?
47 <pre>
48 char *itoa(int n)
50 char retbuf[25];
51 sprintf(retbuf, "%d", n);
52 return retbuf;
54 </pre>
55 This looks superficially reasonable,
56 and it might well be what we'd write at first if we weren't
57 being careful.
58 (It might even seem to work, at first.)
59 However, it has a serious, fatal flaw:
60 let's think about that local array, <TT>retbuf</TT>.
61 Since it's a regular local variable,
62 it has <dfn>automatic</dfn> duration,
63 which means that it springs into existence when the function is called
64 <em>and disappears when the function returns</em>.
65 Therefore,
66 the pointer that this version of <TT>itoa</TT> returns
67 is to an array which no longer exists by the time the caller
68 receives the pointer.
69 (Remember that the statement <TT>return retbuf;</TT>
70 returns a pointer to the first character in <TT>retbuf</TT>;
71 by the ``equivalence of arrays and pointers,''
72 the mention of the array <TT>retbuf</TT> in this context
73 is equivalent to <TT>&amp;retbuf[0]</TT>.)
74 When the caller tries to use the pointer,
75 the string
76 created
77 by <TT>itoa</TT> might still be there,
78 or the memory might have been re-used by some other function.
79 Therefore, this first version of <TT>itoa</TT> is <em>not</em>
80 adequate and <em>not</em> acceptable.
81 Functions must never return pointers to local,
82 automatic-duration arrays.
83 </p><p>Since the problem with returning a pointer to a local array
84 is that the array has automatic duration by default,
85 the simplest fix to the above non-functional version of <TT>itoa</TT>,
86 and the first of our three working methods of returning arrays from functions,
87 is to declare the array <TT>static</TT>, instead:
88 <pre>
89 char *itoa(int n)
91 static char retbuf[25];
92 sprintf(retbuf, "%d", n);
93 return retbuf;
95 </pre>
96 Now, the <TT>retbuf</TT> array does not disappear when <TT>itoa</TT>
97 returns, so the pointer is still valid by the time the caller uses it.
98 </p><p>Returning a pointer to a <TT>static</TT> array
99 is a practical and popular solution
100 to the problem of ``returning'' an array,
101 but it has one drawback.
102 Each time you call the function,
103 it re-uses the same array and returns the same pointer.
104 Therefore,
105 when you call the function a second time,
106 whatever information it ``returned'' to you last time
107 will be overwritten.
108 (More precisely, the information,
109 that the function returned a pointer to,
110 will be overwritten.)
111 For example,
112 suppose we had occasion
113 to save the pointer returned by <TT>itoa</TT> for a little while,
114 with the intention of using it later,
115 after calling <TT>itoa</TT> again in the meantime:
116 <pre>
117 int i = 23;
118 char *p1, *p2;
119 p1 = itoa(i);
120 i = i + 10;
121 p2 = itoa(i);
122 printf("old i = %s, new i = %s\n", p1, p2);
123 </pre>
124 But this won't work as we
125 expect--the
126 second call to <TT>itoa</TT> will overwrite the string
127 (stored in <TT>itoa</TT>'s
128 static
129 <TT>retbuf</TT> array)
130 which was stored by the first call.
131 Instead of printing <TT>i</TT>'s old and new value,
132 the last line will print the new value, twice.
133 Both <TT>p1</TT> and <TT>p2</TT> will point to the same place,
134 to the <TT>retbuf</TT> array down inside <TT>itoa</TT>,
135 because each call to <TT>itoa</TT> always returns
136 the same pointer to that same array.
137 </p><p>We can see the same problem in an even simpler example.
138 Suppose we had never heard of
139 the <TT>%d</TT> format specifier in <TT>printf</TT>.
140 We might try to call something like this:
141 <pre>
142 printf("i = %s, j = %s\n", itoa(i), itoa(j));
143 </pre>
144 where <TT>i</TT> and <TT>j</TT> are
145 two different <TT>int</TT> variables.
146 What will happen?
147 Either the compiler will make
148 the first call to <TT>itoa</TT> first,
149 or the second.
150 (It turns out that it's not specified
151 which order the compiler will use;
152 different compilers behave differently in this respect.)
153 Whichever call to <TT>itoa</TT> happens <em>second</em>
154 will be the one that
155 gets to keep its return value in
156 <TT>retbuf</TT>.
157 The <TT>printf</TT> call will either print <TT>i</TT>'s value twice,
158 or <TT>j</TT>'s value twice,
159 but it won't be able to print two distinct values.
160 </p><p>The moral is that
161 although the <TT>static</TT> return array technique will work,
162 the caller has to be a little bit careful,
163 and must never expect the return pointer from one call to the function
164 to be usable after a later call to the function.
165 Sometimes this restriction is a real problem;
166 other times it's perfectly acceptable.
167 (Some of the functions in the standard C library use this technique;
168 one example is <TT>ctime</TT>,
169 which converts timestamp values to printable strings.
170 When you see a cryptic sentence like
171 ``The returned pointer is to static data
172 which is overwritten with each call''
173 in the documentation for a library function,
174 it means that the function is using this technique.)
175 When this restriction <em>would</em> be too onerous on the caller,
176 we should use one of the other two techniques, described next.
177 </p><p>If the function can't use a local or local <TT>static</TT> array
178 to hold the return value,
179 the next option is to have the <em>caller</em> allocate an array,
180 and use that.
181 In this case,
182 the function accepts
183 at least one additional argument
184 (in addition to any data to be operated on):
185 a pointer to the location to write the result back to.
186 Our familiar <TT>getline</TT> function has worked this way all along.
187 If we rewrote <TT>itoa</TT>
188 along these lines,
189 it might look like this:
190 <pre>
191 char *itoa(int n, char buf[])
193 sprintf(buf, "%d", n);
194 return buf;
196 </pre>
197 Now the caller must pass an <TT>int</TT> value to be converted
198 <em>and</em> an array to hold the converted result:
199 <pre>
200 int i = 23;
201 char buf[25];
202 char *str = itoa(i, buf);
203 </pre>
204 There are two differences between this
205 version of <TT>itoa</TT> and our old <TT>getline</TT> function.
206 (Well, three, really;
207 of course
208 the two functions do totally different things.)
209 One difference is that
210 <TT>getline</TT> accepted another extra argument
211 which was the <em>size</em> of the array in the caller,
212 so that <TT>getline</TT> could promise not to overflow that array.
213 Our latest version of <TT>itoa</TT> does not accept such an argument,
214 which is a deficiency.
215 If the caller ever passes an array
216 which is too small to hold all the digits of the converted integer,
217 <TT>itoa</TT> (actually, <TT>sprintf</TT>)
218 will sail off the end of the array
219 and scribble on some other part of memory.
220 (Needless to say, this can be a disaster.)
221 </p><p>Another difference is that the return value
222 of this latest version of <TT>itoa</TT>
223 isn't terribly useful.
224 The pointer which this version of <TT>itoa</TT> returns
225 is always the same as the pointer you handed it.
226 Even if this version of <TT>itoa</TT> didn't return anything
227 as its formal return value,
228 you could still get your hands on the string it created,
229 since it would be sitting right there in your own array
230 (the one that you passed
232 <TT>itoa</TT>).
233 In the case of <TT>getline</TT>,
234 we had a second thing to return as the formal return value,
235 namely the length of the line we'd just read.
236 </p><p>However, this second strategy is also popular and workable.
237 Besides our own <TT>getline</TT> function,
238 the standard library functions <TT>fgets</TT> and <TT>fread</TT>
239 both use this technique.
240 </p><p>When the limit of a single static return array within the function
241 would be unacceptable,
242 and when it would be a nuisance for the caller
243 to have to declare or otherwise allocate return arrays,
244 a third option
245 is for the function to dynamically allocate some memory
246 for the returned array
247 by calling <TT>malloc</TT>.
248 Here is our last version of <TT>itoa</TT>,
249 demonstrating this technique:
250 <pre>
251 char *itoa(int n)
253 char *retbuf = malloc(25);
254 if(retbuf == NULL)
255 return NULL;
256 sprintf(retbuf, "%d", n);
257 return retbuf;
259 </pre>
260 Now the caller can go back to saying simple things like
261 <pre>
262 char *p = itoa(i);
263 </pre>
264 and it no longer has to worry about the possibility that
265 a later call to <TT>itoa</TT>
266 will overwrite the results of the first.
267 However, the caller now has two <em>new</em> things to worry about:
268 <OL><li>This version of <TT>itoa</TT> returns a null pointer if
269 <TT>malloc</TT> fails to return the memory that <TT>itoa</TT> needs.
270 The caller should really be checking for this null pointer return
271 each time it calls <TT>itoa</TT>,
272 before using the pointer.
273 <li>If the caller calls <TT>itoa</TT> 10,000 times,
274 we'll have allocated
275 25 <TT>*</TT> 10,000 = 250,000 bytes of memory,
276 or a quarter of a meg.
277 Unless someone is careful to call <TT>free</TT>
278 to deallocate all of that memory,
279 it will be wasted.
280 Few programs can afford to waste that much memory.
281 (Once upon a time,
282 few programs could get that much memory, period.)
283 The ``someone''
284 who is going to have to call <TT>free</TT>
285 isn't <TT>itoa</TT>;
286 it has no idea when the caller is done
287 with the memory returned by a previous call to <TT>itoa</TT>,
288 and in fact <TT>itoa</TT> might never get called again.
289 So it will be the caller's responsibility
290 to keep track of each pointer returned by <TT>itoa</TT>,
291 and to free it when it's no longer needed,
292 or else memory will gradually leak away.
293 </OL>We can work around the first problem--if
294 we expect that there will usually be enough memory,
295 such that the call to <TT>malloc</TT> will rarely if ever fail,
296 and if all the caller would do in an out-of-memory situation is
297 print an error message and abort,
298 we can move the test down into the function:
299 <pre>
300 char *retbuf = malloc(25);
301 if(retbuf == NULL)
303 fprintf(stderr, "out of memory\n");
304 exit(EXIT_FAILURE);
306 </pre>
307 Now the function never returns a null pointer,
308 so the caller doesn't have to check.
309 (When <TT>malloc</TT> fails, the function doesn't return at all.)
310 </p><p></p><p>In summary, we've seen three ways
311 of ``returning'' arrays from functions,
312 none of which is perfect.
313 The <TT>static</TT> array technique is usually convenient for the caller,
314 but only for functions
315 which it's unlikely that the caller will be trying to call multiple times
316 and retain multiple return values.
317 (The <TT>static</TT> array technique is also definitely imperfect
318 in that it violates the notion
319 that calling code shouldn't need to know
320 about the inner, implementation details of a called function.)
321 The caller-passes-an-array technique
322 is useful when the caller might have a number of calls to the function active,
323 but when that number is small and fixed,
324 so that the caller can easily declare and keep track
325 of a number of return arrays
326 (if necessary).
327 Finally,
328 when there might be an arbitrary number of calls to the function,
329 or when maximum flexibility is otherwise needed,
330 the function-calls-<TT>malloc</TT> technique is appropriate,
331 but with its extra flexibility comes some costs,
332 the most important of which is that the caller must remember to
333 free the returned pointers.
334 </p><hr>
336 Read sequentially:
337 <a href="sx4cc.html" rev=precedes>prev</a>
338 <a href="sx6.html" rel=precedes>next</a>
339 <a href="top.html" rev=subdocument>up</a>
340 <a href="top.html">top</a>
341 </p>
343 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
344 // <a href="copyright.html">Copyright</a> 1996-1999
345 // <a href="mailto:scs@eskimo.com">mail feedback</a>
346 </p>
347 </body>
348 </html>