* remove "\r" nonsense
[mascara-docs.git] / C / the.ansi.c.programming.language / c.programming.notes / sx11a.html
blob221cbb5aad2060085a551f4d6cbd1523392dc37c
1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
2 <!-- This collection of hypertext pages is Copyright 1995-7 by Steve Summit. -->
3 <!-- This material may be freely redistributed and used -->
4 <!-- but may not be republished or sold without permission. -->
5 <html>
6 <head>
7 <link rev="owner" href="mailto:scs@eskimo.com">
8 <link rev="made" href="mailto:scs@eskimo.com">
9 <title>11.1 Allocating Memory with <TT>malloc</TT></title>
10 <link href="sx11.html" rev=precedes>
11 <link href="sx11b.html" rel=precedes>
12 <link href="sx11.html" rev=subdocument>
13 </head>
14 <body>
15 <H2>11.1 Allocating Memory with <TT>malloc</TT></H2>
17 <p>[This section corresponds to parts of K&amp;R Secs. 5.4, 5.6, 6.5, and 7.8.5]
18 </p><p>A problem with many simple programs,
19 including in particular
20 little teaching programs
21 such as we've been writing so far,
22 is that they tend to use fixed-size arrays which may or may not be big enough.
23 We have an array of 100 <TT>int</TT>s
24 for the numbers which the user enters and wishes to find the average
25 of--what
26 if the user enters 101 numbers?
27 We have an array of 100 <TT>char</TT>s
28 which we pass to <TT>getline</TT> to receive the user's
29 input--what
30 if the user types a line of 200 characters?
31 If we're lucky,
32 the relevant parts of the program check how much of an array they've used,
33 and print an error message or otherwise gracefully abort
34 before overflowing the array.
35 If we're not so lucky, a program may sail off the end of an array,
36 overwriting other data and behaving quite badly.
37 In either case, the user doesn't get his job done.
38 How can we avoid the restrictions of fixed-size arrays?
39 </p><p>The answers all involve the standard library function <TT>malloc</TT>.
40 Very simply, <TT>malloc</TT> returns a pointer to <I>n</I> bytes of memory
41 which we can do anything we want to with.
42 If we didn't want to read a line of input into a fixed-size array,
43 we could use <TT>malloc</TT>, instead.
44 Here's the first step:
45 <pre>
46 #include &lt;stdlib.h&gt;
48 char *line;
49 int linelen = 100;
50 line = malloc(linelen);
51 /* incomplete -- malloc's return value not checked */
52 getline(line, linelen);
53 </pre>
54 <TT>malloc</TT> is declared in <TT>&lt;stdlib.h&gt;</TT>,
55 so we <TT>#include</TT> that header in any program that calls <TT>malloc</TT>.
56 A ``byte'' in C is, by definition,
57 an amount of storage suitable for storing one character,
58 so the above invocation of <TT>malloc</TT>
59 gives us exactly as many <TT>char</TT>s as we ask for.
60 We could illustrate the resulting pointer like this:
61 <br>
62 <img src="fig11.1.gif">
63 <br>
64 The 100 bytes of memory (not all of which are shown)
65 pointed to by <TT>line</TT> are those allocated by <TT>malloc</TT>.
66 (They are brand-new memory,
67 conceptually a bit different from
68 the memory which the compiler arranges to have allocated automatically
69 for our conventional variables.
70 The 100 boxes in the figure
71 don't have a name next to them,
72 because they're not storage for a variable we've declared.)
73 </p><p>As a second example,
74 we might have occasion to
75 allocate a piece of memory,
76 and to copy a string into it with <TT>strcpy</TT>:
77 <pre>
78 char *p = malloc(15);
79 /* incomplete -- malloc's return value not checked */
80 strcpy(p, "Hello, world!");
81 </pre>
82 </p><p>When copying strings,
83 remember that all strings have a terminating <TT>\0</TT> character.
84 If you use <TT>strlen</TT> to count the characters in a string for you,
85 that count will <em>not</em> include the trailing <TT>\0</TT>,
86 so you must add one before calling <TT>malloc</TT>:
87 <pre>
88 char *somestring, *copy;
89 ...
90 copy = malloc(strlen(somestring) + 1); /* +1 for \0 */
91 /* incomplete -- malloc's return value not checked */
92 strcpy(copy, somestring);
93 </pre>
95 </p><p>What if we're not allocating characters, but integers?
96 If we want to allocate 100 <TT>int</TT>s, how many bytes is that?
97 If we know how big <TT>int</TT>s are on our machine
98 (i.e. depending on whether we're using a 16- or 32-bit machine)
99 we could try to compute it ourselves,
100 but it's much safer and more portable to let C compute it for us.
101 C has a <TT>sizeof</TT> operator,
102 which
103 computes
104 the size, in bytes, of a variable or type.
105 It's just what we need when calling <TT>malloc</TT>.
106 To allocate space for 100 <TT>int</TT>s, we could call
107 <pre>
108 int *ip = malloc(100 * sizeof(int));
109 </pre>
110 The use of the <TT>sizeof</TT> operator
111 tends to look like a function call,
112 but it's really an operator,
113 and it does its work at compile time.
114 </p><p>Since we can use array indexing syntax on pointers,
115 we can treat a pointer variable after a call to <TT>malloc</TT>
116 almost exactly as if it were an array.
117 In particular,
118 after the above call to <TT>malloc</TT>
119 initializes <TT>ip</TT> to point at storage for 100 <TT>int</TT>s,
120 we can access
121 <TT>ip[0]</TT>, <TT>ip[1]</TT>, ... up to <TT>ip[99]</TT>.
122 This way,
123 we can get the effect of an array
124 even if we don't know until run time how big the ``array'' should be.
125 (In a later section we'll see how we might deal with the case
126 where we're not even sure at the point we begin using it
127 how big an ``array'' will eventually have to be.)
128 </p><p>Our examples so far have all had a significant omission:
129 they have not checked <TT>malloc</TT>'s return value.
130 Obviously,
131 no real computer has an infinite amount of memory available,
132 so there is no guarantee that <TT>malloc</TT> will be able to
133 give us as much memory as we ask for.
134 If we call <TT>malloc(100000000)</TT>,
135 or if we call <TT>malloc(10)</TT> 10,000,000 times,
136 we're probably going to run out of memory.
137 </p><p>When <TT>malloc</TT> is unable to allocate the requested memory,
138 it returns a <dfn>null pointer</dfn>.
139 A null pointer, remember, points definitively nowhere.
140 It's a ``not a pointer'' marker;
141 it's not a pointer you can use.
142 (As we said in section
144 9.4,
145 a null pointer
146 can be used as a failure return from a function that returns pointers,
147 and <TT>malloc</TT> is a perfect example.)
148 Therefore,
149 whenever you call <TT>malloc</TT>,
150 it's vital to check the returned pointer before using it!
151 If you call <TT>malloc</TT>, and it returns a null pointer,
152 and you go off and use that null pointer as if it pointed somewhere,
153 your program probably won't last long.
154 Instead, a program should immediately check for a null pointer,
155 and if it receives one,
156 it should at the very least
157 print an error message and exit,
158 or perhaps figure out some way of proceeding
159 without the memory it asked for.
160 But it cannot
161 go on to use
162 the null pointer it got back from <TT>malloc</TT>
163 in any way,
164 because that null pointer by definition points nowhere.
166 (``It cannot use a null pointer in any way''
167 means that
168 the program
169 cannot use
170 the <TT>*</TT> or <TT>[]</TT> operators
171 on such a pointer value,
172 or pass it to any function that expects a valid pointer.)
173 </p><p>A call to <TT>malloc</TT>,
174 with an error check,
175 typically looks something like this:
176 <pre>
177 int *ip = malloc(100 * sizeof(int));
178 if(ip == NULL)
180 printf("out of memory\n");
181 <I>exit or return</I>
183 </pre>
185 After printing the error message,
186 this code should return to its caller,
187 or exit from the program entirely;
188 it cannot proceed with the code that would have used <TT>ip</TT>.
189 </p><p>Of course,
190 in our examples so far,
191 we've still limited ourselves to ``fixed size'' regions of memory,
192 because we've been calling <TT>malloc</TT> with fixed arguments
193 like 10 or 100.
194 (Our call to <TT>getline</TT> is
195 still limited to 100-character lines,
196 or whatever number we set the <TT>linelen</TT> variable to;
197 our <TT>ip</TT> variable still points at only 100 <TT>int</TT>s.)
198 However, since
199 the sizes are now values which
200 can in principle be determined at run-time,
201 we've at least moved beyond having to recompile the program
202 (with a bigger array)
203 to accommodate longer lines,
204 and with a little more work,
205 we could arrange that
206 the ``arrays'' automatically grew to be as large as required.
207 (For example,
208 we could write something like <TT>getline</TT> which could
209 read the longest input line actually seen.)
210 We'll begin to explore this possibility in a later section.
211 </p><hr>
213 Read sequentially:
214 <a href="sx11.html" rev=precedes>prev</a>
215 <a href="sx11b.html" rel=precedes>next</a>
216 <a href="sx11.html" rev=subdocument>up</a>
217 <a href="top.html">top</a>
218 </p>
220 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
221 // <a href="copyright.html">Copyright</a> 1995-1997
222 // <a href="mailto:scs@eskimo.com">mail feedback</a>
223 </p>
224 </body>
225 </html>