* better
[mascara-docs.git] / lang / C / the.ansi.c.programming.language / notes.accompany.ansi.c / sx8a.html
blob6b4ff518b0b02a7a17b0d63b6791b6465e33e4d1
1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
2 <!-- This collection of hypertext pages is Copyright 1995, 1996 by Steve Summit. -->
3 <!-- This material may be freely redistributed and used -->
4 <!-- but may not be republished or sold without permission. -->
5 <html>
6 <head>
7 <link rev="owner" href="mailto:scs@eskimo.com">
8 <link rev="made" href="mailto:scs@eskimo.com">
9 <title>section 5.1: Pointers and Addresses</title>
10 <link href="sx8.html" rev=precedes>
11 <link href="sx8b.html" rel=precedes>
12 <link href="sx8.html" rev=subdocument>
13 </head>
14 <body>
15 <H2>section 5.1: Pointers and Addresses</H2>
17 <p>If you like to use concrete examples
18 and to think about exactly what's going on at the machine level,
19 you'll want to know how many bytes are occupied by
20 <TT>short</TT>s, <TT>long</TT>s, pointers, etc.
21 It's equally possible, though,
22 to understand pointers at a more abstract level,
23 thinking about them only in terms of boxes and arrows,
24 as in the figures on pages 96, 98, 104, 107, and 114-5.
25 (Not worrying about the exact size in bytes basically means not
26 worrying about how big the boxes are.)
27 The figure at the bottom of page 93
28 is probably the least pretty pointer picture in the whole book;
29 don't worry if it doesn't mean much to you.
30 </p><p>When we say that a pointer holds an ``address,''
31 and that unary <TT>&amp;</TT> is the ``address of'' operator,
32 our language is of course influenced by the fact that the
33 underlying hardware assigns addresses to memory locations,
34 but again,
35 it is not necessary
36 (nor necessarily desirable)
37 to think about actual machine addresses when working with pointers.
38 Thinking about the machine addresses
39 can make certain aspects of pointers easier to understand,
40 but doing so can also make certain mistakes and misunderstandings easier.
41 In particular,
43 a pointer in C is more than just an address;
44 as we'll see on the next page,
45 a pointer also carries the notion of what <em>type</em> of data it points to.
46 </p><p>page 94
47 </p><p>The presentation on this page is going to seem very artificial at first.
48 At best, you're going to say,
49 ``This makes sense, but what's it <em>for</em>?''
50 In fact,
51 it <em>is</em> artificial,
52 and no real program would ever do meaningless little pointer operations
53 such as are embodied in the example on this page.
54 However,
55 this is the traditional way to introduce pointers from scratch,
56 and once we've moved past it,
57 we'll be able to talk about some more meaningful uses of pointers,
58 and to forget about these artificial ones.
59 (Once we're done talking about
60 the traditional, artificial introduction on page 94,
61 we'll also attempt a slightly more elaborate,
62 slightly less traditional,
63 slightly more meaningful
64 parallel introduction,
65 so stay tuned.)
66 </p><p>Deep sentence:
67 <blockquote>The declaration of the pointer <TT>ip</TT>,
68 <pre> int *ip;
69 </pre>is intended as a mnemonic;
70 it says that the expression <TT>*ip</TT> is an <TT>int</TT>.
71 </blockquote>We'll have more to say about this sentence in a bit.
72 </p><p>As an even more traditional,
73 even less meaningful,
74 even simpler example,
75 we could say
76 <pre> int i = 1; /* an integer */
77 int *ip; /* a pointer-to-int */
78 ip = &amp;i; /* ip points to i */
79 printf("%d\n", *ip); /* prints i, which is 1 */
80 *ip = 5; /* sets i to 5 */
81 </pre>(The obvious questions are,
82 ``if you want to print <TT>i</TT>,
83 or set it to 5,
84 why not just <em>do</em> it?
85 Why mess around with this `pointer' thing?''
86 More on that in a minute.)
87 </p><p>The unary <TT>&amp;</TT> and <TT>*</TT> operators are complementary.
88 Given an object (i.e. a variable),
89 <TT>&amp;</TT> generates a pointer to it;
90 given a pointer, <TT>*</TT> ``returns'' the value of the pointed-to object.
91 ``Returns'' is in quotes because,
92 as you may have noticed in the examples,
93 you're not restricted to fetching values via pointers:
94 you can also store values via pointers.
95 In an assignment like
96 <pre> *ip = 0;
97 </pre>the subexpression <TT>*ip</TT> is conceptually
98 ``replaced'' by the object which <TT>ip</TT> points to,
99 and since <TT>*ip</TT> appears on the left-hand side of the
100 assignment operator,
101 what happens to the pointed-to object is that it gets assigned to.
102 </p><p>One of the things that's hard about pointers is simply talking
103 about what's going on.
104 We've been using the words ``return'' and ``replace'' in quotes,
105 because they don't quite reflect what's actually going on,
106 and we've been using clumsy locutions like
107 ``fetch via pointers'' and ``store via pointers.''
108 There is some jargon for referring to pointer use;
109 one word you'll often see is <dfn>dereference</dfn>,
110 a term which,
111 though its derivation is suspect,
112 is used to mean
113 ``follow a pointer to get at, and use, the object it points to.''
114 Thus, we sometimes call unary <TT>*</TT>
115 the ``pointer dereferencing operator,''
116 and we may say that the expressions
117 <pre> printf("%d\n", *ip);
118 </pre>and
119 <pre> *ip = 5;
120 </pre>both ``dereference the pointer <TT>ip</TT>.''
121 We may also talk about <dfn>indirecting</dfn> on a pointer:
122 to <dfn>indirect</dfn> on a pointer
123 is again to follow it to see what it points to;
124 and <TT>*</TT> may also be called the ``pointer indirection operator.''
125 </p><p>Our examples of pointers so far have been,
126 admittedly,
127 artificial and rather meaningless.
128 Let's try a slightly more realistic example.
129 In the previous chapter,
130 we used the routines <TT>atoi</TT> and <TT>atof</TT>
131 to convert strings representing numbers to the actual numbers represented.
132 Often the strings were typed by the user,
133 and read with <TT>getline</TT>.
134 As you may have noticed,
135 neither <TT>atoi</TT> nor <TT>atof</TT> does any validity or error checking:
136 both simply stop reading
137 when they reach a character
138 that can't be part of the number they're converting,
139 and if there aren't <em>any</em> numeric characters in the string,
140 they simply return 0.
141 (For example, <TT>atoi("49er")</TT> is 49,
142 and <TT>atoi("three")</TT> is 0,
143 and <TT>atof("1.2.3")</TT> is 1.2
145 These attributes make <TT>atoi</TT> and <TT>atof</TT>
146 easy to write and easy
147 (for the programmer)
148 to use,
149 but they are not the most user-friendly routines possible.
150 A good user interface would warn the user
151 and prompt again
152 in case of invalid, non-numeric input.
153 </p><p>Suppose we were writing a simple inventory-control system.
154 For each part stored in our warehouse,
155 we might record the part number,
156 location,
157 and number of parts on hand.
158 For simplicity,
159 we'll assume that the location is always a simple bin number.
160 </p><p>Somewhere in the inventory-control program,
161 we might find the variables
162 <pre> int part_number;
163 int location;
164 int number_on_hand;
165 </pre>and there might be a routine that lets the user enter any of these numbers.
166 Suppose that there is another variable,
167 <pre> int which_entry;
168 </pre>which indicates which of the three numbers is being entered
169 (1 for <TT>part_number</TT>,
170 2 for <TT>location</TT>,
171 or 3 for <TT>number_on_hand</TT>).
172 We might have code like this:
173 <pre> char instring[30];
174 <br>
175 <br>
176 switch (which_entry) {
177 case 1:
178 printf("enter part number:\n");
179 getline(instring, 30);
180 part_number = atoi(instring);
181 break;
182 <br>
183 <br>
184 case 2:
185 printf("enter location:\n");
186 getline(instring, 30);
187 location = atoi(instring);
188 break;
189 <br>
190 <br>
191 case 3:
192 printf("enter number on hand:\n");
193 getline(instring, 30);
194 number_on_hand = atoi(instring);
195 break;
197 </pre>Suppose that we now begin to add
198 a bit of rudimentary verification to the input routines.
199 The first case might look like
200 <pre> case 1:
201 do {
202 printf("enter part number:\n");
203 getline(instring, 30);
204 if(!isdigit(instring[0]))
205 continue;
206 part_number = atoi(instring);
207 } while (part_number == 0);
208 break;
209 </pre>If the first character is not a digit,
210 or if <TT>atoi</TT> returns 0,
211 the code
212 goes around the loop another time,
213 and prompts the user again,
214 in hopes that the user will type some proper numeric input this time.
215 (The
216 tests
217 for numeric input
218 are not sufficient,
219 nor even wise if 0 is a possible input value,
220 as it presumably is for number on hand.
221 In fact, the two tests really do the same thing!
222 But please overlook these faults.
223 If you're curious,
224 you can learn about a new ANSI function, <TT>strtol</TT>,
225 which is like <TT>atoi</TT> but gives you a bit more control,
226 and would be a better routine to use here.)
227 </p><p>The code fragment above is for just one of the three input cases.
228 The obvious way to perform the same checking
229 for the other two cases
230 would be to repeat the same code two more times,
231 changing the prompt string
232 and the name of the variable assigned to
233 (<TT>location</TT> or <TT>number_on_hand</TT> instead of <TT>part_number</TT>).
234 Duplicating the code is a nuisance,
235 though,
236 especially if we later come up with a better way to do input verification
237 (perhaps one not suffering from the imperfections mentioned above).
238 Is there a better way?
239 </p><p>One way would be to use a temporary variable in the input loop,
240 and then set one of the three real variables
241 to the value of the temporary variable,
242 depending on <TT>which_entry</TT>:
243 <pre> int temp;
244 <br>
245 <br>
246 do {
247 printf("enter the number:\n");
248 getline(instring, 30);
249 if(!isdigit(instring[0]))
250 continue;
251 temp = atoi(instring);
252 } while (temp == 0);
253 <br>
254 <br>
255 switch (which_entry) {
256 case 1:
257 part_number = temp;
258 break;
259 <br>
260 <br>
261 case 2:
262 location = temp;
263 break;
264 <br>
265 <br>
266 case 3:
267 number_on_hand = temp;
268 break;
270 </pre></p><p>Another way, however,
271 would be to use a <em>pointer</em>
272 to keep track of which variable we're setting.
273 (In this example, we'll also get the prompt right.)
274 <pre> char instring[30];
275 int *numpointer;
276 char *prompt;
277 <br>
278 <br>
279 switch (which_entry) {
280 case 1:
281 numpointer = &amp;part_number;
282 prompt = "part number";
283 break;
284 <br>
285 <br>
286 case 2:
287 numpointer = &amp;location;
288 prompt = "location";
289 break;
290 <br>
291 <br>
292 case 3:
293 numpointer = &amp;number_on_hand;
294 prompt = "number on hand";
295 break;
297 <br>
298 <br>
299 do {
300 printf("enter %s:\n", prompt);
301 getline(instring, 30);
302 if(!isdigit(instring[0]))
303 continue;
304 *numpointer = atoi(instring);
305 } while (*numpointer == 0);
306 </pre>The idea here is that
307 <TT>prompt</TT> is the prompt string
309 <TT>numpointer</TT> points to the
310 particular numeric value we're entering.
311 That way, a single input verification loop can
312 print any of the three prompts
314 set any of the
315 three numeric variables,
316 depending on where <TT>numpointer</TT> points.
318 won't officially see
319 character pointers and strings until section 5.5,
320 so don't worry if
321 the use of the <TT>prompt</TT> pointer seems
322 new or inexplicable.)
323 </p><p>This example is, in its own ways, quite artificial.
324 (In a real inventory-control program,
325 we'd obviously need to keep track of many parts;
326 we couldn't use single variables for the part number, location, and quantity.
327 We probably wouldn't really have a <TT>which_entry</TT> variable
328 telling us which number to prompt for,
329 and we'd do the numeric validation quite differently.
330 We might well do numeric entry and validation in a separate function,
331 removing this need for the pointers.)
332 However,
333 the pointer aspect of this
334 example--using
335 a pointer to refer to one of several different things,
336 so that one generic piece of code can access any of the
337 things--is
338 a very typical
339 (i.e. realistic)
340 use of pointers.
341 </p><p>There's one nuance of pointer declarations which deserves mention.
342 We've seen that
343 <pre> int *ip;
344 </pre>declares the variable <TT>ip</TT> as a pointer to an <TT>int</TT>.
345 We might look at that declaration and imagine that
346 <TT>int *</TT> is the type
347 and <TT>ip</TT> is the name of the variable being declared.
348 (Actually, so far, these assumptions are both true.)
349 We might therefore imagine that a more ``obvious'' way of writing
350 the declaration would be
351 <pre> int* ip;
352 </pre>This would work,
353 but it is misleading,
354 as we'll see if we try to declare two <TT>int</TT> pointers at once.
355 How shall we do it?
356 If we try
357 <pre> int* ip1, ip2; /* WRONG */
358 </pre>we don't succeed;
359 this would declare <TT>ip1</TT> as a pointer-to-<TT>int</TT>,
360 but <TT>ip2</TT> as an <TT>int</TT>
361 (not a pointer).
362 The correct declaration for two pointers is
363 <pre> int *ip1, *ip2;
364 </pre>As the authors said in the middle of page 94,
365 the intent of pointer
366 (and in fact all)
367 declarations is that they give little miniature expressions
368 indicating what type
369 a certain use
370 of the variables
371 will have.
372 The declaration
373 <pre> int *ip1;
374 </pre>doesn't so much say that <TT>ip</TT> is a pointer-to-<TT>int</TT>;
375 it says that <TT>*ip</TT> is an <TT>int</TT>.
376 (To be sure, <TT>ip</TT> <em>is</em> a pointer-to-<TT>int</TT>.)
377 In the declaration
378 <pre> int *ip1, *ip2;
379 </pre>both <TT>*ip1</TT> and <TT>*ip2</TT> are <TT>int</TT>s;
380 so <TT>ip1</TT> and <TT>ip2</TT> are both pointers-to-<TT>int</TT>.
381 You'll hear this aspect of C declarations referred to as
382 ``declaration mimics use.''
383 If it bothers you,
384 or if you think you might accidentally write things like
385 <pre> int *ip1, ip2;
386 </pre>then
387 to stay on the safe side
388 you might want to get in the habit of writing declarations
389 on separate lines:
390 <pre> int *ip1;
391 int *ip2;
392 </pre></p><p>I promised to point out the safe techniques for ensuring that
393 pointers always point where they should.
394 The examples in this section,
395 which have all involved pointers pointing to single variables,
396 are relatively safe;
397 a single variable is not a very risky thing to point to,
398 so code like the examples in this section is relatively
399 unlikely to go awry and result in invalid pointers.
400 (One potential problem, though,
401 which we'll talk more about later,
403 is that since local, ``automatic'' variables
404 are automatically deallocated when the function containing them returns,
405 any pointer to a local variable also becomes invalid.
406 Therefore, a function which returns a pointer
407 must never return a pointer to one of its own local variables,
408 and it would also be invalid to take
409 a pointer to a local variable
410 and assign it
411 to a global pointer variable.)
412 </p><hr>
414 Read sequentially:
415 <a href="sx8.html" rev=precedes>prev</a>
416 <a href="sx8b.html" rel=precedes>next</a>
417 <a href="sx8.html" rev=subdocument>up</a>
418 <a href="top.html">top</a>
419 </p>
421 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
422 // <a href="copyright.html">Copyright</a> 1995, 1996
423 // <a href="mailto:scs@eskimo.com">mail feedback</a>
424 </p>
425 </body>
426 </html>