* remove "\r" nonsense
[mascara-docs.git] / C / the.ansi.c.programming.language / c.programming.notes.int / sx2f.html
blob00a4bc31e20e8465b735cb0e43eb65728148b67d
1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
2 <!-- This collection of hypertext pages is Copyright 1995-7 by Steve Summit. -->
3 <!-- This material may be freely redistributed and used -->
4 <!-- but may not be republished or sold without permission. -->
5 <html>
6 <head>
7 <link rev="owner" href="mailto:scs@eskimo.com">
8 <link rev="made" href="mailto:scs@eskimo.com">
9 <title>16.6: Formatted Input (<TT>scanf</TT>)</title>
10 <link href="sx2e.html" rev=precedes>
11 <link href="sx2g.html" rel=precedes>
12 <link href="sx2.html" rev=subdocument>
13 </head>
14 <body>
15 <H2>16.6: Formatted Input (<TT>scanf</TT>)</H2>
17 <p>Just as <TT>putchar</TT> has its <TT>getchar</TT>
18 and <TT>fputs</TT> has its <TT>fgets</TT>,
19 there's an input analog to <TT>printf</TT>,
20 namely <TT>scanf</TT>.
21 <TT>scanf</TT> reads characters
22 from standard input,
23 under control of a format string,
24 perhaps converting some components of the string
25 and storing them into variables.
26 For example,
27 just as you could use the call
28 <pre>
29 printf("(%d, %d)", x, y);
30 </pre>
31 to print two integer values and some surrounding punctuation,
32 you could use the call
33 <pre>
34 scanf("(%d, %d)", &amp;x, &amp;y);
35 </pre>
36 to attempt to extract two integer values
37 from some input containing similar punctuation.
38 </p><p><TT>scanf</TT> interprets a format string,
39 much like <TT>printf</TT>,
40 with the first difference being
41 that <TT>scanf</TT> attempts to read characters
42 and match them against the format string,
43 rather than printing under control of the format string.
44 For each ordinary character in the format string,
45 <TT>scanf</TT> expects to see that character on the input;
46 if not, it fails.
47 For each format specifier in the input string,
48 <TT>scanf</TT> attempts to match and convert
49 a string appropriate to the format specifier,
50 storing the converted result into a variable
51 pointed to by the corresponding argument.
52 If it can't find any characters matching the format specifier,
53 it fails.
54 </p><p>Since <TT>scanf</TT> ``returns'' many values
55 (one for each format specifier in the format string),
56 it must do so using pointers which the caller passes.
57 For each value to be converted,
58 the caller passes a pointer to the variable
59 (or other location)
60 where <TT>scanf</TT> should write the converted value.
61 All arguments passed to <TT>scanf</TT> must be pointers.
62 </p><p>The format strings used by <TT>scanf</TT>
63 are similar to those used by <TT>printf</TT>,
64 but there are several differences.
65 </p><p>The optional <I>width</I>
66 gives the maximum number of characters to read
67 while performing the conversion requested by a particular format specifier.
68 (If there are many adjacent characters which could satisfy
69 a request--many
70 digits for one of the numeric conversions,
71 or many characters for <TT>%s</TT>
72 conversion--the
73 <I>width</I> keeps <TT>scanf</TT> from gobbling all of them up at once.)
74 </p><p>There is no equivalent to the <I>precision</I> modifier.
75 </p><p>If the <TT>*</TT> flag appears,
76 it indicates that the converted value should be discarded,
77 not written to a location
78 pointed to by one
80 of the pointers in the argument list.
81 (In other words,
82 there is no corresponding argument.)
83 Since <TT>*</TT> is usurped for this function,
84 there is no way to use a variable field width
85 from the argument list
86 with <TT>scanf</TT>.
87 There are no other <I>flags</I>.
88 </p><p>The <I>modifier</I> characters are more significant.
89 An <TT>h</TT> indicates that the corresponding integer pointer argument
90 (for <TT>%d</TT>, <TT>%u</TT>, <TT>%o</TT>, or <TT>%x</TT>)
91 is a <TT>short int *</TT> or <TT>unsigned short int *</TT>.
92 An <TT>l</TT> indicates that the corresponding integer pointer argument
93 (for <TT>%d</TT>, <TT>%u</TT>, <TT>%o</TT>, or <TT>%x</TT>)
94 is a <TT>long int *</TT> or
95 <TT>unsigned long int *</TT>,
96 or that the floating-point pointer argument
97 (for <TT>%e</TT>, <TT>%f</TT>, or <TT>%g</TT>)
98 is a <TT>double *</TT> rather than a <TT>float *</TT>.
99 (Similarly,
100 an <TT>L</TT> indicates a <TT>long double *</TT>.)
101 </p><p>The <TT>%c</TT> format will read more than one character
102 if an explicit <I>width</I> greater than 1 is specified.
103 The corresponding argument must be a pointer to enough space
104 to hold all the characters read.
105 </p><p>The <TT>%e</TT>, <TT>%f</TT>, and <TT>%g</TT> formats
106 all read strings in either scientific notation
107 or conventional decimal fraction <TT>m.n</TT> notation.
108 (In other words,
109 the three formats
111 just
112 the same.)
113 However,
114 they assume a <TT>float *</TT> argument
115 unless the <TT>l</TT> modifier appears,
116 in which case they expect a <TT>double *</TT>.
117 (This is in contrast to <TT>printf</TT>,
118 which accepts either <TT>float</TT> or <TT>double</TT> arguments
119 for <TT>%e</TT>, <TT>%f</TT>, and <TT>%g</TT>,
120 due to the default argument promotions.)
121 </p><p>The <TT>%i</TT> format
122 will read a number in decimal, octal, or hexadecimal,
123 taking a leading <TT>0</TT> to indicate octal
124 and a leading <TT>0x</TT> (or <TT>0X</TT>) to indicate hexadecimal,
125 i.e. the same rules as used by C constants.
126 </p><p>The <TT>%n</TT> format causes the number of characters read so far
127 (by this call to <TT>scanf</TT>)
128 to be stored in the integer pointed to by the corresponding argument.
129 </p><p>The <TT>%s</TT> format will read a string,
130 up to the next whitespace character,
131 and copy the string,
132 terminated by a <TT>\0</TT>,
133 to the corresponding argument,
134 which must be a <TT>char *</TT>.
135 The caller must ensure (perhaps by using an explicit <I>width</I>)
136 that there is enough space to hold the received characters.
137 </p><p><TT>scanf</TT> has a special format specifier <TT>%[</TT>...<TT>]</TT>,
138 which matches any string composed of characters specified in the <TT>[]</TT>.
139 For example,
140 <TT>%[abc]</TT>
141 would match any string composed of a's, b's, and c's.
142 The corresponding argument is a <TT>char *</TT>;
143 the matched string is written to the location pointed to,
144 followed by a <TT>\0</TT>.
145 The caller must ensure
146 (perhaps by using an explicit <I>width</I>)
147 that there is enough space to hold the received characters.
148 A second form,
149 <TT>%[^</TT>...<TT>]</TT>,
150 matches a string of characters <em>not</em> found in the set.
151 For example,
152 <TT>scanf("(%[^)])", s)</TT> reads, into the string <TT>s</TT>,
153 a string of characters (possibly including whitespace)
154 from an input in which the string appears enclosed in parentheses.
155 It may also be possible to specify ranges of characters
156 (e.g. <TT>%[a-z]</TT>, <TT>%[0-9]</TT>, etc.),
157 but these are not as portable.
158 </p><p>With the exception of <TT>%c</TT>, <TT>%n</TT>, and <TT>%[</TT>,
159 all of the conversion specifiers skip any leading whitespace
160 (spaces, tabs, or newlines)
161 which might precede the value or string converted.
162 Also,
163 any whitespace character in the format string
164 matches any number of whitespace characters in the input.
165 Therefore,
166 the format <TT>"%d %d"</TT>
167 would match the input <TT>"12 34"</TT>
168 or <TT>"12 34"</TT>
169 or <TT>"12\t34"</TT>.
170 However,
171 the format <TT>"%d%d"</TT> would match all of these inputs as well,
172 since the second <TT>%d</TT> first
174 scans past any whitespace preceding the <TT>34</TT>.
175 </p><p><TT>scanf</TT> returns the number of items
176 it successfully converts and stores.
177 It will return a number less than expected
178 (less than the number of format specifiers not containing <TT>*</TT>,
179 or less than the number of corresponding pointer arguments)
180 if the conversion fails at any point,
181 and it will leave any unrecognized characters
182 (i.e. the ones that caused the last match to fail)
183 waiting in the input for next time.
184 <TT>scanf</TT> returns <TT>EOF</TT>
185 if it encounters end-of-file before converting anything.
186 </p><p>If you want to read characters from an arbitrary stream,
187 you can use <TT>fscanf</TT>,
188 which takes an initial <TT>FILE *</TT> argument.
189 </p><p>You can scan and convert characters from a string
190 (rather than from a stream)
191 using <TT>sscanf</TT>.
192 For example,
193 <pre>
194 int x, y;
195 sscanf("12 34", "%d %d", &amp;x, &amp;y);
196 </pre>
197 would place 12 in <TT>x</TT> and 34 in <TT>y</TT>.
198 </p><p><TT>scanf</TT> and <TT>fscanf</TT> are seductively useful,
199 but they have a number of drawbacks in practice.
200 They seem to make it very easy to,
201 say,
202 prompt the user for a number:
203 <pre>
204 int x;
205 printf("Type a number:\n");
206 scanf("%d", &amp;x);
207 </pre>
208 But what happens if the user fumbles,
209 and types something other than a number?
210 Even if the code checks <TT>scanf</TT>'s return value,
211 and prompts the user again if <TT>scanf</TT> returns 0,
212 the non-numeric input remains on the input,
213 and will be encountered by the next call to <TT>scanf</TT>
214 unless some other steps are taken.
215 (That is,
216 <TT>scanf</TT> will rediscover the user's old, bad input
217 before it gets to any new input.)
218 It's also easy to write things like
219 <pre>
220 scanf("%d\n", &amp;x);
221 </pre>
222 but this code does <em>not</em> work as intended;
223 the <TT>\n</TT> in the format string is a whitespace character,
224 which asks <TT>scanf</TT> to discard one or more whitespace characters,
225 so it will <em>keep reading</em> characters
226 as long as they are whitespace characters,
227 that is,
228 it will read characters
229 until it finds something that is not a whitespace character.
230 It won't read that eventual whitespace character once it finds it,
231 but in the process of looking for it
232 it will seem to jam your program,
233 since the call to <TT>scanf</TT> won't return
234 right after the user types a number.
236 </p><p>Therefore,
237 it's much better to read interactive user input
238 a line at a time,
239 and then use functions like <TT>atoi</TT>
240 (or perhaps <TT>sscanf</TT>)
241 to interpret the line that the user typed.
243 </p><hr>
245 Read sequentially:
246 <a href="sx2e.html" rev=precedes>prev</a>
247 <a href="sx2g.html" rel=precedes>next</a>
248 <a href="sx2.html" rev=subdocument>up</a>
249 <a href="top.html">top</a>
250 </p>
252 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
253 // <a href="copyright.html">Copyright</a> 1996-1999
254 // <a href="mailto:scs@eskimo.com">mail feedback</a>
255 </p>
256 </body>
257 </html>