lang/C/the.ansi.c.programming.language/notes.accompany.ansi.c/sx8g.html

   1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
   2 <!-- This collection of hypertext pages is Copyright 1995, 1996 by Steve Summit. -->
   3 <!-- This material may be freely redistributed and used -->
   4 <!-- but may not be republished or sold without permission. -->
   5 <html>
   6 <head>
   7 <link rev="owner" href="mailto:scs@eskimo.com">
   8 <link rev="made" href="mailto:scs@eskimo.com">
   9 <title>section 5.7: Multi-dimensional Arrays</title>
  10 <link href="sx8f.html" rev=precedes>
  11 <link href="sx8h.html" rel=precedes>
  12 <link href="sx8.html" rev=subdocument>
  13 </head>
  14 <body>
  15 <H2>section 5.7: Multi-dimensional Arrays</H2>
  16
  17 page 111
  18 <p>The <TT>month_day</TT> function is another example of a
  19 function which simulates having multiple return values by using
  20 pointer parameters.
  21 <TT>month_day</TT> is declared as <TT>void</TT>,
  22 so it has no formal return value,
  23 but two of its parameters,
  24 <TT>pmonth</TT> and <TT>pday</TT>,
  25 are pointers,
  26 and it fills in the locations
  27 pointed to by these two pointers
  28 with the two values it wants to ``return.''
  29 One line of the definition of <TT>month_day</TT> on page 111
  30 is cut off in all printings I have seen:
  31 it should read
  32 <pre>   void month_day(int year, int yearday, int *pmonth, int *pday)
  33 </pre></p><p>As we've said,
  34 although any nonzero value is considered ``true'' in C,
  35 the built-in relational and Boolean operators
  36 always ``return'' 0 or 1.
  37 Therefore,
  38 the line
  39 <pre>   int leap = year%4 == 0 &amp;&amp; year%100 != 0 || year%400 == 0;
  40 </pre>sets <TT>leap</TT> to 1 or 0
  41 (``true'' or ``false'')
  42 depending on the condition
  43 <pre>   year%4 == 0 &amp;&amp; year%100 != 0 || year%400 == 0
  44 </pre>which is the condition for leap years in the Gregorian calendar.
  45
  46 (It's a little-known fact that century years are not leap years
  47 unless they are also divisible by 400.
  48 Thus, 2000 <em>will</em> be a leap year.)
  49 The 1/0 value that <TT>leap</TT> receives
  50 is what
  51 the authors are referring to
  52 when they say that
  53 ``the arithmetic value of a logical expression... can
  54 be used as a subscript of the array <TT>daytab</TT>.''
  55 This line could also have been written
  56 <pre>   int leap;
  57         if (year%4 == 0 &amp;&amp; year%100 != 0 || year%400 == 0)
  58                 leap = 1;
  59         else
  60                 leap = 0;
  61 </pre>or
  62 <pre>   int leap = (year%4 == 0 &amp;&amp; year%100 != 0 || year%400 == 0) ? 1 : 0;
  63 </pre></p><p>page 112
  64 </p><p>The <TT>daytab</TT> array holds small integers (in the range 0-31),
  65 so it can legally be made an array of <TT>char</TT>,
  66 though whether this is a legitimate use is a question of style.
  67 </p><p>Deep sentence:
  68 <blockquote>In C,
  69 a two-dimensional array is really a one-dimensional array,
  70 each of whose elements is an array.
  71 </blockquote>Earlier we said that ``array-of-<I>type</I> is another type,''
  72 and here we must believe it:
  73 since array-of-<I>type</I> is a type,
  74 array-of-(array-of-<I>type</I>) is yet another type.
  75 </p><p>The statement that
  76 ``Elements are stored by rows,
  77 so the rightmost subscript, or column,
  78 varies fastest as elements are accessed in storage order''
  79 probably won't make much sense
  80 unless you've done a lot of work with other languages,
  81 such as FORTRAN,
  82 which do have true multi-dimensional arrays.
  83 It's pretty arbitrary what you call a ``row'' and what you call a ``column'';
  84 the most important thing to know is which subscript goes with which dimension.
  85 If you have
  86 <pre>   int a[10][20];
  87 </pre>then in the reference <TT>a[i][j]</TT>,
  88 <TT>i</TT> can range from 0 to 9
  89 and <TT>j</TT> can range from 0 to 19.
  90 In other words, you might write
  91 <pre>   for (i = 0; i &lt; 10; i++)
  92                 for (j = 0; j &lt; 20; j++)
  93                         <I>do something with</I> a[i][j]
  94 </pre></p><p>We also want to know what <TT>a</TT> actually is.
  95 Is it an array of 10 arrays, each of size 20,
  96 or is it an array of 20 arrays, each of size 10?
  97 There are other ways of convincing ourselves of the answer,
  98 but for now let's just say that the ``closer'' dimensions are
  99 closer to what <TT>a</TT> is.
 100 Therefore, <TT>a</TT> is first an array of size 10,
 101 and what it's an array of is arrays of 20 <TT>int</TT>s.
 102 This also tells us that if we ever refer to <TT>a[i]</TT>
 103 (without a second subscript),
 104 then we're referring to just one of those 10 arrays
 105 (of size 20)
 106 in its entirety.
 107 </p><p>When we look back
 108 at the initialization of the <TT>daytab</TT> array on page 111,
 109 everything lines up.
 110 <TT>daytab</TT> is defined as
 111 <pre>   char daytab[2][13]
 112 </pre>and we can see from the initializer that there are two (sub)arrays,
 113 each of size 13.
 114 (We can also see that there is some justification
 115 for saying that the first subscript refers to ``rows''
 116 and the second to ``columns.'')
 117 </p><p>The authors illustrate one way of dealing with C's 0-based
 118 arrays when you have an algorithm that really wants to treat an
 119 array as if it were 1-based.
 120 Here, rather than remembering
 121 to subtract one from the 1-based month number each time,
 122 they chose to waste a ``column'' of the array,
 123 and declare it one larger than necessary,
 124 so that they could refer to subscripts from [1] to [12].
 125 </p><p>One last note about the initialization of <TT>daytab</TT>:
 126 you may have seen code in other programming books that kept an
 127 array of the cumulative days of all the months:
 128 <pre>   {0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365}
 129 </pre>Precomputing an array like that
 130 might make things a tiny bit easier on the computer
 131 (it wouldn't have to loop through the entire array each time,
 132 as it does in the <TT>day_of_year</TT> function),
 133 but it makes it considerably harder to see what the numbers mean,
 134 and to verify that they are correct.
 135 The simple table of individual month lengths is much clearer,
 136 and if the computer has to do a bit more grunge work,
 137 well, that's what computers are for.
 138 As explained in another book co-authored by Brian Kernighan:
 139
 140 <blockquote>A cumulative table of days must be calculated by someone
 141 and checked by someone else.
 142 Since few people are familiar
 143 with the number of days up to the end of a particular month,
 144 neither writing nor checking is easy.
 145 But if instead we use a table of days per month,
 146 we can let the computer count them for us.
 147 (``Let the machine do the dirty work.'')
 148 </blockquote></p><p>The bottom of page 112 begins to get confusing.
 149 The ``number of rows'' of an array like <TT>daytab</TT>
 150 ``is irrelevant'' when passed to a function such as the
 151 hypothetical <TT>f</TT> because the compiler doesn't need to
 152 know the number of rows when calculating subscripts.
 153 It does need to know the number of columns or ``width,''
 154 because that's how it knows that the second element on the
 155 second row of a 10-column array is actually 12 cells past the
 156 beginning of the array, which is essentially what it needs to
 157 know when it goes off and actually accesses the array in memory.
 158 But it doesn't need to know how long the overall array is,
 159 as long as we promise not to run off the end of it,
 160 and that's always up to us.
 161 (This is why we haven't specified the array sizes
 162 in the definitions of functions such as
 163 <TT>getline</TT> on pages 29 and 69,
 164 or <TT>atoi</TT> on pages 43, 61, and 73,
 165 or <TT>readlines</TT> on page 109,
 166 although we did carry the array size as a separate argument
 167 to <TT>getline</TT> and <TT>readlines</TT>,
 168 to assist us in our promise not to run off the end.)
 169 </p><p>The third version
 170 of <TT>f</TT>
 171 on page 112
 172 comes about because of the ``gentle fiction''
 173 involving array parameters.
 174 We learned on page 99
 175 that functions don't really receive arrays as parameters;
 176 they receive arrays
 177 (since any array passed by the caller decayed immediately to a pointer).
 178 On page 39 we wrote a <TT>strlen</TT> function as
 179 <pre>   int strlen(char s[])
 180 </pre>but on page 99 we rewrote it as
 181 <pre>   int strlen(char *s)
 182 </pre>which is closer to the way the compiler sees the situation.
 183 (In fact, when we write <TT>int strlen(char s[])</TT>,
 184 the compiler essentially rewrites it as
 185 <TT>int strlen(char *s)</TT> for us.)
 186 In the same way,
 187 a function declared as
 188 <pre>   f(int daytab[][13])
 189 </pre>can be rewritten by us
 190 (or if not, is rewritten by the compiler)
 191 to
 192 <pre>   f(int (*daytab)[13])
 193 </pre>which declares the <TT>daytab</TT> parameter as a
 194 pointer-to-array-of-13-<TT>int</TT>s.
 195 Here we see two things:
 196 (1)
 197 the rewrite
 198 which changes an array parameter to a pointer parameter
 199 happens only once
 200 (we end up with a pointer to an array,
 201 not a pointer to a pointer),
 202 and
 203 (2)
 204 the syntax for pointers to arrays is a bit messy,
 205 because of some required extra parentheses,
 206 as explained in the text.
 207 </p><p>If this seems obscure, don't worry about it too much;
 208 just declare functions with array parameters matching the
 209 arrays you call them with,
 210 like
 211 <pre>   f(int daytab[2][13])
 212 </pre>and let the compiler worry about the rewriting.
 213 </p><p>Deep sentence:
 214 <blockquote>More generally,
 215 only the first dimension (subscript) of an array is free;
 216 all the others have to be specified.
 217 </blockquote>This just says what we said already:
 218 when declaring an array as a function parameter,
 219 you can leave off the first dimension
 220 because it is the overall length
 221 and not knowing it causes no immediate problems
 222 (unless you accidentally go off the end).
 223 But the compiler always needs to know the other dimensions,
 224 so that it knows how the rown and columns line up.
 225 </p><hr>
 226 <p>
 227 Read sequentially:
 228 <a href="sx8f.html" rev=precedes>prev</a>
 229 <a href="sx8h.html" rel=precedes>next</a>
 230 <a href="sx8.html" rev=subdocument>up</a>
 231 <a href="top.html">top</a>
 232 </p>
 233 <p>
 234 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
 235 // <a href="copyright.html">Copyright</a> 1995, 1996
 236 // <a href="mailto:scs@eskimo.com">mail feedback</a>
 237 </p>
 238 </body>
 239 </html>