docs/CodingStandards.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   2                       "http://www.w3.org/TR/html4/strict.dtd">
   3 <html>
   4 <head>
   5   <link rel="stylesheet" href="llvm.css" type="text/css">
   6   <title>LLVM Coding Standards</title>
   7 </head>
   8 <body>
   9
  10 <div class="doc_title">
  11   LLVM Coding Standards
  12 </div>
  13
  14 <ol>
  15   <li><a href="#introduction">Introduction</a></li>
  16   <li><a href="#mechanicalissues">Mechanical Source Issues</a>
  17     <ol>
  18       <li><a href="#sourceformating">Source Code Formatting</a>
  19         <ol>
  20           <li><a href="#scf_commenting">Commenting</a></li>
  21           <li><a href="#scf_commentformat">Comment Formatting</a></li>
  22           <li><a href="#scf_includes"><tt>#include</tt> Style</a></li>
  23           <li><a href="#scf_codewidth">Source Code Width</a></li>
  24           <li><a href="#scf_spacestabs">Use Spaces Instead of Tabs</a></li>
  25           <li><a href="#scf_indentation">Indent Code Consistently</a></li>
  26         </ol></li>
  27       <li><a href="#compilerissues">Compiler Issues</a>
  28         <ol>
  29           <li><a href="#ci_warningerrors">Treat Compiler Warnings Like
  30               Errors</a></li>
  31           <li><a href="#ci_portable_code">Write Portable Code</a></li>
  32           <li><a href="#ci_class_struct">Use of class/struct Keywords</a></li>
  33         </ol></li>
  34     </ol></li>
  35   <li><a href="#styleissues">Style Issues</a>
  36     <ol>
  37       <li><a href="#macro">The High Level Issues</a>
  38         <ol>
  39           <li><a href="#hl_module">A Public Header File <b>is</b> a
  40               Module</a></li>
  41           <li><a href="#hl_dontinclude">#include as Little as Possible</a></li>
  42           <li><a href="#hl_privateheaders">Keep "internal" Headers
  43               Private</a></li>
  44           <li><a href="#hl_earlyexit">Use Early Exits and 'continue' to Simplify
  45               Code</a></li>
  46           <li><a href="#hl_else_after_return">Don't use "else" after a
  47               return</a></li>
  48           <li><a href="#hl_predicateloops">Turn Predicate Loops into Predicate
  49               Functions</a></li>
  50         </ol></li>
  51       <li><a href="#micro">The Low Level Issues</a>
  52         <ol>
  53           <li><a href="#ll_assert">Assert Liberally</a></li>
  54           <li><a href="#ll_ns_std">Do not use 'using namespace std'</a></li>
  55           <li><a href="#ll_virtual_anch">Provide a virtual method anchor for
  56               classes in headers</a></li>
  57           <li><a href="#ll_end">Don't evaluate end() every time through a
  58               loop</a></li>
  59           <li><a href="#ll_iostream"><tt>#include &lt;iostream&gt;</tt> is
  60               <em>forbidden</em></a></li>
  61           <li><a href="#ll_avoidendl">Avoid <tt>std::endl</tt></a></li>
  62           <li><a href="#ll_raw_ostream">Use <tt>raw_ostream</tt></a</li>
  63         </ol></li>
  64
  65       <li><a href="#nano">Microscopic Details</a>
  66         <ol>
  67           <li><a href="#micro_spaceparen">Spaces Before Parentheses</a></li>
  68           <li><a href="#micro_preincrement">Prefer Preincrement</a></li>
  69           <li><a href="#micro_namespaceindent">Namespace Indentation</a></li>
  70           <li><a href="#micro_anonns">Anonymous Namespaces</a></li>
  71         </ol></li>
  72
  73
  74     </ol></li>
  75   <li><a href="#seealso">See Also</a></li>
  76 </ol>
  77
  78 <div class="doc_author">
  79   <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
  80 </div>
  81
  82
  83 <!-- *********************************************************************** -->
  84 <div class="doc_section">
  85   <a name="introduction">Introduction</a>
  86 </div>
  87 <!-- *********************************************************************** -->
  88
  89 <div class="doc_text">
  90
  91 <p>This document attempts to describe a few coding standards that are being used
  92 in the LLVM source tree.  Although no coding standards should be regarded as
  93 absolute requirements to be followed in all instances, coding standards can be
  94 useful.</p>
  95
  96 <p>This document intentionally does not prescribe fixed standards for religious
  97 issues such as brace placement and space usage.  For issues like this, follow
  98 the golden rule:</p>
  99
 100 <blockquote>
 101
 102 <p><b><a name="goldenrule">If you are adding a significant body of source to a
 103 project, feel free to use whatever style you are most comfortable with.  If you
 104 are extending, enhancing, or bug fixing already implemented code, use the style
 105 that is already being used so that the source is uniform and easy to
 106 follow.</a></b></p>
 107
 108 </blockquote>
 109
 110 <p>The ultimate goal of these guidelines is the increase readability and
 111 maintainability of our common source base. If you have suggestions for topics to
 112 be included, please mail them to <a
 113 href="mailto:sabre@nondot.org">Chris</a>.</p>
 114
 115 </div>
 116
 117 <!-- *********************************************************************** -->
 118 <div class="doc_section">
 119   <a name="mechanicalissues">Mechanical Source Issues</a>
 120 </div>
 121 <!-- *********************************************************************** -->
 122
 123 <!-- ======================================================================= -->
 124 <div class="doc_subsection">
 125   <a name="sourceformating">Source Code Formatting</a>
 126 </div>
 127
 128 <!-- _______________________________________________________________________ -->
 129 <div class="doc_subsubsection">
 130   <a name="scf_commenting">Commenting</a>
 131 </div>
 132
 133 <div class="doc_text">
 134
 135 <p>Comments are one critical part of readability and maintainability.  Everyone
 136 knows they should comment, so should you.  When writing comments, write them as
 137 English prose, which means they should use proper capitalization, punctuation,
 138 etc.  Although we all should probably
 139 comment our code more than we do, there are a few very critical places that
 140 documentation is very useful:</p>
 141
 142 <b>File Headers</b>
 143
 144 <p>Every source file should have a header on it that describes the basic
 145 purpose of the file.  If a file does not have a header, it should not be
 146 checked into Subversion.  Most source trees will probably have a standard
 147 file header format.  The standard format for the LLVM source tree looks like
 148 this:</p>
 149
 150 <div class="doc_code">
 151 <pre>
 152 //===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===//
 153 //
 154 //                     The LLVM Compiler Infrastructure
 155 //
 156 // This file is distributed under the University of Illinois Open Source
 157 // License. See LICENSE.TXT for details.
 158 //
 159 //===----------------------------------------------------------------------===//
 160 //
 161 // This file contains the declaration of the Instruction class, which is the
 162 // base class for all of the VM instructions.
 163 //
 164 //===----------------------------------------------------------------------===//
 165 </pre>
 166 </div>
 167
 168 <p>A few things to note about this particular format:  The "<tt>-*- C++
 169 -*-</tt>" string on the first line is there to tell Emacs that the source file
 170 is a C++ file, not a C file (Emacs assumes .h files are C files by default).
 171 Note that this tag is not necessary in .cpp files.  The name of the file is also
 172 on the first line, along with a very short description of the purpose of the
 173 file.  This is important when printing out code and flipping though lots of
 174 pages.</p>
 175
 176 <p>The next section in the file is a concise note that defines the license
 177 that the file is released under.  This makes it perfectly clear what terms the
 178 source code can be distributed under and should not be modified in any way.</p>
 179
 180 <p>The main body of the description does not have to be very long in most cases.
 181 Here it's only two lines.  If an algorithm is being implemented or something
 182 tricky is going on, a reference to the paper where it is published should be
 183 included, as well as any notes or "gotchas" in the code to watch out for.</p>
 184
 185 <b>Class overviews</b>
 186
 187 <p>Classes are one fundamental part of a good object oriented design.  As such,
 188 a class definition should have a comment block that explains what the class is
 189 used for... if it's not obvious.  If it's so completely obvious your grandma
 190 could figure it out, it's probably safe to leave it out.  Naming classes
 191 something sane goes a long ways towards avoiding writing documentation.</p>
 192
 193
 194 <b>Method information</b>
 195
 196 <p>Methods defined in a class (as well as any global functions) should also be
 197 documented properly.  A quick note about what it does and a description of the
 198 borderline behaviour is all that is necessary here (unless something
 199 particularly tricky or insidious is going on).  The hope is that people can
 200 figure out how to use your interfaces without reading the code itself... that is
 201 the goal metric.</p>
 202
 203 <p>Good things to talk about here are what happens when something unexpected
 204 happens: does the method return null?  Abort?  Format your hard disk?</p>
 205
 206 </div>
 207
 208 <!-- _______________________________________________________________________ -->
 209 <div class="doc_subsubsection">
 210   <a name="scf_commentformat">Comment Formatting</a>
 211 </div>
 212
 213 <div class="doc_text">
 214
 215 <p>In general, prefer C++ style (<tt>//</tt>) comments.  They take less space,
 216 require less typing, don't have nesting problems, etc.  There are a few cases
 217 when it is useful to use C style (<tt>/* */</tt>) comments however:</p>
 218
 219 <ol>
 220   <li>When writing a C code: Obviously if you are writing C code, use C style
 221       comments.</li>
 222   <li>When writing a header file that may be <tt>#include</tt>d by a C source
 223       file.</li>
 224   <li>When writing a source file that is used by a tool that only accepts C
 225       style comments.</li>
 226 </ol>
 227
 228 <p>To comment out a large block of code, use <tt>#if 0</tt> and <tt>#endif</tt>.
 229 These nest properly and are better behaved in general than C style comments.</p>
 230
 231 </div>
 232
 233 <!-- _______________________________________________________________________ -->
 234 <div class="doc_subsubsection">
 235   <a name="scf_includes"><tt>#include</tt> Style</a>
 236 </div>
 237
 238 <div class="doc_text">
 239
 240 <p>Immediately after the <a href="#scf_commenting">header file comment</a> (and
 241 include guards if working on a header file), the <a
 242 href="#hl_dontinclude">minimal</a> list of <tt>#include</tt>s required by the
 243 file should be listed.  We prefer these <tt>#include</tt>s to be listed in this
 244 order:</p>
 245
 246 <ol>
 247   <li><a href="#mmheader">Main Module header</a></li>
 248   <li><a href="#hl_privateheaders">Local/Private Headers</a></li>
 249   <li><tt>llvm/*</tt></li>
 250   <li><tt>llvm/Analysis/*</tt></li>
 251   <li><tt>llvm/Assembly/*</tt></li>
 252   <li><tt>llvm/Bytecode/*</tt></li>
 253   <li><tt>llvm/CodeGen/*</tt></li>
 254   <li>...</li>
 255   <li><tt>Support/*</tt></li>
 256   <li><tt>Config/*</tt></li>
 257   <li>System <tt>#includes</tt></li>
 258 </ol>
 259
 260 <p>... and each category should be sorted by name.</p>
 261
 262 <p><a name="mmheader">The "Main Module Header"</a> file applies to .cpp file
 263 which implement an interface defined by a .h file.  This <tt>#include</tt>
 264 should always be included <b>first</b> regardless of where it lives on the file
 265 system.  By including a header file first in the .cpp files that implement the
 266 interfaces, we ensure that the header does not have any hidden dependencies
 267 which are not explicitly #included in the header, but should be.  It is also a
 268 form of documentation in the .cpp file to indicate where the interfaces it
 269 implements are defined.</p>
 270
 271 </div>
 272
 273 <!-- _______________________________________________________________________ -->
 274 <div class="doc_subsubsection">
 275   <a name="scf_codewidth">Source Code Width</a>
 276 </div>
 277
 278 <div class="doc_text">
 279
 280 <p>Write your code to fit within 80 columns of text.  This helps those of us who
 281 like to print out code and look at your code in an xterm without resizing
 282 it.</p>
 283
 284 <p>The longer answer is that there must be some limit to the width of the code
 285 in order to reasonably allow developers to have multiple files side-by-side in
 286 windows on a modest display.  If you are going to pick a width limit, it is
 287 somewhat arbitrary but you might as well pick something standard.  Going with
 288 90 columns (for example) instead of 80 columns wouldn't add any significant
 289 value and would be detrimental to printing out code.  Also many other projects
 290 have standardized on 80 columns, so some people have already configured their
 291 editors for it (vs something else, like 90 columns).</p>
 292
 293 <p>This is one of many contentious issues in coding standards, but is not up
 294 for debate.</p>
 295
 296 </div>
 297
 298 <!-- _______________________________________________________________________ -->
 299 <div class="doc_subsubsection">
 300   <a name="scf_spacestabs">Use Spaces Instead of Tabs</a>
 301 </div>
 302
 303 <div class="doc_text">
 304
 305 <p>In all cases, prefer spaces to tabs in source files.  People have different
 306 preferred indentation levels, and different styles of indentation that they
 307 like... this is fine.  What isn't is that different editors/viewers expand tabs
 308 out to different tab stops.  This can cause your code to look completely
 309 unreadable, and it is not worth dealing with.</p>
 310
 311 <p>As always, follow the <a href="#goldenrule">Golden Rule</a> above: follow the
 312 style of existing code if your are modifying and extending it.  If you like four
 313 spaces of indentation, <b>DO NOT</b> do that in the middle of a chunk of code
 314 with two spaces of indentation.  Also, do not reindent a whole source file: it
 315 makes for incredible diffs that are absolutely worthless.</p>
 316
 317 </div>
 318
 319 <!-- _______________________________________________________________________ -->
 320 <div class="doc_subsubsection">
 321   <a name="scf_indentation">Indent Code Consistently</a>
 322 </div>
 323
 324 <div class="doc_text">
 325
 326 <p>Okay, your first year of programming you were told that indentation is
 327 important.  If you didn't believe and internalize this then, now is the time.
 328 Just do it.</p>
 329
 330 </div>
 331
 332
 333 <!-- ======================================================================= -->
 334 <div class="doc_subsection">
 335   <a name="compilerissues">Compiler Issues</a>
 336 </div>
 337
 338
 339 <!-- _______________________________________________________________________ -->
 340 <div class="doc_subsubsection">
 341   <a name="ci_warningerrors">Treat Compiler Warnings Like Errors</a>
 342 </div>
 343
 344 <div class="doc_text">
 345
 346 <p>If your code has compiler warnings in it, something is wrong: you aren't
 347 casting values correctly, your have "questionable" constructs in your code, or
 348 you are doing something legitimately wrong.  Compiler warnings can cover up
 349 legitimate errors in output and make dealing with a translation unit
 350 difficult.</p>
 351
 352 <p>It is not possible to prevent all warnings from all compilers, nor is it
 353 desirable.  Instead, pick a standard compiler (like <tt>gcc</tt>) that provides
 354 a good thorough set of warnings, and stick to them.  At least in the case of
 355 <tt>gcc</tt>, it is possible to work around any spurious errors by changing the
 356 syntax of the code slightly.  For example, an warning that annoys me occurs when
 357 I write code like this:</p>
 358
 359 <div class="doc_code">
 360 <pre>
 361 if (V = getValue()) {
 362   ...
 363 }
 364 </pre>
 365 </div>
 366
 367 <p><tt>gcc</tt> will warn me that I probably want to use the <tt>==</tt>
 368 operator, and that I probably mistyped it.  In most cases, I haven't, and I
 369 really don't want the spurious errors.  To fix this particular problem, I
 370 rewrite the code like this:</p>
 371
 372 <div class="doc_code">
 373 <pre>
 374 if ((V = getValue())) {
 375   ...
 376 }
 377 </pre>
 378 </div>
 379
 380 <p>...which shuts <tt>gcc</tt> up.  Any <tt>gcc</tt> warning that annoys you can
 381 be fixed by massaging the code appropriately.</p>
 382
 383 <p>These are the <tt>gcc</tt> warnings that I prefer to enable: <tt>-Wall
 384 -Winline -W -Wwrite-strings -Wno-unused</tt></p>
 385
 386 </div>
 387
 388 <!-- _______________________________________________________________________ -->
 389 <div class="doc_subsubsection">
 390   <a name="ci_portable_code">Write Portable Code</a>
 391 </div>
 392
 393 <div class="doc_text">
 394
 395 <p>In almost all cases, it is possible and within reason to write completely
 396 portable code.  If there are cases where it isn't possible to write portable
 397 code, isolate it behind a well defined (and well documented) interface.</p>
 398
 399 <p>In practice, this means that you shouldn't assume much about the host
 400 compiler, including its support for "high tech" features like partial
 401 specialization of templates.  If these features are used, they should only be
 402 an implementation detail of a library which has a simple exposed API.</p>
 403
 404 </div>
 405
 406 <!-- _______________________________________________________________________ -->
 407 <div class="doc_subsubsection">
 408 <a name="ci_class_struct">Use of <tt>class</tt> and <tt>struct</tt> Keywords</a>
 409 </div>
 410 <div class="doc_text">
 411
 412 <p>In C++, the <tt>class</tt> and <tt>struct</tt> keywords can be used almost
 413 interchangeably. The only difference is when they are used to declare a class:
 414 <tt>class</tt> makes all members private by default while <tt>struct</tt> makes
 415 all members public by default.</p>
 416
 417 <p>Unfortunately, not all compilers follow the rules and some will generate
 418 different symbols based on whether <tt>class</tt> or <tt>struct</tt> was used to
 419 declare the symbol.  This can lead to problems at link time.</p>
 420
 421 <p>So, the rule for LLVM is to always use the <tt>class</tt> keyword, unless
 422 <b>all</b> members are public and the type is a C++ "POD" type, in which case
 423 <tt>struct</tt> is allowed.</p>
 424
 425 </div>
 426
 427 <!-- *********************************************************************** -->
 428 <div class="doc_section">
 429   <a name="styleissues">Style Issues</a>
 430 </div>
 431 <!-- *********************************************************************** -->
 432
 433
 434 <!-- ======================================================================= -->
 435 <div class="doc_subsection">
 436   <a name="macro">The High Level Issues</a>
 437 </div>
 438 <!-- ======================================================================= -->
 439
 440
 441 <!-- _______________________________________________________________________ -->
 442 <div class="doc_subsubsection">
 443   <a name="hl_module">A Public Header File <b>is</b> a Module</a>
 444 </div>
 445
 446 <div class="doc_text">
 447
 448 <p>C++ doesn't do too well in the modularity department.  There is no real
 449 encapsulation or data hiding (unless you use expensive protocol classes), but it
 450 is what we have to work with.  When you write a public header file (in the LLVM
 451 source tree, they live in the top level "include" directory), you are defining a
 452 module of functionality.</p>
 453
 454 <p>Ideally, modules should be completely independent of each other, and their
 455 header files should only include the absolute minimum number of headers
 456 possible. A module is not just a class, a function, or a namespace: <a
 457 href="http://www.cuj.com/articles/2000/0002/0002c/0002c.htm">it's a collection
 458 of these</a> that defines an interface.  This interface may be several
 459 functions, classes or data structures, but the important issue is how they work
 460 together.</p>
 461
 462 <p>In general, a module should be implemented with one or more <tt>.cpp</tt>
 463 files.  Each of these <tt>.cpp</tt> files should include the header that defines
 464 their interface first.  This ensure that all of the dependences of the module
 465 header have been properly added to the module header itself, and are not
 466 implicit.  System headers should be included after user headers for a
 467 translation unit.</p>
 468
 469 </div>
 470
 471 <!-- _______________________________________________________________________ -->
 472 <div class="doc_subsubsection">
 473   <a name="hl_dontinclude"><tt>#include</tt> as Little as Possible</a>
 474 </div>
 475
 476 <div class="doc_text">
 477
 478 <p><tt>#include</tt> hurts compile time performance.  Don't do it unless you
 479 have to, especially in header files.</p>
 480
 481 <p>But wait, sometimes you need to have the definition of a class to use it, or
 482 to inherit from it.  In these cases go ahead and <tt>#include</tt> that header
 483 file.  Be aware however that there are many cases where you don't need to have
 484 the full definition of a class.  If you are using a pointer or reference to a
 485 class, you don't need the header file.  If you are simply returning a class
 486 instance from a prototyped function or method, you don't need it.  In fact, for
 487 most cases, you simply don't need the definition of a class... and not
 488 <tt>#include</tt>'ing speeds up compilation.</p>
 489
 490 <p>It is easy to try to go too overboard on this recommendation, however.  You
 491 <b>must</b> include all of the header files that you are using -- you can
 492 include them either directly
 493 or indirectly (through another header file).  To make sure that you don't
 494 accidentally forget to include a header file in your module header, make sure to
 495 include your module header <b>first</b> in the implementation file (as mentioned
 496 above).  This way there won't be any hidden dependencies that you'll find out
 497 about later...</p>
 498
 499 </div>
 500
 501 <!-- _______________________________________________________________________ -->
 502 <div class="doc_subsubsection">
 503   <a name="hl_privateheaders">Keep "internal" Headers Private</a>
 504 </div>
 505
 506 <div class="doc_text">
 507
 508 <p>Many modules have a complex implementation that causes them to use more than
 509 one implementation (<tt>.cpp</tt>) file.  It is often tempting to put the
 510 internal communication interface (helper classes, extra functions, etc) in the
 511 public module header file.  Don't do this.</p>
 512
 513 <p>If you really need to do something like this, put a private header file in
 514 the same directory as the source files, and include it locally.  This ensures
 515 that your private interface remains private and undisturbed by outsiders.</p>
 516
 517 <p>Note however, that it's okay to put extra implementation methods a public
 518 class itself... just make them private (or protected), and all is well.</p>
 519
 520 </div>
 521
 522 <!-- _______________________________________________________________________ -->
 523 <div class="doc_subsubsection">
 524   <a name="hl_earlyexit">Use Early Exits and 'continue' to Simplify Code</a>
 525 </div>
 526
 527 <div class="doc_text">
 528
 529 <p>When reading code, keep in mind how much state and how many previous
 530 decisions have to be remembered by the reader to understand a block of code.
 531 Aim to reduce indentation where possible when it doesn't make it more difficult
 532 to understand the code.  One great way to do this is by making use of early
 533 exits and the 'continue' keyword in long loops.  As an example of using an early
 534 exit from a function, consider this "bad" code:</p>
 535
 536 <div class="doc_code">
 537 <pre>
 538 Value *DoSomething(Instruction *I) {
 539   if (!isa&lt;TerminatorInst&gt;(I) &amp;&amp;
 540       I-&gt;hasOneUse() &amp;&amp; SomeOtherThing(I)) {
 541     ... some long code ....
 542   }
 543
 544   return 0;
 545 }
 546 </pre>
 547 </div>
 548
 549 <p>This code has several problems if the body of the 'if' is large.  When you're
 550 looking at the top of the function, it isn't immediately clear that this
 551 <em>only</em> does interesting things with non-terminator instructions, and only
 552 applies to things with the other predicates.  Second, it is relatively difficult
 553 to describe (in comments) why these predicates are important because the if
 554 statement makes it difficult to lay out the comments.  Third, when you're deep
 555 within the body of the code, it is indented an extra level.   Finally, when
 556 reading the top of the function, it isn't clear what the result is if the
 557 predicate isn't true, you have to read to the end of the function to know that
 558 it returns null.</p>
 559
 560 <p>It is much preferred to format the code like this:</p>
 561
 562 <div class="doc_code">
 563 <pre>
 564 Value *DoSomething(Instruction *I) {
 565   // Terminators never need 'something' done to them because, ...
 566   if (isa&lt;TerminatorInst&gt;(I))
 567     return 0;
 568
 569   // We conservatively avoid transforming instructions with multiple uses
 570   // because goats like cheese.
 571   if (!I-&gt;hasOneUse())
 572     return 0;
 573
 574   // This is really just here for example.
 575   if (!SomeOtherThing(I))
 576     return 0;
 577
 578   ... some long code ....
 579 }
 580 </pre>
 581 </div>
 582
 583 <p>This fixes these problems.  A similar problem frequently happens in for
 584 loops.  A silly example is something like this:</p>
 585
 586 <div class="doc_code">
 587 <pre>
 588   for (BasicBlock::iterator II = BB-&gt;begin(), E = BB-&gt;end(); II != E; ++II) {
 589     if (BinaryOperator *BO = dyn_cast&lt;BinaryOperator&gt;(II)) {
 590       Value *LHS = BO-&gt;getOperand(0);
 591       Value *RHS = BO-&gt;getOperand(1);
 592       if (LHS != RHS) {
 593         ...
 594       }
 595     }
 596   }
 597 </pre>
 598 </div>
 599
 600 <p>When you have very very small loops, this sort of structure is fine, but if
 601 it exceeds more than 10-15 lines, it becomes difficult for people to read and
 602 understand at a glance.
 603 The problem with this sort of code is that it gets very nested very quickly,
 604 meaning that the reader of the code has to keep a lot of context in their brain
 605 to remember what is going immediately on in the loop, because they don't know
 606 if/when the if conditions will have elses etc.  It is strongly preferred to
 607 structure the loop like this:</p>
 608
 609 <div class="doc_code">
 610 <pre>
 611   for (BasicBlock::iterator II = BB-&gt;begin(), E = BB-&gt;end(); II != E; ++II) {
 612     BinaryOperator *BO = dyn_cast&lt;BinaryOperator&gt;(II);
 613     if (!BO) continue;
 614
 615     Value *LHS = BO-&gt;getOperand(0);
 616     Value *RHS = BO-&gt;getOperand(1);
 617     if (LHS == RHS) continue;
 618   }
 619 </pre>
 620 </div>
 621
 622 <p>This has all the benefits of using early exits from functions: it reduces
 623 nesting of the loop, it makes it easier to describe why the conditions are true,
 624 and it makes it obvious to the reader that there is no "else" coming up that
 625 they have to push context into their brain for.  If a loop is large, this can
 626 be a big understandability win.</p>
 627
 628 </div>
 629
 630 <!-- _______________________________________________________________________ -->
 631 <div class="doc_subsubsection">
 632   <a name="hl_else_after_return">Don't use "else" after a return</a>
 633 </div>
 634
 635 <div class="doc_text">
 636
 637 <p>For similar reasons above (reduction of indentation and easier reading),
 638    please do not use "else" or "else if" after something that interrupts
 639    control flow like return, break, continue, goto, etc.  For example, this is
 640    "bad":</p>
 641
 642 <div class="doc_code">
 643 <pre>
 644   case 'J': {
 645     if (Signed) {
 646       Type = Context.getsigjmp_bufType();
 647       if (Type.isNull()) {
 648         Error = ASTContext::GE_Missing_sigjmp_buf;
 649         return QualType();
 650       } else {
 651         break;
 652       }
 653     } else {
 654       Type = Context.getjmp_bufType();
 655       if (Type.isNull()) {
 656         Error = ASTContext::GE_Missing_jmp_buf;
 657         return QualType();
 658       } else {
 659         break;
 660       }
 661     }
 662   }
 663   }
 664 </pre>
 665 </div>
 666
 667 <p>It is better to write this something like:</p>
 668
 669 <div class="doc_code">
 670 <pre>
 671   case 'J':
 672     if (Signed) {
 673       Type = Context.getsigjmp_bufType();
 674       if (Type.isNull()) {
 675         Error = ASTContext::GE_Missing_sigjmp_buf;
 676         return QualType();
 677       }
 678     } else {
 679       Type = Context.getjmp_bufType();
 680       if (Type.isNull()) {
 681         Error = ASTContext::GE_Missing_jmp_buf;
 682         return QualType();
 683       }
 684     }
 685     break;
 686 </pre>
 687 </div>
 688
 689 <p>Or better yet (in this case), as:</p>
 690
 691 <div class="doc_code">
 692 <pre>
 693   case 'J':
 694     if (Signed)
 695       Type = Context.getsigjmp_bufType();
 696     else
 697       Type = Context.getjmp_bufType();
 698
 699     if (Type.isNull()) {
 700       Error = Signed ? ASTContext::GE_Missing_sigjmp_buf :
 701                        ASTContext::GE_Missing_jmp_buf;
 702       return QualType();
 703     }
 704     break;
 705 </pre>
 706 </div>
 707
 708 <p>The idea is to reduce indentation and the amount of code you have to keep
 709    track of when reading the code.</p>
 710
 711 </div>
 712
 713 <!-- _______________________________________________________________________ -->
 714 <div class="doc_subsubsection">
 715   <a name="hl_predicateloops">Turn Predicate Loops into Predicate Functions</a>
 716 </div>
 717
 718 <div class="doc_text">
 719
 720 <p>It is very common to write small loops that just compute a boolean
 721    value.  There are a number of ways that people commonly write these, but an
 722    example of this sort of thing is:</p>
 723
 724 <div class="doc_code">
 725 <pre>
 726   <b>bool FoundFoo = false;</b>
 727   for (unsigned i = 0, e = BarList.size(); i != e; ++i)
 728     if (BarList[i]-&gt;isFoo()) {
 729       <b>FoundFoo = true;</b>
 730       break;
 731     }
 732
 733   <b>if (FoundFoo) {</b>
 734     ...
 735   }
 736 </pre>
 737 </div>
 738
 739 <p>This sort of code is awkward to write, and is almost always a bad sign.
 740 Instead of this sort of loop, we strongly prefer to use a predicate function
 741 (which may be <a href="#micro_anonns">static</a>) that uses
 742 <a href="#hl_earlyexit">early exits</a> to compute the predicate.  We prefer
 743 the code to be structured like this:
 744 </p>
 745
 746
 747 <div class="doc_code">
 748 <pre>
 749 /// ListContainsFoo - Return true if the specified list has an element that is
 750 /// a foo.
 751 static bool ListContainsFoo(const std::vector&lt;Bar*&gt; &amp;List) {
 752   for (unsigned i = 0, e = List.size(); i != e; ++i)
 753     if (List[i]-&gt;isFoo())
 754       return true;
 755   return false;
 756 }
 757 ...
 758
 759   <b>if (ListContainsFoo(BarList)) {</b>
 760     ...
 761   }
 762 </pre>
 763 </div>
 764
 765 <p>There are many reasons for doing this: it reduces indentation and factors out
 766 code which can often be shared by other code that checks for the same predicate.
 767 More importantly, it <em>forces you to pick a name</em> for the function, and
 768 forces you to write a comment for it.  In this silly example, this doesn't add
 769 much value.  However, if the condition is complex, this can make it a lot easier
 770 for the reader to understand the code that queries for this predicate.  Instead
 771 of being faced with the in-line details of how we check to see if the BarList
 772 contains a foo, we can trust the function name and continue reading with better
 773 locality.</p>
 774
 775 </div>
 776
 777
 778 <!-- ======================================================================= -->
 779 <div class="doc_subsection">
 780   <a name="micro">The Low Level Issues</a>
 781 </div>
 782 <!-- ======================================================================= -->
 783
 784
 785 <!-- _______________________________________________________________________ -->
 786 <div class="doc_subsubsection">
 787   <a name="ll_assert">Assert Liberally</a>
 788 </div>
 789
 790 <div class="doc_text">
 791
 792 <p>Use the "<tt>assert</tt>" function to its fullest.  Check all of your
 793 preconditions and assumptions, you never know when a bug (not necessarily even
 794 yours) might be caught early by an assertion, which reduces debugging time
 795 dramatically.  The "<tt>&lt;cassert&gt;</tt>" header file is probably already
 796 included by the header files you are using, so it doesn't cost anything to use
 797 it.</p>
 798
 799 <p>To further assist with debugging, make sure to put some kind of error message
 800 in the assertion statement (which is printed if the assertion is tripped). This
 801 helps the poor debugging make sense of why an assertion is being made and
 802 enforced, and hopefully what to do about it.  Here is one complete example:</p>
 803
 804 <div class="doc_code">
 805 <pre>
 806 inline Value *getOperand(unsigned i) {
 807   assert(i &lt; Operands.size() &amp;&amp; "getOperand() out of range!");
 808   return Operands[i];
 809 }
 810 </pre>
 811 </div>
 812
 813 <p>Here are some examples:</p>
 814
 815 <div class="doc_code">
 816 <pre>
 817 assert(Ty-&gt;isPointerType() &amp;&amp; "Can't allocate a non pointer type!");
 818
 819 assert((Opcode == Shl || Opcode == Shr) &amp;&amp; "ShiftInst Opcode invalid!");
 820
 821 assert(idx &lt; getNumSuccessors() &amp;&amp; "Successor # out of range!");
 822
 823 assert(V1.getType() == V2.getType() &amp;&amp; "Constant types must be identical!");
 824
 825 assert(isa&lt;PHINode&gt;(Succ-&gt;front()) &amp;&amp; "Only works on PHId BBs!");
 826 </pre>
 827 </div>
 828
 829 <p>You get the idea...</p>
 830
 831 <p>Please be aware when adding assert statements that not all compilers are aware of
 832 the semantics of the assert.  In some places, asserts are used to indicate a piece of
 833 code that should not be reached.  These are typically of the form:</p>
 834
 835 <div class="doc_code">
 836 <pre>
 837 assert(0 &amp;&amp; "Some helpful error message");
 838 </pre>
 839 </div>
 840
 841 <p>When used in a function that returns a value, they should be followed with a return
 842 statement and a comment indicating that this line is never reached.  This will prevent
 843 a compiler which is unable to deduce that the assert statement never returns from
 844 generating a warning.</p>
 845
 846 <div class="doc_code">
 847 <pre>
 848 assert(0 &amp;&amp; "Some helpful error message");
 849 // Not reached
 850 return 0;
 851 </pre>
 852 </div>
 853
 854 </div>
 855
 856 <!-- _______________________________________________________________________ -->
 857 <div class="doc_subsubsection">
 858   <a name="ll_ns_std">Do not use '<tt>using namespace std</tt>'</a>
 859 </div>
 860
 861 <div class="doc_text">
 862 <p>In LLVM, we prefer to explicitly prefix all identifiers from the standard
 863 namespace with an "<tt>std::</tt>" prefix, rather than rely on
 864 "<tt>using namespace std;</tt>".</p>
 865
 866 <p> In header files, adding a '<tt>using namespace XXX</tt>' directive pollutes
 867 the namespace of any source file that <tt>#include</tt>s the header.  This is
 868 clearly a bad thing.</p>
 869
 870 <p>In implementation files (e.g. .cpp files), the rule is more of a stylistic
 871 rule, but is still important.  Basically, using explicit namespace prefixes
 872 makes the code <b>clearer</b>, because it is immediately obvious what facilities
 873 are being used and where they are coming from, and <b>more portable</b>, because
 874 namespace clashes cannot occur between LLVM code and other namespaces.  The
 875 portability rule is important because different standard library implementations
 876 expose different symbols (potentially ones they shouldn't), and future revisions
 877 to the C++ standard will add more symbols to the <tt>std</tt> namespace.  As
 878 such, we never use '<tt>using namespace std;</tt>' in LLVM.</p>
 879
 880 <p>The exception to the general rule (i.e. it's not an exception for
 881 the <tt>std</tt> namespace) is for implementation files.  For example, all of
 882 the code in the LLVM project implements code that lives in the 'llvm' namespace.
 883 As such, it is ok, and actually clearer, for the .cpp files to have a '<tt>using
 884 namespace llvm</tt>' directive at their top, after the <tt>#include</tt>s.  The
 885 general form of this rule is that any .cpp file that implements code in any
 886 namespace may use that namespace (and its parents'), but should not use any
 887 others.</p>
 888
 889 </div>
 890
 891 <!-- _______________________________________________________________________ -->
 892 <div class="doc_subsubsection">
 893   <a name="ll_virtual_anch">Provide a virtual method anchor for classes
 894   in headers</a>
 895 </div>
 896
 897 <div class="doc_text">
 898
 899 <p>If a class is defined in a header file and has a v-table (either it has
 900 virtual methods or it derives from classes with virtual methods), it must
 901 always have at least one out-of-line virtual method in the class.  Without
 902 this, the compiler will copy the vtable and RTTI into every <tt>.o</tt> file
 903 that <tt>#include</tt>s the header, bloating <tt>.o</tt> file sizes and
 904 increasing link times.</p>
 905
 906 </div>
 907
 908 <!-- _______________________________________________________________________ -->
 909 <div class="doc_subsubsection">
 910   <a name="ll_end">Don't evaluate end() every time through a loop</a>
 911 </div>
 912
 913 <div class="doc_text">
 914
 915 <p>Because C++ doesn't have a standard "foreach" loop (though it can be emulated
 916 with macros and may be coming in C++'0x) we end up writing a lot of loops that
 917 manually iterate from begin to end on a variety of containers or through other
 918 data structures.  One common mistake is to write a loop in this style:</p>
 919
 920 <div class="doc_code">
 921 <pre>
 922   BasicBlock *BB = ...
 923   for (BasicBlock::iterator I = BB->begin(); I != <b>BB->end()</b>; ++I)
 924      ... use I ...
 925 </pre>
 926 </div>
 927
 928 <p>The problem with this construct is that it evaluates "<tt>BB->end()</tt>"
 929 every time through the loop.  Instead of writing the loop like this, we strongly
 930 prefer loops to be written so that they evaluate it once before the loop starts.
 931 A convenient way to do this is like so:</p>
 932
 933 <div class="doc_code">
 934 <pre>
 935   BasicBlock *BB = ...
 936   for (BasicBlock::iterator I = BB->begin(), E = <b>BB->end()</b>; I != E; ++I)
 937      ... use I ...
 938 </pre>
 939 </div>
 940
 941 <p>The observant may quickly point out that these two loops may have different
 942 semantics: if the container (a basic block in this case) is being mutated, then
 943 "<tt>BB->end()</tt>" may change its value every time through the loop and the
 944 second loop may not in fact be correct.  If you actually do depend on this
 945 behavior, please write the loop in the first form and add a comment indicating
 946 that you did it intentionally.</p>
 947
 948 <p>Why do we prefer the second form (when correct)?  Writing the loop in the
 949 first form has two problems: First it may be less efficient than evaluating it
 950 at the start of the loop.  In this case, the cost is probably minor: a few extra
 951 loads every time through the loop.  However, if the base expression is more
 952 complex, then the cost can rise quickly.  I've seen loops where the end
 953 expression was actually something like: "<tt>SomeMap[x]->end()</tt>" and map
 954 lookups really aren't cheap.  By writing it in the second form consistently, you
 955 eliminate the issue entirely and don't even have to think about it.</p>
 956
 957 <p>The second (even bigger) issue is that writing the loop in the first form
 958 hints to the reader that the loop is mutating the container (a fact that a
 959 comment would handily confirm!).  If you write the loop in the second form, it
 960 is immediately obvious without even looking at the body of the loop that the
 961 container isn't being modified, which makes it easier to read the code and
 962 understand what it does.</p>
 963
 964 <p>While the second form of the loop is a few extra keystrokes, we do strongly
 965 prefer it.</p>
 966
 967 </div>
 968
 969 <!-- _______________________________________________________________________ -->
 970 <div class="doc_subsubsection">
 971   <a name="ll_iostream"><tt>#include &lt;iostream&gt;</tt> is forbidden</a>
 972 </div>
 973
 974 <div class="doc_text">
 975
 976 <p>The use of <tt>#include &lt;iostream&gt;</tt> in library files is
 977 hereby <b><em>forbidden</em></b>. The primary reason for doing this is to
 978 support clients using LLVM libraries as part of larger systems. In particular,
 979 we statically link LLVM into some dynamic libraries. Even if LLVM isn't used,
 980 the static c'tors are run whenever an application start up that uses the dynamic
 981 library. There are two problems with this:</p>
 982
 983 <ol>
 984   <li>The time to run the static c'tors impacts startup time of
 985       applications&mdash;a critical time for GUI apps.</li>
 986   <li>The static c'tors cause the app to pull many extra pages of memory off the
 987       disk: both the code for the static c'tors in each <tt>.o</tt> file and the
 988       small amount of data that gets touched. In addition, touched/dirty pages
 989       put more pressure on the VM system on low-memory machines.</li>
 990 </ol>
 991
 992 <p>Note that using the other stream headers (<tt>&lt;sstream&gt;</tt> for
 993 example) is not problematic in this regard (just <tt>&lt;iostream&gt;</tt>).
 994 However, raw_ostream provides various APIs that are better performing for almost
 995 every use than std::ostream style APIs, so you should just use it for new
 996 code.</p>
 997
 998 <p><b>New code should always
 999 use <a href="#ll_raw_ostream"><tt>raw_ostream</tt></a> for writing, or
1000 the <tt>llvm::MemoryBuffer</tt> API for reading files.</b></p>
1001
1002 </div>
1003
1004
1005 <!-- _______________________________________________________________________ -->
1006 <div class="doc_subsubsection">
1007   <a name="ll_avoidendl">Avoid <tt>std::endl</tt></a>
1008 </div>
1009
1010 <div class="doc_text">
1011
1012 <p>The <tt>std::endl</tt> modifier, when used with iostreams outputs a newline
1013 to the output stream specified.  In addition to doing this, however, it also
1014 flushes the output stream.  In other words, these are equivalent:</p>
1015
1016 <div class="doc_code">
1017 <pre>
1018 std::cout &lt;&lt; std::endl;
1019 std::cout &lt;&lt; '\n' &lt;&lt; std::flush;
1020 </pre>
1021 </div>
1022
1023 <p>Most of the time, you probably have no reason to flush the output stream, so
1024 it's better to use a literal <tt>'\n'</tt>.</p>
1025
1026 </div>
1027
1028
1029 <!-- _______________________________________________________________________ -->
1030 <div class="doc_subsubsection">
1031   <a name="ll_raw_ostream">Use <tt>raw_ostream</tt></a>
1032 </div>
1033
1034 <div class="doc_text">
1035
1036 <p>LLVM includes a lightweight, simple, and efficient stream implementation
1037 in <tt>llvm/Support/raw_ostream.h</tt> which provides all of the common features
1038 of <tt>std::ostream</tt>.  All new code should use <tt>raw_ostream</tt> instead
1039 of <tt>ostream</tt>.</p>
1040
1041 <p>Unlike <tt>std::ostream</tt>, <tt>raw_ostream</tt> is not a template and can
1042 be forward declared as <tt>class raw_ostream</tt>.  Public headers should
1043 generally not include the <tt>raw_ostream</tt> header, but use forward
1044 declarations and constant references to <tt>raw_ostream</tt> instances.</p>
1045
1046 </div>
1047
1048
1049 <!-- ======================================================================= -->
1050 <div class="doc_subsection">
1051   <a name="nano">Microscopic Details</a>
1052 </div>
1053 <!-- ======================================================================= -->
1054
1055 <p>This section describes preferred low-level formatting guidelines along with
1056 reasoning on why we prefer them.</p>
1057
1058 <!-- _______________________________________________________________________ -->
1059 <div class="doc_subsubsection">
1060   <a name="micro_spaceparen">Spaces Before Parentheses</a>
1061 </div>
1062
1063 <div class="doc_text">
1064
1065 <p>We prefer to put a space before a parentheses only in control flow
1066 statements, but not in normal function call expressions and function-like
1067 macros.  For example, this is good:</p>
1068
1069 <div class="doc_code">
1070 <pre>
1071   <b>if (</b>x) ...
1072   <b>for (</b>i = 0; i != 100; ++i) ...
1073   <b>while (</b>llvm_rocks) ...
1074
1075   <b>somefunc(</b>42);
1076   <b><a href="#ll_assert">assert</a>(</b>3 != 4 &amp;&amp; "laws of math are failing me");
1077
1078   a = <b>foo(</b>42, 92) + <b>bar(</b>x);
1079   </pre>
1080 </div>
1081
1082 <p>... and this is bad:</p>
1083
1084 <div class="doc_code">
1085 <pre>
1086   <b>if(</b>x) ...
1087   <b>for(</b>i = 0; i != 100; ++i) ...
1088   <b>while(</b>llvm_rocks) ...
1089
1090   <b>somefunc (</b>42);
1091   <b><a href="#ll_assert">assert</a> (</b>3 != 4 &amp;&amp; "laws of math are failing me");
1092
1093   a = <b>foo (</b>42, 92) + <b>bar (</b>x);
1094 </pre>
1095 </div>
1096
1097 <p>The reason for doing this is not completely arbitrary.  This style makes
1098    control flow operators stand out more, and makes expressions flow better. The
1099    function call operator binds very tightly as a postfix operator.  Putting
1100    a space after a function name (as in the last example) makes it appear that
1101    the code might bind the arguments of the left-hand-side of a binary operator
1102    with the argument list of a function and the name of the right side.  More
1103    specifically, it is easy to misread the "a" example as:</p>
1104
1105 <div class="doc_code">
1106 <pre>
1107   a = foo <b>(</b>(42, 92) + bar<b>)</b> (x);
1108 </pre>
1109 </div>
1110
1111 <p>... when skimming through the code.  By avoiding a space in a function, we
1112 avoid this misinterpretation.</p>
1113
1114 </div>
1115
1116 <!-- _______________________________________________________________________ -->
1117 <div class="doc_subsubsection">
1118   <a name="micro_preincrement">Prefer Preincrement</a>
1119 </div>
1120
1121 <div class="doc_text">
1122
1123 <p>Hard fast rule: Preincrement (<tt>++X</tt>) may be no slower than
1124 postincrement (<tt>X++</tt>) and could very well be a lot faster than it.  Use
1125 preincrementation whenever possible.</p>
1126
1127 <p>The semantics of postincrement include making a copy of the value being
1128 incremented, returning it, and then preincrementing the "work value".  For
1129 primitive types, this isn't a big deal... but for iterators, it can be a huge
1130 issue (for example, some iterators contains stack and set objects in them...
1131 copying an iterator could invoke the copy ctor's of these as well).  In general,
1132 get in the habit of always using preincrement, and you won't have a problem.</p>
1133
1134 </div>
1135
1136 <!-- _______________________________________________________________________ -->
1137 <div class="doc_subsubsection">
1138   <a name="micro_namespaceindent">Namespace Indentation</a>
1139 </div>
1140
1141 <div class="doc_text">
1142
1143 <p>
1144 In general, we strive to reduce indentation where ever possible.  This is useful
1145 because we want code to <a href="#scf_codewidth">fit into 80 columns</a> without
1146 wrapping horribly, but also because it makes it easier to understand the code.
1147 Namespaces are a funny thing: they are often large, and we often desire to put
1148 lots of stuff into them (so they can be large).  Other times they are tiny,
1149 because they just hold an enum or something similar.  In order to balance this,
1150 we use different approaches for small versus large namespaces.
1151 </p>
1152
1153 <p>
1154 If a namespace definition is small and <em>easily</em> fits on a screen (say,
1155 less than 35 lines of code), then you should indent its body.  Here's an
1156 example:
1157 </p>
1158
1159 <div class="doc_code">
1160 <pre>
1161 namespace llvm {
1162   namespace X86 {
1163     /// RelocationType - An enum for the x86 relocation codes. Note that
1164     /// the terminology here doesn't follow x86 convention - word means
1165     /// 32-bit and dword means 64-bit.
1166     enum RelocationType {
1167       /// reloc_pcrel_word - PC relative relocation, add the relocated value to
1168       /// the value already in memory, after we adjust it for where the PC is.
1169       reloc_pcrel_word = 0,
1170
1171       /// reloc_picrel_word - PIC base relative relocation, add the relocated
1172       /// value to the value already in memory, after we adjust it for where the
1173       /// PIC base is.
1174       reloc_picrel_word = 1,
1175
1176       /// reloc_absolute_word, reloc_absolute_dword - Absolute relocation, just
1177       /// add the relocated value to the value already in memory.
1178       reloc_absolute_word = 2,
1179       reloc_absolute_dword = 3
1180     };
1181   }
1182 }
1183 </pre>
1184 </div>
1185
1186 <p>Since the body is small, indenting adds value because it makes it very clear
1187 where the namespace starts and ends, and it is easy to take the whole thing in
1188 in one "gulp" when reading the code.  If the blob of code in the namespace is
1189 larger (as it typically is in a header in the llvm or clang namespaces), do not
1190 indent the code, and add a comment indicating what namespace is being closed.
1191 For example:</p>
1192
1193 <div class="doc_code">
1194 <pre>
1195 namespace llvm {
1196 namespace knowledge {
1197
1198 /// Grokable - This class represents things that Smith can have an intimate
1199 /// understanding of and contains the data associated with it.
1200 class Grokable {
1201 ...
1202 public:
1203   explicit Grokable() { ... }
1204   virtual ~Grokable() = 0;
1205
1206   ...
1207
1208 };
1209
1210 } // end namespace knowledge
1211 } // end namespace llvm
1212 </pre>
1213 </div>
1214
1215 <p>Because the class is large, we don't expect that the reader can easily
1216 understand the entire concept in a glance, and the end of the file (where the
1217 namespaces end) may be a long ways away from the place they open.  As such,
1218 indenting the contents of the namespace doesn't add any value, and detracts from
1219 the readability of the class.  In these cases it is best to <em>not</em> indent
1220 the contents of the namespace.</p>
1221
1222 </div>
1223
1224 <!-- _______________________________________________________________________ -->
1225 <div class="doc_subsubsection">
1226   <a name="micro_anonns">Anonymous Namespaces</a>
1227 </div>
1228
1229 <div class="doc_text">
1230
1231 <p>After talking about namespaces in general, you may be wondering about
1232 anonymous namespaces in particular.
1233 Anonymous namespaces are a great language feature that tells the C++ compiler
1234 that the contents of the namespace are only visible within the current
1235 translation unit, allowing more aggressive optimization and eliminating the
1236 possibility of symbol name collisions.  Anonymous namespaces are to C++ as
1237 "static" is to C functions and global variables.  While "static" is available
1238 in C++, anonymous namespaces are more general: they can make entire classes
1239 private to a file.</p>
1240
1241 <p>The problem with anonymous namespaces is that they naturally want to
1242 encourage indentation of their body, and they reduce locality of reference: if
1243 you see a random function definition in a C++ file, it is easy to see if it is
1244 marked static, but seeing if it is in an anonymous namespace requires scanning
1245 a big chunk of the file.</p>
1246
1247 <p>Because of this, we have a simple guideline: make anonymous namespaces as
1248 small as possible, and only use them for class declarations.  For example, this
1249 is good:</p>
1250
1251 <div class="doc_code">
1252 <pre>
1253 <b>namespace {</b>
1254   class StringSort {
1255   ...
1256   public:
1257     StringSort(...)
1258     bool operator&lt;(const char *RHS) const;
1259   };
1260 <b>} // end anonymous namespace</b>
1261
1262 static void Helper() {
1263   ...
1264 }
1265
1266 bool StringSort::operator&lt;(const char *RHS) const {
1267   ...
1268 }
1269
1270 </pre>
1271 </div>
1272
1273 <p>This is bad:</p>
1274
1275
1276 <div class="doc_code">
1277 <pre>
1278 <b>namespace {</b>
1279 class StringSort {
1280 ...
1281 public:
1282   StringSort(...)
1283   bool operator&lt;(const char *RHS) const;
1284 };
1285
1286 void Helper() {
1287   ...
1288 }
1289
1290 bool StringSort::operator&lt;(const char *RHS) const {
1291   ...
1292 }
1293
1294 <b>} // end anonymous namespace</b>
1295
1296 </pre>
1297 </div>
1298
1299
1300 <p>This is bad specifically because if you're looking at "Helper" in the middle
1301 of a large C++ file, that you have no immediate way to tell if it is local to
1302 the file.  When it is marked static explicitly, this is immediately obvious.
1303 Also, there is no reason to enclose the definition of "operator&lt;" in the
1304 namespace just because it was declared there.
1305 </p>
1306
1307 </div>
1308
1309
1310
1311 <!-- *********************************************************************** -->
1312 <div class="doc_section">
1313   <a name="seealso">See Also</a>
1314 </div>
1315 <!-- *********************************************************************** -->
1316
1317 <div class="doc_text">
1318
1319 <p>A lot of these comments and recommendations have been culled for other
1320 sources.  Two particularly important books for our work are:</p>
1321
1322 <ol>
1323
1324 <li><a href="http://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876">Effective
1325 C++</a> by Scott Meyers.  Also
1326 interesting and useful are "More Effective C++" and "Effective STL" by the same
1327 author.</li>
1328
1329 <li>Large-Scale C++ Software Design by John Lakos</li>
1330
1331 </ol>
1332
1333 <p>If you get some free time, and you haven't read them: do so, you might learn
1334 something.</p>
1335
1336 </div>
1337
1338 <!-- *********************************************************************** -->
1339
1340 <hr>
1341 <address>
1342   <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1343   src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
1344   <a href="http://validator.w3.org/check/referer"><img
1345   src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
1346
1347   <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1348   <a href="http://llvm.org">LLVM Compiler Infrastructure</a><br>
1349   Last modified: $Date$
1350 </address>
1351
1352 </body>
1353 </html>