I will not hold on to temporary StringRefs.
[clang.git] / www / diagnostics.html
blob4f7d025b8ec0a9449459d45593fcd57677803e88
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3 <html>
4 <head>
5 <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
6 <title>Clang - Expressive Diagnostics</title>
7 <link type="text/css" rel="stylesheet" href="menu.css" />
8 <link type="text/css" rel="stylesheet" href="content.css" />
9 <style type="text/css">
10 </style>
11 </head>
12 <body>
14 <!--#include virtual="menu.html.incl"-->
16 <div id="content">
19 <!--=======================================================================-->
20 <h1>Expressive Diagnostics</h1>
21 <!--=======================================================================-->
23 <p>In addition to being fast and functional, we aim to make Clang extremely user
24 friendly. As far as a command-line compiler goes, this basically boils down to
25 making the diagnostics (error and warning messages) generated by the compiler
26 be as useful as possible. There are several ways that we do this. This section
27 talks about the experience provided by the command line compiler, contrasting
28 Clang output to GCC 4.2's output in several examples.
29 <!--
30 Other clients
31 that embed Clang and extract equivalent information through internal APIs.-->
32 </p>
34 <h2>Column Numbers and Caret Diagnostics</h2>
36 <p>First, all diagnostics produced by clang include full column number
37 information, and use this to print "caret diagnostics". This is a feature
38 provided by many commercial compilers, but is generally missing from open source
39 compilers. This is nice because it makes it very easy to understand exactly
40 what is wrong in a particular piece of code, an example is:</p>
42 <pre>
43 $ <b>gcc-4.2 -fsyntax-only -Wformat format-strings.c</b>
44 format-strings.c:91: warning: too few arguments for format
45 $ <b>clang -fsyntax-only format-strings.c</b>
46 format-strings.c:91:13: <font color="magenta">warning:</font> '.*' specified field precision is missing a matching 'int' argument
47 <font color="darkgreen"> printf("%.*d");</font>
48 <font color="blue"> ^</font>
49 </pre>
51 <p>The caret (the blue "^" character) exactly shows where the problem is, even
52 inside of the string. This makes it really easy to jump to the problem and
53 helps when multiple instances of the same character occur on a line. We'll
54 revisit this more in following examples.</p>
56 <h2>Range Highlighting for Related Text</h2>
58 <p>Clang captures and accurately tracks range information for expressions,
59 statements, and other constructs in your program and uses this to make
60 diagnostics highlight related information. For example, here's a somewhat
61 nonsensical example to illustrate this:</p>
63 <pre>
64 $ <b>gcc-4.2 -fsyntax-only t.c</b>
65 t.c:7: error: invalid operands to binary + (have 'int' and 'struct A')
66 $ <b>clang -fsyntax-only t.c</b>
67 t.c:7:39: <font color="red">error:</font> invalid operands to binary expression ('int' and 'struct A')
68 <font color="darkgreen"> return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);</font>
69 <font color="blue"> ~~~~~~~~~~~~~~ ^ ~~~~~</font>
70 </pre>
72 <p>Here you can see that you don't even need to see the original source code to
73 understand what is wrong based on the Clang error: Because clang prints a
74 caret, you know exactly <em>which</em> plus it is complaining about. The range
75 information highlights the left and right side of the plus which makes it
76 immediately obvious what the compiler is talking about, which is very useful for
77 cases involving precedence issues and many other cases.</p>
79 <h2>Precision in Wording</h2>
81 <p>A detail is that we have tried really hard to make the diagnostics that come
82 out of clang contain exactly the pertinent information about what is wrong and
83 why. In the example above, we tell you what the inferred types are for
84 the left and right hand sides, and we don't repeat what is obvious from the
85 caret (that this is a "binary +"). Many other examples abound, here is a simple
86 one:</p>
88 <pre>
89 $ <b>gcc-4.2 -fsyntax-only t.c</b>
90 t.c:5: error: invalid type argument of 'unary *'
91 $ <b>clang -fsyntax-only t.c</b>
92 t.c:5:11: <font color="red">error:</font> indirection requires pointer operand ('int' invalid)
93 <font color="darkgreen"> int y = *SomeA.X;</font>
94 <font color="blue"> ^~~~~~~~</font>
95 </pre>
97 <p>In this example, not only do we tell you that there is a problem with the *
98 and point to it, we say exactly why and tell you what the type is (in case it is
99 a complicated subexpression, such as a call to an overloaded function). This
100 sort of attention to detail makes it much easier to understand and fix problems
101 quickly.</p>
103 <h2>No Pretty Printing of Expressions in Diagnostics</h2>
105 <p>Since Clang has range highlighting, it never needs to pretty print your code
106 back out to you. This is particularly bad in G++ (which often emits errors
107 containing lowered vtable references), but even GCC can produce
108 inscrutible error messages in some cases when it tries to do this. In this
109 example P and Q have type "int*":</p>
111 <pre>
112 $ <b>gcc-4.2 -fsyntax-only t.c</b>
113 #'exact_div_expr' not supported by pp_c_expression#'t.c:12: error: called object is not a function
114 $ <b>clang -fsyntax-only t.c</b>
115 t.c:12:8: <font color="red">error:</font> called object type 'int' is not a function or function pointer
116 <font color="darkgreen"> (P-Q)();</font>
117 <font color="blue"> ~~~~~^</font>
118 </pre>
121 <h2>Typedef Preservation and Selective Unwrapping</h2>
123 <p>Many programmers use high-level user defined types, typedefs, and other
124 syntactic sugar to refer to types in their program. This is useful because they
125 can abbreviate otherwise very long types and it is useful to preserve the
126 typename in diagnostics. However, sometimes very simple typedefs can wrap
127 trivial types and it is important to strip off the typedef to understand what
128 is going on. Clang aims to handle both cases well.<p>
130 <p>For example, here is an example that shows where it is important to preserve
131 a typedef in C:</p>
133 <pre>
134 $ <b>gcc-4.2 -fsyntax-only t.c</b>
135 t.c:15: error: invalid operands to binary / (have 'float __vector__' and 'const int *')
136 $ <b>clang -fsyntax-only t.c</b>
137 t.c:15:11: <font color="red">error:</font> can't convert between vector values of different size ('__m128' and 'int const *')
138 <font color="darkgreen"> myvec[1]/P;</font>
139 <font color="blue"> ~~~~~~~~^~</font>
140 </pre>
142 <p>Here the type printed by GCC isn't even valid, but if the error were about a
143 very long and complicated type (as often happens in C++) the error message would
144 be ugly just because it was long and hard to read. Here's an example where it
145 is useful for the compiler to expose underlying details of a typedef:</p>
147 <pre>
148 $ <b>gcc-4.2 -fsyntax-only t.c</b>
149 t.c:13: error: request for member 'x' in something not a structure or union
150 $ <b>clang -fsyntax-only t.c</b>
151 t.c:13:9: <font color="red">error:</font> member reference base type 'pid_t' (aka 'int') is not a structure or union
152 <font color="darkgreen"> myvar = myvar.x;</font>
153 <font color="blue"> ~~~~~ ^</font>
154 </pre>
156 <p>If the user was somehow confused about how the system "pid_t" typedef is
157 defined, Clang helpfully displays it with "aka".</p>
159 <p>In C++, type preservation includes retaining any qualification written into type names. For example, if we take a small snippet of code such as:
161 <blockquote>
162 <pre>
163 namespace services {
164 struct WebService { };
166 namespace myapp {
167 namespace servers {
168 struct Server { };
172 using namespace myapp;
173 void addHTTPService(servers::Server const &server, ::services::WebService const *http) {
174 server += http;
176 </pre>
177 </blockquote>
179 <p>and then compile it, we see that Clang is both providing more accurate information and is retaining the types as written by the user (e.g., "servers::Server", "::services::WebService"):
181 <pre>
182 $ <b>g++-4.2 -fsyntax-only t.cpp</b>
183 t.cpp:9: error: no match for 'operator+=' in 'server += http'
184 $ <b>clang -fsyntax-only t.cpp</b>
185 t.cpp:9:10: <font color="red">error:</font> invalid operands to binary expression ('servers::Server const' and '::services::WebService const *')
186 <font color="darkgreen">server += http;</font>
187 <font color="blue">~~~~~~ ^ ~~~~</font>
188 </pre>
190 <p>Naturally, type preservation extends to uses of templates, and Clang retains information about how a particular template specialization (like <code>std::vector&lt;Real&gt;</code>) was spelled within the source code. For example:</p>
192 <pre>
193 $ <b>g++-4.2 -fsyntax-only t.cpp</b>
194 t.cpp:12: error: no match for 'operator=' in 'str = vec'
195 $ <b>clang -fsyntax-only t.cpp</b>
196 t.cpp:12:7: <font color="red">error:</font> incompatible type assigning 'vector&lt;Real&gt;', expected 'std::string' (aka 'class std::basic_string&lt;char&gt;')
197 <font color="darkgreen">str = vec</font>;
198 <font color="blue">^ ~~~</font>
199 </pre>
201 <h2>Fix-it Hints</h2>
203 <p>"Fix-it" hints provide advice for fixing small, localized problems
204 in source code. When Clang produces a diagnostic about a particular
205 problem that it can work around (e.g., non-standard or redundant
206 syntax, missing keywords, common mistakes, etc.), it may also provide
207 specific guidance in the form of a code transformation to correct the
208 problem. For example, here Clang warns about the use of a GCC
209 extension that has been considered obsolete since 1993:</p>
211 <pre>
212 $ <b>clang t.c</b>
213 t.c:5:28: <font color="magenta">warning:</font> use of GNU old-style field designator extension
214 <font color="darkgreen">struct point origin = { x: 0.0, y: 0.0 };</font>
215 <font color="red">~~</font> <font color="blue">^</font>
216 <font color="darkgreen">.x = </font>
217 t.c:5:36: <font color="magenta">warning:</font> use of GNU old-style field designator extension
218 <font color="darkgreen">struct point origin = { x: 0.0, y: 0.0 };</font>
219 <font color="red">~~</font> <font color="blue">^</font>
220 <font color="darkgreen">.y = </font>
221 </pre>
223 <p>The underlined code should be removed, then replaced with the code below the
224 caret line (".x =" or ".y =", respectively). "Fix-it" hints are most useful for
225 working around common user errors and misconceptions. For example, C++ users
226 commonly forget the syntax for explicit specialization of class templates,
227 as in the following error:</p>
229 <pre>
230 $ <b>clang t.cpp</b>
231 t.cpp:9:3: <font color="red">error:</font> template specialization requires 'template&lt;&gt;'
232 struct iterator_traits&lt;file_iterator&gt; {
233 <font color="blue">^</font>
234 <font color="darkgreen">template&lt;&gt; </font>
235 </pre>
237 <p>Again, after describing the problem, Clang provides the fix--add <code>template&lt;&gt;</code>--as part of the diagnostic.<p>
239 <h2>Automatic Macro Expansion</h2>
241 <p>Many errors happen in macros that are sometimes deeply nested. With
242 traditional compilers, you need to dig deep into the definition of the macro to
243 understand how you got into trouble. Here's a simple example that shows how
244 Clang helps you out:</p>
246 <pre>
247 $ <b>gcc-4.2 -fsyntax-only t.c</b>
248 t.c: In function 'test':
249 t.c:80: error: invalid operands to binary &lt; (have 'struct mystruct' and 'float')
250 $ <b>clang -fsyntax-only t.c</b>
251 t.c:80:3: <font color="red">error:</font> invalid operands to binary expression ('typeof(P)' (aka 'struct mystruct') and 'typeof(F)' (aka 'float'))
252 <font color="darkgreen"> X = MYMAX(P, F);</font>
253 <font color="blue"> ^~~~~~~~~~~</font>
254 t.c:76:94: note: instantiated from:
255 <font color="darkgreen">#define MYMAX(A,B) __extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a &lt; __b ? __b : __a; })</font>
256 <font color="blue"> ~~~ ^ ~~~</font>
257 </pre>
259 <p>This shows how clang automatically prints instantiation information and
260 nested range information for diagnostics as they are instantiated through macros
261 and also shows how some of the other pieces work in a bigger example. Here's
262 another real world warning that occurs in the "window" Unix package (which
263 implements the "wwopen" class of APIs):</p>
265 <pre>
266 $ <b>clang -fsyntax-only t.c</b>
267 t.c:22:2: <font color="magenta">warning:</font> type specifier missing, defaults to 'int'
268 <font color="darkgreen"> ILPAD();</font>
269 <font color="blue"> ^</font>
270 t.c:17:17: note: instantiated from:
271 <font color="darkgreen">#define ILPAD() PAD((NROW - tt.tt_row) * 10) /* 1 ms per char */</font>
272 <font color="blue"> ^</font>
273 t.c:14:2: note: instantiated from:
274 <font color="darkgreen"> register i; \</font>
275 <font color="blue"> ^</font>
276 </pre>
278 <p>In practice, we've found that this is actually more useful in multiply nested
279 macros that in simple ones.</p>
281 <h2>Quality of Implementation and Attention to Detail</h2>
283 <p>Finally, we have put a lot of work polishing the little things, because
284 little things add up over time and contribute to a great user experience. Three
285 examples are:</p>
287 <pre>
288 $ <b>gcc-4.2 t.c</b>
289 t.c: In function 'foo':
290 t.c:5: error: expected ';' before '}' token
291 $ <b>clang t.c</b>
292 t.c:4:8: <font color="red">error:</font> expected ';' after expression
293 <font color="darkgreen"> bar()</font>
294 <font color="blue"> ^</font>
295 <font color="blue"> ;</font>
296 </pre>
298 <p>This shows a trivial little tweak, where we tell you to put the semicolon at
299 the end of the line that is missing it (line 4) instead of at the beginning of
300 the following line (line 5). This is particularly important with fixit hints
301 and caret diagnostics, because otherwise you don't get the important context.
302 </p>
304 <pre>
305 $ <b>gcc-4.2 t.c</b>
306 t.c:3: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token
307 $ <b>clang t.c</b>
308 t.c:3:1: <font color="red">error:</font> unknown type name 'foo_t'
309 <font color="darkgreen">foo_t *P = 0;</font>
310 <font color="blue">^</font>
311 </pre>
313 <p>This shows an example of much better error recovery. The message coming out
314 of GCC is completely useless for diagnosing the problem, Clang tries much harder
315 and produces a much more useful diagnosis of the problem.</p>
317 <pre>
318 $ <b>cat t.cc</b>
319 template&lt;class T&gt;
320 class a {}
321 class temp {};
322 a&lt;temp&gt; b;
323 struct b {
325 $ <b>gcc-4.2 t.cc</b>
326 t.cc:3: error: multiple types in one declaration
327 t.cc:4: error: non-template type 'a' used as a template
328 t.cc:4: error: invalid type in declaration before ';' token
329 t.cc:6: error: expected unqualified-id at end of input
330 $ <b>clang t.cc</b>
331 t.cc:2:11: <font color="red">error:</font> expected ';' after class
332 <font color="darkgreen">class a {}</font>
333 <font color="blue"> ^</font>
334 <font color="blue"> ;</font>
335 t.cc:6:2: <font color="red">error:</font> expected ';' after struct
336 <font color="darkgreen">}</font>
337 <font color="blue"> ^</font>
338 <font color="blue"> ;</font>
339 </pre>
341 <p>This shows that we recover from the simple case of forgetting a ; after
342 a struct definition much better than GCC.</p>
344 <p>While each of these details is minor, we feel that they all add up to provide
345 a much more polished experience.</p>
347 </div>
348 </body>
349 </html>