gcc/doc/analyzer.texi

   1 @c Copyright (C) 2019-2023 Free Software Foundation, Inc.
   2 @c This is part of the GCC manual.
   3 @c For copying conditions, see the file gcc.texi.
   4 @c Contributed by David Malcolm <dmalcolm@redhat.com>.
   5
   6 @node Static Analyzer
   7 @chapter Static Analyzer
   8 @cindex analyzer
   9 @cindex static analysis
  10 @cindex static analyzer
  11
  12 @menu
  13 * Analyzer Internals::       Analyzer Internals
  14 * Debugging the Analyzer::   Useful debugging tips
  15 @end menu
  16
  17 @node Analyzer Internals
  18 @section Analyzer Internals
  19 @cindex analyzer, internals
  20 @cindex static analyzer, internals
  21
  22 @subsection Overview
  23
  24 The analyzer implementation works on the gimple-SSA representation.
  25 (I chose this in the hopes of making it easy to work with LTO to
  26 do whole-program analysis).
  27
  28 The implementation is read-only: it doesn't attempt to change anything,
  29 just emit warnings.
  30
  31 The gimple representation can be seen using @option{-fdump-ipa-analyzer}.
  32 @quotation Tip
  33 If the analyzer ICEs before this is written out, one workaround is to use
  34 @option{--param=analyzer-bb-explosion-factor=0} to force the analyzer
  35 to bail out after analyzing the first basic block.
  36 @end quotation
  37
  38 First, we build a @code{supergraph} which combines the callgraph and all
  39 of the CFGs into a single directed graph, with both interprocedural and
  40 intraprocedural edges.  The nodes and edges in the supergraph are called
  41 ``supernodes'' and ``superedges'', and often referred to in code as
  42 @code{snodes} and @code{sedges}.  Basic blocks in the CFGs are split at
  43 interprocedural calls, so there can be more than one supernode per
  44 basic block.  Most statements will be in just one supernode, but a call
  45 statement can appear in two supernodes: at the end of one for the call,
  46 and again at the start of another for the return.
  47
  48 The supergraph can be seen using @option{-fdump-analyzer-supergraph}.
  49
  50 We then build an @code{analysis_plan} which walks the callgraph to
  51 determine which calls might be suitable for being summarized (rather
  52 than fully explored) and thus in what order to explore the functions.
  53
  54 Next is the heart of the analyzer: we use a worklist to explore state
  55 within the supergraph, building an "exploded graph".
  56 Nodes in the exploded graph correspond to <point,@w{ }state> pairs, as in
  57      "Precise Interprocedural Dataflow Analysis via Graph Reachability"
  58      (Thomas Reps, Susan Horwitz and Mooly Sagiv).
  59
  60 We reuse nodes for <point, state> pairs we've already seen, and avoid
  61 tracking state too closely, so that (hopefully) we rapidly converge
  62 on a final exploded graph, and terminate the analysis.  We also bail
  63 out if the number of exploded <end-of-basic-block, state> nodes gets
  64 larger than a particular multiple of the total number of basic blocks
  65 (to ensure termination in the face of pathological state-explosion
  66 cases, or bugs).  We also stop exploring a point once we hit a limit
  67 of states for that point.
  68
  69 We can identify problems directly when processing a <point,@w{ }state>
  70 instance.  For example, if we're finding the successors of
  71
  72 @smallexample
  73    <point: before-stmt: "free (ptr);",
  74     state: @{"ptr": freed@}>
  75 @end smallexample
  76
  77 then we can detect a double-free of "ptr".  We can then emit a path
  78 to reach the problem by finding the simplest route through the graph.
  79
  80 Program points in the analysis are much more fine-grained than in the
  81 CFG and supergraph, with points (and thus potentially exploded nodes)
  82 for various events, including before individual statements.
  83 By default the exploded graph merges multiple consecutive statements
  84 in a supernode into one exploded edge to minimize the size of the
  85 exploded graph.  This can be suppressed via
  86 @option{-fanalyzer-fine-grained}.
  87 The fine-grained approach seems to make things simpler and more debuggable
  88 that other approaches I tried, in that each point is responsible for one
  89 thing.
  90
  91 Program points in the analysis also have a "call string" identifying the
  92 stack of callsites below them, so that paths in the exploded graph
  93 correspond to interprocedurally valid paths: we always return to the
  94 correct call site, propagating state information accordingly.
  95 We avoid infinite recursion by stopping the analysis if a callsite
  96 appears more than @code{analyzer-max-recursion-depth} in a callstring
  97 (defaulting to 2).
  98
  99 @subsection Graphs
 100
 101 Nodes and edges in the exploded graph are called ``exploded nodes'' and
 102 ``exploded edges'' and often referred to in the code as
 103 @code{enodes} and @code{eedges} (especially when distinguishing them
 104 from the @code{snodes} and @code{sedges} in the supergraph).
 105
 106 Each graph numbers its nodes, giving unique identifiers - supernodes
 107 are referred to throughout dumps in the form @samp{SN': @var{index}} and
 108 exploded nodes in the form @samp{EN: @var{index}} (e.g. @samp{SN: 2} and
 109 @samp{EN:29}).
 110
 111 The supergraph can be seen using @option{-fdump-analyzer-supergraph-graph}.
 112
 113 The exploded graph can be seen using @option{-fdump-analyzer-exploded-graph}
 114 and other dump options.  Exploded nodes are color-coded in the .dot output
 115 based on state-machine states to make it easier to see state changes at
 116 a glance.
 117
 118 @subsection State Tracking
 119
 120 There's a tension between:
 121 @itemize @bullet
 122 @item
 123 precision of analysis in the straight-line case, vs
 124 @item
 125 exponential blow-up in the face of control flow.
 126 @end itemize
 127
 128 For example, in general, given this CFG:
 129
 130 @smallexample
 131       A
 132      / \
 133     B   C
 134      \ /
 135       D
 136      / \
 137     E   F
 138      \ /
 139       G
 140 @end smallexample
 141
 142 we want to avoid differences in state-tracking in B and C from
 143 leading to blow-up.  If we don't prevent state blowup, we end up
 144 with exponential growth of the exploded graph like this:
 145
 146 @smallexample
 147
 148            1:A
 149           /   \
 150          /     \
 151         /       \
 152       2:B       3:C
 153        |         |
 154       4:D       5:D        (2 exploded nodes for D)
 155      /   \     /   \
 156    6:E   7:F 8:E   9:F
 157     |     |   |     |
 158    10:G 11:G 12:G  13:G    (4 exploded nodes for G)
 159
 160 @end smallexample
 161
 162 Similar issues arise with loops.
 163
 164 To prevent this, we follow various approaches:
 165
 166 @enumerate a
 167 @item
 168 state pruning: which tries to discard state that won't be relevant
 169 later on withing the function.
 170 This can be disabled via @option{-fno-analyzer-state-purge}.
 171
 172 @item
 173 state merging.  We can try to find the commonality between two
 174 program_state instances to make a third, simpler program_state.
 175 We have two strategies here:
 176
 177   @enumerate
 178   @item
 179      the worklist keeps new nodes for the same program_point together,
 180      and tries to merge them before processing, and thus before they have
 181      successors.  Hence, in the above, the two nodes for D (4 and 5) reach
 182      the front of the worklist together, and we create a node for D with
 183      the merger of the incoming states.
 184
 185   @item
 186      try merging with the state of existing enodes for the program_point
 187      (which may have already been explored).  There will be duplication,
 188      but only one set of duplication; subsequent duplicates are more likely
 189      to hit the cache.  In particular, (hopefully) all merger chains are
 190      finite, and so we guarantee termination.
 191      This is intended to help with loops: we ought to explore the first
 192      iteration, and then have a "subsequent iterations" exploration,
 193      which uses a state merged from that of the first, to be more abstract.
 194   @end enumerate
 195
 196 We avoid merging pairs of states that have state-machine differences,
 197 as these are the kinds of differences that are likely to be most
 198 interesting.  So, for example, given:
 199
 200 @smallexample
 201       if (condition)
 202         ptr = malloc (size);
 203       else
 204         ptr = local_buf;
 205
 206       .... do things with 'ptr'
 207
 208       if (condition)
 209         free (ptr);
 210
 211       ...etc
 212 @end smallexample
 213
 214 then we end up with an exploded graph that looks like this:
 215
 216 @smallexample
 217
 218                    if (condition)
 219                      / T      \ F
 220             ---------          ----------
 221            /                             \
 222       ptr = malloc (size)             ptr = local_buf
 223           |                               |
 224       copy of                         copy of
 225         "do things with 'ptr'"          "do things with 'ptr'"
 226       with ptr: heap-allocated        with ptr: stack-allocated
 227           |                               |
 228       if (condition)                  if (condition)
 229           | known to be T                 | known to be F
 230       free (ptr);                         |
 231            \                             /
 232             -----------------------------
 233                          | ('ptr' is pruned, so states can be merged)
 234                         etc
 235
 236 @end smallexample
 237
 238 where some duplication has occurred, but only for the places where the
 239 the different paths are worth exploringly separately.
 240
 241 Merging can be disabled via @option{-fno-analyzer-state-merge}.
 242 @end enumerate
 243
 244 @subsection Region Model
 245
 246 Part of the state stored at a @code{exploded_node} is a @code{region_model}.
 247 This is an implementation of the region-based ternary model described in
 248 @url{https://www.researchgate.net/publication/221430855_A_Memory_Model_for_Static_Analysis_of_C_Programs,
 249 "A Memory Model for Static Analysis of C Programs"}
 250 (Zhongxing Xu, Ted Kremenek, and Jian Zhang).
 251
 252 A @code{region_model} encapsulates a representation of the state of
 253 memory, with a @code{store} recording a binding between @code{region}
 254 instances, to @code{svalue} instances.  The bindings are organized into
 255 clusters, where regions accessible via well-defined pointer arithmetic
 256 are in the same cluster.  The representation is graph-like because values
 257 can be pointers to regions.  It also stores a constraint_manager,
 258 capturing relationships between the values.
 259
 260 Because each node in the @code{exploded_graph} has a @code{region_model},
 261 and each of the latter is graph-like, the @code{exploded_graph} is in some
 262 ways a graph of graphs.
 263
 264 Here's an example of printing a @code{program_state}, showing the
 265 @code{region_model} within it, along with state for the @code{malloc}
 266 state machine.
 267
 268 @smallexample
 269 (gdb) call debug (*this)
 270 rmodel:
 271 stack depth: 1
 272   frame (index 0): frame: ‘test’@@1
 273 clusters within frame: ‘test’@@1
 274   cluster for: ptr_3: &HEAP_ALLOCATED_REGION(12)
 275 m_called_unknown_fn: FALSE
 276 constraint_manager:
 277   equiv classes:
 278   constraints:
 279 malloc:
 280   0x2e89590: &HEAP_ALLOCATED_REGION(12): unchecked ('ptr_3')
 281 @end smallexample
 282
 283 This is the state at the point of returning from @code{calls_malloc} back
 284 to @code{test} in the following:
 285
 286 @smallexample
 287 void *
 288 calls_malloc (void)
 289 @{
 290   void *result = malloc (1024);
 291   return result;
 292 @}
 293
 294 void test (void)
 295 @{
 296   void *ptr = calls_malloc ();
 297   /* etc.  */
 298 @}
 299 @end smallexample
 300
 301 Within the store, there is the cluster for @code{ptr_3} within the frame
 302 for @code{test}, where the whole cluster is bound to a pointer value,
 303 pointing at @code{HEAP_ALLOCATED_REGION(12)}.  Additionally, this pointer
 304 has the @code{unchecked} state for the @code{malloc} state machine
 305 indicating it hasn't yet been checked against NULL since the allocation
 306 call.
 307
 308 @subsection Analyzer Paths
 309
 310 We need to explain to the user what the problem is, and to persuade them
 311 that there really is a problem.  Hence having a @code{diagnostic_path}
 312 isn't just an incidental detail of the analyzer; it's required.
 313
 314 Paths ought to be:
 315 @itemize @bullet
 316 @item
 317 interprocedurally-valid
 318 @item
 319 feasible
 320 @end itemize
 321
 322 Without state-merging, all paths in the exploded graph are feasible
 323 (in terms of constraints being satisfied).
 324 With state-merging, paths in the exploded graph can be infeasible.
 325
 326 We collate warnings and only emit them for the simplest path
 327 e.g. for a bug in a utility function, with lots of routes to calling it,
 328 we only emit the simplest path (which could be intraprocedural, if
 329 it can be reproduced without a caller).
 330
 331 We thus want to find the shortest feasible path through the exploded
 332 graph from the origin to the exploded node at which the diagnostic was
 333 saved.  Unfortunately, if we simply find the shortest such path and
 334 check if it's feasible we might falsely reject the diagnostic, as there
 335 might be a longer path that is feasible.  Examples include the cases
 336 where the diagnostic requires us to go at least once around a loop for a
 337 later condition to be satisfied, or where for a later condition to be
 338 satisfied we need to enter a suite of code that the simpler path skips.
 339
 340 We attempt to find the shortest feasible path to each diagnostic by
 341 first constructing a ``trimmed graph'' from the exploded graph,
 342 containing only those nodes and edges from which there are paths to
 343 the target node, and using Dijkstra's algorithm to order the trimmed
 344 nodes by minimal distance to the target.
 345
 346 We then use a worklist to iteratively build a ``feasible graph''
 347 (actually a tree), capturing the pertinent state along each path, in
 348 which every path to a ``feasible node'' is feasible by construction,
 349 restricting ourselves to the trimmed graph to ensure we stay on target,
 350 and ordering the worklist so that the first feasible path we find to the
 351 target node is the shortest possible path.  Hence we start by trying the
 352 shortest possible path, but if that fails, we explore progressively
 353 longer paths, eventually trying iterations through loops.  The
 354 exploration is captured in the feasible_graph, which can be dumped as a
 355 .dot file via @option{-fdump-analyzer-feasibility} to visualize the
 356 exploration.  The indices of the feasible nodes show the order in which
 357 they were created.  We effectively explore the tree of feasible paths in
 358 order of shortest path until we either find a feasible path to the
 359 target node, or hit a limit and give up.
 360
 361 This is something of a brute-force approach, but the trimmed graph
 362 hopefully keeps the complexity manageable.
 363
 364 This algorithm can be disabled (for debugging purposes) via
 365 @option{-fno-analyzer-feasibility}, which simply uses the shortest path,
 366 and notes if it is infeasible.
 367
 368 The above gives us a shortest feasible @code{exploded_path} through the
 369 @code{exploded_graph} (a list of @code{exploded_edge *}).  We use this
 370 @code{exploded_path} to build a @code{diagnostic_path} (a list of
 371 @strong{events} for the diagnostic subsystem) - specifically a
 372 @code{checker_path}.
 373
 374 Having built the @code{checker_path}, we prune it to try to eliminate
 375 events that aren't relevant, to minimize how much the user has to read.
 376
 377 After pruning, we notify each event in the path of its ID and record the
 378 IDs of interesting events, allowing for events to refer to other events
 379 in their descriptions.  The @code{pending_diagnostic} class has various
 380 vfuncs to support emitting more precise descriptions, so that e.g.
 381
 382 @itemize @bullet
 383 @item
 384 a deref-of-unchecked-malloc diagnostic might use:
 385 @smallexample
 386   returning possibly-NULL pointer to 'make_obj' from 'allocator'
 387 @end smallexample
 388 for a @code{return_event} to make it clearer how the unchecked value moves
 389 from callee back to caller
 390 @item
 391 a double-free diagnostic might use:
 392 @smallexample
 393   second 'free' here; first 'free' was at (3)
 394 @end smallexample
 395 and a use-after-free might use
 396 @smallexample
 397   use after 'free' here; memory was freed at (2)
 398 @end smallexample
 399 @end itemize
 400
 401 At this point we can emit the diagnostic.
 402
 403 @subsection Limitations
 404
 405 @itemize @bullet
 406 @item
 407 Only for C so far
 408 @item
 409 The implementation of call summaries is currently very simplistic.
 410 @item
 411 Lack of function pointer analysis
 412 @item
 413 The constraint-handling code assumes reflexivity in some places
 414 (that values are equal to themselves), which is not the case for NaN.
 415 As a simple workaround, constraints on floating-point values are
 416 currently ignored.
 417 @item
 418 There are various other limitations in the region model (grep for TODO/xfail
 419 in the testsuite).
 420 @item
 421 The constraint_manager's implementation of transitivity is currently too
 422 expensive to enable by default and so must be manually enabled via
 423 @option{-fanalyzer-transitivity}).
 424 @item
 425 The checkers are currently hardcoded and don't allow for user extensibility
 426 (e.g. adding allocate/release pairs).
 427 @item
 428 Although the analyzer's test suite has a proof-of-concept test case for
 429 LTO, LTO support hasn't had extensive testing.  There are various
 430 lang-specific things in the analyzer that assume C rather than LTO.
 431 For example, SSA names are printed to the user in ``raw'' form, rather
 432 than printing the underlying variable name.
 433 @end itemize
 434
 435 @node Debugging the Analyzer
 436 @section Debugging the Analyzer
 437 @cindex analyzer, debugging
 438 @cindex static analyzer, debugging
 439
 440 When debugging the analyzer I normally use all of these options
 441 together:
 442
 443 @smallexample
 444 ./xgcc -B. \
 445   -S \
 446   -fanalyzer \
 447   OTHER_GCC_ARGS \
 448   -wrapper gdb,--args \
 449   -fdump-analyzer-stderr \
 450   -fanalyzer-fine-grained \
 451   -fdump-ipa-analyzer=stderr
 452 @end smallexample
 453
 454 where:
 455
 456 @itemize @bullet
 457 @item @code{./xgcc -B.}
 458 is the usual way to invoke a self-built GCC from within the @file{BUILDDIR/gcc}
 459 subdirectory.
 460
 461 @item @code{-S}
 462 so that the driver (@code{./xgcc}) invokes @code{cc1}, but doesn't bother
 463 running the assembler or linker (since the analyzer runs inside @code{cc1}).
 464
 465 @item @code{-fanalyzer}
 466 enables the analyzer, obviously.
 467
 468 @item @code{-wrapper gdb,--args}
 469 invokes @code{cc1} under the debugger so that I can debug @code{cc1} and
 470 set breakpoints and step through things.
 471
 472 @item @code{-fdump-analyzer-stderr}
 473 so that the logging interface is enabled and goes to stderr, which often
 474 gives valuable context into what's happening when stepping through the
 475 analyzer
 476
 477 @item @code{-fanalyzer-fine-grained}
 478 which splits the effect of every statement into its own
 479 exploded_node, rather than the default (which tries to combine
 480 successive stmts to reduce the size of the exploded_graph).  This makes
 481 it easier to see exactly where a particular change happens.
 482
 483 @item @code{-fdump-ipa-analyzer=stderr}
 484 which dumps the GIMPLE IR seen by the analyzer pass to stderr
 485
 486 @end itemize
 487
 488 Other useful options:
 489
 490 @itemize @bullet
 491 @item @code{-fdump-analyzer-exploded-graph}
 492 which dumps a @file{SRC.eg.dot} GraphViz file that I can look at (with
 493 python-xdot)
 494
 495 @item @code{-fdump-analyzer-exploded-nodes-2}
 496 which dumps a @file{SRC.eg.txt} file containing the full @code{exploded_graph}.
 497
 498 @end itemize
 499
 500 Assuming that you have the
 501 @uref{https://gcc-newbies-guide.readthedocs.io/en/latest/debugging.html,,python support scripts for gdb}
 502 installed, you can use:
 503
 504 @smallexample
 505 (gdb) break-on-saved-diagnostic
 506 @end smallexample
 507
 508 to put a breakpoint at the place where a diagnostic is saved during
 509 @code{exploded_graph} exploration, to see where a particular diagnostic
 510 is being saved, and:
 511
 512 @smallexample
 513 (gdb) break-on-diagnostic
 514 @end smallexample
 515
 516 to put a breakpoint at the place where diagnostics are actually emitted.
 517
 518 @subsection Special Functions for Debugging the Analyzer
 519
 520 The analyzer recognizes various special functions by name, for use
 521 in debugging the analyzer, and for use in DejaGnu tests.
 522
 523 The declarations of these functions can be seen in the testsuite
 524 in @file{analyzer-decls.h}.  None of these functions are actually
 525 implemented in terms of code, merely as @code{known_function} subclasses
 526 (in @file{gcc/analyzer/kf-analyzer.cc}).
 527
 528 @table @code
 529
 530 @item __analyzer_break
 531 Add:
 532 @smallexample
 533   __analyzer_break ();
 534 @end smallexample
 535 to the source being analyzed to trigger a breakpoint in the analyzer when
 536 that source is reached.  By putting a series of these in the source, it's
 537 much easier to effectively step through the program state as it's analyzed.
 538
 539 @item __analyzer_describe
 540 The analyzer handles:
 541
 542 @smallexample
 543 __analyzer_describe (0, expr);
 544 @end smallexample
 545
 546 by emitting a warning describing the 2nd argument (which can be of any
 547 type), at a verbosity level given by the 1st argument.  This is for use when
 548 debugging, and may be of use in DejaGnu tests.
 549
 550 @item __analyzer_dump
 551 @smallexample
 552 __analyzer_dump ();
 553 @end smallexample
 554
 555 will dump the copious information about the analyzer's state each time it
 556 reaches the call in its traversal of the source.
 557
 558 @item __analyzer_dump_capacity
 559 @smallexample
 560 extern void __analyzer_dump_capacity (const void *ptr);
 561 @end smallexample
 562
 563 will emit a warning describing the capacity of the base region of
 564 the region pointed to by the 1st argument.
 565
 566 @item __analyzer_dump_escaped
 567 @smallexample
 568 extern void __analyzer_dump_escaped (void);
 569 @end smallexample
 570
 571 will emit a warning giving the number of decls that have escaped on this
 572 analysis path, followed by a comma-separated list of their names,
 573 in alphabetical order.
 574
 575 @item __analyzer_dump_path
 576 @smallexample
 577 __analyzer_dump_path ();
 578 @end smallexample
 579
 580 will emit a placeholder ``note'' diagnostic with a path to that call site,
 581 if the analyzer finds a feasible path to it.  This can be useful for
 582 writing DejaGnu tests for constraint-tracking and feasibility checking.
 583
 584 @item __analyzer_dump_exploded_nodes
 585 For every callsite to @code{__analyzer_dump_exploded_nodes} the analyzer
 586 will emit a warning after it finished the analysis containing information
 587 on all of the exploded nodes at that program point.
 588
 589 @smallexample
 590   __analyzer_dump_exploded_nodes (0);
 591 @end smallexample
 592
 593 will output the number of ``processed'' nodes, and the IDs of
 594 both ``processed'' and ``merger'' nodes, such as:
 595
 596 @smallexample
 597 warning: 2 processed enodes: [EN: 56, EN: 58] merger(s): [EN: 54-55, EN: 57, EN: 59]
 598 @end smallexample
 599
 600 With a non-zero argument
 601
 602 @smallexample
 603   __analyzer_dump_exploded_nodes (1);
 604 @end smallexample
 605
 606 it will also dump all of the states within the ``processed'' nodes.
 607
 608 @item __analyzer_dump_named_constant
 609 When the analyzer sees a call to @code{__analyzer_dump_named_constant} it
 610 will emit a warning describing what is known about the value of a given
 611 named constant, for parts of the analyzer that interact with target
 612 headers.
 613
 614 For example:
 615
 616 @smallexample
 617 __analyzer_dump_named_constant ("O_RDONLY");
 618 @end smallexample
 619
 620 might lead to the analyzer emitting the warning:
 621
 622 @smallexample
 623 warning: named constant 'O_RDONLY' has value '1'
 624 @end smallexample
 625
 626 @item __analyzer_dump_region_model
 627 @smallexample
 628    __analyzer_dump_region_model ();
 629 @end smallexample
 630 will dump the region_model's state to stderr.
 631
 632 @item __analyzer_dump_state
 633 @smallexample
 634 __analyzer_dump_state ("malloc", ptr);
 635 @end smallexample
 636
 637 will emit a warning describing the state of the 2nd argument
 638 (which can be of any type) with respect to the state machine with
 639 a name matching the 1st argument (which must be a string literal).
 640 This is for use when debugging, and may be of use in DejaGnu tests.
 641
 642 @item __analyzer_eval
 643 @smallexample
 644 __analyzer_eval (expr);
 645 @end smallexample
 646 will emit a warning with text "TRUE", FALSE" or "UNKNOWN" based on the
 647 truthfulness of the argument.  This is useful for writing DejaGnu tests.
 648
 649 @item __analyzer_get_unknown_ptr
 650 @smallexample
 651 __analyzer_get_unknown_ptr ();
 652 @end smallexample
 653 will obtain an unknown @code{void *}.
 654
 655 @item __analyzer_get_strlen
 656 @smallexample
 657 __analyzer_get_strlen (buf);
 658 @end smallexample
 659 will emit a warning if PTR doesn't point to a null-terminated string.
 660 TODO: eventually get the strlen of the buffer (without the
 661 optimizer touching it).
 662
 663 @end table
 664
 665 @subsection Other Debugging Techniques
 666
 667 To compare two different exploded graphs, try
 668 @code{-fdump-analyzer-exploded-nodes-2 -fdump-noaddr -fanalyzer-fine-grained}.
 669 This will dump a @file{SRC.eg.txt} file containing the full
 670 @code{exploded_graph}. I use @code{diff -u50 -p} to compare two different
 671 such files (e.g. before and after a patch) to find the first place where the
 672 two graphs diverge.  The option @option{-fdump-noaddr} will suppress
 673 printing pointers withihn the dumps (which would otherwise hide the real
 674 differences with irrelevent churn).
 675
 676 The option @option{-fdump-analyzer-json} will dump both the supergraph
 677 and the exploded graph in compressed JSON form.
 678
 679 One approach when tracking down where a particular bogus state is
 680 introduced into the @code{exploded_graph} is to add custom code to
 681 @code{program_state::validate}.
 682
 683 The debug function @code{region::is_named_decl_p} can be used when debugging,
 684 such as for assertions and conditional breakpoints.  For example, when
 685 tracking down a bug in handling a decl called @code{yy_buffer_stack}, I
 686 temporarily added a:
 687 @smallexample
 688   gcc_assert (!m_base_region->is_named_decl_p ("yy_buffer_stack"));
 689 @end smallexample
 690 to @code{binding_cluster::mark_as_escaped} to trap a point where
 691 @code{yy_buffer_stack} was mistakenly being treated as having escaped.