gcc/doc/lto.texi

   1 @c Copyright (C) 2010-2017 Free Software Foundation, Inc.
   2 @c This is part of the GCC manual.
   3 @c For copying conditions, see the file gcc.texi.
   4 @c Contributed by Jan Hubicka <jh@suse.cz> and
   5 @c Diego Novillo <dnovillo@google.com>
   6
   7 @node LTO
   8 @chapter Link Time Optimization
   9 @cindex lto
  10 @cindex whopr
  11 @cindex wpa
  12 @cindex ltrans
  13
  14 Link Time Optimization (LTO) gives GCC the capability of
  15 dumping its internal representation (GIMPLE) to disk,
  16 so that all the different compilation units that make up
  17 a single executable can be optimized as a single module.
  18 This expands the scope of inter-procedural optimizations
  19 to encompass the whole program (or, rather, everything
  20 that is visible at link time).
  21
  22 @menu
  23 * LTO Overview::            Overview of LTO.
  24 * LTO object file layout::  LTO file sections in ELF.
  25 * IPA::                     Using summary information in IPA passes.
  26 * WHOPR::                   Whole program assumptions,
  27                             linker plugin and symbol visibilities.
  28 * Internal flags::          Internal flags controlling @code{lto1}.
  29 @end menu
  30
  31 @node LTO Overview
  32 @section Design Overview
  33
  34 Link time optimization is implemented as a GCC front end for a
  35 bytecode representation of GIMPLE that is emitted in special sections
  36 of @code{.o} files.  Currently, LTO support is enabled in most
  37 ELF-based systems, as well as darwin, cygwin and mingw systems.
  38
  39 Since GIMPLE bytecode is saved alongside final object code, object
  40 files generated with LTO support are larger than regular object files.
  41 This ``fat'' object format makes it easy to integrate LTO into
  42 existing build systems, as one can, for instance, produce archives of
  43 the files.  Additionally, one might be able to ship one set of fat
  44 objects which could be used both for development and the production of
  45 optimized builds.  A, perhaps surprising, side effect of this feature
  46 is that any mistake in the toolchain leads to LTO information not
  47 being used (e.g.@: an older @code{libtool} calling @code{ld} directly).
  48 This is both an advantage, as the system is more robust, and a
  49 disadvantage, as the user is not informed that the optimization has
  50 been disabled.
  51
  52 The current implementation only produces ``fat'' objects, effectively
  53 doubling compilation time and increasing file sizes up to 5x the
  54 original size.  This hides the problem that some tools, such as
  55 @code{ar} and @code{nm}, need to understand symbol tables of LTO
  56 sections.  These tools were extended to use the plugin infrastructure,
  57 and with these problems solved, GCC will also support ``slim'' objects
  58 consisting of the intermediate code alone.
  59
  60 At the highest level, LTO splits the compiler in two.  The first half
  61 (the ``writer'') produces a streaming representation of all the
  62 internal data structures needed to optimize and generate code.  This
  63 includes declarations, types, the callgraph and the GIMPLE representation
  64 of function bodies.
  65
  66 When @option{-flto} is given during compilation of a source file, the
  67 pass manager executes all the passes in @code{all_lto_gen_passes}.
  68 Currently, this phase is composed of two IPA passes:
  69
  70 @itemize @bullet
  71 @item @code{pass_ipa_lto_gimple_out}
  72 This pass executes the function @code{lto_output} in
  73 @file{lto-streamer-out.c}, which traverses the call graph encoding
  74 every reachable declaration, type and function.  This generates a
  75 memory representation of all the file sections described below.
  76
  77 @item @code{pass_ipa_lto_finish_out}
  78 This pass executes the function @code{produce_asm_for_decls} in
  79 @file{lto-streamer-out.c}, which takes the memory image built in the
  80 previous pass and encodes it in the corresponding ELF file sections.
  81 @end itemize
  82
  83 The second half of LTO support is the ``reader''.  This is implemented
  84 as the GCC front end @file{lto1} in @file{lto/lto.c}.  When
  85 @file{collect2} detects a link set of @code{.o}/@code{.a} files with
  86 LTO information and the @option{-flto} is enabled, it invokes
  87 @file{lto1} which reads the set of files and aggregates them into a
  88 single translation unit for optimization.  The main entry point for
  89 the reader is @file{lto/lto.c}:@code{lto_main}.
  90
  91 @subsection LTO modes of operation
  92
  93 One of the main goals of the GCC link-time infrastructure was to allow
  94 effective compilation of large programs.  For this reason GCC implements two
  95 link-time compilation modes.
  96
  97 @enumerate
  98 @item   @emph{LTO mode}, in which the whole program is read into the
  99 compiler at link-time and optimized in a similar way as if it
 100 were a single source-level compilation unit.
 101
 102 @item   @emph{WHOPR or partitioned mode}, designed to utilize multiple
 103 CPUs and/or a distributed compilation environment to quickly link
 104 large applications.  WHOPR stands for WHOle Program optimizeR (not to
 105 be confused with the semantics of @option{-fwhole-program}).  It
 106 partitions the aggregated callgraph from many different @code{.o}
 107 files and distributes the compilation of the sub-graphs to different
 108 CPUs.
 109
 110 Note that distributed compilation is not implemented yet, but since
 111 the parallelism is facilitated via generating a @code{Makefile}, it
 112 would be easy to implement.
 113 @end enumerate
 114
 115 WHOPR splits LTO into three main stages:
 116 @enumerate
 117 @item Local generation (LGEN)
 118 This stage executes in parallel.  Every file in the program is compiled
 119 into the intermediate language and packaged together with the local
 120 call-graph and summary information.  This stage is the same for both
 121 the LTO and WHOPR compilation mode.
 122
 123 @item Whole Program Analysis (WPA)
 124 WPA is performed sequentially.  The global call-graph is generated, and
 125 a global analysis procedure makes transformation decisions.  The global
 126 call-graph is partitioned to facilitate parallel optimization during
 127 phase 3.  The results of the WPA stage are stored into new object files
 128 which contain the partitions of program expressed in the intermediate
 129 language and the optimization decisions.
 130
 131 @item Local transformations (LTRANS)
 132 This stage executes in parallel.  All the decisions made during phase 2
 133 are implemented locally in each partitioned object file, and the final
 134 object code is generated.  Optimizations which cannot be decided
 135 efficiently during the phase 2 may be performed on the local
 136 call-graph partitions.
 137 @end enumerate
 138
 139 WHOPR can be seen as an extension of the usual LTO mode of
 140 compilation.  In LTO, WPA and LTRANS are executed within a single
 141 execution of the compiler, after the whole program has been read into
 142 memory.
 143
 144 When compiling in WHOPR mode, the callgraph is partitioned during
 145 the WPA stage.  The whole program is split into a given number of
 146 partitions of roughly the same size.  The compiler tries to
 147 minimize the number of references which cross partition boundaries.
 148 The main advantage of WHOPR is to allow the parallel execution of
 149 LTRANS stages, which are the most time-consuming part of the
 150 compilation process.  Additionally, it avoids the need to load the
 151 whole program into memory.
 152
 153
 154 @node LTO object file layout
 155 @section LTO file sections
 156
 157 LTO information is stored in several ELF sections inside object files.
 158 Data structures and enum codes for sections are defined in
 159 @file{lto-streamer.h}.
 160
 161 These sections are emitted from @file{lto-streamer-out.c} and mapped
 162 in all at once from @file{lto/lto.c}:@code{lto_file_read}.  The
 163 individual functions dealing with the reading/writing of each section
 164 are described below.
 165
 166 @itemize @bullet
 167 @item Command line options (@code{.gnu.lto_.opts})
 168
 169 This section contains the command line options used to generate the
 170 object files.  This is used at link time to determine the optimization
 171 level and other settings when they are not explicitly specified at the
 172 linker command line.
 173
 174 Currently, GCC does not support combining LTO object files compiled
 175 with different set of the command line options into a single binary.
 176 At link time, the options given on the command line and the options
 177 saved on all the files in a link-time set are applied globally.  No
 178 attempt is made at validating the combination of flags (other than the
 179 usual validation done by option processing).  This is implemented in
 180 @file{lto/lto.c}:@code{lto_read_all_file_options}.
 181
 182
 183 @item Symbol table (@code{.gnu.lto_.symtab})
 184
 185 This table replaces the ELF symbol table for functions and variables
 186 represented in the LTO IL.  Symbols used and exported by the optimized
 187 assembly code of ``fat'' objects might not match the ones used and
 188 exported by the intermediate code.  This table is necessary because
 189 the intermediate code is less optimized and thus requires a separate
 190 symbol table.
 191
 192 Additionally, the binary code in the ``fat'' object will lack a call
 193 to a function, since the call was optimized out at compilation time
 194 after the intermediate language was streamed out.  In some special
 195 cases, the same optimization may not happen during link-time
 196 optimization.  This would lead to an undefined symbol if only one
 197 symbol table was used.
 198
 199 The symbol table is emitted in
 200 @file{lto-streamer-out.c}:@code{produce_symtab}.
 201
 202
 203 @item Global declarations and types (@code{.gnu.lto_.decls})
 204
 205 This section contains an intermediate language dump of all
 206 declarations and types required to represent the callgraph, static
 207 variables and top-level debug info.
 208
 209 The contents of this section are emitted in
 210 @file{lto-streamer-out.c}:@code{produce_asm_for_decls}.  Types and
 211 symbols are emitted in a topological order that preserves the sharing
 212 of pointers when the file is read back in
 213 (@file{lto.c}:@code{read_cgraph_and_symbols}).
 214
 215
 216 @item The callgraph (@code{.gnu.lto_.cgraph})
 217
 218 This section contains the basic data structure used by the GCC
 219 inter-procedural optimization infrastructure.  This section stores an
 220 annotated multi-graph which represents the functions and call sites as
 221 well as the variables, aliases and top-level @code{asm} statements.
 222
 223 This section is emitted in
 224 @file{lto-streamer-out.c}:@code{output_cgraph} and read in
 225 @file{lto-cgraph.c}:@code{input_cgraph}.
 226
 227
 228 @item IPA references (@code{.gnu.lto_.refs})
 229
 230 This section contains references between function and static
 231 variables.  It is emitted by @file{lto-cgraph.c}:@code{output_refs}
 232 and read by @file{lto-cgraph.c}:@code{input_refs}.
 233
 234
 235 @item Function bodies (@code{.gnu.lto_.function_body.<name>})
 236
 237 This section contains function bodies in the intermediate language
 238 representation.  Every function body is in a separate section to allow
 239 copying of the section independently to different object files or
 240 reading the function on demand.
 241
 242 Functions are emitted in
 243 @file{lto-streamer-out.c}:@code{output_function} and read in
 244 @file{lto-streamer-in.c}:@code{input_function}.
 245
 246
 247 @item Static variable initializers (@code{.gnu.lto_.vars})
 248
 249 This section contains all the symbols in the global variable pool.  It
 250 is emitted by @file{lto-cgraph.c}:@code{output_varpool} and read in
 251 @file{lto-cgraph.c}:@code{input_cgraph}.
 252
 253 @item Summaries and optimization summaries used by IPA passes
 254 (@code{.gnu.lto_.<xxx>}, where @code{<xxx>} is one of @code{jmpfuncs},
 255 @code{pureconst} or @code{reference})
 256
 257 These sections are used by IPA passes that need to emit summary
 258 information during LTO generation to be read and aggregated at
 259 link time.  Each pass is responsible for implementing two pass manager
 260 hooks: one for writing the summary and another for reading it in.  The
 261 format of these sections is entirely up to each individual pass.  The
 262 only requirement is that the writer and reader hooks agree on the
 263 format.
 264 @end itemize
 265
 266
 267 @node IPA
 268 @section Using summary information in IPA passes
 269
 270 Programs are represented internally as a @emph{callgraph} (a
 271 multi-graph where nodes are functions and edges are call sites)
 272 and a @emph{varpool} (a list of static and external variables in
 273 the program).
 274
 275 The inter-procedural optimization is organized as a sequence of
 276 individual passes, which operate on the callgraph and the
 277 varpool.  To make the implementation of WHOPR possible, every
 278 inter-procedural optimization pass is split into several stages
 279 that are executed at different times during WHOPR compilation:
 280
 281 @itemize @bullet
 282 @item LGEN time
 283 @enumerate
 284 @item @emph{Generate summary} (@code{generate_summary} in
 285 @code{struct ipa_opt_pass_d}).  This stage analyzes every function
 286 body and variable initializer is examined and stores relevant
 287 information into a pass-specific data structure.
 288
 289 @item @emph{Write summary} (@code{write_summary} in
 290 @code{struct ipa_opt_pass_d}).  This stage writes all the
 291 pass-specific information generated by @code{generate_summary}.
 292 Summaries go into their own @code{LTO_section_*} sections that
 293 have to be declared in @file{lto-streamer.h}:@code{enum
 294 lto_section_type}.  A new section is created by calling
 295 @code{create_output_block} and data can be written using the
 296 @code{lto_output_*} routines.
 297 @end enumerate
 298
 299 @item WPA time
 300 @enumerate
 301 @item @emph{Read summary} (@code{read_summary} in
 302 @code{struct ipa_opt_pass_d}).  This stage reads all the
 303 pass-specific information in exactly the same order that it was
 304 written by @code{write_summary}.
 305
 306 @item @emph{Execute} (@code{execute} in @code{struct
 307 opt_pass}).  This performs inter-procedural propagation.  This
 308 must be done without actual access to the individual function
 309 bodies or variable initializers.  Typically, this results in a
 310 transitive closure operation over the summary information of all
 311 the nodes in the callgraph.
 312
 313 @item @emph{Write optimization summary}
 314 (@code{write_optimization_summary} in @code{struct
 315 ipa_opt_pass_d}).  This writes the result of the inter-procedural
 316 propagation into the object file.  This can use the same data
 317 structures and helper routines used in @code{write_summary}.
 318 @end enumerate
 319
 320 @item LTRANS time
 321 @enumerate
 322 @item @emph{Read optimization summary}
 323 (@code{read_optimization_summary} in @code{struct
 324 ipa_opt_pass_d}).  The counterpart to
 325 @code{write_optimization_summary}.  This reads the interprocedural
 326 optimization decisions in exactly the same format emitted by
 327 @code{write_optimization_summary}.
 328
 329 @item @emph{Transform} (@code{function_transform} and
 330 @code{variable_transform} in @code{struct ipa_opt_pass_d}).
 331 The actual function bodies and variable initializers are updated
 332 based on the information passed down from the @emph{Execute} stage.
 333 @end enumerate
 334 @end itemize
 335
 336 The implementation of the inter-procedural passes are shared
 337 between LTO, WHOPR and classic non-LTO compilation.
 338
 339 @itemize
 340 @item During the traditional file-by-file mode every pass executes its
 341 own @emph{Generate summary}, @emph{Execute}, and @emph{Transform}
 342 stages within the single execution context of the compiler.
 343
 344 @item In LTO compilation mode, every pass uses @emph{Generate
 345 summary} and @emph{Write summary} stages at compilation time,
 346 while the @emph{Read summary}, @emph{Execute}, and
 347 @emph{Transform} stages are executed at link time.
 348
 349 @item In WHOPR mode all stages are used.
 350 @end itemize
 351
 352 To simplify development, the GCC pass manager differentiates
 353 between normal inter-procedural passes and small inter-procedural
 354 passes.  A @emph{small inter-procedural pass}
 355 (@code{SIMPLE_IPA_PASS}) is a pass that does
 356 everything at once and thus it can not be executed during WPA in
 357 WHOPR mode.  It defines only the @emph{Execute} stage and during
 358 this stage it accesses and modifies the function bodies.  Such
 359 passes are useful for optimization at LGEN or LTRANS time and are
 360 used, for example, to implement early optimization before writing
 361 object files.  The simple inter-procedural passes can also be used
 362 for easier prototyping and development of a new inter-procedural
 363 pass.
 364
 365
 366 @subsection Virtual clones
 367
 368 One of the main challenges of introducing the WHOPR compilation
 369 mode was addressing the interactions between optimization passes.
 370 In LTO compilation mode, the passes are executed in a sequence,
 371 each of which consists of analysis (or @emph{Generate summary}),
 372 propagation (or @emph{Execute}) and @emph{Transform} stages.
 373 Once the work of one pass is finished, the next pass sees the
 374 updated program representation and can execute.  This makes the
 375 individual passes dependent on each other.
 376
 377 In WHOPR mode all passes first execute their @emph{Generate
 378 summary} stage.  Then summary writing marks the end of the LGEN
 379 stage.  At WPA time,
 380 the summaries are read back into memory and all passes run the
 381 @emph{Execute} stage.  Optimization summaries are streamed and
 382 sent to LTRANS, where all the passes execute the @emph{Transform}
 383 stage.
 384
 385 Most optimization passes split naturally into analysis,
 386 propagation and transformation stages.  But some do not.  The
 387 main problem arises when one pass performs changes and the
 388 following pass gets confused by seeing different callgraphs
 389 between the @emph{Transform} stage and the @emph{Generate summary}
 390 or @emph{Execute} stage.  This means that the passes are required
 391 to communicate their decisions with each other.
 392
 393 To facilitate this communication, the GCC callgraph
 394 infrastructure implements @emph{virtual clones}, a method of
 395 representing the changes performed by the optimization passes in
 396 the callgraph without needing to update function bodies.
 397
 398 A @emph{virtual clone} in the callgraph is a function that has no
 399 associated body, just a description of how to create its body based
 400 on a different function (which itself may be a virtual clone).
 401
 402 The description of function modifications includes adjustments to
 403 the function's signature (which allows, for example, removing or
 404 adding function arguments), substitutions to perform on the
 405 function body, and, for inlined functions, a pointer to the
 406 function that it will be inlined into.
 407
 408 It is also possible to redirect any edge of the callgraph from a
 409 function to its virtual clone.  This implies updating of the call
 410 site to adjust for the new function signature.
 411
 412 Most of the transformations performed by inter-procedural
 413 optimizations can be represented via virtual clones.  For
 414 instance, a constant propagation pass can produce a virtual clone
 415 of the function which replaces one of its arguments by a
 416 constant.  The inliner can represent its decisions by producing a
 417 clone of a function whose body will be later integrated into
 418 a given function.
 419
 420 Using @emph{virtual clones}, the program can be easily updated
 421 during the @emph{Execute} stage, solving most of pass interactions
 422 problems that would otherwise occur during @emph{Transform}.
 423
 424 Virtual clones are later materialized in the LTRANS stage and
 425 turned into real functions.  Passes executed after the virtual
 426 clone were introduced also perform their @emph{Transform} stage
 427 on new functions, so for a pass there is no significant
 428 difference between operating on a real function or a virtual
 429 clone introduced before its @emph{Execute} stage.
 430
 431 Optimization passes then work on virtual clones introduced before
 432 their @emph{Execute} stage as if they were real functions.  The
 433 only difference is that clones are not visible during the
 434 @emph{Generate Summary} stage.
 435
 436 To keep function summaries updated, the callgraph interface
 437 allows an optimizer to register a callback that is called every
 438 time a new clone is introduced as well as when the actual
 439 function or variable is generated or when a function or variable
 440 is removed.  These hooks are registered in the @emph{Generate
 441 summary} stage and allow the pass to keep its information intact
 442 until the @emph{Execute} stage.  The same hooks can also be
 443 registered during the @emph{Execute} stage to keep the
 444 optimization summaries updated for the @emph{Transform} stage.
 445
 446 @subsection IPA references
 447
 448 GCC represents IPA references in the callgraph.  For a function
 449 or variable @code{A}, the @emph{IPA reference} is a list of all
 450 locations where the address of @code{A} is taken and, when
 451 @code{A} is a variable, a list of all direct stores and reads
 452 to/from @code{A}.  References represent an oriented multi-graph on
 453 the union of nodes of the callgraph and the varpool.  See
 454 @file{ipa-reference.c}:@code{ipa_reference_write_optimization_summary}
 455 and
 456 @file{ipa-reference.c}:@code{ipa_reference_read_optimization_summary}
 457 for details.
 458
 459 @subsection Jump functions
 460 Suppose that an optimization pass sees a function @code{A} and it
 461 knows the values of (some of) its arguments.  The @emph{jump
 462 function} describes the value of a parameter of a given function
 463 call in function @code{A} based on this knowledge.
 464
 465 Jump functions are used by several optimizations, such as the
 466 inter-procedural constant propagation pass and the
 467 devirtualization pass.  The inliner also uses jump functions to
 468 perform inlining of callbacks.
 469
 470 @node WHOPR
 471 @section Whole program assumptions, linker plugin and symbol visibilities
 472
 473 Link-time optimization gives relatively minor benefits when used
 474 alone.  The problem is that propagation of inter-procedural
 475 information does not work well across functions and variables
 476 that are called or referenced by other compilation units (such as
 477 from a dynamically linked library).  We say that such functions
 478 and variables are @emph{externally visible}.
 479
 480 To make the situation even more difficult, many applications
 481 organize themselves as a set of shared libraries, and the default
 482 ELF visibility rules allow one to overwrite any externally
 483 visible symbol with a different symbol at runtime.  This
 484 basically disables any optimizations across such functions and
 485 variables, because the compiler cannot be sure that the function
 486 body it is seeing is the same function body that will be used at
 487 runtime.  Any function or variable not declared @code{static} in
 488 the sources degrades the quality of inter-procedural
 489 optimization.
 490
 491 To avoid this problem the compiler must assume that it sees the
 492 whole program when doing link-time optimization.  Strictly
 493 speaking, the whole program is rarely visible even at link-time.
 494 Standard system libraries are usually linked dynamically or not
 495 provided with the link-time information.  In GCC, the whole
 496 program option (@option{-fwhole-program}) asserts that every
 497 function and variable defined in the current compilation
 498 unit is static, except for function @code{main} (note: at
 499 link time, the current unit is the union of all objects compiled
 500 with LTO).  Since some functions and variables need to
 501 be referenced externally, for example by another DSO or from an
 502 assembler file, GCC also provides the function and variable
 503 attribute @code{externally_visible} which can be used to disable
 504 the effect of @option{-fwhole-program} on a specific symbol.
 505
 506 The whole program mode assumptions are slightly more complex in
 507 C++, where inline functions in headers are put into @emph{COMDAT}
 508 sections.  COMDAT function and variables can be defined by
 509 multiple object files and their bodies are unified at link-time
 510 and dynamic link-time.  COMDAT functions are changed to local only
 511 when their address is not taken and thus un-sharing them with a
 512 library is not harmful.  COMDAT variables always remain externally
 513 visible, however for readonly variables it is assumed that their
 514 initializers cannot be overwritten by a different value.
 515
 516 GCC provides the function and variable attribute
 517 @code{visibility} that can be used to specify the visibility of
 518 externally visible symbols (or alternatively an
 519 @option{-fdefault-visibility} command line option).  ELF defines
 520 the @code{default}, @code{protected}, @code{hidden} and
 521 @code{internal} visibilities.
 522
 523 The most commonly used is visibility is @code{hidden}.  It
 524 specifies that the symbol cannot be referenced from outside of
 525 the current shared library.  Unfortunately, this information
 526 cannot be used directly by the link-time optimization in the
 527 compiler since the whole shared library also might contain
 528 non-LTO objects and those are not visible to the compiler.
 529
 530 GCC solves this problem using linker plugins.  A @emph{linker
 531 plugin} is an interface to the linker that allows an external
 532 program to claim the ownership of a given object file.  The linker
 533 then performs the linking procedure by querying the plugin about
 534 the symbol table of the claimed objects and once the linking
 535 decisions are complete, the plugin is allowed to provide the
 536 final object file before the actual linking is made.  The linker
 537 plugin obtains the symbol resolution information which specifies
 538 which symbols provided by the claimed objects are bound from the
 539 rest of a binary being linked.
 540
 541 GCC is designed to be independent of the rest of the toolchain
 542 and aims to support linkers without plugin support.  For this
 543 reason it does not use the linker plugin by default.  Instead,
 544 the object files are examined by @command{collect2} before being
 545 passed to the linker and objects found to have LTO sections are
 546 passed to @command{lto1} first.  This mode does not work for
 547 library archives.  The decision on what object files from the
 548 archive are needed depends on the actual linking and thus GCC
 549 would have to implement the linker itself.  The resolution
 550 information is missing too and thus GCC needs to make an educated
 551 guess based on @option{-fwhole-program}.  Without the linker
 552 plugin GCC also assumes that symbols are declared @code{hidden}
 553 and not referred by non-LTO code by default.
 554
 555 @node Internal flags
 556 @section Internal flags controlling @code{lto1}
 557
 558 The following flags are passed into @command{lto1} and are not
 559 meant to be used directly from the command line.
 560
 561 @itemize
 562 @item -fwpa
 563 @opindex fwpa
 564 This option runs the serial part of the link-time optimizer
 565 performing the inter-procedural propagation (WPA mode).  The
 566 compiler reads in summary information from all inputs and
 567 performs an analysis based on summary information only.  It
 568 generates object files for subsequent runs of the link-time
 569 optimizer where individual object files are optimized using both
 570 summary information from the WPA mode and the actual function
 571 bodies.  It then drives the LTRANS phase.
 572
 573 @item -fltrans
 574 @opindex fltrans
 575 This option runs the link-time optimizer in the
 576 local-transformation (LTRANS) mode, which reads in output from a
 577 previous run of the LTO in WPA mode.  In the LTRANS mode, LTO
 578 optimizes an object and produces the final assembly.
 579
 580 @item -fltrans-output-list=@var{file}
 581 @opindex fltrans-output-list
 582 This option specifies a file to which the names of LTRANS output
 583 files are written.  This option is only meaningful in conjunction
 584 with @option{-fwpa}.
 585
 586 @item -fresolution=@var{file}
 587 @opindex fresolution
 588 This option specifies the linker resolution file.  This option is
 589 only meaningful in conjunction with @option{-fwpa} and as option
 590 to pass through to the LTO linker plugin.
 591 @end itemize