gcc/doc/lto.texi

   1 @c Copyright (C) 2010-2023 Free Software Foundation, Inc.
   2 @c This is part of the GCC manual.
   3 @c For copying conditions, see the file gcc.texi.
   4 @c Contributed by Jan Hubicka <jh@suse.cz> and
   5 @c Diego Novillo <dnovillo@google.com>
   6
   7 @node LTO
   8 @chapter Link Time Optimization
   9 @cindex lto
  10 @cindex whopr
  11 @cindex wpa
  12 @cindex ltrans
  13
  14 Link Time Optimization (LTO) gives GCC the capability of
  15 dumping its internal representation (GIMPLE) to disk,
  16 so that all the different compilation units that make up
  17 a single executable can be optimized as a single module.
  18 This expands the scope of inter-procedural optimizations
  19 to encompass the whole program (or, rather, everything
  20 that is visible at link time).
  21
  22 @menu
  23 * LTO Overview::            Overview of LTO.
  24 * LTO object file layout::  LTO file sections in ELF.
  25 * IPA::                     Using summary information in IPA passes.
  26 * WHOPR::                   Whole program assumptions,
  27                             linker plugin and symbol visibilities.
  28 * Internal flags::          Internal flags controlling @code{lto1}.
  29 @end menu
  30
  31 @node LTO Overview
  32 @section Design Overview
  33
  34 Link time optimization is implemented as a GCC front end for a
  35 bytecode representation of GIMPLE that is emitted in special sections
  36 of @code{.o} files.  Currently, LTO support is enabled in most
  37 ELF-based systems, as well as darwin, cygwin and mingw systems.
  38
  39 By default, object files generated with LTO support contain only GIMPLE
  40 bytecode.  Such objects are called ``slim'', and they require that
  41 tools like @code{ar} and @code{nm} understand symbol tables of LTO
  42 sections.  For most targets these tools have been extended to use the
  43 plugin infrastructure, so GCC can support ``slim'' objects consisting
  44 of the intermediate code alone.
  45
  46 GIMPLE bytecode could also be saved alongside final object code if
  47 the @option{-ffat-lto-objects} option is passed, or if no plugin support
  48 is detected for @code{ar} and @code{nm} when GCC is configured.  It makes
  49 the object files generated with LTO support larger than regular object
  50 files.  This ``fat'' object format allows to ship one set of fat
  51 objects which could be used both for development and the production of
  52 optimized builds.  A, perhaps surprising, side effect of this feature
  53 is that any mistake in the toolchain leads to LTO information not
  54 being used (e.g.@: an older @code{libtool} calling @code{ld} directly).
  55 This is both an advantage, as the system is more robust, and a
  56 disadvantage, as the user is not informed that the optimization has
  57 been disabled.
  58
  59 At the highest level, LTO splits the compiler in two.  The first half
  60 (the ``writer'') produces a streaming representation of all the
  61 internal data structures needed to optimize and generate code.  This
  62 includes declarations, types, the callgraph and the GIMPLE representation
  63 of function bodies.
  64
  65 When @option{-flto} is given during compilation of a source file, the
  66 pass manager executes all the passes in @code{all_lto_gen_passes}.
  67 Currently, this phase is composed of two IPA passes:
  68
  69 @itemize @bullet
  70 @item @code{pass_ipa_lto_gimple_out}
  71 This pass executes the function @code{lto_output} in
  72 @file{lto-streamer-out.cc}, which traverses the call graph encoding
  73 every reachable declaration, type and function.  This generates a
  74 memory representation of all the file sections described below.
  75
  76 @item @code{pass_ipa_lto_finish_out}
  77 This pass executes the function @code{produce_asm_for_decls} in
  78 @file{lto-streamer-out.cc}, which takes the memory image built in the
  79 previous pass and encodes it in the corresponding ELF file sections.
  80 @end itemize
  81
  82 The second half of LTO support is the ``reader''.  This is implemented
  83 as the GCC front end @file{lto1} in @file{lto/lto.cc}.  When
  84 @file{collect2} detects a link set of @code{.o}/@code{.a} files with
  85 LTO information and the @option{-flto} is enabled, it invokes
  86 @file{lto1} which reads the set of files and aggregates them into a
  87 single translation unit for optimization.  The main entry point for
  88 the reader is @file{lto/lto.cc}:@code{lto_main}.
  89
  90 @subsection LTO modes of operation
  91
  92 One of the main goals of the GCC link-time infrastructure was to allow
  93 effective compilation of large programs.  For this reason GCC implements two
  94 link-time compilation modes.
  95
  96 @enumerate
  97 @item   @emph{LTO mode}, in which the whole program is read into the
  98 compiler at link-time and optimized in a similar way as if it
  99 were a single source-level compilation unit.
 100
 101 @item   @emph{WHOPR or partitioned mode}, designed to utilize multiple
 102 CPUs and/or a distributed compilation environment to quickly link
 103 large applications.  WHOPR stands for WHOle Program optimizeR (not to
 104 be confused with the semantics of @option{-fwhole-program}).  It
 105 partitions the aggregated callgraph from many different @code{.o}
 106 files and distributes the compilation of the sub-graphs to different
 107 CPUs.
 108
 109 Note that distributed compilation is not implemented yet, but since
 110 the parallelism is facilitated via generating a @code{Makefile}, it
 111 would be easy to implement.
 112 @end enumerate
 113
 114 WHOPR splits LTO into three main stages:
 115 @enumerate
 116 @item Local generation (LGEN)
 117 This stage executes in parallel.  Every file in the program is compiled
 118 into the intermediate language and packaged together with the local
 119 call-graph and summary information.  This stage is the same for both
 120 the LTO and WHOPR compilation mode.
 121
 122 @item Whole Program Analysis (WPA)
 123 WPA is performed sequentially.  The global call-graph is generated, and
 124 a global analysis procedure makes transformation decisions.  The global
 125 call-graph is partitioned to facilitate parallel optimization during
 126 phase 3.  The results of the WPA stage are stored into new object files
 127 which contain the partitions of program expressed in the intermediate
 128 language and the optimization decisions.
 129
 130 @item Local transformations (LTRANS)
 131 This stage executes in parallel.  All the decisions made during phase 2
 132 are implemented locally in each partitioned object file, and the final
 133 object code is generated.  Optimizations which cannot be decided
 134 efficiently during the phase 2 may be performed on the local
 135 call-graph partitions.
 136 @end enumerate
 137
 138 WHOPR can be seen as an extension of the usual LTO mode of
 139 compilation.  In LTO, WPA and LTRANS are executed within a single
 140 execution of the compiler, after the whole program has been read into
 141 memory.
 142
 143 When compiling in WHOPR mode, the callgraph is partitioned during
 144 the WPA stage.  The whole program is split into a given number of
 145 partitions of roughly the same size.  The compiler tries to
 146 minimize the number of references which cross partition boundaries.
 147 The main advantage of WHOPR is to allow the parallel execution of
 148 LTRANS stages, which are the most time-consuming part of the
 149 compilation process.  Additionally, it avoids the need to load the
 150 whole program into memory.
 151
 152
 153 @node LTO object file layout
 154 @section LTO file sections
 155
 156 LTO information is stored in several ELF sections inside object files.
 157 Data structures and enum codes for sections are defined in
 158 @file{lto-streamer.h}.
 159
 160 These sections are emitted from @file{lto-streamer-out.cc} and mapped
 161 in all at once from @file{lto/lto.cc}:@code{lto_file_read}.  The
 162 individual functions dealing with the reading/writing of each section
 163 are described below.
 164
 165 @itemize @bullet
 166 @item Command line options (@code{.gnu.lto_.opts})
 167
 168 This section contains the command line options used to generate the
 169 object files.  This is used at link time to determine the optimization
 170 level and other settings when they are not explicitly specified at the
 171 linker command line.
 172
 173 Most options are recorded at a per function level and their setting
 174 restored when processing the functions at link time.  Global options
 175 are composed from options specified at compile time and link time.
 176 How exactly they are combined or mismatches diagnosed is implemented in
 177 @file{lto-wrapper.cc}:@code{find_and_merge_options}.
 178
 179
 180 @item Symbol table (@code{.gnu.lto_.symtab})
 181
 182 This table replaces the ELF symbol table for functions and variables
 183 represented in the LTO IL.  Symbols used and exported by the optimized
 184 assembly code of ``fat'' objects might not match the ones used and
 185 exported by the intermediate code.  This table is necessary because
 186 the intermediate code is less optimized and thus requires a separate
 187 symbol table.
 188
 189 Additionally, the binary code in the ``fat'' object will lack a call
 190 to a function, since the call was optimized out at compilation time
 191 after the intermediate language was streamed out.  In some special
 192 cases, the same optimization may not happen during link-time
 193 optimization.  This would lead to an undefined symbol if only one
 194 symbol table was used.
 195
 196 The symbol table is emitted in
 197 @file{lto-streamer-out.cc}:@code{produce_symtab}.
 198
 199
 200 @item Global declarations and types (@code{.gnu.lto_.decls})
 201
 202 This section contains an intermediate language dump of all
 203 declarations and types required to represent the callgraph, static
 204 variables and top-level debug info.
 205
 206 The contents of this section are emitted in
 207 @file{lto-streamer-out.cc}:@code{produce_asm_for_decls}.  Types and
 208 symbols are emitted in a topological order that preserves the sharing
 209 of pointers when the file is read back in
 210 (@file{lto.cc}:@code{read_cgraph_and_symbols}).
 211
 212
 213 @item The callgraph (@code{.gnu.lto_.cgraph})
 214
 215 This section contains the basic data structure used by the GCC
 216 inter-procedural optimization infrastructure.  This section stores an
 217 annotated multi-graph which represents the functions and call sites as
 218 well as the variables, aliases and top-level @code{asm} statements.
 219
 220 This section is emitted in
 221 @file{lto-streamer-out.cc}:@code{output_cgraph} and read in
 222 @file{lto-cgraph.cc}:@code{input_cgraph}.
 223
 224
 225 @item IPA references (@code{.gnu.lto_.refs})
 226
 227 This section contains references between function and static
 228 variables.  It is emitted by @file{lto-cgraph.cc}:@code{output_refs}
 229 and read by @file{lto-cgraph.cc}:@code{input_refs}.
 230
 231
 232 @item Function bodies (@code{.gnu.lto_.function_body.<name>})
 233
 234 This section contains function bodies in the intermediate language
 235 representation.  Every function body is in a separate section to allow
 236 copying of the section independently to different object files or
 237 reading the function on demand.
 238
 239 Functions are emitted in
 240 @file{lto-streamer-out.cc}:@code{output_function} and read in
 241 @file{lto-streamer-in.cc}:@code{input_function}.
 242
 243
 244 @item Static variable initializers (@code{.gnu.lto_.vars})
 245
 246 This section contains all the symbols in the global variable pool.  It
 247 is emitted by @file{lto-cgraph.cc}:@code{output_varpool} and read in
 248 @file{lto-cgraph.cc}:@code{input_cgraph}.
 249
 250 @item Summaries and optimization summaries used by IPA passes
 251 (@code{.gnu.lto_.<xxx>}, where @code{<xxx>} is one of @code{jmpfuncs},
 252 @code{pureconst} or @code{reference})
 253
 254 These sections are used by IPA passes that need to emit summary
 255 information during LTO generation to be read and aggregated at
 256 link time.  Each pass is responsible for implementing two pass manager
 257 hooks: one for writing the summary and another for reading it in.  The
 258 format of these sections is entirely up to each individual pass.  The
 259 only requirement is that the writer and reader hooks agree on the
 260 format.
 261 @end itemize
 262
 263
 264 @node IPA
 265 @section Using summary information in IPA passes
 266
 267 Programs are represented internally as a @emph{callgraph} (a
 268 multi-graph where nodes are functions and edges are call sites)
 269 and a @emph{varpool} (a list of static and external variables in
 270 the program).
 271
 272 The inter-procedural optimization is organized as a sequence of
 273 individual passes, which operate on the callgraph and the
 274 varpool.  To make the implementation of WHOPR possible, every
 275 inter-procedural optimization pass is split into several stages
 276 that are executed at different times during WHOPR compilation:
 277
 278 @itemize @bullet
 279 @item LGEN time
 280 @enumerate
 281 @item @emph{Generate summary} (@code{generate_summary} in
 282 @code{struct ipa_opt_pass_d}).  This stage analyzes every function
 283 body and variable initializer is examined and stores relevant
 284 information into a pass-specific data structure.
 285
 286 @item @emph{Write summary} (@code{write_summary} in
 287 @code{struct ipa_opt_pass_d}).  This stage writes all the
 288 pass-specific information generated by @code{generate_summary}.
 289 Summaries go into their own @code{LTO_section_*} sections that
 290 have to be declared in @file{lto-streamer.h}:@code{enum
 291 lto_section_type}.  A new section is created by calling
 292 @code{create_output_block} and data can be written using the
 293 @code{lto_output_*} routines.
 294 @end enumerate
 295
 296 @item WPA time
 297 @enumerate
 298 @item @emph{Read summary} (@code{read_summary} in
 299 @code{struct ipa_opt_pass_d}).  This stage reads all the
 300 pass-specific information in exactly the same order that it was
 301 written by @code{write_summary}.
 302
 303 @item @emph{Execute} (@code{execute} in @code{struct
 304 opt_pass}).  This performs inter-procedural propagation.  This
 305 must be done without actual access to the individual function
 306 bodies or variable initializers.  Typically, this results in a
 307 transitive closure operation over the summary information of all
 308 the nodes in the callgraph.
 309
 310 @item @emph{Write optimization summary}
 311 (@code{write_optimization_summary} in @code{struct
 312 ipa_opt_pass_d}).  This writes the result of the inter-procedural
 313 propagation into the object file.  This can use the same data
 314 structures and helper routines used in @code{write_summary}.
 315 @end enumerate
 316
 317 @item LTRANS time
 318 @enumerate
 319 @item @emph{Read optimization summary}
 320 (@code{read_optimization_summary} in @code{struct
 321 ipa_opt_pass_d}).  The counterpart to
 322 @code{write_optimization_summary}.  This reads the interprocedural
 323 optimization decisions in exactly the same format emitted by
 324 @code{write_optimization_summary}.
 325
 326 @item @emph{Transform} (@code{function_transform} and
 327 @code{variable_transform} in @code{struct ipa_opt_pass_d}).
 328 The actual function bodies and variable initializers are updated
 329 based on the information passed down from the @emph{Execute} stage.
 330 @end enumerate
 331 @end itemize
 332
 333 The implementation of the inter-procedural passes are shared
 334 between LTO, WHOPR and classic non-LTO compilation.
 335
 336 @itemize
 337 @item During the traditional file-by-file mode every pass executes its
 338 own @emph{Generate summary}, @emph{Execute}, and @emph{Transform}
 339 stages within the single execution context of the compiler.
 340
 341 @item In LTO compilation mode, every pass uses @emph{Generate
 342 summary} and @emph{Write summary} stages at compilation time,
 343 while the @emph{Read summary}, @emph{Execute}, and
 344 @emph{Transform} stages are executed at link time.
 345
 346 @item In WHOPR mode all stages are used.
 347 @end itemize
 348
 349 To simplify development, the GCC pass manager differentiates
 350 between normal inter-procedural passes (@pxref{Regular IPA passes}),
 351 small inter-procedural passes (@pxref{Small IPA passes})
 352 and late inter-procedural passes (@pxref{Late IPA passes}).
 353 A small or late IPA pass (@code{SIMPLE_IPA_PASS}) does
 354 everything at once and thus cannot be executed during WPA in
 355 WHOPR mode.  It defines only the @emph{Execute} stage and during
 356 this stage it accesses and modifies the function bodies.  Such
 357 passes are useful for optimization at LGEN or LTRANS time and are
 358 used, for example, to implement early optimization before writing
 359 object files.  The simple inter-procedural passes can also be used
 360 for easier prototyping and development of a new inter-procedural
 361 pass.
 362
 363
 364 @subsection Virtual clones
 365
 366 One of the main challenges of introducing the WHOPR compilation
 367 mode was addressing the interactions between optimization passes.
 368 In LTO compilation mode, the passes are executed in a sequence,
 369 each of which consists of analysis (or @emph{Generate summary}),
 370 propagation (or @emph{Execute}) and @emph{Transform} stages.
 371 Once the work of one pass is finished, the next pass sees the
 372 updated program representation and can execute.  This makes the
 373 individual passes dependent on each other.
 374
 375 In WHOPR mode all passes first execute their @emph{Generate
 376 summary} stage.  Then summary writing marks the end of the LGEN
 377 stage.  At WPA time,
 378 the summaries are read back into memory and all passes run the
 379 @emph{Execute} stage.  Optimization summaries are streamed and
 380 sent to LTRANS, where all the passes execute the @emph{Transform}
 381 stage.
 382
 383 Most optimization passes split naturally into analysis,
 384 propagation and transformation stages.  But some do not.  The
 385 main problem arises when one pass performs changes and the
 386 following pass gets confused by seeing different callgraphs
 387 between the @emph{Transform} stage and the @emph{Generate summary}
 388 or @emph{Execute} stage.  This means that the passes are required
 389 to communicate their decisions with each other.
 390
 391 To facilitate this communication, the GCC callgraph
 392 infrastructure implements @emph{virtual clones}, a method of
 393 representing the changes performed by the optimization passes in
 394 the callgraph without needing to update function bodies.
 395
 396 A @emph{virtual clone} in the callgraph is a function that has no
 397 associated body, just a description of how to create its body based
 398 on a different function (which itself may be a virtual clone).
 399
 400 The description of function modifications includes adjustments to
 401 the function's signature (which allows, for example, removing or
 402 adding function arguments), substitutions to perform on the
 403 function body, and, for inlined functions, a pointer to the
 404 function that it will be inlined into.
 405
 406 It is also possible to redirect any edge of the callgraph from a
 407 function to its virtual clone.  This implies updating of the call
 408 site to adjust for the new function signature.
 409
 410 Most of the transformations performed by inter-procedural
 411 optimizations can be represented via virtual clones.  For
 412 instance, a constant propagation pass can produce a virtual clone
 413 of the function which replaces one of its arguments by a
 414 constant.  The inliner can represent its decisions by producing a
 415 clone of a function whose body will be later integrated into
 416 a given function.
 417
 418 Using @emph{virtual clones}, the program can be easily updated
 419 during the @emph{Execute} stage, solving most of pass interactions
 420 problems that would otherwise occur during @emph{Transform}.
 421
 422 Virtual clones are later materialized in the LTRANS stage and
 423 turned into real functions.  Passes executed after the virtual
 424 clone were introduced also perform their @emph{Transform} stage
 425 on new functions, so for a pass there is no significant
 426 difference between operating on a real function or a virtual
 427 clone introduced before its @emph{Execute} stage.
 428
 429 Optimization passes then work on virtual clones introduced before
 430 their @emph{Execute} stage as if they were real functions.  The
 431 only difference is that clones are not visible during the
 432 @emph{Generate Summary} stage.
 433
 434 To keep function summaries updated, the callgraph interface
 435 allows an optimizer to register a callback that is called every
 436 time a new clone is introduced as well as when the actual
 437 function or variable is generated or when a function or variable
 438 is removed.  These hooks are registered in the @emph{Generate
 439 summary} stage and allow the pass to keep its information intact
 440 until the @emph{Execute} stage.  The same hooks can also be
 441 registered during the @emph{Execute} stage to keep the
 442 optimization summaries updated for the @emph{Transform} stage.
 443
 444 @subsection IPA references
 445
 446 GCC represents IPA references in the callgraph.  For a function
 447 or variable @code{A}, the @emph{IPA reference} is a list of all
 448 locations where the address of @code{A} is taken and, when
 449 @code{A} is a variable, a list of all direct stores and reads
 450 to/from @code{A}.  References represent an oriented multi-graph on
 451 the union of nodes of the callgraph and the varpool.  See
 452 @file{ipa-reference.cc}:@code{ipa_reference_write_optimization_summary}
 453 and
 454 @file{ipa-reference.cc}:@code{ipa_reference_read_optimization_summary}
 455 for details.
 456
 457 @subsection Jump functions
 458 Suppose that an optimization pass sees a function @code{A} and it
 459 knows the values of (some of) its arguments.  The @emph{jump
 460 function} describes the value of a parameter of a given function
 461 call in function @code{A} based on this knowledge.
 462
 463 Jump functions are used by several optimizations, such as the
 464 inter-procedural constant propagation pass and the
 465 devirtualization pass.  The inliner also uses jump functions to
 466 perform inlining of callbacks.
 467
 468 @node WHOPR
 469 @section Whole program assumptions, linker plugin and symbol visibilities
 470
 471 Link-time optimization gives relatively minor benefits when used
 472 alone.  The problem is that propagation of inter-procedural
 473 information does not work well across functions and variables
 474 that are called or referenced by other compilation units (such as
 475 from a dynamically linked library).  We say that such functions
 476 and variables are @emph{externally visible}.
 477
 478 To make the situation even more difficult, many applications
 479 organize themselves as a set of shared libraries, and the default
 480 ELF visibility rules allow one to overwrite any externally
 481 visible symbol with a different symbol at runtime.  This
 482 basically disables any optimizations across such functions and
 483 variables, because the compiler cannot be sure that the function
 484 body it is seeing is the same function body that will be used at
 485 runtime.  Any function or variable not declared @code{static} in
 486 the sources degrades the quality of inter-procedural
 487 optimization.
 488
 489 To avoid this problem the compiler must assume that it sees the
 490 whole program when doing link-time optimization.  Strictly
 491 speaking, the whole program is rarely visible even at link-time.
 492 Standard system libraries are usually linked dynamically or not
 493 provided with the link-time information.  In GCC, the whole
 494 program option (@option{-fwhole-program}) asserts that every
 495 function and variable defined in the current compilation
 496 unit is static, except for function @code{main} (note: at
 497 link time, the current unit is the union of all objects compiled
 498 with LTO).  Since some functions and variables need to
 499 be referenced externally, for example by another DSO or from an
 500 assembler file, GCC also provides the function and variable
 501 attribute @code{externally_visible} which can be used to disable
 502 the effect of @option{-fwhole-program} on a specific symbol.
 503
 504 The whole program mode assumptions are slightly more complex in
 505 C++, where inline functions in headers are put into @emph{COMDAT}
 506 sections.  COMDAT function and variables can be defined by
 507 multiple object files and their bodies are unified at link-time
 508 and dynamic link-time.  COMDAT functions are changed to local only
 509 when their address is not taken and thus un-sharing them with a
 510 library is not harmful.  COMDAT variables always remain externally
 511 visible, however for readonly variables it is assumed that their
 512 initializers cannot be overwritten by a different value.
 513
 514 GCC provides the function and variable attribute
 515 @code{visibility} that can be used to specify the visibility of
 516 externally visible symbols (or alternatively an
 517 @option{-fdefault-visibility} command line option).  ELF defines
 518 the @code{default}, @code{protected}, @code{hidden} and
 519 @code{internal} visibilities.
 520
 521 The most commonly used is visibility is @code{hidden}.  It
 522 specifies that the symbol cannot be referenced from outside of
 523 the current shared library.  Unfortunately, this information
 524 cannot be used directly by the link-time optimization in the
 525 compiler since the whole shared library also might contain
 526 non-LTO objects and those are not visible to the compiler.
 527
 528 GCC solves this problem using linker plugins.  A @emph{linker
 529 plugin} is an interface to the linker that allows an external
 530 program to claim the ownership of a given object file.  The linker
 531 then performs the linking procedure by querying the plugin about
 532 the symbol table of the claimed objects and once the linking
 533 decisions are complete, the plugin is allowed to provide the
 534 final object file before the actual linking is made.  The linker
 535 plugin obtains the symbol resolution information which specifies
 536 which symbols provided by the claimed objects are bound from the
 537 rest of a binary being linked.
 538
 539 GCC is designed to be independent of the rest of the toolchain
 540 and aims to support linkers without plugin support.  For this
 541 reason it does not use the linker plugin by default.  Instead,
 542 the object files are examined by @command{collect2} before being
 543 passed to the linker and objects found to have LTO sections are
 544 passed to @command{lto1} first.  This mode does not work for
 545 library archives.  The decision on what object files from the
 546 archive are needed depends on the actual linking and thus GCC
 547 would have to implement the linker itself.  The resolution
 548 information is missing too and thus GCC needs to make an educated
 549 guess based on @option{-fwhole-program}.  Without the linker
 550 plugin GCC also assumes that symbols are declared @code{hidden}
 551 and not referred by non-LTO code by default.
 552
 553 @node Internal flags
 554 @section Internal flags controlling @code{lto1}
 555
 556 The following flags are passed into @command{lto1} and are not
 557 meant to be used directly from the command line.
 558
 559 @itemize
 560 @opindex fwpa
 561 @item -fwpa
 562 This option runs the serial part of the link-time optimizer
 563 performing the inter-procedural propagation (WPA mode).  The
 564 compiler reads in summary information from all inputs and
 565 performs an analysis based on summary information only.  It
 566 generates object files for subsequent runs of the link-time
 567 optimizer where individual object files are optimized using both
 568 summary information from the WPA mode and the actual function
 569 bodies.  It then drives the LTRANS phase.
 570
 571 @opindex fltrans
 572 @item -fltrans
 573 This option runs the link-time optimizer in the
 574 local-transformation (LTRANS) mode, which reads in output from a
 575 previous run of the LTO in WPA mode.  In the LTRANS mode, LTO
 576 optimizes an object and produces the final assembly.
 577
 578 @opindex fltrans-output-list
 579 @item -fltrans-output-list=@var{file}
 580 This option specifies a file to which the names of LTRANS output
 581 files are written.  This option is only meaningful in conjunction
 582 with @option{-fwpa}.
 583
 584 @opindex fresolution
 585 @item -fresolution=@var{file}
 586 This option specifies the linker resolution file.  This option is
 587 only meaningful in conjunction with @option{-fwpa} and as option
 588 to pass through to the LTO linker plugin.
 589 @end itemize