hphp/hhbbc/README

   1 HHBBC: HipHop Bytecode to Bytecode Compiler
   2 ===========================================
   3
   4 Overview
   5 --------
   6
   7 This directory contains a bytecode-level optimizer for HHBC.  This is
   8 a short document to hopefully give an overview that makes it easier to
   9 understand the code.
  10
  11 Right now this is optionally hooked into the hphpc emitter code in
  12 compiler/analysis/emitter, and uses UnitEmitters as both input and
  13 output.  The HHBBC::whole_program entry point takes a set of
  14 UnitEmitters and returns a new set.  You can also run it as a
  15 standalone program by passing --hhbbc to hhvm, or running hhvm with
  16 argv[0] as 'hhbbc' (e.g. via a symlink).
  17
  18 HHBBC Entry Points
  19 ------------------
  20
  21 The main entry points to this module are currently exported in
  22 hhbbc.h.  Modules outside of hphp/hhbbc should not include any of its
  23 other headers.
  24
  25 The first thing either entry point does is parse UnitEmitters into an
  26 internal representation for HHBC programs.  The representation is in
  27 bc.h and representation.h, and the code that converts UnitEmitters
  28 into this representation is in parse.cpp.
  29
  30 Then, each of these does some sort of analysis and optimization
  31 (described more below).
  32
  33 Finally, they send the modified representation to emit.cpp to generate
  34 new UnitEmitters that can be used to either serialize to a bytecode
  35 repo or produce a runtime Unit.
  36
  37 Representation
  38 --------------
  39
  40 The representation of PHP programs is in bc.h and representation.h.
  41 Data that corresponds to metadata about the program is in the php
  42 namespace, while the structs that represent each HHBC instruction are
  43 in the bc namespace.  There's analogous php::Foo structures to most of
  44 the FooEmitter structures from the runtime, but the metadata contains
  45 mostly pointers or other things instead of raw offsets into an array
  46 of serialized bytecode.
  47
  48 In the current state of things, during analysis passes, the program
  49 representation should be considered immutable.  This simplifies
  50 parallelization/locking.  Information about the program that is
  51 derived from the analysis or that needs more efficient structures for
  52 querying (e.g. hashtables by name) is stored on the side in the Index
  53 structure (more later).
  54
  55 Each php::Func is analyzed at parse time into a "factored" control
  56 flow graph.  The function is divided into basic blocks at boundaries
  57 described in parse.cpp, and the blocks are linked together
  58 accordingly.  The "factored" part of it is that we don't stop a block
  59 on any instruction that could raise an exception.  HHBC opcodes,
  60 generally speaking, can raise exceptions---but often if you know the
  61 types of their arguments or function locals you can rule out this
  62 possibility.
  63
  64 Each php::Block in the graph has a vector of Bytecode objects.  The
  65 Bytecode class is essentially a sum type of all the bc::Foo
  66 structures, which are generated from the opcode table.
  67
  68 Index and res:: Handles
  69 -----------------------
  70
  71 As mentioned above, the php::Foo structures don't contain derived
  72 information that is a result of analysis, and also may not always have
  73 the information in an efficiently-queryable form.
  74
  75 Instead, on the side there is an Index structure that an answer
  76 questions like function or class lookups, or information that isn't
  77 present in php::Func like the inferred return type of a function.
  78
  79 One major functionality of the index is to 'resolve' names into
  80 abstract tokens that represent known information about that name.  The
  81 resolved versions of things have types in the res namespace,
  82 e.g. res::Class or res::Func.  A res::Func may know as little as the
  83 name of the function, or it may be able to provide information like
  84 the inferred return type of the function.  Queries on resolved objects
  85 go through other apis on the Index.
  86
  87 When the Index is queried for information about a resolved object, it
  88 records dependency information about what it was queried for.  This is
  89 used in whole program mode.
  90
  91 Whole Program Mode
  92 ------------------
  93
  94 The algorithm in whole program mode works by continually running
  95 analysis passes to refine information stored in the Index.
  96
  97 The idea is that some types in the system are only allowed to grow,
  98 and some are only allowed to shrink.  Types that are only allowed to
  99 shrink are stored in the Index, while types that only grow must be
 100 analyzed in a context that can see everything that can affect that
 101 type (e.g. a function, or a whole class for private properties).
 102
 103 The basic idea behind the algorithm is to make several passes over
 104 these work units that infer only-growing types, and use them to refine
 105 the only-shrinking types in the Index until the Index stops shrinking.
 106 It looks somewhat like this:
 107
 108   - Start with a list of all work units (functions or classes) in the
 109     program.
 110
 111   - Run an analysis pass in parallel on each work unit in the list.
 112     Thread safety model here: none of the analysis is allowed to do
 113     anything except query the Index (internally thread safe) or read
 114     the php metadata/Bytecode (unchanging, safe to read concurrently)
 115     in order to produce its results.
 116
 117   - Clear the list.
 118
 119   - Single threaded: if we learned more useful information for any
 120     function or class than was previously in the index, add that
 121     information to the Index, and add any function that tried to ask
 122     for that information back to the list.  Since only one thread is
 123     updating the Index in this step we don't need to lock it.
 124
 125   - Repeat starting from the second step if the list is non-empty.
 126
 127   - Run a final analysis pass in parallel with the Index now at a
 128     fixed-point, and send each analysis pass through the optimizer.
 129     The optimization functions are only allowed to modify the contents
 130     of the php::Blocks for the function they are asked to optimize, so
 131     no one needs to lock these accesses.
 132
 133 What makes this sound is that the information in the Index is never
 134 allowed to be incorrect---it just might not be as useful as it could
 135 be.  This means that no analysis pass should ever deduce a new fact
 136 that is incorrect.  Each piece of information in the Index must also
 137 start conservative, and become more refined upon additional iterations.
 138
 139 Type Inference
 140 --------------
 141
 142 Local analysis of a function is done as a data-flow analysis on the
 143 types of locals and eval stack locations.  The code for this is in
 144 interp.h, with state structures in interp-state.h, which contains a
 145 block-at-a-time abstract interpreter for HHBC.  Functions that drive
 146 it either to do analysis or optimization are in analyze.cpp and
 147 optimize.cpp---the same abstract interpreter that implements the
 148 analysis is also used during optimizations.
 149
 150 Type inference on HHBC is treated as an iterative forward dataflow
 151 problem.  Because of the presence of dynamic calls, we assume that
 152 anything could call the function, so the input states are initialized
 153 quite conservatively.
 154
 155 The algorithm is fairly standard after that, except for handling of exceptional
 156 edges.  Here is an overview (the code is in analyze_func):
 157
 158   - A work list of blocks is initialized with the entry blocks.
 159     (Functions may have multiple entries because of DV initializers.)
 160
 161   - Until the work list is non-empty, pull a block off the list, and
 162     run the abstract interpreter on it:
 163
 164      - If the abstract interpreter encounters an instruction that
 165        could potentially throw an exception, it propagates the current
 166        state from before the instruction across all the throw
 167        edges for the block.
 168
 169      - If it encounters an instruction that could take a branch, it
 170        propagates the state after the branching instruction across the
 171        branch's taken edge.
 172
 173      - If the block could fallthrough, the abstract interpreter
 174        propagates the state after the last instruction across the
 175        fallthrough edge.
 176
 177      - Add any blocks that had their input states changed back to the
 178        list and repeat.
 179
 180 The type system used here is designed to match closely to the type
 181 system we use at JIT time, and makes distinctions useful for HHVM
 182 (e.g. whether a value is potentially reference counted or not).  It
 183 also contains constant values, which allows the same analysis to
 184 perform constant propagation, and determine some unreachable blocks.
 185 Some details and an unmaintainable ascii-art diagram are available in
 186 type-system.h.