docs/aot-compiler.txt

   1 Mono Ahead Of Time Compiler
   2 ===========================
   3
   4         The Ahead of Time compilation feature in Mono allows Mono to
   5         precompile assemblies to minimize JIT time, reduce memory
   6         usage at runtime and increase the code sharing across multiple
   7         running Mono application.
   8
   9         To precompile an assembly use the following command:
  10
  11            mono --aot -O=all assembly.exe
  12
  13         The `--aot' flag instructs Mono to ahead-of-time compile your
  14         assembly, while the -O=all flag instructs Mono to use all the
  15         available optimizations.
  16
  17 * Position Independent Code
  18 ---------------------------
  19
  20         On x86 and x86-64 the code generated by Ahead-of-Time compiled
  21         images is position-independent code.  This allows the same
  22         precompiled image to be reused across multiple applications
  23         without having different copies: this is the same way in which
  24         ELF shared libraries work: the code produced can be relocated
  25         to any address.
  26
  27         The implementation of Position Independent Code had a
  28         performance impact on Ahead-of-Time compiled images but
  29         compiler bootstraps are still faster than JIT-compiled images,
  30         specially with all the new optimizations provided by the Mono
  31         engine.
  32
  33 * How to support Position Independent Code in new Mono Ports
  34 ------------------------------------------------------------
  35
  36         Generated native code needs to reference various runtime
  37         structures/functions whose address is only known at run
  38         time. JITted code can simple embed the address into the native
  39         code, but AOT code needs to do an indirection. This
  40         indirection is done through a table called the Global Offset
  41         Table (GOT), which is similar to the GOT table in the Elf
  42         spec.  When the runtime saves the AOT image, it saves some
  43         information for each method describing the GOT table entries
  44         used by that method. When loading a method from an AOT image,
  45         the runtime will fill out the GOT entries needed by the
  46         method.
  47
  48    * Computing the address of the GOT
  49
  50         Methods which need to access the GOT first need to compute its
  51         address. On the x86 it is done by code like this:
  52
  53                 call <IP + 5>
  54                 pop ebx
  55                 add <OFFSET TO GOT>, ebx
  56                 <save got addr to a register>
  57
  58         The variable representing the got is stored in
  59         cfg->got_var. It is allways allocated to a global register to
  60         prevent some problems with branches + basic blocks.
  61
  62    * Referencing GOT entries
  63
  64         Any time the native code needs to access some other runtime
  65         structure/function (i.e. any time the backend calls
  66         mono_add_patch_info ()), the code pointed by the patch needs
  67         to load the value from the got. For example, instead of:
  68
  69         call <ABSOLUTE ADDR>
  70         it needs to do:
  71         call *<OFFSET>(<GOT REG>)
  72
  73         Here, the <OFFSET> can be 0, it will be fixed up by the AOT compiler.
  74
  75         For more examples on the changes required, see
  76
  77         svn diff -r 37739:38213 mini-x86.c
  78
  79 * The Precompiled File Format
  80 -----------------------------
  81
  82         We use the native object format of the platform. That way it
  83         is possible to reuse existing tools like objdump and the
  84         dynamic loader. All we need is a working assembler, i.e. we
  85         write out a text file which is then passed to gas (the gnu
  86         assembler) to generate the object file.
  87
  88         The precompiled image is stored in a file next to the original
  89         assembly that is precompiled with the native extension for a shared
  90         library (on Linux its ".so" to the generated file).
  91
  92         For example: basic.exe -> basic.exe.so; corlib.dll -> corlib.dll.so
  93
  94         The following things are saved in the object file and can be
  95         looked up using the equivalent to dlsym:
  96
  97                 mono_assembly_guid
  98
  99                         A copy of the assembly GUID.
 100
 101                 mono_aot_version
 102
 103                         The format of the AOT file format.
 104
 105                 mono_aot_opt_flags
 106
 107                         The optimizations flags used to build this
 108                         precompiled image.
 109
 110                 method_infos
 111
 112                         Contains additional information needed by the runtime for using the
 113                         precompiled method, like the GOT entries it uses.
 114
 115                 method_info_offsets
 116
 117                     Maps method indexes to offsets in the method_infos array.
 118
 119                 mono_icall_table
 120
 121                         A table that lists all the internal calls
 122                         references by the precompiled image.
 123
 124                 mono_image_table
 125
 126                         A list of assemblies referenced by this AOT
 127                         module.
 128
 129                 method_offsets
 130
 131                         The equivalent to a procedure linkage table.
 132
 133 * Performance considerations
 134 ----------------------------
 135
 136 Using AOT code is a trade-off which might lead to higher or slower performance,
 137 depending on a lot of circumstances. Some of these are:
 138
 139 - AOT code needs to be loaded from disk before being used, so cold startup of
 140   an application using AOT code MIGHT be slower than using JITed code. Warm
 141   startup (when the code is already in the machines cache) should be faster.
 142   Also, JITing code takes time, and the JIT compiler also need to load
 143   additional metadata for the method from the disk, so startup can be faster
 144   even in the cold startup case.
 145 - AOT code is usually compiled with all optimizations turned on, while JITted
 146   code is usually compiled with default optimizations, so the generated code
 147   in the AOT case should be faster.
 148 - JITted code can directly access runtime data structures and helper functions,
 149   while AOT code needs to go through an indirection (the GOT) to access them,
 150   so it will be slower and somewhat bigger as well.
 151 - When JITting code, the JIT compiler needs to load a lot of metadata about
 152   methods and types into memory.
 153 - JITted code has better locality, meaning that if A method calls B, then
 154   the native code for A and B is usually quite close in memory, leading to
 155   better cache behaviour thus improved performance. In contrast, the native
 156   code of methods inside the AOT file is in a somewhat random order.
 157
 158 * Future Work
 159 -------------
 160
 161 - Currently, the runtime needs to setup some data structures and fill out
 162   GOT entries before a method is first called. This means that even calls to
 163   a method whose code is in the same AOT image need to go through the GOT,
 164   instead of using a direct call.
 165 - On x86, the generated code uses call 0, pop REG, add GOTOFFSET, REG to
 166   materialize the GOT address. Newer versions of gcc use a separate function
 167   to do this, maybe we need to do the same.
 168 - Currently, we get vtable addresses from the GOT. Another solution would be
 169   to store the data from the vtables in the .bss section, so accessing them
 170   would involve less indirection.
 171
 172
 173