7 ![Patch Generation](generation.png)
9 - courgette\_tool.cc:GenerateEnsemblePatch kicks off the patch
10 generation by calling ensemble\_create.cc:GenerateEnsemblePatch
12 - The files are read in by in courgette:SourceStream objects
14 - ensemble\_create.cc:GenerateEnsemblePatch uses FindGenerators, which
15 uses MakeGenerator to create
16 patch\_generator\_x86\_32.h:PatchGeneratorX86\_32 classes.
18 - PatchGeneratorX86\_32's Transform method transforms the input file
19 using Courgette's core techniques that make the bsdiff delta
20 smaller. The steps it takes are the following:
22 - _disassemble_ the old and new binaries into AssemblyProgram
25 - _adjust_ the new AssemblyProgram object, and
27 - _encode_ the AssemblyProgram object back into raw bytes.
31 - The input is a pointer to a buffer containing the raw bytes of the
34 - Disassembly converts certain machine instructions that reference
35 addresses to Courgette instructions. It is not actually
36 disassembly, but this is the term the code-base uses. Specifically,
37 it detects instructions that use absolute addresses given by the
38 binary file's relocation table, and relative addresses used in
41 - Done by disassemble:ParseDetectedExecutable, which selects the
42 appropriate Disassembler subclass by looking at the binary file's
45 - disassembler\_win32\_x86.h defines the PE/COFF x86 disassembler
47 - disassembler\_elf\_32\_x86.h defines the ELF 32-bit x86 disassembler
49 - disassembler\_elf\_32\_arm.h defines the ELF 32-bit arm disassembler
51 - The Disassembler replaces the relocation table with a Courgette
52 instruction that can regenerate the relocation table.
54 - The Disassembler builds a list of addresses referenced by the
55 machine code, numbering each one.
57 - The Disassembler replaces and address used in machine instructions
58 with its index number.
60 - The output is an assembly\_program.h:AssemblyProgram class, which
61 contains a list of instructions, machine or Courgette, and a mapping
62 of indices to actual addresses.
66 - This step takes the AssemblyProgram for the old file and reassigns
67 the indices that map to actual addresses. It is performed by
68 adjustment_method.cc:Adjust().
70 - The goal is the match the indices from the old program to the new
71 program as closely as possible.
73 - When matched correctly, machine instructions that jump to the
74 function in both the new and old binary will look the same to
75 bsdiff, even the function is located in a different part of the
80 - This step takes an AssemblyProgram object and encodes both the
81 instructions and the mapping of indices to addresses as byte
82 vectors. This format can be written to a file directly, and is also
83 more appropriate for bsdiffing. It is done by
84 AssemblyProgram.Encode().
86 - encoded_program.h:EncodedProgram defines the binary format and a
87 WriteTo method that writes to a file.
91 - simple_delta.c:GenerateSimpleDelta
96 ![Patch Application](application.png)
98 - courgette\_tool.cc:ApplyEnsemblePatch kicks off the patch generation
99 by calling ensemble\_apply.cc:ApplyEnsemblePatch
101 - ensemble\_create.cc:ApplyEnsemblePatch, reads and verifies the
102 patch's header, then calls the overloaded version of
103 ensemble\_create.cc:ApplyEnsemblePatch.
105 - The patch is read into an ensemble_apply.cc:EnsemblePatchApplication
106 object, which generates a set of patcher_x86_32.h:PatcherX86_32
107 objects for the sections in the patch.
109 - The original file is disassembled and encoded via a call
110 EnsemblePatchApplication.TransformUp, which in turn call
111 patcher_x86_32.h:PatcherX86_32.Transform.
113 - The transformed file is then bspatched via
114 EnsemblePatchApplication.SubpatchTransformedElements, which calls
115 EnsemblePatchApplication.SubpatchStreamSets, which calls
116 simple_delta.cc:ApplySimpleDelta, Courgette's built-in
117 implementation of bspatch.
119 - Finally, EnsemblePatchApplication.TransformDown assembles, i.e.,
120 reverses the encoding and disassembly, on the patched binary data.
121 This is done by calling PatcherX86_32.Reform, which in turn calls
122 the global function encoded_program.cc:Assemble, which calls
123 EncodedProgram.AssembleTo.
129 **Adjust**: Reassign address indices in the new program to match more
130 closely those from the old.
132 **Assembly program**: The output of _disassembly_. Contains a list of
133 _Courgette instructions_ and an index of branch target addresses.
135 **Assemble**: Convert an _assembly program_ back into an object file
136 by evaluating the _Courgette instructions_ and leaving the machine
137 instructions in place.
139 **Courgette instruction**: Replaces machine instructions in the
140 program. Courgette instructions replace branches with an index to
141 the target addresses and replace part of the relocation table.
143 **Disassembler**: Takes a binary file and produces an _assembly
146 **Encode**: Convert an _assembly program_ into an _encoded program_ by
147 serializing its data structures into byte vectors more appropriate
148 for storage in a file.
150 **Encoded Program**: The output of encoding.
152 **Ensemble**: A Courgette-style patch containing sections for the list
153 of branch addresses, the encoded program. It supports patching
154 multiple object files at once.
156 **Opcode**: The number corresponding to either a machine or _Courgette