1 TITLE: x86-optimization
3 AUTHOR: Eric Olinger <eric@supertux.com>
6 How to use compiler-optimization setting with GCC to optimize binaries for an x86 systems
13 Gerard Beekmans <gerard@linuxfromscratch.org>
14 One of the Authors of the original Compiler-optimization hint
15 and I paraphrased some of lfs-book 2.4.3 in the intro section.
17 Thomas -Balu- Walter <tw@itreff.de>
18 One of the Authors of the original Compiler-optimization hint,
19 which I got some info for this hint from.
21 The people at the Athlon Linux Project <www.AthlonLinux.org>
22 They have one of the few pages I found on optimization flags
23 and what they mean besides the GCC online documentation.
29 Most binaries are compiled with the -O2 option and little if any other
30 optimization options. While this makes the binary portable, as its
31 compiled for the i386 processor by default, it doesn't do much for the
34 There's a few way to change the default compile options. One is to
35 Manually edit or patch the all the Makefile(s) in the src tree. This can
36 be a time consuming process and not very efficient. The second is to set
37 the CFLAGS and the CXXFLAGS environment variables.
43 For the minimal set of optimizations you can enter the following and
44 'unset' the environmental variably when your done the put it in your
45 .bashrc file if you plan to us it all the time.
47 export CFLAGS="-O3 -march=<cpu-type>" &&
50 or for the maximum optimization possible, try the following:
52 export CFLAGS="-s -O3 -fomit-frame-pointer -Wall \
53 -march=<cpu-type> -malign-functions=4 \
54 -funroll-loops -fexpensive-optimizations -malign-double \
55 -fschedule-insns2 -mwide-multiply" &&
58 The minimal optimizations will almost always work on your system but you
59 wont always be able to copy the binaries to other systems with a lower
62 Some packages don't like either of these optimizations and either wont
63 built or seg fault when you try to run it. If your having trouble getting
64 a package to compile or run properly, try turning off most if not all the
65 options, it probably has something to do with your compiler options.
67 The fact that you don't have any problems compiling everything with -O3
68 doesn't mean you won't have any problems in the future. Another problem
69 the Binutils version that's installed on your bootstrap system
70 often causes compilation problems in Glibc (most noticeable in
71 because RedHat often uses beta software which aren't always very
74 "RedHat likes living on the bleeding edge, but leaves the bleeding up to you"
75 (quoted from somebody on the lfs-discuss mailinglist).
77 ----------------------
79 ----------------------
81 For more information on compiler optimization flags see the GCC Command
82 s page in the Online GCC 3.0 docs at:
84 .gnu.org/onlinedocs/gcc-3.0/gcc_3.html
86 Section 3.10 deals with option flags for general compiler optimization.
87 n 3.17.15 deals with compiler optimization flags specific to the
91 A linker option that remove all symbol table and relocation
92 information from the binary.
95 This flag sets the optimizing level for the binary.
96 3 Highest level, machine specific code is generated.
97 Auto-magically adds the -finline-functions and
98 -frename-registers flags.
99 2 Most make files have this set up as Default, performs all
100 supported optimizations that do not involve a space-speed
101 tradeoff. Adds the -fforce-mem flag auto-magically.
102 1 Minimal optimizations are performed. Default for the compiler,
105 s Same as O2 but does additional optimizations for size.
108 Tells the compiler not to keep the frame pointer in
109 a register for functions that don't need one. This
110 avoids the instructions to save, set up and restore
111 frame pointers; it also makes an extra register available
112 in many functions. It also makes debugging impossible
116 Enables all warning messages.
119 Defines the instructions set to use when compiling. -mpcu is implied
120 be the same as -march when only -march is used.
121 i386 Default cpu type
122 i486 Intel/AMD 486 processor
123 i586 First generation pentium
124 i686 Second generation pentium
126 pentiumpro Same as i686
127 pentium4 Intel Pentium 4 processor
132 Sets the machine cpu-type to use when scheduling instructions.
133 The definitions are the same as -mcpu.
136 This is an i386 option. Aligns the start of functions to a 2 raised
137 to 4 byte boundary. If `-malign-functions' is not specified, the
138 default is 2 if optimizing for a 386, and 4 if optimizing ior a 486.
141 This is an optimization option. Performs the optimization of loop
142 unrolling. This is only done for loops whose number of iterations
143 can be determined at compile time or run time. `-funroll-loops'
144 implies both `-fstrength-reduce' and `-frerun-cse-after-loop'.
146 -fexpensive-optimizations
147 Another optimization option that performs a number of minor
148 optimizations that are relatively expensive.
151 This is an i386 option. Controls whether GCC aligns double, long
152 double, and long long variables on a two word boundary or a one
153 word boundary. Aligning double variables on a two word boundary
154 will produce code that runs somewhat faster on a `Pentium' at the
155 expense of more memory. Warning: if you use the `-malign-double'
156 switch, structures containing the above types will be aligned
157 differently than the published application binary interface
158 specifications for the 386.
161 This is an optimization option. Similar to `-fschedule-insns',
162 but requests an additional pass of instruction scheduling after
163 register allocation has been done. This is especially useful on
164 machines with a relatively small number of registers and where
165 memory load instructions take more than one cycle.
168 Control whether GCC uses the mul and imul that produce 64-bit
169 results in eax:edx from 32-bit operands to do long long
170 multiplies and 32-bit division by constants.