3 This is the GNU profiler. It is distributed with other "binary
4 utilities" which should be in ../binutils. See ../binutils/README for
5 more general notes, including where to send bug reports.
7 This file documents the changes and new features available with this
14 o Supports generalized file format, without breaking backward compatibility:
15 new file format supports basic-block execution counts and non-realtime
16 histograms (see below)
18 o Supports profiling at the line level: flat profiles, call-graph profiles,
19 and execution-counts can all be displayed at a level that identifies
20 individual lines rather than just functions
22 o Test-coverage support (similar to Sun tcov program): source files
23 can be annotated with the number of times a function was invoked
24 or with the number of times each basic-block in a function was
27 o Generalized histograms: not just execution-time, but arbitrary
28 histograms are support (for example, performance counter based
31 o Powerful mechanism to select data to be included/excluded from
32 analysis and/or output
34 o Support for DEC OSF/1 v3.0
36 o Full cross-platform profiling support: gprof uses BFD to support
37 arbitrary, non-native object file formats and non-native byte-orders
38 (this feature has not been tested yet)
40 o In the call-graph function index, static function names are now
41 printed together with the filename in which the function was defined
42 (required bfd_find_nearest_line() support and symbolic debugging
43 information to be present in the executable file)
45 o Major overhaul of source code (compiles cleanly with -Wall, etc.)
49 The current version is known to work on:
52 All features supported.
55 All features supported.
58 Line-level profiling unsupported because bfd_find_nearest_line()
59 is not fully implemented for Elf binaries.
62 Line-level profiling unsupported because bfd_find_nearest_line()
63 is not fully implemented for SOM binaries.
65 * Detailed Description
67 ** User Interface Changes
69 The command-line interface is backwards compatible with earlier
70 versions of GNU gprof and Berkeley gprof. The only exception is
71 the option to delete arcs from the call graph. The old syntax
76 while the new syntax is:
80 This change was necessary to be compatible with long-option parsing.
81 Also, "fromname" and "toname" can now be arbitrary symspecs rather
82 than just function names (see below for an explanation of symspecs).
83 For example, option "-k gprof.c/" suppresses all arcs due to calls out
88 It is often necessary to apply gprof only to specific parts of a
89 program. GNU gprof has a simple but powerful mechanism to achieve
90 this. So called {\em symspecs\/} provide the foundation for this
91 mechanism. A symspec selects the parts of a profiled program to which
92 an operation should be applied to. The syntax of a symspec is
95 filename_containing_a_dot
96 | funcname_not_containing_a_dot
98 | ( [ any_filename ] `:' ( any_funcname | linenumber ) )
100 Here are some examples:
102 main.c Selects everything in file "main.c"---the
103 dot in the string tells gprof to interpret
104 the string as a filename, rather than as
105 a function name. To select a file whose
106 name does contain a dot, a trailing colon
107 should be specified. For example, "odd:" is
108 interpreted as the file named "odd".
110 main Selects all functions named "main". Notice
111 that there may be multiple instances of the
112 same function name because some of the
113 definitions may be local (i.e., static).
114 Unless a function name is unique in a program,
115 you must use the colon notation explained
116 below to specify a function from a specific
117 source file. Sometimes, functionnames contain
118 dots. In such cases, it is necessary to
119 add a leading colon to the name. For example,
120 ":.mul" selects function ".mul".
122 main.c:main Selects function "main" in file "main.c".
124 main.c:134 Selects line 134 in file "main.c".
126 IMPLEMENTATION NOTE: The source code uses the type sym_id for symspecs.
127 At some point, this probably ought to be changed to "sym_spec" to make
128 reading the code easier.
132 GNU gprof now supports long options. The following is a list of all
133 supported options. Options that are listed without description
134 operate in the same manner as the corresponding option in older
137 Short Form: Long Form:
138 ----------- ----------
140 Request profiling at the line-level rather
141 than just at the function level. Source
142 lines are identified by symbols of the form:
146 where "func" is the function name, "file" is the
147 file name and "line" is the line-number that
148 corresponds to the line.
150 To work properly, the binary must contain symbolic
151 debugging information. This means that the source
152 have to be translated with option "-g" specified.
153 Functions for which there is no symbolic debugging
154 information available are treated as if "--line"
155 had not been specified. However, the line number
156 printed with such symbols is usually incorrect
157 and should be ignored.
160 -A[symspec] --annotated-source[=symspec]
161 Request output in the form of annotated source
162 files. If "symspec" is specified, print output only
163 for symbols selected by "symspec". If the option
164 is specified multiple times, annotated output is
165 generated for the union of all symspecs.
169 -A Prints annotated source for all
171 -Agprof.c Prints annotated source for file
173 -Afoobar Prints annotated source for files
174 containing a function named "foobar".
175 The entire file will be printed, but
176 only the function itself will be
177 annotated with profile data.
179 -J[symspec] --no-annotated-source[=symspec]
180 Suppress annotated source output. If specified
181 without argument, annotated output is suppressed
182 completely. With an argument, annotated output
183 is suppressed only for the symbols selected by
184 "symspec". If the option is specified multiple
185 times, annotated output is suppressed for the
186 union of all symspecs. This option has lower
187 precedence than --annotated-source
189 -p[symspec] --flat-profile[=symspec]
190 Request output in the form of a flat profile
191 (unless any other output-style option is specified,
192 this option is turned on by default). If
193 "symspec" is specified, include only symbols
194 selected by "symspec" in flat profile. If the
195 option is specified multiple times, the flat
196 profile includes symbols selected by the union
199 -P[symspec] --no-flat-profile[=symspec]
200 Suppress output in the flat profile. If given
201 without an argument, the flat profile is suppressed
202 completely. If "symspec" is specified, suppress
203 the selected symbols in the flat profile. If the
204 option is specified multiple times, the union of
205 the selected symbols is suppressed. This option
206 has lower precedence than --flat-profile.
208 -q[symspec] --graph[=symspec]
209 Request output in the form of a call-graph
210 (unless any other output-style option is specified,
211 this option is turned on by default). If "symspec"
212 is specified, include only symbols selected by
213 "symspec" in the call-graph. If the option is
214 specified multiple times, the call-graph includes
215 symbols selected by the union of all symspecs.
217 -Q[symspec] --no-graph[=symspec]
218 Suppress output in the call-graph. If given without
219 an argument, the call-graph is suppressed completely.
220 With a "symspec", suppress the selected symbols
221 from the call-graph. If the option is specified
222 multiple times, the union of the selected symbols
223 is suppressed. This option has lower precedence
226 -C[symspec] --exec-counts[=symspec]
227 Request output in the form of execution counts.
228 If "symspec" is present, include only symbols
229 selected by "symspec" in the execution count
230 listing. If the option is specified multiple
231 times, the execution count listing includes
232 symbols selected by the union of all symspecs.
234 -Z[symspec] --no-exec-counts[=symspec]
235 Suppress output in the execution count listing.
236 If given without an argument, the listing is
237 suppressed completely. With a "symspec", suppress
238 the selected symbols from the call-graph. If the
239 option is specified multiple times, the union of
240 the selected symbols is suppressed. This option
241 has lower precedence than --exec-counts.
244 Print information about the profile files that
245 are read. The information consists of the
246 number and types of records present in the
247 profile file. Currently, a profile file can
248 contain any number and any combination of histogram,
249 call-graph, or basic-block count records.
254 This option affects annotated source output only.
255 By default, only the lines at the beginning of
256 a basic-block are annotated. If this option is
257 specified, every line in a basic-block is annotated
258 by repeating the annotation for the first line.
259 This option is identical to tcov's "-a".
261 -I dirs --directory-path=dirs
262 This option affects annotated source output only.
263 Specifies the list of directories to be searched
264 for source files. The argument "dirs" is a colon
265 separated list of directories. By default, gprof
266 searches for source files relative to the current
267 working directory only.
269 -z --display-unused-functions
271 -m num --min-count=num
272 This option affects annotated source and execution
273 count output only. Symbols that are executed
274 less than "num" times are suppressed. For annotated
275 source output, suppressed symbols are marked
276 by five hash-marks (#####). In an execution count
277 output, suppressed symbols do not appear at all.
280 Normally, source filenames are printed with the path
281 component suppressed. With this option, gprof
282 can be forced to print the full pathname of
283 source filenames. The full pathname is determined
284 from symbolic debugging information in the image file
285 and is relative to the directory in which the compiler
289 This option affects annotated source output only.
290 Normally, gprof prints annotated source files
291 to standard-output. If this option is specified,
292 annotated source for a file named "path/filename"
293 is generated in the file "filename-ann". That is,
294 annotated output is {\em always\/} generated in
295 gprof's current working directory. Care has to
296 be taken if a program consists of files that have
297 identical filenames, but distinct paths.
299 -c --static-call-graph
301 -t num --table-length=num
302 This option affects annotated source output only.
303 After annotating a source file, gprof generates
304 an execution count summary consisting of a table
305 of lines with the top execution counts. By
306 default, this table is ten entries long.
307 This option can be used to change the table length
308 or, by specifying an argument value of 0, it can be
309 suppressed completely.
311 -n symspec --time=symspec
312 Only symbols selected by "symspec" are considered
313 in total and percentage time computations.
314 However, this option does not affect percentage time
315 computation for the flat profile.
316 If the option is specified multiple times, the union
317 of all selected symbols is used in time computations.
320 Exclude the symbols selected by "symspec" from
321 total and percentage time computations.
322 However, this option does not affect percentage time
323 computation for the flat profile.
324 This option is ignored if any --time options are
328 Sets the output line width. Currently, this option
329 affects the printing of the call-graph function index
332 -e <no long form---for backwards compatibility only>
333 -E <no long form---for backwards compatibility only>
334 -f <no long form---for backwards compatibility only>
335 -F <no long form---for backwards compatibility only>
336 -k <no long form---for backwards compatibility only>
341 Prints a usage message.
343 -O name --file-format=name
344 Selects the format of the profile data files.
345 Recognized formats are "auto", "bsd", "magic",
346 and "prof". The last one is not yet supported.
347 Format "auto" attempts to detect the file format
348 automatically (this is the default behavior).
349 It attempts to read the profile data files as
350 "magic" files and if this fails, falls back to
351 the "bsd" format. "bsd" forces gprof to read
352 the data files in the BSD format. "magic" forces
353 gprof to read the data files in the "magic" format.
358 ** File Format Changes
360 The old BSD-derived format used for profile data does not contain a
361 magic cookie that allows to check whether a data file really is a
362 gprof file. Furthermore, it does not provide a version number, thus
363 rendering changes to the file format almost impossible. GNU gprof
364 uses a new file format that provides these features. For backward
365 compatibility, GNU gprof continues to support the old BSD-derived
366 format, but not all features are supported with it. For example,
367 basic-block execution counts cannot be accommodated by the old file
370 The new file format is defined in header file \file{gmon_out.h}. It
371 consists of a header containing the magic cookie and a version number,
372 as well as some spare bytes available for future extensions. All data
373 in a profile data file is in the native format of the host on which
374 the profile was collected. GNU gprof adapts automatically to the
377 In the new file format, the header is followed by a sequence of
378 records. Currently, there are three different record types: histogram
379 records, call-graph arc records, and basic-block execution count
380 records. Each file can contain any number of each record type. When
381 reading a file, GNU gprof will ensure records of the same type are
382 compatible with each other and compute the union of all records. For
383 example, for basic-block execution counts, the union is simply the sum
384 of all execution counts for each basic-block.
386 *** Histogram Records
388 Histogram records consist of a header that is followed by an array of
389 bins. The header contains the text-segment range that the histogram
390 spans, the size of the histogram in bytes (unlike in the old BSD
391 format, this does not include the size of the header), the rate of the
392 profiling clock, and the physical dimension that the bin counts
393 represent after being scaled by the profiling clock rate. The
394 physical dimension is specified in two parts: a long name of up to 15
395 characters and a single character abbreviation. For example, a
396 histogram representing real-time would specify the long name as
397 "seconds" and the abbreviation as "s". This feature is useful for
398 architectures that support performance monitor hardware (which,
399 fortunately, is becoming increasingly common). For example, under DEC
400 OSF/1, the "uprofile" command can be used to produce a histogram of,
401 say, instruction cache misses. In this case, the dimension in the
402 histogram header could be set to "i-cache misses" and the abbreviation
403 could be set to "1" (because it is simply a count, not a physical
404 dimension). Also, the profiling rate would have to be set to 1 in
407 Histogram bins are 16-bit numbers and each bin represent an equal
408 amount of text-space. For example, if the text-segment is one
409 thousand bytes long and if there are ten bins in the histogram, each
410 bin represents one hundred bytes.
413 *** Call-Graph Records
415 Call-graph records have a format that is identical to the one used in
416 the BSD-derived file format. It consists of an arc in the call graph
417 and a count indicating the number of times the arc was traversed
418 during program execution. Arcs are specified by a pair of addresses:
419 the first must be within caller's function and the second must be
420 within the callee's function. When performing profiling at the
421 function level, these addresses can point anywhere within the
422 respective function. However, when profiling at the line-level, it is
423 better if the addresses are as close to the call-site/entry-point as
424 possible. This will ensure that the line-level call-graph is able to
425 identify exactly which line of source code performed calls to a
428 *** Basic-Block Execution Count Records
430 Basic-block execution count records consist of a header followed by a
431 sequence of address/count pairs. The header simply specifies the
432 length of the sequence. In an address/count pair, the address
433 identifies a basic-block and the count specifies the number of times
434 that basic-block was executed. Any address within the basic-address can
437 IMPLEMENTATION NOTE: gcc -a can be used to instrument a program to
438 record basic-block execution counts. However, the __bb_exit_func()
439 that is currently present in libgcc2.c does not generate a gmon.out
440 file in a suitable format. This should be fixed for future releases
441 of gcc. In the meantime, contact davidm@cs.arizona.edu for a version
442 of __bb_exit_func() to is appropriate.