|
1 This is gprof.info, produced by makeinfo version 4.8 from gprof.texi. |
|
2 |
|
3 START-INFO-DIR-ENTRY |
|
4 * gprof: (gprof). Profiling your program's execution |
|
5 END-INFO-DIR-ENTRY |
|
6 |
|
7 This file documents the gprof profiler of the GNU system. |
|
8 |
|
9 Copyright (C) 1988, 92, 97, 98, 99, 2000, 2001, 2003, 2007 Free |
|
10 Software Foundation, Inc. |
|
11 |
|
12 Permission is granted to copy, distribute and/or modify this document |
|
13 under the terms of the GNU Free Documentation License, Version 1.1 or |
|
14 any later version published by the Free Software Foundation; with no |
|
15 Invariant Sections, with no Front-Cover Texts, and with no Back-Cover |
|
16 Texts. A copy of the license is included in the section entitled "GNU |
|
17 Free Documentation License". |
|
18 |
|
19 |
|
20 File: gprof.info, Node: Top, Next: Introduction, Up: (dir) |
|
21 |
|
22 Profiling a Program: Where Does It Spend Its Time? |
|
23 ************************************************** |
|
24 |
|
25 This manual describes the GNU profiler, `gprof', and how you can use it |
|
26 to determine which parts of a program are taking most of the execution |
|
27 time. We assume that you know how to write, compile, and execute |
|
28 programs. GNU `gprof' was written by Jay Fenlason. |
|
29 |
|
30 This manual is for `gprof' (GNU Binutils) version 2.18.50. |
|
31 |
|
32 This document is distributed under the terms of the GNU Free |
|
33 Documentation License. A copy of the license is included in the |
|
34 section entitled "GNU Free Documentation License". |
|
35 |
|
36 * Menu: |
|
37 |
|
38 * Introduction:: What profiling means, and why it is useful. |
|
39 |
|
40 * Compiling:: How to compile your program for profiling. |
|
41 * Executing:: Executing your program to generate profile data |
|
42 * Invoking:: How to run `gprof', and its options |
|
43 |
|
44 * Output:: Interpreting `gprof''s output |
|
45 |
|
46 * Inaccuracy:: Potential problems you should be aware of |
|
47 * How do I?:: Answers to common questions |
|
48 * Incompatibilities:: (between GNU `gprof' and Unix `gprof'.) |
|
49 * Details:: Details of how profiling is done |
|
50 * GNU Free Documentation License:: GNU Free Documentation License |
|
51 |
|
52 |
|
53 File: gprof.info, Node: Introduction, Next: Compiling, Prev: Top, Up: Top |
|
54 |
|
55 1 Introduction to Profiling |
|
56 *************************** |
|
57 |
|
58 Profiling allows you to learn where your program spent its time and |
|
59 which functions called which other functions while it was executing. |
|
60 This information can show you which pieces of your program are slower |
|
61 than you expected, and might be candidates for rewriting to make your |
|
62 program execute faster. It can also tell you which functions are being |
|
63 called more or less often than you expected. This may help you spot |
|
64 bugs that had otherwise been unnoticed. |
|
65 |
|
66 Since the profiler uses information collected during the actual |
|
67 execution of your program, it can be used on programs that are too |
|
68 large or too complex to analyze by reading the source. However, how |
|
69 your program is run will affect the information that shows up in the |
|
70 profile data. If you don't use some feature of your program while it |
|
71 is being profiled, no profile information will be generated for that |
|
72 feature. |
|
73 |
|
74 Profiling has several steps: |
|
75 |
|
76 * You must compile and link your program with profiling enabled. |
|
77 *Note Compiling a Program for Profiling: Compiling. |
|
78 |
|
79 * You must execute your program to generate a profile data file. |
|
80 *Note Executing the Program: Executing. |
|
81 |
|
82 * You must run `gprof' to analyze the profile data. *Note `gprof' |
|
83 Command Summary: Invoking. |
|
84 |
|
85 The next three chapters explain these steps in greater detail. |
|
86 |
|
87 Several forms of output are available from the analysis. |
|
88 |
|
89 The "flat profile" shows how much time your program spent in each |
|
90 function, and how many times that function was called. If you simply |
|
91 want to know which functions burn most of the cycles, it is stated |
|
92 concisely here. *Note The Flat Profile: Flat Profile. |
|
93 |
|
94 The "call graph" shows, for each function, which functions called |
|
95 it, which other functions it called, and how many times. There is also |
|
96 an estimate of how much time was spent in the subroutines of each |
|
97 function. This can suggest places where you might try to eliminate |
|
98 function calls that use a lot of time. *Note The Call Graph: Call |
|
99 Graph. |
|
100 |
|
101 The "annotated source" listing is a copy of the program's source |
|
102 code, labeled with the number of times each line of the program was |
|
103 executed. *Note The Annotated Source Listing: Annotated Source. |
|
104 |
|
105 To better understand how profiling works, you may wish to read a |
|
106 description of its implementation. *Note Implementation of Profiling: |
|
107 Implementation. |
|
108 |
|
109 |
|
110 File: gprof.info, Node: Compiling, Next: Executing, Prev: Introduction, Up: Top |
|
111 |
|
112 2 Compiling a Program for Profiling |
|
113 *********************************** |
|
114 |
|
115 The first step in generating profile information for your program is to |
|
116 compile and link it with profiling enabled. |
|
117 |
|
118 To compile a source file for profiling, specify the `-pg' option when |
|
119 you run the compiler. (This is in addition to the options you normally |
|
120 use.) |
|
121 |
|
122 To link the program for profiling, if you use a compiler such as `cc' |
|
123 to do the linking, simply specify `-pg' in addition to your usual |
|
124 options. The same option, `-pg', alters either compilation or linking |
|
125 to do what is necessary for profiling. Here are examples: |
|
126 |
|
127 cc -g -c myprog.c utils.c -pg |
|
128 cc -o myprog myprog.o utils.o -pg |
|
129 |
|
130 The `-pg' option also works with a command that both compiles and |
|
131 links: |
|
132 |
|
133 cc -o myprog myprog.c utils.c -g -pg |
|
134 |
|
135 Note: The `-pg' option must be part of your compilation options as |
|
136 well as your link options. If it is not then no call-graph data will |
|
137 be gathered and when you run `gprof' you will get an error message like |
|
138 this: |
|
139 |
|
140 gprof: gmon.out file is missing call-graph data |
|
141 |
|
142 If you add the `-Q' switch to suppress the printing of the call |
|
143 graph data you will still be able to see the time samples: |
|
144 |
|
145 Flat profile: |
|
146 |
|
147 Each sample counts as 0.01 seconds. |
|
148 % cumulative self self total |
|
149 time seconds seconds calls Ts/call Ts/call name |
|
150 44.12 0.07 0.07 zazLoop |
|
151 35.29 0.14 0.06 main |
|
152 20.59 0.17 0.04 bazMillion |
|
153 |
|
154 If you run the linker `ld' directly instead of through a compiler |
|
155 such as `cc', you may have to specify a profiling startup file |
|
156 `gcrt0.o' as the first input file instead of the usual startup file |
|
157 `crt0.o'. In addition, you would probably want to specify the |
|
158 profiling C library, `libc_p.a', by writing `-lc_p' instead of the |
|
159 usual `-lc'. This is not absolutely necessary, but doing this gives |
|
160 you number-of-calls information for standard library functions such as |
|
161 `read' and `open'. For example: |
|
162 |
|
163 ld -o myprog /lib/gcrt0.o myprog.o utils.o -lc_p |
|
164 |
|
165 If you compile only some of the modules of the program with `-pg', |
|
166 you can still profile the program, but you won't get complete |
|
167 information about the modules that were compiled without `-pg'. The |
|
168 only information you get for the functions in those modules is the |
|
169 total time spent in them; there is no record of how many times they |
|
170 were called, or from where. This will not affect the flat profile |
|
171 (except that the `calls' field for the functions will be blank), but |
|
172 will greatly reduce the usefulness of the call graph. |
|
173 |
|
174 If you wish to perform line-by-line profiling you should use the |
|
175 `gcov' tool instead of `gprof'. See that tool's manual or info pages |
|
176 for more details of how to do this. |
|
177 |
|
178 Note, older versions of `gcc' produce line-by-line profiling |
|
179 information that works with `gprof' rather than `gcov' so there is |
|
180 still support for displaying this kind of information in `gprof'. *Note |
|
181 Line-by-line Profiling: Line-by-line. |
|
182 |
|
183 It also worth noting that `gcc' implements a |
|
184 `-finstrument-functions' command line option which will insert calls to |
|
185 special user supplied instrumentation routines at the entry and exit of |
|
186 every function in their program. This can be used to implement an |
|
187 alternative profiling scheme. |
|
188 |
|
189 |
|
190 File: gprof.info, Node: Executing, Next: Invoking, Prev: Compiling, Up: Top |
|
191 |
|
192 3 Executing the Program |
|
193 *********************** |
|
194 |
|
195 Once the program is compiled for profiling, you must run it in order to |
|
196 generate the information that `gprof' needs. Simply run the program as |
|
197 usual, using the normal arguments, file names, etc. The program should |
|
198 run normally, producing the same output as usual. It will, however, run |
|
199 somewhat slower than normal because of the time spent collecting and |
|
200 writing the profile data. |
|
201 |
|
202 The way you run the program--the arguments and input that you give |
|
203 it--may have a dramatic effect on what the profile information shows. |
|
204 The profile data will describe the parts of the program that were |
|
205 activated for the particular input you use. For example, if the first |
|
206 command you give to your program is to quit, the profile data will show |
|
207 the time used in initialization and in cleanup, but not much else. |
|
208 |
|
209 Your program will write the profile data into a file called |
|
210 `gmon.out' just before exiting. If there is already a file called |
|
211 `gmon.out', its contents are overwritten. There is currently no way to |
|
212 tell the program to write the profile data under a different name, but |
|
213 you can rename the file afterwards if you are concerned that it may be |
|
214 overwritten. |
|
215 |
|
216 In order to write the `gmon.out' file properly, your program must |
|
217 exit normally: by returning from `main' or by calling `exit'. Calling |
|
218 the low-level function `_exit' does not write the profile data, and |
|
219 neither does abnormal termination due to an unhandled signal. |
|
220 |
|
221 The `gmon.out' file is written in the program's _current working |
|
222 directory_ at the time it exits. This means that if your program calls |
|
223 `chdir', the `gmon.out' file will be left in the last directory your |
|
224 program `chdir''d to. If you don't have permission to write in this |
|
225 directory, the file is not written, and you will get an error message. |
|
226 |
|
227 Older versions of the GNU profiling library may also write a file |
|
228 called `bb.out'. This file, if present, contains an human-readable |
|
229 listing of the basic-block execution counts. Unfortunately, the |
|
230 appearance of a human-readable `bb.out' means the basic-block counts |
|
231 didn't get written into `gmon.out'. The Perl script `bbconv.pl', |
|
232 included with the `gprof' source distribution, will convert a `bb.out' |
|
233 file into a format readable by `gprof'. Invoke it like this: |
|
234 |
|
235 bbconv.pl < bb.out > BH-DATA |
|
236 |
|
237 This translates the information in `bb.out' into a form that `gprof' |
|
238 can understand. But you still need to tell `gprof' about the existence |
|
239 of this translated information. To do that, include BB-DATA on the |
|
240 `gprof' command line, _along with `gmon.out'_, like this: |
|
241 |
|
242 gprof OPTIONS EXECUTABLE-FILE gmon.out BB-DATA [YET-MORE-PROFILE-DATA-FILES...] [> OUTFILE] |
|
243 |
|
244 |
|
245 File: gprof.info, Node: Invoking, Next: Output, Prev: Executing, Up: Top |
|
246 |
|
247 4 `gprof' Command Summary |
|
248 ************************* |
|
249 |
|
250 After you have a profile data file `gmon.out', you can run `gprof' to |
|
251 interpret the information in it. The `gprof' program prints a flat |
|
252 profile and a call graph on standard output. Typically you would |
|
253 redirect the output of `gprof' into a file with `>'. |
|
254 |
|
255 You run `gprof' like this: |
|
256 |
|
257 gprof OPTIONS [EXECUTABLE-FILE [PROFILE-DATA-FILES...]] [> OUTFILE] |
|
258 |
|
259 Here square-brackets indicate optional arguments. |
|
260 |
|
261 If you omit the executable file name, the file `a.out' is used. If |
|
262 you give no profile data file name, the file `gmon.out' is used. If |
|
263 any file is not in the proper format, or if the profile data file does |
|
264 not appear to belong to the executable file, an error message is |
|
265 printed. |
|
266 |
|
267 You can give more than one profile data file by entering all their |
|
268 names after the executable file name; then the statistics in all the |
|
269 data files are summed together. |
|
270 |
|
271 The order of these options does not matter. |
|
272 |
|
273 * Menu: |
|
274 |
|
275 * Output Options:: Controlling `gprof''s output style |
|
276 * Analysis Options:: Controlling how `gprof' analyzes its data |
|
277 * Miscellaneous Options:: |
|
278 * Deprecated Options:: Options you no longer need to use, but which |
|
279 have been retained for compatibility |
|
280 * Symspecs:: Specifying functions to include or exclude |
|
281 |
|
282 |
|
283 File: gprof.info, Node: Output Options, Next: Analysis Options, Up: Invoking |
|
284 |
|
285 4.1 Output Options |
|
286 ================== |
|
287 |
|
288 These options specify which of several output formats `gprof' should |
|
289 produce. |
|
290 |
|
291 Many of these options take an optional "symspec" to specify |
|
292 functions to be included or excluded. These options can be specified |
|
293 multiple times, with different symspecs, to include or exclude sets of |
|
294 symbols. *Note Symspecs: Symspecs. |
|
295 |
|
296 Specifying any of these options overrides the default (`-p -q'), |
|
297 which prints a flat profile and call graph analysis for all functions. |
|
298 |
|
299 `-A[SYMSPEC]' |
|
300 `--annotated-source[=SYMSPEC]' |
|
301 The `-A' option causes `gprof' to print annotated source code. If |
|
302 SYMSPEC is specified, print output only for matching symbols. |
|
303 *Note The Annotated Source Listing: Annotated Source. |
|
304 |
|
305 `-b' |
|
306 `--brief' |
|
307 If the `-b' option is given, `gprof' doesn't print the verbose |
|
308 blurbs that try to explain the meaning of all of the fields in the |
|
309 tables. This is useful if you intend to print out the output, or |
|
310 are tired of seeing the blurbs. |
|
311 |
|
312 `-C[SYMSPEC]' |
|
313 `--exec-counts[=SYMSPEC]' |
|
314 The `-C' option causes `gprof' to print a tally of functions and |
|
315 the number of times each was called. If SYMSPEC is specified, |
|
316 print tally only for matching symbols. |
|
317 |
|
318 If the profile data file contains basic-block count records, |
|
319 specifying the `-l' option, along with `-C', will cause basic-block |
|
320 execution counts to be tallied and displayed. |
|
321 |
|
322 `-i' |
|
323 `--file-info' |
|
324 The `-i' option causes `gprof' to display summary information |
|
325 about the profile data file(s) and then exit. The number of |
|
326 histogram, call graph, and basic-block count records is displayed. |
|
327 |
|
328 `-I DIRS' |
|
329 `--directory-path=DIRS' |
|
330 The `-I' option specifies a list of search directories in which to |
|
331 find source files. Environment variable GPROF_PATH can also be |
|
332 used to convey this information. Used mostly for annotated source |
|
333 output. |
|
334 |
|
335 `-J[SYMSPEC]' |
|
336 `--no-annotated-source[=SYMSPEC]' |
|
337 The `-J' option causes `gprof' not to print annotated source code. |
|
338 If SYMSPEC is specified, `gprof' prints annotated source, but |
|
339 excludes matching symbols. |
|
340 |
|
341 `-L' |
|
342 `--print-path' |
|
343 Normally, source filenames are printed with the path component |
|
344 suppressed. The `-L' option causes `gprof' to print the full |
|
345 pathname of source filenames, which is determined from symbolic |
|
346 debugging information in the image file and is relative to the |
|
347 directory in which the compiler was invoked. |
|
348 |
|
349 `-p[SYMSPEC]' |
|
350 `--flat-profile[=SYMSPEC]' |
|
351 The `-p' option causes `gprof' to print a flat profile. If |
|
352 SYMSPEC is specified, print flat profile only for matching symbols. |
|
353 *Note The Flat Profile: Flat Profile. |
|
354 |
|
355 `-P[SYMSPEC]' |
|
356 `--no-flat-profile[=SYMSPEC]' |
|
357 The `-P' option causes `gprof' to suppress printing a flat profile. |
|
358 If SYMSPEC is specified, `gprof' prints a flat profile, but |
|
359 excludes matching symbols. |
|
360 |
|
361 `-q[SYMSPEC]' |
|
362 `--graph[=SYMSPEC]' |
|
363 The `-q' option causes `gprof' to print the call graph analysis. |
|
364 If SYMSPEC is specified, print call graph only for matching symbols |
|
365 and their children. *Note The Call Graph: Call Graph. |
|
366 |
|
367 `-Q[SYMSPEC]' |
|
368 `--no-graph[=SYMSPEC]' |
|
369 The `-Q' option causes `gprof' to suppress printing the call graph. |
|
370 If SYMSPEC is specified, `gprof' prints a call graph, but excludes |
|
371 matching symbols. |
|
372 |
|
373 `-t' |
|
374 `--table-length=NUM' |
|
375 The `-t' option causes the NUM most active source lines in each |
|
376 source file to be listed when source annotation is enabled. The |
|
377 default is 10. |
|
378 |
|
379 `-y' |
|
380 `--separate-files' |
|
381 This option affects annotated source output only. Normally, |
|
382 `gprof' prints annotated source files to standard-output. If this |
|
383 option is specified, annotated source for a file named |
|
384 `path/FILENAME' is generated in the file `FILENAME-ann'. If the |
|
385 underlying file system would truncate `FILENAME-ann' so that it |
|
386 overwrites the original `FILENAME', `gprof' generates annotated |
|
387 source in the file `FILENAME.ann' instead (if the original file |
|
388 name has an extension, that extension is _replaced_ with `.ann'). |
|
389 |
|
390 `-Z[SYMSPEC]' |
|
391 `--no-exec-counts[=SYMSPEC]' |
|
392 The `-Z' option causes `gprof' not to print a tally of functions |
|
393 and the number of times each was called. If SYMSPEC is specified, |
|
394 print tally, but exclude matching symbols. |
|
395 |
|
396 `-r' |
|
397 `--function-ordering' |
|
398 The `--function-ordering' option causes `gprof' to print a |
|
399 suggested function ordering for the program based on profiling |
|
400 data. This option suggests an ordering which may improve paging, |
|
401 tlb and cache behavior for the program on systems which support |
|
402 arbitrary ordering of functions in an executable. |
|
403 |
|
404 The exact details of how to force the linker to place functions in |
|
405 a particular order is system dependent and out of the scope of this |
|
406 manual. |
|
407 |
|
408 `-R MAP_FILE' |
|
409 `--file-ordering MAP_FILE' |
|
410 The `--file-ordering' option causes `gprof' to print a suggested |
|
411 .o link line ordering for the program based on profiling data. |
|
412 This option suggests an ordering which may improve paging, tlb and |
|
413 cache behavior for the program on systems which do not support |
|
414 arbitrary ordering of functions in an executable. |
|
415 |
|
416 Use of the `-a' argument is highly recommended with this option. |
|
417 |
|
418 The MAP_FILE argument is a pathname to a file which provides |
|
419 function name to object file mappings. The format of the file is |
|
420 similar to the output of the program `nm'. |
|
421 |
|
422 c-parse.o:00000000 T yyparse |
|
423 c-parse.o:00000004 C yyerrflag |
|
424 c-lang.o:00000000 T maybe_objc_method_name |
|
425 c-lang.o:00000000 T print_lang_statistics |
|
426 c-lang.o:00000000 T recognize_objc_keyword |
|
427 c-decl.o:00000000 T print_lang_identifier |
|
428 c-decl.o:00000000 T print_lang_type |
|
429 ... |
|
430 |
|
431 To create a MAP_FILE with GNU `nm', type a command like `nm |
|
432 --extern-only --defined-only -v --print-file-name program-name'. |
|
433 |
|
434 `-T' |
|
435 `--traditional' |
|
436 The `-T' option causes `gprof' to print its output in |
|
437 "traditional" BSD style. |
|
438 |
|
439 `-w WIDTH' |
|
440 `--width=WIDTH' |
|
441 Sets width of output lines to WIDTH. Currently only used when |
|
442 printing the function index at the bottom of the call graph. |
|
443 |
|
444 `-x' |
|
445 `--all-lines' |
|
446 This option affects annotated source output only. By default, |
|
447 only the lines at the beginning of a basic-block are annotated. |
|
448 If this option is specified, every line in a basic-block is |
|
449 annotated by repeating the annotation for the first line. This |
|
450 behavior is similar to `tcov''s `-a'. |
|
451 |
|
452 `--demangle[=STYLE]' |
|
453 `--no-demangle' |
|
454 These options control whether C++ symbol names should be demangled |
|
455 when printing output. The default is to demangle symbols. The |
|
456 `--no-demangle' option may be used to turn off demangling. |
|
457 Different compilers have different mangling styles. The optional |
|
458 demangling style argument can be used to choose an appropriate |
|
459 demangling style for your compiler. |
|
460 |
|
461 |
|
462 File: gprof.info, Node: Analysis Options, Next: Miscellaneous Options, Prev: Output Options, Up: Invoking |
|
463 |
|
464 4.2 Analysis Options |
|
465 ==================== |
|
466 |
|
467 `-a' |
|
468 `--no-static' |
|
469 The `-a' option causes `gprof' to suppress the printing of |
|
470 statically declared (private) functions. (These are functions |
|
471 whose names are not listed as global, and which are not visible |
|
472 outside the file/function/block where they were defined.) Time |
|
473 spent in these functions, calls to/from them, etc., will all be |
|
474 attributed to the function that was loaded directly before it in |
|
475 the executable file. This option affects both the flat profile |
|
476 and the call graph. |
|
477 |
|
478 `-c' |
|
479 `--static-call-graph' |
|
480 The `-c' option causes the call graph of the program to be |
|
481 augmented by a heuristic which examines the text space of the |
|
482 object file and identifies function calls in the binary machine |
|
483 code. Since normal call graph records are only generated when |
|
484 functions are entered, this option identifies children that could |
|
485 have been called, but never were. Calls to functions that were |
|
486 not compiled with profiling enabled are also identified, but only |
|
487 if symbol table entries are present for them. Calls to dynamic |
|
488 library routines are typically _not_ found by this option. |
|
489 Parents or children identified via this heuristic are indicated in |
|
490 the call graph with call counts of `0'. |
|
491 |
|
492 `-D' |
|
493 `--ignore-non-functions' |
|
494 The `-D' option causes `gprof' to ignore symbols which are not |
|
495 known to be functions. This option will give more accurate |
|
496 profile data on systems where it is supported (Solaris and HPUX for |
|
497 example). |
|
498 |
|
499 `-k FROM/TO' |
|
500 The `-k' option allows you to delete from the call graph any arcs |
|
501 from symbols matching symspec FROM to those matching symspec TO. |
|
502 |
|
503 `-l' |
|
504 `--line' |
|
505 The `-l' option enables line-by-line profiling, which causes |
|
506 histogram hits to be charged to individual source code lines, |
|
507 instead of functions. This feature only works with programs |
|
508 compiled by older versions of the `gcc' compiler. Newer versions |
|
509 of `gcc' are designed to work with the `gcov' tool instead. |
|
510 |
|
511 If the program was compiled with basic-block counting enabled, |
|
512 this option will also identify how many times each line of code |
|
513 was executed. While line-by-line profiling can help isolate where |
|
514 in a large function a program is spending its time, it also |
|
515 significantly increases the running time of `gprof', and magnifies |
|
516 statistical inaccuracies. *Note Statistical Sampling Error: |
|
517 Sampling Error. |
|
518 |
|
519 `-m NUM' |
|
520 `--min-count=NUM' |
|
521 This option affects execution count output only. Symbols that are |
|
522 executed less than NUM times are suppressed. |
|
523 |
|
524 `-nSYMSPEC' |
|
525 `--time=SYMSPEC' |
|
526 The `-n' option causes `gprof', in its call graph analysis, to |
|
527 only propagate times for symbols matching SYMSPEC. |
|
528 |
|
529 `-NSYMSPEC' |
|
530 `--no-time=SYMSPEC' |
|
531 The `-n' option causes `gprof', in its call graph analysis, not to |
|
532 propagate times for symbols matching SYMSPEC. |
|
533 |
|
534 `-z' |
|
535 `--display-unused-functions' |
|
536 If you give the `-z' option, `gprof' will mention all functions in |
|
537 the flat profile, even those that were never called, and that had |
|
538 no time spent in them. This is useful in conjunction with the |
|
539 `-c' option for discovering which routines were never called. |
|
540 |
|
541 |
|
542 |
|
543 File: gprof.info, Node: Miscellaneous Options, Next: Deprecated Options, Prev: Analysis Options, Up: Invoking |
|
544 |
|
545 4.3 Miscellaneous Options |
|
546 ========================= |
|
547 |
|
548 `-d[NUM]' |
|
549 `--debug[=NUM]' |
|
550 The `-d NUM' option specifies debugging options. If NUM is not |
|
551 specified, enable all debugging. *Note Debugging `gprof': |
|
552 Debugging. |
|
553 |
|
554 `-h' |
|
555 `--help' |
|
556 The `-h' option prints command line usage. |
|
557 |
|
558 `-ONAME' |
|
559 `--file-format=NAME' |
|
560 Selects the format of the profile data files. Recognized formats |
|
561 are `auto' (the default), `bsd', `4.4bsd', `magic', and `prof' |
|
562 (not yet supported). |
|
563 |
|
564 `-s' |
|
565 `--sum' |
|
566 The `-s' option causes `gprof' to summarize the information in the |
|
567 profile data files it read in, and write out a profile data file |
|
568 called `gmon.sum', which contains all the information from the |
|
569 profile data files that `gprof' read in. The file `gmon.sum' may |
|
570 be one of the specified input files; the effect of this is to |
|
571 merge the data in the other input files into `gmon.sum'. |
|
572 |
|
573 Eventually you can run `gprof' again without `-s' to analyze the |
|
574 cumulative data in the file `gmon.sum'. |
|
575 |
|
576 `-v' |
|
577 `--version' |
|
578 The `-v' flag causes `gprof' to print the current version number, |
|
579 and then exit. |
|
580 |
|
581 |
|
582 |
|
583 File: gprof.info, Node: Deprecated Options, Next: Symspecs, Prev: Miscellaneous Options, Up: Invoking |
|
584 |
|
585 4.4 Deprecated Options |
|
586 ====================== |
|
587 |
|
588 These options have been replaced with newer versions that use |
|
589 symspecs. |
|
590 |
|
591 `-e FUNCTION_NAME' |
|
592 The `-e FUNCTION' option tells `gprof' to not print information |
|
593 about the function FUNCTION_NAME (and its children...) in the call |
|
594 graph. The function will still be listed as a child of any |
|
595 functions that call it, but its index number will be shown as |
|
596 `[not printed]'. More than one `-e' option may be given; only one |
|
597 FUNCTION_NAME may be indicated with each `-e' option. |
|
598 |
|
599 `-E FUNCTION_NAME' |
|
600 The `-E FUNCTION' option works like the `-e' option, but time |
|
601 spent in the function (and children who were not called from |
|
602 anywhere else), will not be used to compute the |
|
603 percentages-of-time for the call graph. More than one `-E' option |
|
604 may be given; only one FUNCTION_NAME may be indicated with each |
|
605 `-E' option. |
|
606 |
|
607 `-f FUNCTION_NAME' |
|
608 The `-f FUNCTION' option causes `gprof' to limit the call graph to |
|
609 the function FUNCTION_NAME and its children (and their |
|
610 children...). More than one `-f' option may be given; only one |
|
611 FUNCTION_NAME may be indicated with each `-f' option. |
|
612 |
|
613 `-F FUNCTION_NAME' |
|
614 The `-F FUNCTION' option works like the `-f' option, but only time |
|
615 spent in the function and its children (and their children...) |
|
616 will be used to determine total-time and percentages-of-time for |
|
617 the call graph. More than one `-F' option may be given; only one |
|
618 FUNCTION_NAME may be indicated with each `-F' option. The `-F' |
|
619 option overrides the `-E' option. |
|
620 |
|
621 |
|
622 Note that only one function can be specified with each `-e', `-E', |
|
623 `-f' or `-F' option. To specify more than one function, use multiple |
|
624 options. For example, this command: |
|
625 |
|
626 gprof -e boring -f foo -f bar myprogram > gprof.output |
|
627 |
|
628 lists in the call graph all functions that were reached from either |
|
629 `foo' or `bar' and were not reachable from `boring'. |
|
630 |
|
631 |
|
632 File: gprof.info, Node: Symspecs, Prev: Deprecated Options, Up: Invoking |
|
633 |
|
634 4.5 Symspecs |
|
635 ============ |
|
636 |
|
637 Many of the output options allow functions to be included or excluded |
|
638 using "symspecs" (symbol specifications), which observe the following |
|
639 syntax: |
|
640 |
|
641 filename_containing_a_dot |
|
642 | funcname_not_containing_a_dot |
|
643 | linenumber |
|
644 | ( [ any_filename ] `:' ( any_funcname | linenumber ) ) |
|
645 |
|
646 Here are some sample symspecs: |
|
647 |
|
648 `main.c' |
|
649 Selects everything in file `main.c'--the dot in the string tells |
|
650 `gprof' to interpret the string as a filename, rather than as a |
|
651 function name. To select a file whose name does not contain a |
|
652 dot, a trailing colon should be specified. For example, `odd:' is |
|
653 interpreted as the file named `odd'. |
|
654 |
|
655 `main' |
|
656 Selects all functions named `main'. |
|
657 |
|
658 Note that there may be multiple instances of the same function name |
|
659 because some of the definitions may be local (i.e., static). |
|
660 Unless a function name is unique in a program, you must use the |
|
661 colon notation explained below to specify a function from a |
|
662 specific source file. |
|
663 |
|
664 Sometimes, function names contain dots. In such cases, it is |
|
665 necessary to add a leading colon to the name. For example, |
|
666 `:.mul' selects function `.mul'. |
|
667 |
|
668 In some object file formats, symbols have a leading underscore. |
|
669 `gprof' will normally not print these underscores. When you name a |
|
670 symbol in a symspec, you should type it exactly as `gprof' prints |
|
671 it in its output. For example, if the compiler produces a symbol |
|
672 `_main' from your `main' function, `gprof' still prints it as |
|
673 `main' in its output, so you should use `main' in symspecs. |
|
674 |
|
675 `main.c:main' |
|
676 Selects function `main' in file `main.c'. |
|
677 |
|
678 `main.c:134' |
|
679 Selects line 134 in file `main.c'. |
|
680 |
|
681 |
|
682 File: gprof.info, Node: Output, Next: Inaccuracy, Prev: Invoking, Up: Top |
|
683 |
|
684 5 Interpreting `gprof''s Output |
|
685 ******************************* |
|
686 |
|
687 `gprof' can produce several different output styles, the most important |
|
688 of which are described below. The simplest output styles (file |
|
689 information, execution count, and function and file ordering) are not |
|
690 described here, but are documented with the respective options that |
|
691 trigger them. *Note Output Options: Output Options. |
|
692 |
|
693 * Menu: |
|
694 |
|
695 * Flat Profile:: The flat profile shows how much time was spent |
|
696 executing directly in each function. |
|
697 * Call Graph:: The call graph shows which functions called which |
|
698 others, and how much time each function used |
|
699 when its subroutine calls are included. |
|
700 * Line-by-line:: `gprof' can analyze individual source code lines |
|
701 * Annotated Source:: The annotated source listing displays source code |
|
702 labeled with execution counts |
|
703 |
|
704 |
|
705 File: gprof.info, Node: Flat Profile, Next: Call Graph, Up: Output |
|
706 |
|
707 5.1 The Flat Profile |
|
708 ==================== |
|
709 |
|
710 The "flat profile" shows the total amount of time your program spent |
|
711 executing each function. Unless the `-z' option is given, functions |
|
712 with no apparent time spent in them, and no apparent calls to them, are |
|
713 not mentioned. Note that if a function was not compiled for profiling, |
|
714 and didn't run long enough to show up on the program counter histogram, |
|
715 it will be indistinguishable from a function that was never called. |
|
716 |
|
717 This is part of a flat profile for a small program: |
|
718 |
|
719 Flat profile: |
|
720 |
|
721 Each sample counts as 0.01 seconds. |
|
722 % cumulative self self total |
|
723 time seconds seconds calls ms/call ms/call name |
|
724 33.34 0.02 0.02 7208 0.00 0.00 open |
|
725 16.67 0.03 0.01 244 0.04 0.12 offtime |
|
726 16.67 0.04 0.01 8 1.25 1.25 memccpy |
|
727 16.67 0.05 0.01 7 1.43 1.43 write |
|
728 16.67 0.06 0.01 mcount |
|
729 0.00 0.06 0.00 236 0.00 0.00 tzset |
|
730 0.00 0.06 0.00 192 0.00 0.00 tolower |
|
731 0.00 0.06 0.00 47 0.00 0.00 strlen |
|
732 0.00 0.06 0.00 45 0.00 0.00 strchr |
|
733 0.00 0.06 0.00 1 0.00 50.00 main |
|
734 0.00 0.06 0.00 1 0.00 0.00 memcpy |
|
735 0.00 0.06 0.00 1 0.00 10.11 print |
|
736 0.00 0.06 0.00 1 0.00 0.00 profil |
|
737 0.00 0.06 0.00 1 0.00 50.00 report |
|
738 ... |
|
739 |
|
740 The functions are sorted first by decreasing run-time spent in them, |
|
741 then by decreasing number of calls, then alphabetically by name. The |
|
742 functions `mcount' and `profil' are part of the profiling apparatus and |
|
743 appear in every flat profile; their time gives a measure of the amount |
|
744 of overhead due to profiling. |
|
745 |
|
746 Just before the column headers, a statement appears indicating how |
|
747 much time each sample counted as. This "sampling period" estimates the |
|
748 margin of error in each of the time figures. A time figure that is not |
|
749 much larger than this is not reliable. In this example, each sample |
|
750 counted as 0.01 seconds, suggesting a 100 Hz sampling rate. The |
|
751 program's total execution time was 0.06 seconds, as indicated by the |
|
752 `cumulative seconds' field. Since each sample counted for 0.01 |
|
753 seconds, this means only six samples were taken during the run. Two of |
|
754 the samples occurred while the program was in the `open' function, as |
|
755 indicated by the `self seconds' field. Each of the other four samples |
|
756 occurred one each in `offtime', `memccpy', `write', and `mcount'. |
|
757 Since only six samples were taken, none of these values can be regarded |
|
758 as particularly reliable. In another run, the `self seconds' field for |
|
759 `mcount' might well be `0.00' or `0.02'. *Note Statistical Sampling |
|
760 Error: Sampling Error, for a complete discussion. |
|
761 |
|
762 The remaining functions in the listing (those whose `self seconds' |
|
763 field is `0.00') didn't appear in the histogram samples at all. |
|
764 However, the call graph indicated that they were called, so therefore |
|
765 they are listed, sorted in decreasing order by the `calls' field. |
|
766 Clearly some time was spent executing these functions, but the paucity |
|
767 of histogram samples prevents any determination of how much time each |
|
768 took. |
|
769 |
|
770 Here is what the fields in each line mean: |
|
771 |
|
772 `% time' |
|
773 This is the percentage of the total execution time your program |
|
774 spent in this function. These should all add up to 100%. |
|
775 |
|
776 `cumulative seconds' |
|
777 This is the cumulative total number of seconds the computer spent |
|
778 executing this functions, plus the time spent in all the functions |
|
779 above this one in this table. |
|
780 |
|
781 `self seconds' |
|
782 This is the number of seconds accounted for by this function alone. |
|
783 The flat profile listing is sorted first by this number. |
|
784 |
|
785 `calls' |
|
786 This is the total number of times the function was called. If the |
|
787 function was never called, or the number of times it was called |
|
788 cannot be determined (probably because the function was not |
|
789 compiled with profiling enabled), the "calls" field is blank. |
|
790 |
|
791 `self ms/call' |
|
792 This represents the average number of milliseconds spent in this |
|
793 function per call, if this function is profiled. Otherwise, this |
|
794 field is blank for this function. |
|
795 |
|
796 `total ms/call' |
|
797 This represents the average number of milliseconds spent in this |
|
798 function and its descendants per call, if this function is |
|
799 profiled. Otherwise, this field is blank for this function. This |
|
800 is the only field in the flat profile that uses call graph |
|
801 analysis. |
|
802 |
|
803 `name' |
|
804 This is the name of the function. The flat profile is sorted by |
|
805 this field alphabetically after the "self seconds" and "calls" |
|
806 fields are sorted. |
|
807 |
|
808 |
|
809 File: gprof.info, Node: Call Graph, Next: Line-by-line, Prev: Flat Profile, Up: Output |
|
810 |
|
811 5.2 The Call Graph |
|
812 ================== |
|
813 |
|
814 The "call graph" shows how much time was spent in each function and its |
|
815 children. From this information, you can find functions that, while |
|
816 they themselves may not have used much time, called other functions |
|
817 that did use unusual amounts of time. |
|
818 |
|
819 Here is a sample call from a small program. This call came from the |
|
820 same `gprof' run as the flat profile example in the previous section. |
|
821 |
|
822 granularity: each sample hit covers 2 byte(s) for 20.00% of 0.05 seconds |
|
823 |
|
824 index % time self children called name |
|
825 <spontaneous> |
|
826 [1] 100.0 0.00 0.05 start [1] |
|
827 0.00 0.05 1/1 main [2] |
|
828 0.00 0.00 1/2 on_exit [28] |
|
829 0.00 0.00 1/1 exit [59] |
|
830 ----------------------------------------------- |
|
831 0.00 0.05 1/1 start [1] |
|
832 [2] 100.0 0.00 0.05 1 main [2] |
|
833 0.00 0.05 1/1 report [3] |
|
834 ----------------------------------------------- |
|
835 0.00 0.05 1/1 main [2] |
|
836 [3] 100.0 0.00 0.05 1 report [3] |
|
837 0.00 0.03 8/8 timelocal [6] |
|
838 0.00 0.01 1/1 print [9] |
|
839 0.00 0.01 9/9 fgets [12] |
|
840 0.00 0.00 12/34 strncmp <cycle 1> [40] |
|
841 0.00 0.00 8/8 lookup [20] |
|
842 0.00 0.00 1/1 fopen [21] |
|
843 0.00 0.00 8/8 chewtime [24] |
|
844 0.00 0.00 8/16 skipspace [44] |
|
845 ----------------------------------------------- |
|
846 [4] 59.8 0.01 0.02 8+472 <cycle 2 as a whole> [4] |
|
847 0.01 0.02 244+260 offtime <cycle 2> [7] |
|
848 0.00 0.00 236+1 tzset <cycle 2> [26] |
|
849 ----------------------------------------------- |
|
850 |
|
851 The lines full of dashes divide this table into "entries", one for |
|
852 each function. Each entry has one or more lines. |
|
853 |
|
854 In each entry, the primary line is the one that starts with an index |
|
855 number in square brackets. The end of this line says which function |
|
856 the entry is for. The preceding lines in the entry describe the |
|
857 callers of this function and the following lines describe its |
|
858 subroutines (also called "children" when we speak of the call graph). |
|
859 |
|
860 The entries are sorted by time spent in the function and its |
|
861 subroutines. |
|
862 |
|
863 The internal profiling function `mcount' (*note The Flat Profile: |
|
864 Flat Profile.) is never mentioned in the call graph. |
|
865 |
|
866 * Menu: |
|
867 |
|
868 * Primary:: Details of the primary line's contents. |
|
869 * Callers:: Details of caller-lines' contents. |
|
870 * Subroutines:: Details of subroutine-lines' contents. |
|
871 * Cycles:: When there are cycles of recursion, |
|
872 such as `a' calls `b' calls `a'... |
|
873 |
|
874 |
|
875 File: gprof.info, Node: Primary, Next: Callers, Up: Call Graph |
|
876 |
|
877 5.2.1 The Primary Line |
|
878 ---------------------- |
|
879 |
|
880 The "primary line" in a call graph entry is the line that describes the |
|
881 function which the entry is about and gives the overall statistics for |
|
882 this function. |
|
883 |
|
884 For reference, we repeat the primary line from the entry for function |
|
885 `report' in our main example, together with the heading line that shows |
|
886 the names of the fields: |
|
887 |
|
888 index % time self children called name |
|
889 ... |
|
890 [3] 100.0 0.00 0.05 1 report [3] |
|
891 |
|
892 Here is what the fields in the primary line mean: |
|
893 |
|
894 `index' |
|
895 Entries are numbered with consecutive integers. Each function |
|
896 therefore has an index number, which appears at the beginning of |
|
897 its primary line. |
|
898 |
|
899 Each cross-reference to a function, as a caller or subroutine of |
|
900 another, gives its index number as well as its name. The index |
|
901 number guides you if you wish to look for the entry for that |
|
902 function. |
|
903 |
|
904 `% time' |
|
905 This is the percentage of the total time that was spent in this |
|
906 function, including time spent in subroutines called from this |
|
907 function. |
|
908 |
|
909 The time spent in this function is counted again for the callers of |
|
910 this function. Therefore, adding up these percentages is |
|
911 meaningless. |
|
912 |
|
913 `self' |
|
914 This is the total amount of time spent in this function. This |
|
915 should be identical to the number printed in the `seconds' field |
|
916 for this function in the flat profile. |
|
917 |
|
918 `children' |
|
919 This is the total amount of time spent in the subroutine calls |
|
920 made by this function. This should be equal to the sum of all the |
|
921 `self' and `children' entries of the children listed directly |
|
922 below this function. |
|
923 |
|
924 `called' |
|
925 This is the number of times the function was called. |
|
926 |
|
927 If the function called itself recursively, there are two numbers, |
|
928 separated by a `+'. The first number counts non-recursive calls, |
|
929 and the second counts recursive calls. |
|
930 |
|
931 In the example above, the function `report' was called once from |
|
932 `main'. |
|
933 |
|
934 `name' |
|
935 This is the name of the current function. The index number is |
|
936 repeated after it. |
|
937 |
|
938 If the function is part of a cycle of recursion, the cycle number |
|
939 is printed between the function's name and the index number (*note |
|
940 How Mutually Recursive Functions Are Described: Cycles.). For |
|
941 example, if function `gnurr' is part of cycle number one, and has |
|
942 index number twelve, its primary line would be end like this: |
|
943 |
|
944 gnurr <cycle 1> [12] |
|
945 |
|
946 |
|
947 File: gprof.info, Node: Callers, Next: Subroutines, Prev: Primary, Up: Call Graph |
|
948 |
|
949 5.2.2 Lines for a Function's Callers |
|
950 ------------------------------------ |
|
951 |
|
952 A function's entry has a line for each function it was called by. |
|
953 These lines' fields correspond to the fields of the primary line, but |
|
954 their meanings are different because of the difference in context. |
|
955 |
|
956 For reference, we repeat two lines from the entry for the function |
|
957 `report', the primary line and one caller-line preceding it, together |
|
958 with the heading line that shows the names of the fields: |
|
959 |
|
960 index % time self children called name |
|
961 ... |
|
962 0.00 0.05 1/1 main [2] |
|
963 [3] 100.0 0.00 0.05 1 report [3] |
|
964 |
|
965 Here are the meanings of the fields in the caller-line for `report' |
|
966 called from `main': |
|
967 |
|
968 `self' |
|
969 An estimate of the amount of time spent in `report' itself when it |
|
970 was called from `main'. |
|
971 |
|
972 `children' |
|
973 An estimate of the amount of time spent in subroutines of `report' |
|
974 when `report' was called from `main'. |
|
975 |
|
976 The sum of the `self' and `children' fields is an estimate of the |
|
977 amount of time spent within calls to `report' from `main'. |
|
978 |
|
979 `called' |
|
980 Two numbers: the number of times `report' was called from `main', |
|
981 followed by the total number of non-recursive calls to `report' |
|
982 from all its callers. |
|
983 |
|
984 `name and index number' |
|
985 The name of the caller of `report' to which this line applies, |
|
986 followed by the caller's index number. |
|
987 |
|
988 Not all functions have entries in the call graph; some options to |
|
989 `gprof' request the omission of certain functions. When a caller |
|
990 has no entry of its own, it still has caller-lines in the entries |
|
991 of the functions it calls. |
|
992 |
|
993 If the caller is part of a recursion cycle, the cycle number is |
|
994 printed between the name and the index number. |
|
995 |
|
996 If the identity of the callers of a function cannot be determined, a |
|
997 dummy caller-line is printed which has `<spontaneous>' as the "caller's |
|
998 name" and all other fields blank. This can happen for signal handlers. |
|
999 |
|
1000 |
|
1001 File: gprof.info, Node: Subroutines, Next: Cycles, Prev: Callers, Up: Call Graph |
|
1002 |
|
1003 5.2.3 Lines for a Function's Subroutines |
|
1004 ---------------------------------------- |
|
1005 |
|
1006 A function's entry has a line for each of its subroutines--in other |
|
1007 words, a line for each other function that it called. These lines' |
|
1008 fields correspond to the fields of the primary line, but their meanings |
|
1009 are different because of the difference in context. |
|
1010 |
|
1011 For reference, we repeat two lines from the entry for the function |
|
1012 `main', the primary line and a line for a subroutine, together with the |
|
1013 heading line that shows the names of the fields: |
|
1014 |
|
1015 index % time self children called name |
|
1016 ... |
|
1017 [2] 100.0 0.00 0.05 1 main [2] |
|
1018 0.00 0.05 1/1 report [3] |
|
1019 |
|
1020 Here are the meanings of the fields in the subroutine-line for `main' |
|
1021 calling `report': |
|
1022 |
|
1023 `self' |
|
1024 An estimate of the amount of time spent directly within `report' |
|
1025 when `report' was called from `main'. |
|
1026 |
|
1027 `children' |
|
1028 An estimate of the amount of time spent in subroutines of `report' |
|
1029 when `report' was called from `main'. |
|
1030 |
|
1031 The sum of the `self' and `children' fields is an estimate of the |
|
1032 total time spent in calls to `report' from `main'. |
|
1033 |
|
1034 `called' |
|
1035 Two numbers, the number of calls to `report' from `main' followed |
|
1036 by the total number of non-recursive calls to `report'. This |
|
1037 ratio is used to determine how much of `report''s `self' and |
|
1038 `children' time gets credited to `main'. *Note Estimating |
|
1039 `children' Times: Assumptions. |
|
1040 |
|
1041 `name' |
|
1042 The name of the subroutine of `main' to which this line applies, |
|
1043 followed by the subroutine's index number. |
|
1044 |
|
1045 If the caller is part of a recursion cycle, the cycle number is |
|
1046 printed between the name and the index number. |
|
1047 |
|
1048 |
|
1049 File: gprof.info, Node: Cycles, Prev: Subroutines, Up: Call Graph |
|
1050 |
|
1051 5.2.4 How Mutually Recursive Functions Are Described |
|
1052 ---------------------------------------------------- |
|
1053 |
|
1054 The graph may be complicated by the presence of "cycles of recursion" |
|
1055 in the call graph. A cycle exists if a function calls another function |
|
1056 that (directly or indirectly) calls (or appears to call) the original |
|
1057 function. For example: if `a' calls `b', and `b' calls `a', then `a' |
|
1058 and `b' form a cycle. |
|
1059 |
|
1060 Whenever there are call paths both ways between a pair of functions, |
|
1061 they belong to the same cycle. If `a' and `b' call each other and `b' |
|
1062 and `c' call each other, all three make one cycle. Note that even if |
|
1063 `b' only calls `a' if it was not called from `a', `gprof' cannot |
|
1064 determine this, so `a' and `b' are still considered a cycle. |
|
1065 |
|
1066 The cycles are numbered with consecutive integers. When a function |
|
1067 belongs to a cycle, each time the function name appears in the call |
|
1068 graph it is followed by `<cycle NUMBER>'. |
|
1069 |
|
1070 The reason cycles matter is that they make the time values in the |
|
1071 call graph paradoxical. The "time spent in children" of `a' should |
|
1072 include the time spent in its subroutine `b' and in `b''s |
|
1073 subroutines--but one of `b''s subroutines is `a'! How much of `a''s |
|
1074 time should be included in the children of `a', when `a' is indirectly |
|
1075 recursive? |
|
1076 |
|
1077 The way `gprof' resolves this paradox is by creating a single entry |
|
1078 for the cycle as a whole. The primary line of this entry describes the |
|
1079 total time spent directly in the functions of the cycle. The |
|
1080 "subroutines" of the cycle are the individual functions of the cycle, |
|
1081 and all other functions that were called directly by them. The |
|
1082 "callers" of the cycle are the functions, outside the cycle, that |
|
1083 called functions in the cycle. |
|
1084 |
|
1085 Here is an example portion of a call graph which shows a cycle |
|
1086 containing functions `a' and `b'. The cycle was entered by a call to |
|
1087 `a' from `main'; both `a' and `b' called `c'. |
|
1088 |
|
1089 index % time self children called name |
|
1090 ---------------------------------------- |
|
1091 1.77 0 1/1 main [2] |
|
1092 [3] 91.71 1.77 0 1+5 <cycle 1 as a whole> [3] |
|
1093 1.02 0 3 b <cycle 1> [4] |
|
1094 0.75 0 2 a <cycle 1> [5] |
|
1095 ---------------------------------------- |
|
1096 3 a <cycle 1> [5] |
|
1097 [4] 52.85 1.02 0 0 b <cycle 1> [4] |
|
1098 2 a <cycle 1> [5] |
|
1099 0 0 3/6 c [6] |
|
1100 ---------------------------------------- |
|
1101 1.77 0 1/1 main [2] |
|
1102 2 b <cycle 1> [4] |
|
1103 [5] 38.86 0.75 0 1 a <cycle 1> [5] |
|
1104 3 b <cycle 1> [4] |
|
1105 0 0 3/6 c [6] |
|
1106 ---------------------------------------- |
|
1107 |
|
1108 (The entire call graph for this program contains in addition an entry |
|
1109 for `main', which calls `a', and an entry for `c', with callers `a' and |
|
1110 `b'.) |
|
1111 |
|
1112 index % time self children called name |
|
1113 <spontaneous> |
|
1114 [1] 100.00 0 1.93 0 start [1] |
|
1115 0.16 1.77 1/1 main [2] |
|
1116 ---------------------------------------- |
|
1117 0.16 1.77 1/1 start [1] |
|
1118 [2] 100.00 0.16 1.77 1 main [2] |
|
1119 1.77 0 1/1 a <cycle 1> [5] |
|
1120 ---------------------------------------- |
|
1121 1.77 0 1/1 main [2] |
|
1122 [3] 91.71 1.77 0 1+5 <cycle 1 as a whole> [3] |
|
1123 1.02 0 3 b <cycle 1> [4] |
|
1124 0.75 0 2 a <cycle 1> [5] |
|
1125 0 0 6/6 c [6] |
|
1126 ---------------------------------------- |
|
1127 3 a <cycle 1> [5] |
|
1128 [4] 52.85 1.02 0 0 b <cycle 1> [4] |
|
1129 2 a <cycle 1> [5] |
|
1130 0 0 3/6 c [6] |
|
1131 ---------------------------------------- |
|
1132 1.77 0 1/1 main [2] |
|
1133 2 b <cycle 1> [4] |
|
1134 [5] 38.86 0.75 0 1 a <cycle 1> [5] |
|
1135 3 b <cycle 1> [4] |
|
1136 0 0 3/6 c [6] |
|
1137 ---------------------------------------- |
|
1138 0 0 3/6 b <cycle 1> [4] |
|
1139 0 0 3/6 a <cycle 1> [5] |
|
1140 [6] 0.00 0 0 6 c [6] |
|
1141 ---------------------------------------- |
|
1142 |
|
1143 The `self' field of the cycle's primary line is the total time spent |
|
1144 in all the functions of the cycle. It equals the sum of the `self' |
|
1145 fields for the individual functions in the cycle, found in the entry in |
|
1146 the subroutine lines for these functions. |
|
1147 |
|
1148 The `children' fields of the cycle's primary line and subroutine |
|
1149 lines count only subroutines outside the cycle. Even though `a' calls |
|
1150 `b', the time spent in those calls to `b' is not counted in `a''s |
|
1151 `children' time. Thus, we do not encounter the problem of what to do |
|
1152 when the time in those calls to `b' includes indirect recursive calls |
|
1153 back to `a'. |
|
1154 |
|
1155 The `children' field of a caller-line in the cycle's entry estimates |
|
1156 the amount of time spent _in the whole cycle_, and its other |
|
1157 subroutines, on the times when that caller called a function in the |
|
1158 cycle. |
|
1159 |
|
1160 The `called' field in the primary line for the cycle has two numbers: |
|
1161 first, the number of times functions in the cycle were called by |
|
1162 functions outside the cycle; second, the number of times they were |
|
1163 called by functions in the cycle (including times when a function in |
|
1164 the cycle calls itself). This is a generalization of the usual split |
|
1165 into non-recursive and recursive calls. |
|
1166 |
|
1167 The `called' field of a subroutine-line for a cycle member in the |
|
1168 cycle's entry says how many time that function was called from |
|
1169 functions in the cycle. The total of all these is the second number in |
|
1170 the primary line's `called' field. |
|
1171 |
|
1172 In the individual entry for a function in a cycle, the other |
|
1173 functions in the same cycle can appear as subroutines and as callers. |
|
1174 These lines show how many times each function in the cycle called or |
|
1175 was called from each other function in the cycle. The `self' and |
|
1176 `children' fields in these lines are blank because of the difficulty of |
|
1177 defining meanings for them when recursion is going on. |
|
1178 |
|
1179 |
|
1180 File: gprof.info, Node: Line-by-line, Next: Annotated Source, Prev: Call Graph, Up: Output |
|
1181 |
|
1182 5.3 Line-by-line Profiling |
|
1183 ========================== |
|
1184 |
|
1185 `gprof''s `-l' option causes the program to perform "line-by-line" |
|
1186 profiling. In this mode, histogram samples are assigned not to |
|
1187 functions, but to individual lines of source code. This only works |
|
1188 with programs compiled with older versions of the `gcc' compiler. |
|
1189 Newer versions of `gcc' use a different program - `gcov' - to display |
|
1190 line-by-line profiling information. |
|
1191 |
|
1192 With the older versions of `gcc' the program usually has to be |
|
1193 compiled with a `-g' option, in addition to `-pg', in order to generate |
|
1194 debugging symbols for tracking source code lines. Note, in much older |
|
1195 versions of `gcc' the program had to be compiled with the `-a' command |
|
1196 line option as well. |
|
1197 |
|
1198 The flat profile is the most useful output table in line-by-line |
|
1199 mode. The call graph isn't as useful as normal, since the current |
|
1200 version of `gprof' does not propagate call graph arcs from source code |
|
1201 lines to the enclosing function. The call graph does, however, show |
|
1202 each line of code that called each function, along with a count. |
|
1203 |
|
1204 Here is a section of `gprof''s output, without line-by-line |
|
1205 profiling. Note that `ct_init' accounted for four histogram hits, and |
|
1206 13327 calls to `init_block'. |
|
1207 |
|
1208 Flat profile: |
|
1209 |
|
1210 Each sample counts as 0.01 seconds. |
|
1211 % cumulative self self total |
|
1212 time seconds seconds calls us/call us/call name |
|
1213 30.77 0.13 0.04 6335 6.31 6.31 ct_init |
|
1214 |
|
1215 |
|
1216 Call graph (explanation follows) |
|
1217 |
|
1218 |
|
1219 granularity: each sample hit covers 4 byte(s) for 7.69% of 0.13 seconds |
|
1220 |
|
1221 index % time self children called name |
|
1222 |
|
1223 0.00 0.00 1/13496 name_too_long |
|
1224 0.00 0.00 40/13496 deflate |
|
1225 0.00 0.00 128/13496 deflate_fast |
|
1226 0.00 0.00 13327/13496 ct_init |
|
1227 [7] 0.0 0.00 0.00 13496 init_block |
|
1228 |
|
1229 Now let's look at some of `gprof''s output from the same program run, |
|
1230 this time with line-by-line profiling enabled. Note that `ct_init''s |
|
1231 four histogram hits are broken down into four lines of source code--one |
|
1232 hit occurred on each of lines 349, 351, 382 and 385. In the call graph, |
|
1233 note how `ct_init''s 13327 calls to `init_block' are broken down into |
|
1234 one call from line 396, 3071 calls from line 384, 3730 calls from line |
|
1235 385, and 6525 calls from 387. |
|
1236 |
|
1237 Flat profile: |
|
1238 |
|
1239 Each sample counts as 0.01 seconds. |
|
1240 % cumulative self |
|
1241 time seconds seconds calls name |
|
1242 7.69 0.10 0.01 ct_init (trees.c:349) |
|
1243 7.69 0.11 0.01 ct_init (trees.c:351) |
|
1244 7.69 0.12 0.01 ct_init (trees.c:382) |
|
1245 7.69 0.13 0.01 ct_init (trees.c:385) |
|
1246 |
|
1247 |
|
1248 Call graph (explanation follows) |
|
1249 |
|
1250 |
|
1251 granularity: each sample hit covers 4 byte(s) for 7.69% of 0.13 seconds |
|
1252 |
|
1253 % time self children called name |
|
1254 |
|
1255 0.00 0.00 1/13496 name_too_long (gzip.c:1440) |
|
1256 0.00 0.00 1/13496 deflate (deflate.c:763) |
|
1257 0.00 0.00 1/13496 ct_init (trees.c:396) |
|
1258 0.00 0.00 2/13496 deflate (deflate.c:727) |
|
1259 0.00 0.00 4/13496 deflate (deflate.c:686) |
|
1260 0.00 0.00 5/13496 deflate (deflate.c:675) |
|
1261 0.00 0.00 12/13496 deflate (deflate.c:679) |
|
1262 0.00 0.00 16/13496 deflate (deflate.c:730) |
|
1263 0.00 0.00 128/13496 deflate_fast (deflate.c:654) |
|
1264 0.00 0.00 3071/13496 ct_init (trees.c:384) |
|
1265 0.00 0.00 3730/13496 ct_init (trees.c:385) |
|
1266 0.00 0.00 6525/13496 ct_init (trees.c:387) |
|
1267 [6] 0.0 0.00 0.00 13496 init_block (trees.c:408) |
|
1268 |
|
1269 |
|
1270 File: gprof.info, Node: Annotated Source, Prev: Line-by-line, Up: Output |
|
1271 |
|
1272 5.4 The Annotated Source Listing |
|
1273 ================================ |
|
1274 |
|
1275 `gprof''s `-A' option triggers an annotated source listing, which lists |
|
1276 the program's source code, each function labeled with the number of |
|
1277 times it was called. You may also need to specify the `-I' option, if |
|
1278 `gprof' can't find the source code files. |
|
1279 |
|
1280 With older versions of `gcc' compiling with `gcc ... -g -pg -a' |
|
1281 augments your program with basic-block counting code, in addition to |
|
1282 function counting code. This enables `gprof' to determine how many |
|
1283 times each line of code was executed. With newer versions of `gcc' |
|
1284 support for displaying basic-block counts is provided by the `gcov' |
|
1285 program. |
|
1286 |
|
1287 For example, consider the following function, taken from gzip, with |
|
1288 line numbers added: |
|
1289 |
|
1290 1 ulg updcrc(s, n) |
|
1291 2 uch *s; |
|
1292 3 unsigned n; |
|
1293 4 { |
|
1294 5 register ulg c; |
|
1295 6 |
|
1296 7 static ulg crc = (ulg)0xffffffffL; |
|
1297 8 |
|
1298 9 if (s == NULL) { |
|
1299 10 c = 0xffffffffL; |
|
1300 11 } else { |
|
1301 12 c = crc; |
|
1302 13 if (n) do { |
|
1303 14 c = crc_32_tab[...]; |
|
1304 15 } while (--n); |
|
1305 16 } |
|
1306 17 crc = c; |
|
1307 18 return c ^ 0xffffffffL; |
|
1308 19 } |
|
1309 |
|
1310 `updcrc' has at least five basic-blocks. One is the function |
|
1311 itself. The `if' statement on line 9 generates two more basic-blocks, |
|
1312 one for each branch of the `if'. A fourth basic-block results from the |
|
1313 `if' on line 13, and the contents of the `do' loop form the fifth |
|
1314 basic-block. The compiler may also generate additional basic-blocks to |
|
1315 handle various special cases. |
|
1316 |
|
1317 A program augmented for basic-block counting can be analyzed with |
|
1318 `gprof -l -A'. The `-x' option is also helpful, to ensure that each |
|
1319 line of code is labeled at least once. Here is `updcrc''s annotated |
|
1320 source listing for a sample `gzip' run: |
|
1321 |
|
1322 ulg updcrc(s, n) |
|
1323 uch *s; |
|
1324 unsigned n; |
|
1325 2 ->{ |
|
1326 register ulg c; |
|
1327 |
|
1328 static ulg crc = (ulg)0xffffffffL; |
|
1329 |
|
1330 2 -> if (s == NULL) { |
|
1331 1 -> c = 0xffffffffL; |
|
1332 1 -> } else { |
|
1333 1 -> c = crc; |
|
1334 1 -> if (n) do { |
|
1335 26312 -> c = crc_32_tab[...]; |
|
1336 26312,1,26311 -> } while (--n); |
|
1337 } |
|
1338 2 -> crc = c; |
|
1339 2 -> return c ^ 0xffffffffL; |
|
1340 2 ->} |
|
1341 |
|
1342 In this example, the function was called twice, passing once through |
|
1343 each branch of the `if' statement. The body of the `do' loop was |
|
1344 executed a total of 26312 times. Note how the `while' statement is |
|
1345 annotated. It began execution 26312 times, once for each iteration |
|
1346 through the loop. One of those times (the last time) it exited, while |
|
1347 it branched back to the beginning of the loop 26311 times. |
|
1348 |
|
1349 |
|
1350 File: gprof.info, Node: Inaccuracy, Next: How do I?, Prev: Output, Up: Top |
|
1351 |
|
1352 6 Inaccuracy of `gprof' Output |
|
1353 ****************************** |
|
1354 |
|
1355 * Menu: |
|
1356 |
|
1357 * Sampling Error:: Statistical margins of error |
|
1358 * Assumptions:: Estimating children times |
|
1359 |
|
1360 |
|
1361 File: gprof.info, Node: Sampling Error, Next: Assumptions, Up: Inaccuracy |
|
1362 |
|
1363 6.1 Statistical Sampling Error |
|
1364 ============================== |
|
1365 |
|
1366 The run-time figures that `gprof' gives you are based on a sampling |
|
1367 process, so they are subject to statistical inaccuracy. If a function |
|
1368 runs only a small amount of time, so that on the average the sampling |
|
1369 process ought to catch that function in the act only once, there is a |
|
1370 pretty good chance it will actually find that function zero times, or |
|
1371 twice. |
|
1372 |
|
1373 By contrast, the number-of-calls and basic-block figures are derived |
|
1374 by counting, not sampling. They are completely accurate and will not |
|
1375 vary from run to run if your program is deterministic. |
|
1376 |
|
1377 The "sampling period" that is printed at the beginning of the flat |
|
1378 profile says how often samples are taken. The rule of thumb is that a |
|
1379 run-time figure is accurate if it is considerably bigger than the |
|
1380 sampling period. |
|
1381 |
|
1382 The actual amount of error can be predicted. For N samples, the |
|
1383 _expected_ error is the square-root of N. For example, if the sampling |
|
1384 period is 0.01 seconds and `foo''s run-time is 1 second, N is 100 |
|
1385 samples (1 second/0.01 seconds), sqrt(N) is 10 samples, so the expected |
|
1386 error in `foo''s run-time is 0.1 seconds (10*0.01 seconds), or ten |
|
1387 percent of the observed value. Again, if the sampling period is 0.01 |
|
1388 seconds and `bar''s run-time is 100 seconds, N is 10000 samples, |
|
1389 sqrt(N) is 100 samples, so the expected error in `bar''s run-time is 1 |
|
1390 second, or one percent of the observed value. It is likely to vary |
|
1391 this much _on the average_ from one profiling run to the next. |
|
1392 (_Sometimes_ it will vary more.) |
|
1393 |
|
1394 This does not mean that a small run-time figure is devoid of |
|
1395 information. If the program's _total_ run-time is large, a small |
|
1396 run-time for one function does tell you that that function used an |
|
1397 insignificant fraction of the whole program's time. Usually this means |
|
1398 it is not worth optimizing. |
|
1399 |
|
1400 One way to get more accuracy is to give your program more (but |
|
1401 similar) input data so it will take longer. Another way is to combine |
|
1402 the data from several runs, using the `-s' option of `gprof'. Here is |
|
1403 how: |
|
1404 |
|
1405 1. Run your program once. |
|
1406 |
|
1407 2. Issue the command `mv gmon.out gmon.sum'. |
|
1408 |
|
1409 3. Run your program again, the same as before. |
|
1410 |
|
1411 4. Merge the new data in `gmon.out' into `gmon.sum' with this command: |
|
1412 |
|
1413 gprof -s EXECUTABLE-FILE gmon.out gmon.sum |
|
1414 |
|
1415 5. Repeat the last two steps as often as you wish. |
|
1416 |
|
1417 6. Analyze the cumulative data using this command: |
|
1418 |
|
1419 gprof EXECUTABLE-FILE gmon.sum > OUTPUT-FILE |
|
1420 |
|
1421 |
|
1422 File: gprof.info, Node: Assumptions, Prev: Sampling Error, Up: Inaccuracy |
|
1423 |
|
1424 6.2 Estimating `children' Times |
|
1425 =============================== |
|
1426 |
|
1427 Some of the figures in the call graph are estimates--for example, the |
|
1428 `children' time values and all the time figures in caller and |
|
1429 subroutine lines. |
|
1430 |
|
1431 There is no direct information about these measurements in the |
|
1432 profile data itself. Instead, `gprof' estimates them by making an |
|
1433 assumption about your program that might or might not be true. |
|
1434 |
|
1435 The assumption made is that the average time spent in each call to |
|
1436 any function `foo' is not correlated with who called `foo'. If `foo' |
|
1437 used 5 seconds in all, and 2/5 of the calls to `foo' came from `a', |
|
1438 then `foo' contributes 2 seconds to `a''s `children' time, by |
|
1439 assumption. |
|
1440 |
|
1441 This assumption is usually true enough, but for some programs it is |
|
1442 far from true. Suppose that `foo' returns very quickly when its |
|
1443 argument is zero; suppose that `a' always passes zero as an argument, |
|
1444 while other callers of `foo' pass other arguments. In this program, |
|
1445 all the time spent in `foo' is in the calls from callers other than `a'. |
|
1446 But `gprof' has no way of knowing this; it will blindly and incorrectly |
|
1447 charge 2 seconds of time in `foo' to the children of `a'. |
|
1448 |
|
1449 We hope some day to put more complete data into `gmon.out', so that |
|
1450 this assumption is no longer needed, if we can figure out how. For the |
|
1451 novice, the estimated figures are usually more useful than misleading. |
|
1452 |
|
1453 |
|
1454 File: gprof.info, Node: How do I?, Next: Incompatibilities, Prev: Inaccuracy, Up: Top |
|
1455 |
|
1456 7 Answers to Common Questions |
|
1457 ***************************** |
|
1458 |
|
1459 How can I get more exact information about hot spots in my program? |
|
1460 Looking at the per-line call counts only tells part of the story. |
|
1461 Because `gprof' can only report call times and counts by function, |
|
1462 the best way to get finer-grained information on where the program |
|
1463 is spending its time is to re-factor large functions into sequences |
|
1464 of calls to smaller ones. Beware however that this can introduce |
|
1465 artificial hot spots since compiling with `-pg' adds a significant |
|
1466 overhead to function calls. An alternative solution is to use a |
|
1467 non-intrusive profiler, e.g. oprofile. |
|
1468 |
|
1469 How do I find which lines in my program were executed the most times? |
|
1470 Use the `gcov' program. |
|
1471 |
|
1472 How do I find which lines in my program called a particular function? |
|
1473 Use `gprof -l' and lookup the function in the call graph. The |
|
1474 callers will be broken down by function and line number. |
|
1475 |
|
1476 How do I analyze a program that runs for less than a second? |
|
1477 Try using a shell script like this one: |
|
1478 |
|
1479 for i in `seq 1 100`; do |
|
1480 fastprog |
|
1481 mv gmon.out gmon.out.$i |
|
1482 done |
|
1483 |
|
1484 gprof -s fastprog gmon.out.* |
|
1485 |
|
1486 gprof fastprog gmon.sum |
|
1487 |
|
1488 If your program is completely deterministic, all the call counts |
|
1489 will be simple multiples of 100 (i.e., a function called once in |
|
1490 each run will appear with a call count of 100). |
|
1491 |
|
1492 |
|
1493 |
|
1494 File: gprof.info, Node: Incompatibilities, Next: Details, Prev: How do I?, Up: Top |
|
1495 |
|
1496 8 Incompatibilities with Unix `gprof' |
|
1497 ************************************* |
|
1498 |
|
1499 GNU `gprof' and Berkeley Unix `gprof' use the same data file |
|
1500 `gmon.out', and provide essentially the same information. But there |
|
1501 are a few differences. |
|
1502 |
|
1503 * GNU `gprof' uses a new, generalized file format with support for |
|
1504 basic-block execution counts and non-realtime histograms. A magic |
|
1505 cookie and version number allows `gprof' to easily identify new |
|
1506 style files. Old BSD-style files can still be read. *Note |
|
1507 Profiling Data File Format: File Format. |
|
1508 |
|
1509 * For a recursive function, Unix `gprof' lists the function as a |
|
1510 parent and as a child, with a `calls' field that lists the number |
|
1511 of recursive calls. GNU `gprof' omits these lines and puts the |
|
1512 number of recursive calls in the primary line. |
|
1513 |
|
1514 * When a function is suppressed from the call graph with `-e', GNU |
|
1515 `gprof' still lists it as a subroutine of functions that call it. |
|
1516 |
|
1517 * GNU `gprof' accepts the `-k' with its argument in the form |
|
1518 `from/to', instead of `from to'. |
|
1519 |
|
1520 * In the annotated source listing, if there are multiple basic |
|
1521 blocks on the same line, GNU `gprof' prints all of their counts, |
|
1522 separated by commas. |
|
1523 |
|
1524 * The blurbs, field widths, and output formats are different. GNU |
|
1525 `gprof' prints blurbs after the tables, so that you can see the |
|
1526 tables without skipping the blurbs. |
|
1527 |
|
1528 |
|
1529 File: gprof.info, Node: Details, Next: GNU Free Documentation License, Prev: Incompatibilities, Up: Top |
|
1530 |
|
1531 9 Details of Profiling |
|
1532 ********************** |
|
1533 |
|
1534 * Menu: |
|
1535 |
|
1536 * Implementation:: How a program collects profiling information |
|
1537 * File Format:: Format of `gmon.out' files |
|
1538 * Internals:: `gprof''s internal operation |
|
1539 * Debugging:: Using `gprof''s `-d' option |
|
1540 |
|
1541 |
|
1542 File: gprof.info, Node: Implementation, Next: File Format, Up: Details |
|
1543 |
|
1544 9.1 Implementation of Profiling |
|
1545 =============================== |
|
1546 |
|
1547 Profiling works by changing how every function in your program is |
|
1548 compiled so that when it is called, it will stash away some information |
|
1549 about where it was called from. From this, the profiler can figure out |
|
1550 what function called it, and can count how many times it was called. |
|
1551 This change is made by the compiler when your program is compiled with |
|
1552 the `-pg' option, which causes every function to call `mcount' (or |
|
1553 `_mcount', or `__mcount', depending on the OS and compiler) as one of |
|
1554 its first operations. |
|
1555 |
|
1556 The `mcount' routine, included in the profiling library, is |
|
1557 responsible for recording in an in-memory call graph table both its |
|
1558 parent routine (the child) and its parent's parent. This is typically |
|
1559 done by examining the stack frame to find both the address of the |
|
1560 child, and the return address in the original parent. Since this is a |
|
1561 very machine-dependent operation, `mcount' itself is typically a short |
|
1562 assembly-language stub routine that extracts the required information, |
|
1563 and then calls `__mcount_internal' (a normal C function) with two |
|
1564 arguments--`frompc' and `selfpc'. `__mcount_internal' is responsible |
|
1565 for maintaining the in-memory call graph, which records `frompc', |
|
1566 `selfpc', and the number of times each of these call arcs was traversed. |
|
1567 |
|
1568 GCC Version 2 provides a magical function |
|
1569 (`__builtin_return_address'), which allows a generic `mcount' function |
|
1570 to extract the required information from the stack frame. However, on |
|
1571 some architectures, most notably the SPARC, using this builtin can be |
|
1572 very computationally expensive, and an assembly language version of |
|
1573 `mcount' is used for performance reasons. |
|
1574 |
|
1575 Number-of-calls information for library routines is collected by |
|
1576 using a special version of the C library. The programs in it are the |
|
1577 same as in the usual C library, but they were compiled with `-pg'. If |
|
1578 you link your program with `gcc ... -pg', it automatically uses the |
|
1579 profiling version of the library. |
|
1580 |
|
1581 Profiling also involves watching your program as it runs, and |
|
1582 keeping a histogram of where the program counter happens to be every |
|
1583 now and then. Typically the program counter is looked at around 100 |
|
1584 times per second of run time, but the exact frequency may vary from |
|
1585 system to system. |
|
1586 |
|
1587 This is done is one of two ways. Most UNIX-like operating systems |
|
1588 provide a `profil()' system call, which registers a memory array with |
|
1589 the kernel, along with a scale factor that determines how the program's |
|
1590 address space maps into the array. Typical scaling values cause every |
|
1591 2 to 8 bytes of address space to map into a single array slot. On |
|
1592 every tick of the system clock (assuming the profiled program is |
|
1593 running), the value of the program counter is examined and the |
|
1594 corresponding slot in the memory array is incremented. Since this is |
|
1595 done in the kernel, which had to interrupt the process anyway to handle |
|
1596 the clock interrupt, very little additional system overhead is required. |
|
1597 |
|
1598 However, some operating systems, most notably Linux 2.0 (and |
|
1599 earlier), do not provide a `profil()' system call. On such a system, |
|
1600 arrangements are made for the kernel to periodically deliver a signal |
|
1601 to the process (typically via `setitimer()'), which then performs the |
|
1602 same operation of examining the program counter and incrementing a slot |
|
1603 in the memory array. Since this method requires a signal to be |
|
1604 delivered to user space every time a sample is taken, it uses |
|
1605 considerably more overhead than kernel-based profiling. Also, due to |
|
1606 the added delay required to deliver the signal, this method is less |
|
1607 accurate as well. |
|
1608 |
|
1609 A special startup routine allocates memory for the histogram and |
|
1610 either calls `profil()' or sets up a clock signal handler. This |
|
1611 routine (`monstartup') can be invoked in several ways. On Linux |
|
1612 systems, a special profiling startup file `gcrt0.o', which invokes |
|
1613 `monstartup' before `main', is used instead of the default `crt0.o'. |
|
1614 Use of this special startup file is one of the effects of using `gcc |
|
1615 ... -pg' to link. On SPARC systems, no special startup files are used. |
|
1616 Rather, the `mcount' routine, when it is invoked for the first time |
|
1617 (typically when `main' is called), calls `monstartup'. |
|
1618 |
|
1619 If the compiler's `-a' option was used, basic-block counting is also |
|
1620 enabled. Each object file is then compiled with a static array of |
|
1621 counts, initially zero. In the executable code, every time a new |
|
1622 basic-block begins (i.e., when an `if' statement appears), an extra |
|
1623 instruction is inserted to increment the corresponding count in the |
|
1624 array. At compile time, a paired array was constructed that recorded |
|
1625 the starting address of each basic-block. Taken together, the two |
|
1626 arrays record the starting address of every basic-block, along with the |
|
1627 number of times it was executed. |
|
1628 |
|
1629 The profiling library also includes a function (`mcleanup') which is |
|
1630 typically registered using `atexit()' to be called as the program |
|
1631 exits, and is responsible for writing the file `gmon.out'. Profiling |
|
1632 is turned off, various headers are output, and the histogram is |
|
1633 written, followed by the call-graph arcs and the basic-block counts. |
|
1634 |
|
1635 The output from `gprof' gives no indication of parts of your program |
|
1636 that are limited by I/O or swapping bandwidth. This is because samples |
|
1637 of the program counter are taken at fixed intervals of the program's |
|
1638 run time. Therefore, the time measurements in `gprof' output say |
|
1639 nothing about time that your program was not running. For example, a |
|
1640 part of the program that creates so much data that it cannot all fit in |
|
1641 physical memory at once may run very slowly due to thrashing, but |
|
1642 `gprof' will say it uses little time. On the other hand, sampling by |
|
1643 run time has the advantage that the amount of load due to other users |
|
1644 won't directly affect the output you get. |
|
1645 |
|
1646 |
|
1647 File: gprof.info, Node: File Format, Next: Internals, Prev: Implementation, Up: Details |
|
1648 |
|
1649 9.2 Profiling Data File Format |
|
1650 ============================== |
|
1651 |
|
1652 The old BSD-derived file format used for profile data does not contain a |
|
1653 magic cookie that allows to check whether a data file really is a |
|
1654 `gprof' file. Furthermore, it does not provide a version number, thus |
|
1655 rendering changes to the file format almost impossible. GNU `gprof' |
|
1656 uses a new file format that provides these features. For backward |
|
1657 compatibility, GNU `gprof' continues to support the old BSD-derived |
|
1658 format, but not all features are supported with it. For example, |
|
1659 basic-block execution counts cannot be accommodated by the old file |
|
1660 format. |
|
1661 |
|
1662 The new file format is defined in header file `gmon_out.h'. It |
|
1663 consists of a header containing the magic cookie and a version number, |
|
1664 as well as some spare bytes available for future extensions. All data |
|
1665 in a profile data file is in the native format of the target for which |
|
1666 the profile was collected. GNU `gprof' adapts automatically to the |
|
1667 byte-order in use. |
|
1668 |
|
1669 In the new file format, the header is followed by a sequence of |
|
1670 records. Currently, there are three different record types: histogram |
|
1671 records, call-graph arc records, and basic-block execution count |
|
1672 records. Each file can contain any number of each record type. When |
|
1673 reading a file, GNU `gprof' will ensure records of the same type are |
|
1674 compatible with each other and compute the union of all records. For |
|
1675 example, for basic-block execution counts, the union is simply the sum |
|
1676 of all execution counts for each basic-block. |
|
1677 |
|
1678 9.2.1 Histogram Records |
|
1679 ----------------------- |
|
1680 |
|
1681 Histogram records consist of a header that is followed by an array of |
|
1682 bins. The header contains the text-segment range that the histogram |
|
1683 spans, the size of the histogram in bytes (unlike in the old BSD |
|
1684 format, this does not include the size of the header), the rate of the |
|
1685 profiling clock, and the physical dimension that the bin counts |
|
1686 represent after being scaled by the profiling clock rate. The physical |
|
1687 dimension is specified in two parts: a long name of up to 15 characters |
|
1688 and a single character abbreviation. For example, a histogram |
|
1689 representing real-time would specify the long name as "seconds" and the |
|
1690 abbreviation as "s". This feature is useful for architectures that |
|
1691 support performance monitor hardware (which, fortunately, is becoming |
|
1692 increasingly common). For example, under DEC OSF/1, the "uprofile" |
|
1693 command can be used to produce a histogram of, say, instruction cache |
|
1694 misses. In this case, the dimension in the histogram header could be |
|
1695 set to "i-cache misses" and the abbreviation could be set to "1" |
|
1696 (because it is simply a count, not a physical dimension). Also, the |
|
1697 profiling rate would have to be set to 1 in this case. |
|
1698 |
|
1699 Histogram bins are 16-bit numbers and each bin represent an equal |
|
1700 amount of text-space. For example, if the text-segment is one thousand |
|
1701 bytes long and if there are ten bins in the histogram, each bin |
|
1702 represents one hundred bytes. |
|
1703 |
|
1704 9.2.2 Call-Graph Records |
|
1705 ------------------------ |
|
1706 |
|
1707 Call-graph records have a format that is identical to the one used in |
|
1708 the BSD-derived file format. It consists of an arc in the call graph |
|
1709 and a count indicating the number of times the arc was traversed during |
|
1710 program execution. Arcs are specified by a pair of addresses: the |
|
1711 first must be within caller's function and the second must be within |
|
1712 the callee's function. When performing profiling at the function |
|
1713 level, these addresses can point anywhere within the respective |
|
1714 function. However, when profiling at the line-level, it is better if |
|
1715 the addresses are as close to the call-site/entry-point as possible. |
|
1716 This will ensure that the line-level call-graph is able to identify |
|
1717 exactly which line of source code performed calls to a function. |
|
1718 |
|
1719 9.2.3 Basic-Block Execution Count Records |
|
1720 ----------------------------------------- |
|
1721 |
|
1722 Basic-block execution count records consist of a header followed by a |
|
1723 sequence of address/count pairs. The header simply specifies the |
|
1724 length of the sequence. In an address/count pair, the address |
|
1725 identifies a basic-block and the count specifies the number of times |
|
1726 that basic-block was executed. Any address within the basic-address can |
|
1727 be used. |
|
1728 |
|
1729 |
|
1730 File: gprof.info, Node: Internals, Next: Debugging, Prev: File Format, Up: Details |
|
1731 |
|
1732 9.3 `gprof''s Internal Operation |
|
1733 ================================ |
|
1734 |
|
1735 Like most programs, `gprof' begins by processing its options. During |
|
1736 this stage, it may building its symspec list (`sym_ids.c:sym_id_add'), |
|
1737 if options are specified which use symspecs. `gprof' maintains a |
|
1738 single linked list of symspecs, which will eventually get turned into |
|
1739 12 symbol tables, organized into six include/exclude pairs--one pair |
|
1740 each for the flat profile (INCL_FLAT/EXCL_FLAT), the call graph arcs |
|
1741 (INCL_ARCS/EXCL_ARCS), printing in the call graph |
|
1742 (INCL_GRAPH/EXCL_GRAPH), timing propagation in the call graph |
|
1743 (INCL_TIME/EXCL_TIME), the annotated source listing |
|
1744 (INCL_ANNO/EXCL_ANNO), and the execution count listing |
|
1745 (INCL_EXEC/EXCL_EXEC). |
|
1746 |
|
1747 After option processing, `gprof' finishes building the symspec list |
|
1748 by adding all the symspecs in `default_excluded_list' to the exclude |
|
1749 lists EXCL_TIME and EXCL_GRAPH, and if line-by-line profiling is |
|
1750 specified, EXCL_FLAT as well. These default excludes are not added to |
|
1751 EXCL_ANNO, EXCL_ARCS, and EXCL_EXEC. |
|
1752 |
|
1753 Next, the BFD library is called to open the object file, verify that |
|
1754 it is an object file, and read its symbol table (`core.c:core_init'), |
|
1755 using `bfd_canonicalize_symtab' after mallocing an appropriately sized |
|
1756 array of symbols. At this point, function mappings are read (if the |
|
1757 `--file-ordering' option has been specified), and the core text space |
|
1758 is read into memory (if the `-c' option was given). |
|
1759 |
|
1760 `gprof''s own symbol table, an array of Sym structures, is now built. |
|
1761 This is done in one of two ways, by one of two routines, depending on |
|
1762 whether line-by-line profiling (`-l' option) has been enabled. For |
|
1763 normal profiling, the BFD canonical symbol table is scanned. For |
|
1764 line-by-line profiling, every text space address is examined, and a new |
|
1765 symbol table entry gets created every time the line number changes. In |
|
1766 either case, two passes are made through the symbol table--one to count |
|
1767 the size of the symbol table required, and the other to actually read |
|
1768 the symbols. In between the two passes, a single array of type `Sym' |
|
1769 is created of the appropriate length. Finally, |
|
1770 `symtab.c:symtab_finalize' is called to sort the symbol table and |
|
1771 remove duplicate entries (entries with the same memory address). |
|
1772 |
|
1773 The symbol table must be a contiguous array for two reasons. First, |
|
1774 the `qsort' library function (which sorts an array) will be used to |
|
1775 sort the symbol table. Also, the symbol lookup routine |
|
1776 (`symtab.c:sym_lookup'), which finds symbols based on memory address, |
|
1777 uses a binary search algorithm which requires the symbol table to be a |
|
1778 sorted array. Function symbols are indicated with an `is_func' flag. |
|
1779 Line number symbols have no special flags set. Additionally, a symbol |
|
1780 can have an `is_static' flag to indicate that it is a local symbol. |
|
1781 |
|
1782 With the symbol table read, the symspecs can now be translated into |
|
1783 Syms (`sym_ids.c:sym_id_parse'). Remember that a single symspec can |
|
1784 match multiple symbols. An array of symbol tables (`syms') is created, |
|
1785 each entry of which is a symbol table of Syms to be included or |
|
1786 excluded from a particular listing. The master symbol table and the |
|
1787 symspecs are examined by nested loops, and every symbol that matches a |
|
1788 symspec is inserted into the appropriate syms table. This is done |
|
1789 twice, once to count the size of each required symbol table, and again |
|
1790 to build the tables, which have been malloced between passes. From now |
|
1791 on, to determine whether a symbol is on an include or exclude symspec |
|
1792 list, `gprof' simply uses its standard symbol lookup routine on the |
|
1793 appropriate table in the `syms' array. |
|
1794 |
|
1795 Now the profile data file(s) themselves are read |
|
1796 (`gmon_io.c:gmon_out_read'), first by checking for a new-style |
|
1797 `gmon.out' header, then assuming this is an old-style BSD `gmon.out' if |
|
1798 the magic number test failed. |
|
1799 |
|
1800 New-style histogram records are read by `hist.c:hist_read_rec'. For |
|
1801 the first histogram record, allocate a memory array to hold all the |
|
1802 bins, and read them in. When multiple profile data files (or files |
|
1803 with multiple histogram records) are read, the memory ranges of each |
|
1804 pair of histogram records must be either equal, or non-overlapping. |
|
1805 For each pair of histogram records, the resolution (memory region size |
|
1806 divided by the number of bins) must be the same. The time unit must be |
|
1807 the same for all histogram records. If the above containts are met, all |
|
1808 histograms for the same memory range are merged. |
|
1809 |
|
1810 As each call graph record is read (`call_graph.c:cg_read_rec'), the |
|
1811 parent and child addresses are matched to symbol table entries, and a |
|
1812 call graph arc is created by `cg_arcs.c:arc_add', unless the arc fails |
|
1813 a symspec check against INCL_ARCS/EXCL_ARCS. As each arc is added, a |
|
1814 linked list is maintained of the parent's child arcs, and of the child's |
|
1815 parent arcs. Both the child's call count and the arc's call count are |
|
1816 incremented by the record's call count. |
|
1817 |
|
1818 Basic-block records are read (`basic_blocks.c:bb_read_rec'), but |
|
1819 only if line-by-line profiling has been selected. Each basic-block |
|
1820 address is matched to a corresponding line symbol in the symbol table, |
|
1821 and an entry made in the symbol's bb_addr and bb_calls arrays. Again, |
|
1822 if multiple basic-block records are present for the same address, the |
|
1823 call counts are cumulative. |
|
1824 |
|
1825 A gmon.sum file is dumped, if requested (`gmon_io.c:gmon_out_write'). |
|
1826 |
|
1827 If histograms were present in the data files, assign them to symbols |
|
1828 (`hist.c:hist_assign_samples') by iterating over all the sample bins |
|
1829 and assigning them to symbols. Since the symbol table is sorted in |
|
1830 order of ascending memory addresses, we can simple follow along in the |
|
1831 symbol table as we make our pass over the sample bins. This step |
|
1832 includes a symspec check against INCL_FLAT/EXCL_FLAT. Depending on the |
|
1833 histogram scale factor, a sample bin may span multiple symbols, in |
|
1834 which case a fraction of the sample count is allocated to each symbol, |
|
1835 proportional to the degree of overlap. This effect is rare for normal |
|
1836 profiling, but overlaps are more common during line-by-line profiling, |
|
1837 and can cause each of two adjacent lines to be credited with half a |
|
1838 hit, for example. |
|
1839 |
|
1840 If call graph data is present, `cg_arcs.c:cg_assemble' is called. |
|
1841 First, if `-c' was specified, a machine-dependent routine (`find_call') |
|
1842 scans through each symbol's machine code, looking for subroutine call |
|
1843 instructions, and adding them to the call graph with a zero call count. |
|
1844 A topological sort is performed by depth-first numbering all the |
|
1845 symbols (`cg_dfn.c:cg_dfn'), so that children are always numbered less |
|
1846 than their parents, then making a array of pointers into the symbol |
|
1847 table and sorting it into numerical order, which is reverse topological |
|
1848 order (children appear before parents). Cycles are also detected at |
|
1849 this point, all members of which are assigned the same topological |
|
1850 number. Two passes are now made through this sorted array of symbol |
|
1851 pointers. The first pass, from end to beginning (parents to children), |
|
1852 computes the fraction of child time to propagate to each parent and a |
|
1853 print flag. The print flag reflects symspec handling of |
|
1854 INCL_GRAPH/EXCL_GRAPH, with a parent's include or exclude (print or no |
|
1855 print) property being propagated to its children, unless they |
|
1856 themselves explicitly appear in INCL_GRAPH or EXCL_GRAPH. A second |
|
1857 pass, from beginning to end (children to parents) actually propagates |
|
1858 the timings along the call graph, subject to a check against |
|
1859 INCL_TIME/EXCL_TIME. With the print flag, fractions, and timings now |
|
1860 stored in the symbol structures, the topological sort array is now |
|
1861 discarded, and a new array of pointers is assembled, this time sorted |
|
1862 by propagated time. |
|
1863 |
|
1864 Finally, print the various outputs the user requested, which is now |
|
1865 fairly straightforward. The call graph (`cg_print.c:cg_print') and |
|
1866 flat profile (`hist.c:hist_print') are regurgitations of values already |
|
1867 computed. The annotated source listing |
|
1868 (`basic_blocks.c:print_annotated_source') uses basic-block information, |
|
1869 if present, to label each line of code with call counts, otherwise only |
|
1870 the function call counts are presented. |
|
1871 |
|
1872 The function ordering code is marginally well documented in the |
|
1873 source code itself (`cg_print.c'). Basically, the functions with the |
|
1874 most use and the most parents are placed first, followed by other |
|
1875 functions with the most use, followed by lower use functions, followed |
|
1876 by unused functions at the end. |
|
1877 |
|
1878 |
|
1879 File: gprof.info, Node: Debugging, Prev: Internals, Up: Details |
|
1880 |
|
1881 9.4 Debugging `gprof' |
|
1882 ===================== |
|
1883 |
|
1884 If `gprof' was compiled with debugging enabled, the `-d' option |
|
1885 triggers debugging output (to stdout) which can be helpful in |
|
1886 understanding its operation. The debugging number specified is |
|
1887 interpreted as a sum of the following options: |
|
1888 |
|
1889 2 - Topological sort |
|
1890 Monitor depth-first numbering of symbols during call graph analysis |
|
1891 |
|
1892 4 - Cycles |
|
1893 Shows symbols as they are identified as cycle heads |
|
1894 |
|
1895 16 - Tallying |
|
1896 As the call graph arcs are read, show each arc and how the total |
|
1897 calls to each function are tallied |
|
1898 |
|
1899 32 - Call graph arc sorting |
|
1900 Details sorting individual parents/children within each call graph |
|
1901 entry |
|
1902 |
|
1903 64 - Reading histogram and call graph records |
|
1904 Shows address ranges of histograms as they are read, and each call |
|
1905 graph arc |
|
1906 |
|
1907 128 - Symbol table |
|
1908 Reading, classifying, and sorting the symbol table from the object |
|
1909 file. For line-by-line profiling (`-l' option), also shows line |
|
1910 numbers being assigned to memory addresses. |
|
1911 |
|
1912 256 - Static call graph |
|
1913 Trace operation of `-c' option |
|
1914 |
|
1915 512 - Symbol table and arc table lookups |
|
1916 Detail operation of lookup routines |
|
1917 |
|
1918 1024 - Call graph propagation |
|
1919 Shows how function times are propagated along the call graph |
|
1920 |
|
1921 2048 - Basic-blocks |
|
1922 Shows basic-block records as they are read from profile data (only |
|
1923 meaningful with `-l' option) |
|
1924 |
|
1925 4096 - Symspecs |
|
1926 Shows symspec-to-symbol pattern matching operation |
|
1927 |
|
1928 8192 - Annotate source |
|
1929 Tracks operation of `-A' option |
|
1930 |
|
1931 |
|
1932 File: gprof.info, Node: GNU Free Documentation License, Prev: Details, Up: Top |
|
1933 |
|
1934 Appendix A GNU Free Documentation License |
|
1935 ***************************************** |
|
1936 |
|
1937 Version 1.1, March 2000 |
|
1938 |
|
1939 Copyright (C) 2000, 2003 Free Software Foundation, Inc. |
|
1940 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA |
|
1941 |
|
1942 Everyone is permitted to copy and distribute verbatim copies |
|
1943 of this license document, but changing it is not allowed. |
|
1944 |
|
1945 |
|
1946 0. PREAMBLE |
|
1947 |
|
1948 The purpose of this License is to make a manual, textbook, or other |
|
1949 written document "free" in the sense of freedom: to assure everyone |
|
1950 the effective freedom to copy and redistribute it, with or without |
|
1951 modifying it, either commercially or noncommercially. Secondarily, |
|
1952 this License preserves for the author and publisher a way to get |
|
1953 credit for their work, while not being considered responsible for |
|
1954 modifications made by others. |
|
1955 |
|
1956 This License is a kind of "copyleft", which means that derivative |
|
1957 works of the document must themselves be free in the same sense. |
|
1958 It complements the GNU General Public License, which is a copyleft |
|
1959 license designed for free software. |
|
1960 |
|
1961 We have designed this License in order to use it for manuals for |
|
1962 free software, because free software needs free documentation: a |
|
1963 free program should come with manuals providing the same freedoms |
|
1964 that the software does. But this License is not limited to |
|
1965 software manuals; it can be used for any textual work, regardless |
|
1966 of subject matter or whether it is published as a printed book. |
|
1967 We recommend this License principally for works whose purpose is |
|
1968 instruction or reference. |
|
1969 |
|
1970 |
|
1971 1. APPLICABILITY AND DEFINITIONS |
|
1972 |
|
1973 This License applies to any manual or other work that contains a |
|
1974 notice placed by the copyright holder saying it can be distributed |
|
1975 under the terms of this License. The "Document", below, refers to |
|
1976 any such manual or work. Any member of the public is a licensee, |
|
1977 and is addressed as "you." |
|
1978 |
|
1979 A "Modified Version" of the Document means any work containing the |
|
1980 Document or a portion of it, either copied verbatim, or with |
|
1981 modifications and/or translated into another language. |
|
1982 |
|
1983 A "Secondary Section" is a named appendix or a front-matter |
|
1984 section of the Document that deals exclusively with the |
|
1985 relationship of the publishers or authors of the Document to the |
|
1986 Document's overall subject (or to related matters) and contains |
|
1987 nothing that could fall directly within that overall subject. |
|
1988 (For example, if the Document is in part a textbook of |
|
1989 mathematics, a Secondary Section may not explain any mathematics.) |
|
1990 The relationship could be a matter of historical connection with |
|
1991 the subject or with related matters, or of legal, commercial, |
|
1992 philosophical, ethical or political position regarding them. |
|
1993 |
|
1994 The "Invariant Sections" are certain Secondary Sections whose |
|
1995 titles are designated, as being those of Invariant Sections, in |
|
1996 the notice that says that the Document is released under this |
|
1997 License. |
|
1998 |
|
1999 The "Cover Texts" are certain short passages of text that are |
|
2000 listed, as Front-Cover Texts or Back-Cover Texts, in the notice |
|
2001 that says that the Document is released under this License. |
|
2002 |
|
2003 A "Transparent" copy of the Document means a machine-readable copy, |
|
2004 represented in a format whose specification is available to the |
|
2005 general public, whose contents can be viewed and edited directly |
|
2006 and straightforwardly with generic text editors or (for images |
|
2007 composed of pixels) generic paint programs or (for drawings) some |
|
2008 widely available drawing editor, and that is suitable for input to |
|
2009 text formatters or for automatic translation to a variety of |
|
2010 formats suitable for input to text formatters. A copy made in an |
|
2011 otherwise Transparent file format whose markup has been designed |
|
2012 to thwart or discourage subsequent modification by readers is not |
|
2013 Transparent. A copy that is not "Transparent" is called "Opaque." |
|
2014 |
|
2015 Examples of suitable formats for Transparent copies include plain |
|
2016 ASCII without markup, Texinfo input format, LaTeX input format, |
|
2017 SGML or XML using a publicly available DTD, and |
|
2018 standard-conforming simple HTML designed for human modification. |
|
2019 Opaque formats include PostScript, PDF, proprietary formats that |
|
2020 can be read and edited only by proprietary word processors, SGML |
|
2021 or XML for which the DTD and/or processing tools are not generally |
|
2022 available, and the machine-generated HTML produced by some word |
|
2023 processors for output purposes only. |
|
2024 |
|
2025 The "Title Page" means, for a printed book, the title page itself, |
|
2026 plus such following pages as are needed to hold, legibly, the |
|
2027 material this License requires to appear in the title page. For |
|
2028 works in formats which do not have any title page as such, "Title |
|
2029 Page" means the text near the most prominent appearance of the |
|
2030 work's title, preceding the beginning of the body of the text. |
|
2031 |
|
2032 2. VERBATIM COPYING |
|
2033 |
|
2034 You may copy and distribute the Document in any medium, either |
|
2035 commercially or noncommercially, provided that this License, the |
|
2036 copyright notices, and the license notice saying this License |
|
2037 applies to the Document are reproduced in all copies, and that you |
|
2038 add no other conditions whatsoever to those of this License. You |
|
2039 may not use technical measures to obstruct or control the reading |
|
2040 or further copying of the copies you make or distribute. However, |
|
2041 you may accept compensation in exchange for copies. If you |
|
2042 distribute a large enough number of copies you must also follow |
|
2043 the conditions in section 3. |
|
2044 |
|
2045 You may also lend copies, under the same conditions stated above, |
|
2046 and you may publicly display copies. |
|
2047 |
|
2048 3. COPYING IN QUANTITY |
|
2049 |
|
2050 If you publish printed copies of the Document numbering more than |
|
2051 100, and the Document's license notice requires Cover Texts, you |
|
2052 must enclose the copies in covers that carry, clearly and legibly, |
|
2053 all these Cover Texts: Front-Cover Texts on the front cover, and |
|
2054 Back-Cover Texts on the back cover. Both covers must also clearly |
|
2055 and legibly identify you as the publisher of these copies. The |
|
2056 front cover must present the full title with all words of the |
|
2057 title equally prominent and visible. You may add other material |
|
2058 on the covers in addition. Copying with changes limited to the |
|
2059 covers, as long as they preserve the title of the Document and |
|
2060 satisfy these conditions, can be treated as verbatim copying in |
|
2061 other respects. |
|
2062 |
|
2063 If the required texts for either cover are too voluminous to fit |
|
2064 legibly, you should put the first ones listed (as many as fit |
|
2065 reasonably) on the actual cover, and continue the rest onto |
|
2066 adjacent pages. |
|
2067 |
|
2068 If you publish or distribute Opaque copies of the Document |
|
2069 numbering more than 100, you must either include a |
|
2070 machine-readable Transparent copy along with each Opaque copy, or |
|
2071 state in or with each Opaque copy a publicly-accessible |
|
2072 computer-network location containing a complete Transparent copy |
|
2073 of the Document, free of added material, which the general |
|
2074 network-using public has access to download anonymously at no |
|
2075 charge using public-standard network protocols. If you use the |
|
2076 latter option, you must take reasonably prudent steps, when you |
|
2077 begin distribution of Opaque copies in quantity, to ensure that |
|
2078 this Transparent copy will remain thus accessible at the stated |
|
2079 location until at least one year after the last time you |
|
2080 distribute an Opaque copy (directly or through your agents or |
|
2081 retailers) of that edition to the public. |
|
2082 |
|
2083 It is requested, but not required, that you contact the authors of |
|
2084 the Document well before redistributing any large number of |
|
2085 copies, to give them a chance to provide you with an updated |
|
2086 version of the Document. |
|
2087 |
|
2088 4. MODIFICATIONS |
|
2089 |
|
2090 You may copy and distribute a Modified Version of the Document |
|
2091 under the conditions of sections 2 and 3 above, provided that you |
|
2092 release the Modified Version under precisely this License, with |
|
2093 the Modified Version filling the role of the Document, thus |
|
2094 licensing distribution and modification of the Modified Version to |
|
2095 whoever possesses a copy of it. In addition, you must do these |
|
2096 things in the Modified Version: |
|
2097 |
|
2098 A. Use in the Title Page (and on the covers, if any) a title |
|
2099 distinct from that of the Document, and from those of previous |
|
2100 versions (which should, if there were any, be listed in the |
|
2101 History section of the Document). You may use the same title |
|
2102 as a previous version if the original publisher of that version |
|
2103 gives permission. |
|
2104 B. List on the Title Page, as authors, one or more persons or |
|
2105 entities responsible for authorship of the modifications in the |
|
2106 Modified Version, together with at least five of the principal |
|
2107 authors of the Document (all of its principal authors, if it |
|
2108 has less than five). |
|
2109 C. State on the Title page the name of the publisher of the |
|
2110 Modified Version, as the publisher. |
|
2111 D. Preserve all the copyright notices of the Document. |
|
2112 E. Add an appropriate copyright notice for your modifications |
|
2113 adjacent to the other copyright notices. |
|
2114 F. Include, immediately after the copyright notices, a license |
|
2115 notice giving the public permission to use the Modified Version |
|
2116 under the terms of this License, in the form shown in the |
|
2117 Addendum below. |
|
2118 G. Preserve in that license notice the full lists of Invariant |
|
2119 Sections and required Cover Texts given in the Document's |
|
2120 license notice. |
|
2121 H. Include an unaltered copy of this License. |
|
2122 I. Preserve the section entitled "History", and its title, and add |
|
2123 to it an item stating at least the title, year, new authors, and |
|
2124 publisher of the Modified Version as given on the Title Page. |
|
2125 If there is no section entitled "History" in the Document, |
|
2126 create one stating the title, year, authors, and publisher of |
|
2127 the Document as given on its Title Page, then add an item |
|
2128 describing the Modified Version as stated in the previous |
|
2129 sentence. |
|
2130 J. Preserve the network location, if any, given in the Document for |
|
2131 public access to a Transparent copy of the Document, and |
|
2132 likewise the network locations given in the Document for |
|
2133 previous versions it was based on. These may be placed in the |
|
2134 "History" section. You may omit a network location for a work |
|
2135 that was published at least four years before the Document |
|
2136 itself, or if the original publisher of the version it refers |
|
2137 to gives permission. |
|
2138 K. In any section entitled "Acknowledgements" or "Dedications", |
|
2139 preserve the section's title, and preserve in the section all the |
|
2140 substance and tone of each of the contributor acknowledgements |
|
2141 and/or dedications given therein. |
|
2142 L. Preserve all the Invariant Sections of the Document, |
|
2143 unaltered in their text and in their titles. Section numbers |
|
2144 or the equivalent are not considered part of the section titles. |
|
2145 M. Delete any section entitled "Endorsements." Such a section |
|
2146 may not be included in the Modified Version. |
|
2147 N. Do not retitle any existing section as "Endorsements" or to |
|
2148 conflict in title with any Invariant Section. |
|
2149 |
|
2150 If the Modified Version includes new front-matter sections or |
|
2151 appendices that qualify as Secondary Sections and contain no |
|
2152 material copied from the Document, you may at your option |
|
2153 designate some or all of these sections as invariant. To do this, |
|
2154 add their titles to the list of Invariant Sections in the Modified |
|
2155 Version's license notice. These titles must be distinct from any |
|
2156 other section titles. |
|
2157 |
|
2158 You may add a section entitled "Endorsements", provided it contains |
|
2159 nothing but endorsements of your Modified Version by various |
|
2160 parties-for example, statements of peer review or that the text has |
|
2161 been approved by an organization as the authoritative definition |
|
2162 of a standard. |
|
2163 |
|
2164 You may add a passage of up to five words as a Front-Cover Text, |
|
2165 and a passage of up to 25 words as a Back-Cover Text, to the end |
|
2166 of the list of Cover Texts in the Modified Version. Only one |
|
2167 passage of Front-Cover Text and one of Back-Cover Text may be |
|
2168 added by (or through arrangements made by) any one entity. If the |
|
2169 Document already includes a cover text for the same cover, |
|
2170 previously added by you or by arrangement made by the same entity |
|
2171 you are acting on behalf of, you may not add another; but you may |
|
2172 replace the old one, on explicit permission from the previous |
|
2173 publisher that added the old one. |
|
2174 |
|
2175 The author(s) and publisher(s) of the Document do not by this |
|
2176 License give permission to use their names for publicity for or to |
|
2177 assert or imply endorsement of any Modified Version. |
|
2178 |
|
2179 5. COMBINING DOCUMENTS |
|
2180 |
|
2181 You may combine the Document with other documents released under |
|
2182 this License, under the terms defined in section 4 above for |
|
2183 modified versions, provided that you include in the combination |
|
2184 all of the Invariant Sections of all of the original documents, |
|
2185 unmodified, and list them all as Invariant Sections of your |
|
2186 combined work in its license notice. |
|
2187 |
|
2188 The combined work need only contain one copy of this License, and |
|
2189 multiple identical Invariant Sections may be replaced with a single |
|
2190 copy. If there are multiple Invariant Sections with the same name |
|
2191 but different contents, make the title of each such section unique |
|
2192 by adding at the end of it, in parentheses, the name of the |
|
2193 original author or publisher of that section if known, or else a |
|
2194 unique number. Make the same adjustment to the section titles in |
|
2195 the list of Invariant Sections in the license notice of the |
|
2196 combined work. |
|
2197 |
|
2198 In the combination, you must combine any sections entitled |
|
2199 "History" in the various original documents, forming one section |
|
2200 entitled "History"; likewise combine any sections entitled |
|
2201 "Acknowledgements", and any sections entitled "Dedications." You |
|
2202 must delete all sections entitled "Endorsements." |
|
2203 |
|
2204 6. COLLECTIONS OF DOCUMENTS |
|
2205 |
|
2206 You may make a collection consisting of the Document and other |
|
2207 documents released under this License, and replace the individual |
|
2208 copies of this License in the various documents with a single copy |
|
2209 that is included in the collection, provided that you follow the |
|
2210 rules of this License for verbatim copying of each of the |
|
2211 documents in all other respects. |
|
2212 |
|
2213 You may extract a single document from such a collection, and |
|
2214 distribute it individually under this License, provided you insert |
|
2215 a copy of this License into the extracted document, and follow |
|
2216 this License in all other respects regarding verbatim copying of |
|
2217 that document. |
|
2218 |
|
2219 7. AGGREGATION WITH INDEPENDENT WORKS |
|
2220 |
|
2221 A compilation of the Document or its derivatives with other |
|
2222 separate and independent documents or works, in or on a volume of |
|
2223 a storage or distribution medium, does not as a whole count as a |
|
2224 Modified Version of the Document, provided no compilation |
|
2225 copyright is claimed for the compilation. Such a compilation is |
|
2226 called an "aggregate", and this License does not apply to the |
|
2227 other self-contained works thus compiled with the Document, on |
|
2228 account of their being thus compiled, if they are not themselves |
|
2229 derivative works of the Document. |
|
2230 |
|
2231 If the Cover Text requirement of section 3 is applicable to these |
|
2232 copies of the Document, then if the Document is less than one |
|
2233 quarter of the entire aggregate, the Document's Cover Texts may be |
|
2234 placed on covers that surround only the Document within the |
|
2235 aggregate. Otherwise they must appear on covers around the whole |
|
2236 aggregate. |
|
2237 |
|
2238 8. TRANSLATION |
|
2239 |
|
2240 Translation is considered a kind of modification, so you may |
|
2241 distribute translations of the Document under the terms of section |
|
2242 4. Replacing Invariant Sections with translations requires special |
|
2243 permission from their copyright holders, but you may include |
|
2244 translations of some or all Invariant Sections in addition to the |
|
2245 original versions of these Invariant Sections. You may include a |
|
2246 translation of this License provided that you also include the |
|
2247 original English version of this License. In case of a |
|
2248 disagreement between the translation and the original English |
|
2249 version of this License, the original English version will prevail. |
|
2250 |
|
2251 9. TERMINATION |
|
2252 |
|
2253 You may not copy, modify, sublicense, or distribute the Document |
|
2254 except as expressly provided for under this License. Any other |
|
2255 attempt to copy, modify, sublicense or distribute the Document is |
|
2256 void, and will automatically terminate your rights under this |
|
2257 License. However, parties who have received copies, or rights, |
|
2258 from you under this License will not have their licenses |
|
2259 terminated so long as such parties remain in full compliance. |
|
2260 |
|
2261 10. FUTURE REVISIONS OF THIS LICENSE |
|
2262 |
|
2263 The Free Software Foundation may publish new, revised versions of |
|
2264 the GNU Free Documentation License from time to time. Such new |
|
2265 versions will be similar in spirit to the present version, but may |
|
2266 differ in detail to address new problems or concerns. See |
|
2267 http://www.gnu.org/copyleft/. |
|
2268 |
|
2269 Each version of the License is given a distinguishing version |
|
2270 number. If the Document specifies that a particular numbered |
|
2271 version of this License "or any later version" applies to it, you |
|
2272 have the option of following the terms and conditions either of |
|
2273 that specified version or of any later version that has been |
|
2274 published (not as a draft) by the Free Software Foundation. If |
|
2275 the Document does not specify a version number of this License, |
|
2276 you may choose any version ever published (not as a draft) by the |
|
2277 Free Software Foundation. |
|
2278 |
|
2279 |
|
2280 ADDENDUM: How to use this License for your documents |
|
2281 ==================================================== |
|
2282 |
|
2283 To use this License in a document you have written, include a copy of |
|
2284 the License in the document and put the following copyright and license |
|
2285 notices just after the title page: |
|
2286 |
|
2287 Copyright (C) YEAR YOUR NAME. |
|
2288 Permission is granted to copy, distribute and/or modify this document |
|
2289 under the terms of the GNU Free Documentation License, Version 1.1 |
|
2290 or any later version published by the Free Software Foundation; |
|
2291 with the Invariant Sections being LIST THEIR TITLES, with the |
|
2292 Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST. |
|
2293 A copy of the license is included in the section entitled "GNU |
|
2294 Free Documentation License." |
|
2295 |
|
2296 If you have no Invariant Sections, write "with no Invariant Sections" |
|
2297 instead of saying which ones are invariant. If you have no Front-Cover |
|
2298 Texts, write "no Front-Cover Texts" instead of "Front-Cover Texts being |
|
2299 LIST"; likewise for Back-Cover Texts. |
|
2300 |
|
2301 If your document contains nontrivial examples of program code, we |
|
2302 recommend releasing these examples in parallel under your choice of |
|
2303 free software license, such as the GNU General Public License, to |
|
2304 permit their use in free software. |
|
2305 |
|
2306 |
|
2307 |
|
2308 Tag Table: |
|
2309 Node: Top719 |
|
2310 Node: Introduction2033 |
|
2311 Node: Compiling4525 |
|
2312 Node: Executing7996 |
|
2313 Node: Invoking10784 |
|
2314 Node: Output Options12199 |
|
2315 Node: Analysis Options19288 |
|
2316 Node: Miscellaneous Options22689 |
|
2317 Node: Deprecated Options23944 |
|
2318 Node: Symspecs26023 |
|
2319 Node: Output27849 |
|
2320 Node: Flat Profile28889 |
|
2321 Node: Call Graph33842 |
|
2322 Node: Primary37074 |
|
2323 Node: Callers39662 |
|
2324 Node: Subroutines41779 |
|
2325 Node: Cycles43620 |
|
2326 Node: Line-by-line50397 |
|
2327 Node: Annotated Source54470 |
|
2328 Node: Inaccuracy57469 |
|
2329 Node: Sampling Error57727 |
|
2330 Node: Assumptions60297 |
|
2331 Node: How do I?61767 |
|
2332 Node: Incompatibilities63321 |
|
2333 Node: Details64815 |
|
2334 Node: Implementation65208 |
|
2335 Node: File Format71105 |
|
2336 Node: Internals75395 |
|
2337 Node: Debugging83890 |
|
2338 Node: GNU Free Documentation License85491 |
|
2339 |
|
2340 End Tag Table |