2 Copyright (c) 1985, 2000, 2001, 2005 Free Software Foundation, Inc.
4 Permission is granted to anyone to make or distribute verbatim copies
5 of this document as received, in any medium, provided that the
6 copyright notice and permission notice are preserved,
7 and that the distributor grants the recipient permission
8 for further redistribution as permitted by this notice.
10 Permission is granted to distribute modified versions
11 of this document, or of portions of it,
12 under the above conditions, provided also that they
13 carry prominent notices stating who last changed them.
15 [People who debug Emacs on Windows using native Windows debuggers
16 should read the Windows-specific section near the end of this
19 ** When you debug Emacs with GDB, you should start it in the directory
20 where the executable was made. That directory has a .gdbinit file
21 that defines various "user-defined" commands for debugging Emacs.
23 ** When you are trying to analyze failed assertions, it will be
24 essential to compile Emacs either completely without optimizations or
25 at least (when using GCC) with the -fno-crossjumping option. Failure
26 to do so may make the compiler recycle the same abort call for all
27 assertions in a given function, rendering the stack backtrace useless
28 for identifying the specific failed assertion.
30 ** It is a good idea to run Emacs under GDB (or some other suitable
31 debugger) *all the time*. Then, when Emacs crashes, you will be able
32 to debug the live process, not just a core dump. (This is especially
33 important on systems which don't support core files, and instead print
34 just the registers and some stack addresses.)
36 ** If Emacs hangs, or seems to be stuck in some infinite loop, typing
37 "kill -TSTP PID", where PID is the Emacs process ID, will cause GDB to
38 kick in, provided that you run under GDB.
40 ** Getting control to the debugger
42 `Fsignal' is a very useful place to put a breakpoint in.
43 All Lisp errors go through there.
45 It is useful, when debugging, to have a guaranteed way to return to
46 the debugger at any time. When using X, this is easy: type C-z at the
47 window where Emacs is running under GDB, and it will stop Emacs just
48 as it would stop any ordinary program. When Emacs is running in a
49 terminal, things are not so easy.
51 The src/.gdbinit file in the Emacs distribution arranges for SIGINT
52 (C-g in Emacs) to be passed to Emacs and not give control back to GDB.
53 On modern POSIX systems, you can override that with this command:
55 handle SIGINT stop nopass
57 After this `handle' command, SIGINT will return control to GDB. If
58 you want the C-g to cause a QUIT within Emacs as well, omit the
61 A technique that can work when `handle SIGINT' does not is to store
62 the code for some character into the variable stop_character. Thus,
64 set stop_character = 29
66 makes Control-] (decimal code 29) the stop character.
67 Typing Control-] will cause immediate stop. You cannot
68 use the set command until the inferior process has been started.
69 Put a breakpoint early in `main', or suspend the Emacs,
70 to get an opportunity to do the set command.
72 When Emacs is running in a terminal, it is useful to use a separate terminal
73 for the debug session. This can be done by starting Emacs as usual, then
74 attaching to it from gdb with the `attach' command which is explained in the
75 node "Attach" of the GDB manual.
77 ** Examining Lisp object values.
79 When you have a live process to debug, and it has not encountered a
80 fatal error, you can use the GDB command `pr'. First print the value
81 in the ordinary way, with the `p' command. Then type `pr' with no
82 arguments. This calls a subroutine which uses the Lisp printer.
84 You can also use `pp value' to print the emacs value directly.
86 Note: It is not a good idea to try `pr' or `pp' if you know that Emacs
87 is in deep trouble: its stack smashed (e.g., if it encountered SIGSEGV
88 due to stack overflow), or crucial data structures, such as `obarray',
89 corrupted, etc. In such cases, the Emacs subroutine called by `pr'
90 might make more damage, like overwrite some data that is important for
91 debugging the original problem.
93 Also, on some systems it is impossible to use `pr' if you stopped
94 Emacs while it was inside `select'. This is in fact what happens if
95 you stop Emacs while it is waiting. In such a situation, don't try to
96 use `pr'. Instead, use `s' to step out of the system call. Then
97 Emacs will be between instructions and capable of handling `pr'.
99 If you can't use `pr' command, for whatever reason, you can fall back
100 on lower-level commands. Use the `xtype' command to print out the
101 data type of the last data value. Once you know the data type, use
102 the command that corresponds to that type. Here are these commands:
104 xint xptr xwindow xmarker xoverlay xmiscfree xintfwd xboolfwd xobjfwd
105 xbufobjfwd xkbobjfwd xbuflocal xbuffer xsymbol xstring xvector xframe
106 xwinconfig xcompiled xcons xcar xcdr xsubr xprocess xfloat xscrollbar
108 Each one of them applies to a certain type or class of types.
109 (Some of these types are not visible in Lisp, because they exist only
112 Each x... command prints some information about the value, and
113 produces a GDB value (subsequently available in $) through which you
114 can get at the rest of the contents.
116 In general, most of the rest of the contents will be additional Lisp
117 objects which you can examine in turn with the x... commands.
119 Even with a live process, these x... commands are useful for
120 examining the fields in a buffer, window, process, frame or marker.
121 Here's an example using concepts explained in the node "Value History"
122 of the GDB manual to print values associated with the variable
123 called frame. First, use these commands:
127 b set_frame_buffer_list
130 Then Emacs hits the breakpoint:
138 $2 = (struct frame *) 0x8560258
149 Now we can use `pr' to print the name of the frame:
152 "emacs@steenrod.math.nwu.edu"
154 The Emacs C code heavily uses macros defined in lisp.h. So suppose
155 we want the address of the l-value expression near the bottom of
156 `add_command_key' from keyboard.c:
158 XVECTOR (this_command_keys)->contents[this_command_key_count++] = key;
160 XVECTOR is a macro, and therefore GDB does not know about it.
161 GDB cannot evaluate "p XVECTOR (this_command_keys)".
163 However, you can use the xvector command in GDB to get the same
166 (gdb) p this_command_keys
169 $2 = (struct Lisp_Vector *) 0x411000
171 (gdb) p $->contents[this_command_key_count]
174 $4 = (int *) 0x411008
176 Here's a related example of macros and the GDB `define' command.
177 There are many Lisp vectors such as `recent_keys', which contains the
178 last 100 keystrokes. We can print this Lisp vector
183 But this may be inconvenient, since `recent_keys' is much more verbose
184 than `C-h l'. We might want to print only the last 10 elements of
185 this vector. `recent_keys' is updated in keyboard.c by the command
187 XVECTOR (recent_keys)->contents[recent_keys_index] = c;
189 So we define a GDB command `xvector-elts', so the last 10 keystrokes
192 xvector-elts recent_keys recent_keys_index 10
194 where you can define xvector-elts as follows:
202 p $foo->contents[$arg1-($i++)]
205 document xvector-elts
206 Prints a range of elements of a Lisp vector.
208 prints `i' elements of the vector `v' ending at the index `n'.
211 ** Getting Lisp-level backtrace information within GDB
213 The most convenient way is to use the `xbacktrace' command. This
214 shows the names of the Lisp functions that are currently active.
216 If that doesn't work (e.g., because the `backtrace_list' structure is
217 corrupted), type "bt" at the GDB prompt, to produce the C-level
218 backtrace, and look for stack frames that call Ffuncall. Select them
219 one by one in GDB, by typing "up N", where N is the appropriate number
220 of frames to go up, and in each frame that calls Ffuncall type this:
225 This will print the name of the Lisp function called by that level
228 By printing the remaining elements of args, you can see the argument
229 values. Here's how to print the first argument:
234 If you do not have a live process, you can use xtype and the other
235 x... commands such as xsymbol to get such information, albeit less
236 conveniently. For example:
241 and, assuming that "xtype" says that args[0] is a symbol:
245 ** Debugging what happens while preloading and dumping Emacs
247 Type `gdb temacs' and start it with `r -batch -l loadup dump'.
249 If temacs actually succeeds when running under GDB in this way, do not
250 try to run the dumped Emacs, because it was dumped with the GDB
253 ** Debugging `temacs'
255 Debugging `temacs' is useful when you want to establish whether a
256 problem happens in an undumped Emacs. To run `temacs' under a
257 debugger, type "gdb temacs", then start it with `r -batch -l loadup'.
259 ** If you encounter X protocol errors
261 Try evaluating (x-synchronize t). That puts Emacs into synchronous
262 mode, where each Xlib call checks for errors before it returns. This
263 mode is much slower, but when you get an error, you will see exactly
264 which call really caused the error.
266 You can start Emacs in a synchronous mode by invoking it with the -xrm
269 emacs -xrm "emacs.synchronous: true"
271 Setting a breakpoint in the function `x_error_quitter' and looking at
272 the backtrace when Emacs stops inside that function will show what
273 code causes the X protocol errors.
275 Some bugs related to the X protocol disappear when Emacs runs in a
276 synchronous mode. To track down those bugs, we suggest the following
279 - Run Emacs under a debugger and put a breakpoint inside the
280 primitive function which, when called from Lisp, triggers the X
281 protocol errors. For example, if the errors happen when you
282 delete a frame, put a breakpoint inside `Fdelete_frame'.
284 - When the breakpoint breaks, step through the code, looking for
285 calls to X functions (the ones whose names begin with "X" or
288 - Insert calls to `XSync' before and after each call to the X
289 functions, like this:
291 XSync (f->output_data.x->display_info->display, 0);
293 where `f' is the pointer to the `struct frame' of the selected
294 frame, normally available via XFRAME (selected_frame). (Most
295 functions which call X already have some variable that holds the
296 pointer to the frame, perhaps called `f' or `sf', so you shouldn't
299 If your debugger can call functions in the program being debugged,
300 you should be able to issue the calls to `XSync' without recompiling
301 Emacs. For example, with GDB, just type:
303 call XSync (f->output_data.x->display_info->display, 0)
305 before and immediately after the suspect X calls. If your
306 debugger does not support this, you will need to add these pairs
307 of calls in the source and rebuild Emacs.
309 Either way, systematically step through the code and issue these
310 calls until you find the first X function called by Emacs after
311 which a call to `XSync' winds up in the function
312 `x_error_quitter'. The first X function call for which this
313 happens is the one that generated the X protocol error.
315 - You should now look around this offending X call and try to figure
316 out what is wrong with it.
318 ** If Emacs causes errors or memory leaks in your X server
320 You can trace the traffic between Emacs and your X server with a tool
321 like xmon, available at ftp://ftp.x.org/contrib/devel_tools/.
323 Xmon can be used to see exactly what Emacs sends when X protocol errors
324 happen. If Emacs causes the X server memory usage to increase you can
325 use xmon to see what items Emacs creates in the server (windows,
326 graphical contexts, pixmaps) and what items Emacs delete. If there
327 are consistently more creations than deletions, the type of item
328 and the activity you do when the items get created can give a hint where
331 ** If the symptom of the bug is that Emacs fails to respond
333 Don't assume Emacs is `hung'--it may instead be in an infinite loop.
334 To find out which, make the problem happen under GDB and stop Emacs
335 once it is not responding. (If Emacs is using X Windows directly, you
336 can stop Emacs by typing C-z at the GDB job.) Then try stepping with
337 `step'. If Emacs is hung, the `step' command won't return. If it is
338 looping, `step' will return.
340 If this shows Emacs is hung in a system call, stop it again and
341 examine the arguments of the call. If you report the bug, it is very
342 important to state exactly where in the source the system call is, and
343 what the arguments are.
345 If Emacs is in an infinite loop, try to determine where the loop
346 starts and ends. The easiest way to do this is to use the GDB command
347 `finish'. Each time you use it, Emacs resumes execution until it
348 exits one stack frame. Keep typing `finish' until it doesn't
349 return--that means the infinite loop is in the stack frame which you
350 just tried to finish.
352 Stop Emacs again, and use `finish' repeatedly again until you get back
353 to that frame. Then use `next' to step through that frame. By
354 stepping, you will see where the loop starts and ends. Also, examine
355 the data being used in the loop and try to determine why the loop does
356 not exit when it should.
358 ** If certain operations in Emacs are slower than they used to be, here
359 is some advice for how to find out why.
361 Stop Emacs repeatedly during the slow operation, and make a backtrace
362 each time. Compare the backtraces looking for a pattern--a specific
363 function that shows up more often than you'd expect.
365 If you don't see a pattern in the C backtraces, get some Lisp
366 backtrace information by typing "xbacktrace" or by looking at Ffuncall
367 frames (see above), and again look for a pattern.
369 When using X, you can stop Emacs at any time by typing C-z at GDB.
370 When not using X, you can do this with C-g. On non-Unix platforms,
371 such as MS-DOS, you might need to press C-BREAK instead.
373 ** If GDB does not run and your debuggers can't load Emacs.
375 On some systems, no debugger can load Emacs with a symbol table,
376 perhaps because they all have fixed limits on the number of symbols
377 and Emacs exceeds the limits. Here is a method that can be used
378 in such an extremity. Do
387 :r -l loadup (or whatever)
389 It is necessary to refer to the file `nmout' to convert
390 numeric addresses into symbols and vice versa.
392 It is useful to be running under a window system.
393 Then, if Emacs becomes hopelessly wedged, you can create
394 another window to do kill -9 in. kill -ILL is often
395 useful too, since that may make Emacs dump core or return
399 ** Debugging incorrect screen updating.
401 To debug Emacs problems that update the screen wrong, it is useful
402 to have a record of what input you typed and what Emacs sent to the
403 screen. To make these records, do
405 (open-dribble-file "~/.dribble")
406 (open-termscript "~/.termscript")
408 The dribble file contains all characters read by Emacs from the
409 terminal, and the termscript file contains all characters it sent to
410 the terminal. The use of the directory `~/' prevents interference
413 If you have irreproducible display problems, put those two expressions
414 in your ~/.emacs file. When the problem happens, exit the Emacs that
415 you were running, kill it, and rename the two files. Then you can start
416 another Emacs without clobbering those files, and use it to examine them.
418 An easy way to see if too much text is being redrawn on a terminal is to
419 evaluate `(setq inverse-video t)' before you try the operation you think
420 will cause too much redrawing. This doesn't refresh the screen, so only
421 newly drawn text is in inverse video.
423 The Emacs display code includes special debugging code, but it is
424 normally disabled. You can enable it by building Emacs with the
425 pre-processing symbol GLYPH_DEBUG defined. Here's one easy way,
426 suitable for Unix and GNU systems, to build such a debugging version:
428 MYCPPFLAGS='-DGLYPH_DEBUG=1' make
430 Building Emacs like that activates many assertions which scrutinize
431 display code operation more than Emacs does normally. (To see the
432 code which tests these assertions, look for calls to the `xassert'
433 macros.) Any assertion that is reported to fail should be
436 Building with GLYPH_DEBUG defined also defines several helper
437 functions which can help debugging display code. One such function is
438 `dump_glyph_matrix'. If you run Emacs under GDB, you can print the
439 contents of any glyph matrix by just calling that function with the
440 matrix as its argument. For example, the following command will print
441 the contents of the current matrix of the window whose pointer is in
444 (gdb) p dump_glyph_matrix (w->current_matrix, 2)
446 (The second argument 2 tells dump_glyph_matrix to print the glyphs in
447 a long form.) You can dump the selected window's current glyph matrix
448 interactively with "M-x dump-glyph-matrix RET"; see the documentation
449 of this function for more details.
451 Several more functions for debugging display code are available in
452 Emacs compiled with GLYPH_DEBUG defined; type "C-h f dump- TAB" and
453 "C-h f trace- TAB" to see the full list.
455 When you debug display problems running emacs under X, you can use
456 the `ff' command to flush all pending display updates to the screen.
461 If you encounter bugs whereby Emacs built with LessTif grabs all mouse
462 and keyboard events, or LessTif menus behave weirdly, it might be
463 helpful to set the `DEBUGSOURCES' and `DEBUG_FILE' environment
464 variables, so that one can see what LessTif was doing at this point.
467 export DEBUGSOURCES="RowColumn.c:MenuShell.c:MenuUtil.c"
468 export DEBUG_FILE=/usr/tmp/LESSTIF_TRACE
471 causes LessTif to print traces from the three named source files to a
472 file in `/usr/tmp' (that file can get pretty large). The above should
473 be typed at the shell prompt before invoking Emacs, as shown by the
476 Running GDB from another terminal could also help with such problems.
477 You can arrange for GDB to run on one machine, with the Emacs display
478 appearing on another. Then, when the bug happens, you can go back to
479 the machine where you started GDB and use the debugger from there.
482 ** Debugging problems which happen in GC
484 The array `last_marked' (defined on alloc.c) can be used to display up
485 to 500 last objects marked by the garbage collection process.
486 Whenever the garbage collector marks a Lisp object, it records the
487 pointer to that object in the `last_marked' array. The variable
488 `last_marked_index' holds the index into the `last_marked' array one
489 place beyond where the pointer to the very last marked object is
492 The single most important goal in debugging GC problems is to find the
493 Lisp data structure that got corrupted. This is not easy since GC
494 changes the tag bits and relocates strings which make it hard to look
495 at Lisp objects with commands such as `pr'. It is sometimes necessary
496 to convert Lisp_Object variables into pointers to C struct's manually.
497 Use the `last_marked' array and the source to reconstruct the sequence
498 that objects were marked.
500 Once you discover the corrupted Lisp object or data structure, it is
501 useful to look at it in a fresh Emacs session and compare its contents
502 with a session that you are debugging.
504 ** Debugging problems with non-ASCII characters
506 If you experience problems which seem to be related to non-ASCII
507 characters, such as \201 characters appearing in the buffer or in your
508 files, set the variable byte-debug-flag to t. This causes Emacs to do
509 some extra checks, such as look for broken relations between byte and
510 character positions in buffers and strings; the resulting diagnostics
511 might pinpoint the cause of the problem.
513 ** Debugging the TTY (non-windowed) version
515 The most convenient method of debugging the character-terminal display
516 is to do that on a window system such as X. Begin by starting an
517 xterm window, then type these commands inside that window:
522 Let's say these commands print "/dev/ttyp4" and "xterm", respectively.
524 Now start Emacs (the normal, windowed-display session, i.e. without
525 the `-nw' option), and invoke "M-x gdb RET emacs RET" from there. Now
526 type these commands at GDB's prompt:
528 (gdb) set args -nw -t /dev/ttyp4
529 (gdb) set environment TERM xterm
532 The debugged Emacs should now start in no-window mode with its display
533 directed to the xterm window you opened above.
535 Similar arrangement is possible on a character terminal by using the
538 ** Running Emacs built with malloc debugging packages
540 If Emacs exhibits bugs that seem to be related to use of memory
541 allocated off the heap, it might be useful to link Emacs with a
542 special debugging library, such as Electric Fence (a.k.a. efence) or
543 GNU Checker, which helps find such problems.
545 Emacs compiled with such packages might not run without some hacking,
546 because Emacs replaces the system's memory allocation functions with
547 its own versions, and because the dumping process might be
548 incompatible with the way these packages use to track allocated
549 memory. Here are some of the changes you might find necessary
550 (SYSTEM-NAME and MACHINE-NAME are the names of your OS- and
551 CPU-specific headers in the subdirectories of `src'):
553 - In src/s/SYSTEM-NAME.h add "#define SYSTEM_MALLOC".
555 - In src/m/MACHINE-NAME.h add "#define CANNOT_DUMP" and
556 "#define CANNOT_UNEXEC".
558 - Configure with a different --prefix= option. If you use GCC,
559 version 2.7.2 is preferred, as some malloc debugging packages
560 work a lot better with it than with 2.95 or later versions.
562 - Type "make" then "make -k install".
564 - If required, invoke the package-specific command to prepare
565 src/temacs for execution.
569 (Note that this runs `temacs' instead of the usual `emacs' executable.
570 This avoids problems with dumping Emacs mentioned above.)
572 Some malloc debugging libraries might print lots of false alarms for
573 bitfields used by Emacs in some data structures. If you want to get
574 rid of the false alarms, you will have to hack the definitions of
575 these data structures on the respective headers to remove the `:N'
576 bitfield definitions (which will cause each such field to use a full
579 ** Some suggestions for debugging on MS Windows:
581 (written by Marc Fleischeuers, Geoff Voelker and Andrew Innes)
583 To debug Emacs with Microsoft Visual C++, you either start emacs from
584 the debugger or attach the debugger to a running emacs process.
586 To start emacs from the debugger, you can use the file bin/debug.bat.
587 The Microsoft Developer studio will start and under Project, Settings,
588 Debug, General you can set the command-line arguments and Emacs's
589 startup directory. Set breakpoints (Edit, Breakpoints) at Fsignal and
590 other functions that you want to examine. Run the program (Build,
591 Start debug). Emacs will start and the debugger will take control as
592 soon as a breakpoint is hit.
594 You can also attach the debugger to an already running Emacs process.
595 To do this, start up the Microsoft Developer studio and select Build,
596 Start debug, Attach to process. Choose the Emacs process from the
597 list. Send a break to the running process (Debug, Break) and you will
598 find that execution is halted somewhere in user32.dll. Open the stack
599 trace window and go up the stack to w32_msg_pump. Now you can set
600 breakpoints in Emacs (Edit, Breakpoints). Continue the running Emacs
601 process (Debug, Step out) and control will return to Emacs, until a
604 To examine the contents of a Lisp variable, you can use the function
605 'debug_print'. Right-click on a variable, select QuickWatch (it has
606 an eyeglass symbol on its button in the toolbar), and in the text
607 field at the top of the window, place 'debug_print(' and ')' around
608 the expression. Press 'Recalculate' and the output is sent to stderr,
609 and to the debugger via the OutputDebugString routine. The output
610 sent to stderr should be displayed in the console window that was
611 opened when the emacs.exe executable was started. The output sent to
612 the debugger should be displayed in the 'Debug' pane in the Output
613 window. If Emacs was started from the debugger, a console window was
614 opened at Emacs' startup; this console window also shows the output of
617 For example, start and run Emacs in the debugger until it is waiting
618 for user input. Then click on the `Break' button in the debugger to
619 halt execution. Emacs should halt in `ZwUserGetMessage' waiting for
620 an input event. Use the `Call Stack' window to select the procedure
621 `w32_msp_pump' up the call stack (see below for why you have to do
622 this). Open the QuickWatch window and enter
623 "debug_print(Vexec_path)". Evaluating this expression will then print
624 out the contents of the Lisp variable `exec-path'.
626 If QuickWatch reports that the symbol is unknown, then check the call
627 stack in the `Call Stack' window. If the selected frame in the call
628 stack is not an Emacs procedure, then the debugger won't recognize
629 Emacs symbols. Instead, select a frame that is inside an Emacs
630 procedure and try using `debug_print' again.
632 If QuickWatch invokes debug_print but nothing happens, then check the
633 thread that is selected in the debugger. If the selected thread is
634 not the last thread to run (the "current" thread), then it cannot be
635 used to execute debug_print. Use the Debug menu to select the current
636 thread and try using debug_print again. Note that the debugger halts
637 execution (e.g., due to a breakpoint) in the context of the current
638 thread, so this should only be a problem if you've explicitly switched
641 It is also possible to keep appropriately masked and typecast Lisp
642 symbols in the Watch window, this is more convenient when steeping
643 though the code. For instance, on entering apply_lambda, you can
644 watch (struct Lisp_Symbol *) (0xfffffff & args[0]).
646 Optimizations often confuse the MS debugger. For example, the
647 debugger will sometimes report wrong line numbers, e.g., when it
648 prints the backtrace for a crash. It is usually best to look at the
649 disassembly to determine exactly what code is being run--the
650 disassembly will probably show several source lines followed by a
651 block of assembler for those lines. The actual point where Emacs
652 crashes will be one of those source lines, but not neccesarily the one
653 that the debugger reports.
655 Another problematic area with the MS debugger is with variables that
656 are stored in registers: it will sometimes display wrong values for
657 those variables. Usually you will not be able to see any value for a
658 register variable, but if it is only being stored in a register
659 temporarily, you will see an old value for it. Again, you need to
660 look at the disassembly to determine which registers are being used,
661 and look at those registers directly, to see the actual current values
664 ;;; arch-tag: fbf32980-e35d-481f-8e4c-a2eca2586e6b