2 <!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns=
"http://www.w3.org/1999/xhtml">
5 <title>perldebguts - Guts of Perl debugging
</title>
6 <meta http-equiv=
"content-type" content=
"text/html; charset=utf-8" />
7 <link rev=
"made" href=
"mailto:" />
10 <body style=
"background-color: white">
11 <table border=
"0" width=
"100%" cellspacing=
"0" cellpadding=
"3">
12 <tr><td class=
"block" style=
"background-color: #cccccc" valign=
"middle">
13 <big><strong><span class=
"block"> perldebguts - Guts of Perl debugging
</span></strong></big>
17 <p><a name=
"__index__"></a></p>
22 <li><a href=
"#name">NAME
</a></li>
23 <li><a href=
"#description">DESCRIPTION
</a></li>
24 <li><a href=
"#debugger_internals">Debugger Internals
</a></li>
27 <li><a href=
"#writing_your_own_debugger">Writing Your Own Debugger
</a></li>
30 <li><a href=
"#environment_variables">Environment Variables
</a></li>
31 <li><a href=
"#debugger_internal_variables">Debugger internal variables
</a></li>
32 <li><a href=
"#debugger_customization_functions">Debugger customization functions
</a></li>
37 <li><a href=
"#frame_listing_output_examples">Frame Listing Output Examples
</a></li>
38 <li><a href=
"#debugging_regular_expressions">Debugging regular expressions
</a></li>
41 <li><a href=
"#compiletime_output">Compile-time output
</a></li>
42 <li><a href=
"#types_of_nodes">Types of nodes
</a></li>
43 <li><a href=
"#runtime_output">Run-time output
</a></li>
46 <li><a href=
"#debugging_perl_memory_usage">Debugging Perl memory usage
</a></li>
49 <li><a href=
"#using__env_perl_debug_mstats_">Using
<code>$ENV{PERL_DEBUG_MSTATS}
</code></a></li>
52 <li><a href=
"#see_also">SEE ALSO
</a></li>
59 <h1><a name=
"name">NAME
</a></h1>
60 <p>perldebguts - Guts of Perl debugging
</p>
64 <h1><a name=
"description">DESCRIPTION
</a></h1>
65 <p>This is not the
<code>perldebug(
1)
</code> manpage, which tells you how to use
66 the debugger. This manpage describes low-level details concerning
67 the debugger's internals, which range from difficult to impossible
68 to understand for anyone who isn't incredibly intimate with Perl's guts.
73 <h1><a name=
"debugger_internals">Debugger Internals
</a></h1>
74 <p>Perl has special debugging hooks at compile-time and run-time used
75 to create debugging environments. These hooks are not to be confused
76 with the
<em>perl -Dxxx
</em> command described in
<a href=
"file://C|\msysgit\mingw\html/pod/perlrun.html">the perlrun manpage
</a>, which is
77 usable only if a special Perl is built per the instructions in the
78 <em>INSTALL
</em> podpage in the Perl source tree.
</p>
79 <p>For example, whenever you call Perl's built-in
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_caller"><code>caller
</code></a> function
80 from the package
<code>DB
</code>, the arguments that the corresponding stack
81 frame was called with are copied to the
<code>@DB::args
</code> array. These
82 mechanisms are enabled by calling Perl with the
<strong>-d
</strong> switch.
83 Specifically, the following additional features are enabled
84 (cf.
<a href=
"file://C|\msysgit\mingw\html/pod/perlvar.html#item___p">$^P in the perlvar manpage
</a>):
</p>
87 <p>Perl inserts the contents of
<code>$ENV{PERL5DB}
</code> (or
<code>BEGIN {require
88 'perl5db.pl'}
</code> if not present) before the first line of your program.
</p>
91 <p>Each array
<code>@{
"_
<$filename
"}
</code> holds the lines of $filename for a
92 file compiled by Perl. The same is also true for
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_eval"><code>eval
</code></a>ed strings
93 that contain subroutines, or which are currently being executed.
94 The $filename for
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_eval"><code>eval
</code></a>ed strings looks like
<code>(eval
34)
</code>.
95 Code assertions in regexes look like
<code>(re_eval
19)
</code>.
</p>
96 <p>Values in this array are magical in numeric context: they compare
97 equal to zero only if the line is not breakable.
</p>
100 <p>Each hash
<code>%{
"_
<$filename
"}
</code> contains breakpoints and actions keyed
101 by line number. Individual entries (as opposed to the whole hash)
102 are settable. Perl only cares about Boolean true here, although
103 the values used by
<em>perl5db.pl
</em> have the form
104 <code>"$break_condition\
0$action
"</code>.
</p>
105 <p>The same holds for evaluated strings that contain subroutines, or
106 which are currently being executed. The $filename for
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_eval"><code>eval
</code></a>ed strings
107 looks like
<code>(eval
34)
</code> or
<code>(re_eval
19)
</code>.
</p>
110 <p>Each scalar
<code>${
"_
<$filename
"}
</code> contains
<code>"_
<$filename
"</code>. This is
111 also the case for evaluated strings that contain subroutines, or
112 which are currently being executed. The $filename for
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_eval"><code>eval
</code></a>ed
113 strings looks like
<code>(eval
34)
</code> or
<code>(re_eval
19)
</code>.
</p>
116 <p>After each
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_require"><code>require
</code></a>d file is compiled, but before it is executed,
117 <code>DB::postponed(*{
"_
<$filename
"})
</code> is called if the subroutine
118 <code>DB::postponed
</code> exists. Here, the $filename is the expanded name of
119 the
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_require"><code>require
</code></a>d file, as found in the values of %INC.
</p>
122 <p>After each subroutine
<code>subname
</code> is compiled, the existence of
123 <code>$DB::postponed{subname}
</code> is checked. If this key exists,
124 <code>DB::postponed(subname)
</code> is called if the
<code>DB::postponed
</code> subroutine
128 <p>A hash
<code>%DB::sub
</code> is maintained, whose keys are subroutine names
129 and whose values have the form
<code>filename:startline-endline
</code>.
130 <code>filename
</code> has the form
<code>(eval
34)
</code> for subroutines defined inside
131 <a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_eval"><code>eval
</code></a>s, or
<code>(re_eval
19)
</code> for those within regex code assertions.
</p>
134 <p>When the execution of your program reaches a point that can hold a
135 breakpoint, the
<code>DB::DB()
</code> subroutine is called if any of the variables
136 <code>$DB::trace
</code>,
<code>$DB::single
</code>, or
<code>$DB::signal
</code> is true. These variables
137 are not
<code>local
</code>izable. This feature is disabled when executing
138 inside
<code>DB::DB()
</code>, including functions called from it
139 unless
<a href=
"file://C|\msysgit\mingw\html/pod/perlvar.html#item___d"><code>$^D
& (
1<<30)
</code></a> is true.
</p>
142 <p>When execution of the program reaches a subroutine call, a call to
143 <code>&DB::sub
</code>(
<em>args
</em>) is made instead, with
<code>$DB::sub
</code> holding the
144 name of the called subroutine. (This doesn't happen if the subroutine
145 was compiled in the
<code>DB
</code> package.)
</p>
148 <p>Note that if
<code>&DB::sub
</code> needs external data for it to work, no
149 subroutine call is possible without it. As an example, the standard
150 debugger's
<code>&DB::sub
</code> depends on the
<code>$DB::deep
</code> variable
151 (it defines how many levels of recursion deep into the debugger you can go
152 before a mandatory break). If
<code>$DB::deep
</code> is not defined, subroutine
153 calls are not possible, even though
<code>&DB::sub
</code> exists.
</p>
156 <h2><a name=
"writing_your_own_debugger">Writing Your Own Debugger
</a></h2>
159 <h3><a name=
"environment_variables">Environment Variables
</a></h3>
160 <p>The
<code>PERL5DB
</code> environment variable can be used to define a debugger.
161 For example, the minimal ``working'' debugger (it actually doesn't do anything)
162 consists of one line:
</p>
165 <p>It can easily be defined like this:
</p>
167 $ PERL5DB=
"sub DB::DB {}
" perl -d your-script
</pre>
168 <p>Another brief debugger, slightly more useful, can be created
169 with only the line:
</p>
171 sub DB::DB {print ++$i; scalar
<STDIN
>}
</pre>
172 <p>This debugger prints a number which increments for each statement
173 encountered and waits for you to hit a newline before continuing
174 to the next statement.
</p>
175 <p>The following debugger is actually useful:
</p>
180 sub sub {print ++$i,
" $sub\n
";
&$sub}
182 <p>It prints the sequence number of each subroutine call and the name of the
183 called subroutine. Note that
<code>&DB::sub
</code> is being compiled into the
184 package
<code>DB
</code> through the use of the
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_package"><code>package
</code></a> directive.
</p>
185 <p>When it starts, the debugger reads your rc file (
<em>./.perldb
</em> or
186 <em>~/.perldb
</em> under Unix), which can set important options.
187 (A subroutine (
<code>&afterinit
</code>) can be defined here as well; it is executed
188 after the debugger completes its own initialization.)
</p>
189 <p>After the rc file is read, the debugger reads the PERLDB_OPTS
190 environment variable and uses it to set debugger options. The
191 contents of this variable are treated as if they were the argument
192 of an
<code>o ...
</code> debugger command (q.v. in
<a href=
"file://C|\msysgit\mingw\html/pod/perldebug.html#options">Options in the perldebug manpage
</a>).
</p>
195 <h3><a name=
"debugger_internal_variables_in_addition_to_the_file_and_subroutinerelated_variables_mentioned_above__the_debugger_also_maintains_various_magical_internal_variables_">Debugger internal variables
196 In addition to the file and subroutine-related variables mentioned above,
197 the debugger also maintains various magical internal variables.
</a></h3>
200 <p><code>@DB::dbline
</code> is an alias for
<code>@{
"::_
<current_file
"}
</code>, which
201 holds the lines of the currently-selected file (compiled by Perl), either
202 explicitly chosen with the debugger's
<a href=
"file://C|\msysgit\mingw\html/pod/perlguts.html#item_f"><code>f
</code></a> command, or implicitly by flow
204 <p>Values in this array are magical in numeric context: they compare
205 equal to zero only if the line is not breakable.
</p>
208 <p><code>%DB::dbline
</code>, is an alias for
<code>%{
"::_
<current_file
"}
</code>, which
209 contains breakpoints and actions keyed by line number in
210 the currently-selected file, either explicitly chosen with the
211 debugger's
<a href=
"file://C|\msysgit\mingw\html/pod/perlguts.html#item_f"><code>f
</code></a> command, or implicitly by flow of execution.
</p>
212 <p>As previously noted, individual entries (as opposed to the whole hash)
213 are settable. Perl only cares about Boolean true here, although
214 the values used by
<em>perl5db.pl
</em> have the form
215 <code>"$break_condition\
0$action
"</code>.
</p>
220 <h3><a name=
"debugger_customization_functions">Debugger customization functions
</a></h3>
221 <p>Some functions are provided to simplify customization.
</p>
224 <p>See
<a href=
"file://C|\msysgit\mingw\html/pod/perldebug.html#options">Options in the perldebug manpage
</a> for description of options parsed by
225 <code>DB::parse_options(string)
</code> parses debugger options; see
226 <em>pperldebug/Options
</em> for a description of options recognized.
</p>
229 <p><code>DB::dump_trace(skip[,count])
</code> skips the specified number of frames
230 and returns a list containing information about the calling frames (all
231 of them, if
<code>count
</code> is missing). Each entry is reference to a hash
232 with keys
<code>context
</code> (either
<code>.
</code>,
<code>$
</code>, or
<code>@
</code>),
<code>sub
</code> (subroutine
233 name, or info about
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_eval"><code>eval
</code></a>),
<code>args
</code> (
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_undef"><code>undef
</code></a> or a reference to
234 an array),
<code>file
</code>, and
<code>line
</code>.
</p>
237 <p><code>DB::print_trace(FH, skip[, count[, short]])
</code> prints
238 formatted info about caller frames. The last two functions may be
239 convenient as arguments to
<code><</code>,
<code><<</code> commands.
</p>
242 <p>Note that any variables and functions that are not documented in
243 this manpages (or in
<a href=
"file://C|\msysgit\mingw\html/pod/perldebug.html">the perldebug manpage
</a>) are considered for internal
244 use only, and as such are subject to change without notice.
</p>
248 <h1><a name=
"frame_listing_output_examples">Frame Listing Output Examples
</a></h1>
249 <p>The
<code>frame
</code> option can be used to control the output of frame
250 information. For example, contrast this expression trace:
</p>
253 Stack dump during die enabled outside of evals.
</pre>
255 Loading DB routines from perl5db.pl patch level
0.94
256 Emacs support available.
</pre>
258 Enter h or `h h' for help.
</pre>
261 DB
<1> sub foo {
14 }
</pre>
263 DB
<2> sub bar {
3 }
</pre>
265 DB
<3> t print foo() * bar()
266 main::((eval
172):
3): print foo() + bar();
267 main::foo((eval
168):
2):
268 main::bar((eval
170):
2):
270 <p>with this one, once the
<a href=
"file://C|\msysgit\mingw\html/pod/perlguts.html#item_o"><code>o
</code></a>ption
<code>frame=
2</code> has been set:
</p>
274 DB
<5> t print foo() * bar()
283 <p>By way of demonstration, we present below a laborious listing
284 resulting from setting your
<code>PERLDB_OPTS
</code> environment variable to
285 the value
<code>f=n N
</code>, and running
<em>perl -d -V
</em> from the command line.
286 Examples use various values of
<a href=
"file://C|\msysgit\mingw\html/pod/perlguts.html#item_n"><code>n
</code></a> are shown to give you a feel
287 for the difference between settings. Long those it may be, this
288 is not a complete listing, but only excerpts.
</p>
293 entering Config::BEGIN
294 Package lib/Exporter.pm.
296 Package lib/Config.pm.
297 entering Config::TIEHASH
298 entering Exporter::import
299 entering Exporter::export
300 entering Config::myconfig
301 entering Config::FETCH
302 entering Config::FETCH
303 entering Config::FETCH
304 entering Config::FETCH
</pre>
308 entering Config::BEGIN
309 Package lib/Exporter.pm.
312 Package lib/Config.pm.
313 entering Config::TIEHASH
314 exited Config::TIEHASH
315 entering Exporter::import
316 entering Exporter::export
317 exited Exporter::export
318 exited Exporter::import
320 entering Config::myconfig
321 entering Config::FETCH
323 entering Config::FETCH
325 entering Config::FETCH
</pre>
328 in $=main::BEGIN() from /dev/null:
0
329 in $=Config::BEGIN() from lib/Config.pm:
2
330 Package lib/Exporter.pm.
332 Package lib/Config.pm.
333 in $=Config::TIEHASH('Config') from lib/Config.pm:
644
334 in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:
0
335 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li
336 in @=Config::myconfig() from /dev/null:
0
337 in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:
574
338 in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:
574
339 in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:
574
340 in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:
574
341 in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:
574
342 in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:
574</pre>
345 in $=main::BEGIN() from /dev/null:
0
346 in $=Config::BEGIN() from lib/Config.pm:
2
347 Package lib/Exporter.pm.
349 out $=Config::BEGIN() from lib/Config.pm:
0
350 Package lib/Config.pm.
351 in $=Config::TIEHASH('Config') from lib/Config.pm:
644
352 out $=Config::TIEHASH('Config') from lib/Config.pm:
644
353 in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:
0
354 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
355 out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
356 out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:
0
357 out $=main::BEGIN() from /dev/null:
0
358 in @=Config::myconfig() from /dev/null:
0
359 in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:
574
360 out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:
574
361 in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:
574
362 out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:
574
363 in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:
574
364 out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:
574
365 in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:
574</pre>
368 in $=main::BEGIN() from /dev/null:
0
369 in $=Config::BEGIN() from lib/Config.pm:
2
370 Package lib/Exporter.pm.
372 out $=Config::BEGIN() from lib/Config.pm:
0
373 Package lib/Config.pm.
374 in $=Config::TIEHASH('Config') from lib/Config.pm:
644
375 out $=Config::TIEHASH('Config') from lib/Config.pm:
644
376 in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:
0
377 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
378 out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
379 out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:
0
380 out $=main::BEGIN() from /dev/null:
0
381 in @=Config::myconfig() from /dev/null:
0
382 in $=Config::FETCH('Config=HASH(
0x1aa444)', 'package') from lib/Config.pm:
574
383 out $=Config::FETCH('Config=HASH(
0x1aa444)', 'package') from lib/Config.pm:
574
384 in $=Config::FETCH('Config=HASH(
0x1aa444)', 'baserev') from lib/Config.pm:
574
385 out $=Config::FETCH('Config=HASH(
0x1aa444)', 'baserev') from lib/Config.pm:
574</pre>
388 in $=CODE(
0x15eca4)() from /dev/null:
0
389 in $=CODE(
0x182528)() from lib/Config.pm:
2
390 Package lib/Exporter.pm.
391 out $=CODE(
0x182528)() from lib/Config.pm:
0
392 scalar context return from CODE(
0x182528): undef
393 Package lib/Config.pm.
394 in $=Config::TIEHASH('Config') from lib/Config.pm:
628
395 out $=Config::TIEHASH('Config') from lib/Config.pm:
628
396 scalar context return from Config::TIEHASH: empty hash
397 in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:
0
398 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:
171
399 out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:
171
400 scalar context return from Exporter::export: ''
401 out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:
0
402 scalar context return from Exporter::import: ''
</pre>
404 <p>In all cases shown above, the line indentation shows the call tree.
405 If bit
2 of
<code>frame
</code> is set, a line is printed on exit from a
406 subroutine as well. If bit
4 is set, the arguments are printed
407 along with the caller info. If bit
8 is set, the arguments are
408 printed even if they are tied or references. If bit
16 is set, the
409 return value is printed, too.
</p>
410 <p>When a package is compiled, a line like this
</p>
412 Package lib/Carp.pm.
</pre>
413 <p>is printed with proper indentation.
</p>
417 <h1><a name=
"debugging_regular_expressions">Debugging regular expressions
</a></h1>
418 <p>There are two ways to enable debugging output for regular expressions.
</p>
419 <p>If your perl is compiled with
<code>-DDEBUGGING
</code>, you may use the
420 <strong>-Dr
</strong> flag on the command line.
</p>
421 <p>Otherwise, one can
<code>use re 'debug'
</code>, which has effects at
422 compile time and run time. It is not lexically scoped.
</p>
425 <h2><a name=
"compiletime_output">Compile-time output
</a></h2>
426 <p>The debugging output at compile time looks like this:
</p>
428 Compiling REx `[bc]d(ef*g)+h[ij]k$'
429 size
45 Got
364 bytes for offset annotations.
434 12: EXACT
<d
>(
14)
435 14: CURLYX[
0] {
1,
32767}(
28)
437 18: EXACT
<e
>(
20)
439 21: EXACT
<f
>(
0)
440 23: EXACT
<g
>(
25)
444 29: EXACT
<h
>(
31)
446 42: EXACT
<k
>(
44)
449 anchored `de' at
1 floating `gh' at
3.
.2147483647 (checking floating)
450 stclass `ANYOF[bc]' minlen
7
452 1[
4]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
5[
1]
453 0[
0]
12[
1]
0[
0]
6[
1]
0[
0]
7[
1]
0[
0]
9[
1]
8[
1]
0[
0]
10[
1]
0[
0]
454 11[
1]
0[
0]
12[
0]
12[
0]
13[
1]
0[
0]
14[
4]
0[
0]
0[
0]
0[
0]
0[
0]
455 0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
18[
1]
0[
0]
19[
1]
20[
0]
456 Omitting $` $
& $' support.
</pre>
457 <p>The first line shows the pre-compiled form of the regex. The second
458 shows the size of the compiled form (in arbitrary units, usually
459 4-byte words) and the total number of bytes allocated for the
460 offset/length table, usually
4+
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_size"><code>size
</code></a>*
8. The next line shows the
461 label
<em>id
</em> of the first node that does a match.
</p>
464 anchored `de' at
1 floating `gh' at
3.
.2147483647 (checking floating)
465 stclass `ANYOF[bc]' minlen
7</pre>
466 <p>line (split into two lines above) contains optimizer
467 information. In the example shown, the optimizer found that the match
468 should contain a substring
<code>de
</code> at offset
1, plus substring
<code>gh
</code>
469 at some offset between
3 and infinity. Moreover, when checking for
470 these substrings (to abandon impossible matches quickly), Perl will check
471 for the substring
<code>gh
</code> before checking for the substring
<code>de
</code>. The
472 optimizer may also use the knowledge that the match starts (at the
473 <code>first
</code> <em>id
</em>) with a character class, and no string
474 shorter than
7 characters can possibly match.
</p>
475 <p>The fields of interest which may appear in this line are
</p>
477 <dt><strong><a name=
"item_anchored_string_at_pos"><a href=
"#item_anchored"><code>anchored
</code></a> <em>STRING
</em> <code>at
</code> <em>POS
</em></a></strong>
479 <dt><strong><a name=
"item_floating_string_at_pos1_2e_2epos2"><code>floating
</code> <em>STRING
</em> <code>at
</code> <em>POS1..POS2
</em></a></strong>
485 <dt><strong><a name=
"item_matching_floating_2fanchored"><code>matching floating/anchored
</code></a></strong>
488 <p>Which substring to check first.
</p>
491 <dt><strong><a name=
"item_minlen"><code>minlen
</code></a></strong>
494 <p>The minimal length of the match.
</p>
497 <dt><strong><a name=
"item_stclass"><code>stclass
</code> <em>TYPE
</em></a></strong>
500 <p>Type of first matching node.
</p>
503 <dt><strong><a name=
"item_noscan"><code>noscan
</code></a></strong>
506 <p>Don't scan for the found substrings.
</p>
509 <dt><strong><a name=
"item_isall"><code>isall
</code></a></strong>
512 <p>Means that the optimizer information is all that the regular
513 expression contains, and thus one does not need to enter the regex engine at
517 <dt><strong><a name=
"item_gpos"><code>GPOS
</code></a></strong>
520 <p>Set if the pattern contains
<code>\G
</code>.
</p>
523 <dt><strong><a name=
"item_plus"><code>plus
</code></a></strong>
526 <p>Set if the pattern starts with a repeated char (as in
<code>x+y
</code>).
</p>
529 <dt><strong><a name=
"item_implicit"><code>implicit
</code></a></strong>
532 <p>Set if the pattern starts with
<code>.*
</code>.
</p>
535 <dt><strong><a name=
"item_with_eval"><code>with eval
</code></a></strong>
538 <p>Set if the pattern contain eval-groups, such as
<code>(?{ code })
</code> and
539 <code>(??{ code })
</code>.
</p>
542 <dt><strong><a name=
"item_anchored"><code>anchored(TYPE)
</code></a></strong>
545 <p>If the pattern may match only at a handful of places, (with
<code>TYPE
</code>
546 being
<code>BOL
</code>,
<code>MBOL
</code>, or
<a href=
"#item_gpos"><code>GPOS
</code></a>. See the table below.
</p>
550 <p>If a substring is known to match at end-of-line only, it may be
551 followed by
<code>$
</code>, as in
<code>floating `k'$
</code>.
</p>
552 <p>The optimizer-specific information is used to avoid entering (a slow) regex
553 engine on strings that will not definitely match. If the
<a href=
"#item_isall"><code>isall
</code></a> flag
554 is set, a call to the regex engine may be avoided even when the optimizer
555 found an appropriate place for the match.
</p>
556 <p>Above the optimizer section is the list of
<em>nodes
</em> of the compiled
557 form of the regex. Each line has format
</p>
558 <p><code> </code><em>id
</em>:
<em>TYPE
</em> <em>OPTIONAL-INFO
</em> (
<em>next-id
</em>)
</p>
561 <h2><a name=
"types_of_nodes">Types of nodes
</a></h2>
562 <p>Here are the possible types, with short descriptions:
</p>
564 # TYPE arg-description [num-args] [longjump-len] DESCRIPTION
</pre>
567 END no End of program.
568 SUCCEED no Return from a subroutine, basically.
</pre>
571 BOL no Match
"" at beginning of line.
572 MBOL no Same, assuming multiline.
573 SBOL no Same, assuming singleline.
574 EOS no Match
"" at end of string.
575 EOL no Match
"" at end of line.
576 MEOL no Same, assuming multiline.
577 SEOL no Same, assuming singleline.
578 BOUND no Match
"" at any word boundary
579 BOUNDL no Match
"" at any word boundary
580 NBOUND no Match
"" at any word non-boundary
581 NBOUNDL no Match
"" at any word non-boundary
582 GPOS no Matches where last m//g left off.
</pre>
584 # [Special] alternatives
585 ANY no Match any one character (except newline).
586 SANY no Match any one character.
587 ANYOF sv Match character in (or not in) this class.
588 ALNUM no Match any alphanumeric character
589 ALNUML no Match any alphanumeric char in locale
590 NALNUM no Match any non-alphanumeric character
591 NALNUML no Match any non-alphanumeric char in locale
592 SPACE no Match any whitespace character
593 SPACEL no Match any whitespace char in locale
594 NSPACE no Match any non-whitespace character
595 NSPACEL no Match any non-whitespace char in locale
596 DIGIT no Match any numeric character
597 NDIGIT no Match any non-numeric character
</pre>
599 # BRANCH The set of branches constituting a single choice are hooked
600 # together with their
"next
" pointers, since precedence prevents
601 # anything being concatenated to any individual branch. The
602 #
"next
" pointer of the last BRANCH in a choice points to the
603 # thing following the whole choice. This is also where the
604 # final
"next
" pointer of each individual branch points; each
605 # branch starts with the operand node of a BRANCH node.
607 BRANCH node Match this alternative, or the next...
</pre>
609 # BACK Normal
"next
" pointers all implicitly point forward; BACK
610 # exists to make loop structures possible.
612 BACK no Match
"",
"next
" ptr points backward.
</pre>
615 EXACT sv Match this string (preceded by length).
616 EXACTF sv Match this string, folded (prec. by length).
617 EXACTFL sv Match this string, folded in locale (w/len).
</pre>
620 NOTHING no Match empty string.
621 # A variant of above which delimits a group, thus stops optimizations
622 TAIL no Match empty string. Can jump here from outside.
</pre>
624 # STAR,PLUS '?', and complex '*' and '+', are implemented as circular
625 # BRANCH structures using BACK. Simple cases (one character
626 # per match) are implemented with STAR and PLUS for speed
627 # and to minimize recursive plunges.
629 STAR node Match this (simple) thing
0 or more times.
630 PLUS node Match this (simple) thing
1 or more times.
</pre>
632 CURLY sv
2 Match this simple thing {n,m} times.
633 CURLYN no
2 Match next-after-this simple thing
634 # {n,m} times, set parens.
635 CURLYM no
2 Match this medium-complex thing {n,m} times.
636 CURLYX sv
2 Match this complex thing {n,m} times.
</pre>
638 # This terminator creates a loop structure for CURLYX
639 WHILEM no Do curly processing and see if rest matches.
</pre>
641 # OPEN,CLOSE,GROUPP ...are numbered at compile time.
642 OPEN num
1 Mark this point in input as start of #n.
643 CLOSE num
1 Analogous to OPEN.
</pre>
645 REF num
1 Match some already matched string
646 REFF num
1 Match already matched string, folded
647 REFFL num
1 Match already matched string, folded in loc.
</pre>
649 # grouping assertions
650 IFMATCH off
1 2 Succeeds if the following matches.
651 UNLESSM off
1 2 Fails if the following matches.
652 SUSPEND off
1 1 "Independent
" sub-regex.
653 IFTHEN off
1 1 Switch, should be preceded by switcher .
654 GROUPP num
1 Whether the group matched.
</pre>
656 # Support for long regex
657 LONGJMP off
1 1 Jump far away.
658 BRANCHJ off
1 1 BRANCH with long offset.
</pre>
661 EVAL evl
1 Execute some Perl code.
</pre>
664 MINMOD no Next operator is not greedy.
665 LOGICAL no Next opcode should set the flag only.
</pre>
667 # This is not used yet
668 RENUM off
1 1 Group with independently numbered parens.
</pre>
670 # This is not really a node, but an optimized away piece of a
"long
" node.
671 # To simplify debugging output, we mark it as if it were a node
672 OPTIMIZED off Placeholder for dump.
</pre>
673 <p>Following the optimizer information is a dump of the offset/length
674 table, here split across several lines:
</p>
677 1[
4]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
5[
1]
678 0[
0]
12[
1]
0[
0]
6[
1]
0[
0]
7[
1]
0[
0]
9[
1]
8[
1]
0[
0]
10[
1]
0[
0]
679 11[
1]
0[
0]
12[
0]
12[
0]
13[
1]
0[
0]
14[
4]
0[
0]
0[
0]
0[
0]
0[
0]
680 0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
0[
0]
18[
1]
0[
0]
19[
1]
20[
0]
</pre>
681 <p>The first line here indicates that the offset/length table contains
45
682 entries. Each entry is a pair of integers, denoted by
<code>offset[length]
</code>.
683 Entries are numbered starting with
1, so entry #
1 here is
<code>1[
4]
</code> and
684 entry #
12 is
<code>5[
1]
</code>.
<code>1[
4]
</code> indicates that the node labeled
<code>1:
</code>
685 (the
<code>1: ANYOF[bc]
</code>) begins at character position
1 in the
686 pre-compiled form of the regex, and has a length of
4 characters.
687 <code>5[
1]
</code> in position
12
688 indicates that the node labeled
<code>12:
</code>
689 (the
<code>12: EXACT
<d
></code>) begins at character position
5 in the
690 pre-compiled form of the regex, and has a length of
1 character.
691 <code>12[
1]
</code> in position
14
692 indicates that the node labeled
<code>14:
</code>
693 (the
<code>14: CURLYX[
0] {
1,
32767}
</code>) begins at character position
12 in the
694 pre-compiled form of the regex, and has a length of
1 character---that
695 is, it corresponds to the
<code>+
</code> symbol in the precompiled regex.
</p>
696 <p><code>0[
0]
</code> items indicate that there is no corresponding node.
</p>
699 <h2><a name=
"runtime_output">Run-time output
</a></h2>
700 <p>First of all, when doing a match, one may get no run-time output even
701 if debugging is enabled. This means that the regex engine was never
702 entered and that all of the job was therefore done by the optimizer.
</p>
703 <p>If the regex engine was entered, the output may look like this:
</p>
705 Matching `[bc]d(ef*g)+h[ij]k$' against `abcdefg__gh__'
706 Setting an EVAL scope, savestack=
3
707 2 <ab
> <cdefg__gh_
> |
1: ANYOF
708 3 <abc
> <defg__gh_
> |
11: EXACT
<d
>
709 4 <abcd
> <efg__gh_
> |
13: CURLYX {
1,
32767}
710 4 <abcd
> <efg__gh_
> |
26: WHILEM
711 0 out of
1.
.32767 cc=effff31c
712 4 <abcd
> <efg__gh_
> |
15: OPEN1
713 4 <abcd
> <efg__gh_
> |
17: EXACT
<e
>
714 5 <abcde
> <fg__gh_
> |
19: STAR
715 EXACT
<f
> can match
1 times out of
32767...
716 Setting an EVAL scope, savestack=
3
717 6 <bcdef
> <g__gh__
> |
22: EXACT
<g
>
718 7 <bcdefg
> <__gh__
> |
24: CLOSE1
719 7 <bcdefg
> <__gh__
> |
26: WHILEM
720 1 out of
1.
.32767 cc=effff31c
721 Setting an EVAL scope, savestack=
12
722 7 <bcdefg
> <__gh__
> |
15: OPEN1
723 7 <bcdefg
> <__gh__
> |
17: EXACT
<e
>
724 restoring \
1 to
4(
4).
.7
725 failed, try continuation...
726 7 <bcdefg
> <__gh__
> |
27: NOTHING
727 7 <bcdefg
> <__gh__
> |
28: EXACT
<h
>
730 <p>The most significant information in the output is about the particular
<em>node
</em>
731 of the compiled regex that is currently being tested against the target string.
732 The format of these lines is
</p>
733 <p><code> </code><em>STRING-OFFSET
</em> <<em>PRE-STRING
</em>> <<em>POST-STRING
</em>> |
<em>ID
</em>:
<em>TYPE
</em></p>
734 <p>The
<em>TYPE
</em> info is indented with respect to the backtracking level.
735 Other incidental information appears interspersed within.
</p>
739 <h1><a name=
"debugging_perl_memory_usage">Debugging Perl memory usage
</a></h1>
740 <p>Perl is a profligate wastrel when it comes to memory use. There
741 is a saying that to estimate memory usage of Perl, assume a reasonable
742 algorithm for memory allocation, multiply that estimate by
10, and
743 while you still may miss the mark, at least you won't be quite so
744 astonished. This is not absolutely true, but may provide a good
745 grasp of what happens.
</p>
746 <p>Assume that an integer cannot take less than
20 bytes of memory, a
747 float cannot take less than
24 bytes, a string cannot take less
748 than
32 bytes (all these examples assume
32-bit architectures, the
749 result are quite a bit worse on
64-bit architectures). If a variable
750 is accessed in two of three different ways (which require an integer,
751 a float, or a string), the memory footprint may increase yet another
752 20 bytes. A sloppy
<code>malloc(
3)
</code> implementation can inflate these
753 numbers dramatically.
</p>
754 <p>On the opposite end of the scale, a declaration like
</p>
757 <p>may take up to
500 bytes of memory, depending on which release of Perl
759 <p>Anecdotal estimates of source-to-compiled code bloat suggest an
760 eightfold increase. This means that the compiled form of reasonable
761 (normally commented, properly indented etc.) code will take
762 about eight times more space in memory than the code took
764 <p>The
<strong>-DL
</strong> command-line switch is obsolete since circa Perl
5.6.0
765 (it was available only if Perl was built with
<code>-DDEBUGGING
</code>).
766 The switch was used to track Perl's memory allocations and possible
767 memory leaks. These days the use of malloc debugging tools like
768 <em>Purify
</em> or
<em>valgrind
</em> is suggested instead.
</p>
769 <p>One way to find out how much memory is being used by Perl data
770 structures is to install the Devel::Size module from CPAN: it gives
771 you the minimum number of bytes required to store a particular data
772 structure. Please be mindful of the difference between the
<a href=
"file://C|\msysgit\mingw\html/pod/perlfunc.html#item_size"><code>size()
</code></a>
773 and total_size().
</p>
774 <p>If Perl has been compiled using Perl's malloc you can analyze Perl
775 memory usage by setting the $ENV{PERL_DEBUG_MSTATS}.
</p>
778 <h2><a name=
"using__env_perl_debug_mstats_">Using
<code>$ENV{PERL_DEBUG_MSTATS}
</code></a></h2>
779 <p>If your perl is using Perl's
<code>malloc()
</code> and was compiled with the
780 necessary switches (this is the default), then it will print memory
781 usage statistics after compiling your code when
<code>$ENV{PERL_DEBUG_MSTATS}
782 > 1</code>, and before termination of the program when
<code><
783 $ENV{PERL_DEBUG_MSTATS}
</code>=
1 >>. The report format is similar to
784 the following example:
</p>
786 $ PERL_DEBUG_MSTATS=
2 perl -e
"require Carp
"
787 Memory allocation statistics after compilation: (buckets
4(
4).
.8188(
8192)
788 14216 free:
130 117 28 7 9 0 2 2 1 0 0
790 60924 used:
125 137 161 55 7 8 6 16 2 0 1
792 Total sbrk():
77824/
21:
119. Odd ends: pad+heads+chain+tail:
0+
636+
0+
2048.
793 Memory allocation statistics after execution: (buckets
4(
4).
.8188(
8192)
794 30888 free:
245 78 85 13 6 2 1 3 2 0 1
796 175816 used:
265 176 1112 111 26 22 11 27 2 1 1
798 Total sbrk():
215040/
47:
145. Odd ends: pad+heads+chain+tail:
0+
2192+
0+
6144.
</pre>
799 <p>It is possible to ask for such a statistic at arbitrary points in
800 your execution using the
<code>mstat()
</code> function out of the standard
801 Devel::Peek module.
</p>
802 <p>Here is some explanation of that format:
</p>
804 <dt><strong><a name=
"item_smallest"><code>buckets SMALLEST(APPROX)..GREATEST(APPROX)
</code></a></strong>
807 <p>Perl's
<code>malloc()
</code> uses bucketed allocations. Every request is rounded
808 up to the closest bucket size available, and a bucket is taken from
809 the pool of buckets of that size.
</p>
812 <p>The line above describes the limits of buckets currently in use.
813 Each bucket has two sizes: memory footprint and the maximal size
814 of user data that can fit into this bucket. Suppose in the above
815 example that the smallest bucket were size
4. The biggest bucket
816 would have usable size
8188, and the memory footprint would be
8192.
</p>
819 <p>In a Perl built for debugging, some buckets may have negative usable
820 size. This means that these buckets cannot (and will not) be used.
821 For larger buckets, the memory footprint may be one page greater
822 than a power of
2. If so, case the corresponding power of two is
823 printed in the
<code>APPROX
</code> field above.
</p>
826 <dt><strong><a name=
"item_free_2fused">Free/Used
</a></strong>
829 <p>The
1 or
2 rows of numbers following that correspond to the number
830 of buckets of each size between
<a href=
"#item_smallest"><code>SMALLEST
</code></a> and
<code>GREATEST
</code>. In
831 the first row, the sizes (memory footprints) of buckets are powers
832 of two--or possibly one page greater. In the second row, if present,
833 the memory footprints of the buckets are between the memory footprints
834 of two buckets ``above''.
</p>
837 <p>For example, suppose under the previous example, the memory footprints
842 free:
8 16 32 64 128 256 512 1024 2048 4096 8192
846 <p>With non-
<code>DEBUGGING
</code> perl, the buckets starting from
<code>128</code> have
847 a
4-byte overhead, and thus an
8192-long bucket may take up to
848 8188-byte allocations.
</p>
851 <dt><strong><a name=
"item_sbrk"><code>Total sbrk(): SBRKed/SBRKs:CONTINUOUS
</code></a></strong>
854 <p>The first two fields give the total amount of memory perl sbrk(
2)ed
855 (ess-broken? :-) and number of sbrk(
2)s used. The third number is
856 what perl thinks about continuity of returned chunks. So long as
857 this number is positive,
<code>malloc()
</code> will assume that it is probable
858 that
<a href=
"#item_sbrk"><code>sbrk(
2)
</code></a> will provide continuous memory.
</p>
861 <p>Memory allocated by external libraries is not counted.
</p>
864 <dt><strong><a name=
"item_pad_3a_0"><code>pad:
0</code></a></strong>
867 <p>The amount of sbrk(
2)ed memory needed to keep buckets aligned.
</p>
870 <dt><strong><a name=
"item_heads_3a_2192"><code>heads:
2192</code></a></strong>
873 <p>Although memory overhead of bigger buckets is kept inside the bucket, for
874 smaller buckets, it is kept in separate areas. This field gives the
875 total size of these areas.
</p>
878 <dt><strong><a name=
"item_chain_3a_0"><code>chain:
0</code></a></strong>
881 <p><code>malloc()
</code> may want to subdivide a bigger bucket into smaller buckets.
882 If only a part of the deceased bucket is left unsubdivided, the rest
883 is kept as an element of a linked list. This field gives the total
884 size of these chunks.
</p>
887 <dt><strong><a name=
"item_tail_3a_6144"><code>tail:
6144</code></a></strong>
890 <p>To minimize the number of sbrk(
2)s,
<code>malloc()
</code> asks for more memory. This
891 field gives the size of the yet unused part, which is sbrk(
2)ed, but
899 <h1><a name=
"see_also">SEE ALSO
</a></h1>
900 <p><a href=
"file://C|\msysgit\mingw\html/pod/perldebug.html">the perldebug manpage
</a>,
901 <a href=
"file://C|\msysgit\mingw\html/pod/perlguts.html">the perlguts manpage
</a>,
902 <a href=
"file://C|\msysgit\mingw\html/pod/perlrun.html">the perlrun manpage
</a>
903 <a href=
"file://C|\msysgit\mingw\html/lib/re.html">the re manpage
</a>,
905 <a href=
"file://C|\msysgit\mingw\html/lib/Devel/DProf.html">the Devel::DProf manpage
</a>.
</p>
906 <table border=
"0" width=
"100%" cellspacing=
"0" cellpadding=
"3">
907 <tr><td class=
"block" style=
"background-color: #cccccc" valign=
"middle">
908 <big><strong><span class=
"block"> perldebguts - Guts of Perl debugging
</span></strong></big>