strip old $Id tags
[bioperl-live.git] / t / lib / Tree / DAG_Node.pm
blob3cb16eee90fa04e99d5093a39439eb1e513533cb
2 # -*-Perl-*- Time-stamp: "2004-12-29 18:17:58 AST"
4 require 5;
5 package Tree::DAG_Node;
6 use Carp ();
7 use strict;
8 use vars qw(@ISA $Debug $VERSION);
10 $Debug = 0;
11 $VERSION = "1.05";
13 =head1 NAME
15 Tree::DAG_Node - (super)class for representing nodes in a tree
17 =head1 SYNOPSIS
19 Using as a base class:
21 package Game::Tree::Node; # or whatever you're doing
22 use Tree::DAG_Node;
23 @ISA = qw(Tree::DAG_Node);
24 ...your own methods overriding/extending
25 the methods in Tree::DAG_Node...
27 Using as a class of its own:
29 use Tree::DAG_Node;
30 my $root = Tree::DAG_Node->new();
31 $root->name("I'm the tops");
32 my $new_daughter = $root->new_daughter;
33 $new_daughter->name("More");
34 ...
36 =head1 DESCRIPTION
38 This class encapsulates/makes/manipulates objects that represent nodes
39 in a tree structure. The tree structure is not an object itself, but
40 is emergent from the linkages you create between nodes. This class
41 provides the methods for making linkages that can be used to build up
42 a tree, while preventing you from ever making any kinds of linkages
43 which are not allowed in a tree (such as having a node be its own
44 mother or ancestor, or having a node have two mothers).
46 This is what I mean by a "tree structure", a bit redundantly stated:
48 * A tree is a special case of an acyclic directed graph.
50 * A tree is a network of nodes where there's exactly one root
51 node (i.e., 'the top'), and the only primary relationship between nodes
52 is the mother-daugher relationship.
54 * No node can be its own mother, or its mother's mother, etc.
56 * Each node in the tree has exactly one "parent" (node in the "up"
57 direction) -- except the root, which is parentless.
59 * Each node can have any number (0 to any finite number) of daughter
60 nodes. A given node's daughter nodes constitute an I<ordered> list.
61 (However, you are free to consider this ordering irrelevant.
62 Some applications do need daughters to be ordered, so I chose to
63 consider this the general case.)
65 * A node can appear in only one tree, and only once in that tree.
66 Notably (notable because it doesn't follow from the two above points),
67 a node cannot appear twice in its mother's daughter list.
69 * In other words, there's an idea of up (toward the root) versus
70 down (away from the root), and left (i.e., toward the start (index 0)
71 of a given node's daughter list) versus right (toward the end of a
72 given node's daughter list).
74 Trees as described above have various applications, among them:
75 representing syntactic constituency, in formal linguistics;
76 representing contingencies in a game tree; representing abstract
77 syntax in the parsing of any computer language -- whether in
78 expression trees for programming languages, or constituency in the
79 parse of a markup language document. (Some of these might not use the
80 fact that daughters are ordered.)
82 (Note: B-Trees are a very special case of the above kinds of trees,
83 and are best treated with their own class. Check CPAN for modules
84 encapsulating B-Trees; or if you actually want a database, and for
85 some reason ended up looking here, go look at L<AnyDBM_File>.)
87 Many base classes are not usable except as such -- but Tree::DAG_Node
88 can be used as a normal class. You can go ahead and say:
90 use Tree::DAG_Node;
91 my $root = Tree::DAG_Node->new();
92 $root->name("I'm the tops");
93 $new_daughter = Tree::DAG_Node->new();
94 $new_daughter->name("More");
95 $root->add_daughter($new_daughter);
97 and so on, constructing and linking objects from Tree::DAG_Node and
98 making useful tree structures out of them.
100 =head1 A NOTE TO THE READER
102 This class is big and provides lots of methods. If your problem is
103 simple (say, just representing a simple parse tree), this class might
104 seem like using an atomic sledgehammer to swat a fly. But the
105 complexity of this module's bells and whistles shouldn't detract from
106 the efficiency of using this class for a simple purpose. In fact, I'd
107 be very surprised if any one user ever had use for more that even a
108 third of the methods in this class. And remember: an atomic
109 sledgehammer B<will> kill that fly.
111 =head1 OBJECT CONTENTS
113 Implementationally, each node in a tree is an object, in the sense of
114 being an arbitrarily complex data structure that belongs to a class
115 (presumably Tree::DAG_Node, or ones derived from it) that provides
116 methods.
118 The attributes of a node-object are:
120 =over
122 =item mother -- this node's mother. undef if this is a root.
124 =item daughters -- the (possibly empty) list of daughters of this node.
126 =item name -- the name for this node.
128 Need not be unique, or even printable. This is printed in some of the
129 various dumper methods, but it's up to you if you don't put anything
130 meaningful or printable here.
132 =item attributes -- whatever the user wants to use it for.
134 Presumably a hashref to whatever other attributes the user wants to
135 store without risk of colliding with the object's real attributes.
136 (Example usage: attributes to an SGML tag -- you definitely wouldn't
137 want the existence of a "mother=foo" pair in such a tag to collide with
138 a node object's 'mother' attribute.)
140 Aside from (by default) initializing it to {}, and having the access
141 method called "attributes" (described a ways below), I don't do
142 anything with the "attributes" in this module. I basically intended
143 this so that users who don't want/need to bother deriving a class
144 from Tree::DAG_Node, could still attach whatever data they wanted in a
145 node.
147 =back
149 "mother" and "daughters" are attributes that relate to linkage -- they
150 are never written to directly, but are changed as appropriate by the
151 "linkage methods", discussed below.
153 The other two (and whatever others you may add in derived classes) are
154 simply accessed thru the same-named methods, discussed further below.
156 =head2 ABOUT THE DOCUMENTED INTERFACE
158 Stick to the documented interface (and comments in the source --
159 especially ones saying "undocumented!" and/or "disfavored!" -- do not
160 count as documentation!), and don't rely on any behavior that's not in
161 the documented interface.
163 Specifically, unless the documentation for a particular method says
164 "this method returns thus-and-such a value", then you should not rely on
165 it returning anything meaningful.
167 A I<passing> acquintance with at least the broader details of the source
168 code for this class is assumed for anyone using this class as a base
169 class -- especially if you're overriding existing methods, and
170 B<definitely> if you're overriding linkage methods.
172 =head1 MAIN CONSTRUCTOR, AND INITIALIZER
174 =over
176 =item the constructor CLASS->new() or CLASS->new({...options...})
178 This creates a new node object, calls $object->_init({...options...})
179 to provide it sane defaults (like: undef name, undef mother, no
180 daughters, 'attributes' setting of a new empty hashref), and returns
181 the object created. (If you just said "CLASS->new()" or "CLASS->new",
182 then it pretends you called "CLASS->new({})".)
184 Currently no options for putting in {...options...} are part
185 of the documented interface, but the options is here in case
186 you want to add such behavior in a derived class.
188 Read on if you plan on using Tree::DAG_New as a base class.
189 (Otherwise feel free to skip to the description of _init.)
191 There are, in my mind, two ways to do object construction:
193 Way 1: create an object, knowing that it'll have certain uninteresting
194 sane default values, and then call methods to change those values to
195 what you want. Example:
197 $node = Tree::DAG_Node->new;
198 $node->name('Supahnode!');
199 $root->add_daughter($node);
200 $node->add_daughters(@some_others)
202 Way 2: be able to specify some/most/all the object's attributes in
203 the call to the constructor. Something like:
205 $node = Tree::DAG_Node->new({
206 name => 'Supahnode!',
207 mother => $root,
208 daughters => \@some_others
211 After some deliberation, I've decided that the second way is a Bad
212 Thing. First off, it is B<not> markedly more concise than the first
213 way. Second off, it often requires subtly different syntax (e.g.,
214 \@some_others vs @some_others). It just complicates things for the
215 programmer and the user, without making either appreciably happier.
217 (This is not to say that options in general for a constructor are bad
218 -- C<random_network>, discussed far below, necessarily takes options.
219 But note that those are not options for the default values of
220 attributes.)
222 Anyway, if you use Tree::DAG_Node as a superclass, and you add
223 attributes that need to be initialized, what you need to do is provide
224 an _init method that calls $this->SUPER::_init($options) to use its
225 superclass's _init method, and then initializes the new attributes:
227 sub _init {
228 my($this, $options) = @_[0,1];
229 $this->SUPER::_init($options); # call my superclass's _init to
230 # init all the attributes I'm inheriting
232 # Now init /my/ new attributes:
233 $this->{'amigos'} = []; # for example
236 ...or, as I prefer when I'm being a neat freak:
238 sub _init {
239 my($this, $options) = @_[0,1];
240 $this->SUPER::_init($options);
242 $this->_init_amigos($options);
245 sub _init_amigos {
246 my $this = $_[0];
247 # Or my($this,$options) = @_[0,1]; if I'm using $options
248 $this->{'amigos'} = [];
252 In other words, I like to have each attribute initialized thru a
253 method named _init_[attribute], which should expect the object as
254 $_[0] and the the options hashref (or {} if none was given) as $_[1].
255 If you insist on having your _init recognize options for setting
256 attributes, you might as well have them dealt with by the appropriate
257 _init_[attribute] method, like this:
259 sub _init {
260 my($this, $options) = @_[0,1];
261 $this->SUPER::_init($options);
263 $this->_init_amigos($options);
266 sub _init_amigos {
267 my($this,$options) = @_[0,1]; # I need options this time
268 $this->{'amigos'} = [];
269 $this->amigos(@{$options->{'amigos'}}) if $options->{'amigos'};
272 All this bookkeeping looks silly with just one new attribute in a
273 class derived straight from Tree::DAG_Node, but if there's lots of new
274 attributes running around, and if you're deriving from a class derived
275 from a class derived from Tree::DAG_Node, then tidy
276 stratification/modularization like this can keep you sane.
278 =item the constructor $obj->new() or $obj->new({...options...})
280 Just another way to get at the C<new> method. This B<does not copy>
281 $obj, but merely constructs a new object of the same class as it.
282 Saves you the bother of going $class = ref $obj; $obj2 = $class->new;
284 =cut
286 sub new { # constructor
287 # Presumably you won't EVER need to override this -- _init is what
288 # you'd override in order to set an object's default attribute values.
289 my $class = shift;
290 $class = ref($class) if ref($class); # tchristic style. why not?
292 my $o = ref($_[0]) eq 'HASH' ? $_[0] : {}; # o for options hashref
293 my $it = bless( {}, $class );
294 print "Constructing $it in class $class\n" if $Debug;
295 $it->_init( $o );
296 return $it;
299 ###########################################################################
301 =item the method $node->_init({...options...})
303 Initialize the object's attribute values. See the discussion above.
304 Presumably this should be called only by the guts of the C<new>
305 constructor -- never by the end user.
307 Currently there are no documented options for putting in
308 {...options...}, but (in case you want to disregard the above rant)
309 the option exists for you to use {...options...} for something useful
310 in a derived class.
312 Please see the source for more information.
314 =item see also (below) the constructors "new_daughter" and "new_daughter_left"
316 =back
318 =cut
320 sub _init { # method
321 my $this = shift;
322 my $o = ref($_[0]) eq 'HASH' ? $_[0] : {};
324 # Sane initialization.
325 $this->_init_mother($o);
326 $this->_init_daughters($o);
327 $this->_init_name($o);
328 $this->_init_attributes($o);
330 return;
333 sub _init_mother { # to be called by an _init
334 my($this, $o) = @_[0,1];
336 $this->{'mother'} = undef;
338 # Undocumented and disfavored. Consider this just an example.
339 ( $o->{'mother'} )->add_daughter($this)
340 if defined($o->{'mother'}) && ref($o->{'mother'});
341 # DO NOT use this option (as implemented) with new_daughter or
342 # new_daughter_left!!!!!
343 # BAD THINGS MAY HAPPEN!!!
346 sub _init_daughters { # to be called by an _init
347 my($this, $o) = @_[0,1];
349 $this->{'daughters'} = [];
351 # Undocumented and disfavored. Consider this just an example.
352 $this->set_daughters( @{$o->{'daughters'}} )
353 if ref($o->{'daughters'}) && (@{$o->{'daughters'}});
354 # DO NOT use this option (as implemented) with new_daughter or
355 # new_daughter_left!!!!!
356 # BAD THINGS MAY HAPPEN!!!
359 sub _init_name { # to be called by an _init
360 my($this, $o) = @_[0,1];
362 $this->{'name'} = undef;
364 # Undocumented and disfavored. Consider this just an example.
365 $this->name( $o->{'name'} ) if exists $o->{'name'};
368 sub _init_attributes { # to be called by an _init
369 my($this, $o) = @_[0,1];
371 $this->{'attributes'} = {};
373 # Undocumented and disfavored. Consider this just an example.
374 $this->attributes( $o->{'attributes'} ) if exists $o->{'attributes'};
377 ###########################################################################
378 ###########################################################################
380 =head1 LINKAGE-RELATED METHODS
382 =over
384 =item $node->daughters
386 This returns the (possibly empty) list of daughters for $node.
388 =cut
390 sub daughters { # read-only attrib-method: returns a list.
391 my $this = shift;
393 if(@_) { # undoc'd and disfavored to use as a write-method
394 Carp::croak "Don't set daughters with doughters anymore\n";
395 Carp::carp "my parameter must be a listref" unless ref($_[0]);
396 $this->{'daughters'} = $_[0];
397 $this->_update_daughter_links;
399 #return $this->{'daughters'};
400 return @{$this->{'daughters'} || []};
403 ###########################################################################
405 =item $node->mother
407 This returns what node is $node's mother. This is undef if $node has
408 no mother -- i.e., if it is a root.
410 =cut
412 sub mother { # read-only attrib-method: returns an object (the mother node)
413 my $this = shift;
414 Carp::croak "I'm a read-only method!" if @_;
415 return $this->{'mother'};
418 ###########################################################################
419 ###########################################################################
421 =item $mother->add_daughters( LIST )
423 This method adds the node objects in LIST to the (right) end of
424 $mother's C<daughter> list. Making a node N1 the daughter of another
425 node N2 also means that N1's C<mother> attribute is "automatically" set
426 to N2; it also means that N1 stops being anything else's daughter as
427 it becomes N2's daughter.
429 If you try to make a node its own mother, a fatal error results. If
430 you try to take one of a a node N1's ancestors and make it also a
431 daughter of N1, a fatal error results. A fatal error results if
432 anything in LIST isn't a node object.
434 If you try to make N1 a daughter of N2, but it's B<already> a daughter
435 of N2, then this is a no-operation -- it won't move such nodes to the
436 end of the list or anything; it just skips doing anything with them.
438 =item $node->add_daughter( LIST )
440 An exact synonym for $node->add_daughters(LIST)
442 =cut
444 sub add_daughters { # write-only method
445 my($mother, @daughters) = @_;
446 return unless @daughters; # no-op
447 return
448 $mother->_add_daughters_wrapper(
449 sub { push @{$_[0]}, $_[1]; },
450 @daughters
454 sub add_daughter { # alias
455 my($it,@them) = @_; $it->add_daughters(@them);
458 =item $mother->add_daughters_left( LIST )
460 This method is just like C<add_daughters>, except that it adds the
461 node objects in LIST to the (left) beginning of $mother's daughter
462 list, instead of the (right) end of it.
464 =item $node->add_daughter_left( LIST )
466 An exact synonym for $node->add_daughters_left( LIST )
468 =cut
470 sub add_daughters_left { # write-only method
471 my($mother, @daughters) = @_;
472 return unless @daughters;
473 return
474 $mother->_add_daughters_wrapper(
475 sub { unshift @{$_[0]}, $_[1]; },
476 @daughters
480 sub add_daughter_left { # alias
481 my($it,@them) = @_; $it->add_daughters_left(@them);
484 =item Note:
486 The above link-making methods perform basically an C<unshift> or
487 C<push> on the mother node's daughter list. To get the full range of
488 list-handling functionality, copy the daughter list, and change it,
489 and then call C<set_daughters> on the result:
491 @them = $mother->daughters;
492 @removed = splice(@them, 0,2, @new_nodes);
493 $mother->set_daughters(@them);
495 Or consider a structure like:
497 $mother->set_daughters(
498 grep($_->name =~ /NP/ ,
499 $mother->daughters
503 =cut
507 ## Used by the adding methods
508 # (except maybe new_daughter, and new_daughter_left)
510 sub _add_daughters_wrapper {
511 my($mother, $callback, @daughters) = @_;
512 return unless @daughters;
514 my %ancestors;
515 @ancestors{ $mother->ancestors } = undef;
516 # This could be made more efficient by not bothering to compile
517 # the ancestor list for $mother if all the nodes to add are
518 # daughterless.
519 # But then you have to CHECK if they're daughterless.
520 # If $mother is [big number] generations down, then it's worth checking.
522 foreach my $daughter (@daughters) { # which may be ()
523 Carp::croak "daughter must be a node object!" unless UNIVERSAL::can($daughter, 'is_node');
525 printf "Mother : %s (%s)\n", $mother, ref $mother if $Debug;
526 printf "Daughter: %s (%s)\n", $daughter, ref $daughter if $Debug;
527 printf "Adding %s to %s\n",
528 ($daughter->name() || $daughter),
529 ($mother->name() || $mother) if $Debug > 1;
531 Carp::croak "mother can't be its own daughter!" if $mother eq $daughter;
533 $daughter->cyclicity_fault(
534 "$daughter (" . ($daughter->name || 'no_name') .
535 ") is an ancestor of $mother (" . ($mother->name || 'no_name') .
536 "), so can't became its daughter."
537 ) if exists $ancestors{$daughter};
539 my $old_mother = $daughter->{'mother'};
541 next if defined($old_mother) && ref($old_mother) && $old_mother eq $mother;
542 # noop if $daughter is already $mother's daughter
544 $old_mother->remove_daughters($daughter)
545 if defined($old_mother) && ref($old_mother);
547 &{$callback}($mother->{'daughters'}, $daughter);
549 $mother->_update_daughter_links; # need only do this at the end
551 return;
554 ###########################################################################
555 ###########################################################################
557 sub _update_daughter_links {
558 # Eliminate any duplicates in my daughters list, and update
559 # all my daughters' links to myself.
560 my $this = shift;
562 my $them = $this->{'daughters'};
564 # Eliminate duplicate daughters.
565 my %seen = ();
566 @$them = grep { ref($_) && not($seen{$_}++) } @$them;
567 # not that there should ever be duplicate daughters anyhoo.
569 foreach my $one (@$them) { # linkage bookkeeping
570 Carp::croak "daughter <$one> isn't an object!" unless ref $one;
571 $one->{'mother'} = $this;
573 return;
576 ###########################################################################
578 # Currently unused.
580 sub _update_links { # update all descendant links for ancestorship below
581 # this point
582 # note: it's "descendant", not "descendent"
583 # see <http://www.lenzo.com/~sburke/stuff/english_ant_and_ent.html>
584 my $this = shift;
585 # $this->no_cyclicity;
586 $this->walk_down({
587 'callback' => sub {
588 my $this = $_[0];
589 $this->_update_daughter_links;
590 return 1;
595 ###########################################################################
596 ###########################################################################
598 =item the constructor $daughter = $mother->new_daughter, or
600 =item the constructor $daughter = $mother->new_daughter({...options...})
602 This B<constructs> a B<new> node (of the same class as $mother), and
603 adds it to the (right) end of the daughter list of $mother. This is
604 essentially the same as going
606 $daughter = $mother->new;
607 $mother->add_daughter($daughter);
609 but is rather more efficient because (since $daughter is guaranteed new
610 and isn't linked to/from anything), it doesn't have to check that
611 $daughter isn't an ancestor of $mother, isn't already daughter to a
612 mother it needs to be unlinked from, isn't already in $mother's
613 daughter list, etc.
615 As you'd expect for a constructor, it returns the node-object created.
617 =cut
619 # Note that if you radically change 'mother'/'daughters' bookkeeping,
620 # you may have to change this routine, since it's one of the places
621 # that directly writes to 'daughters' and 'mother'.
623 sub new_daughter {
624 my($mother, @options) = @_;
625 my $daughter = $mother->new(@options);
627 push @{$mother->{'daughters'}}, $daughter;
628 $daughter->{'mother'} = $mother;
630 return $daughter;
633 =item the constructor $mother->new_daughter_left, or
635 =item $mother->new_daughter_left({...options...})
637 This is just like $mother->new_daughter, but adds the new daughter
638 to the left (start) of $mother's daughter list.
640 =cut
642 # Note that if you radically change 'mother'/'daughters' bookkeeping,
643 # you may have to change this routine, since it's one of the places
644 # that directly writes to 'daughters' and 'mother'.
646 sub new_daughter_left {
647 my($mother, @options) = @_;
648 my $daughter = $mother->new(@options);
650 unshift @{$mother->{'daughters'}}, $daughter;
651 $daughter->{'mother'} = $mother;
653 return $daughter;
656 ###########################################################################
658 =item $mother->remove_daughters( LIST )
660 This removes the nodes listed in LIST from $mother's daughter list.
661 This is a no-operation if LIST is empty. If there are things in LIST
662 that aren't a current daughter of $mother, they are ignored.
664 Not to be confused with $mother->clear_daughters.
666 =cut
668 sub remove_daughters { # write-only method
669 my($mother, @daughters) = @_;
670 Carp::croak "mother must be an object!" unless ref $mother;
671 return unless @daughters;
673 my %to_delete;
674 @daughters = grep {ref($_)
675 and defined($_->{'mother'})
676 and $mother eq $_->{'mother'}
677 } @daughters;
678 return unless @daughters;
679 @to_delete{ @daughters } = undef;
681 # This could be done better and more efficiently, I guess.
682 foreach my $daughter (@daughters) {
683 $daughter->{'mother'} = undef;
685 my $them = $mother->{'daughters'};
686 @$them = grep { !exists($to_delete{$_}) } @$them;
688 # $mother->_update_daughter_links; # unnecessary
689 return;
692 =item $node->remove_daughter( LIST )
694 An exact synonym for $node->remove_daughters( LIST )
696 =cut
698 sub remove_daughter { # alias
699 my($it,@them) = @_; $it->remove_daughters(@them);
702 =item $node->unlink_from_mother
704 This removes node from the daughter list of its mother. If it has no
705 mother, this is a no-operation.
707 Returns the mother unlinked from (if any).
709 =cut
711 sub unlink_from_mother {
712 my $node = $_[0];
713 my $mother = $node->{'mother'};
714 $mother->remove_daughters($node) if defined($mother) && ref($mother);
715 return $mother;
718 ###########################################################################
720 =item $mother->clear_daughters
722 This unlinks all $mother's daughters.
723 Returns the the list of what used to be $mother's daughters.
725 Not to be confused with $mother->remove_daughters( LIST ).
727 =cut
729 sub clear_daughters { # write-only method
730 my($mother) = $_[0];
731 my @daughters = @{$mother->{'daughters'}};
733 @{$mother->{'daughters'}} = ();
734 foreach my $one (@daughters) {
735 next unless UNIVERSAL::can($one, 'is_node'); # sanity check
736 $one->{'mother'} = undef;
738 # Another, simpler, way to do it:
739 # $mother->remove_daughters($mother->daughters);
741 return @daughters; # NEW
743 #--------------------------------------------------------------------------
745 =item $mother->set_daughters( LIST )
747 This unlinks all $mother's daughters, and replaces them with the
748 daughters in LIST.
750 Currently implemented as just $mother->clear_daughters followed by
751 $mother->add_daughters( LIST ).
753 =cut
755 sub set_daughters { # write-only method
756 my($mother, @them) = @_;
757 $mother->clear_daughters;
758 $mother->add_daughters(@them) if @them;
759 # yup, it's that simple
762 #--------------------------------------------------------------------------
764 =item $node->replace_with( LIST )
766 This replaces $node in its mother's daughter list, by unlinking $node
767 and replacing it with the items in LIST. This returns a list consisting
768 of $node followed by LIST, i.e., the nodes that replaced it.
770 LIST can include $node itself (presumably at most once). LIST can
771 also be empty-list. However, if any items in LIST are sisters to
772 $node, they are ignored, and are not in the copy of LIST passed as the
773 return value.
775 As you might expect for any linking operation, the items in LIST
776 cannot be $node's mother, or any ancestor to it; and items in LIST are,
777 of course, unlinked from their mothers (if they have any) as they're
778 linked to $node's mother.
780 (In the special (and bizarre) case where $node is root, this simply calls
781 $this->unlink_from_mother on all the items in LIST, making them roots of
782 their own trees.)
784 Note that the daughter-list of $node is not necessarily affected; nor
785 are the daughter-lists of the items in LIST. I mention this in case you
786 think replace_with switches one node for another, with respect to its
787 mother list B<and> its daughter list, leaving the rest of the tree
788 unchanged. If that's what you want, replacing $Old with $New, then you
789 want:
791 $New->set_daughters($Old->clear_daughters);
792 $Old->replace_with($New);
794 (I can't say $node's and LIST-items' daughter lists are B<never>
795 affected my replace_with -- they can be affected in this case:
797 $N1 = ($node->daughters)[0]; # first daughter of $node
798 $N2 = ($N1->daughters)[0]; # first daughter of $N1;
799 $N3 = Tree::DAG_Node->random_network; # or whatever
800 $node->replace_with($N1, $N2, $N3);
802 As a side affect of attaching $N1 and $N2 to $node's mother, they're
803 unlinked from their parents ($node, and $N1, replectively).
804 But N3's daughter list is unaffected.
806 In other words, this method does what it has to, as you'd expect it
809 =cut
811 sub replace_with { # write-only method
812 my($this, @replacements) = @_;
814 if(not( defined($this->{'mother'}) && ref($this->{'mother'}) )) { # if root
815 foreach my $replacement (@replacements) {
816 $replacement->{'mother'}->remove_daughters($replacement)
817 if $replacement->{'mother'};
819 # make 'em roots
820 } else { # I have a mother
821 my $mother = $this->{'mother'};
823 #@replacements = grep(($_ eq $this || $_->{'mother'} ne $mother),
824 # @replacements);
825 @replacements = grep { $_ eq $this
826 || not(defined($_->{'mother'}) &&
827 ref($_->{'mother'}) &&
828 $_->{'mother'} eq $mother
831 @replacements;
832 # Eliminate sisters (but not self)
833 # i.e., I want myself or things NOT with the same mother as myself.
835 $mother->set_daughters( # old switcheroo
836 map($_ eq $this ? (@replacements) : $_ ,
837 @{$mother->{'daughters'}}
840 # and set_daughters does all the checking and possible
841 # unlinking
843 return($this, @replacements);
846 =item $node->replace_with_daughters
848 This replaces $node in its mother's daughter list, by unlinking $node
849 and replacing it with its daughters. In other words, $node becomes
850 motherless and daughterless as its daughters move up and take its place.
851 This returns a list consisting of $node followed by the nodes that were
852 its daughters.
854 In the special (and bizarre) case where $node is root, this simply
855 unlinks its daughters from it, making them roots of their own trees.
857 Effectively the same as $node->replace_with($node->daughters), but more
858 efficient, since less checking has to be done. (And I also think
859 $node->replace_with_daughters is a more common operation in
860 tree-wrangling than $node->replace_with(LIST), so deserves a named
861 method of its own, but that's just me.)
863 =cut
865 # Note that if you radically change 'mother'/'daughters' bookkeeping,
866 # you may have to change this routine, since it's one of the places
867 # that directly writes to 'daughters' and 'mother'.
869 sub replace_with_daughters { # write-only method
870 my($this) = $_[0]; # takes no params other than the self
871 my $mother = $this->{'mother'};
872 return($this, $this->clear_daughters)
873 unless defined($mother) && ref($mother);
875 my @daughters = $this->clear_daughters;
876 my $sib_r = $mother->{'daughters'};
877 @$sib_r = map($_ eq $this ? (@daughters) : $_,
878 @$sib_r # old switcheroo
880 foreach my $daughter (@daughters) {
881 $daughter->{'mother'} = $mother;
883 return($this, @daughters);
886 #--------------------------------------------------------------------------
888 =item $node->add_left_sisters( LIST )
890 This adds the elements in LIST (in that order) as immediate left sisters of
891 $node. In other words, given that B's mother's daughter-list is (A,B,C,D),
892 calling B->add_left_sisters(X,Y) makes B's mother's daughter-list
893 (A,X,Y,B,C,D).
895 If LIST is empty, this is a no-op, and returns empty-list.
897 This is basically implemented as a call to $node->replace_with(LIST,
898 $node), and so all replace_with's limitations and caveats apply.
900 The return value of $node->add_left_sisters( LIST ) is the elements of
901 LIST that got added, as returned by replace_with -- minus the copies
902 of $node you'd get from a straight call to $node->replace_with(LIST,
903 $node).
905 =cut
907 sub add_left_sisters { # write-only method
908 my($this, @new) = @_;
909 return() unless @new;
911 @new = $this->replace_with(@new, $this);
912 shift @new; pop @new; # kill the copies of $this
913 return @new;
916 =item $node->add_left_sister( LIST )
918 An exact synonym for $node->add_left_sisters(LIST)
920 =cut
922 sub add_left_sister { # alias
923 my($it,@them) = @_; $it->add_left_sisters(@them);
926 =item $node->add_right_sisters( LIST )
928 Just like add_left_sisters (which see), except that the the elements
929 in LIST (in that order) as immediate B<right> sisters of $node;
931 In other words, given that B's mother's daughter-list is (A,B,C,D),
932 calling B->add_right_sisters(X,Y) makes B's mother's daughter-list
933 (A,B,X,Y,C,D).
935 =cut
937 sub add_right_sisters { # write-only method
938 my($this, @new) = @_;
939 return() unless @new;
940 @new = $this->replace_with($this, @new);
941 shift @new; shift @new; # kill the copies of $this
942 return @new;
945 =item $node->add_right_sister( LIST )
947 An exact synonym for $node->add_right_sisters(LIST)
949 =cut
951 sub add_right_sister { # alias
952 my($it,@them) = @_; $it->add_right_sisters(@them);
955 ###########################################################################
957 =back
959 =cut
961 ###########################################################################
962 ###########################################################################
964 =head1 OTHER ATTRIBUTE METHODS
966 =over
968 =item $node->name or $node->name(SCALAR)
970 In the first form, returns the value of the node object's "name"
971 attribute. In the second form, sets it to the value of SCALAR.
973 =cut
975 sub name { # read/write attribute-method. returns/expects a scalar
976 my $this = shift;
977 $this->{'name'} = $_[0] if @_;
978 return $this->{'name'};
982 ###########################################################################
984 =item $node->attributes or $node->attributes(SCALAR)
986 In the first form, returns the value of the node object's "attributes"
987 attribute. In the second form, sets it to the value of SCALAR. I
988 intend this to be used to store a reference to a (presumably
989 anonymous) hash the user can use to store whatever attributes he
990 doesn't want to have to store as object attributes. In this case, you
991 needn't ever set the value of this. (_init has already initialized it
992 to {}.) Instead you can just do...
994 $node->attributes->{'foo'} = 'bar';
996 ...to write foo => bar.
998 =cut
1000 sub attributes { # read/write attribute-method
1001 # expects a ref, presumably a hashref
1002 my $this = shift;
1003 if(@_) {
1004 Carp::carp "my parameter must be a reference" unless ref($_[0]);
1005 $this->{'attributes'} = $_[0];
1007 return $this->{'attributes'};
1010 =item $node->attribute or $node->attribute(SCALAR)
1012 An exact synonym for $node->attributes or $node->attributes(SCALAR)
1014 =cut
1016 sub attribute { # alias
1017 my($it,@them) = @_; $it->attributes(@them);
1020 ###########################################################################
1021 # Secret Stuff.
1023 sub no_cyclicity { # croak iff I'm in a CYCLIC class.
1024 my($it) = $_[0];
1025 # If, God forbid, I use this to make a cyclic class, then I'd
1026 # expand the functionality of this routine to actually look for
1027 # cyclicity. Or something like that. Maybe.
1029 $it->cyclicity_fault("You can't do that in a cyclic class!")
1030 if $it->cyclicity_allowed;
1031 return;
1034 sub cyclicity_fault {
1035 my($it, $bitch) = @_[0,1];
1036 Carp::croak "Cyclicity fault: $bitch"; # never return
1039 sub cyclicity_allowed {
1040 return 0;
1043 ###########################################################################
1044 # More secret stuff. Currently unused.
1046 sub inaugurate_root { # no-op
1047 my($it, $tree) = @_[0,1];
1048 # flag this node as being the root of the tree $tree.
1049 return;
1052 sub decommission_root { # no-op
1053 # flag this node as no longer being the root of the tree $tree.
1054 return;
1057 ###########################################################################
1058 ###########################################################################
1060 =back
1062 =head1 OTHER METHODS TO DO WITH RELATIONSHIPS
1064 =over
1066 =item $node->is_node
1068 This always returns true. More pertinently, $object->can('is_node')
1069 is true (regardless of what C<is_node> would do if called) for objects
1070 belonging to this class or for any class derived from it.
1072 =cut
1074 sub is_node { return 1; } # always true.
1075 # NEVER override this with anything that returns false in the belief
1076 # that this'd signal "not a node class". The existence of this method
1077 # is what I test for, with the various "can()" uses in this class.
1079 ###########################################################################
1081 =item $node->ancestors
1083 Returns the list of this node's ancestors, starting with its mother,
1084 then grandmother, and ending at the root. It does this by simply
1085 following the 'mother' attributes up as far as it can. So if $item IS
1086 the root, this returns an empty list.
1088 Consider that scalar($node->ancestors) returns the ply of this node
1089 within the tree -- 2 for a granddaughter of the root, etc., and 0 for
1090 root itself.
1092 =cut
1094 sub ancestors {
1095 my $this = shift;
1096 my $mama = $this->{'mother'}; # initial condition
1097 return () unless ref($mama); # I must be root!
1099 # $this->no_cyclicity; # avoid infinite loops
1101 # Could be defined recursively, as:
1102 # if(ref($mama = $this->{'mother'})){
1103 # return($mama, $mama->ancestors);
1104 # } else {
1105 # return ();
1107 # But I didn't think of that until I coded the stuff below, which is
1108 # faster.
1110 my @ancestors = ( $mama ); # start off with my mama
1111 while(defined( $mama = $mama->{'mother'} ) && ref($mama)) {
1112 # Walk up the tree
1113 push(@ancestors, $mama);
1114 # This turns into an infinite loop if someone gets stupid
1115 # and makes this tree cyclic! Don't do it!
1117 return @ancestors;
1120 ###########################################################################
1122 =item $node->root
1124 Returns the root of whatever tree $node is a member of. If $node is
1125 the root, then the result is $node itself.
1127 =cut
1129 sub root {
1130 my $it = $_[0];
1131 my @ancestors = ($it, $it->ancestors);
1132 return $ancestors[-1];
1135 ###########################################################################
1137 =item $node->is_daughter_of($node2)
1139 Returns true iff $node is a daughter of $node2.
1140 Currently implemented as just a test of ($it->mother eq $node2).
1142 =cut
1144 sub is_daughter_of {
1145 my($it,$mama) = @_[0,1];
1146 return $it->{'mother'} eq $mama;
1149 ###########################################################################
1151 =item $node->self_and_descendants
1153 Returns a list consisting of itself (as element 0) and all the
1154 descendants of $node. Returns just itself if $node is a
1155 terminal_node.
1157 (Note that it's spelled "descendants", not "descendents".)
1159 =cut
1161 sub self_and_descendants {
1162 # read-only method: return a list of myself and any/all descendants
1163 my $node = shift;
1164 my @List = ();
1165 # $node->no_cyclicity;
1166 $node->walk_down({ 'callback' => sub { push @List, $_[0]; return 1;}});
1167 Carp::croak "Spork Error 919: \@List has no contents!?!?" unless @List;
1168 # impossible
1169 return @List;
1172 ###########################################################################
1174 =item $node->descendants
1176 Returns a list consisting of all the descendants of $node. Returns
1177 empty-list if $node is a terminal_node.
1179 (Note that it's spelled "descendants", not "descendents".)
1181 =cut
1183 sub descendants {
1184 # read-only method: return a list of my descendants
1185 my $node = shift;
1186 my @list = $node->self_and_descendants;
1187 shift @list; # lose myself.
1188 return @list;
1191 ###########################################################################
1193 =item $node->leaves_under
1195 Returns a list (going left-to-right) of all the leaf nodes under
1196 $node. ("Leaf nodes" are also called "terminal nodes" -- i.e., nodes
1197 that have no daughters.) Returns $node in the degenerate case of
1198 $node being a leaf itself.
1200 =cut
1202 sub leaves_under {
1203 # read-only method: return a list of all leaves under myself.
1204 # Returns myself in the degenerate case of being a leaf myself.
1205 my $node = shift;
1206 my @List = ();
1207 # $node->no_cyclicity;
1208 $node->walk_down({ 'callback' =>
1209 sub {
1210 my $node = $_[0];
1211 my @daughters = @{$node->{'daughters'}};
1212 push(@List, $node) unless @daughters;
1213 return 1;
1216 Carp::croak "Spork Error 861: \@List has no contents!?!?" unless @List;
1217 # impossible
1218 return @List;
1221 ###########################################################################
1223 =item $node->depth_under
1225 Returns an integer representing the number of branches between this
1226 $node and the most distant leaf under it. (In other words, this
1227 returns the ply of subtree starting of $node. Consider
1228 scalar($it->ancestors) if you want the ply of a node within the whole
1229 tree.)
1231 =cut
1233 sub depth_under {
1234 my $node = shift;
1235 my $max_depth = 0;
1236 $node->walk_down({
1237 '_depth' => 0,
1238 'callback' => sub {
1239 my $depth = $_[1]->{'_depth'};
1240 $max_depth = $depth if $depth > $max_depth;
1241 return 1;
1244 return $max_depth;
1247 ###########################################################################
1249 =item $node->generation
1251 Returns a list of all nodes (going left-to-right) that are in $node's
1252 generation -- i.e., that are the some number of nodes down from
1253 the root. $root->generation is just $root.
1255 Of course, $node is always in its own generation.
1257 =item $node->generation_under(NODE2)
1259 Like $node->generation, but returns only the nodes in $node's generation
1260 that are also descendants of NODE2 -- in other words,
1262 @us = $node->generation_under( $node->mother->mother );
1264 is all $node's first cousins (to borrow yet more kinship terminology) --
1265 assuming $node does indeed have a grandmother. Actually "cousins" isn't
1266 quite an apt word, because C<@us> ends up including $node's siblings and
1267 $node.
1269 Actually, C<generation_under> is just an alias to C<generation>, but I
1270 figure that this:
1272 @us = $node->generation_under($way_upline);
1274 is a bit more readable than this:
1276 @us = $node->generation($way_upline);
1278 But it's up to you.
1280 $node->generation_under($node) returns just $node.
1282 If you call $node->generation_under($node) but NODE2 is not $node or an
1283 ancestor of $node, it behaves as if you called just $node->generation().
1285 =cut
1287 sub generation {
1288 my($node, $limit) = @_[0,1];
1289 # $node->no_cyclicity;
1290 return $node
1291 if $node eq $limit || not(
1292 defined($node->{'mother'}) &&
1293 ref($node->{'mother'})
1294 ); # bailout
1296 return map(@{$_->{'daughters'}}, $node->{'mother'}->generation($limit));
1297 # recurse!
1298 # Yup, my generation is just all the daughters of my mom's generation.
1301 sub generation_under {
1302 my($node, @rest) = @_;
1303 return $node->generation(@rest);
1306 ###########################################################################
1308 =item $node->self_and_sisters
1310 Returns a list of all nodes (going left-to-right) that have the same
1311 mother as $node -- including $node itself. This is just like
1312 $node->mother->daughters, except that that fails where $node is root,
1313 whereas $root->self_and_siblings, as a special case, returns $root.
1315 (Contrary to how you may interpret how this method is named, "self" is
1316 not (necessarily) the first element of what's returned.)
1318 =cut
1320 sub self_and_sisters {
1321 my $node = $_[0];
1322 my $mother = $node->{'mother'};
1323 return $node unless defined($mother) && ref($mother); # special case
1324 return @{$node->{'mother'}->{'daughters'}};
1327 ###########################################################################
1329 =item $node->sisters
1331 Returns a list of all nodes (going left-to-right) that have the same
1332 mother as $node -- B<not including> $node itself. If $node is root,
1333 this returns empty-list.
1335 =cut
1337 sub sisters {
1338 my $node = $_[0];
1339 my $mother = $node->{'mother'};
1340 return() unless $mother; # special case
1341 return grep($_ ne $node,
1342 @{$node->{'mother'}->{'daughters'}}
1346 ###########################################################################
1348 =item $node->left_sister
1350 Returns the node that's the immediate left sister of $node. If $node
1351 is the leftmost (or only) daughter of its mother (or has no mother),
1352 then this returns undef.
1354 (See also $node->add_left_sisters(LIST).)
1356 =cut
1358 sub left_sister {
1359 my $it = $_[0];
1360 my $mother = $it->{'mother'};
1361 return undef unless $mother;
1362 my @sisters = @{$mother->{'daughters'}};
1364 return undef if @sisters == 1; # I'm an only daughter
1366 my $left = undef;
1367 foreach my $one (@sisters) {
1368 return $left if $one eq $it;
1369 $left = $one;
1371 die "SPORK ERROR 9757: I'm not in my mother's daughter list!?!?";
1375 =item $node->left_sisters
1377 Returns a list of nodes that're sisters to the left of $node. If
1378 $node is the leftmost (or only) daughter of its mother (or has no
1379 mother), then this returns an empty list.
1381 (See also $node->add_left_sisters(LIST).)
1383 =cut
1385 sub left_sisters {
1386 my $it = $_[0];
1387 my $mother = $it->{'mother'};
1388 return() unless $mother;
1389 my @sisters = @{$mother->{'daughters'}};
1390 return() if @sisters == 1; # I'm an only daughter
1392 my @out = ();
1393 foreach my $one (@sisters) {
1394 return @out if $one eq $it;
1395 push @out, $one;
1397 die "SPORK ERROR 9767: I'm not in my mother's daughter list!?!?";
1400 =item $node->right_sister
1402 Returns the node that's the immediate right sister of $node. If $node
1403 is the rightmost (or only) daughter of its mother (or has no mother),
1404 then this returns undef.
1406 (See also $node->add_right_sisters(LIST).)
1408 =cut
1410 sub right_sister {
1411 my $it = $_[0];
1412 my $mother = $it->{'mother'};
1413 return undef unless $mother;
1414 my @sisters = @{$mother->{'daughters'}};
1415 return undef if @sisters == 1; # I'm an only daughter
1417 my $seen = 0;
1418 foreach my $one (@sisters) {
1419 return $one if $seen;
1420 $seen = 1 if $one eq $it;
1422 die "SPORK ERROR 9777: I'm not in my mother's daughter list!?!?"
1423 unless $seen;
1424 return undef;
1427 =item $node->right_sisters
1429 Returns a list of nodes that're sisters to the right of $node. If
1430 $node is the rightmost (or only) daughter of its mother (or has no
1431 mother), then this returns an empty list.
1433 (See also $node->add_right_sisters(LIST).)
1435 =cut
1437 sub right_sisters {
1438 my $it = $_[0];
1439 my $mother = $it->{'mother'};
1440 return() unless $mother;
1441 my @sisters = @{$mother->{'daughters'}};
1442 return() if @sisters == 1; # I'm an only daughter
1444 my @out;
1445 my $seen = 0;
1446 foreach my $one (@sisters) {
1447 push @out, $one if $seen;
1448 $seen = 1 if $one eq $it;
1450 die "SPORK ERROR 9787: I'm not in my mother's daughter list!?!?"
1451 unless $seen;
1452 return @out;
1455 ###########################################################################
1457 =item $node->my_daughter_index
1459 Returns what index this daughter is, in its mother's C<daughter> list.
1460 In other words, if $node is ($node->mother->daughters)[3], then
1461 $node->my_daughter_index returns 3.
1463 As a special case, returns 0 if $node has no mother.
1465 =cut
1467 sub my_daughter_index {
1468 # returns what number is my index in my mother's daughter list
1469 # special case: 0 for root.
1470 my $node = $_[0];
1471 my $ord = -1;
1472 my $mother = $node->{'mother'};
1474 return 0 unless $mother;
1475 my @sisters = @{$mother->{'daughters'}};
1477 die "SPORK ERROR 6512: My mother has no kids!!!" unless @sisters;
1479 Find_Self:
1480 for(my $i = 0; $i < @sisters; $i++) {
1481 if($sisters[$i] eq $node) {
1482 $ord = $i;
1483 last Find_Self;
1486 die "SPORK ERROR 2837: I'm not a daughter of my mother?!?!" if $ord == -1;
1487 return $ord;
1490 ###########################################################################
1492 =item $node->address or $anynode->address(ADDRESS)
1494 With the first syntax, returns the address of $node within its tree,
1495 based on its position within the tree. An address is formed by noting
1496 the path between the root and $node, and concatenating the
1497 daughter-indices of the nodes this passes thru (starting with 0 for
1498 the root, and ending with $node).
1500 For example, if to get from node ROOT to node $node, you pass thru
1501 ROOT, A, B, and $node, then the address is determined as:
1503 * ROOT's my_daughter_index is 0.
1505 * A's my_daughter_index is, suppose, 2. (A is index 2 in ROOT's
1506 daughter list.)
1508 * B's my_daughter_index is, suppose, 0. (B is index 0 in A's
1509 daughter list.)
1511 * $node's my_daughter_index is, suppose, 4. ($node is index 4 in
1512 B's daughter list.)
1514 The address of the above-described $node is, therefore, "0:2:0:4".
1516 (As a somewhat special case, the address of the root is always "0";
1517 and since addresses start from the root, all addresses start with a
1518 "0".)
1520 The second syntax, where you provide an address, starts from the root
1521 of the tree $anynode belongs to, and returns the node corresponding to
1522 that address. Returns undef if no node corresponds to that address.
1523 Note that this routine may be somewhat liberal in its interpretation
1524 of what can constitute an address; i.e., it accepts "0.2.0.4", besides
1525 "0:2:0:4".
1527 Also note that the address of a node in a tree is meaningful only in
1528 that tree as currently structured.
1530 (Consider how ($address1 cmp $address2) may be magically meaningful
1531 to you, if you mant to figure out what nodes are to the right of what
1532 other nodes.)
1534 =cut
1536 sub address {
1537 my($it, $address) = @_[0,1];
1538 if(defined($address) && length($address)) { # given the address, return the node.
1539 # invalid addresses return undef
1540 my $root = $it->root;
1541 my @parts = map {$_ + 0}
1542 $address =~ m/(\d+)/g; # generous!
1543 Carp::croak "Address \"$address\" is an ill-formed address" unless @parts;
1544 Carp::croak "Address \"$address\" must start with '0'" unless shift(@parts) == 0;
1546 my $current_node = $root;
1547 while(@parts) { # no-op for root
1548 my $ord = shift @parts;
1549 my @daughters = @{$current_node->{'daughters'}};
1551 if($#daughters < $ord) { # illegal address
1552 print "* $address has an out-of-range index ($ord)!" if $Debug;
1553 return undef;
1555 $current_node = $daughters[$ord];
1556 unless(ref($current_node)) {
1557 print "* $address points to or thru a non-node!" if $Debug;
1558 return undef;
1561 return $current_node;
1563 } else { # given the node, return the address
1564 my @parts = ();
1565 my $current_node = $it;
1566 my $mother;
1568 while(defined( $mother = $current_node->{'mother'} ) && ref($mother)) {
1569 unshift @parts, $current_node->my_daughter_index;
1570 $current_node = $mother;
1572 return join(':', 0, @parts);
1576 ###########################################################################
1578 =item $node->common(LIST)
1580 Returns the lowest node in the tree that is ancestor-or-self to the
1581 nodes $node and LIST.
1583 If the nodes are far enough apart in the tree, the answer is just the
1584 root.
1586 If the nodes aren't all in the same tree, the answer is undef.
1588 As a degenerate case, if LIST is empty, returns $node.
1590 =cut
1592 sub common { # Return the lowest node common to all these nodes...
1593 # Called as $it->common($other) or $it->common(@others)
1594 my @ones = @_; # all nodes I was given
1595 my($first, @others) = @_;
1597 return $first unless @others; # degenerate case
1599 my %ones;
1600 @ones{ @ones } = undef;
1602 foreach my $node (@others) {
1603 Carp::croak "TILT: node \"$node\" is not a node"
1604 unless UNIVERSAL::can($node, 'is_node');
1605 my %first_lineage;
1606 @first_lineage{$first, $first->ancestors} = undef;
1607 my $higher = undef; # the common of $first and $node
1608 my @my_lineage = $node->ancestors;
1610 Find_Common:
1611 while(@my_lineage) {
1612 if(exists $first_lineage{$my_lineage[0]}) {
1613 $higher = $my_lineage[0];
1614 last Find_Common;
1616 shift @my_lineage;
1618 return undef unless $higher;
1619 $first = $higher;
1621 return $first;
1625 ###########################################################################
1627 =item $node->common_ancestor(LIST)
1629 Returns the lowest node that is ancestor to all the nodes given (in
1630 nodes $node and LIST). In other words, it answers the question: "What
1631 node in the tree, as low as possible, is ancestor to the nodes given
1632 ($node and LIST)?"
1634 If the nodes are far enough apart, the answer is just the root --
1635 except if any of the nodes are the root itself, in which case the
1636 answer is undef (since the root has no ancestor).
1638 If the nodes aren't all in the same tree, the answer is undef.
1640 As a degenerate case, if LIST is empty, returns $node's mother;
1641 that'll be undef if $node is root.
1643 =cut
1645 sub common_ancestor {
1646 my @ones = @_; # all nodes I was given
1647 my($first, @others) = @_;
1649 return $first->{'mother'} unless @others;
1650 # which may be undef if $first is the root!
1652 my %ones;
1653 @ones{ @ones } = undef; # my arguments
1655 my $common = $first->common(@others);
1656 if(exists($ones{$common})) { # if the common is one of my nodes...
1657 return $common->{'mother'};
1658 # and this might be undef, if $common is root!
1659 } else {
1660 return $common;
1661 # which might be null if that's all common came up with
1665 ###########################################################################
1666 ###########################################################################
1668 =back
1670 =head1 YET MORE METHODS
1672 =over
1674 =item $node->walk_down({ callback => \&foo, callbackback => \&foo, ... })
1676 Performs a depth-first traversal of the structure at and under $node.
1677 What it does at each node depends on the value of the options hashref,
1678 which you must provide. There are three options, "callback" and
1679 "callbackback" (at least one of which must be defined, as a sub
1680 reference), and "_depth". This is what C<walk_down> does, in
1681 pseudocode form:
1683 * Start at the $node given.
1685 * If there's a C<callback>, call it with $node as the first argument,
1686 and the options hashref as the second argument (which contains the
1687 potentially useful C<_depth>, remember). This function must return
1688 true or false -- if false, it will block the next step:
1690 * If $node has any daughter nodes, increment C<_depth>, and call
1691 $daughter->walk_down(options_hashref) for each daughter (in order, of
1692 course), where options_hashref is the same hashref it was called with.
1693 When this returns, decrements C<_depth>.
1695 * If there's a C<callbackback>, call just it as with C<callback> (but
1696 tossing out the return value). Note that C<callback> returning false
1697 blocks traversal below $node, but doesn't block calling callbackback
1698 for $node. (Incidentally, in the unlikely case that $node has stopped
1699 being a node object, C<callbackback> won't get called.)
1701 * Return.
1703 $node->walk_down is the way to recursively do things to a tree (if you
1704 start at the root) or part of a tree; if what you're doing is best done
1705 via pre-pre order traversal, use C<callback>; if what you're doing is
1706 best done with post-order traversal, use C<callbackback>.
1707 C<walk_down> is even the basis for plenty of the methods in this
1708 class. See the source code for examples both simple and horrific.
1710 Note that if you don't specify C<_depth>, it effectively defaults to
1711 0. You should set it to scalar($node->ancestors) if you want
1712 C<_depth> to reflect the true depth-in-the-tree for the nodes called,
1713 instead of just the depth below $node. (If $node is the root, there's
1714 difference, of course.)
1716 And B<by the way>, it's a bad idea to modify the tree from the callback.
1717 Unpredictable things may happen. I instead suggest having your callback
1718 add to a stack of things that need changing, and then, once C<walk_down>
1719 is all finished, changing those nodes from that stack.
1721 Note that the existence of C<walk_down> doesn't mean you can't write
1722 you own special-use traversers.
1724 =cut
1726 sub walk_down {
1727 my($this, $o) = @_[0,1];
1729 # All the can()s are in case an object changes class while I'm
1730 # looking at it.
1732 Carp::croak "I need options!" unless ref($o);
1733 Carp::croak "I need a callback or a callbackback" unless
1734 ( ref($o->{'callback'}) || ref($o->{'callbackback'}) );
1736 # $this->no_cyclicity;
1737 my $callback = ref($o->{'callback'}) ? $o->{'callback'} : undef;
1738 my $callbackback = ref($o->{'callbackback'}) ? $o->{'callbackback'} : undef;
1739 my $callback_status = 1;
1741 print "Callback: $callback Callbackback: $callbackback\n" if $Debug;
1743 printf "* Entering %s\n", ($this->name || $this) if $Debug;
1744 $callback_status = &{ $callback }( $this, $o ) if $callback;
1746 if($callback_status) {
1747 # Keep recursing unless callback returned false... and if there's
1748 # anything to recurse into, of course.
1749 my @daughters = UNIVERSAL::can($this, 'is_node') ? @{$this->{'daughters'}} : ();
1750 if(@daughters) {
1751 $o->{'_depth'} += 1;
1752 #print "Depth " , $o->{'_depth'}, "\n";
1753 foreach my $one (@daughters) {
1754 $one->walk_down($o) if UNIVERSAL::can($one, 'is_node');
1755 # and if it can do "is_node", it should provide a walk_down!
1757 $o->{'_depth'} -= 1;
1759 } else {
1760 printf "* Recursing below %s pruned\n", ($this->name || $this) if $Debug;
1763 # Note that $callback_status doesn't block callbackback from being called
1764 if($callbackback){
1765 if(UNIVERSAL::can($this, 'is_node')) { # if it's still a node!
1766 print "* Calling callbackback\n" if $Debug;
1767 scalar( &{ $callbackback }( $this, $o ) );
1768 # scalar to give it the same context as callback
1769 } else {
1770 print "* Can't call callbackback -- $this isn't a node anymore\n"
1771 if $Debug;
1774 if($Debug) {
1775 if(UNIVERSAL::can($this, 'is_node')) { # if it's still a node!
1776 printf "* Leaving %s\n", ($this->name || $this)
1777 } else {
1778 print "* Leaving [no longer a node]\n";
1781 return;
1784 ###########################################################################
1786 =item @lines = $node->dump_names({ ...options... });
1788 Dumps, as an indented list, the names of the nodes starting at $node,
1789 and continuing under it. Options are:
1791 * _depth -- A nonnegative number. Indicating the depth to consider
1792 $node as being at (and so the generation under that is that plus one,
1793 etc.). Defaults to 0. You may choose to use set _depth =>
1794 scalar($node->ancestors).
1796 * tick -- a string to preface each entry with, between the
1797 indenting-spacing and the node's name. Defaults to empty-string. You
1798 may prefer "*" or "-> " or someting.
1800 * indent -- the string used to indent with. Defaults to " " (two
1801 spaces). Another sane value might be ". " (period, space). Setting it
1802 to empty-string suppresses indenting.
1804 The dump is not printed, but is returned as a list, where each
1805 item is a line, with a "\n" at the end.
1807 =cut
1809 sub dump_names {
1810 my($it, $o) = @_[0,1];
1811 $o = {} unless ref $o;
1812 my @out = ();
1813 $o->{'_depth'} ||= 0;
1814 $o->{'indent'} ||= ' ';
1815 $o->{'tick'} ||= '';
1817 $o->{'callback'} = sub {
1818 my($this, $o) = @_[0,1];
1819 push(@out,
1820 join('',
1821 $o->{'indent'} x $o->{'_depth'},
1822 $o->{'tick'},
1823 &Tree::DAG_Node::_dump_quote($this->name || $this),
1824 "\n"
1827 return 1;
1830 $it->walk_down($o);
1831 return @out;
1834 ###########################################################################
1835 ###########################################################################
1837 =item the constructor CLASS->random_network({...options...})
1839 =item the method $node->random_network({...options...})
1841 In the first case, constructs a randomly arranged network under a new
1842 node, and returns the root node of that tree. In the latter case,
1843 constructs the network under $node.
1845 Currently, this is implemented a bit half-heartedly, and
1846 half-wittedly. I basically needed to make up random-looking networks
1847 to stress-test the various tree-dumper methods, and so wrote this. If
1848 you actually want to rely on this for any application more
1849 serious than that, I suggest examining the source code and seeing if
1850 this does really what you need (say, in reliability of randomness);
1851 and feel totally free to suggest changes to me (especially in the form
1852 of "I rewrote C<random_network>, here's the code...")
1854 It takes four options:
1856 * max_node_count -- maximum number of nodes this tree will be allowed
1857 to have (counting the root). Defaults to 25.
1859 * min_depth -- minimum depth for the tree. Defaults to 2. Leaves can
1860 be generated only after this depth is reached, so the tree will be at
1861 least this deep -- unless max_node_count is hit first.
1863 * max_depth -- maximum depth for the tree. Defaults to 3 plus
1864 min_depth. The tree will not be deeper than this.
1866 * max_children -- maximum number of children any mother in the tree
1867 can have. Defaults to 4.
1869 =cut
1871 sub random_network { # constructor or method.
1872 my $class = $_[0];
1873 my $o = ref($_[1]) ? $_[1] : {};
1874 my $am_cons = 0;
1875 my $root;
1877 if(ref($class)){ # I'm a method.
1878 $root = $_[0]; # build under the given node, from same class.
1879 $class = ref $class;
1880 $am_cons = 0;
1881 } else { # I'm a constructor
1882 $root = $class->new; # build under a new node, with class named.
1883 $root->name("Root");
1884 $am_cons = 1;
1887 my $min_depth = $o->{'min_depth'} || 2;
1888 my $max_depth = $o->{'max_depth'} || ($min_depth + 3);
1889 my $max_children = $o->{'max_children'} || 4;
1890 my $max_node_count = $o->{'max_node_count'} || 25;
1892 Carp::croak "max_children has to be positive" if int($max_children) < 1;
1894 my @mothers = ( $root );
1895 my @children = ( );
1896 my $node_count = 1; # the root
1898 Gen:
1899 foreach my $depth (1 .. $max_depth) {
1900 last if $node_count > $max_node_count;
1901 Mother:
1902 foreach my $mother (@mothers) {
1903 last Gen if $node_count > $max_node_count;
1904 my $children_number;
1905 if($depth <= $min_depth) {
1906 until( $children_number = int(rand(1 + $max_children)) ) {}
1907 } else {
1908 $children_number = int(rand($max_children));
1910 Beget:
1911 foreach (1 .. $children_number) {
1912 last Gen if $node_count > $max_node_count;
1913 my $node = $mother->new_daughter;
1914 $node->name("Node$node_count");
1915 ++$node_count;
1916 push(@children, $node);
1919 @mothers = @children;
1920 @children = ();
1921 last unless @mothers;
1924 return $root;
1927 =item the constructor CLASS->lol_to_tree($lol);
1929 Converts something like bracket-notation for "Chomsky trees" (or
1930 rather, the closest you can come with Perl
1931 list-of-lists(-of-lists(-of-lists))) into a tree structure. Returns
1932 the root of the tree converted.
1934 The conversion rules are that: 1) if the last (possibly the only) item
1935 in a given list is a scalar, then that is used as the "name" attribute
1936 for the node based on this list. 2) All other items in the list
1937 represent daughter nodes of the current node -- recursively so, if
1938 they are list references; otherwise, (non-terminal) scalars are
1939 considered to denote nodes with that name. So ['Foo', 'Bar', 'N'] is
1940 an alternate way to represent [['Foo'], ['Bar'], 'N'].
1942 An example will illustrate:
1944 use Tree::DAG_Node;
1945 $lol =
1948 [ [ 'Det:The' ],
1949 [ [ 'dog' ], 'N'], 'NP'],
1950 [ '/with rabies\\', 'PP'],
1951 'NP'
1953 [ 'died', 'VP'],
1956 $tree = Tree::DAG_Node->lol_to_tree($lol);
1957 $diagram = $tree->draw_ascii_tree;
1958 print map "$_\n", @$diagram;
1960 ...returns this tree:
1963 <S>
1965 /------------------\
1966 | |
1967 <NP> <VP>
1968 | |
1969 /---------------\ <died>
1970 | |
1971 <NP> <PP>
1972 | |
1973 /-------\ </with rabies\>
1974 | |
1975 <Det:The> <N>
1977 <dog>
1979 By the way (and this rather follows from the above rules), when
1980 denoting a LoL tree consisting of just one node, this:
1982 $tree = Tree::DAG_Node->lol_to_tree( 'Lonely' );
1984 is okay, although it'd probably occur to you to denote it only as:
1986 $tree = Tree::DAG_Node->lol_to_tree( ['Lonely'] );
1988 which is of course fine, too.
1990 =cut
1992 sub lol_to_tree {
1993 my($class, $lol, $seen_r) = @_[0,1,2];
1994 $seen_r = {} unless ref($seen_r) eq 'HASH';
1995 return if ref($lol) && $seen_r->{$lol}++; # catch circularity
1997 $class = ref($class) || $class;
1998 my $node = $class->new();
2000 unless(ref($lol) eq 'ARRAY') { # It's a terminal node.
2001 $node->name($lol) if defined $lol;
2002 return $node;
2004 return $node unless @$lol; # It's a terminal node, oddly represented
2006 # It's a non-terminal node.
2008 my @options = @$lol;
2009 unless(ref($options[-1]) eq 'ARRAY') {
2010 # This is what separates this method from simple_lol_to_tree
2011 $node->name(pop(@options));
2014 foreach my $d (@options) { # Scan daughters (whether scalars or listrefs)
2015 $node->add_daughter( $class->lol_to_tree($d, $seen_r) ); # recurse!
2018 return $node;
2021 #--------------------------------------------------------------------------
2023 =item $node->tree_to_lol_notation({...options...})
2025 Dumps a tree (starting at $node) as the sort of LoL-like bracket
2026 notation you see in the above example code. Returns just one big
2027 block of text. The only option is "multiline" -- if true, it dumps
2028 the text as the sort of indented structure as seen above; if false
2029 (and it defaults to false), dumps it all on one line (with no
2030 indenting, of course).
2032 For example, starting with the tree from the above example,
2033 this:
2035 print $tree->tree_to_lol_notation, "\n";
2037 prints the following (which I've broken over two lines for sake of
2038 printablitity of documentation):
2040 [[[['Det:The'], [['dog'], 'N'], 'NP'], [["/with rabies\x5c"],
2041 'PP'], 'NP'], [['died'], 'VP'], 'S'],
2043 Doing this:
2045 print $tree->tree_to_lol_notation({ multiline => 1 });
2047 prints the same content, just spread over many lines, and prettily
2048 indented.
2050 =cut
2052 #--------------------------------------------------------------------------
2054 sub tree_to_lol_notation {
2055 my $root = $_[0];
2056 my($it, $o) = @_[0,1];
2057 $o = {} unless ref $o;
2058 my @out = ();
2059 $o->{'_depth'} ||= 0;
2060 $o->{'multiline'} = 0 unless exists($o->{'multiline'});
2062 my $line_end;
2063 if($o->{'multiline'}) {
2064 $o->{'indent'} ||= ' ';
2065 $line_end = "\n";
2066 } else {
2067 $o->{'indent'} ||= '';
2068 $line_end = '';
2071 $o->{'callback'} = sub {
2072 my($this, $o) = @_[0,1];
2073 push(@out,
2074 $o->{'indent'} x $o->{'_depth'},
2075 "[$line_end",
2077 return 1;
2080 $o->{'callbackback'} = sub {
2081 my($this, $o) = @_[0,1];
2082 my $name = $this->name;
2083 if(!defined($name)) {
2084 $name = 'undef';
2085 } else {
2086 $name = &Tree::DAG_Node::_dump_quote($name);
2088 push(@out,
2089 $o->{'indent'} x ($o->{'_depth'} + 1),
2090 "$name$line_end",
2091 $o->{'indent'} x $o->{'_depth'},
2092 "], $line_end",
2094 return 1;
2097 $it->walk_down($o);
2098 return join('', @out);
2101 #--------------------------------------------------------------------------
2103 =item $node->tree_to_lol
2105 Returns that tree (starting at $node) represented as a LoL, like what
2106 $lol, above, holds. (This is as opposed to C<tree_to_lol_notation>,
2107 which returns the viewable code like what gets evaluated and stored in
2108 $lol, above.)
2110 Lord only knows what you use this for -- maybe for feeding to
2111 Data::Dumper, in case C<tree_to_lol_notation> doesn't do just what you
2112 want?
2114 =cut
2116 sub tree_to_lol {
2117 # I haven't /rigorously/ tested this.
2118 my($it, $o) = @_[0,1]; # $o is currently unused anyway
2119 $o = {} unless ref $o;
2121 my $out = [];
2122 my @lol_stack = ($out);
2123 $o->{'callback'} = sub {
2124 my($this, $o) = @_[0,1];
2125 my $new = [];
2126 push @{$lol_stack[-1]}, $new;
2127 push(@lol_stack, $new);
2128 return 1;
2131 $o->{'callbackback'} = sub {
2132 my($this, $o) = @_[0,1];
2133 push @{$lol_stack[-1]}, $this->name;
2134 pop @lol_stack;
2135 return 1;
2138 $it->walk_down($o);
2139 die "totally bizarre error 12416" unless ref($out->[0]);
2140 $out = $out->[0]; # the real root
2141 return $out;
2144 ###########################################################################
2146 =item the constructor CLASS->simple_lol_to_tree($simple_lol);
2148 This is like lol_to_tree, except that rule 1 doesn't apply -- i.e.,
2149 all scalars (or really, anything not a listref) in the LoL-structure
2150 end up as named terminal nodes, and only terminal nodes get names
2151 (and, of course, that name comes from that scalar value). This method
2152 is useful for making things like expression trees, or at least
2153 starting them off. Consider that this:
2155 $tree = Tree::DAG_Node->simple_lol_to_tree(
2156 [ 'foo', ['bar', ['baz'], 'quux'], 'zaz', 'pati' ]
2159 converts from something like a Lispish or Iconish tree, if you pretend
2160 the brackets are parentheses.
2162 Note that there is a (possibly surprising) degenerate case of what I'm
2163 calling a "simple-LoL", and it's like this:
2165 $tree = Tree::DAG_Node->simple_lol_to_tree('Lonely');
2167 This is the (only) way you can specify a tree consisting of only a
2168 single node, which here gets the name 'Lonely'.
2170 =cut
2172 sub simple_lol_to_tree {
2173 my($class, $lol, $seen_r) = @_[0,1,2];
2174 $class = ref($class) || $class;
2175 $seen_r = {} unless ref($seen_r) eq 'HASH';
2176 return if ref($lol) && $seen_r->{$lol}++; # catch circularity
2178 my $node = $class->new();
2180 unless(ref($lol) eq 'ARRAY') { # It's a terminal node.
2181 $node->name($lol) if defined $lol;
2182 return $node;
2185 # It's a non-terminal node.
2186 foreach my $d (@$lol) { # scan daughters (whether scalars or listrefs)
2187 $node->add_daughter( $class->simple_lol_to_tree($d, $seen_r) ); # recurse!
2190 return $node;
2193 #--------------------------------------------------------------------------
2195 =item $node->tree_to_simple_lol
2197 Returns that tree (starting at $node) represented as a simple-LoL --
2198 i.e., one where non-terminal nodes are represented as listrefs, and
2199 terminal nodes are gotten from the contents of those nodes' "name'
2200 attributes.
2202 Note that in the case of $node being terminal, what you get back is
2203 the same as $node->name.
2205 Compare to tree_to_simple_lol_notation.
2207 =cut
2209 sub tree_to_simple_lol {
2210 # I haven't /rigorously/ tested this.
2211 my $root = $_[0];
2213 return $root->name unless scalar($root->daughters);
2214 # special case we have to nip in the bud
2216 my($it, $o) = @_[0,1]; # $o is currently unused anyway
2217 $o = {} unless ref $o;
2219 my $out = [];
2220 my @lol_stack = ($out);
2221 $o->{'callback'} = sub {
2222 my($this, $o) = @_[0,1];
2223 my $new;
2224 $new = scalar($this->daughters) ? [] : $this->name;
2225 # Terminal nodes are scalars, the rest are listrefs we'll fill in
2226 # as we recurse the tree below here.
2227 push @{$lol_stack[-1]}, $new;
2228 push(@lol_stack, $new);
2229 return 1;
2232 $o->{'callbackback'} = sub { pop @lol_stack; return 1; };
2233 $it->walk_down($o);
2234 die "totally bizarre error 12416" unless ref($out->[0]);
2235 $out = $out->[0]; # the real root
2236 return $out;
2239 #--------------------------------------------------------------------------
2241 =item $node->tree_to_simple_lol_notation({...options...})
2243 A simple-LoL version of tree_to_lol_notation (which see); takes the
2244 same options.
2246 =cut
2248 sub tree_to_simple_lol_notation {
2249 my($it, $o) = @_[0,1];
2250 $o = {} unless ref $o;
2251 my @out = ();
2252 $o->{'_depth'} ||= 0;
2253 $o->{'multiline'} = 0 unless exists($o->{'multiline'});
2255 my $line_end;
2256 if($o->{'multiline'}) {
2257 $o->{'indent'} ||= ' ';
2258 $line_end = "\n";
2259 } else {
2260 $o->{'indent'} ||= '';
2261 $line_end = '';
2264 $o->{'callback'} = sub {
2265 my($this, $o) = @_[0,1];
2266 if(scalar($this->daughters)) { # Nonterminal
2267 push(@out,
2268 $o->{'indent'} x $o->{'_depth'},
2269 "[$line_end",
2271 } else { # Terminal
2272 my $name = $this->name;
2273 push @out,
2274 $o->{'indent'} x $o->{'_depth'},
2275 defined($name) ? &Tree::DAG_Node::_dump_quote($name) : 'undef',
2276 ",$line_end";
2278 return 1;
2281 $o->{'callbackback'} = sub {
2282 my($this, $o) = @_[0,1];
2283 push(@out,
2284 $o->{'indent'} x $o->{'_depth'},
2285 "], $line_end",
2286 ) if scalar($this->daughters);
2287 return 1;
2291 $it->walk_down($o);
2292 return join('', @out);
2295 ###########################################################################
2296 # $list_r = $root_node->draw_ascii_tree({ h_compact => 1});
2297 # print map("$_\n", @$list_r);
2299 =item $list_r = $node->draw_ascii_tree({ ... options ... })
2301 Draws a nice ASCII-art representation of the tree structure
2302 at-and-under $node, with $node at the top. Returns a reference to the
2303 list of lines (with no "\n"s or anything at the end of them) that make
2304 up the picture.
2306 Example usage:
2308 print map("$_\n", @{$tree->draw_ascii_tree});
2310 draw_ascii_tree takes parameters you set in the options hashref:
2312 * "no_name" -- if true, C<draw_ascii_tree> doesn't print the name of
2313 the node; simply prints a "*". Defaults to 0 (i.e., print the node
2314 name.)
2316 * "h_spacing" -- number 0 or greater. Sets the number of spaces
2317 inserted horizontally between nodes (and groups of nodes) in a tree.
2318 Defaults to 1.
2320 * "h_compact" -- number 0 or 1. Sets the extent to which
2321 C<draw_ascii_tree> tries to save horizontal space. Defaults to 1. If
2322 I think of a better scrunching algorithm, there'll be a "2" setting
2323 for this.
2325 * "v_compact" -- number 0, 1, or 2. Sets the degree to which
2326 C<draw_ascii_tree> tries to save vertical space. Defaults to 1.
2328 This occasionally returns trees that are a bit cock-eyed in parts; if
2329 anyone can suggest a better drawing algorithm, I'd be appreciative.
2331 =cut
2333 sub draw_ascii_tree {
2334 # Make a "box" for this node and its possible daughters, recursively.
2336 # The guts of this routine are horrific AND recursive!
2338 # Feel free to send me better code. I worked on this until it
2339 # gave me a headache and it worked passably, and then I stopped.
2341 my $it = $_[0];
2342 my $o = ref($_[1]) ? $_[1] : {};
2343 my(@box, @daughter_boxes, $width, @daughters);
2344 @daughters = @{$it->{'daughters'}};
2346 # $it->no_cyclicity;
2348 $o->{'no_name'} = 0 unless exists $o->{'no_name'};
2349 $o->{'h_spacing'} = 1 unless exists $o->{'h_spacing'};
2350 $o->{'h_compact'} = 1 unless exists $o->{'h_compact'};
2351 $o->{'v_compact'} = 1 unless exists $o->{'v_compact'};
2353 my $printable_name;
2354 if($o->{'no_name'}) {
2355 $printable_name = '*';
2356 } else {
2357 $printable_name = $it->name || $it;
2358 $printable_name =~ tr<\cm\cj\t >< >s;
2359 $printable_name = "<$printable_name>";
2362 if(!scalar(@daughters)) { # I am a leaf!
2363 # Now add the top parts, and return.
2364 @box = ("|", $printable_name);
2365 } else {
2366 @daughter_boxes = map { &draw_ascii_tree($_, $o) } @daughters;
2368 my $max_height = 0;
2369 foreach my $box (@daughter_boxes) {
2370 my $h = @$box;
2371 $max_height = $h if $h > $max_height;
2374 @box = ('') x $max_height; # establish the list
2376 foreach my $one (@daughter_boxes) {
2377 my $length = length($one->[0]);
2378 my $height = @$one;
2380 #now make all the same height.
2381 my $deficit = $max_height - $height;
2382 if($deficit > 0) {
2383 push @$one, ( scalar( ' ' x $length ) ) x $deficit;
2384 $height = scalar(@$one);
2388 # Now tack 'em onto @box
2389 ##########################################################
2390 # This used to be a sub of its own. Ho-hum.
2392 my($b1, $b2) = (\@box, $one);
2393 my($h1, $h2) = (scalar(@$b1), scalar(@$b2));
2395 my(@diffs, $to_chop);
2396 if($o->{'h_compact'}) { # Try for h-scrunching.
2397 my @diffs;
2398 my $min_diff = length($b1->[0]); # just for starters
2399 foreach my $line (0 .. ($h1 - 1)) {
2400 my $size_l = 0; # length of terminal whitespace
2401 my $size_r = 0; # length of initial whitespace
2402 $size_l = length($1) if $b1->[$line] =~ /( +)$/s;
2403 $size_r = length($1) if $b2->[$line] =~ /^( +)/s;
2404 my $sum = $size_l + $size_r;
2406 $min_diff = $sum if $sum < $min_diff;
2407 push @diffs, [$sum, $size_l, $size_r];
2409 $to_chop = $min_diff - $o->{'h_spacing'};
2410 $to_chop = 0 if $to_chop < 0;
2413 if(not( $o->{'h_compact'} and $to_chop )) {
2414 # No H-scrunching needed/possible
2415 foreach my $line (0 .. ($h1 - 1)) {
2416 $b1->[ $line ] .= $b2->[ $line ] . (' ' x $o->{'h_spacing'});
2418 } else {
2419 # H-scrunching is called for.
2420 foreach my $line (0 .. ($h1 - 1)) {
2421 my $r = $b2->[$line]; # will be the new line
2422 my $remaining = $to_chop;
2423 if($remaining) {
2424 my($l_chop, $r_chop) = @{$diffs[$line]}[1,2];
2426 if($l_chop) {
2427 if($l_chop > $remaining) {
2428 $l_chop = $remaining;
2429 $remaining = 0;
2430 } elsif($l_chop == $remaining) {
2431 $remaining = 0;
2432 } else { # remaining > l_chop
2433 $remaining -= $l_chop;
2436 if($r_chop) {
2437 if($r_chop > $remaining) {
2438 $r_chop = $remaining;
2439 $remaining = 0;
2440 } elsif($r_chop == $remaining) {
2441 $remaining = 0;
2442 } else { # remaining > r_chop
2443 $remaining -= $r_chop; # should never happen!
2447 substr($b1->[$line], -$l_chop) = '' if $l_chop;
2448 substr($r, 0, $r_chop) = '' if $r_chop;
2449 } # else no-op
2450 $b1->[ $line ] .= $r . (' ' x $o->{'h_spacing'});
2452 # End of H-scrunching ickyness
2454 # End of ye big tack-on
2457 # End of the foreach daughter_box loop
2459 # remove any fencepost h_spacing
2460 if($o->{'h_spacing'}) {
2461 foreach my $line (@box) {
2462 substr($line, -$o->{'h_spacing'}) = '' if length($line);
2466 # end of catenation
2467 die "SPORK ERROR 958203: Freak!!!!!" unless @box;
2469 # Now tweak the pipes
2470 my $new_pipes = $box[0];
2471 my $pipe_count = $new_pipes =~ tr<|><+>;
2472 if($pipe_count < 2) {
2473 $new_pipes = "|";
2474 } else {
2475 my($init_space, $end_space);
2477 # Thanks to Gilles Lamiral for pointing out the need to set to '',
2478 # to avoid -w warnings about undeffiness.
2480 if( $new_pipes =~ s<^( +)><>s ) {
2481 $init_space = $1;
2482 } else {
2483 $init_space = '';
2486 if( $new_pipes =~ s<( +)$><>s ) {
2487 $end_space = $1
2488 } else {
2489 $end_space = '';
2492 $new_pipes =~ tr< ><->;
2493 substr($new_pipes,0,1) = "/";
2494 substr($new_pipes,-1,1) = "\\";
2496 $new_pipes = $init_space . $new_pipes . $end_space;
2497 # substr($new_pipes, int((length($new_pipes)), 1)) / 2) = "^"; # feh
2500 # Now tack on the formatting for this node.
2501 if($o->{'v_compact'} == 2) {
2502 if(@daughters == 1) {
2503 unshift @box, "|", $printable_name;
2504 } else {
2505 unshift @box, "|", $printable_name, $new_pipes;
2507 } elsif ($o->{'v_compact'} == 1 and @daughters == 1) {
2508 unshift @box, "|", $printable_name;
2509 } else { # general case
2510 unshift @box, "|", $printable_name, $new_pipes;
2514 # Flush the edges:
2515 my $max_width = 0;
2516 foreach my $line (@box) {
2517 my $w = length($line);
2518 $max_width = $w if $w > $max_width;
2520 foreach my $one (@box) {
2521 my $space_to_add = $max_width - length($one);
2522 next unless $space_to_add;
2523 my $add_left = int($space_to_add / 2);
2524 my $add_right = $space_to_add - $add_left;
2525 $one = (' ' x $add_left) . $one . (' ' x $add_right);
2528 return \@box; # must not return a null list!
2531 ###########################################################################
2533 =item $node->copy_tree or $node->copy_tree({...options...})
2535 This returns the root of a copy of the tree that $node is a member of.
2536 If you pass no options, copy_tree pretends you've passed {}.
2538 This method is currently implemented as just a call to
2539 $this->root->copy_at_and_under({...options...}), but magic may be
2540 added in the future.
2542 Options you specify are passed down to calls to $node->copy.
2544 =cut
2546 sub copy_tree {
2547 my($this, $o) = @_[0,1];
2548 my $root = $this->root;
2549 $o = {} unless ref $o;
2551 my $new_root = $root->copy_at_and_under($o);
2553 return $new_root;
2556 =item $node->copy_at_and_under or $node->copy_at_and_under({...options...})
2558 This returns a copy of the subtree consisting of $node and everything
2559 under it.
2561 If you pass no options, copy_at_and_under pretends you've passed {}.
2563 This works by recursively building up the new tree from the leaves,
2564 duplicating nodes using $orig_node->copy($options_ref) and then
2565 linking them up into a new tree of the same shape.
2567 Options you specify are passed down to calls to $node->copy.
2569 =cut
2571 sub copy_at_and_under {
2572 my($from, $o) = @_[0,1];
2573 $o = {} unless ref $o;
2574 my @daughters = map($_->copy_at_and_under($o), @{$from->{'daughters'}});
2575 my $to = $from->copy($o);
2576 $to->set_daughters(@daughters) if @daughters;
2577 return $to;
2580 =item the constructor $node->copy or $node->copy({...options...})
2582 Returns a copy of $node, B<minus> its daughter or mother attributes
2583 (which are set back to default values).
2585 If you pass no options, C<copy> pretends you've passed {}.
2587 Magic happens with the 'attributes' attribute: if it's a hashref (and
2588 it usually is), the new node doesn't end up with the same hashref, but
2589 with ref to a hash with the content duplicated from the original's
2590 hashref. If 'attributes' is not a hashref, but instead an object that
2591 belongs to a class that provides a method called "copy", then that
2592 method is called, and the result saved in the clone's 'attribute'
2593 attribute. Both of these kinds of magic are disabled if the options
2594 you pass to C<copy> (maybe via C<copy_tree>, or C<copy_at_and_under>)
2595 includes (C<no_attribute_copy> => 1).
2597 The options hashref you pass to C<copy> (derictly or indirectly) gets
2598 changed slightly after you call C<copy> -- it gets an entry called
2599 "from_to" added to it. Chances are you would never know nor care, but
2600 this is reserved for possible future use. See the source if you are
2601 wildly curious.
2603 Note that if you are using $node->copy (whether directly or via
2604 $node->copy_tree or $node->copy_at_or_under), and it's not properly
2605 copying object attributes containing references, you probably
2606 shouldn't fight it or try to fix it -- simply override copy_tree with:
2608 sub copy_tree {
2609 use Storable qw(dclone);
2610 my $this = $_[0];
2611 return dclone($this->root);
2612 # d for "deep"
2617 sub copy_tree {
2618 use Data::Dumper;
2619 my $this = $_[0];
2620 $Data::Dumper::Purity = 1;
2621 return eval(Dumper($this->root));
2624 Both of these avoid you having to reinvent the wheel.
2626 How to override copy_at_or_under with something that uses Storable
2627 or Data::Dumper is left as an exercise to the reader.
2629 Consider that if in a derived class, you add attributes with really
2630 bizarre contents (like a unique-for-all-time-ID), you may need to
2631 override C<copy>. Consider:
2633 sub copy {
2634 my($it, @etc) = @_;
2635 $it->SUPER::copy(@etc);
2636 $it->{'UID'} = &get_new_UID;
2639 ...or the like. See the source of Tree::DAG_Node::copy for
2640 inspiration.
2642 =cut
2644 sub copy {
2645 my($from,$o) = @_[0,1];
2646 $o = {} unless ref $o;
2648 # Straight dupe, and bless into same class:
2649 my $to = bless { %$from }, ref($from);
2651 # Null out linkages.
2652 $to->_init_mother;
2653 $to->_init_daughters;
2655 # dupe the 'attributes' attribute:
2656 unless($o->{'no_attribute_copy'}) {
2657 my $attrib_copy = ref($to->{'attributes'});
2658 if($attrib_copy) {
2659 if($attrib_copy eq 'HASH') {
2660 $to->{'attributes'} = { %{$to->{'attributes'}} };
2661 # dupe the hashref
2662 } elsif ($attrib_copy = UNIVERSAL::can($to->{'attributes'}, 'copy') ) {
2663 # $attrib_copy now points to the copier method
2664 $to->{'attributes'} = &{$attrib_copy}($from);
2665 } # otherwise I don't know how to copy it; leave as is
2668 $o->{'from_to'}->{$from} = $to; # SECRET VOODOO
2669 # ...autovivifies an anon hashref for 'from_to' if need be
2670 # This is here in case I later want/need a table corresponding
2671 # old nodes to new.
2672 return $to;
2676 ###########################################################################
2678 =item $node->delete_tree
2680 Destroys the entire tree that $node is a member of (starting at the
2681 root), by nulling out each node-object's attributes (including, most
2682 importantly, its linkage attributes -- hopefully this is more than
2683 sufficient to eliminate all circularity in the data structure), and
2684 then moving it into the class DEADNODE.
2686 Use this when you're finished with the tree in question, and want to
2687 free up its memory. (If you don't do this, it'll get freed up anyway
2688 when your program ends.)
2690 If you try calling any methods on any of the node objects in the tree
2691 you've destroyed, you'll get an error like:
2693 Can't locate object method "leaves_under"
2694 via package "DEADNODE".
2696 So if you see that, that's what you've done wrong. (Actually, the
2697 class DEADNODE does provide one method: a no-op method "delete_tree".
2698 So if you want to delete a tree, but think you may have deleted it
2699 already, it's safe to call $node->delete_tree on it (again).)
2701 The C<delete_tree> method is needed because Perl's garbage collector
2702 would never (as currently implemented) see that it was time to
2703 de-allocate the memory the tree uses -- until either you call
2704 $node->delete_tree, or until the program stops (at "global
2705 destruction" time, when B<everything> is unallocated).
2707 Incidentally, there are better ways to do garbage-collecting on a
2708 tree, ways which don't require the user to explicitly call a method
2709 like C<delete_tree> -- they involve dummy classes, as explained at
2710 C<http://mox.perl.com/misc/circle-destroy.pod>
2712 However, introducing a dummy class concept into Tree::DAG_Node would
2713 be rather a distraction. If you want to do this with your derived
2714 classes, via a DESTROY in a dummy class (or in a tree-metainformation
2715 class, maybe), then feel free to.
2717 The only case where I can imagine C<delete_tree> failing to totally
2718 void the tree, is if you use the hashref in the "attributes" attribute
2719 to store (presumably among other things) references to other nodes'
2720 "attributes" hashrefs -- which 1) is maybe a bit odd, and 2) is your
2721 problem, because it's your hash structure that's circular, not the
2722 tree's. Anyway, consider:
2724 # null out all my "attributes" hashes
2725 $anywhere->root->walk_down({
2726 'callback' => sub {
2727 $hr = $_[0]->attributes; %$hr = (); return 1;
2730 # And then:
2731 $anywhere->delete_tree;
2733 (I suppose C<delete_tree> is a "destructor", or as close as you can
2734 meaningfully come for a circularity-rich data structure in Perl.)
2736 =cut
2738 sub delete_tree {
2739 my $it = $_[0];
2740 $it->root->walk_down({ # has to be callbackback, not callback
2741 'callbackback' => sub {
2742 %{$_[0]} = ();
2743 bless($_[0], 'DEADNODE'); # cause become dead! cause become dead!
2744 return 1;
2747 return;
2748 # Why DEADNODE? Because of the nice error message:
2749 # "Can't locate object method "leaves_under" via package "DEADNODE"."
2750 # Moreover, DEADNODE doesn't provide is_node, so fails my can() tests.
2753 sub DEADNODE::delete_tree { return; }
2754 # in case you kill it AGAIN!!!!! AND AGAIN AND AGAIN!!!!!! OO-HAHAHAHA!
2756 ###########################################################################
2757 # stolen from MIDI.pm
2759 sub _dump_quote {
2760 my @stuff = @_;
2761 return
2762 join(", ",
2764 { # the cleaner-upper function
2765 if(!length($_)) { # empty string
2766 "''";
2767 } elsif( m/^-?\d+(?:\.\d+)?$/s ) { # a number
2769 } elsif( # text with junk in it
2770 s<([^\x20\x21\x23\x27-\x3F\x41-\x5B\x5D-\x7E])>
2771 <'\\x'.(unpack("H2",$1))>eg
2773 "\"$_\"";
2774 } else { # text with no junk in it
2775 s<'><\\'>g;
2776 "\'$_\'";
2779 @stuff
2783 ###########################################################################
2785 =back
2787 =head2 When and How to Destroy
2789 It should be clear to you that if you've built a big parse tree or
2790 something, and then you're finished with it, you should call
2791 $some_node->delete_tree on it if you want the memory back.
2793 But consider this case: you've got this tree:
2796 / | \
2797 B C D
2798 | | \
2799 E X Y
2801 Let's say you decide you don't want D or any of its descendants in the
2802 tree, so you call D->unlink_from_mother. This does NOT automagically
2803 destroy the tree D-X-Y. Instead it merely splits the tree into two:
2806 / \ / \
2807 B C X Y
2811 To destroy D and its little tree, you have to explicitly call
2812 delete_tree on it.
2814 Note, however, that if you call C->unlink_from_mother, and if you don't
2815 have a link to C anywhere, then it B<does> magically go away. This is
2816 because nothing links to C -- whereas with the D-X-Y tree, D links to
2817 X and Y, and X and Y each link back to D. Note that calling
2818 C->delete_tree is harmless -- after all, a tree of only one node is
2819 still a tree.
2821 So, this is a surefire way of getting rid of all $node's children and
2822 freeing up the memory associated with them and their descendants:
2824 foreach my $it ($node->clear_daughters) { $it->delete_tree }
2826 Just be sure not to do this:
2828 foreach my $it ($node->daughters) { $it->delete_tree }
2829 $node->clear_daughters;
2831 That's bad; the first call to $_->delete_tree will climb to the root
2832 of $node's tree, and nuke the whole tree, not just the bits under $node.
2833 You might as well have just called $node->delete_tree.
2834 (Moreavor, once $node is dead, you can't call clear_daughters on it,
2835 so you'll get an error there.)
2837 =head1 BUG REPORTS
2839 If you find a bug in this library, report it to me as soon as possible,
2840 at the address listed in the AUTHOR section, below. Please try to be
2841 as specific as possible about how you got the bug to occur.
2843 =head1 HELP!
2845 If you develop a given routine for dealing with trees in some way, and
2846 use it a lot, then if you think it'd be of use to anyone else, do email
2847 me about it; it might be helpful to others to include that routine, or
2848 something based on it, in a later version of this module.
2850 It's occurred to me that you might like to (and might yourself develop
2851 routines to) draw trees in something other than ASCII art. If you do so
2852 -- say, for PostScript output, or for output interpretable by some
2853 external plotting program -- I'd be most interested in the results.
2855 =head1 RAMBLINGS
2857 This module uses "strict", but I never wrote it with -w warnings in
2858 mind -- so if you use -w, do not be surprised if you see complaints
2859 from the guts of DAG_Node. As long as there is no way to turn off -w
2860 for a given module (instead of having to do it in every single
2861 subroutine with a "local $^W"), I'm not going to change this. However,
2862 I do, at points, get bursts of ambition, and I try to fix code in
2863 DAG_Node that generates warnings, I<as I come across them> -- which is
2864 only occasionally. Feel free to email me any patches for any such
2865 fixes you come up with, tho.
2867 Currently I don't assume (or enforce) anything about the class
2868 membership of nodes being manipulated, other than by testing whether
2869 each one provides a method C<is_node>, a la:
2871 die "Not a node!!!" unless UNIVERSAL::can($node, "is_node");
2873 So, as far as I'm concerned, a given tree's nodes are free to belong to
2874 different classes, just so long as they provide/inherit C<is_node>, the
2875 few methods that this class relies on to navigate the tree, and have the
2876 same internal object structure, or a superset of it. Presumably this
2877 would be the case for any object belonging to a class derived from
2878 C<Tree::DAG_Node>, or belonging to C<Tree::DAG_Node> itself.
2880 When routines in this class access a node's "mother" attribute, or its
2881 "daughters" attribute, they (generally) do so directly (via
2882 $node->{'mother'}, etc.), for sake of efficiency. But classes derived
2883 from this class should probably do this instead thru a method (via
2884 $node->mother, etc.), for sake of portability, abstraction, and general
2885 goodness.
2887 However, no routines in this class (aside from, necessarily, C<_init>,
2888 C<_init_name>, and C<name>) access the "name" attribute directly;
2889 routines (like the various tree draw/dump methods) get the "name" value
2890 thru a call to $obj->name(). So if you want the object's name to not be
2891 a real attribute, but instead have it derived dynamically from some feature
2892 of the object (say, based on some of its other attributes, or based on
2893 its address), you can to override the C<name> method, without causing
2894 problems. (Be sure to consider the case of $obj->name as a write
2895 method, as it's used in C<lol_to_tree> and C<random_network>.)
2897 =head1 SEE ALSO
2899 L<HTML::Element>
2901 Wirth, Niklaus. 1976. I<Algorithms + Data Structures = Programs>
2902 Prentice-Hall, Englewood Cliffs, NJ.
2904 Knuth, Donald Ervin. 1997. I<Art of Computer Programming, Volume 1,
2905 Third Edition: Fundamental Algorithms>. Addison-Wesley, Reading, MA.
2907 Wirth's classic, currently and lamentably out of print, has a good
2908 section on trees. I find it clearer than Knuth's (if not quite as
2909 encyclopedic), probably because Wirth's example code is in a
2910 block-structured high-level language (basically Pascal), instead
2911 of in assembler (MIX).
2913 Until some kind publisher brings out a new printing of Wirth's book,
2914 try poking around used bookstores (or C<www.abebooks.com>) for a copy.
2915 I think it was also republished in the 1980s under the title
2916 I<Algorithms and Data Structures>, and in a German edition called
2917 I<Algorithmen und Datenstrukturen>. (That is, I'm sure books by Knuth
2918 were published under those titles, but I'm I<assuming> that they're just
2919 later printings/editions of I<Algorithms + Data Structures =
2920 Programs>.)
2922 =head1 COPYRIGHT AND DISCLAIMER
2924 Copyright 1998,1999,2000,2001 by Sean M. Burke C<sburke@cpan.org>, all
2925 rights reserved. This program is free software; you can redistribute
2926 it and/or modify it under the same terms as Perl itself.
2928 This program is distributed in the hope that it will be useful, but
2929 without any warranty; without even the implied warranty of
2930 merchantability or fitness for a particular purpose.
2932 =head1 AUTHOR
2934 Sean M. Burke C<sburke@cpan.org>
2936 =cut
2940 __END__