sync w/ main trunk
[bioperl-live.git] / Bio / Das / SegmentI.pm
blobf750d581455844fd5ea15c5ff355be90500ba8e6
1 # $Id$
3 # BioPerl module for Bio::Das::SegmentI
5 # Please direct questions and support issues to <bioperl-l@bioperl.org>
7 # Cared for by Lincoln Stein <lstein@cshl.org>
9 # Copyright Lincoln Stein
11 # You may distribute this module under the same terms as perl itself
13 # POD documentation - main docs before the code
15 =head1 NAME
17 Bio::Das::SegmentI - DAS-style access to a feature database
19 =head1 SYNOPSIS
21 # Get a Bio::Das::SegmentI object from a Bio::DasI database...
23 $segment = $das->segment(-name=>'Landmark',
24 -start=>$start,
25 -end => $end);
27 @features = $segment->overlapping_features(-type=>['type1','type2']);
28 # each feature is a Bio::SeqFeatureI-compliant object
30 @features = $segment->contained_features(-type=>['type1','type2']);
32 @features = $segment->contained_in(-type=>['type1','type2']);
34 $stream = $segment->get_feature_stream(-type=>['type1','type2','type3'];
35 while (my $feature = $stream->next_seq) {
36 # do something with feature
39 $count = $segment->features_callback(-type=>['type1','type2','type3'],
40 -callback => sub { ... { }
43 =head1 DESCRIPTION
45 Bio::Das::SegmentI is a simplified alternative interface to sequence
46 annotation databases used by the distributed annotation system. In
47 this scheme, the genome is represented as a series of landmarks. Each
48 Bio::Das::SegmentI object ("segment") corresponds to a genomic region
49 defined by a landmark and a start and end position relative to that
50 landmark. A segment is created using the Bio::DasI segment() method.
52 Features can be filtered by the following attributes:
54 1) their location relative to the segment (whether overlapping,
55 contained within, or completely containing)
57 2) their type
59 3) other attributes using tag/value semantics
61 Access to the feature list uses three distinct APIs:
63 1) fetching entire list of features at a time
65 2) fetching an iterator across features
67 3) a callback
69 =head1 FEEDBACK
71 =head2 Mailing Lists
73 User feedback is an integral part of the evolution of this and other
74 Bioperl modules. Send your comments and suggestions preferably to one
75 of the Bioperl mailing lists. Your participation is much appreciated.
77 bioperl-l@bio.perl.org
79 =head2 Support
81 Please direct usage questions or support issues to the mailing list:
83 L<bioperl-l@bioperl.org>
85 rather than to the module maintainer directly. Many experienced and
86 reponsive experts will be able look at the problem and quickly
87 address it. Please include a thorough description of the problem
88 with code and data examples if at all possible.
90 =head2 Reporting Bugs
92 Report bugs to the Bioperl bug tracking system to help us keep track
93 the bugs and their resolution. Bug reports can be submitted via the
94 web:
96 http://bugzilla.open-bio.org/
98 =head1 AUTHOR - Lincoln Stein
100 Email lstein@cshl.org
102 =head1 APPENDIX
104 The rest of the documentation details each of the object
105 methods. Internal methods are usually preceded with a _
107 =cut
110 # Let the code begin...
112 package Bio::Das::SegmentI;
113 use strict;
116 # Object preamble - inherits from Bio::Root::RootI;
117 use base qw(Bio::Root::RootI);
119 =head2 seq_id
121 Title : seq_id
122 Usage : $ref = $s->seq_id
123 Function: return the ID of the landmark
124 Returns : a string
125 Args : none
126 Status : Public
128 =cut
130 sub seq_id { shift->throw_not_implemented }
132 =head2 display_name
134 Title : seq_name
135 Usage : $ref = $s->seq_name
136 Function: return the human-readable name for the landmark
137 Returns : a string
138 Args : none
139 Status : Public
141 This defaults to the same as seq_id.
143 =cut
145 sub display_name { shift->seq_id }
147 =head2 start
149 Title : start
150 Usage : $s->start
151 Function: start of segment
152 Returns : integer
153 Args : none
154 Status : Public
156 This is a read-only accessor for the start of the segment. Alias
157 to low() for Gadfly compatibility.
159 =cut
161 sub start { shift->throw_not_implemented }
162 sub low { shift->start }
164 =head2 end
166 Title : end
167 Usage : $s->end
168 Function: end of segment
169 Returns : integer
170 Args : none
171 Status : Public
173 This is a read-only accessor for the end of the segment. Alias to
174 high() for Gadfly compatibility.
176 =cut
178 sub end { shift->throw_not_implemented }
179 sub stop { shift->end }
180 sub high { shift->end }
182 =head2 length
184 Title : length
185 Usage : $s->length
186 Function: length of segment
187 Returns : integer
188 Args : none
189 Status : Public
191 Returns the length of the segment. Always a positive number.
193 =cut
195 sub length { shift->throw_not_implemented; }
197 =head2 seq
199 Title : seq
200 Usage : $s->seq
201 Function: get the sequence string for this segment
202 Returns : a string
203 Args : none
204 Status : Public
206 Returns the sequence for this segment as a simple string.
208 =cut
210 sub seq {shift->throw_not_implemented}
212 =head2 ref
214 Title : ref
215 Usage : $ref = $s->ref([$newlandmark])
216 Function: get/set the reference landmark for addressing
217 Returns : a string
218 Args : none
219 Status : Public
221 This method is used to examine/change the reference landmark used to
222 establish the coordinate system. By default, the landmark cannot be
223 changed and therefore this has the same effect as seq_id(). The new
224 landmark might be an ID, or another Das::SegmentI object.
226 =cut
228 sub ref { shift->seq_id }
229 *refseq = \&ref;
231 =head2 absolute
233 Title : absolute
234 Usage : $s->absolute([$new_value])
235 Function: get/set absolute addressing mode
236 Returns : flag
237 Args : new flag (optional)
238 Status : Public
240 Turn on and off absolute-addressing mode. In absolute addressing
241 mode, coordinates are relative to some underlying "top level"
242 coordinate system (such as a chromosome). ref() returns the identity
243 of the top level landmark, and start() and end() return locations
244 relative to that landmark. In relative addressing mode, coordinates
245 are relative to the landmark sequence specified at the time of segment
246 creation or later modified by the ref() method.
248 The default is to return false and to do nothing in response to
249 attempts to set absolute addressing mode.
251 =cut
253 sub absolute { return }
255 =head2 features
257 Title : features
258 Usage : @features = $s->features(@args)
259 Function: get features that overlap this segment
260 Returns : a list of Bio::SeqFeatureI objects
261 Args : see below
262 Status : Public
264 This method will find all features that intersect the segment in a
265 variety of ways and return a list of Bio::SeqFeatureI objects. The
266 feature locations will use coordinates relative to the reference
267 sequence in effect at the time that features() was called.
269 The returned list can be limited to certain types, attributes or
270 range intersection modes. Types of range intersection are one of:
272 "overlaps" the default
273 "contains" return features completely contained within the segment
274 "contained_in" return features that completely contain the segment
276 Two types of argument lists are accepted. In the positional argument
277 form, the arguments are treated as a list of feature types. In the
278 named parameter form, the arguments are a series of -name=E<gt>value
279 pairs.
281 Argument Description
282 -------- ------------
284 -types An array reference to type names in the format
285 "method:source"
287 -attributes A hashref containing a set of attributes to match
289 -rangetype One of "overlaps", "contains", or "contained_in".
291 -iterator Return an iterator across the features.
293 -callback A callback to invoke on each feature
295 The -attributes argument is a hashref containing one or more
296 attributes to match against:
298 -attributes => { Gene => 'abc-1',
299 Note => 'confirmed' }
301 Attribute matching is simple string matching, and multiple attributes
302 are ANDed together. More complex filtering can be performed using the
303 -callback option (see below).
305 If -iterator is true, then the method returns an object reference that
306 implements the next_seq() method. Each call to next_seq() returns a
307 new Bio::SeqFeatureI object.
309 If -callback is passed a code reference, the code reference will be
310 invoked on each feature returned. The code will be passed two
311 arguments consisting of the current feature and the segment object
312 itself, and must return a true value. If the code returns a false
313 value, feature retrieval will be aborted.
315 -callback and -iterator are mutually exclusive options. If -iterator
316 is defined, then -callback is ignored.
318 NOTE: the following methods all build on top of features(), and do not
319 need to be explicitly implemented.
321 overlapping_features()
322 contained_features()
323 contained_in()
324 get_feature_stream()
326 =cut
328 sub features {shift->throw_not_implemented}
330 =head2 overlapping_features
332 Title : overlapping_features
333 Usage : @features = $s->overlapping_features(@args)
334 Function: get features that overlap this segment
335 Returns : a list of Bio::SeqFeatureI objects
336 Args : see below
337 Status : Public
339 This method is identical to features() except that it defaults to
340 finding overlapping features.
342 =cut
344 sub overlapping_features {
345 my $self = shift;
346 my @args = $_[0] =~ /^-/ ? (@_, -rangetype=>'overlaps')
347 : (-types=>\@_,-rangetype=>'overlaps');
348 $self->features(@args);
351 =head2 contained_features
353 Title : contained_features
354 Usage : @features = $s->contained_features(@args)
355 Function: get features that are contained in this segment
356 Returns : a list of Bio::SeqFeatureI objects
357 Args : see below
358 Status : Public
360 This method is identical to features() except that it defaults to
361 a range type of 'contained'.
363 =cut
365 sub contained_features {
366 my $self = shift;
367 my @args = $_[0] =~ /^-/ ? (@_, -rangetype=>'contained')
368 : (-types=>\@_,-rangetype=>'contained');
369 $self->features(@args);
372 =head2 contained_in
374 Title : contained_in
375 Usage : @features = $s->contained_in(@args)
376 Function: get features that contain this segment
377 Returns : a list of Bio::SeqFeatureI objects
378 Args : see below
379 Status : Public
381 This method is identical to features() except that it defaults to
382 a range type of 'contained_in'.
384 =cut
386 sub contained_in {
387 my $self = shift;
388 my @args = $_[0] =~ /^-/ ? (@_, -rangetype=>'contained_in')
389 : (-types=>\@_,-rangetype=>'contained_in');
390 $self->features(@args);
393 =head2 get_feature_stream
395 Title : get_feature_stream
396 Usage : $iterator = $s->get_feature_stream(@args)
397 Function: get an iterator across the segment
398 Returns : an object that implements next_seq()
399 Args : see below
400 Status : Public
402 This method is identical to features() except that it always generates
403 an iterator.
405 NOTE: This is defined in the interface in terms of features(). You do not
406 have to implement it.
408 =cut
410 sub get_feature_stream {
411 my $self = shift;
412 my @args = defined $_[0] && $_[0] =~ /^-/ ? (@_, -iterator=>1)
413 : (-types=>\@_,-iterator=>1);
414 $self->features(@args);
417 =head2 factory
419 Title : factory
420 Usage : $factory = $s->factory
421 Function: return the segment factory
422 Returns : a Bio::DasI object
423 Args : see below
424 Status : Public
426 This method returns a Bio::DasI object that can be used to fetch
427 more segments. This is typically the Bio::DasI object from which
428 the segment was originally generated.
430 =cut
434 sub factory {shift->throw_not_implemented}
436 =head2 primary_tag
438 Title : primary_tag
439 Usage : $tag = $s->primary_tag
440 Function: identifies the segment as type "DasSegment"
441 Returns : a string named "DasSegment"
442 Args : none
443 Status : Public, but see below
445 This method provides Bio::Das::Segment objects with a primary_tag()
446 field that identifies them as being of type "DasSegment". This allows
447 the Bio::Graphics engine to render segments just like a feature in order
448 nis way useful.
450 This does not need to be implemented. It is defined by the interface.
452 =cut
456 sub primary_tag {"DasSegment"}
458 =head2 strand
460 Title : strand
461 Usage : $strand = $s->strand
462 Function: identifies the segment strand as 0
463 Returns : the number 0
464 Args : none
465 Status : Public, but see below
467 This method provides Bio::Das::Segment objects with a strand() field
468 that identifies it as being strandless. This allows the Bio::Graphics
469 engine to render segments just like a feature in order nis way useful.
471 This does not need to be implemented. It is defined by the interface.
473 =cut
475 sub strand { 0 }