Merge pull request #209 from jvolkening/master
[bioperl-live.git] / Bio / Das / SegmentI.pm
blob3bafe8d88c2dc5ee7c9a78de17ed26313924c600
2 # BioPerl module for Bio::Das::SegmentI
4 # Please direct questions and support issues to <bioperl-l@bioperl.org>
6 # Cared for by Lincoln Stein <lstein@cshl.org>
8 # Copyright Lincoln Stein
10 # You may distribute this module under the same terms as perl itself
12 # POD documentation - main docs before the code
14 =head1 NAME
16 Bio::Das::SegmentI - DAS-style access to a feature database
18 =head1 SYNOPSIS
20 # Get a Bio::Das::SegmentI object from a Bio::DasI database...
22 $segment = $das->segment(-name=>'Landmark',
23 -start=>$start,
24 -end => $end);
26 @features = $segment->overlapping_features(-type=>['type1','type2']);
27 # each feature is a Bio::SeqFeatureI-compliant object
29 @features = $segment->contained_features(-type=>['type1','type2']);
31 @features = $segment->contained_in(-type=>['type1','type2']);
33 $stream = $segment->get_feature_stream(-type=>['type1','type2','type3'];
34 while (my $feature = $stream->next_seq) {
35 # do something with feature
38 $count = $segment->features_callback(-type=>['type1','type2','type3'],
39 -callback => sub { ... { }
42 =head1 DESCRIPTION
44 Bio::Das::SegmentI is a simplified alternative interface to sequence
45 annotation databases used by the distributed annotation system. In
46 this scheme, the genome is represented as a series of landmarks. Each
47 Bio::Das::SegmentI object ("segment") corresponds to a genomic region
48 defined by a landmark and a start and end position relative to that
49 landmark. A segment is created using the Bio::DasI segment() method.
51 Features can be filtered by the following attributes:
53 1) their location relative to the segment (whether overlapping,
54 contained within, or completely containing)
56 2) their type
58 3) other attributes using tag/value semantics
60 Access to the feature list uses three distinct APIs:
62 1) fetching entire list of features at a time
64 2) fetching an iterator across features
66 3) a callback
68 =head1 FEEDBACK
70 =head2 Mailing Lists
72 User feedback is an integral part of the evolution of this and other
73 Bioperl modules. Send your comments and suggestions preferably to one
74 of the Bioperl mailing lists. Your participation is much appreciated.
76 bioperl-l@bio.perl.org
78 =head2 Support
80 Please direct usage questions or support issues to the mailing list:
82 I<bioperl-l@bioperl.org>
84 rather than to the module maintainer directly. Many experienced and
85 reponsive experts will be able look at the problem and quickly
86 address it. Please include a thorough description of the problem
87 with code and data examples if at all possible.
89 =head2 Reporting Bugs
91 Report bugs to the Bioperl bug tracking system to help us keep track
92 the bugs and their resolution. Bug reports can be submitted via the
93 web:
95 https://github.com/bioperl/bioperl-live/issues
97 =head1 AUTHOR - Lincoln Stein
99 Email lstein@cshl.org
101 =head1 APPENDIX
103 The rest of the documentation details each of the object
104 methods. Internal methods are usually preceded with a _
106 =cut
109 # Let the code begin...
111 package Bio::Das::SegmentI;
112 use strict;
115 # Object preamble - inherits from Bio::Root::RootI;
116 use base qw(Bio::Root::RootI);
118 =head2 seq_id
120 Title : seq_id
121 Usage : $ref = $s->seq_id
122 Function: return the ID of the landmark
123 Returns : a string
124 Args : none
125 Status : Public
127 =cut
129 sub seq_id { shift->throw_not_implemented }
131 =head2 display_name
133 Title : seq_name
134 Usage : $ref = $s->seq_name
135 Function: return the human-readable name for the landmark
136 Returns : a string
137 Args : none
138 Status : Public
140 This defaults to the same as seq_id.
142 =cut
144 sub display_name { shift->seq_id }
146 =head2 start
148 Title : start
149 Usage : $s->start
150 Function: start of segment
151 Returns : integer
152 Args : none
153 Status : Public
155 This is a read-only accessor for the start of the segment. Alias
156 to low() for Gadfly compatibility.
158 =cut
160 sub start { shift->throw_not_implemented }
161 sub low { shift->start }
163 =head2 end
165 Title : end
166 Usage : $s->end
167 Function: end of segment
168 Returns : integer
169 Args : none
170 Status : Public
172 This is a read-only accessor for the end of the segment. Alias to
173 high() for Gadfly compatibility.
175 =cut
177 sub end { shift->throw_not_implemented }
178 sub stop { shift->end }
179 sub high { shift->end }
181 =head2 length
183 Title : length
184 Usage : $s->length
185 Function: length of segment
186 Returns : integer
187 Args : none
188 Status : Public
190 Returns the length of the segment. Always a positive number.
192 =cut
194 sub length { shift->throw_not_implemented; }
196 =head2 seq
198 Title : seq
199 Usage : $s->seq
200 Function: get the sequence string for this segment
201 Returns : a string
202 Args : none
203 Status : Public
205 Returns the sequence for this segment as a simple string.
207 =cut
209 sub seq {shift->throw_not_implemented}
211 =head2 ref
213 Title : ref
214 Usage : $ref = $s->ref([$newlandmark])
215 Function: get/set the reference landmark for addressing
216 Returns : a string
217 Args : none
218 Status : Public
220 This method is used to examine/change the reference landmark used to
221 establish the coordinate system. By default, the landmark cannot be
222 changed and therefore this has the same effect as seq_id(). The new
223 landmark might be an ID, or another Das::SegmentI object.
225 =cut
227 sub ref { shift->seq_id }
228 *refseq = \&ref;
230 =head2 absolute
232 Title : absolute
233 Usage : $s->absolute([$new_value])
234 Function: get/set absolute addressing mode
235 Returns : flag
236 Args : new flag (optional)
237 Status : Public
239 Turn on and off absolute-addressing mode. In absolute addressing
240 mode, coordinates are relative to some underlying "top level"
241 coordinate system (such as a chromosome). ref() returns the identity
242 of the top level landmark, and start() and end() return locations
243 relative to that landmark. In relative addressing mode, coordinates
244 are relative to the landmark sequence specified at the time of segment
245 creation or later modified by the ref() method.
247 The default is to return false and to do nothing in response to
248 attempts to set absolute addressing mode.
250 =cut
252 sub absolute { return }
254 =head2 features
256 Title : features
257 Usage : @features = $s->features(@args)
258 Function: get features that overlap this segment
259 Returns : a list of Bio::SeqFeatureI objects
260 Args : see below
261 Status : Public
263 This method will find all features that intersect the segment in a
264 variety of ways and return a list of Bio::SeqFeatureI objects. The
265 feature locations will use coordinates relative to the reference
266 sequence in effect at the time that features() was called.
268 The returned list can be limited to certain types, attributes or
269 range intersection modes. Types of range intersection are one of:
271 "overlaps" the default
272 "contains" return features completely contained within the segment
273 "contained_in" return features that completely contain the segment
275 Two types of argument lists are accepted. In the positional argument
276 form, the arguments are treated as a list of feature types. In the
277 named parameter form, the arguments are a series of -name=E<gt>value
278 pairs.
280 Argument Description
281 -------- ------------
283 -types An array reference to type names in the format
284 "method:source"
286 -attributes A hashref containing a set of attributes to match
288 -rangetype One of "overlaps", "contains", or "contained_in".
290 -iterator Return an iterator across the features.
292 -callback A callback to invoke on each feature
294 The -attributes argument is a hashref containing one or more
295 attributes to match against:
297 -attributes => { Gene => 'abc-1',
298 Note => 'confirmed' }
300 Attribute matching is simple string matching, and multiple attributes
301 are ANDed together. More complex filtering can be performed using the
302 -callback option (see below).
304 If -iterator is true, then the method returns an object reference that
305 implements the next_seq() method. Each call to next_seq() returns a
306 new Bio::SeqFeatureI object.
308 If -callback is passed a code reference, the code reference will be
309 invoked on each feature returned. The code will be passed two
310 arguments consisting of the current feature and the segment object
311 itself, and must return a true value. If the code returns a false
312 value, feature retrieval will be aborted.
314 -callback and -iterator are mutually exclusive options. If -iterator
315 is defined, then -callback is ignored.
317 NOTE: the following methods all build on top of features(), and do not
318 need to be explicitly implemented.
320 overlapping_features()
321 contained_features()
322 contained_in()
323 get_feature_stream()
325 =cut
327 sub features {shift->throw_not_implemented}
329 =head2 overlapping_features
331 Title : overlapping_features
332 Usage : @features = $s->overlapping_features(@args)
333 Function: get features that overlap this segment
334 Returns : a list of Bio::SeqFeatureI objects
335 Args : see below
336 Status : Public
338 This method is identical to features() except that it defaults to
339 finding overlapping features.
341 =cut
343 sub overlapping_features {
344 my $self = shift;
345 my @args = $_[0] =~ /^-/ ? (@_, -rangetype=>'overlaps')
346 : (-types=>\@_,-rangetype=>'overlaps');
347 $self->features(@args);
350 =head2 contained_features
352 Title : contained_features
353 Usage : @features = $s->contained_features(@args)
354 Function: get features that are contained in this segment
355 Returns : a list of Bio::SeqFeatureI objects
356 Args : see below
357 Status : Public
359 This method is identical to features() except that it defaults to
360 a range type of 'contained'.
362 =cut
364 sub contained_features {
365 my $self = shift;
366 my @args = $_[0] =~ /^-/ ? (@_, -rangetype=>'contained')
367 : (-types=>\@_,-rangetype=>'contained');
368 $self->features(@args);
371 =head2 contained_in
373 Title : contained_in
374 Usage : @features = $s->contained_in(@args)
375 Function: get features that contain this segment
376 Returns : a list of Bio::SeqFeatureI objects
377 Args : see below
378 Status : Public
380 This method is identical to features() except that it defaults to
381 a range type of 'contained_in'.
383 =cut
385 sub contained_in {
386 my $self = shift;
387 my @args = $_[0] =~ /^-/ ? (@_, -rangetype=>'contained_in')
388 : (-types=>\@_,-rangetype=>'contained_in');
389 $self->features(@args);
392 =head2 get_feature_stream
394 Title : get_feature_stream
395 Usage : $iterator = $s->get_feature_stream(@args)
396 Function: get an iterator across the segment
397 Returns : an object that implements next_seq()
398 Args : see below
399 Status : Public
401 This method is identical to features() except that it always generates
402 an iterator.
404 NOTE: This is defined in the interface in terms of features(). You do not
405 have to implement it.
407 =cut
409 sub get_feature_stream {
410 my $self = shift;
411 my @args = defined $_[0] && $_[0] =~ /^-/ ? (@_, -iterator=>1)
412 : (-types=>\@_,-iterator=>1);
413 $self->features(@args);
416 =head2 factory
418 Title : factory
419 Usage : $factory = $s->factory
420 Function: return the segment factory
421 Returns : a Bio::DasI object
422 Args : see below
423 Status : Public
425 This method returns a Bio::DasI object that can be used to fetch
426 more segments. This is typically the Bio::DasI object from which
427 the segment was originally generated.
429 =cut
433 sub factory {shift->throw_not_implemented}
435 =head2 primary_tag
437 Title : primary_tag
438 Usage : $tag = $s->primary_tag
439 Function: identifies the segment as type "DasSegment"
440 Returns : a string named "DasSegment"
441 Args : none
442 Status : Public, but see below
444 This method provides Bio::Das::Segment objects with a primary_tag()
445 field that identifies them as being of type "DasSegment". This allows
446 the Bio::Graphics engine to render segments just like a feature in order
447 nis way useful.
449 This does not need to be implemented. It is defined by the interface.
451 =cut
455 sub primary_tag {"DasSegment"}
457 =head2 strand
459 Title : strand
460 Usage : $strand = $s->strand
461 Function: identifies the segment strand as 0
462 Returns : the number 0
463 Args : none
464 Status : Public, but see below
466 This method provides Bio::Das::Segment objects with a strand() field
467 that identifies it as being strandless. This allows the Bio::Graphics
468 engine to render segments just like a feature in order nis way useful.
470 This does not need to be implemented. It is defined by the interface.
472 =cut
474 sub strand { 0 }