tag fourth (and hopefully last) alpha
[bioperl-live.git] / branch-1-6 / Bio / Tools / Seg.pm
blob9895dbb892cc5f4bfe58ce55f93dada60ce72bf5
1 # $Id$
3 # BioPerl module for Bio::Tools::Seg
5 # Copyright Balamurugan Kumarasamy
6 # Totally re-written, added docs and tests -- Torsten Seemann, Sep 2006
8 # Copyright
9 # You may distribute this module under the same terms as perl itself
11 # POD documentation - main docs before the code
13 =head1 NAME
15 Bio::Tools::Seg - parse C<seg> output
17 =head1 SYNOPSIS
19 use Bio::Tools::Seg;
20 my $parser = Bio::Tools::Seg->(-file => 'seg.fasta');
21 while ( my $f = $parser->next_result ) {
22 if ($f->score < 1.5) {
23 print $f->location->to_FTstring, " is low complexity\n";
27 =head1 DESCRIPTION
29 C<seg> identifies low-complexity regions on a protein sequence.
30 It is usually part of the C<WU-BLAST> and C<InterProScan> packages.
32 The L<Bio::Tools::Seg> module will only parse the "fasta" output
33 modes of C<seg>, i.e. C<seg -l> (low complexity regions only),
34 C<seg -h> (high complexity regions only), or C<seg -a> (both low
35 and high).
37 It creates a L<Bio::SeqFeature::Generic> for each FASTA-like entry
38 found in the input file. It is up to the user to appropriately filter
39 these using the feature's score.
41 =head1 FEEDBACK
43 =head2 Mailing Lists
45 User feedback is an integral part of the evolution of this and other
46 Bioperl modules. Send your comments and suggestions preferably to
47 the Bioperl mailing list. Your participation is much appreciated.
49 bioperl-l@bioperl.org - General discussion
50 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
52 =head2 Support
54 Please direct usage questions or support issues to the mailing list:
56 I<bioperl-l@bioperl.org>
58 rather than to the module maintainer directly. Many experienced and
59 reponsive experts will be able look at the problem and quickly
60 address it. Please include a thorough description of the problem
61 with code and data examples if at all possible.
63 =head2 Reporting Bugs
65 Report bugs to the Bioperl bug tracking system to help us keep track
66 of the bugs and their resolution. Bug reports can be submitted via the
67 web:
69 http://bugzilla.open-bio.org/
71 =head1 AUTHOR - Torsten Seemann
73 Email - torsten.seemann AT infotech.monash.edu.au
75 =head1 CONTRIBUTOR - Bala
77 Email - savikalpa@fugu-sg.org
79 =head1 APPENDIX
81 The rest of the documentation details each of the object methods.
82 Internal methods are usually preceded with a _
84 =cut
86 package Bio::Tools::Seg;
87 use strict;
89 use Bio::SeqFeature::Generic;
90 use base qw(Bio::Root::Root Bio::Root::IO);
92 =head2 new
94 Title : new
95 Usage : my $obj = Bio::Tools::Seg->new();
96 Function: Builds a new Bio::Tools::Seg object
97 Returns : Bio::Tools::Seg
98 Args : -fh/-file => $val, # for initing input, see Bio::Root::IO
100 =cut
103 sub new {
104 my($class,@args) = @_;
105 my $self = $class->SUPER::new(@args);
106 $self->_initialize_io(@args);
107 return $self;
110 =head2 next_result
112 Title : next_result
113 Usage : my $feat = $seg->next_result
114 Function: Get the next result set from parser data
115 Returns : Bio::SeqFeature::Generic
116 Args : none
118 =cut
120 sub next_result {
121 my ($self) = @_;
123 # For example in this line
124 # test_prot(214-226) complexity=2.26 (12/2.20/2.50)
125 # $1 is test_prot
126 # $2 is 214
127 # $3 is 226
128 # $4 is 2.26
130 while (my $line = $self->_readline) {
131 if ($line =~ /^\>\s*?(\S+)?\s*?\((\d+)\-(\d+)\)\s*complexity=(\S+)/) {
132 return Bio::SeqFeature::Generic->new(
133 -seq_id => $1,
134 -start => $2,
135 -end => $3,
136 -score => $4,
137 -source_tag => 'Seg',
138 -primary => 'low_complexity'