changes all issue tracking in preparation for switch to github issues
[bioperl-live.git] / Bio / DB / Universal.pm
blobaf18c13f8683ed93ff111ded59b8869de36aa524
3 # BioPerl module for Bio::DB::Universal
5 # Please direct questions and support issues to <bioperl-l@bioperl.org>
7 # Cared for by Ewan Birney <birney@ebi.ac.uk>
9 # Copyright Ewan Birney
11 # You may distribute this module under the same terms as perl itself
13 # POD documentation - main docs before the code
15 =head1 NAME
17 Bio::DB::Universal - Artificial database that delegates to specific databases
19 =head1 SYNOPSIS
21 $uni = Bio::DB::Universal->new();
23 # by default connects to web databases. We can also
24 # substitute local databases
26 $embl = Bio::Index::EMBL->new( -filename => '/some/index/filename/locally/stored');
27 $uni->use_database('embl',$embl);
29 # treat it like a normal database. Recognises strings
30 # like gb|XXXXXX and embl:YYYYYY
32 $seq1 = $uni->get_Seq_by_id("embl:HSHNRNPA");
33 $seq2 = $uni->get_Seq_by_acc("gb|A000012");
35 # with no separator, tries to guess database. In this case the
36 # _ is considered to be indicative of swissprot
37 $seq3 = $uni->get_Seq_by_id('ROA1_HUMAN');
39 =head1 DESCRIPTION
41 Artificial database that delegates to specific databases, with a
42 "smart" (well, smartish) guessing routine for what the ids. No doubt
43 the smart routine can be made smarter.
45 The hope is that you can make this database and just throw ids at it -
46 for most easy cases it will sort you out. Personally, I would be
47 making sure I knew where each id came from and putting it into its own
48 database first - but this is a quick and dirty solution.
50 By default this connects to web orientated databases, with all the
51 reliability and network bandwidth costs this implies. However you can
52 subsistute your own local databases - they could be Bio::Index
53 databases (DBM file and flat file) or bioperl-db based (MySQL based)
54 or biocorba-based (whatever you like behind the corba interface).
56 Internally the tags for the databases are
58 genbank - ncbi dna database
59 embl - ebi's dna database (these two share accession number space)
60 swiss - swissprot + sptrembl (EBI's protein database)
62 We should extend this for RefSeq and other sequence databases which
63 are out there... ;)
65 Inspired by Lincoln Stein, written by Ewan Birney.
67 =head1 FEEDBACK
69 =head2 Mailing Lists
71 User feedback is an integral part of the evolution of this and other
72 Bioperl modules. Send your comments and suggestions preferably to one
73 of the Bioperl mailing lists. Your participation is much appreciated.
75 bioperl-l@bio.perl.org
77 =head2 Support
79 Please direct usage questions or support issues to the mailing list:
81 I<bioperl-l@bioperl.org>
83 rather than to the module maintainer directly. Many experienced and
84 reponsive experts will be able look at the problem and quickly
85 address it. Please include a thorough description of the problem
86 with code and data examples if at all possible.
88 =head2 Reporting Bugs
90 Report bugs to the Bioperl bug tracking system to help us keep track
91 the bugs and their resolution. Bug reports can be submitted via the
92 web:
94 https://github.com/bioperl/bioperl-live/issues
96 =head1 AUTHOR - Ewan Birney
98 Email birney@ebi.ac.uk
100 =head1 APPENDIX
102 The rest of the documentation details each of the object
103 methods. Internal methods are usually preceded with a _
105 =cut
108 # Let the code begin...
111 package Bio::DB::Universal;
112 use strict;
114 # Object preamble - inherits from Bio::Root::Root
117 use Bio::DB::GenBank;
118 use Bio::DB::SwissProt;
119 use Bio::DB::EMBL;
122 use base qw(Bio::DB::RandomAccessI Bio::Root::Root);
123 # new() can be inherited from Bio::Root::Root
125 sub new {
126 my ($class) = @_;
128 my $self = {};
129 bless $self,$class;
131 $self->{'db_hash'} = {};
133 # default databases
135 $self->use_database('embl',Bio::DB::EMBL->new);
136 $self->use_database('genbank',Bio::DB::GenBank->new);
137 $self->use_database('swiss',Bio::DB::GenBank->new);
139 return $self;
143 =head2 get_Seq_by_id
145 Title : get_Seq_by_id
146 Usage :
147 Function:
148 Example :
149 Returns :
150 Args :
153 =cut
155 sub get_Seq_by_id{
156 my ($self,$str) = @_;
158 my ($tag,$id) = $self->guess_id($str);
160 return $self->{'db_hash'}->{$tag}->get_Seq_by_id($id);
164 =head2 get_Seq_by_acc
166 Title : get_Seq_by_acc
167 Usage :
168 Function:
169 Example :
170 Returns :
171 Args :
174 =cut
176 sub get_Seq_by_acc {
177 my ($self,$str) = @_;
179 my ($tag,$id) = $self->guess_id($str);
181 return $self->{'db_hash'}->{$tag}->get_Seq_by_acc($id);
186 =head2 guess_id
188 Title : guess_id
189 Usage :
190 Function:
191 Example :
192 Returns :
193 Args :
196 =cut
198 sub guess_id{
199 my ($self,$str) = @_;
201 if( $str =~ /(\S+)[:|\/;](\w+)/ ) {
202 my $tag;
203 my $db = $1;
204 my $id = $2;
205 if( $db =~ /gb/i || $db =~ /genbank/i || $db =~ /ncbi/i ) {
206 $tag = 'genbank';
207 } elsif ( $db =~ /embl/i || $db =~ /emblbank/ || $db =~ /^em/i ) {
208 $tag = 'embl';
209 } elsif ( $db =~ /swiss/i || $db =~ /^sw/i || $db =~ /sptr/ ) {
210 $tag = 'swiss';
211 } else {
212 # throw for the moment
213 $self->throw("Could not guess database type $db from $str");
215 return ($tag,$id);
217 } else {
218 my $tag;
219 # auto-guess from just the id
220 if( $str =~ /_/ ) {
221 $tag = 'swiss';
222 } elsif ( $str =~ /^[QPR]\w+\d$/ ) {
223 $tag = 'swiss';
224 } elsif ( $str =~ /[A-Z]\d+/ ) {
225 $tag = 'genbank';
226 } else {
227 # default genbank...
228 $tag = 'genbank';
230 return ($tag,$str);
237 =head2 use_database
239 Title : use_database
240 Usage :
241 Function:
242 Example :
243 Returns :
244 Args :
247 =cut
249 sub use_database{
250 my ($self,$name,$database) = @_;
252 $self->{'db_hash'}->{$name} = $database;