2 # BioPerl module for Bio::OntologyIO
4 # Please direct questions and support issues to <bioperl-l@bioperl.org>
6 # Cared for by Hilmar Lapp <hlapp at gmx.net>
8 # Copyright Hilmar Lapp
10 # You may distribute this module under the same terms as perl itself
13 # (c) Hilmar Lapp, hlapp at gmx.net, 2003.
14 # (c) GNF, Genomics Institute of the Novartis Research Foundation, 2003.
16 # You may distribute this module under the same terms as perl itself.
17 # Refer to the Perl Artistic License (see the license accompanying this
18 # software package, or see http://www.perl.com/language/misc/Artistic.html)
19 # for the terms under which you may use, modify, and redistribute this module.
21 # THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
22 # WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
23 # MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
26 # POD documentation - main docs before the code
30 Bio::OntologyIO - Parser factory for Ontology formats
36 my $parser = Bio::OntologyIO->new(-format => "go",
39 while(my $ont = $parser->next_ontology()) {
40 print "read ontology ",$ont->name()," with ",
41 scalar($ont->get_root_terms)," root terms, and ",
42 scalar($ont->get_leaf_terms)," leaf terms\n";
47 This is the parser factory for different ontology sources and
48 formats. Conceptually, it is very similar to L<Bio::SeqIO>, but the
49 difference is that the chunk of data returned as an object is an
56 User feedback is an integral part of the evolution of this and other
57 Bioperl modules. Send your comments and suggestions preferably to
58 the Bioperl mailing list. Your participation is much appreciated.
60 bioperl-l@bioperl.org - General discussion
61 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
65 Please direct usage questions or support issues to the mailing list:
67 I<bioperl-l@bioperl.org>
69 rather than to the module maintainer directly. Many experienced and
70 reponsive experts will be able look at the problem and quickly
71 address it. Please include a thorough description of the problem
72 with code and data examples if at all possible.
76 Report bugs to the Bioperl bug tracking system to help us keep track
77 of the bugs and their resolution. Bug reports can be submitted via
80 https://github.com/bioperl/bioperl-live/issues
82 =head1 AUTHOR - Hilmar Lapp
84 Email hlapp at gmx.net
88 The rest of the documentation details each of the object methods.
89 Internal methods are usually preceded with a _
94 # Let the code begin...
97 package Bio
::OntologyIO
;
100 # Object preamble - inherits from Bio::Root::Root
103 use base
qw(Bio::Root::Root Bio::Root::IO);
106 # Maps from format name to driver suitable for the format.
108 my %format_driver_map = (
111 "interpro" => "InterProParser",
112 "interprosax" => "Handlers::InterPro_BioSQL_Handler",
113 "evoc" => "simplehierarchy",
120 Usage : my $parser = Bio::OntologyIO->new(-format => 'go', @args);
121 Function: Returns a stream of ontologies opened on the specified input
122 for the specified format.
123 Returns : An ontology parser (an instance of Bio::OntologyIO) initialized
124 for the specified format.
125 Args : Named parameters. Common parameters are
127 -format - the format of the input; the following are
129 goflat: DAG-Edit Gene Ontology flat files
130 go : synonymous to goflat
131 soflat: DAG-Edit Sequence Ontology flat files
132 so : synonymous to soflat
133 simplehierarchy: text format with one term per line
134 and indentation giving the hierarchy
135 evoc : synonymous to simplehierarchy
136 interpro: InterPro XML
137 interprosax: InterPro XML - this is actually not a
138 Bio::OntologyIO compliant parser; instead it
139 persists terms as they are encountered.
140 L<Bio::OntologyIO::Handlers::InterPro_BioSQL_Handler>
141 obo : OBO format style from Gene Ontology Consortium
142 -file - the file holding the data
143 -fh - the stream providing the data (-file and -fh are
145 -ontology_name - the name of the ontology
146 -engine - the L<Bio::Ontology::OntologyEngineI> object
147 to be reused (will be created otherwise); note
148 that every L<Bio::Ontology::OntologyI> will
149 qualify as well since that one inherits from the
151 -term_factory - the ontology term factory to use. Provide a
152 value only if you know what you are doing.
154 DAG-Edit flat file parsers will usually also accept the
155 following parameters.
157 -defs_file - the name of the file holding the term
159 -files - an array ref holding the file names (for GO,
160 there will usually be 3 files: component.ontology,
161 function.ontology, process.ontology)
163 Other parameters are specific to the parsers.
168 my ($caller,@args) = @_;
169 my $class = ref($caller) || $caller;
170 # or do we want to call SUPER on an object if $caller is an
172 if( $class =~ /Bio::OntologyIO::(\S+)/ ) {
173 my ($self) = $class->SUPER::new
(@args);
174 $self->_initialize(@args);
178 @param{ map { lc $_ } keys %param } = values %param; # lowercase keys
179 my $format = $class->_map_format($param{'-format'});
181 # normalize capitalization
182 return unless( $class->_load_format_module($format) );
183 return "Bio::OntologyIO::$format"->new(@args);
192 Usage : $format = $parser->format()
193 Function: Get the ontology format
194 Returns : ontology format
199 # format() method inherited from Bio::Root::IO
203 my($self, @args) = @_;
205 # initialize factories etc
206 my ($eng,$fact,$ontname) =
207 $self->_rearrange([qw(TERM_FACTORY)
209 # term object factory
210 $self->term_factory($fact) if $fact;
212 # initialize the Bio::Root::IO part
213 $self->_initialize_io(@args);
218 Title : next_ontology
219 Usage : $ont = $stream->next_ontology()
220 Function: Reads the next ontology object from the stream and returns it.
221 Returns : a L<Bio::Ontology::OntologyI> compliant object, or undef at the
229 shift->throw_not_implemented();
235 Usage : $obj->term_factory($newval)
236 Function: Get/set the ontology term factory to use.
238 As a user of this module it is not necessary to call this
239 method as there will be default. In order to change the
240 default, the easiest way is to instantiate
241 L<Bio::Ontology::TermFactory> with the proper -type
242 argument. Most if not all parsers will actually use this
243 very implementation, so even easier than the aforementioned
244 way is to simply call
245 $ontio->term_factory->type("Bio::Ontology::MyTerm").
248 Returns : value of term_factory (a Bio::Factory::ObjectFactoryI object)
249 Args : on set, new value (a Bio::Factory::ObjectFactoryI object, optional)
257 return $self->{'term_factory'} = shift if @_;
258 return $self->{'term_factory'};
261 =head1 Private Methods
263 Some of these are actually 'protected' in OO speak, which means you
264 may or will want to utilize them in a derived ontology parser, but
265 you should not call them from outside.
269 =head2 _load_format_module
271 Title : _load_format_module
272 Usage : *INTERNAL OntologyIO stuff*
273 Function: Loads up (like use) a module at run time on demand
280 sub _load_format_module
{
281 my ($self, $format) = @_;
282 my $module = "Bio::OntologyIO::" . $format;
286 $ok = $self->_load_module($module);
290 $self: $format cannot be found
292 For more information about the OntologyIO system please see the docs.
293 This includes ways of checking for formats at compile time, not run time
311 $mod = $format_driver_map{lc($format)};
312 $mod = lc($format) unless $mod;
314 $self->throw("unable to guess ontology format, specify -format");
320 my( $self, $ref ) = @_;
321 $ref =~ s/<\\;/\</g;
322 $ref =~ s/>\\;/\>/g;
323 $ref =~ s/&pct\\;/\%/g;