buzgilla -> redmine
[bioperl-db.git] / lib / Bio / DB / BioSQL / Oracle / BiosequenceAdaptorDriver.pm
blobe2d49f618d8552e3979b7aff1addc0dde79a5c1f
1 # $Id$
3 # BioPerl module for Bio::DB::BioSQL::Oracle::BiosequenceAdaptorDriver
5 # Please direct questions and support issues to <bioperl-l@bioperl.org>
7 # Cared for by Hilmar Lapp <hlapp at gmx.net>
9 # Copyright Hilmar Lapp
11 # You may distribute this module under the same terms as perl itself
14 # (c) Hilmar Lapp, hlapp at gmx.net, 2002.
15 # (c) GNF, Genomics Institute of the Novartis Research Foundation, 2002.
17 # You may distribute this module under the same terms as perl itself.
18 # Refer to the Perl Artistic License (see the license accompanying this
19 # software package, or see http://www.perl.com/language/misc/Artistic.html)
20 # for the terms under which you may use, modify, and redistribute this module.
22 # THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
23 # WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
24 # MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
27 # POD documentation - main docs before the code
29 =head1 NAME
31 Bio::DB::BioSQL::Oracle::BiosequenceAdaptorDriver - DESCRIPTION of Object
33 =head1 SYNOPSIS
35 Give standard usage here
37 =head1 DESCRIPTION
39 Describe the object here
41 =head1 FEEDBACK
43 =head2 Mailing Lists
45 User feedback is an integral part of the evolution of this and other
46 Bioperl modules. Send your comments and suggestions preferably to
47 the Bioperl mailing list. Your participation is much appreciated.
49 bioperl-l@bioperl.org - General discussion
50 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
52 =head2 Support
54 Please direct usage questions or support issues to the mailing list:
56 I<bioperl-l@bioperl.org>
58 rather than to the module maintainer directly. Many experienced and
59 reponsive experts will be able look at the problem and quickly
60 address it. Please include a thorough description of the problem
61 with code and data examples if at all possible.
63 =head2 Reporting Bugs
65 Report bugs to the Bioperl bug tracking system to help us keep track
66 of the bugs and their resolution. Bug reports can be submitted via
67 email or the web:
69 bioperl-bugs@bioperl.org
70 http://redmine.open-bio.org/projects/bioperl/
72 =head1 AUTHOR - Hilmar Lapp
74 Email hlapp at gmx.net
76 Describe contact details here
78 =head1 CONTRIBUTORS
80 Additional contributors names and emails here
82 =head1 APPENDIX
84 The rest of the documentation details each of the object methods.
85 Internal methods are usually preceded with a _
87 =cut
90 # Let the code begin...
93 package Bio::DB::BioSQL::Oracle::BiosequenceAdaptorDriver;
94 use vars qw(@ISA);
95 use strict;
97 # Object preamble - inherits from Bio::Root::Root
99 use Bio::DB::BioSQL::Oracle::BasePersistenceAdaptorDriver;
100 use DBD::Oracle qw(:ora_types);
102 @ISA = qw(Bio::DB::BioSQL::Oracle::BasePersistenceAdaptorDriver);
104 # new() is inherited
106 =head2 insert_object
108 Title : insert_object
109 Usage :
110 Function: We override this here in order to omit the insert if there are
111 no values. This is because this entity basically represents a
112 derived class, and we may simply be dealing with the base class.
114 Example :
115 Returns : The primary key of the newly inserted record.
116 Args : A Bio::DB::BioSQL::BasePersistenceAdaptor derived object
117 (basically, it needs to implement dbh(), sth($key, $sth),
118 dbcontext(), and get_persistent_slots()).
119 The object to be inserted.
120 A reference to an array of foreign key objects; if any of those
121 foreign key values is NULL (some foreign keys may be nullable),
122 then give the class name.
125 =cut
127 sub insert_object{
128 my $self = shift;
129 my ($adp,$obj,$fkobjs,$isdef) = @_;
131 # this may come precomputed from the update_object() method below
132 if(!defined($isdef)) {
133 # no, not precomputed
134 # obtain the object's slot values to be serialized
135 my $slotvals = $adp->get_persistent_slot_values($obj, $fkobjs);
136 # any value present?
137 foreach (@$slotvals) { $isdef ||= $_; last if $isdef; }
139 return $self->SUPER::insert_object(@_) if $isdef;
140 return -1;
143 =head2 update_object
145 Title : update_object
146 Usage :
147 Function: See parent class. We need to override this here because
148 there is no Biosequence object separate from PrimarySeq
149 that would hold a primary key. Hence, store()s cannot
150 recognize when the Biosequence for a Bioentry already
151 exists and needs to be updated, or when it needs to be
152 created. The way the code is currently wired, the presence
153 of the primary key (stemming from the bioentry) will always
154 trigger an update.
156 So, what we need to do here is check whether the entry already
157 exists and if not delegate to insert_object().
158 Example :
159 Returns : The number of updated rows
160 Args : A Bio::DB::BioSQL::BasePersistenceAdaptor derived object
161 (basically, it needs to implement dbh(), sth($key, $sth),
162 dbcontext(), and get_persistent_slots()).
163 The object to be updated.
164 A reference to an array of foreign key objects; if any of those
165 foreign key values is NULL (some foreign keys may be nullable),
166 then give the class name.
169 =cut
171 sub update_object{
172 my ($self,$adp,$obj,$fkobjs) = @_;
174 # see whether there are any values defined at all
175 my $slotvals = $adp->get_persistent_slot_values($obj, $fkobjs);
176 my $isdef = 0;
177 foreach (@$slotvals) { $isdef ||= $_; last if $isdef; }
178 # in the majority of cases this actually will be an update indeed - so
179 # let's just go ahead and try if there are any values to update
180 my $rv = -1;
181 if($isdef) {
182 $rv = $self->SUPER::update_object($adp,$obj,$fkobjs);
183 # if the number of affected rows was zero, then it needs to be
184 # an insert
185 if($rv && ($rv == 0)) {
186 # pass on the pre-computed $isdef (see the implementation above)
187 $rv = $self->insert_object($adp,$obj,$fkobjs,$isdef);
190 # done
191 return $rv;
194 =head2 get_biosequence
196 Title : get_biosequence
197 Usage :
198 Function: Returns the actual sequence for a bioentry, or a substring of it.
199 Example :
200 Returns : A string (the sequence or subsequence)
201 Args : The calling persistence adaptor.
202 The primary key of the bioentry for which to obtain the sequence.
203 Optionally, start and end position if only a subsequence is to be
204 returned (for long sequences, obtaining the subsequence from the
205 database may be much faster than obtaining it from the complete
206 in-memory string, because the latter has to be retrieved first).
209 =cut
211 sub get_biosequence{
212 my ($self,$adp,$bioentryid,$start,$end) = @_;
213 my ($sth, $cache_key, $row);
214 my $seqstr;
216 if(defined($start)) {
217 # statement cached?
218 $cache_key = "SELECT BIOSEQ SUBSTR".$adp.(defined($end) ?" 2POS":"");
219 $sth = $adp->sth($cache_key);
220 if(! $sth) {
221 # we need to create this
222 my $table = $self->table_name($adp);
223 my $seqcol = $self->slot_attribute_map($table)->{"seq"};
224 if(! $seqcol) {
225 $self->throw("no mapping for column seq in table $table");
227 my $ukname = $self->foreign_key_name("Bio::PrimarySeqI");
228 my $sql = "SELECT DBMS_LOB.SUBSTR($seqcol, ";
229 if(defined($end)) {
230 $sql .= "?, ?";
231 } else {
232 $sql .= "DBMS_LOB.GETLENGTH($seqcol) - ?, ?";
234 $sql .= ") FROM $table WHERE $ukname = ?";
235 $adp->debug("preparing SELECT statement: $sql\n");
236 $sth = $adp->dbh()->prepare($sql);
237 # and cache it
238 $adp->sth($cache_key, $sth);
240 # bind parameters
241 if(defined($end)) {
242 $sth->bind_param(1, $end-$start+1);
243 } else {
244 $sth->bind_param(1, $start-1);
246 $sth->bind_param(2, $start);
247 $sth->bind_param(3, $bioentryid);
248 } else {
249 # statement cached?
250 $cache_key = "SELECT BIOSEQ ".$adp;
251 $sth = $adp->sth($cache_key);
252 if(! $sth) {
253 # we need to create this
254 my $table = $self->table_name($adp);
255 my $seqcol = $self->slot_attribute_map($table)->{"seq"};
256 if(! $seqcol) {
257 $self->throw("no mapping for column seq in table $table");
259 my $ukname = $self->foreign_key_name("Bio::PrimarySeqI");
260 my $sql = "SELECT $seqcol FROM $table WHERE $ukname = ?";
261 $adp->debug("preparing SELECT statement: $sql\n");
262 $sth = $adp->dbh()->prepare($sql);
263 # and cache it
264 $adp->sth($cache_key, $sth);
266 # bind parameters
267 $sth->bind_param(1, $bioentryid);
269 # execute and fetch
270 if (! $sth->execute()) {
271 $self->throw("error while executing query $cache_key with values ("
272 .(defined($start) ? "$start;" : "")
273 .(defined($end) ? ($end-$start+1).";" : "")
274 .$bioentryid."):\n"
275 .$sth->errstr." (".$sth->state.")");
277 $row = $sth->fetchall_arrayref();
278 return (@$row ? $row->[0]->[0] : undef);
281 =head2 prepare
283 Title : prepare
284 Usage :
285 Function: Prepares a SQL statement and returns a statement handle.
287 We override this here in order to intercept the row update
288 statement. We'll edit the statement to replace the table
289 name with the fully qualified table the former points to if
290 it is in fact a synonym, not a real table. The reason is
291 that otherwise LOB support doesn't work properly if the LOB
292 parameter is wrapped in a call to NVL() (which it is) and
293 the table is only a synonym, not a physical table.
295 Example :
296 Returns : the return value of the DBI::prepare() call
297 Args : the DBI database handle for preparing the statement
298 the SQL statement to prepare (a scalar)
299 additional arguments to be passed to the dbh->prepare call
302 =cut
304 sub prepare{
305 my ($self,$dbh,$sql,@args) = @_;
307 # we need to intercept the 'UPDATE biosequence' or whatever the table
308 # is called here, so in order not to hardcode the table name let's
309 # ask for it
310 my $table = uc($self->table_name("Bio::DB::BioSQL::BiosequenceAdaptor"));
311 # now is it the UPDATE we're interested in messing with?
312 if($sql =~ /^update\s+$table/i) {
313 # yes it is.
315 # copy the sql and edit to remove the NVL() for the SEQ column
316 my $sql2 = $sql;
317 $sql2 =~ s/seq\s+=\s+nvl\(\s*\?\s*,\s*seq\s*\)/seq = \?/i;
318 # In the third version we edit away the NVL clause and replace
319 # it with a CONCAT clause. This is to be used if the parameter
320 # is NULL; concatenating NULL to an existing CLOB doesn't
321 # change it, and the return type of CONCAT preserves the type
322 # of the first argument.
323 my $sql3 = $sql;
324 $sql3 =~ s/seq\s+=\s+nvl\(\s*\?\s*,\s*seq\s*\)/seq = CONCAT(seq, \?)/i;
325 # prepare both and cache for later use
326 $self->debug("first alternative UPDATE biosequence: $sql2\n");
327 $self->debug("second alternative UPDATE biosequence: $sql3\n");
328 $self->_upd_sth2($dbh->prepare($sql2));
329 $self->_upd_sth3($dbh->prepare($sql3));
331 return $dbh->prepare($sql,@args);
334 =head2 get_sth
336 Title : get_sth
337 Usage :
338 Function: Retrieves the (prepared) statement handle to bind
339 parameters for and to execute for the given operation.
341 By default this will use the supplied key to retrieve the
342 statement from the cache.
344 This method is here to provide an opportunity for
345 inheriting drivers to intercept the cached statement
346 retrieval in order to on-the-fly redirect the statement
347 execution to use a different statement than it would have
348 used by default.
350 This method may return undef if for instance there is no
351 appropriate statement handle in the cache. Returning undef
352 will trigger the calling method to construct a statement
353 from scratch.
355 Example :
356 Returns : a prepared statement handle if one is exists for the query,
357 and undef otherwise
358 Args : - the calling adaptor (a Bio::DB::BioSQL::BasePersistenceAdaptor
359 derived object
360 - the object for the persistence operation
361 - a reference to an array of foreign key objects; if any of
362 those foreign key values is NULL then the class name
363 - the key to the cache of the adaptor
364 - the operation requesting a cache key (a scalar basically
365 representing the name of the method)
368 =cut
370 sub get_sth{
371 my ($self,$adp,$obj,$fkobjs,$key,$op) = @_;
372 my ($sth,$meth);
374 # check whether we have to return the statement here for the edited
375 # update statement
376 if ($op eq "update_object") {
377 my $vals = $adp->get_persistent_slot_values($obj, $fkobjs);
378 my @cols = $adp->get_persistent_slots($obj, $fkobjs);
379 # we need both (new) seq(uence) and (original) length in
380 # order to determine whether either old or new sequence is
381 # longer than 4000 chars
382 my @i = 0..(scalar(@cols)-1);
383 my ($i_l) = grep { $cols[$_] eq "length"; } @i;
384 my ($i_s) = grep { $cols[$_] eq "seq"; } @i;
385 if (defined($vals->[$i_s]) && (length($vals->[$i_s]) > 4000)) {
386 $key .= " clob-compat";
387 $meth = "_upd_sth2";
388 $adp->debug("UPDATE biosequence: using first alternative\n");
389 } elsif ((!defined($vals->[$i_s])) && ($vals->[$i_l] > 4000)) {
390 $key .= " clob-compat2";
391 $meth = "_upd_sth3";
392 $adp->debug("UPDATE biosequence: using second alternative\n");
395 $sth = $adp->sth($key);
396 $sth = $adp->sth($key, $self->$meth) unless $sth || !defined($meth);
397 return $sth;
400 =head2 _upd_sth2
402 Title : _upd_sth2
403 Usage : $obj->_upd_sth2($newval)
404 Function: Get/set the second version of the update row statement
405 as a prepared statement handle.
407 The 'second version' differs from the default in that the
408 set parameter for the SEQ column is not wrapped in a NVL()
409 call. This is needed to make it work for LOB values (values
410 longer than 4000 chars). However, this statement should
411 only be executed if the value is defined in order to
412 prevent unwanted un-sets of the value in the database.
414 This is a private method. Do not use from outside.
416 Example :
417 Returns : value of _upd_sth2 (a DBI statement handle)
418 Args : on set, new value (a DBI statement handle or undef, optional)
421 =cut
423 sub _upd_sth2{
424 my $self = shift;
426 return $self->{'_upd_sth2'} = shift if @_;
427 return $self->{'_upd_sth2'};
430 =head2 _upd_sth3
432 Title : _upd_sth3
433 Usage : $obj->_upd_sth3($newval)
434 Function: Get/set the third version of the update row statement
435 as a prepared statement handle.
437 The 'third version' differs from the default in that the
438 parameter for the SEQ column is not used for updating at
439 all, but instead is placed into the WHERE-section as a
440 dummy clause that always evaluates to true. This is needed
441 to protect existing LOB values longer than 4000 chars from
442 being updated to NULL, due to a bug in NVL().
444 This is a private method. Do not use from outside.
446 Example :
447 Returns : value of _upd_sth3 (a DBI statement handle)
448 Args : on set, new value (a DBI statement handle or undef, optional)
451 =cut
453 sub _upd_sth3{
454 my $self = shift;
456 return $self->{'_upd_sth3'} = shift if @_;
457 return $self->{'_upd_sth3'};