StandAloneBlast & CommandExts: Apply double-quotes on
[bioperl-live.git] / Bio / Tools / Run / WrapperBase / CommandExts.pm
blobbd17343d02ff70677451d68d8833e115eb88d6dd
2 # BioPerl module for Bio::Tools::Run::WrapperBase::CommandExts
4 # Please direct questions and support issues to <bioperl-l@bioperl.org>
6 # Cared for by Mark A. Jensen <maj -at- fortinbras -dot- us>
8 # Copyright Mark A. Jensen
10 # You may distribute this module under the same terms as perl itself
12 # POD documentation - main docs before the code
14 =head1 NAME
16 Bio::Tools::Run::WrapperBase::CommandExts - Extensions to WrapperBase for handling programs with commands *ALPHA*
18 =head1 SYNOPSIS
20 Devs, see L</DEVELOPER INTERFACE>.
21 Users, see L</USER INTERFACE>.
23 =head1 DESCRIPTION
25 This is a developer-focused experimental module. The main idea is to
26 extend L<Bio::Tools::Run::WrapperBase> to make it relatively easy to
27 create run wrappers around I<suites> of related programs, like
28 C<samtools> or C<blast+>.
30 Some definitions:
32 =over
34 =item * program
36 The program is the command-line frontend application. C<samtools>, for example, is run from the command line as follows:
38 $ samtools view -bS in.bam > out.sam
39 $ samtools faidx
41 =item * command
43 The command is the specific component of a suite run by executing the
44 program. In the example above, C<view> and C<faidx> are commands.
46 =item * command prefix
48 The command prefix is an abbreviation of the command name used
49 internally by C<CommandExts> method, and sometimes by the user of the
50 factory for specifying command line parameters to subcommands of
51 composite commands.
53 =item * composite command
55 A composite command is a pipeline or script representing a series of
56 separate executions of different commands. Composite commands can be
57 specified by configuring C<CommandExts> appropriately; the composite
58 command can be run by the user from a factory in the same way as
59 ordinary commands.
61 =item * options, parameters, switches and filespecs
63 An option is any command-line option; i.e., a specification set off by
64 a command-line by a specifier (like C<-v> or C<--outfile>). Parameters
65 are command-line options that accept a value (C<-title mydb>);
66 switches are boolean flags (C<--no-filter>). Filespecs are barewords
67 at the end of the command line that usually indicate input or output
68 files. In this module, this includes files that capture STDIN, STDOUT,
69 or STDERR via redirection.
71 =item * pseudo-program
73 A "pseudo-program" is a way to refer to a collection of related
74 applications that are run independently from the command line, rather
75 than via a frontend program. The C<blast+> suite of programs is an
76 example: C<blastn>, C<makeblastdb>, etc. C<CommandExts> can be
77 configured to create a single factory for a suite of related,
78 independent programs that treats each independent program as a
79 "pseudo-program" command.
81 =back
83 This module essentially adds the non-assembler-specific wrapper
84 machinery of fangly's L<Bio::Tools::Run::AssemblerBase> to the
85 L<Bio::Tools::Run::WrapperBase> namespace, adding the general
86 command-handling capability of L<Bio::Tools::Run::BWA>. It creates run
87 factories that are automatically Bio::ParameterBaseI compliant,
88 meaning that C<available_parameters()>, C<set_parameters()>,
89 C<get_parameters>, C<reset_parameters()>, and C<parameters_changed()>
90 are available.
92 =head1 DEVELOPER INTERFACE
94 C<CommandExts> is currently set up to read particular package globals
95 which define the program, the commands available, command-line options
96 for those commands, and human-readable aliases for those options.
98 The easiest way to use C<CommandExts> is probably to create two modules:
100 Bio::Tools::Run::YourRunPkg
101 Bio::Tools::Run::YourRunPkg::Config
103 The package globals should be defined in the C<Config> module, and the
104 run package itself should begin with the following mantra:
106 use YourRunPkg::Config;
107 use Bio::Tools::Run::WrapperBase;
108 use Bio::Tools::Run::WrapperBase::CommandExts;
109 sub new {
110 my $class = shift;
111 my @args = @_;
112 my $self = $class->SUPER::new(@args);
114 return $self;
117 The following globals can/should be defined in the C<Config> module:
119 $program_name
120 $program_dir
121 $use_dash
122 $join
123 @program_commands
124 %command_prefixes
125 @program_params
126 @program_switches
127 %param_translation
128 %composite_commands
129 %command_files
131 See L</Config Globals> for detailed descriptions.
133 The work of creating a run wrapper with C<CommandExts> lies mainly in
134 setting up the globals. The key methods for the developer interface are:
136 =over
138 =item * program_dir($path_to_programs)
140 Set this to point the factory to the executables.
142 =item * _run(@file_args)
144 Runs an instantiated factory with the given file args. Use in the
145 C<run()> method override.
147 =item * _create_factory_set()
149 Returns a hash of instantiated factories for each true command from a
150 composite command factory. The hash keys are the true command names, so
151 you could do
153 $cmds = $composite_fac->_create_factory_set;
154 for (@true_commands) {
155 $cmds->{$_}->_run(@file_args);
158 =item * executables($cmd,[$fullpath])
160 For pseudo-programs, this gets/sets the full path to the executable of
161 the true program corresponding to the command C<$cmd>.
163 =back
165 =head2 Implementing Composite Commands
167 =head2 Implementing Pseudo-programs
169 To indicate that a package wraps disparate programs under a single pseudo program, use an asterisk before the program name:
171 package Bio::Tools::Run::YourPkg::Config;
173 our $program_name = '*blast+';
175 and C<_run> will know what to do. Specify the rest of the globals as
176 if the desired programs were commands. Use the basename of the
177 programs for the command names.
179 If all the programs can be found in a single directory, just specify
180 that directory in C<program_dir()>. If not, use C<executables()> to set the paths to each program explicitly:
182 foreach (keys %cmdpaths) {
183 $self->executables($_, $cmdpaths{$_});
186 =head2 Config Globals
188 Here is an example config file. Further details in prose are below.
190 package Dummy::Config;
191 use strict;
192 use warnings;
193 no warnings qw(qw);
194 use Exporter;
195 our (@ISA, @EXPORT, @EXPORT_OK);
196 push @ISA, 'Exporter';
197 @EXPORT = qw(
198 $program_name
199 $program_dir
200 $use_dash
201 $join
202 @program_commands
203 %command_prefixes
204 @program_params
205 @program_switches
206 %param_translation
207 %command_files
208 %composite_commands
211 our $program_name = '*flurb';
212 our $program_dir = 'C:\cygwin\usr\local\bin';
213 our $use_dash = 'mixed';
214 our $join = ' ';
216 our @program_commands = qw(
217 rpsblast
218 find
219 goob
220 blorb
221 multiglob
224 our %command_prefixes = (
225 blastp => 'blp',
226 tblastn => 'tbn',
227 goob => 'g',
228 blorb => 'b',
229 multiglob => 'm'
232 our @program_params = qw(
233 command
234 g|narf
235 g|schlurb
236 b|scroob
237 b|frelb
238 m|trud
241 our @program_switches = qw(
242 g|freen
243 b|klep
246 our %param_translation = (
247 'g|narf' => 'n',
248 'g|schlurb' => 'schlurb',
249 'g|freen' => 'f',
250 'b|scroob' => 's',
251 'b|frelb' => 'frelb'
254 our %command_files = (
255 'goob' => [qw( fas faq )],
258 our %composite_commands = (
259 'multiglob' => [qw( blorb goob )]
263 C<$use_dash> can be one of C<single>, C<double>, or C<mixed>. See L<Bio::Tools::Run::WrapperBase>.
265 There is a syntax for the C<%command_files> specification. The token
266 matching C<[a-zA-Z0-9_]+> in each element of each arrayref becomes the
267 named filespec parameter for the C<_run()> method in the wrapper
268 class. Additional symbols surrounding this token indicate how this
269 argument should be handled. Some examples:
271 >out : stdout is redirected into the file
272 specified by (..., -out => $file,... )
273 <in : stdin is accepted from the file
274 specified by (..., -in => $file,... )
275 2>log : stderr is redirected into the file
276 specified by (..., -log => $file,... )
277 #opt : this filespec argument is optional
278 (no throw if -opt => $option is missing)
279 2>#log: if -log is not specified in the arguments, the stderr()
280 method will capture stderr
281 *lst : this filespec can take multiple arguments,
282 specify using an arrayref (..., -lst => [$file1, $file2], ...)
283 *#lst : an optional list
285 The tokens above are examples; they can be anything matching the above regexp.
287 =head1 USER INTERFACE
289 Using a wrapper created with C<Bio::Tools::Run::WrapperBase::CommandExts>:
291 =over
293 =item * Getting a list of available commands, parameters, and filespecs:
295 To get a list of commands, simply:
297 @commands = Bio::Tools::Run::ThePkg->available_commands;
299 The wrapper will generally have human-readable aliases for each of the
300 command-line options for the wrapped program and commands. To obtain a
301 list of the parameters and switches available for a particular
302 command, do
304 $factory = Bio::Tools::Run::ThePkg->new( -command => 'glurb' );
305 @params = $factory->available_parameters('params');
306 @switches = $factory->available_parameters('switches');
307 @filespec = $factory->available_parameters('filespec');
308 @filespec = $factory->filespec; # alias
310 =item * Create factories
312 The factory is a handle on the program and command you wish to
313 run. Create a factory using C<new> to set command-line parameters:
315 $factory = Bio::Tools::Run::ThePkg->new( -command => 'glurb',
316 -freen => 1,
317 -furschlugginer => 'vreeble' );
319 A shorthand for this is:
321 $factory = Bio::Tools::Run::ThePkg->new_glurb(
322 -freen => 1,
323 -furschlugginer => 'vreeble' );
325 =item * Running programs
327 To run the program, use the C<run> method, providing filespecs as arguments
329 $factory = Bio::Tools::Run::ThePkg->new_assemble( -min_qual => 63 );
330 $factory->run( -faq1 => 'read1.fq', -faq2 => 'read2.fq',
331 -ref => 'refseq.fas', -out => 'new.sam' );
332 # do another
333 $factory->run( -faq1 => 'read-old1.fq', -faq2 => 'read-old2.fq',
334 -ref => 'refseq.fas', -out => 'old.sam' );
336 Messages on STDOUT and STDERR are dumped into their respective attributes:
338 $stdout = $factory->stdout;
339 $stderr = $factory->stderr;
341 unless STDOUT and/or STDERR are part of the named files in the filespec.
343 =item * Setting/getting/resetting/polling parameters.
345 A C<CommandExts>-based factory is always L<Bio::ParameterBaseI>
346 compliant. That means that you may set, get, and reset parameters
347 using C<set_parameters()>, C<get_parameters()>, and
348 C<reset_parameters>. You can ask whether parameters have changed since
349 they were last accessed by using the predicate
350 C<parameters_changed>. See L<Bio::ParameterBaseI> for more details.
352 Once set, parameters become attributes of the factory. Thus, you can get their values as follows:
354 if ($factory->freen) {
355 $furs = $factory->furshlugginer;
356 #...
359 =back
361 =head1 FEEDBACK
363 =head2 Mailing Lists
365 User feedback is an integral part of the evolution of this and other
366 Bioperl modules. Send your comments and suggestions preferably to
367 the Bioperl mailing list. Your participation is much appreciated.
369 bioperl-l@bioperl.org - General discussion
370 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
372 =head2 Support
374 Please direct usage questions or support issues to the mailing list:
376 L<bioperl-l@bioperl.org>
378 rather than to the module maintainer directly. Many experienced and
379 reponsive experts will be able look at the problem and quickly
380 address it. Please include a thorough description of the problem
381 with code and data examples if at all possible.
383 =head2 Reporting Bugs
385 Report bugs to the Bioperl bug tracking system to help us keep track
386 of the bugs and their resolution. Bug reports can be submitted via
387 the web:
389 https://redmine.open-bio.org/projects/bioperl/
391 =head1 AUTHOR - Mark A. Jensen
393 Email maj -at- fortinbras -dot- us
395 Describe contact details here
397 =head1 CONTRIBUTORS
399 Dan Kortschak ( dan -dot- kortschak -at- adelaide -dot- edu -dot- au )
401 =head1 APPENDIX
403 The rest of the documentation details each of the object methods.
404 Internal methods are usually preceded with a _
406 =cut
408 # Let the code begin...
410 package Bio::Tools::Run::WrapperBase; # need these methods in WrapperBase/maj
411 use strict;
412 use warnings;
413 no warnings qw(redefine);
415 use Bio::Root::Root;
416 use File::Spec;
417 use IPC::Run;
418 use base qw(Bio::Root::Root Bio::ParameterBaseI);
420 our $AUTOLOAD;
422 =head2 new()
424 Title : new
425 Usage :
426 Function: constructor for WrapperBase::CommandExts ;
427 correctly binds configuration variables
428 to the WrapperBase object
429 Returns : Bio::Tools::Run::WrapperBase object with command extensions
430 Args :
431 Note : this method subsumes the old _register_program_commands and
432 _set_program_options, leaving out the assembler-specific
433 parms ($qual_param and out_type())
435 =cut
437 sub new {
438 my ($class, @args) = @_;
439 my $self = bless ({}, $class);
440 # pull in *copies* of the Config variables from the caller namespace:
441 my ($pkg, @goob) = caller();
442 my ($commands,
443 $prefixes,
444 $params,
445 $switches,
446 $translation,
447 $use_dash,
448 $join,
449 $name,
450 $dir,
451 $composite_commands,
452 $files);
453 for (qw( @program_commands
454 %command_prefixes
455 @program_params
456 @program_switches
457 %param_translation
458 $use_dash
459 $join
460 $program_name
461 $program_dir
462 %composite_commands
463 %command_files ) ) {
464 my ($sigil, $var) = m/(.)(.*)/;
465 my $qualvar = "${sigil}${pkg}::${var}";
466 for ($sigil) {
467 /\@/ && do { $qualvar = "\[$qualvar\]" };
468 /\%/ && do { $qualvar = "\{$qualvar\}" };
470 my $locvar = "\$${var}";
471 $locvar =~ s/program_|command_|param_//g;
472 eval "$locvar = $qualvar";
474 # set up the info registry hash
475 my %registry;
476 if ($composite_commands) {
477 $self->_register_composite_commands($composite_commands,
478 $params,
479 $switches,
480 $prefixes);
482 @registry{qw( _commands _prefixes _files
483 _params _switches _translation
484 _composite_commands )} =
485 ($commands, $prefixes, $files,
486 $params, $switches, $translation,
487 $composite_commands);
488 $self->{_options} = \%registry;
489 if (not defined $use_dash) {
490 $self->{'_options'}->{'_dash'} = 1;
491 } else {
492 $self->{'_options'}->{'_dash'} = $use_dash;
494 if (not defined $join) {
495 $self->{'_options'}->{'_join'} = ' ';
496 } else {
497 $self->{'_options'}->{'_join'} = $join;
499 if ($name =~ /^\*/) {
500 $self->is_pseudo(1);
501 $name =~ s/^\*//;
503 $self->program_name($name) if not defined $self->program_name();
504 $self->program_dir($dir) if not defined $self->program_dir();
505 $self->set_parameters(@args);
506 $self->parameters_changed(1); # set on instantiation, per Bio::ParameterBaseI
507 return $self;
510 =head2 program_name
512 Title : program_name
513 Usage : $factory->program_name($name)
514 Function: get/set the executable name
515 Returns: string
516 Args : string
518 =cut
520 sub program_name {
521 my ($self, $val) = @_;
522 $self->{'_program_name'} = $val if $val;
523 return $self->{'_program_name'};
526 =head2 program_dir
528 Title : program_dir
529 Usage : $factory->program_dir($dir)
530 Function: get/set the program dir
531 Returns: string
532 Args : string
534 =cut
536 sub program_dir {
537 my ($self, $val) = @_;
538 $self->{'_program_dir'} = $val if $val;
539 return $self->{'_program_dir'};
542 =head2 _register_program_commands()
544 Title : _register_program_commands
545 Usage : $factory->_register_program_commands( \@commands, \%prefixes )
546 Function: Register the commands a program accepts (for programs that act
547 as frontends for a set of commands, each command having its own
548 set of params/switches)
549 Returns : true on success
550 Args : arrayref to a list of commands (scalar strings),
551 hashref to a translation table of the form
552 { $prefix1 => $command1, ... } [optional]
553 Note : To implement a program with this kind of calling structure,
554 include a parameter called 'command' in the
555 @program_params global
556 Note : The translation table is used to associate parameters and
557 switches specified in _set_program_options with the correct
558 program command. In the globals @program_params and
559 @program_switches, specify elements as 'prefix1|param' and
560 'prefix1|switch', etc.
562 =cut
564 =head2 _set_program_options
566 Title : _set_program_options
567 Usage : $factory->_set_program_options( \@ args );
568 Function: Register the parameters and flags that an assembler takes.
569 Returns : 1 for success
570 Args : - arguments passed by the user
571 - parameters that the program accepts, optional (default: none)
572 - switches that the program accepts, optional (default: none)
573 - parameter translation, optional (default: no translation occurs)
574 - dash option for the program parameters, [1|single|double|mixed],
575 optional (default: yes, use single dashes only)
576 - join, optional (default: ' ')
578 =cut
580 =head2 _translate_params
582 Title : _translate_params
583 Usage : @options = $assembler->_translate_params( );
584 Function: Translate the Bioperl arguments into the arguments to pass to the
585 program on the command line
586 Returns : Arrayref of arguments
587 Args : none
589 =cut
591 sub _translate_params {
592 my ($self) = @_;
593 # Get option string
594 my ($params, $switches, $join, $dash, $translat) =
595 @{$self->{_options}}{qw(_params _switches _join _dash _translation)};
597 # access the multiple dash choices of _setparams...
598 my @dash_args;
599 $dash ||= 1; # default as advertised
600 for ($dash) {
601 $_ eq '1' && do {
602 @dash_args = ( -dash => 1 );
603 last;
605 /^s/ && do { #single dash only
606 @dash_args = ( -dash => 1);
607 last;
609 /^d/ && do { # double dash only
610 @dash_args = ( -double_dash => 1);
611 last;
613 /^m/ && do { # mixed dash: one-letter opts get -,
614 # long opts get --
615 @dash_args = ( -mixed_dash => 1);
616 last;
618 do {
619 $self->warn( "Dash spec '$dash' not recognized; using 'single'" );
620 @dash_args = ( -dash => 1 );
623 my $options = $self->_setparams(
624 -params => $params,
625 -switches => $switches,
626 -join => $join,
627 @dash_args
630 # Translate options
631 my @options = split(/(\s|$join)/, $options);
632 for (my $i = 0; $i < scalar @options; $i++) {
633 my ($prefix, $name) = ( $options[$i] =~ m/^(-{0,2})(.+)$/ );
634 if (defined $name) {
635 if ($name =~ /command/i) {
636 $name = $options[$i+2]; # get the command
637 splice @options, $i, 4;
638 $i--;
639 # don't add the command if this is a pseudo-program
640 unshift @options, $name unless ($self->is_pseudo); # put command first
642 elsif (defined $$translat{$name}) {
643 $options[$i] = $prefix.$$translat{$name};
646 else {
647 splice @options, $i, 1;
648 $i--;
651 $options = join('', @options);
653 # this is a kludge for mixed options: the reason mixed doesn't
654 # work right on the pass through _setparams is that the
655 # *aliases* and not the actual params are passed to it.
656 # here we just rejigger the dashes
657 if ($dash =~ /^m/) {
658 $options =~ s/--([a-z0-9](?:\s|$))/-$1/gi;
661 # Now arrayify the options
662 @options = split(' ', $options);
664 return \@options;
667 =head2 executable()
669 Title : executable
670 Usage :
671 Function: find the full path to the main executable,
672 or to the command executable for pseudo-programs
673 Returns : full path, if found
674 Args : [optional] explicit path to the executable
675 (will set the appropriate command exec if
676 applicable)
677 [optional] boolean flag whether or not to warn when exe no found
678 Note : overrides WrapperBase.pm
680 =cut
682 sub executable {
683 my $self = shift;
684 my ($exe, $warn) = @_;
685 if ($self->is_pseudo) {
686 return $self->{_pathtoexe} = $self->executables($self->command,$exe);
689 # otherwise
690 # setter
691 if (defined $exe) {
692 $self->throw("binary '$exe' does not exist") unless -e $exe;
693 $self->throw("'$exe' is not executable") unless -x $exe;
694 return $self->{_pathtoexe} = $exe;
697 # getter
698 return $self->{_pathtoexe} if defined $self->{_pathstoexe};
700 # finder
701 return $self->{_pathtoexe} = $self->_find_executable($exe, $warn);
704 =head2 executables()
706 Title : executables
707 Usage :
708 Function: find the full path to a command's executable
709 Returns : full path (scalar string)
710 Args : command (scalar string),
711 [optional] explicit path to this command exe
712 [optional] boolean flag whether or not to warn when exe no found
714 =cut
716 sub executables {
717 my $self = shift;
718 my ($cmd, $exe, $warn) = @_;
719 # for now, barf if this is not a pseudo program
720 $self->throw("This wrapper represents a single program with commands, not multiple programs; can't use executables()") unless $self->is_pseudo;
721 $self->throw("Command name required at arg 1") unless defined $cmd;
722 $self->throw("The desired executable '$cmd' is not registered as a command") unless grep /^$cmd$/, @{$self->{_options}->{_commands}};
724 # setter
725 if (defined $exe) {
726 $self->throw("binary '$exe' does not exist") unless -e $exe;
727 $self->throw("'$exe' is not executable") unless -x $exe;
728 $self->{_pathstoexe} = {} unless defined $self->{_pathstoexe};
729 return $self->{_pathstoexe}->{$cmd} = $exe;
732 # getter
733 return $self->{_pathstoexe}->{$cmd} if defined $self->{_pathstoexe}->{$cmd};
735 $exe ||= $cmd;
736 # finder
737 return $self->{_pathstoexe}->{$cmd} = $self->_find_executable($exe, $warn);
740 =head2 _find_executable()
742 Title : _find_executable
743 Usage : my $exe_path = $fac->_find_executable($exe, $warn);
744 Function: find the full path to a named executable,
745 Returns : full path, if found
746 Args : name of executable to find
747 [optional] boolean flag whether or not to warn when exe no found
748 Note : differs from executable and executables in not
749 setting any object attributes
751 =cut
753 sub _find_executable {
754 my $self = shift;
755 my ($exe, $warn) = @_;
757 if ($self->is_pseudo && !$exe) {
758 if (!$self->command) {
759 # this throw probably appropriate
760 # the rest are now warns if $warn.../maj
761 $self->throw(
762 "The ".__PACKAGE__." wrapper represents several different programs;".
763 "arg1 to _find_executable must be specified explicitly,".
764 "or the command() attribute set");
766 else {
767 $exe = $self->command;
770 $exe ||= $self->program_path;
772 my $path;
773 if ($self->program_dir) {
774 $path = File::Spec->catfile($self->program_dir, $exe);
775 } else {
776 $path = $exe;
777 $self->warn('Program directory not specified; use program_dir($path).') if $warn;
780 # use provided info - we are allowed to follow symlinks, but refuse directories
781 map { return $path.$_ if ( -x $path.$_ && !(-d $path.$_) ) } ('', '.exe') if defined $path;
783 # couldn't get path to executable from provided info, so use system path
784 $path = $path ? " in $path" : undef;
785 $self->warn("Executable $exe not found$path, trying system path...") if $warn;
786 if ($path = $self->io->exists_exe($exe)) {
787 return $path;
788 } else {
789 $self->warn("Cannot find executable for program '".($self->is_pseudo ? $self->command : $self->program_name)."'") if $warn;
790 return;
794 =head2 _register_composite_commands()
796 Title : _register_composite_commands
797 Usage :
798 Function: adds subcomand params and switches for composite commands
799 Returns : true on success
800 Args : \%composite_commands,
801 \@program_params,
802 \@program_switches
804 =cut
806 sub _register_composite_commands {
807 my $self = shift;
808 my ($composite_commands, $program_params,
809 $program_switches, $command_prefixes) = @_;
810 my @sub_params;
811 my @sub_switches;
812 foreach my $cmd (keys %$composite_commands) {
813 my $pfx = $command_prefixes->{$cmd} || $cmd;
814 foreach my $subcmd ( @{$$composite_commands{$cmd}} ) {
815 my $spfx = $command_prefixes->{$subcmd} || $subcmd;
816 my @sub_program_params = grep /^$spfx\|/, @$program_params;
817 my @sub_program_switches = grep /^$spfx\|/, @$program_switches;
818 for (@sub_program_params) {
819 m/^$spfx\|(.*)/;
820 push @sub_params, "$pfx\|${spfx}_".$1;
822 for (@sub_program_switches) {
823 m/^$spfx\|(.*)/;
824 push @sub_switches, "$pfx\|${spfx}_".$1;
828 push @$program_params, @sub_params;
829 push @$program_switches, @sub_switches;
830 # translations for subcmd params/switches not necessary
831 return 1;
834 =head2 _create_factory_set()
836 Title : _create_factory_set
837 Usage : @facs = $self->_create_factory_set
838 Function: instantiate a set of individual command factories for
839 a given composite command
840 Factories will have the correct parameter fields set for
841 their own subcommand
842 Returns : hash of factories: ( $subcmd_prefix => $subcmd_factory, ... )
843 Args : none
845 =cut
847 sub _create_factory_set {
848 my $self = shift;
849 $self->throw('command not set') unless $self->command;
850 my $cmd = $self->command;
851 $self->throw('_create_factory_set only works on composite commands')
852 unless grep /^$cmd$/, keys %{$self->{_options}->{_composite_commands}};
853 my %ret;
854 my $class = ref $self;
855 my $subargs_hash = $self->_collate_subcmd_args($cmd);
856 for (keys %$subargs_hash) {
857 $ret{$_} = $class->new( -command => $_, @{$$subargs_hash{$_}} );
859 return %ret;
862 =head2 _collate_subcmd_args()
864 Title : _collate_subcmd_args
865 Usage : $args_hash = $self->_collate_subcmd_args
866 Function: collate parameters and switches into command-specific
867 arg lists for passing to new()
868 Returns : hash of named argument lists
869 Args : [optional] composite cmd prefix (scalar string)
870 [default is 'run']
872 =cut
874 sub _collate_subcmd_args {
875 my $self = shift;
876 my $cmd = shift;
877 my %ret;
878 # default command is 'run'
879 $cmd ||= 'run';
880 return unless $self->{'_options'}->{'_composite_commands'};
881 return unless $self->{'_options'}->{'_composite_commands'}->{$cmd};
882 my @subcmds = @{$self->{'_options'}->{'_composite_commands'}->{$cmd}};
884 my $cur_options = $self->{'_options'};
885 # collate
886 foreach my $subcmd (@subcmds) {
887 # find the composite cmd form of the argument in
888 # the current params and switches
889 # e.g., map_max_mismatches
890 my $pfx = $self->{_options}->{_prefixes}->{$subcmd} || $subcmd;
891 my @params = grep /^${pfx}_/, @{$$cur_options{'_params'}};
892 my @switches = grep /^${pfx}_/, @{$$cur_options{'_switches'}};
893 $ret{$subcmd} = [];
894 # create an argument list suitable for passing to new() of
895 # the subcommand factory...
896 foreach my $opt (@params, @switches) {
897 my $subopt = $opt;
898 $subopt =~ s/^${pfx}_//;
899 push(@{$ret{$subcmd}}, '-'.$subopt => $self->$opt) if defined $self->$opt;
902 return \%ret;
905 =head2 _run
907 Title : _run
908 Usage : $fac->_run( @file_args )
909 Function: Run a command as specified during object contruction
910 Returns : true on success
911 Args : a specification of the files to operate on according
912 to the filespec
914 =cut
916 sub _run {
917 my ($self, @args) = @_;
918 # _translate_params will provide an array of command/parameters/switches
919 # -- these are set at object construction
920 # to set up the run, need to add the files to the call
921 # -- provide these as arguments to this function
922 my $cmd = $self->command if $self->can('command');
923 my $opts = $self->{_options};
924 my %args;
925 $self->throw("No command specified for the object") unless $cmd;
926 # setup files necessary for this command
927 my $filespec = $opts->{'_files'}->{$cmd};
928 my @switches;
929 my ($in, $out, $err);
930 # some applications rely completely on switches
931 if (defined $filespec && @$filespec) {
932 # parse args based on filespec
933 # require named args
934 $self->throw("Named args are required") unless !(@args % 2);
935 s/^-// for @args;
936 %args = @args;
937 # validate
938 my @req = map {
939 my $s = $_;
940 $s =~ s/^-.*\|//;
941 $s =~ s/^[012]?[<>]//;
942 $s =~ s/[^a-zA-Z0-9_]//g;
944 } grep !/[#]/, @$filespec;
945 !defined($args{$_}) && $self->throw("Required filearg '$_' not specified") for @req;
946 # set up redirects and file switches
947 for (@$filespec) {
948 m/^1?>#?(.*)/ && do {
949 defined($args{$1}) && ( open($out,">", $args{$1}) or $self->throw("Open for write error : $!"));
950 next;
952 m/^2>#?(.*)/ && do {
953 defined($args{$1}) && (open($err, ">", $args{$1}) or $self->throw("Open for write error : $!"));
954 next;
956 m/^<#?(.*)/ && do {
957 defined($args{$1}) && (open($in, "<", $args{$1}) or $self->throw("Open for read error : $!"));
958 next;
960 if (m/^-(.*)\|/) {
961 push @switches, $self->_dash_switch($1);
962 } else {
963 push @switches, undef;
967 my $dum;
968 $in || ($in = \$dum);
969 $out || ($out = \$self->{'stdout'});
970 $err || ($err = \$self->{'stderr'});
972 # Get program executable
973 my $exe = $self->executable;
974 $self->throw("Can't find executable for '".($self->is_pseudo ? $self->command : $self->program_name)."'; can't continue") unless $exe;
975 # Use double quotes if executable path have empty spaces
976 if ($exe =~ m/ /) {
977 $exe = "\"$exe\"";
980 # Get command-line options
981 my $options = $self->_translate_params();
982 # Get file specs sans redirects in correct order
983 my @specs = map {
984 my $s = $_;
985 $s =~ s/^-.*\|//;
986 $s =~ s/[^a-zA-Z0-9_]//g;
988 } grep !/[<>]/, @$filespec;
989 my @files = @args{@specs};
990 # expand arrayrefs
991 my $l = $#files;
993 # Note: below code block may be brittle, see link on this:
994 # http://lists.open-bio.org/pipermail/bioperl-l/2010-June/033439.html
996 for (0..$l) {
997 if (ref($files[$_]) eq 'ARRAY') {
998 splice(@switches, $_, 1, ($switches[$_]) x @{$files[$_]});
999 splice(@files, $_, 1, @{$files[$_]});
1004 @files = map {
1005 my $s = shift @switches;
1006 defined $_ ? ($s, $_): ()
1007 } @files;
1008 @files = map { defined $_ ? $_ : () } @files; # squish undefs
1009 my @ipc_args = ( $exe, @$options, @files );
1010 $self->{_last_execution} = join( $self->{'_options'}->{'_join'}, @ipc_args );
1011 eval {
1012 IPC::Run::run(\@ipc_args, $in, $out, $err) or
1013 die ("There was a problem running $exe : ".$$err);
1016 if ($@) {
1017 $self->throw("$exe call crashed: $@") unless $self->no_throw_on_crash;
1018 return 0;
1021 return 1;
1026 =head2 no_throw_on_crash()
1028 Title : no_throw_on_crash
1029 Usage :
1030 Function: prevent throw on execution error
1031 Returns :
1032 Args : [optional] boolean
1034 =cut
1036 sub no_throw_on_crash {
1037 my $self = shift;
1038 return $self->{'_no_throw'} = shift if @_;
1039 return $self->{'_no_throw'};
1042 =head2 last_execution()
1044 Title : last_execution
1045 Usage :
1046 Function: return the last executed command with options
1047 Returns : string of command line sent to IPC::Run
1048 Args :
1050 =cut
1052 sub last_execution {
1053 my $self = shift;
1054 return $self->{'_last_execution'};
1057 =head2 _dash_switch()
1059 Title : _dash_switch
1060 Usage : $version = $fac->_dash_switch( $switch )
1061 Function: Returns an appropriately dashed switch for the executable
1062 Args : A string containing a switch without dashes
1063 Returns : string containing an appropriately dashed switch for the current executable
1065 =cut
1067 sub _dash_switch {
1068 my ($self, $switch) = @_;
1070 my $dash = $self->{'_options'}->{'_dash'};
1071 for ($dash) {
1072 $_ eq '1' && do {
1073 $switch = '-'.$switch;
1074 last;
1076 /^s/ && do { #single dash only
1077 $switch = '-'.$switch;
1078 last;
1080 /^d/ && do { # double dash only
1081 $switch = '--'.$switch;
1082 last;
1084 /^m/ && do { # mixed dash: one-letter opts get -,
1085 $switch = '-'.$switch;
1086 $switch =~ s/^(-[a-z0-9](?:\w+))$/-$1/i;
1087 last;
1089 do {
1090 $self->warn( "Dash spec '$dash' not recognized; using 'single'" );
1091 $switch = '-'.$switch;
1095 return $switch;
1098 =head2 stdout()
1100 Title : stdout
1101 Usage : $fac->stdout()
1102 Function: store the output from STDOUT for the run,
1103 if no file specified in _run arguments
1104 Example :
1105 Returns : scalar string
1106 Args : on set, new value (a scalar or undef, optional)
1108 =cut
1110 sub stdout {
1111 my $self = shift;
1112 return $self->{'stdout'} = shift if @_;
1113 return $self->{'stdout'};
1116 =head2 stderr()
1118 Title : stderr
1119 Usage : $fac->stderr()
1120 Function: store the output from STDERR for the run,
1121 if no file is specified in _run arguments
1122 Example :
1123 Returns : scalar string
1124 Args : on set, new value (a scalar or undef, optional)
1126 =cut
1128 sub stderr {
1129 my $self = shift;
1130 return $self->{'stderr'} = shift if @_;
1131 return $self->{'stderr'};
1134 =head2 is_pseudo()
1136 Title : is_pseudo
1137 Usage : $obj->is_pseudo($newval)
1138 Function: returns true if this factory represents
1139 a pseudo-program
1140 Example :
1141 Returns : value of is_pseudo (boolean)
1142 Args : on set, new value (a scalar or undef, optional)
1144 =cut
1146 sub is_pseudo {
1147 my $self = shift;
1149 return $self->{'is_pseudo'} = shift if @_;
1150 return $self->{'is_pseudo'};
1153 =head2 AUTOLOAD
1155 AUTOLOAD permits
1157 $class->new_yourcommand(@args);
1159 as an alias for
1161 $class->new( -command => 'yourcommand', @args );
1163 =cut
1165 sub AUTOLOAD {
1166 my $class = shift;
1167 my $tok = $AUTOLOAD;
1168 my @args = @_;
1169 $tok =~ s/.*:://;
1170 unless ($tok =~ /^new_/) {
1171 $class->throw("Can't locate object method '$tok' via package '".ref($class)?ref($class):$class);
1173 my ($cmd) = $tok =~ m/new_(.*)/;
1174 return $class->new( -command => $cmd, @args );
1177 =head1 Bio:ParameterBaseI compliance
1179 =head2 set_parameters()
1181 Title : set_parameters
1182 Usage : $pobj->set_parameters(%params);
1183 Function: sets the parameters listed in the hash or array
1184 Returns : true on success
1185 Args : [optional] hash or array of parameter/values.
1187 =cut
1189 sub set_parameters {
1190 my ($self, @args) = @_;
1192 # currently stored stuff
1193 my $opts = $self->{'_options'};
1194 my $params = $opts->{'_params'};
1195 my $switches = $opts->{'_switches'};
1196 my $translation = $opts->{'_translation'};
1197 my $use_dash = $opts->{'_dash'};
1198 my $join = $opts->{'_join'};
1199 unless (($self->can('command') && $self->command)
1200 || (grep /command/, @args)) {
1201 push @args, '-command', 'run';
1203 my %args = @args;
1204 my $cmd = $args{'-command'} || $args{'command'} || ($self->can('command') && $self->command);
1205 if ($cmd) {
1206 my (@p,@s, %x);
1207 $self->warn('Command present, but no commands registered') unless $self->{'_options'}->{'_commands'};
1208 $self->throw("Command '$cmd' not registered") unless grep /^$cmd$/, @{$self->{'_options'}->{'_commands'}};
1209 $cmd = $self->{_options}->{_prefixes}->{$cmd} || $cmd;
1211 @p = (grep(!/^.*?\|/, @$params), grep(/^${cmd}\|/, @$params));
1212 @s = (grep(!/^.*?\|/, @$switches), grep(/^${cmd}\|/, @$switches));
1213 s/.*?\|// for @p;
1214 s/.*?\|// for @s;
1215 @x{@p, @s} = @{$translation}{
1216 grep( !/^.*?\|/, @$params, @$switches),
1217 grep(/^${cmd}\|/, @$params, @$switches) };
1218 $opts->{_translation} = $translation = \%x;
1219 $opts->{_params} = $params = \@p;
1220 $opts->{_switches} = $switches = \@s;
1222 $self->_set_from_args(
1223 \@args,
1224 -methods => [ @$params, @$switches, 'program_name', 'program_dir', 'out_type' ],
1225 -create => 1,
1226 # when our parms are accessed, signal parameters are unchanged for
1227 # future reads (until set_parameters is called)
1228 -code =>
1229 ' my $self = shift;
1230 $self->parameters_changed(0);
1231 return $self->{\'_\'.$method} = shift if @_;
1232 return $self->{\'_\'.$method};'
1234 # the question is, are previously-set parameters left alone when
1235 # not specified in @args?
1236 $self->parameters_changed(1);
1237 return 1;
1240 =head2 reset_parameters()
1242 Title : reset_parameters
1243 Usage : resets values
1244 Function: resets parameters to either undef or value in passed hash
1245 Returns : none
1246 Args : [optional] hash of parameter-value pairs
1248 =cut
1250 sub reset_parameters {
1251 my ($self, @args) = @_;
1253 my @reset_args;
1254 # currently stored stuff
1255 my $opts = $self->{'_options'};
1256 my $params = $opts->{'_params'};
1257 my $switches = $opts->{'_switches'};
1258 my $translation = $opts->{'_translation'};
1259 my $qual_param = $opts->{'_qual_param'};
1260 my $use_dash = $opts->{'_dash'};
1261 my $join = $opts->{'_join'};
1263 # handle command name
1264 my %args = @args;
1265 my $cmd = $args{'-command'} || $args{'command'} || $self->command;
1266 $args{'command'} = $cmd;
1267 delete $args{'-command'};
1268 @args = %args;
1269 # don't like this, b/c _set_program_args will create a bunch of
1270 # accessors with undef values, but oh well for now /maj
1272 for my $p (@$params) {
1273 push(@reset_args, $p => undef) unless grep /^[-]?$p$/, @args;
1275 for my $s (@$switches) {
1276 push(@reset_args, $s => undef) unless grep /^[-]?$s$/, @args;
1278 push @args, @reset_args;
1279 $self->set_parameters(@args);
1280 $self->parameters_changed(1);
1283 =head2 parameters_changed()
1285 Title : parameters_changed
1286 Usage : if ($pobj->parameters_changed) {...}
1287 Function: Returns boolean true (1) if parameters have changed
1288 Returns : Boolean (0 or 1)
1289 Args : [optional] Boolean
1291 =cut
1293 sub parameters_changed {
1294 my $self = shift;
1295 return $self->{'_parameters_changed'} = shift if @_;
1296 return $self->{'_parameters_changed'};
1299 =head2 available_parameters()
1301 Title : available_parameters
1302 Usage : @params = $pobj->available_parameters()
1303 Function: Returns a list of the available parameters
1304 Returns : Array of parameters
1305 Args : 'params' for settable program parameters
1306 'switches' for boolean program switches
1307 default: all
1309 =cut
1311 sub available_parameters {
1312 my $self = shift;
1313 my $subset = shift;
1314 my $opts = $self->{'_options'};
1315 my @ret;
1316 for ($subset) {
1317 (!defined || /^a/) && do {
1318 @ret = (@{$opts->{'_params'}}, @{$opts->{'_switches'}});
1319 last;
1321 m/^p/i && do {
1322 @ret = @{$opts->{'_params'}};
1323 last;
1325 m/^s/i && do {
1326 @ret = @{$opts->{'_switches'}};
1327 last;
1329 m/^c/i && do {
1330 @ret = @{$opts->{'_commands'}};
1331 last;
1333 m/^f/i && do { # get file spec
1334 return @{$opts->{'_files'}->{$self->command}};
1336 do { #fail
1337 $self->throw("available_parameters: unrecognized subset");
1340 return @ret;
1343 sub available_commands { shift->available_parameters('commands') }
1344 sub filespec { shift->available_parameters('filespec') }
1346 =head2 get_parameters()
1348 Title : get_parameters
1349 Usage : %params = $pobj->get_parameters;
1350 Function: Returns list of key-value pairs of parameter => value
1351 Returns : List of key-value pairs
1352 Args : [optional] A string is allowed if subsets are wanted or (if a
1353 parameter subset is default) 'all' to return all parameters
1355 =cut
1357 sub get_parameters {
1358 my $self = shift;
1359 my $subset = shift;
1360 $subset ||= 'all';
1361 my @ret;
1362 my $opts = $self->{'_options'};
1363 for ($subset) {
1364 m/^p/i && do { #params only
1365 for (@{$opts->{'_params'}}) {
1366 push(@ret, $_, $self->$_) if $self->can($_) && defined $self->$_;
1368 last;
1370 m/^s/i && do { #switches only
1371 for (@{$opts->{'_switches'}}) {
1372 push(@ret, $_, $self->$_) if $self->can($_) && defined $self->$_;
1374 last;
1376 m/^a/i && do { # all
1377 for ((@{$opts->{'_params'}},@{$opts->{'_switches'}})) {
1378 push(@ret, $_, $self->$_) if $self->can($_) && defined $self->$_;
1380 last;
1382 do {
1383 $self->throw("get_parameters: unrecognized subset");
1386 return @ret;