fix spelling errors, fixes #3228
[bioperl-live.git] / Bio / Tools / Run / WrapperBase / CommandExts.pm
blob9f05189defb3d8854479feff2d77c7d52f384dc0
2 # BioPerl module for Bio::Tools::Run::WrapperBase::CommandExts
4 # Please direct questions and support issues to <bioperl-l@bioperl.org>
6 # Cared for by Mark A. Jensen <maj -at- fortinbras -dot- us>
8 # Copyright Mark A. Jensen
10 # You may distribute this module under the same terms as perl itself
12 # POD documentation - main docs before the code
14 =head1 NAME
16 Bio::Tools::Run::WrapperBase::CommandExts - Extensions to WrapperBase for handling programs with commands *ALPHA*
18 =head1 SYNOPSIS
20 Devs, see L</DEVELOPER INTERFACE>.
21 Users, see L</USER INTERFACE>.
23 =head1 DESCRIPTION
25 This is a developer-focused experimental module. The main idea is to
26 extend L<Bio::Tools::Run::WrapperBase> to make it relatively easy to
27 create run wrappers around I<suites> of related programs, like
28 C<samtools> or C<blast+>.
30 Some definitions:
32 =over
34 =item * program
36 The program is the command-line frontend application. C<samtools>, for example, is run from the command line as follows:
38 $ samtools view -bS in.bam > out.sam
39 $ samtools faidx
41 =item * command
43 The command is the specific component of a suite run by executing the
44 program. In the example above, C<view> and C<faidx> are commands.
46 =item * command prefix
48 The command prefix is an abbreviation of the command name used
49 internally by C<CommandExts> method, and sometimes by the user of the
50 factory for specifying command line parameters to subcommands of
51 composite commands.
53 =item * composite command
55 A composite command is a pipeline or script representing a series of
56 separate executions of different commands. Composite commands can be
57 specified by configuring C<CommandExts> appropriately; the composite
58 command can be run by the user from a factory in the same way as
59 ordinary commands.
61 =item * options, parameters, switches and filespecs
63 An option is any command-line option; i.e., a specification set off by
64 a command-line by a specifier (like C<-v> or C<--outfile>). Parameters
65 are command-line options that accept a value (C<-title mydb>);
66 switches are boolean flags (C<--no-filter>). Filespecs are barewords
67 at the end of the command line that usually indicate input or output
68 files. In this module, this includes files that capture STDIN, STDOUT,
69 or STDERR via redirection.
71 =item * pseudo-program
73 A "pseudo-program" is a way to refer to a collection of related
74 applications that are run independently from the command line, rather
75 than via a frontend program. The C<blast+> suite of programs is an
76 example: C<blastn>, C<makeblastdb>, etc. C<CommandExts> can be
77 configured to create a single factory for a suite of related,
78 independent programs that treats each independent program as a
79 "pseudo-program" command.
81 =back
83 This module essentially adds the non-assembler-specific wrapper
84 machinery of fangly's L<Bio::Tools::Run::AssemblerBase> to the
85 L<Bio::Tools::Run::WrapperBase> namespace, adding the general
86 command-handling capability of L<Bio::Tools::Run::BWA>. It creates run
87 factories that are automatically Bio::ParameterBaseI compliant,
88 meaning that C<available_parameters()>, C<set_parameters()>,
89 C<get_parameters>, C<reset_parameters()>, and C<parameters_changed()>
90 are available.
92 =head1 DEVELOPER INTERFACE
94 C<CommandExts> is currently set up to read particular package globals
95 which define the program, the commands available, command-line options
96 for those commands, and human-readable aliases for those options.
98 The easiest way to use C<CommandExts> is probably to create two modules:
100 Bio::Tools::Run::YourRunPkg
101 Bio::Tools::Run::YourRunPkg::Config
103 The package globals should be defined in the C<Config> module, and the
104 run package itself should begin with the following mantra:
106 use YourRunPkg::Config;
107 use Bio::Tools::Run::WrapperBase;
108 use Bio::Tools::Run::WrapperBase::CommandExts;
109 sub new {
110 my $class = shift;
111 my @args = @_;
112 my $self = $class->SUPER::new(@args);
114 return $self;
117 The following globals can/should be defined in the C<Config> module:
119 $program_name
120 $program_dir
121 $use_dash
122 $join
123 @program_commands
124 %command_prefixes
125 @program_params
126 @program_switches
127 %param_translation
128 %composite_commands
129 %command_files
131 See L</Config Globals> for detailed descriptions.
133 The work of creating a run wrapper with C<CommandExts> lies mainly in
134 setting up the globals. The key methods for the developer interface are:
136 =over
138 =item * program_dir($path_to_programs)
140 Set this to point the factory to the executables.
142 =item * _run(@file_args)
144 Runs an instantiated factory with the given file args. Use in the
145 C<run()> method override.
147 =item * _create_factory_set()
149 Returns a hash of instantiated factories for each true command from a
150 composite command factory. The hash keys are the true command names, so
151 you could do
153 $cmds = $composite_fac->_create_factory_set;
154 for (@true_commands) {
155 $cmds->{$_}->_run(@file_args);
158 =item * executables($cmd,[$fullpath])
160 For pseudo-programs, this gets/sets the full path to the executable of
161 the true program corresponding to the command C<$cmd>.
163 =back
165 =head2 Implementing Composite Commands
167 =head2 Implementing Pseudo-programs
169 To indicate that a package wraps disparate programs under a single pseudo program, use an asterisk before the program name:
171 package Bio::Tools::Run::YourPkg::Config;
173 our $program_name = '*blast+';
175 and C<_run> will know what to do. Specify the rest of the globals as
176 if the desired programs were commands. Use the basename of the
177 programs for the command names.
179 If all the programs can be found in a single directory, just specify
180 that directory in C<program_dir()>. If not, use C<executables()> to set the paths to each program explicitly:
182 foreach (keys %cmdpaths) {
183 $self->executables($_, $cmdpaths{$_});
186 =head2 Config Globals
188 Here is an example config file. Further details in prose are below.
190 package Dummy::Config;
191 use strict;
192 use warnings;
193 no warnings qw(qw);
194 use Exporter;
195 our (@ISA, @EXPORT, @EXPORT_OK);
196 push @ISA, 'Exporter';
197 @EXPORT = qw(
198 $program_name
199 $program_dir
200 $use_dash
201 $join
202 @program_commands
203 %command_prefixes
204 @program_params
205 @program_switches
206 %param_translation
207 %command_files
208 %composite_commands
211 our $program_name = '*flurb';
212 our $program_dir = 'C:\cygwin\usr\local\bin';
213 our $use_dash = 'mixed';
214 our $join = ' ';
216 our @program_commands = qw(
217 rpsblast
219 goob
220 blorb
221 multiglob
224 our %command_prefixes = (
225 blastp => 'blp',
226 tblastn => 'tbn',
227 goob => 'g',
228 blorb => 'b',
229 multiglob => 'm'
232 our @program_params = qw(
233 command
234 g|narf
235 g|schlurb
236 b|scroob
237 b|frelb
238 m|trud
241 our @program_switches = qw(
242 g|freen
243 b|klep
246 our %param_translation = (
247 'g|narf' => 'n',
248 'g|schlurb' => 'schlurb',
249 'g|freen' => 'f',
250 'b|scroob' => 's',
251 'b|frelb' => 'frelb'
254 our %command_files = (
255 'goob' => [qw( fas faq )],
258 our %composite_commands = (
259 'multiglob' => [qw( blorb goob )]
263 C<$use_dash> can be one of C<single>, C<double>, or C<mixed>. See L<Bio::Tools::Run::WrapperBase>.
265 There is a syntax for the C<%command_files> specification. The token
266 matching C<[a-zA-Z0-9_]+> in each element of each arrayref becomes the
267 named filespec parameter for the C<_run()> method in the wrapper
268 class. Additional symbols surrounding this token indicate how this
269 argument should be handled. Some examples:
271 >out : stdout is redirected into the file
272 specified by (..., -out => $file,... )
273 <in : stdin is accepted from the file
274 specified by (..., -in => $file,... )
275 2>log : stderr is redirected into the file
276 specified by (..., -log => $file,... )
277 #opt : this filespec argument is optional
278 (no throw if -opt => $option is missing)
279 2>#log: if -log is not specified in the arguments, the stderr()
280 method will capture stderr
281 *lst : this filespec can take multiple arguments,
282 specify using an arrayref (..., -lst => [$file1, $file2], ...)
283 *#lst : an optional list
285 The tokens above are examples; they can be anything matching the above regexp.
287 =head1 USER INTERFACE
289 Using a wrapper created with C<Bio::Tools::Run::WrapperBase::CommandExts>:
291 =over
293 =item * Getting a list of available commands, parameters, and filespecs:
295 To get a list of commands, simply:
297 @commands = Bio::Tools::Run::ThePkg->available_commands;
299 The wrapper will generally have human-readable aliases for each of the
300 command-line options for the wrapped program and commands. To obtain a
301 list of the parameters and switches available for a particular
302 command, do
304 $factory = Bio::Tools::Run::ThePkg->new( -command => 'glurb' );
305 @params = $factory->available_parameters('params');
306 @switches = $factory->available_parameters('switches');
307 @filespec = $factory->available_parameters('filespec');
308 @filespec = $factory->filespec; # alias
310 =item * Create factories
312 The factory is a handle on the program and command you wish to
313 run. Create a factory using C<new> to set command-line parameters:
315 $factory = Bio::Tools::Run::ThePkg->new( -command => 'glurb',
316 -freen => 1,
317 -furschlugginer => 'vreeble' );
319 A shorthand for this is:
321 $factory = Bio::Tools::Run::ThePkg->new_glurb(
322 -freen => 1,
323 -furschlugginer => 'vreeble' );
325 =item * Running programs
327 To run the program, use the C<run> method, providing filespecs as arguments
329 $factory = Bio::Tools::Run::ThePkg->new_assemble( -min_qual => 63 );
330 $factory->run( -faq1 => 'read1.fq', -faq2 => 'read2.fq',
331 -ref => 'refseq.fas', -out => 'new.sam' );
332 # do another
333 $factory->run( -faq1 => 'read-old1.fq', -faq2 => 'read-old2.fq',
334 -ref => 'refseq.fas', -out => 'old.sam' );
336 Messages on STDOUT and STDERR are dumped into their respective attributes:
338 $stdout = $factory->stdout;
339 $stderr = $factory->stderr;
341 unless STDOUT and/or STDERR are part of the named files in the filespec.
343 =item * Setting/getting/resetting/polling parameters.
345 A C<CommandExts>-based factory is always L<Bio::ParameterBaseI>
346 compliant. That means that you may set, get, and reset parameters
347 using C<set_parameters()>, C<get_parameters()>, and
348 C<reset_parameters>. You can ask whether parameters have changed since
349 they were last accessed by using the predicate
350 C<parameters_changed>. See L<Bio::ParameterBaseI> for more details.
352 Once set, parameters become attributes of the factory. Thus, you can get their values as follows:
354 if ($factory->freen) {
355 $furs = $factory->furshlugginer;
356 #...
359 =back
361 =head1 FEEDBACK
363 =head2 Mailing Lists
365 User feedback is an integral part of the evolution of this and other
366 Bioperl modules. Send your comments and suggestions preferably to
367 the Bioperl mailing list. Your participation is much appreciated.
369 bioperl-l@bioperl.org - General discussion
370 http://bioperl.org/wiki/Mailing_lists - About the mailing lists
372 =head2 Support
374 Please direct usage questions or support issues to the mailing list:
376 L<bioperl-l@bioperl.org>
378 rather than to the module maintainer directly. Many experienced and
379 reponsive experts will be able look at the problem and quickly
380 address it. Please include a thorough description of the problem
381 with code and data examples if at all possible.
383 =head2 Reporting Bugs
385 Report bugs to the Bioperl bug tracking system to help us keep track
386 of the bugs and their resolution. Bug reports can be submitted via
387 the web:
389 https://redmine.open-bio.org/projects/bioperl/
391 =head1 AUTHOR - Mark A. Jensen
393 Email maj -at- fortinbras -dot- us
395 Describe contact details here
397 =head1 CONTRIBUTORS
399 Dan Kortschak ( dan -dot- kortschak -at- adelaide -dot- edu -dot- au )
401 =head1 APPENDIX
403 The rest of the documentation details each of the object methods.
404 Internal methods are usually preceded with a _
406 =cut
408 # Let the code begin...
410 package Bio::Tools::Run::WrapperBase; # need these methods in WrapperBase/maj
411 use strict;
412 use warnings;
413 no warnings qw(redefine);
415 use Bio::Root::Root;
416 use File::Spec;
417 use IPC::Run;
418 use base qw(Bio::Root::Root Bio::ParameterBaseI);
420 our $AUTOLOAD;
422 =head2 new()
424 Title : new
425 Usage :
426 Function: constructor for WrapperBase::CommandExts ;
427 correctly binds configuration variables
428 to the WrapperBase object
429 Returns : Bio::Tools::Run::WrapperBase object with command extensions
430 Args :
431 Note : this method subsumes the old _register_program_commands and
432 _set_program_options, leaving out the assembler-specific
433 parms ($qual_param and out_type())
435 =cut
437 sub new {
438 my ($class, @args) = @_;
439 my $self = bless ({}, $class);
440 # pull in *copies* of the Config variables from the caller namespace:
441 my ($pkg, @goob) = caller();
442 my ($commands,
443 $prefixes,
444 $params,
445 $switches,
446 $translation,
447 $use_dash,
448 $join,
449 $name,
450 $dir,
451 $composite_commands,
452 $files);
453 for (qw( @program_commands
454 %command_prefixes
455 @program_params
456 @program_switches
457 %param_translation
458 $use_dash
459 $join
460 $program_name
461 $program_dir
462 %composite_commands
463 %command_files ) ) {
464 my ($sigil, $var) = m/(.)(.*)/;
465 my $qualvar = "${sigil}${pkg}::${var}";
466 for ($sigil) {
467 /\@/ && do { $qualvar = "\[$qualvar\]" };
468 /\%/ && do { $qualvar = "\{$qualvar\}" };
470 my $locvar = "\$${var}";
471 $locvar =~ s/program_|command_|param_//g;
472 eval "$locvar = $qualvar";
474 # set up the info registry hash
475 my %registry;
476 if ($composite_commands) {
477 $self->_register_composite_commands($composite_commands,
478 $params,
479 $switches,
480 $prefixes);
482 @registry{qw( _commands _prefixes _files
483 _params _switches _translation
484 _composite_commands )} =
485 ($commands, $prefixes, $files,
486 $params, $switches, $translation,
487 $composite_commands);
488 $self->{_options} = \%registry;
489 if (not defined $use_dash) {
490 $self->{'_options'}->{'_dash'} = 1;
491 } else {
492 $self->{'_options'}->{'_dash'} = $use_dash;
494 if (not defined $join) {
495 $self->{'_options'}->{'_join'} = ' ';
496 } else {
497 $self->{'_options'}->{'_join'} = $join;
499 if ($name =~ /^\*/) {
500 $self->is_pseudo(1);
501 $name =~ s/^\*//;
503 $self->program_name($name) if not defined $self->program_name();
504 $self->program_dir($dir) if not defined $self->program_dir();
505 $self->set_parameters(@args);
506 $self->parameters_changed(1); # set on instantiation, per Bio::ParameterBaseI
507 return $self;
510 =head2 program_name
512 Title : program_name
513 Usage : $factory->program_name($name)
514 Function: get/set the executable name
515 Returns: string
516 Args : string
518 =cut
520 sub program_name {
521 my ($self, $val) = @_;
522 $self->{'_program_name'} = $val if $val;
523 return $self->{'_program_name'};
526 =head2 program_dir
528 Title : program_dir
529 Usage : $factory->program_dir($dir)
530 Function: get/set the program dir
531 Returns: string
532 Args : string
534 =cut
536 sub program_dir {
537 my ($self, $val) = @_;
538 $self->{'_program_dir'} = $val if $val;
539 return $self->{'_program_dir'};
542 =head2 _register_program_commands()
544 Title : _register_program_commands
545 Usage : $factory->_register_program_commands( \@commands, \%prefixes )
546 Function: Register the commands a program accepts (for programs that act
547 as frontends for a set of commands, each command having its own
548 set of params/switches)
549 Returns : true on success
550 Args : arrayref to a list of commands (scalar strings),
551 hashref to a translation table of the form
552 { $prefix1 => $command1, ... } [optional]
553 Note : To implement a program with this kind of calling structure,
554 include a parameter called 'command' in the
555 @program_params global
556 Note : The translation table is used to associate parameters and
557 switches specified in _set_program_options with the correct
558 program command. In the globals @program_params and
559 @program_switches, specify elements as 'prefix1|param' and
560 'prefix1|switch', etc.
562 =cut
564 =head2 _set_program_options
566 Title : _set_program_options
567 Usage : $factory->_set_program_options( \@ args );
568 Function: Register the parameters and flags that an assembler takes.
569 Returns : 1 for success
570 Args : - arguments passed by the user
571 - parameters that the program accepts, optional (default: none)
572 - switches that the program accepts, optional (default: none)
573 - parameter translation, optional (default: no translation occurs)
574 - dash option for the program parameters, [1|single|double|mixed],
575 optional (default: yes, use single dashes only)
576 - join, optional (default: ' ')
578 =cut
580 =head2 _translate_params
582 Title : _translate_params
583 Usage : @options = $assembler->_translate_params( );
584 Function: Translate the Bioperl arguments into the arguments to pass to the
585 program on the command line
586 Returns : Arrayref of arguments
587 Args : none
589 =cut
591 sub _translate_params {
592 my ($self) = @_;
593 # Get option string
594 my ($params, $switches, $join, $dash, $translat) =
595 @{$self->{_options}}{qw(_params _switches _join _dash _translation)};
597 # access the multiple dash choices of _setparams...
598 my @dash_args;
599 $dash ||= 1; # default as advertised
600 for ($dash) {
601 $_ eq '1' && do {
602 @dash_args = ( -dash => 1 );
603 last;
605 /^s/ && do { #single dash only
606 @dash_args = ( -dash => 1);
607 last;
609 /^d/ && do { # double dash only
610 @dash_args = ( -double_dash => 1);
611 last;
613 /^m/ && do { # mixed dash: one-letter opts get -,
614 # long opts get --
615 @dash_args = ( -mixed_dash => 1);
616 last;
618 do {
619 $self->warn( "Dash spec '$dash' not recognized; using 'single'" );
620 @dash_args = ( -dash => 1 );
623 my $options = $self->_setparams(
624 -params => $params,
625 -switches => $switches,
626 -join => $join,
627 @dash_args
630 # Translate options
631 my @options = split(/(\s|$join)/, $options);
632 for (my $i = 0; $i < scalar @options; $i++) {
633 my ($prefix, $name) = ( $options[$i] =~ m/^(-{0,2})(.+)$/ );
634 if (defined $name) {
635 if ($name =~ /command/i) {
636 $name = $options[$i+2]; # get the command
637 splice @options, $i, 4;
638 $i--;
639 # don't add the command if this is a pseudo-program
640 unshift @options, $name unless ($self->is_pseudo); # put command first
642 elsif (defined $$translat{$name}) {
643 $options[$i] = $prefix.$$translat{$name};
646 else {
647 splice @options, $i, 1;
648 $i--;
651 $options = join('', @options);
653 # this is a kludge for mixed options: the reason mixed doesn't
654 # work right on the pass through _setparams is that the
655 # *aliases* and not the actual params are passed to it.
656 # here we just rejigger the dashes
657 if ($dash =~ /^m/) {
658 $options =~ s/--([a-z0-9](?:\s|$))/-$1/gi;
661 # Now arrayify the options
662 @options = split(' ', $options);
664 return \@options;
667 =head2 executable()
669 Title : executable
670 Usage :
671 Function: find the full path to the main executable,
672 or to the command executable for pseudo-programs
673 Returns : full path, if found
674 Args : [optional] explicit path to the executable
675 (will set the appropriate command exec if
676 applicable)
677 [optional] boolean flag whether or not to warn when exe no found
678 Note : overrides WrapperBase.pm
680 =cut
682 sub executable {
683 my $self = shift;
684 my ($exe, $warn) = @_;
685 if ($self->is_pseudo) {
686 return $self->{_pathtoexe} = $self->executables($self->command,$exe);
689 # otherwise
690 # setter
691 if (defined $exe) {
692 $self->throw("binary '$exe' does not exist") unless -e $exe;
693 $self->throw("'$exe' is not executable") unless -x $exe;
694 return $self->{_pathtoexe} = $exe;
697 # getter
698 return $self->{_pathtoexe} if defined $self->{_pathstoexe};
700 # finder
701 return $self->{_pathtoexe} = $self->_find_executable($exe, $warn);
704 =head2 executables()
706 Title : executables
707 Usage :
708 Function: find the full path to a command's executable
709 Returns : full path (scalar string)
710 Args : command (scalar string),
711 [optional] explicit path to this command exe
712 [optional] boolean flag whether or not to warn when exe no found
714 =cut
716 sub executables {
717 my $self = shift;
718 my ($cmd, $exe, $warn) = @_;
719 # for now, barf if this is not a pseudo program
720 $self->throw("This wrapper represents a single program with commands, not multiple programs; can't use executables()") unless $self->is_pseudo;
721 $self->throw("Command name required at arg 1") unless defined $cmd;
722 $self->throw("The desired executable '$cmd' is not registered as a command") unless grep /^$cmd$/, @{$self->{_options}->{_commands}};
724 # setter
725 if (defined $exe) {
726 $self->throw("binary '$exe' does not exist") unless -e $exe;
727 $self->throw("'$exe' is not executable") unless -x $exe;
728 $self->{_pathstoexe} = {} unless defined $self->{_pathstoexe};
729 return $self->{_pathstoexe}->{$cmd} = $exe;
732 # getter
733 return $self->{_pathstoexe}->{$cmd} if defined $self->{_pathstoexe}->{$cmd};
735 $exe ||= $cmd;
736 # finder
737 return $self->{_pathstoexe}->{$cmd} = $self->_find_executable($exe, $warn);
740 =head2 _find_executable()
742 Title : _find_executable
743 Usage : my $exe_path = $fac->_find_executable($exe, $warn);
744 Function: find the full path to a named executable,
745 Returns : full path, if found
746 Args : name of executable to find
747 [optional] boolean flag whether or not to warn when exe no found
748 Note : differs from executable and executables in not
749 setting any object attributes
751 =cut
753 sub _find_executable {
754 my $self = shift;
755 my ($exe, $warn) = @_;
757 if ($self->is_pseudo && !$exe) {
758 if (!$self->command) {
759 # this throw probably appropriate
760 # the rest are now warns if $warn.../maj
761 $self->throw(
762 "The ".__PACKAGE__." wrapper represents several different programs;".
763 "arg1 to _find_executable must be specified explicitly,".
764 "or the command() attribute set");
766 else {
767 $exe = $self->command;
770 $exe ||= $self->program_path;
772 my $path;
773 if ($self->program_dir) {
774 $path = File::Spec->catfile($self->program_dir, $exe);
775 } else {
776 $path = $exe;
777 $self->warn('Program directory not specified; use program_dir($path).') if $warn;
780 # use provided info - we are allowed to follow symlinks, but refuse directories
781 map { return $path.$_ if ( -x $path.$_ && !(-d $path.$_) ) } ('', '.exe') if defined $path;
783 # couldn't get path to executable from provided info, so use system path
784 $path = $path ? " in $path" : undef;
785 $self->warn("Executable $exe not found$path, trying system path...") if $warn;
786 if ($path = $self->io->exists_exe($exe)) {
787 return $path;
788 } else {
789 $self->warn("Cannot find executable for program '".($self->is_pseudo ? $self->command : $self->program_name)."'") if $warn;
790 return;
794 =head2 _register_composite_commands()
796 Title : _register_composite_commands
797 Usage :
798 Function: adds subcomand params and switches for composite commands
799 Returns : true on success
800 Args : \%composite_commands,
801 \@program_params,
802 \@program_switches
804 =cut
806 sub _register_composite_commands {
807 my $self = shift;
808 my ($composite_commands, $program_params,
809 $program_switches, $command_prefixes) = @_;
810 my @sub_params;
811 my @sub_switches;
812 foreach my $cmd (keys %$composite_commands) {
813 my $pfx = $command_prefixes->{$cmd} || $cmd;
814 foreach my $subcmd ( @{$$composite_commands{$cmd}} ) {
815 my $spfx = $command_prefixes->{$subcmd} || $subcmd;
816 my @sub_program_params = grep /^$spfx\|/, @$program_params;
817 my @sub_program_switches = grep /^$spfx\|/, @$program_switches;
818 for (@sub_program_params) {
819 m/^$spfx\|(.*)/;
820 push @sub_params, "$pfx\|${spfx}_".$1;
822 for (@sub_program_switches) {
823 m/^$spfx\|(.*)/;
824 push @sub_switches, "$pfx\|${spfx}_".$1;
828 push @$program_params, @sub_params;
829 push @$program_switches, @sub_switches;
830 # translations for subcmd params/switches not necessary
831 return 1;
834 =head2 _create_factory_set()
836 Title : _create_factory_set
837 Usage : @facs = $self->_create_factory_set
838 Function: instantiate a set of individual command factories for
839 a given composite command
840 Factories will have the correct parameter fields set for
841 their own subcommand
842 Returns : hash of factories: ( $subcmd_prefix => $subcmd_factory, ... )
843 Args : none
845 =cut
847 sub _create_factory_set {
848 my $self = shift;
849 $self->throw('command not set') unless $self->command;
850 my $cmd = $self->command;
851 $self->throw('_create_factory_set only works on composite commands')
852 unless grep /^$cmd$/, keys %{$self->{_options}->{_composite_commands}};
853 my %ret;
854 my $class = ref $self;
855 my $subargs_hash = $self->_collate_subcmd_args($cmd);
856 for (keys %$subargs_hash) {
857 $ret{$_} = $class->new( -command => $_, @{$$subargs_hash{$_}} );
859 return %ret;
862 =head2 _collate_subcmd_args()
864 Title : _collate_subcmd_args
865 Usage : $args_hash = $self->_collate_subcmd_args
866 Function: collate parameters and switches into command-specific
867 arg lists for passing to new()
868 Returns : hash of named argument lists
869 Args : [optional] composite cmd prefix (scalar string)
870 [default is 'run']
872 =cut
874 sub _collate_subcmd_args {
875 my $self = shift;
876 my $cmd = shift;
877 my %ret;
878 # default command is 'run'
879 $cmd ||= 'run';
880 return unless $self->{'_options'}->{'_composite_commands'};
881 return unless $self->{'_options'}->{'_composite_commands'}->{$cmd};
882 my @subcmds = @{$self->{'_options'}->{'_composite_commands'}->{$cmd}};
884 my $cur_options = $self->{'_options'};
885 # collate
886 foreach my $subcmd (@subcmds) {
887 # find the composite cmd form of the argument in
888 # the current params and switches
889 # e.g., map_max_mismatches
890 my $pfx = $self->{_options}->{_prefixes}->{$subcmd} || $subcmd;
891 my @params = grep /^${pfx}_/, @{$$cur_options{'_params'}};
892 my @switches = grep /^${pfx}_/, @{$$cur_options{'_switches'}};
893 $ret{$subcmd} = [];
894 # create an argument list suitable for passing to new() of
895 # the subcommand factory...
896 foreach my $opt (@params, @switches) {
897 my $subopt = $opt;
898 $subopt =~ s/^${pfx}_//;
899 push(@{$ret{$subcmd}}, '-'.$subopt => $self->$opt) if defined $self->$opt;
902 return \%ret;
905 =head2 _run
907 Title : _run
908 Usage : $fac->_run( @file_args )
909 Function: Run a command as specified during object contruction
910 Returns : true on success
911 Args : a specification of the files to operate on according
912 to the filespec
914 =cut
916 sub _run {
917 my ($self, @args) = @_;
918 # _translate_params will provide an array of command/parameters/switches
919 # -- these are set at object construction
920 # to set up the run, need to add the files to the call
921 # -- provide these as arguments to this function
922 my $cmd = $self->command if $self->can('command');
923 my $opts = $self->{_options};
924 my %args;
925 $self->throw("No command specified for the object") unless $cmd;
926 # setup files necessary for this command
927 my $filespec = $opts->{'_files'}->{$cmd};
928 my @switches;
929 my ($in, $out, $err);
930 # some applications rely completely on switches
931 if (defined $filespec && @$filespec) {
932 # parse args based on filespec
933 # require named args
934 $self->throw("Named args are required") unless !(@args % 2);
935 s/^-// for @args;
936 %args = @args;
937 # validate
938 my @req = map {
939 my $s = $_;
940 $s =~ s/^-.*\|//;
941 $s =~ s/^[012]?[<>]//;
942 $s =~ s/[^a-zA-Z0-9_]//g;
944 } grep !/[#]/, @$filespec;
945 !defined($args{$_}) && $self->throw("Required filearg '$_' not specified") for @req;
946 # set up redirects and file switches
947 for (@$filespec) {
948 m/^1?>#?(.*)/ && do {
949 defined($args{$1}) && ( open($out,">", $args{$1}) or $self->throw("Open for write error : $!"));
950 next;
952 m/^2>#?(.*)/ && do {
953 defined($args{$1}) && (open($err, ">", $args{$1}) or $self->throw("Open for write error : $!"));
954 next;
956 m/^<#?(.*)/ && do {
957 defined($args{$1}) && (open($in, "<", $args{$1}) or $self->throw("Open for read error : $!"));
958 next;
960 if (m/^-(.*)\|/) {
961 push @switches, $self->_dash_switch($1);
962 } else {
963 push @switches, undef;
967 my $dum;
968 $in || ($in = \$dum);
969 $out || ($out = \$self->{'stdout'});
970 $err || ($err = \$self->{'stderr'});
972 # Get program executable
973 my $exe = $self->executable;
974 $self->throw("Can't find executable for '".($self->is_pseudo ? $self->command : $self->program_name)."'; can't continue") unless $exe;
975 # Get command-line options
976 my $options = $self->_translate_params();
977 # Get file specs sans redirects in correct order
978 my @specs = map {
979 my $s = $_;
980 $s =~ s/^-.*\|//;
981 $s =~ s/[^a-zA-Z0-9_]//g;
983 } grep !/[<>]/, @$filespec;
984 my @files = @args{@specs};
985 # expand arrayrefs
986 my $l = $#files;
988 # Note: below code block may be brittle, see link on this:
989 # http://lists.open-bio.org/pipermail/bioperl-l/2010-June/033439.html
991 for (0..$l) {
992 if (ref($files[$_]) eq 'ARRAY') {
993 splice(@switches, $_, 1, ($switches[$_]) x @{$files[$_]});
994 splice(@files, $_, 1, @{$files[$_]});
999 @files = map {
1000 my $s = shift @switches;
1001 defined $_ ? ($s, $_): ()
1002 } @files;
1003 @files = map { defined $_ ? $_ : () } @files; # squish undefs
1004 my @ipc_args = ( $exe, @$options, @files );
1005 $self->{_last_execution} = join( $self->{'_options'}->{'_join'}, @ipc_args );
1006 eval {
1007 IPC::Run::run(\@ipc_args, $in, $out, $err) or
1008 die ("There was a problem running $exe : ".$$err);
1011 if ($@) {
1012 $self->throw("$exe call crashed: $@") unless $self->no_throw_on_crash;
1013 return 0;
1016 return 1;
1021 =head2 no_throw_on_crash()
1023 Title : no_throw_on_crash
1024 Usage :
1025 Function: prevent throw on execution error
1026 Returns :
1027 Args : [optional] boolean
1029 =cut
1031 sub no_throw_on_crash {
1032 my $self = shift;
1033 return $self->{'_no_throw'} = shift if @_;
1034 return $self->{'_no_throw'};
1037 =head2 last_execution()
1039 Title : last_execution
1040 Usage :
1041 Function: return the last executed command with options
1042 Returns : string of command line sent to IPC::Run
1043 Args :
1045 =cut
1047 sub last_execution {
1048 my $self = shift;
1049 return $self->{'_last_execution'};
1052 =head2 _dash_switch()
1054 Title : _dash_switch
1055 Usage : $version = $fac->_dash_switch( $switch )
1056 Function: Returns an appropriately dashed switch for the executable
1057 Args : A string containing a switch without dashes
1058 Returns : string containing an appropriately dashed switch for the current executable
1060 =cut
1062 sub _dash_switch {
1063 my ($self, $switch) = @_;
1065 my $dash = $self->{'_options'}->{'_dash'};
1066 for ($dash) {
1067 $_ eq '1' && do {
1068 $switch = '-'.$switch;
1069 last;
1071 /^s/ && do { #single dash only
1072 $switch = '-'.$switch;
1073 last;
1075 /^d/ && do { # double dash only
1076 $switch = '--'.$switch;
1077 last;
1079 /^m/ && do { # mixed dash: one-letter opts get -,
1080 $switch = '-'.$switch;
1081 $switch =~ s/^(-[a-z0-9](?:\w+))$/-$1/i;
1082 last;
1084 do {
1085 $self->warn( "Dash spec '$dash' not recognized; using 'single'" );
1086 $switch = '-'.$switch;
1090 return $switch;
1093 =head2 stdout()
1095 Title : stdout
1096 Usage : $fac->stdout()
1097 Function: store the output from STDOUT for the run,
1098 if no file specified in _run arguments
1099 Example :
1100 Returns : scalar string
1101 Args : on set, new value (a scalar or undef, optional)
1103 =cut
1105 sub stdout {
1106 my $self = shift;
1107 return $self->{'stdout'} = shift if @_;
1108 return $self->{'stdout'};
1111 =head2 stderr()
1113 Title : stderr
1114 Usage : $fac->stderr()
1115 Function: store the output from STDERR for the run,
1116 if no file is specified in _run arguments
1117 Example :
1118 Returns : scalar string
1119 Args : on set, new value (a scalar or undef, optional)
1121 =cut
1123 sub stderr {
1124 my $self = shift;
1125 return $self->{'stderr'} = shift if @_;
1126 return $self->{'stderr'};
1129 =head2 is_pseudo()
1131 Title : is_pseudo
1132 Usage : $obj->is_pseudo($newval)
1133 Function: returns true if this factory represents
1134 a pseudo-program
1135 Example :
1136 Returns : value of is_pseudo (boolean)
1137 Args : on set, new value (a scalar or undef, optional)
1139 =cut
1141 sub is_pseudo {
1142 my $self = shift;
1144 return $self->{'is_pseudo'} = shift if @_;
1145 return $self->{'is_pseudo'};
1148 =head2 AUTOLOAD
1150 AUTOLOAD permits
1152 $class->new_yourcommand(@args);
1154 as an alias for
1156 $class->new( -command => 'yourcommand', @args );
1158 =cut
1160 sub AUTOLOAD {
1161 my $class = shift;
1162 my $tok = $AUTOLOAD;
1163 my @args = @_;
1164 $tok =~ s/.*:://;
1165 unless ($tok =~ /^new_/) {
1166 $class->throw("Can't locate object method '$tok' via package '".ref($class)?ref($class):$class);
1168 my ($cmd) = $tok =~ m/new_(.*)/;
1169 return $class->new( -command => $cmd, @args );
1172 =head1 Bio:ParameterBaseI compliance
1174 =head2 set_parameters()
1176 Title : set_parameters
1177 Usage : $pobj->set_parameters(%params);
1178 Function: sets the parameters listed in the hash or array
1179 Returns : true on success
1180 Args : [optional] hash or array of parameter/values.
1182 =cut
1184 sub set_parameters {
1185 my ($self, @args) = @_;
1187 # currently stored stuff
1188 my $opts = $self->{'_options'};
1189 my $params = $opts->{'_params'};
1190 my $switches = $opts->{'_switches'};
1191 my $translation = $opts->{'_translation'};
1192 my $use_dash = $opts->{'_dash'};
1193 my $join = $opts->{'_join'};
1194 unless (($self->can('command') && $self->command)
1195 || (grep /command/, @args)) {
1196 push @args, '-command', 'run';
1198 my %args = @args;
1199 my $cmd = $args{'-command'} || $args{'command'} || ($self->can('command') && $self->command);
1200 if ($cmd) {
1201 my (@p,@s, %x);
1202 $self->warn('Command present, but no commands registered') unless $self->{'_options'}->{'_commands'};
1203 $self->throw("Command '$cmd' not registered") unless grep /^$cmd$/, @{$self->{'_options'}->{'_commands'}};
1204 $cmd = $self->{_options}->{_prefixes}->{$cmd} || $cmd;
1206 @p = (grep(!/^.*?\|/, @$params), grep(/^${cmd}\|/, @$params));
1207 @s = (grep(!/^.*?\|/, @$switches), grep(/^${cmd}\|/, @$switches));
1208 s/.*?\|// for @p;
1209 s/.*?\|// for @s;
1210 @x{@p, @s} = @{$translation}{
1211 grep( !/^.*?\|/, @$params, @$switches),
1212 grep(/^${cmd}\|/, @$params, @$switches) };
1213 $opts->{_translation} = $translation = \%x;
1214 $opts->{_params} = $params = \@p;
1215 $opts->{_switches} = $switches = \@s;
1217 $self->_set_from_args(
1218 \@args,
1219 -methods => [ @$params, @$switches, 'program_name', 'program_dir', 'out_type' ],
1220 -create => 1,
1221 # when our parms are accessed, signal parameters are unchanged for
1222 # future reads (until set_parameters is called)
1223 -code =>
1224 ' my $self = shift;
1225 $self->parameters_changed(0);
1226 return $self->{\'_\'.$method} = shift if @_;
1227 return $self->{\'_\'.$method};'
1229 # the question is, are previously-set parameters left alone when
1230 # not specified in @args?
1231 $self->parameters_changed(1);
1232 return 1;
1235 =head2 reset_parameters()
1237 Title : reset_parameters
1238 Usage : resets values
1239 Function: resets parameters to either undef or value in passed hash
1240 Returns : none
1241 Args : [optional] hash of parameter-value pairs
1243 =cut
1245 sub reset_parameters {
1246 my ($self, @args) = @_;
1248 my @reset_args;
1249 # currently stored stuff
1250 my $opts = $self->{'_options'};
1251 my $params = $opts->{'_params'};
1252 my $switches = $opts->{'_switches'};
1253 my $translation = $opts->{'_translation'};
1254 my $qual_param = $opts->{'_qual_param'};
1255 my $use_dash = $opts->{'_dash'};
1256 my $join = $opts->{'_join'};
1258 # handle command name
1259 my %args = @args;
1260 my $cmd = $args{'-command'} || $args{'command'} || $self->command;
1261 $args{'command'} = $cmd;
1262 delete $args{'-command'};
1263 @args = %args;
1264 # don't like this, b/c _set_program_args will create a bunch of
1265 # accessors with undef values, but oh well for now /maj
1267 for my $p (@$params) {
1268 push(@reset_args, $p => undef) unless grep /^[-]?$p$/, @args;
1270 for my $s (@$switches) {
1271 push(@reset_args, $s => undef) unless grep /^[-]?$s$/, @args;
1273 push @args, @reset_args;
1274 $self->set_parameters(@args);
1275 $self->parameters_changed(1);
1278 =head2 parameters_changed()
1280 Title : parameters_changed
1281 Usage : if ($pobj->parameters_changed) {...}
1282 Function: Returns boolean true (1) if parameters have changed
1283 Returns : Boolean (0 or 1)
1284 Args : [optional] Boolean
1286 =cut
1288 sub parameters_changed {
1289 my $self = shift;
1290 return $self->{'_parameters_changed'} = shift if @_;
1291 return $self->{'_parameters_changed'};
1294 =head2 available_parameters()
1296 Title : available_parameters
1297 Usage : @params = $pobj->available_parameters()
1298 Function: Returns a list of the available parameters
1299 Returns : Array of parameters
1300 Args : 'params' for settable program parameters
1301 'switches' for boolean program switches
1302 default: all
1304 =cut
1306 sub available_parameters {
1307 my $self = shift;
1308 my $subset = shift;
1309 my $opts = $self->{'_options'};
1310 my @ret;
1311 for ($subset) {
1312 (!defined || /^a/) && do {
1313 @ret = (@{$opts->{'_params'}}, @{$opts->{'_switches'}});
1314 last;
1316 m/^p/i && do {
1317 @ret = @{$opts->{'_params'}};
1318 last;
1320 m/^s/i && do {
1321 @ret = @{$opts->{'_switches'}};
1322 last;
1324 m/^c/i && do {
1325 @ret = @{$opts->{'_commands'}};
1326 last;
1328 m/^f/i && do { # get file spec
1329 return @{$opts->{'_files'}->{$self->command}};
1331 do { #fail
1332 $self->throw("available_parameters: unrecognized subset");
1335 return @ret;
1338 sub available_commands { shift->available_parameters('commands') }
1339 sub filespec { shift->available_parameters('filespec') }
1341 =head2 get_parameters()
1343 Title : get_parameters
1344 Usage : %params = $pobj->get_parameters;
1345 Function: Returns list of key-value pairs of parameter => value
1346 Returns : List of key-value pairs
1347 Args : [optional] A string is allowed if subsets are wanted or (if a
1348 parameter subset is default) 'all' to return all parameters
1350 =cut
1352 sub get_parameters {
1353 my $self = shift;
1354 my $subset = shift;
1355 $subset ||= 'all';
1356 my @ret;
1357 my $opts = $self->{'_options'};
1358 for ($subset) {
1359 m/^p/i && do { #params only
1360 for (@{$opts->{'_params'}}) {
1361 push(@ret, $_, $self->$_) if $self->can($_) && defined $self->$_;
1363 last;
1365 m/^s/i && do { #switches only
1366 for (@{$opts->{'_switches'}}) {
1367 push(@ret, $_, $self->$_) if $self->can($_) && defined $self->$_;
1369 last;
1371 m/^a/i && do { # all
1372 for ((@{$opts->{'_params'}},@{$opts->{'_switches'}})) {
1373 push(@ret, $_, $self->$_) if $self->can($_) && defined $self->$_;
1375 last;
1377 do {
1378 $self->throw("get_parameters: unrecognized subset");
1381 return @ret;