3 Installing Bioperl on Windows
9 * 3 Installation using the Perl Package Manager
11 * 3.1 GUI Installation
12 * 3.2 Comand-line Installation
14 * 4 Installation using CPAN or manual installation
17 * 7 Bioperl on Windows
20 * 8.1 Setting environment variables
21 * 8.2 Installing bioperl-db
24 * 10 bioperl-db in Cygwin
26 * 12 MySQL and DBD::mysql
28 * 14 Directory for temporary files
34 This installation guide was written by Barry Moore, Nathan Haigh
35 and other Bioperl authors based on the original work of Paul Boutros. The
36 guide was updated for the BioPerl wiki by Chris Fields and Nathan
39 Please report problems and/or fixes to the BioPerl mailing list.
41 An up-to-date version of this document can be found on the BioPerl wiki:
43 http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows
47 Only ActivePerl >= 5.8.8.819 is supported by the Bioperl team. Earlier
48 versions may work, but we do not support them.
50 One of the reason for this requirement is that ActivePerl >= 5.8.8.819 now
51 use Perl Package Manager 4 (PPM4). PPM4 is now superior to earlier
52 versions and also includes a Graphical User Interface (GUI). In short,
53 it's easier for us to produce and maintain a package for installation via
54 PPM and also easier for you to do the install! Proceed with earlier
55 versions at your own risk.
57 To install ActivePerl:
59 1) Download the ActivePerl MSI from ActiveState
61 2) Run the ActivePerl Installer (accepting all defaults is fine).
63 Installation using the Perl Package Manager
67 1) Start the Perl Package Manager GUI from the Start menu.
69 2) Go to Edit >> Preferences and click the Repositories tab. Add a
70 new repository for each of the following:
73 +----------------------------------------------------------------+
75 |--------------------------+-------------------------------------|
76 |BioPerl-Release Candidates|[37]http://bioperl.org/DIST/RC |
77 |--------------------------+-------------------------------------|
78 |BioPerl-Regular Releases |[38]http://bioperl.org/DIST |
79 |--------------------------+-------------------------------------|
80 |Kobes |[39]http://theoryx5.uwinnipeg.ca/ppms|
81 |--------------------------+-------------------------------------|
82 |Bribes |[40]http://www.Bribes.org/perl/ppm |
83 +----------------------------------------------------------------+
86 3) Select View >> All Packages.
88 4) In the search box type bioperl.
90 5) Right click the latest version of Bioperl available and choose
93 5a) From bioperl 1.5.2 onward, all 'optional' pre-requisites will
94 be marked for installation. If you see that some of them complain
95 about needing a command-line installation (eg. XML::SAX::ExpatXS),
96 and you want those particular pre-requisites, stop now (skip step
97 6) and see the 'Command-line Installation' section.
99 6) Click the green arrow (Run marked actions) to complete the
102 Comand-line Installation
104 Use the ActiveState ppm-shell:
106 1) Open a cmd window by going to Start >> Run and typing
107 'cmd' and pressing return.
113 3) Make sure you have the module PPM-Repositories. Try
115 ppm> install PPM-Repositories
117 4) For BioPerl 1.6.1, we require at least the following
118 repositories. You may have some present already.
120 ppm> repo add http://bioperl.org/DIST
121 ppm> repo add uwinnipeg
122 ppm> repo add trouchelle
124 Because you have installed PPM-Repositories, PPM will know
125 your Perl version, and select the correct repo from the
128 5) Install BioPerl (not "bioperl").
132 If you are running ActiveState Perl 5.10, you may have a
133 glitch involving SOAP::Lite. Use the following workaround:
135 1) Get the index numbers for your active repositories:
140 | 1 | 11431 | ActiveState Package Repository |
141 | 2 | 14 | bioperl.org |
142 | 3 | 291 | uwinnipeg |
143 | 4 | 11755 | trouchelle |
145 2) Execute the following commands. (The session here is
146 based on the above table. Substitute the correct index
147 numbers for your situation.)
149 rem -turn off ActiveState, trouchelle repos
152 rem -to get SOAP-Lite-0.69 from uwinnipeg...
153 ppm> install SOAP-Lite
154 rem -turn ActiveState, trouchelle back on...
160 Installation using CPAN or manual installation
162 Installation using PPM is preferred since it is easier, but if you run
163 into problems, or a ppm isn't available for the version/package of bioperl
164 you want, or you want to choose which optional dependencies to install,
165 you can install manually by downloading the appropriate package or by
166 using CPAN. In fact both methods ultimately need nmake to be
167 installed, CPAN to be upgraded to >= v1.81, Module::Build to be installed
168 (>= v0.2805) and Test::Harness to be upgraded to >= v2.62:
172 2) Double-click to run it, which extracts 3 files. Move both
173 NMAKE.EXE and the NMAKE.ERR files to a place in your PATH; if set
174 up properly, you can move these to your Perl bin directory,
175 normally C:\Perl\bin.
177 1) Open a cmd window by going to Start >> Run and typing 'cmd'
178 into the box and pressing return.
180 2) Type 'cpan' to enter the CPAN shell.
182 3) At the cpan> prompt, type 'install CPAN' to upgrade to the
185 4) Quit (by typing 'q') and reload cpan. You may be asked some
186 configuration questions; accepting defaults is fine.
188 5) At the cpan> prompt, type 'o conf prefer_installer MB' to tell
189 CPAN to prefer to use Build.PL scripts for installation. Type 'o
190 conf commit' to save that choice.
192 6) At the cpan> prompt, type 'install Module::Build'.
194 7) At the cpan> prompt, type 'install Test::Harness'.
196 You can now follow the unix instructions for installing using CPAN, or
199 8) Download the .zip version of the package you want.
201 9) Extract the archive in the normal way.
203 10) In a cmd window 'cd' to the directory you extracted to. Eg. if
204 you extracted to directory 'Temp', 'cd Temp\bioperl-1.5.2_100'
206 11) Type 'perl Build.PL' and answer the questions appropriately.
208 12) Type 'perl Build test'. All the tests should pass, but if they
209 don't, let us know. Your usage of Bioperl may not be affected
210 by the failure, so you can choose to continue anyway.
212 13) Type 'perl Build install' to install Bioperl.
216 Bioperl is a large collection of Perl modules (extensions to the
217 Perl language) that aid in the task of writing Perl code to deal
218 with sequence data in a myriad of ways. Bioperl provides objects for
219 various types of sequence data and their associated features and
220 annotations. It provides interfaces for analysis of these sequences with a
221 wide variety of external programs (BLAST, FASTA, clustalw and
222 EMBOSS to name just a few). It provides interfaces to various types of
223 databases both remote (GenBank, EMBL etc) and local (MySQL,
224 Flat_databases flat files, GFF etc.) for storage and retrieval of
225 sequences. And finally with its associated documentation and
226 mailing lists, Bioperl represents a community of bioinformatics
227 professionals working in Perl who are committed to supporting both
228 development of Bioperl and the new users who are drawn to the project.
230 While most bioinformatics and computational biology applications are
231 developed in UNIX/Linux environments, more and more programs are
232 being ported to other operating systems like Windows, and many users
233 (often biologists with little background in programming) are looking for
234 ways to automate bioinformatics analyses in the Windows environment.
236 Perl and Bioperl can be installed natively on Windows NT/2000/XP.
237 Most of the functionality of Bioperl is available with this type of
238 install. Much of the heavy lifting in bioinformatics is done by programs
239 originally developed in lower level languages like C and Pascal
240 (e.g. BLAST, clustalw, Staden etc). Bioperl simply acts as
241 a wrapper for running and parsing output from these external programs.
243 Some of those programs (BLAST for example) are ported to Windows.
244 These can be installed and work quite happily with Bioperl in the native
245 Windows environment. Some external programs such as Staden and the
246 EMBOSS suite of programs can only be installed on Windows by using
247 Cygwin and its gcc C compiler (see Bioperl in Cygwin, below).
248 Recent attempts to port EMBOSS to Windows, however, have been mostly
251 If you have a fairly simple project in mind, want to start using Bioperl
252 quickly, only have access to a computer running Windows, and/or don't mind
253 bumping up against some limitations then Bioperl on Windows may be a
254 good place for you to start. For example, downloading a bunch of sequences
255 from GenBank and sorting out the ones that have a particular
256 annotation or feature works great. Running a bunch of your sequences
257 against remote or local BLAST, parsing the output and storing it
258 in a MySQL database would be fine also.
260 Be aware that most Bioperl developers are working in some type of a
261 UNIX environment (Linux, OS X, Cygwin). If you have
262 problems with Bioperl that are specific to the Windows environment, you
263 may be blazing new ground and your pleas for help on the Bioperl mailing
264 list may get few responses (you can but try!) - simply because no one
265 knows the answer to your Windows specific problem. If this is or becomes a
266 problem for you then you are better off working in some type of UNIX-like
267 environment. One solution to this problem that will keep you working on a
268 Windows machine it to install Cygwin, a UNIX emulation environment for
269 Windows. A number of Bioperl users are using this approach successfully
270 and it is discussed in more detail below.
274 There are a couple of ways of installing Perl on a Windows machine. The
275 most common and easiest is to get the most recent build from
276 ActiveState, a software company that provides free builds of Perl for
277 Windows users. The current (October 2006) build is ActivePerl 5.8.8.819.
278 Bioperl also works on Perl 5.6.x, but due to installation problems etc,
279 only ActivePerl 5.8.8.819 or later is supported for WinXP installation.
280 To install ActivePerl on Windows:
282 1) Download the ActivePerl MSI from
283 http://www.activestate.com/Products/ActivePerl/.
285 2) Run the ActivePerl Installer (accepting all defaults is fine).
287 You can also build Perl yourself (which requires a C compiler) or download
288 one of the other binary distributions. The Perl source for building it
289 yourself is available from CPAN, as are a few other binary
290 distributions that are alternatives to ActiveState. This approach is not
291 recommended unless you have specific reasons for doing so and know what
292 you're doing. If that's the case you probably don't need to be reading
295 Cygwin is a UNIX emulation environment for Windows and comes with
296 its own copy of Perl.
298 Information on Cygwin and Bioperl is found below.
302 Perl is a programming language that has been extended a lot by the
303 addition of external modules.
305 These modules work with the core language to extend the functionality of
308 Bioperl is one such extension to Perl. These modular extensions to
309 Perl sometimes depend on the functionality of other Perl modules and this
310 creates a dependency. You can't install module X unless you have already
311 installed module Y. Some Perl modules are so fundamentally useful that the
312 Perl developers have included them in the core distribution of Perl - if
313 you've installed Perl then these modules are already installed. Other
314 modules are freely available from CPAN, but you'll have to install them
315 yourself if you want to use them. Bioperl has such dependencies.
317 Bioperl is actually a large collection of Perl modules (over 1000
318 currently) and these modules are split into seven packages. These seven
321 +------------------------------------------------------------------------+
322 | Bioperl Group | Functions |
323 |----------------------+-------------------------------------------------|
324 |bioperl (the core) |Most of the main functionality of Bioperl |
325 |----------------------+-------------------------------------------------|
326 |bioperl-run |Wrappers to a lot of external programs |
327 |----------------------+-------------------------------------------------|
328 |bioperl-ext |Interaction with some alignment functions and the|
330 |----------------------+-------------------------------------------------|
331 |bioperl-db |Using Bioperl with BioSQL and local relational |
333 |----------------------+-------------------------------------------------|
334 |bioperl-microarray |Microarray specific functions |
335 |----------------------+-------------------------------------------------|
336 |bioperl-pedigree |manipulating genotype, marker, and individual |
337 | |data for linkage studies |
338 |----------------------+-------------------------------------------------|
339 |bioperl-gui |Some preliminary work on a graphical user |
340 | |interface to some Bioperl functions |
341 +------------------------------------------------------------------------+
343 The Bioperl core is what most new users will want to start with. Bioperl
344 (the core) and the Perl modules that it depends on can be easily installed
345 with the perl package Manager PPM. PPM is an ActivePerl utility for
346 installing Perl modules on systems using ActivePerl. PPM will look online
347 (you have to be connected to the internet of course) for files (these
348 files end with .ppd) that tell it how to install the modules you want and
349 what other modules your new modules depends on. It will then download and
350 install your modules and all dependent modules for you.
352 These .ppd files are stored online in PPM repositories. ActiveState
353 maintains the largest PPM repository and when you installed ActivePerl PPM
354 was installed with directions for using the ActiveState repositories.
355 Unfortunately the ActiveState repositories are far from complete and other
356 ActivePerl users maintain their own PPM repositories to fill in the gaps.
357 Installing will require you to direct PPM to look in three new
358 repositories as detailed in Installation Guide.
360 Once PPM knows where to look for Bioperl and it's dependencies you simply
361 tell PPM to search for packages with a particular name, select those of
362 interest and then tell PPM to install the selected packages.
366 You may find that you want some of the features of other Bioperl groups
367 like bioperl-run or bioperl-db. Currently, plans include setting up PPM
368 packages for installing these parts of Bioperl; check this by doing a
369 Bioperl search in PPM. If these are not available, though, you can use
370 the following instructions for installing the other distributions.
372 For this you will need a Windows version of the program make
375 http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/Nmake15.exe
377 You will also want to have a willingness to experiment. You'll have to
378 read the installation documents for each component that you want to
379 install, and use nmake where the instructions call for make, like so:
386 'nmake test' will likely produce lots of warnings, many of these can be
387 safely ignored (these stem from the excessively paranoid '-w' flag in
388 ActivePerl). You will have to determine from the installation documents
389 what dependencies are required, and you will have to get them, read their
390 documentation and install them first. It is recommended that you look
391 through the PPM repositories for any modules before resorting to using
392 nmake as there isn't any guarantee modules built using nmake will work.
393 The details of this are beyond the scope of this guide. Read the
394 documentation. Search Google. Try your best, and if you get stuck consult
395 with others on the BioPerl mailing list.
397 Setting environment variables
399 Some modules and tools such as Bio::Tools::Run::StandAloneBlast and
400 clustal_w, require that environment variables are set; a few examples
401 are listed in the INSTALL document. Different versions of Windows utilize
402 different methods for setting these variables. NOTE: The instructions that
403 comes with the BLAST executables for setting up BLAST on Windows are
404 out-of-date. Go to the following web address for instructions on setting
405 up standalone BLAST for Windows:
406 http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/pc_setup.html
408 * For Windows XP, go here. This does not require a reboot but all
409 active shells will not reflect any changes made to the environment.
410 * For older versions (Windows 95 to ME), generally editing the
411 C:\autoexec.bat file to add a variable works. This requires a reboot.
414 set BLASTDB=C:\blast\data
416 For either case, you can check the variable this way:
418 C:\Documents and Settings\Administrator>echo %BLASTDB%
421 Some versions of Windows may have problems differentiating forward and
422 back slashes used for directories. In general, always use backslashes (\).
423 If something isn't working properly try reversing the slashes to see if it
426 For setting up Cygwin environment variables quirks, see an example
429 Installing bioperl-db
431 bioperl-db now works for Windows w/o installing CygWin. This has
432 primarily been tested on WinXP using MySQL5, but it is expected that other
433 bioperl-db supported databases (PostgreSQL, Oracle) should work.
435 You will need Bioperl rel. 1.5.2, a relational database (I use MySQL5 here
436 as an example), and the Perl modules DBI and DBD::mysql, which
437 can be installed from PPM as desribed above (make sure the additional
438 repositories for Kobes and Bribes are added, they will have the latest
439 releases). Do NOT try using nmake with these modules as they will not
440 build correctly under Windows! The PPM builds, by Randy Kobes, have been
441 modified and tested specifically for Windows and ActivePerl.
443 NOTE: we plan on having a PPM for bioperl-db available along with the
444 regular bioperl 1.5.2 release PPM. We will post instructions at that
445 time on using PPM to install bioperl-db.
447 To begin, follow instructions detailed in the Installation Guide for
448 adding the three new repositories (Bioperl, Kobes and Bribes). Then
449 install the following packages:
454 The next step involves creating a database. The following steps are for
457 >mysqladmin -u root -p create bioseqdb
458 Enter password: **********
460 The database needs to be loaded with the BioSQL schema, which can be
461 downloaded as a tarball here.
463 >mysql -u root -p bioseqdb < biosqldb-mysql.sql
464 Enter password: **********
466 Download bioperl-db from CVS. Use the following to install the
472 Now, for testing out bioperl-db, make a copy of the file
473 DBHarness.conf.example in the bioperl-db test subdirectory (bioperl-db\t).
474 Rename it to DBHarness.biosql.conf, and modify it for your database setup
475 (particularly the user, password, database name, and driver). Save the
476 file, change back to the main bioperl-db directory, and run 'nmake test'.
477 You may see lots of the following lines,
480 Subroutine Bio::Annotation::Reference::(eq redefined at C:/Perl/lib/overload.pm line 25,
482 Subroutine new redefined at C:\Perl\src\bioperl\bioperl-live/Bio\Annotation\Reference.pm line 80,
486 which can be safely ignored (again, these come from ActivePerl's paranoid
487 '-w' flag). All tests should pass. NOTE : tests should be run with
488 a clean database with the BiOSQL schema loaded, but w/o taxonomy loaded
495 It is recommended that you load the taxonomy database using the script
496 load_ncbi_taxonomy.pl included in biosql-schema\scripts. You will need to
497 download the latest taxonomy files. This can be accomplished using the
498 -download flag in load_ncbi_taxonomy.pl, but it will not 'untar' the file
499 correctly unless you have GNU tar present in your PATH (which most Windows
500 users will not have), thus causing the following error:
502 >load_ncbi_taxonomy.pl -download -driver mysql -dbname bioseqdb -dbuser root -dbpass **********
503 The system cannot find the path specified.
504 Loading NCBI taxon database in taxdata:
505 ... retrieving all taxon nodes in the database
506 ... reading in taxon nodes from nodes.dmp
507 Couldn't open data file taxdata/nodes.dmp: No such file or directory rollback ineffective with
508 AutoCommit enabled at C:\Perl\src\bioperl\biosql-schema\scripts\load_ncbi_taxonomy.pl line 818.
509 Rollback ineffective while AutoCommit is on at
510 C:\Perl\src\bioperl\biosql-schema\scripts\load_ncbi_taxonomy.pl line 818.
511 rollback failed: Rollback ineffective while AutoCommit is on
513 Use a file decompression utility like 7-Zip to 'untar' the files in
514 the folder (if using 7-Zip, this can be accomplished by right-clicking on
515 the file and using the option 'Extract here'). Rerun the script without
516 the -download flag to load the taxonomic information. Be patient, as this
517 can take quite a while:
519 >load_ncbi_taxonomy.pl -driver mysql -dbname bioseqdb -dbuser root -dbpass **********
521 Loading NCBI taxon database in taxdata:
522 ... retrieving all taxon nodes in the database
523 ... reading in taxon nodes from nodes.dmp
524 ... insert / update / delete taxon nodes
525 ... (committing nodes)
526 ... rebuilding nested set left/right values
527 ... reading in taxon names from names.dmp
528 ... deleting old taxon names
529 ... inserting new taxon names
533 Now, load the database with your sequences using the script
534 load_seqdatabase.pl, in bioperl-db's bioperl-db\script directory:
536 C:\Perl\src\bioperl\bioperl-db\scripts\biosql>load_seqdatabase.pl -drive mysql
537 -dbname bioseqdb -dbuser root -dbpass **********
538 Loading NP_249092.gpt ...
541 You may see occasional errors depending on the sequence format, which is a
542 non-platform-related issue. Many of these are due to not having an updated
543 taxonomic database and may be rectified by updating the taxonomic
544 information as detailed in load_ncbi_taxonomy.pl's POD.
546 Thanks to Baohua Wang, who found the initial Windows-specific problem in
547 Bio::Root::Root that led to this fix, to Sendu Bala for fixing
548 Bug #1938, and to Hilmar Lapp for his input.
552 Cygwin is a Unix emulator and shell environment available free at
553 http://www.cygwin.com. Bioperl v. 1.* supposedly runs well within Cygwin,
554 though the latest release has not been tested with Cygwin yet. Some
555 users claim that installation of Bioperl is easier within Cygwin than
556 within Windows, but these may be users with UNIX backgrounds. A note on
557 Cygwin: it doesn't write to your Registry, it doesn't alter your system or
558 your existing files in any way, it doesn't create partitions, it simply
559 creates a cygwin/ directory and writes all of its files to that directory.
560 To uninstall Cygwin just delete that directory.
562 One advantage of using Bioperl in Cygwin is that all the external modules
563 are available through CPAN - the same cannot be said of ActiveState's PPM
566 To get Bioperl running first install the basic Cygwin package as well as
567 the Cygwin perl, make, binutils, and gcc packages. Clicking the View
568 button in the upper right of the installer window enables you to see
569 details on the various packages. Then start up Cygwin and follow the
570 Bioperl installation instructions for UNIX in Bioperl's INSTALL file
571 (for example, THE BIOPERL BUNDLE and INSTALLING BIOPERL THE EASY WAY USING
576 This package is installed using the instructions contained in the package,
577 without modification. Since postgres is a package within Cygwin this is
578 probably the easiest of the 3 platforms supported in bioperl-db to
579 install (postgres, Mysql, Oracle).
583 If you can, install Cygwin on a drive or partition that's
584 NTFS-formatted, not FAT32-formatted. When you install Cygwin on
585 a FAT32 partition you will not be able to set permissions and ownership
586 correctly. In most situations this probably won't make any difference but
587 there may be occasions where this is a problem.
589 If you're trying to use some application or resource outside of Cygwin
590 directory and you're having a problem remember that Cygwin's path syntax
591 may not be the correct one. Cygwin understands /home/jacky or
592 /cygdrive/e/cygwin/home/jacky (when referring to the E: drive) but the
593 external resource may want E:/cygwin/home/jacky. So your *rc files may end
594 up with paths written in these different syntaxes, depending.
598 You may want to install a relational database in order to use BioPerl
599 db, BioSQL or OBDA. The easiest way to install Mysql is to use
600 the Windows binaries available at http://www.mysql.com. Note that
601 Windows does not have sockets, so you need to force the Mysql connections
602 to use TCP/IP instead. Do this by using the -h, or host, option from the
603 command-line. Example:
605 >mysql -h 127.0.0.1 -u <user> -p<password> <database>
607 Alternatively you could install postgres instead of MySQL, postgres is
608 already a package in Cygwin.
610 One known issue is that DBD::mysql can be tricky to install in Cygwin
611 and this module is required for the bioperl-db, Biosql, and
612 bioperl-pipeline external packages. Fortunately there's some good
615 * Instructions included with DBD::mysql:
617 http://search.cpan.org/src/JWIED/DBD-mysql-2.1025/INSTALL.html#windows/cygwin
619 * Additional instructions if you run into any problems; this
620 information is more up-to-date, covers post-2.9 DBD::mysql quirks in
623 http://rage.against.org/installingdbdmysqlInCygwin
627 Note that expat comes with Cygwin (it's used by the modules
628 XML::Parser and XML::SAX::ExpatXS, which are used by certain
631 Directory for temporary files
633 Set the environmental variable TMPDIR, programs like BLAST and
634 clustalw need a place to create temporary files. e.g.:
636 setenv TMPDIR e:/cygwin/tmp # csh, tcsh
637 export TMPDIR=e:/cygwin/tmp # sh, bash
639 This is not the syntax that Cygwin understands, which would be something
640 like /cygdrive/e/cygwin/tmp or /tmp, this is the syntax that a Windows
643 If this variable is not set correctly you'll see errors like this when you
644 run Bio::Tools::Run::StandAloneBlast:
646 ------------- EXCEPTION: Bio::Root::Exception -------------
647 MSG: Could not open /tmp/gXkwEbrL0a: No such file or directory
655 If you want use BLAST we recommend that the Windows binary be obtained
656 from NCBI (ftp://ftp.ncbi.nih.gov/blast/executables/LATEST/ - the
657 file will be named something like blast-2.2.13-ia32-win32.exe). Then
658 follow the Windows instructions in README.bls. You will also need to set
659 the BLASTDIR environment variable to reflect the directory which holds the
660 blast executable and data folder. You may also want to set other variables
661 to reflect the location of your databases and substitution matrices if
662 they differ from the location of your blast executables; see
663 Installing Bioperl for Unix for more details.
667 Although we've recommended using the BLAST and MySQL binaries
668 you should be able to compile just about everything else from source code
669 using Cygwin's gcc. You'll notice when you're installing Cygwin that many
670 different libraries are also available (gd, jpeg, etc.).