1 # $Id: INSTALL.WIN,v 1.21 2006-10-02 22:35:55 sendu Exp $
3 Installing Bioperl on Windows
6 1) Quick instructions for the impatient
21 This installation guide was written by Barry Moore, Nathan Haigh and
22 other Bioperl authors based on the original work of Paul Boutros.
23 Please report problems and/or fixes to the bioperl mailing list,
28 1) Quick instructions for the impatient, lucky, or experienced user.
29 ==========================================
31 Download the ActivePerl MSI from
33 http://www.activestate.com/Products/ActivePerl/
35 Run the ActivePerl Installer (accepting all defaults is fine).
37 Open a command prompt (Menus Start->Run and type cmd) and run the PPM
40 Add two new PPM repositories with the following commands:
42 ppm> rep add Bioperl http://bioperl.org/DIST
44 ppm> rep add Kobes http://theoryx5.uwinnipeg.ca/ppms
46 ppm> rep add Bribes http://www.Bribes.org/perl/ppm
48 Install Bioperl with the following commands:
52 This returns a numbered list of packages with corresponding version
53 numbers etc. with "Bioperl" in their name.
57 Where <number> corresponds to the relevant package and version from the
58 numbered list obtained above.
60 Go to http://www.bioperl.org and start reading documentation.
66 Bioperl is a large collection of Perl modules (extensions to the Perl
67 language) that aid in the task of writing Perl code to deal with
68 sequence data in a myriad of ways. Bioperl provides objects for
69 various types of sequence data and their associated features and
70 annotations. It provides interfaces for analysis of these sequences
71 with a wide variety of external programs (BLAST, fasta, clustalw and
72 EMBOSS to name just a few). It provides interfaces to
73 various types of databases both remote (GenBank, EMBL etc) and local
74 (MySQL, flat files, GFF etc.) for storage and retrieval of
75 sequences. And finally with its associated documentation and mailing
76 list Bioperl represents a community of bioinformatics professionals
77 working in Perl who are committed to supporting both development
78 of Bioperl and the new users who are drawn to the project.
80 While most bioinformatics and computational biology applications are
81 developed in Unix/Linux environments, more and more programs are being
82 ported to other operating systems like Windows, and many users (often
83 biologists with little background in programming) are looking for ways
84 to automate bioinformatics analyses in the Windows environment.
86 Perl and Bioperl can be installed natively on Windows NT/2000/XP. Most
87 of the functionality of Bioperl is available with this type
88 of install. Much of the heavy lifting in bioinformatics is done by
89 programs originally developed in lower level languages like C and
90 Pascal (e.g. BLAST, clustalw, Staden etc). Bioperl simply acts
91 as a wrapper for running and parsing output from these external
94 Some of those programs (BLAST for example) are ported to
95 Windows. These can be installed and work quite happily with Bioperl
96 in the native Windows environment. Some external programs such as
97 Staden and the EMBOSS suite of programs can only be installed on
98 Windows by using Cygwin and its gcc C compiler (see Bioperl in
101 If you have a fairly simple project in mind, want to start using Bioperl
102 quickly, only have access to a computer running Windows, and/or don't
103 mind bumping up against some limitations then Bioperl on Windows may
104 be a good place for you to start. For example, downloading a bunch
105 of sequences from GenBank and sorting out the ones that have a
106 particular annotation or feature works great. Running
107 a bunch of your sequences against remote or local BLAST, parsing the
108 output and storing it in a MySQL database would be fine also.
110 Be aware that most Bioperl developers are working in some type of a UNIX
111 environment (Linux, OSX, Cygwin). If you have problems with Bioperl
112 that are specific to the Windows environment, you may be blazing new
113 ground and your pleas for help on the Bioperl mailing list may get few
114 responses - simply because no one knows the answer to your
115 Windows specific problem. If this is or becomes a problem for you then
116 you are better off working in some type of UNIX like environment. One
117 solution to this problem that will keep you working on a
118 Windows machine it to install Cygwin, a UNIX emulation environment for
119 Windows. A number of Bioperl users are using this approach
120 successfully and it is discussed in more detail below.
126 There are a couple of ways of installing Perl on a Windows machine.
127 The most common and easiest is to get the most recent build from
128 ActiveState. ActiveState is a software company
129 (http://www.activestate.com) that provides free builds of Perl for
130 Windows users. The current (December 2004) build is ActivePerl
131 5.8.4.810. To install ActivePerl on Windows:
133 Download the ActivePerl MSI from
134 http://www.activestate.com/Products/ActivePerl/
136 Run the ActivePerl Installer (accepting all defaults is fine).
138 You can also build Perl yourself (which requires a C compiler) or
139 download one of the other binary distributions. The Perl source for
140 building it yourself is available from CPAN (http://www.cpan.org),
141 as are a few other binary distributions that are alternatives to
142 ActiveState. This approach is not recommended unless you have specific
143 reasons for doing so and know what you're doing. If that's the case
144 you probably don't need to be reading this guide.
146 Cygwin is a UNIX emulation environment for Windows and comes with its
149 Information on Cygwin and Bioperl is found below.
152 4) Bioperl on Windows
155 Perl is a programming language that has been extended a lot by the
156 addition of external modules.
158 These modules work with the core language to extend the functionality
161 Bioperl is one such extension to Perl. These modular extensions to Perl
162 sometimes depend on the functionality of other Perl modules and this
163 creates a dependency. You can't install module X unless you have
164 already installed module Y. Some Perl modules are so fundamentally
165 useful that the Perl developers have included them in the core
166 distribution of Perl - if you've installed Perl then these modules
167 are already installed. Other modules are freely available from CPAN,
168 but you'll have to install them yourself if you want to use them.
169 Bioperl has such dependencies.
171 Bioperl is actually a large collection of Perl modules (over 1000
172 currently) and these modules are split into six packages. These six
176 Bioperl Group Functions
177 -----------------------------------------------------------------
179 bioperl (the core) Most of the main functionality of Bioperl.
181 bioperl-run Wrappers to a lot of external programs.
183 bioperl-ext Interaction with some alignment functions
184 and the Staden package.
186 bioperl-db Using bioperl with BioSQL and local
187 relational databases.
189 bioperl-microarray Microarray specific functions.
191 bioperl-gui Some preliminary work on a graphical user
192 interface to some Bioperl functions.
195 The Bioperl core is what most new users will want to start with.
196 Bioperl (the core) and the Perl modules that it depends on can be
197 easily installed with PPM. PPM (Programmer's Package Manager formerly
198 known as the Perl Package Manager) is an ActivePerl utility
199 for installing Perl modules on systems using ActivePerl. The PPM
200 commands shown in this document are for PPM version 3, if you use PPM
201 version 2 the commands you require will be different. PPM
202 will look online (you have to be connected to the internet of course)
203 for files (these files end with .ppd) that tell it how to install the
204 modules you want and what other modules your new modules
205 depends on. It will then download and install your modules and all
206 dependent modules for you.
208 These .ppd files are stored online in PPM repositories. ActiveState
209 maintains the largest PPM repository and when you installed ActivePerl
210 PPM was installed with directions for using the ActiveState
211 repositories. Unfortunately the ActiveState repositories are far
212 from complete and other ActivePerl users maintain their own PPM
213 repositories to fill in the gaps. Installing will require you to
214 direct PPM to look in three new repositories.
216 You do this by opening a Windows command prompt, typing ppm to start the
217 PPM shell and then typing the following three commands:
220 ppm> rep add Bioperl http://bioperl.org/DIST
222 ppm> rep add Kobes http://theoryx5.uwinnipeg.ca/ppms
224 ppm> rep add Bribes http://www.Bribes.org/perl/ppm
227 Once PPM knows where to look for Bioperl and it's dependencies you
228 simply tell PPM to search for packages with Bioperl in their name,
229 and then which of these to install. This is done with the
234 This returns a numbered list of packages with corresponding version
235 numbers etc. with "Bioperl" in their name.
237 ppm> install <number>
239 Where <number> corresponds to the relevant package and version from the
240 numbered list obtained above.
246 You may find that you want some of the features of other Bioperl groups
247 like bioperl-run or bioperl-db. There are currently no PPM packages
248 for installing these parts of Bioperl (but check this by doing a
249 Bioperl search at the PPM shell):
253 If they are not present, you will have to install these manually from
254 source. For this you will need a Windows version of the program make
256 (http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/Nmake15.exe).
257 You will also want to have a willingness to experiment. You'll have to
258 read the installation documents for each component that you want to
259 install, and use nmake where the instructions call for make. You will
260 have to determine from the installation documents what dependencies are
261 required and you will have to get them, read their documentation and
262 install them first. The details of this are beyond the
263 scope of this guide. Read the documentation. Search Google. Try your
264 best, and if you get stuck consult with others on the bioperl mailing list.
270 Cygwin is a Unix emulator and shell environment available free at
271 www.cygwin.com. Bioperl v. 1.* runs well within Cygwin.
272 Some users claim that installation of Bioperl is easier within
273 Cygwin than within Windows, but these may be users with Unix
274 backgrounds. A note on Cygwin: it doesn't write to your Registry, it
275 doesn't alter your system or your existing files in any way, it
276 doesn't create partitions, it simply creates a cygwin/ directory and
277 writes all of its files to that directory. To uninstall Cygwin just
278 delete that directory.
280 One advantage of using Bioperl in Cygwin is that all the external
281 modules are available through CPAN - the same cannot be said of
282 ActiveState's PPM utility.
284 To get Bioperl running first install the basic Cygwin package as well
285 as the Cygwin Perl, make, binutils, and gcc packages. Clicking the
286 "View" button in the upper right of the installer window enables you
287 to see details on the various packages. Then start up Cygwin and
288 follow the Bioperl installation instructions for Unix in Bioperl's
289 INSTALL file (for example, THE BIOPERL BUNDLE and INSTALLING BIOPERL
290 THE EASY WAY USING CPAN).
296 If you can, install Cygwin on a drive or partition that's
297 NTFS-formatted, not FAT32-formatted. When you install Cygwin on a FAT32
298 partition you will not be able to set permissions and ownership
299 correctly. In most situations this probably won't make any difference
300 but there may be occasions where this is a problem.
302 If you're trying to use some application or resource "outside" of
303 Cygwin and you're having a problem remember that Cygwin's path syntax
304 may not be the correct one. Cygwin understands '/home/jacky' or
305 '/cygdrive/e/cygwin/home/jacky' (when referring to the E: drive)
306 but the external resource may want 'E:/cygwin/home/jacky'. So your
307 *rc files may end up with paths written in these different syntaxes,
313 You may want to install a relational database in order to use
314 bioperl-db, BioSQL or OBDA. The easiest way to install Mysql is to
315 use the Windows binaries available at www.mysql.com. Note that Windows
316 does not have sockets, so you need to force the Mysql connections to
317 use TCP/IP instead. Do this by using the "-h", or host, option from the
318 command-line. Example:
320 >mysql -h 127.0.0.1 -u <user> -p<password> <database>
322 Alternatively you could install postgres instead of Mysql, postgres
323 is already a package in Cygwin.
325 One known issue is that DBD::mysql can be tricky to install in
326 Cygwin and this module is required for the bioperl-db, Biosql, and
327 bioperl-pipeline external packages. Fortunately there's some good
329 http://search.cpan.org/src/JWIED/DBD-mysql-2.1025/INSTALL.html#windows/cygwin.
330 It may be that these issues have been resolved in versions later
336 Note that expat comes with Cygwin (it's used by the module
337 XML::Parser, which is used by certain Bioperl modules).
339 Directory for temporary files
342 Set the environmental variable TMPDIR, programs like BLAST and
343 clustalw need a place to create temporary files. E.g.:
345 setenv TMPDIR e:/cygwin/tmp # csh, tcsh
346 export TMPDIR=e:/cygwin/tmp # sh, bash
348 This is not the syntax that Cygwin understands, which would be
349 something like "/cygdrive/e/cygwin/tmp" or "/tmp", this is the syntax
350 that a Windows application expects.
352 If this variable is not set correctly you'll see errors like this
353 when you run Bio::Tools::Run::StandAloneBlast:
355 ------------- EXCEPTION: Bio::Root::Exception -------------
356 MSG: Could not open /tmp/gXkwEbrL0a: No such file or directory
363 If you want install BLAST on your own computer we recommend that the
364 Windows binary be obtained from NCBI
365 (ftp://ftp.ncbi.nih.gov/blast/executables/LATEST - the file will be
366 named something like blast-2.2.6-ia32-win32.exe). Then follow the
367 Windows instructions at
368 http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/pc_setup.html
373 Although we've recommended using the BLAST and Mysql binaries you
374 should be able to compile just about everything else from source
375 code using Cygwin's gcc. You'll notice when you're installing Cygwin
376 that many different libraries are also available (gd, jpeg, etc.).