From a21f7d96df440d6f6e69c17a6d04a610fe12c1d4 Mon Sep 17 00:00:00 2001 From: cjfields Date: Thu, 11 Dec 2008 19:16:19 +0000 Subject: [PATCH] updates svn path=/bioperl-live/trunk/; revision=15144 --- README | 291 ++++++++++++++++++++++++++++++++--------------------------------- 1 file changed, 144 insertions(+), 147 deletions(-) diff --git a/README b/README index fedfcd35f..1447ec0d1 100644 --- a/README +++ b/README @@ -10,16 +10,17 @@ o Getting Started Thanks for downloading this distribution! - Please see the the INSTALL or INSTALL.WIN documents for installation + Please see the the INSTALL or INSTALL.WIN documents for installation instructions. - For tutorials see the Bioperl Tutorial.pl (http://www.bioperl.org/wiki/Bptutorial.pl) - or the HOWTO documents and tutorials online at http://bioperl.org. - To look at example code browse the scripts/ and examples/ directories + For tutorials see the Bioperl Tutorial.pl + (http://www.bioperl.org/wiki/Bptutorial.pl) or the HOWTO documents and + tutorials online at http://bioperl.org. To look at example code browse the + scripts/ and examples/ directories - For people starting out with Perl and Bioperl, look at the Bio::Perl - module (go "perldoc Bio::Perl" from within this directory). This - module is designed to flatten the learning curve for newcomers. + For people starting out with Perl and Bioperl, look at the Bio::Perl module (go + "perldoc Bio::Perl" from within this directory). This module is designed to + flatten the learning curve for newcomers. For a list of OS's and versions that are known to support Bioperl see the PLATFORMS file. @@ -28,228 +29,224 @@ o Getting Started o About Bioperl - Bioperl is a package of public domain Perl tools for computational - molecular biology. + Bioperl is a package of public domain Perl tools for computational molecular + biology. - Our website, http://bioperl.org, provides an online resource of - modules, scripts, and web links for developers of Perl-based software - for life science research. + Our website, http://bioperl.org, provides an online resource of modules, + scripts, and web links for developers of Perl-based software for life science + research. o Contact info Bioperl developers: bioperl-l@bioperl.org - There's quite a variety of tools available in Bioperl, and more are - added all the time. If the tool you're looking for isn't described in - the documentation please write us, it could be undocumented or in process. + There's quite a variety of tools available in Bioperl, and more are added all + the time. If the tool you're looking for isn't described in the documentation + please write us, it could be undocumented or in process. - Project website : http://bioperl.org - Project FTP server : bioperl.org (anonymous FTP ok) + Project website : http://bioperl.org Project FTP server : bioperl.org + (anonymous FTP ok) - Bug reports : http://bugzilla.open-bio.org/ + Bug reports : http://bugzilla.open-bio.org/ - Please send us bugs, in particular about documentation which you - think is unclear or problems in installation. We are also very - interested in functions which don't work the way you think they do! + Please send us bugs, in particular about documentation which you think is + unclear or problems in installation. We are also very interested in functions + which don't work the way you think they do! - Please see the AUTHORS file for the complete list of bioperl - developers and contributors. + Please see the AUTHORS file for the complete list of bioperl developers and + contributors. o About the directory structure The bioperl directory structure is organized as follows: - Bio/ - Bioperl modules - models/ - DIA drawing program generated OO UML for bioperl classes - t/ - Perl built-in tests - t/data/ - Data files used for the tests - provides good data - examples for those new to bioinformatics data. - scripts/ - Useful production-quality scripts with POD documentation - examples/ - Scripts demonstrating the many uses of Bioperl - maintenance/ - Bioperl housekeeping scripts + Bio/ - Bioperl modules models/ - DIA drawing program generated OO UML for + bioperl classes + (These are quite out-of-date) + t/ - Perl built-in tests. Tests are divided into subdirectories generally + based on the specific classes being tested + t/data/ - Data files used for the tests - provides good data + examples for those new to bioinformatics data. + scripts/ - Useful production-quality scripts with POD documentation examples/ - + Scripts demonstrating the many uses of Bioperl maintenance/ - Bioperl + housekeeping scripts o Documentation - The Bioperl Tutorial (http://www.bioperl.org/wiki/Bptutorial.pl) - contains useful information for new and existing Bioperl users. - This file also contains a number of useful scripts that the - student of Bioperl may want to examine. + The Bioperl Tutorial (http://www.bioperl.org/wiki/Bptutorial.pl) contains + useful information for new and existing Bioperl users. This file also contains + a number of useful scripts that the student of Bioperl may want to examine. - Individual *.pm modules have their own embedded POD documentation - as well. A complete set of hyperlinked POD, or module, documentation - is available at http://www.bioperl.org. + Individual *.pm modules have their own embedded POD documentation as well. A + complete set of hyperlinked POD, or module, documentation is available at + http://www.bioperl.org. - Remember that 'perldoc' is your friend. You can use it to read any - file containing POD formatted documentation without needing any type - of translator. + Remember that 'perldoc' is your friend. You can use it to read any file + containing POD formatted documentation without needing any type of translator. - If you used the Build.PL installation, and depending on your platform, - you may have documentation installed as man pages, which can be - accessed in the usual way. + If you used the Build.PL installation, and depending on your platform, you may + have documentation installed as man pages, which can be accessed in the usual + way. There is also an online course written at the Pasteur Institute. See http://www.pasteur.fr/recherche/unites/sis/formation/bioperl. - Useful documentation in the form of example code can also be found - in the examples/ and scripts/ directories. The current collection - includes scripts that run BLAST, index flat files, parse PDB - structure files, make primers, retrieve ESTs based on tissue, align - protein to nucleotide sequence, run GENSCAN on multiple sequences, - and much more! See bioscripts.pod for a complete listing. + Useful documentation in the form of example code can also be found in the + examples/ and scripts/ directories. The current collection includes scripts + that run BLAST, index flat files, parse PDB structure files, make primers, + retrieve ESTs based on tissue, align protein to nucleotide sequence, run + GENSCAN on multiple sequences, and much more! See bioscripts.pod for a complete + listing. o Releases - Bioperl releases are always available from the website - http://www.bioperl.org or by FTP from ftp://bioperl.org (note that - we've had trouble with our new network setup which is not allowing - FTP to support passive mode properly, use http://www.bioperl.org/DIST - to get a listing of the distribution directory). Each release is - tested with the test suite and cross-tested on a number of different - platforms. See the PLATFORMS file for more information on a specific - platform. All efforts are made to release a bug-free package, - however most major bugs in a release will be documented in the BUGS - file. See the Changes file for a listing of what features have been - added or what APIs have changed between releases. - - Bioperl now has a consistent numbering scheme to indicate stable - release series vs. development release series. A release number - is a three digit number like 1.2.0 - the first digit - indicates the major release - the idea being that all the API calls in a - major release are reasonably consistent. The second number is the - release series. This is probably the most important number. Even - numbers here (1.0, 1.2 etc) indicate stable releases. Stable releases - are well tested and recommended for most uses. Odd numbers (1.1, 1.3 - etc) are development releases which one should only use if you are - interested in the latest and greatest features. The final number (e.g. - 1.2.0, 1.2.1) is the bug fix release. The higher the number the - more bug fixes has been incorporated. In theory you can upgrade from one - bug fix release to the next with no changes to your own code (for production - cases, obviously check things out carefully before you switch over). + Bioperl releases are always available from the website http://www.bioperl.org + or by FTP from ftp://bioperl.org (note that we've had trouble with our new + network setup which is not allowing FTP to support passive mode properly, use + http://www.bioperl.org/DIST to get a listing of the distribution directory). + Each release is tested with the test suite and cross-tested on a number of + different platforms. See the PLATFORMS file for more information on a specific + platform. All efforts are made to release a bug-free package, however most + major bugs in a release will be documented in the BUGS file. See the Changes + file for a listing of what features have been added or what APIs have changed + between releases. + + Bioperl formerly used a numbering scheme to indicate stable release series vs. + development release series. A release number is a three digit number like + 1.2.0. The first digit indicates the major release - the idea being that all + the API calls in a major release are reasonably consistent. The second number + is the release series. This is probably the most important number. + + From the 1.0 release until the 1.6 release, even numbers (1.0, 1.2 etc) + indicated stable releases. Stable releases were well tested and recommended for + most uses. Odd numbers (1.1, 1.3 etc) were development releases which one would + only use if one were interested in the latest and greatest features. The final + number (e.g. 1.2.0, 1.2.1) is the bug fix release. The higher the number the + more bug fixes has been incorporated. In theory you can upgrade from one bug + fix release to the next with no changes to your own code (for production cases, + obviously check things out carefully before you switch over). + + The 1.6 release will be the last release series to utilize the alternating + 'stable'/'developer' convention. Starting immediately after the 1.6 branch, we + will start splitting BioPerl into several smaller easier-to-manage + distributions, including a developer distribution for cutting-edge (in + development) code, untested modules, and alternative implementations. o Caveats, warnings, etc - When you run the tests ("make test") some tests may issue - warnings messages or even fail. Sometimes this is because we didn't - have anyone to test the test system on the combination of your operating - system, version of perl, and associated libraries and other modules. - Because Bioperl depends on several outside libraries we may not be - able to test every single combination so if there are warnings you - may find that the package is still perfectly useful. See the - PLATFORMS file for reports of specific issues. + When you run the tests ("./Build test") some tests may issue warnings messages + or even fail. Sometimes this is because we didn't have anyone to test the test + system on the combination of your operating system, version of perl, and + associated libraries and other modules. Because Bioperl depends on several + outside libraries we may not be able to test every single combination so if + there are warnings you may find that the package is still perfectly useful. See + the PLATFORMS file for reports of specific issues. - If you install the bioperl-run system and run tests when you don't - have the program installed you'll get messages like 'program XXX not - found, skipping tests'. That's okay, Bioperl is doing what it is - supposed to do. If you wanted to run the program you'd need to - install it first. + If you install the bioperl-run system and run tests when you don't have the + program installed you'll get messages like 'program XXX not found, skipping + tests'. That's okay, Bioperl is doing what it is supposed to do. If you wanted + to run the program you'd need to install it first. - Not all scripts in the examples/ directory are correct and up-to-date. - We need volunteers to help maintain these so if you find they do not - work, submit a bug report to http://bugzilla.open-bio.org and consider - helping out in their maintenance. + Not all scripts in the examples/ directory are correct and up-to-date. We need + volunteers to help maintain these so if you find they do not work, submit a bug + report to http://bugzilla.open-bio.org and consider helping out in their + maintenance. - If you are confused about what modules are appropriate when you try - and solve a particular issue in bioinformatics we urge you to look at - the Bioperl Tutorial or the HOWTO documents first. + If you are confused about what modules are appropriate when you try and solve a + particular issue in bioinformatics we urge you to look at the Bioperl Tutorial + or the HOWTO documents first. o A simple module summary - Here is a quick summary of many of the useful modules and how the - toolkit is laid out: + Here is a quick summary of many of the useful modules and how the toolkit is + laid out: - All modules are in the Bio/ namespace, - - Perl is for newbies and gives a functional interface to the main - parts of the package + All modules are in the Bio/ namespace, + - Perl is for newbies and gives a functional interface to the main parts of the + package - Seq is for Sequences (protein and DNA). o Bio::PrimarySeq is a plain sequence (sequence data + identifiers) - o Bio::Seq is a PrimarySeq plus it has a Bio::Annotation::Collection - and a Bio::SeqFeatureI objects attached. + o Bio::Seq is a PrimarySeq plus it has a Bio::Annotation::Collection + and Bio::SeqFeatureI objects attached (via Bio::FeatureHolderI). o Bio::Seq::RichSeq is all of the above plus it has slots for extra information specific to GenBank/EMBL/SwissProt files. o Bio::Seq::LargeSeq is for sequences which are too big for fitting into memory. - - SeqIO is for reading and writing Sequences, it is a front end - module for separate driver modules supporting the different - sequence formats + - SeqIO is for reading and writing Sequences, it is a front end module for + separate driver modules supporting the different sequence formats - SeqFeature - start/stop/strand annotations of sequences o Bio::SeqFeature::Generic is basic catchall o Bio::SeqFeature::Similarity a similarity sequence feature o Bio::SeqFeature::FeaturePair a sequence feature which is pairwise such as query/hit pairs - - SearchIO is for reading and writing pairwise alignment reports - like BLAST or FASTA + - SearchIO is for reading and writing pairwise alignment reports like BLAST or + FASTA - Search is where the alignment objects are defined o Bio::Search::Result::GenericResult is the result object (a blast query is a Result object) - o Bio::Search::Hit::GenericHit is the Hit object (a query will have - 0-> many hits in a database) + o Bio::Search::Hit::GenericHit is the Hit object (a query will have 0-> + many hits in a database) o Bio::Search::HSP::GenericHSP is the High-scoring Segment Pair - object defining the alignment(s) of the query and hit. + object defining the alignment(s) of the query and hit. - SimpleAlign is for multiple sequence alignments - - AlignIO is for reading and writing multiple sequence alignment - formats - - Assembly provides the start of an infrastructure for assemblies - and Assembly::IO IO converters for them + - AlignIO is for reading and writing multiple sequence alignment formats + - Assembly provides the start of an infrastructure for assemblies and + Assembly::IO IO converters for them - DB is the namespace for all the database query objects o Bio::DB::GenBank/GenPept are two modules which query NCBI entrez for sequences o Bio::DB::SwissProt/EMBL query various EMBL and SwissProt repositories for a sequences o Bio::DB::GFF is Lincoln Stein's fast, lightweight feature and - sequence database which is the backend to his GBrowse system - (see www.gmod.org) + sequence database which is the backend to his GBrowse system (see + www.gmod.org) o Bio::DB::Flat is a fast implementation of the OBDA flat-file - indexing system (cross-language and cross-platform supported by - O|B|F projects see http://obda.open-bio.org). + indexing system (cross-language and cross-platform supported by O|B|F + projects see http://obda.open-bio.org). o Bio::DB::BioFetch/DBFetch for OBDA, Web (HTTP) access to remote databases. o Bio::DB::InMemoryCache/FileCache (fast local caching of sequences - from remote dbs to speed up your access). - o Bio::DB::Registry interface to the OBDA specification for remote + from remote dbs to speed up your access). + o Bio::DB::Registry interface to the OBDA specification for remote data sources - o Bio::DB::XEMBL SOAP access to sequence databases o Bio::DB::Biblio for access to remote bibliographic databases. - - Annotation collection of annotation objects (comments, - DBlinks, References, and misc key/value pairs) - - Coordinate is a system for mapping between different coordinate - systems such as DNA to protein or between assemblies. + o Bio::DB::EUtilities is the initial set of modules used for generic + queried using NCBI's eUtils. + - Annotation collection of annotation objects (comments, DBlinks, References, + and misc key/value pairs) + - Coordinate is a system for mapping between different coordinate systems such + as DNA to protein or between assemblies. - Index is for locally indexed flatfiles with BerkeleyDB - Tools contains many miscellaneous parsers and function for different bioinformatics needs o Gene prediction parser (Genscan, MZEF, Grail, Genemark) o Annotation format (GFF) - o simulate restriction enzyme cutting with RestrictionEnzyme o Enumerate codon tables and valid sequences symbols (CodonTable, IUPAC) o Phylogenetic program parsing (PAML, Molphy, Phylip) - Map genetic and physical map representations - - Graphics render a sequence with its features or a sequence analysis - result. - Structure - parse and represent protein structure data - TreeIO is for reading and writing Tree formats - Tree is the namespace for all the associated Tree objects o Bio::Tree::Tree is the basic tree object o Bio::Tree::Node are the nodes which make up the tree o Bio::Tree::Statistics is for computing statistics for a tree - o Bio::Tree::TreeFunctionsI is where specific tree functions are - implemented (like is_monophyletic and lca) - - Bio::Biblio is where bibliographic data and database access objects - are kept - - Variation represent sequences with mutations and variations applied - so one can compare and represent wild-type and mutation versions of - a sequence. + o Bio::Tree::TreeFunctionsI is where specific tree functions are implemented + (like is_monophyletic and lca) + - Bio::Biblio is where bibliographic data and database access objects are kept + - Variation represent sequences with mutations and variations applied so one + can compare and represent wild-type and mutation versions of a sequence. - Root, basic objects for the internals of Bioperl o Upgrading from an older version - If you have a previously installed version of bioperl on your system - some of these notes may help you. - - Some modules have been removed because they have been superceded by - new development efforts. They are documented in the DEPRECATED file - that is included in the release. In addition some methods, or the - Application Programming Interface (API), have changed or been - removed. You may find that scripts which worked with bioperl 1.4 - may give you warnings or may not work at all (although we have tried - very hard to minimize this!). Send an email to the list and we'll be - happy to give you pointers. + If you have a previously installed version of bioperl on your system some of + these notes may help you. + + Some modules have been removed because they have been superceded by new + development efforts. They are documented in the DEPRECATED file that is + included in the release. In addition some methods, or the Application + Programming Interface (API), have changed or been removed. You may find that + scripts which worked with bioperl 1.4 may give you warnings or may not work at + all (although we have tried very hard to minimize this!). Send an email to the + list and we'll be happy to give you pointers. -- 2.11.4.GIT