Data analysis - documentation : Pipeline installation
This page last changed on Apr 25, 2008 by come.
The following document describes the installation of the Genome Analyzer data analysis pipeline. PlatformsThe pipeline is usually developed and run on Linux, which is the only platform we tend to support officially. Although it has never been tested and will not be supported by Illumina, it should also work on any Unix variant on which the prerequisites below are available. However, we may not be able to fix issues that you encounter on any platform other than Linux. PrerequisitesThe following prerequisites are required for the pipeline:
The Linux distribution we are using and testing against is RedHat; the listed dependencies are satisfied by the RedHat packages perl-, python-, make, autoconf, gnuplot, ImageMagick, ghostscript, zlib, zlib-devel, bzip2, bzip2-devel, libtiff-devel and gcc-* as well as their respective prerequisites. The Perl XML::Simple module and fftw3 may have to be downloaded separately and installed from source. Setup directoryIt is generally a good idea to have a separate production and development copies of the code. We also recommend to keep old versions of the pipeline when you install a newer versions. Setting up email reportingThe script Gerald/runReport.pl is called at the end of a pipeline run and sends you an email when (if) a run successfully completes. To use the (optional) email notification you need to set up an SMTP server (in the unlikely case that you haven't got one already) and set the following parameters in the GERALD config file (see Gerald User Guide and FAQ). EMAIL_LIST your.name@yourdomain.com clamouring.experimentalist@yourdomain.com A space separated list of email addresses you want the report to be sent to. You may get away with your.name instead of your.name@yourdomain.com, depends on your email server. WEB_DIR_ROOT http://server/SHARE
The software assumes it can create a valid URL from the GERALD folder path by chopping off the first two path elements and prepending WEB_DIR_ROOT, EMAIL_DOMAIN yourdomain.com Your SMTP server may refuse to accept emails from or send emails to addresses that don't end @yourdomain.com. EMAIL_SERVER yourserver:2525 yourserver is the name or IP address of a mail server willing to accept SMTP email requests from you. 2525 is the port number of the SMTP service on that server. Generally this will be 25 - this is the default value if no port number is specified. The utility nmap (if you have it installed) may help you which port if any on a server is hosting an SMTP service. If you don't get a friendly message when you do telnet yourserver yourPortNumber from the machine you're running GERALD on then email reporting will not work. You can run runReport.pl directly in test mode: /runReport.pl --test yourserver:25 yourdomain.com anything your.name@yourdomain.com should send you a test email. If it doesn't the transcript it prints out will hopefully tell you what went wrong. Please note email reporting is considered a "nice to have" rather than a "must have" feature of the pipeline. The code as it stands works on all of the SMTP servers at Illumina. However, whether it will work for you will depend heavily on how your SMTP servers are set up locally. Your first port of call in the event of problems should be your local systems people. Failure of email reporting should not prevent the rest of a pipeline run coming to a successful completion. Obtaining the source codeInstallation from a source tar ballThis is the normal method to install the pipeline. Change to the directory in which you wanto install the pipeline and type tar xvfz GAPipeline-version.tar.gz where version is the version corresponding to the archive you've got. Of course, you may have to adjust the path to the archive suitably. Installation from a binary tar ballUnpack the archive using the command tar xvfz GAPipeline-version-bin.tar.gz Installation from cvsNote: We do not currently provide external cvs access. The pipeline can be found in the cvs module "Pipeline". CompilationChange into the "Pipeline" directory and type make make install This will first build all C++ code, and then copy the relevant executables into the directories "Goat" and "Gerald" which contain the scripts and Makefile generators. It may be useful to add these two directories to your default path. This build system is non-standard, as we chose to keep executable files within the pipeline folder structure rather than copying them to /usr/local/bin or similar. The main reasons for this decisions are:
We may change this system at some stage in the future, and of course you can simply install the contents of the "installed" Goat/ and Gerald/ folders whereever you like. For a compilation on a 64-bit platform, see the section below. If you want to use the Intel Math Kernel Library as an FFT backend, you have to compile the image analysis module "Firecrest" separately from the rest of the project and specify the additional variable "MKL" to make, as in "make MKL=true". You will most likely also have to set the MKL specific paths in the Makefile to the appropriate locations on your system. Compilation on 64-bit Linux and other platformsThe compilation of the pipeline with the current pipeline Makefiles works on all platforms the pipeline is running on that we are aware of, including many 32- and 64-bit Linux versions and Solaris. However, if your compilation does not succeed on a less commonly used platform (possibly 64-bit architectures or platforms other than Linux), you may have to make manual changes to the Makefiles. If you run into compilation problems, you may have to adapt the platform specific gcc-compiler flags. Because of the optimised FFT libraries, the Firecrest/Makefile is particularly likely to be sensitive to platform-specific peculiarities. Please notify us of any changes that you needed to make. Compilation with SRF supportIf you want the pipeline to support the creation of SRF traces with io_lib, you have to enable it explicitly in the arguments of make: make WITH_IO_LIB=yes make WITH_IO_LIB=yes install Note that this will require autoconf (? 2.59). The resulting SRF converters will be linked to libcurl and libidn if they are available. |
![]() |
Document generated by Confluence on Jul 25, 2008 16:42 |