Data analysis - documentation : Pipeline installation
This page last changed on Nov 15, 2006 by maising.
The following document describes the installation of the Solexa data analysis pipeline. PlatformsThe pipeline is usually developed and run on Linux, which is the only platform we tend to support officially. It has also been run successfully on Cygwin and MacOSX. Although it has never been tested, it should also work on any Unix variant on which the prerequisites below are available. However, we may not be able to fix issues that you encounter on any platform other than Linux. PrerequisitesThe pipeline has been successfully run on Linux, MacOSX and Windows (using the cygwin environment). The following prerequisites are required:
For a compilation from source, the following additional software is required:
If the source code is to be obtained directly from cvs, you will also need:
Setting up cygwinFor a Windows installation, you will need to install cygwin. All of the above packages need to be installed using the cygwin package manager. An X server is recommended. The use of Unix-style line endings is recommended. An editor that is aware of differences in Unix and DOs line endings is recommended. In cygwin, it seems to be difficult to get it to read your .bashrc file automatically. You may have to do "source .bashrc" if you are using the bashrc file to set up environment variables, for example for cvs. Setup directoryIt is generally a good idea to have a separate production and development copies of the code. We also recommend to keep old versions of the pipeline when you install a newer versions. Setting up email reportingThe script runReport.pl is called at the end of a GERALD run sends you an email when (if) a run successfully completes. To get this to work you need to set the following parameters in the GERALD config file. EMAIL_LIST your.name@yourdomain.com clamouring.experimentalist@yourdomain.com A space separated list of email addresses you want the report to be sent to. You may get away with your.name instead of your.name@yourdomain.com, depends on your email server. WEB_DIR_ROOT http://server/SHARE
The software assumes it can create a valid URL from the GERALD folder path by chopping off the first two path elements and prepending WEB_DIR_ROOT, EMAIL_DOMAIN yourdomain.com Your SMTP server may refuse to accept emails from or send emails to addresses that don't end @yourdomain.com. EMAIL_SERVER yourserver:2525 yourserver is the name or IP address of a mail server willing to accept SMTP email requests from you. 2525 is the port number of the SMTP service on that server. Generally this will be 25 - this is the default value if no port number is specified. The utility nmap (if you have it installed) may help you which port if any on a server is hosting an SMTP service. If you don't get a friendly message when you do telnet yourserver yourPortNumber from the machine you're running GERALD on then email reporting will not work. You can run runReport.pl directly in test mode: /runReport.pl --test yourserver:25 yourdomain.com anything your.name@yourdomain.com should send you a test email. If it doesn't the transcript it prints out will hopefully tell you what went wrong. Please note email reporting is considered a "nice to have" rather than a "must have" feature of the pipeline. The code as it stands works on all of the SMTP servers at Solexa. However, whether it will work for you will depend heavily on how your SMTP servers are set up locally. Your first port of call in the event of problems should be your local systems people. Failure of email reporting should not prevent the rest of a pipeline run coming to a successful completion. Obtaining the source codeInstallation from cvsNeed to set up the environment variables export CVSROOT=:ext:whoever@server.solexa.co.uk:/home/cvs export CVS_RSH=/usr/bin/ssh where "whoever" is your login on the cvs server "server". You can set these variables permanently in your ".bashrc". The first option can also be specified on the cvs command-line using the "-d" switch. Change to your installation directory and type cvs co Pipeline and enter your password. Some hints on cvs - if you are not familiar with itCan set up the environment variable "EDITOR" to point to your editor of choice; used for commit messages. Use "cvs log myfile" to get the version number, all release tags and commit messages for file "myfile". To see what the latest version in your local cvs copy is, you can do "ident myfile". NB you can run ident on compiled executables too! "cvs diff" will show you all differences between your copy and the latest version in the repository. To update to a the most recent version of the pipeline, change into your pipeline directory and type "cvs update -d". To update to a specific version (for example Version0_1_1), type "cvs update -r Version0_1_1". Installation from a source tar ballChange to the directory in which you wanto install the pipeline and type tar xvfz SolexaPipeline-version.tar.gz where version is the version corresponding to the archive you've got. Of course, you may have to adjust the path to the archive suitably. Installation from a binary tar ballUnpack the archive using the command tar xvfz SolexaPipeline-version-bin.tar.gz CompilationChange into the "Pipeline" directory and type make make install This will first build all C++ code, and then copy the relevant executables into the directories "Goat" and "Gerald" which contain the scripts and Makefile generators. It may be useful to add these two directories to your default path. This build system is non-standard, as we chose to keep executable files within the pipeline folder structure rather than copying them to /usr/local/bin or similar. The main reasons for this decisions are:
We may change this system at some stage in the future, and of course you can simply install the contents of the "installed" Goat/ and Gerald/ folders whereever you like. For a compilation on a 64-bit platform, see the section below. If you want to use the Intel Math Kernel Library as an FFT backend, you have to compile the image analysis module "Firecrest" separately from the rest of the project and specify the additional variable "MKL" to make, as in "make MKL=true". You will most likely also have to set the MKL specific paths in the Makefile to the appropriate locations on your system. Compilation on 64-bit Linux, MacOSX and other platformsCompilation on 64-bit architectures or platforms other than Linux and cygwin may require manual changes to the Makefiles, even though we have successfully compiled it on various 32-bit and 64-bit platforms. If you run into compilation problems, you may have to adapt the platform specific gcc-compiler flags. Because of the optimised FFT libraries, the Firecrest/Makefile is particularly likely to be sensitive to platform-specific peculiarities. |
![]() |
Document generated by Confluence on Mar 09, 2007 16:11 |