This page last changed on Nov 08, 2007 by martin.schenker.

One of the main considerations behind the current pipeline architecture is the ability to use the parallelisation facilities present on almost all SMP machines, and most Linux/Unix clusters. This page discusses several of the methods available to exploit this intrinsic scalability and to make use of all available CPU power. The pipeline was meant to be very flexible in this respect; the chosen method can be tailored to suit the specific computing infrastructure. At the most basic level, parallelisation is built around the ability of the stanrdard make utility to parallelise its execution across multiple processes on the same computer. Popular clustering/batch systems like Sun Grid Engine or LSF extend this mechanism across multiple nodes of a cluster by providing their own adaptions of make. Finally, since version 0.2.2 the analysis pipeline also provides a series of checkpoints and hooks to facilitate the customisation of the parellelisation to arbitrary computing setup. This will require a little bit of work on the user's part, but is the most flexible approach.

Standard make

The standard "make" utility has many limitations, but it is universally available and has an in-built parallelisation switch ("-j"). For example, on a dual-processor, dual-core system, running

make -j 4

instead of

make

will parallelise the pipeline run over 4 different processor cores, with an almost 4fold decrease in analysis run-time. On an 4-way SMP system "-j 8" or more may be advisable.

The pipeline is built around GNU make version 3.77 or later. Unfortunately, make-3.77 seems to have some bugs, so you may have to go to version 3.78.

Distributed make

There are several distributed versions of make for use on cluster systems. Frequently used ones include qmake from Sun Grid Engine and lsmake from LSF.

Distributed cluster computing may require significant system administration expertise. The pipeline has been successfully run on SGE 6.1 (not 6.1u2), but we will not be able to support external installations. LSF make has known bugs that prevent the pipeline from running reliably.

Sun Grid Engine

In order to submit non-interactive batch jobs to the grid engine, a short wrapper script that can be submitted using qsub is needed. See the grid engine documentation for details.

An example of such a script is:

#!/usr/bin/sh
qmake -cwd -v PATH -inherit -- recursive

One problem to be aware of with qmake is that the rsh implementation used by Grid Engine tends to run out of available ports for large degrees of parallelisation. Several parts of the pipeline farm out short jobs, and ports may be used up before they expire. The work-around we used is to switch to ssh as a remote shell. This is described in http://gridengine.sunsource.net/howto/qrsh_qlogin_ssh.html. Common problems with SGE are described in
http://gridengine.sunsource.net/howto/commonproblems.html.

Another problem that we have observed in the past is grid engine throwing up a "shepherd error". In our own experience, this error could be prevented by keeping all log files that the grid engine daemons produce on a fast local hard drive.

LSF make

LSF make is not in use at Illumina and has known bugs that prevent the pipeline from running. We do not recommend to use the pipeline on LSF.

It has been reported that parts of the pipeline have been run successfully under LSF (not the whole pipeline!). The configuration that has been tested uses Platform LSF HPC 6.1 for Linux. lsmake is based on GNU make-3.77. The command-line used was

bsub -n 20 -o make-%J.out -e make-%J.err -R 'select[LSF_Make] rusage[mem=1000]' lsmake recursive

Custom parallelisation

Many parts of the analysis pipeline are intrinsically parallelisable by lane or tile; at least for Firecrest and Bustard some users have reported success with their own scripted versions of a distributed parallelisation by lane. However, some parts of the pipeline cannot be parallelised completely. From pipeline 0.2.2 onwards, we have added a series of additional hooks and check-points to facilitate an efficient customisation of the pipeline to various setups that do not have SGE qmake or lsmake available.

Essentially the pipeline workflow can be split up (beyond the image analsis, base-calling, alignment modularity) into a series of steps with different levels of scalability. The boundaries between these steps can be viewed as synchronisation "barriers" - the pipeline has to wait for each of the tasks inside the step to finish. Different steps can be parallelised at the run level (essentially no parallelisation), the lane level (up to 8 jobs in parallel) and the tile level (currently up to 1600 jobs in parallel). Each step can be initiated by a make target. After completion of each of these steps, the pipeline tends to produce a file (or a series of files at the lane/tile level) whose presence can be checked to determine whether all jobs belonging to the step have finished. Finally, hooks are provided upon completion of the step (or a part thereof for the lane level) to issue user-defined external commands. The mechanism is probably best explained by an example. The following section lists the steps, corresponding make targets, checkfiles and hooks for the image analysis (Firecrest).

Furthermore, the Firecrest Makefile creates two files,

lanes.txt
tiles.txt

containing a list of all lanes and tiles respectively used in the run. This information can be parsed and used to feed your own analysis scripts.

Warning

These features are considered experimental at this stage. We would appreciate feedback on how useful they are potentially or any suggested improvements.

One specific issue that we have come across is that for complex, highly parallelised analysis tasks one of the "make" targets may attempt to re-run previous steps. This is most likely due to time stamps getting out of sync in a parallel environment, but is still under investigation.

Image analysis (Firecrest)

Parallelisation level run
Target default_offsets.txt
checkfile default_offsets.txt
hook cmdf1
Parallelisation level lane tile
Target s_1 ... s_1_0001 ...
checkfile s_1_finished.txt ... (none currently)
hook cmdf2 (none currently)
Parallelisation level run
Target all
checkfile finished.txt
hook cmdf3

Examples

The previous diagram shows that typing "make" in the Firecrest folder is equivalent to the following series of commands:

make default_offsets.txt
make s_1; make s_2; make s_3; make s_4; make s_5; make s_6; make s_7; make s_8
make all

Of course, typing the commands like this is pointless; it only becomes interesting when one exploits the fact that one can potentially run all 8 commands on the second line in parallel, as long as one makes sure that they all finish before the final "make all" is issued. How one parallelises these jobs is at your discretion; for example, one could send them to the queue of a batch system, or just use "ssh" or "rsh" to send them to a predetermined analysis computer. The following example is a bit more complex and realistic:

make -j 2 default_offsets.txt cmdf1='make s_1; make s_2; make s_3; make s_4; \
   make s_5; make s_6; make s_7; make s_8;' \
  cmdf2='if [[ -e s_1_finished.txt && -e s_2_finished.txt && -e s_3_finished.txt \
         && -e s_4_finished.txt && -e s_5_finished.txt && -e s_6_finished.txt \
         && -e s_7_finished.txt && -e s_8_finished.txt ]]; then make all ; fi #'

In this example, the second step is automatically started after the first step by declaring the second step ("make s_1; ..." ) as the external command "cmdf1" to be issued after completion of the first step. Again, in reality this only makes sense if you actually parallelise the 8 make commands; for example,

nohup ssh mycomputenode1 make -j 4 s_1

or

bsub make s_1

instead of "make s_1". After completion of each of the 8 make commands of the second step, the shell command "cmdf2" is run:

if [[ -e s_1_finished.txt && -e s_2_finished.txt && -e s_3_finished.txt \
 && -e s_4_finished.txt  && -e s_5_finished.txt && -e s_6_finished.txt \
 && -e s_7_finished.txt && -e s_8_finished.txt ]]; then make all ; fi #

This checks for the existence of all 8 checkfiles, and will not do anything after completion of the first 7 lanes ("first" in the order of completion, not on the flowcell). Only when the final lane has completed, the next make command ("all") is issued. A few more points to note:

  • Of course, there is no need to declare the full shell command on the command line as above. You could just put it all shell commands into a shell script and call that script instead.
  • Note the final comment symbol '#' at the end of the shell command above. The reason for this is that the pipeline automatically supplies an argument to all commands issued at the lane level: an identifier for the actual lane analysed, e.g. "s_5". In the example above we do not make use of this argument, and so it needs to be commented out. If we had actually used a real shell script we could have simply ignored the command-line options.

Base-calling (Bustard)

Parallelisation level lane tile
Target Phasing/s_1_phasing.xml ... Phasing/s_1_0001_phasing.txt ...
checkfile Phasing/s_1_phasing.xml ... Phasing/s_1_0001_phasing.txt ...
hook cmdb1 (none currently)
Parallelisation level run
Target Phasing/phasing.xml
checkfile Phasing/phasing.xml
hook cmdb2
Parallelisation level lane tile
Target s_1 ... s_1_0001 ...
checkfile s_1_finished.txt ... s_1_0001_qhg.txt ...
hook cmdb3 (none currently)
Parallelisation level run
Target all
checkfile finished.txt
hook cmdb4

Alignment (GERALD)

Parallelisation level run
Target tiles.txt
checkfile tiles.txt
hook (none currently)
Parallelisation level lane
Target s_1 ...
checkfile s_1_finished.txt ...
hook (none currently)
Parallelisation level run
Target all
checkfile finished.txt
hook POST_RUN_COMMAND (accessible from config file)

Limits of parallelisation

The analysis work on a per-tile basis, so the maximum degree of parallelisation achievable is given by the total number of tiles scanned during the run. However, some parts of the pipeline operate on a per-lane basis, and a few parts on a per-run basis, which means that scaling will cease to be linear at some stage for more than 8-way parallelisation. There is an additional limit on Eland analyses described in the next section.

Memory limits

Most parts of the pipeline are not overly memory extensive (i.e. less than 150 MB). Eland is different as it can use up to 1 GB, which means that parallelisation of Eland is more likely to run into memory issues. Because many load-sharing systems (and presumably make itself) do not take into account the memory used, Eland is treated differently in the pipeline, and its parallelisation is artificially prevented by an non-essential make dependency. If you are certain that you cannot exhaust your available memory, you can use a special option to the GERALD config file ("ELAND_MULTIPLE_INSTANCES 8") to remove this dependency. However, you are responsible for making sure that you have up to 8 GB of RAM at your disposal. See also Whole genome alignments using ELAND.

Document generated by Confluence on Dec 19, 2007 18:32