This page last changed on Mar 28, 2008 by rshaw.
The default base-call quality predictors produced by extract_quality_predictors are :-
- in_read_cycle : This is the (1-offset) cycle number; for multiple read analysis this cycle number is relative to the read in which the base occurs, e.g. it would be 2 rather than 38 for the second cycle of the second read of a pair of 36-cycle reads
- unchastity : The `chastity' statistic for a base is defined as the ratio of the highest of the four (base type) intensities to the sum of highest two; the `unchastity' predictor is simply the chastity value subtracted from 1.
- max_early_unchastity : This is the maximum unchastity value over the first PURE_BASES (12 by default) bases in the read in which the base of interest occurs.
- raw_unquality : This is the negative of the highest of the base type probabilities estimated by Bustard for the base.
Three additional predictors are currently supported for development purposes (further such predictors may be added in future releases) :-
- max_local_unchastity : This is the maximum unchastity over a window of bases centred on the base of interest; two neighbours on either side are considered.
- homopol_len : This is the length of the run of the same called base type in which the base of interest occurs; the `run' will be of length one if the base type called for the base of interest differs from both that called for the base before it and that called for the base after it in the read. (This initial predictor ignores the position of the base of interest within a homopolymer run.)
- signal_decay : This is the proportion by which the highest intensity associated with the current base is less than the highest intensity associated with the first cycle base. (The value is constrained by thresholding to be in the intuitively expected range 0 to 1; possibilities such as a highest intensity increase or negative intensities - these can result from scaling of the intensities by Bustard - might otherwise result in a value outside this range.)
|