mavis.validate package¶
Module contents¶
Sub-package Documentation¶
The validation sub-package is responsible for pulling supporting reads from the bam file and re-calling events based on the evidence in a standard notation.
Types of Output Files¶
A variety of intermediate output files are given for the user. These can be used to “drill down” further into events and also for developers debugging when adding new features, etc.
expected name/suffix | file type/format | content |
---|---|---|
*.raw_evidence.bam |
bam | raw evidence |
*.contigs.bam |
bam | aligned contigs |
*.evidence.bed |
bed | evidence collection window regions |
*.validation-passed.bed |
bed | validated event positions |
*.validation-failed.tab |
text/tabbed | failed events |
*.validation-passed.tab |
text/tabbed | validated events |
*.contigs.fa |
fasta | assembled contigs |
*.contigs.blat_out.pslx |
pslx | results from blatting contigs |
*.igv.batch |
IGV batch file | igv batch file |
Algorithm Overview¶
(For each breakpoint pair)
- Calculate the window/region to read from the bam and collect evidence
- Store evidence (flanking read pair, half-mapped read, spanning read, split read, compatible flanking pairs) which match the expected event type and position
- Assemble a contig from the collected reads. see theory - assembling contigs
Generate a fasta file containing all the contig sequences
Align contigs to the reference genome (currently blat is used to perform this step)
Make the final event calls
(For each breakpoint pair)
- call by contig
- if fails, then call by spanning read
- if fails, then call by split read
- if fails, then call by mixed split read / flanking read pair
- if fails, then call by flanking read pair. see theory - calling breakpoints by flanking evidence
- if fails, then the event is failed
(For each breakpoint pair)
- determine the amount of support for the more specific call. see theory - determining flanking support
Output new calls, evidence, contigs, etc