stats module¶
-
class
mavis.bam.stats.
BamStats
(median_fragment_size, stdev_fragment_size, read_length)[source]¶ Bases:
object
-
class
mavis.bam.stats.
Histogram
[source]¶ Bases:
dict
-
mavis.bam.stats.
compute_genome_bam_stats
(bam_file_handle, sample_bin_size, sample_size, log=<function devnull>, min_mapping_quality=1, sample_cap=10000, distribution_fraction=0.99)[source]¶ computes various statistical measures relating the input bam file
Parameters: - bam_file_handle (pysam.AlignmentFile) – the input bam file handle
- sample_bin_size (int) – how large to make the sample bin (in bp)
- sample_size (int) – the number of genes to compute stats over
- log (callable) – outputs logging information
- min_mapping_quality (int) – the minimum mapping quality for a read to be used
- sample_cap (int) – maximum number of reads to collect for any given sample region
- distribution_fraction (float) – the proportion of the distribution to use in computing stdev
Returns: the fragment size median, stdev and the read length in a object
Return type:
-
mavis.bam.stats.
compute_transcriptome_bam_stats
(bam_file_handle, annotations, sample_size, log=<function devnull>, min_mapping_quality=1, stranded=True, sample_cap=10000, distribution_fraction=0.97)[source]¶ computes various statistical measures relating the input bam file
Parameters: - bam_file_handle (pysam.AlignmentFile) – the input bam file handle
- annotations (object) – see
load_reference_genes()
- sample_size (int) – the number of genes to compute stats over
- log (callable) – outputs logging information
- min_mapping_quality (int) – the minimum mapping quality for a read to be used
- stranded (bool) – if True then reads must match the gene strand
- sample_cap (int) – maximum number of reads to collect for any given sample region
- distribution_fraction (float) – the proportion of the distribution to use in computing stdev
Returns: the fragment size median, stdev and the read length in a object
Return type: