base module

class mavis.validate.base.Evidence(break1, break2, bam_cache, REFERENCE_GENOME, read_length, stdev_fragment_size, median_fragment_size, stranded=False, opposing_strands=None, untemplated_seq=None, data={}, classification=None, **kwargs)[source]

Bases: mavis.breakpoint.BreakpointPair

Parameters:
  • breakpoint_pair (BreakpointPair) – the breakpoint pair to collect evidence for
  • bam_cache (BamCache) – the bam cache (and assc file) to collect evidence from
  • REFERENCE_GENOME (dict of Bio.SeqRecord by str) – dict of reference sequence by template/chr name
  • data (dict) – a dictionary of data to associate with the evidence object
  • classification (SVTYPE) – the event type
  • protocol (PROTOCOL) – genome or transcriptome
assemble_contig(log=<function Evidence.<lambda>>)[source]

uses the split reads and the partners of the half mapped reads to create a contig representing the sequence across the breakpoints

if it is not strand specific then sequences are sorted alphanumerically and only the first of a pair is kept (paired by sequence)

collect_compatible_flanking_pair(read, mate, compatible_type)[source]
collect_flanking_pair(read, mate)[source]

checks if a given read meets the minimum quality criteria to be counted as evidence as stored as support for this event

Parameters:read (pysam.AlignedSegment) – the read to add
Raises:UserWarning – the read does not support this event or does not pass quality filters

see theory - types of flanking evidence

collect_spanning_read(read)[source]

spanning read: a read covering BOTH breakpoints

This is only applicable to small events. Do not need to look for soft clipped reads here since they will be collected already

collect_split_read(read, first_breakpoint)[source]

adds a split read if it passes the criteria filters and raises a warning if it does not

Parameters:
  • read (pysam.AlignedSegment) – the read to add
  • first_breakpoint (bool) – add to the first breakpoint (or second if false)
Raises:
  • UserWarning – the read does not support this breakpoint or does not pass quality filters
  • AttributeError – orientation wasn’t specified for the breakpoint
compute_fragment_size(read, mate)[source]
copy()[source]
decide_sequenced_strand(reads)[source]
flatten()[source]
get_bed_repesentation()[source]
inner_window1

Interval – the window where evidence will be gathered for the first breakpoint

inner_window2

Interval – the window where evidence will be gathered for the second breakpoint

load_evidence(log=<function Evidence.<lambda>>)[source]

open the associated bam file and read and store the evidence does some preliminary read-quality filtering

Todo

support gathering evidence for small structural variants

max_expected_fragment_size
min_expected_fragment_size
outer_window1

Interval – the window where evidence will be gathered for the first breakpoint

see theory - calculating the evidence window

outer_window2

Interval – the window where evidence will be gathered for the second breakpoint

see theory - calculating the evidence window

putative_event_types()[source]
standardize_read(read)[source]
supporting_reads()[source]

convenience method to return all flanking, split and spanning reads associated with an evidence object