cluster module

class mavis.cluster.cluster.BreakpointPairGroupKey[source]

Bases: mavis.cluster.cluster.BreakpointPairGroupKey

mavis.cluster.cluster.all_pair_group_keys(pair, explicit_strand=False)[source]
mavis.cluster.cluster.merge_breakpoint_pairs(input_pairs, cluster_radius=200, cluster_initial_size_limit=25, verbose=False)[source]

two-step merging process

  1. merges all ‘small’ (see cluster_initial_size_limit) events as the union of all events that

    fall within the cluster_radius

  2. for all remaining events choose the ‘best’ merge for any event within cluster_radius of an

    existing node. Otherwise the node is added unmerged. The events in the second phase are done in order of smallest total breakpoint interval size to largest

Parameters:
  • input_pairs (list of BreakpointPair) – the pairs to be merged
  • cluster_radius (int) –
  • cluster_initial_size_limit (int) – maximum size of breakpoint intervals allowed in the first merging phase
Returns:

mapping of merged breakpoint pairs to the input pairs used in the merge

Return type:

dict of list of BreakpointPair by BreakpointPair

mavis.cluster.cluster.merge_by_union(input_pairs, group_key, weight_adjustment=10, cluster_radius=200)[source]

for a given set of breakpoint pairs, merge the union of all pairs that are within the given distance (cluster_radius)

mavis.cluster.cluster.merge_integer_intervals(*intervals, weight_adjustment=0)[source]

Merges a set of integer intervals into a single interval where the center is the weighted mean of the input intervals. The weight is inversely proportional to the length of each interval. The length of the final interval is the average of the lengths of the input intervals capped in size so that it never extends beyond the union of the input intervals

Parameters:weight_adjustment (int) – add to length to lower weighting differences between small intervals
mavis.cluster.cluster.pair_key(pair)[source]
mavis.cluster.cluster.weighted_mean(values, weights=None)[source]