assemble module¶
-
class
mavis.assemble.
DeBruijnGraph
(data=None, **attr)[source]¶ Bases:
networkx.classes.digraph.DiGraph
wrapper for a basic digraph enforces edge weights
Initialize a graph with edges, name, graph attributes.
Parameters: - data (input graph) – Data to initialize graph. If data=None (default) an empty graph is created. The data can be an edge list, or any NetworkX graph object. If the corresponding optional Python packages are installed the data can also be a NumPy matrix or 2d ndarray, a SciPy sparse matrix, or a PyGraphviz graph.
- name (string, optional (default='')) – An optional name for the graph.
- attr (keyword arguments, optional (default= no attributes)) – Attributes to add to graph as key=value pairs.
See also
convert
Examples
>>> G = nx.Graph() # or DiGraph, MultiGraph, MultiDiGraph, etc >>> G = nx.Graph(name='my graph') >>> e = [(1,2),(2,3),(3,4)] # list of edges >>> G = nx.Graph(e)
Arbitrary graph attribute pairs (key=value) may be assigned
>>> G=nx.Graph(e, day="Friday") >>> G.graph {'day': 'Friday'}
-
add_edge
(n1, n2, freq=1)[source]¶ add a given edge to the graph, if it exists add the frequency to the existing frequency count
-
mavis.assemble.
assemble
(sequences, assembly_max_kmer_size=None, assembly_min_edge_weight=3, assembly_min_match_quality=0.95, assembly_min_read_mapping_overlap=None, assembly_min_contig_length=None, assembly_min_exact_match_to_remap=6, assembly_max_paths=20, assembly_max_kmer_strict=False, log=<function <lambda>>)[source]¶ for a set of sequences creates a DeBruijnGraph simplifies trailing and leading paths where edges fall below a weight threshold and the return all possible unitigs/contigs
Parameters: - sequences (
list
ofstr
) – a list of strings/sequences to assemble - assembly_max_kmer_size (int) – the size of the kmer to use
- assembly_min_edge_weight (int) – see assembly_min_edge_weight
- assembly_min_match_quality (float) – percent match for re-aligned reads to contigs
- assembly_min_read_mapping_overlap (int) – the minimum amount of overlap required when aligning reads to contigs
- assembly_max_paths (int) – see assembly_max_paths
Returns: a list of putative contigs
Return type: - sequences (
-
mavis.assemble.
digraph_connected_components
(graph, subgraph=None)[source]¶ the networkx module does not support deriving connected components from digraphs (only simple graphs) this function assumes that connection != reachable this means there is no difference between connected components in a simple graph and a digraph
Parameters: graph (networkx.DiGraph) – the input graph to gather components from Returns: returns a list of compnents which are lists of node names Return type: list
oflist