Development¶
Install¶
Clone the repository and switch to the development branch
>>> git clone https://svn.bcgsc.ca/bitbucket/scm/svia/mavis.git
>>> cd mavis
>>> git checkout develop
Set up a python virtual environment. If you are developing in python setting up with a virtual environment can be incredibly helpful. This can be used to generate the requirements.txt file that pip uses for install. Instructions for setting up the environment are below
>>> pip install virtualenv
>>> virtualenv venv
>>> source venv/bin/activate
(venv) >>>
Install the MAVIS python package (currently need to use pip as well due to dependencies stored in svn)
(venv) >>> python setup.py develop
Run the unit tests and compute code coverage
(venv) >>> python setup.py nosetests
Make the user manual
(venv) >>> cd docs
(venv) >>> make html
The contents of the user manual can then be viewed by opening the build/html/index.html in any available web browser (i.e. google-chrome, firefox, etc.)
Non-python dependencies¶
Aligner (blat)¶
In addition to the python package dependencies, MAVIS also requires an aligner to be installed. Currently the only aligner supported is blat. For MAVIS to run successfully blat must be installed and accessible on the path. If you have a non-std install of blat you may find it useful to edit the PATH environment variable
>>> export PATH=/path/to/directory/containing/blat/binary:$PATH
Samtools¶
Samtools is only used in sorting and indexing the intermediary output bams. Eventually this will hopefully be accomplished through pysam only.
Guidelines for Contributors¶
- In general, follow pep8 style guides using a maximum line width of 120 characters
- docstrings should follow sphinx google code style
- any column name which may appear in any of the intermediate or final output files must be defined in
COLUMNS
Formatting Types in docstrings¶
if you want to be more explicit with nested types, the following conventions are used throughout the code
- dictionary:
d = {<key>: <value>}
becomesdict of <value> by <key>
- list:
l = [1, 2, 3]
becomeslist of int
- mixed:
d = {'a': [1, 2, 3], 'b': [4, 5, 6]}
becomesdict of list of int by str
- tuples:
('a', 1)
becomestuple of str and int
Unit Tests¶
- all new code must have unit tests in the tests subdirectory
- in general for assertEqual statements, the expected value is given first
Major Assumptions¶
Some assumptions have been made when developing this project. The major ones have been listed here to facilitate debugging/development if any of these are violated in the future.
- The input bam reads have stored the sequence wrt to the positive/forward strand and have not stored the reverse complement.
- The distribution of the fragment sizes in the bam file approximately follows a normal distribution.
Current Limitations¶
- Assembling contigs will always fail for repeat sequences as we do not resolve this. Unlike traditional assemblies we cannot assume even input coverage as we are taking a select portion of the reads to assemble.
- Currently no attempt is made to group/pair single events into complex events.
- Transcriptome validation uses a collapsed model of all overlapping transcripts and is not isoform specific. Allowing for isoform specific validation would be computationally expensive but may be considered as an optional setting for future releases.
MAVIS Package Documentation¶
holds submodules related to structural variants
Development Goals¶
Features to be implemented
Todo
return multiple events not just the major event
(The original entry is located in /home/creisle/git/mavis/mavis/breakpoint.py:docstring of mavis.breakpoint.BreakpointPair.call_breakpoint_pair, line 11.)
Todo
add markers for exons with abrogated splice sites
(The original entry is located in /home/creisle/git/mavis/mavis/illustrate/elements.py:docstring of mavis.illustrate.elements.draw_exon, line 17.)