HLAminer Logo

HLAminer is a software for HLA predictions from next-generation shotgun (NGS) sequence read data and supports direct read alignment and targeted de novo assembly of sequence reads.

It has been used successfully to predict expressed HLA genes in the large TCGA (The Cancer Genome Atlas) cancer cohort. 

HLAminer uses TASR as its main assembly engine, which in turn is derived from SSAKE, the first de novo genome assembler published. A manuscript describing the methodology is published in the peer-reviewed journal Genome Medicine.

Targeted assembly

The HLA prediction by targeted assembly of short sequence reads performs targeted de novo assembly of HLA NGS reads and align them to reference HLA alleles from the IMGT/HLA sequence repository using commodity hardware with standard specifications (2GB RAM, 2GHz). This short clip shows the process: https://www.youtube.com/watch?v=j-g8Geh5ST8

Read alignment

The HLA prediction from direct read alignment method is conceptually simpler and faster to execute, since paired reads are aligned up-front to reference HLA alleles. However, the HLA allele detection sensitivity and specificity from this method is currently (100nt paired reads) lower than from targeted HLA assembly.

Alignments from both methods are processed by HLAminer to derive HLA class I and II predictions by scoring and evaluating the probability of each candidate allele bearing alignments.

If you use HLAminer in your research, please cite

Warren RL, Choe G, Freeman DJ, Castellarin M, Munro S, Moore R, Holt RA.  2012. Derivation of HLA types from shotgun sequence datasets. Genome Med. 4:95.

Current Release


All Releases

Version  Released  Description  Licenses  Status 
1.4 Oct 06, 2018
  • Ability to stream the (.sam) output of modern read aligners, directly into HLAminer.
  • Initial support for predicting HLA types from long nanopore reads such as those from Oxford Nanopore Technologies.
  • Better information/sub-routine/date tracking in HLAminer
BCCA (academic use) final
1.3.1 Oct 10, 2017

Streamlined output. Minor bug fixes.

BCCA (academic use) final
1.3 Sep 29, 2015
  • More concise HLA allele summary in HLAminer_HPTASR.csv and HLAminer_HPRA.csv (associated .log is unchanged and lists all predictions)
  • Keeps top two [highest-scoring by HLA group] predictions per gene and only the 'P' designated allele when the summary include HLA Sequences reported to have the same antigen binding domain.
  • For the original output, refer to the HLAminer_v1-2.pl included in the ./bin directory
  • A prediction example from MCF-7 PacBio RNA-seq reads is also provided
BCCA (academic use) final
1.2 Feb 11, 2015 All HLA sequence databases have been updated Shell scripts that download HLA sequences corrected to reflect change of location at EBI (ie. fasta sub folder) Support added for predictions from direct alignment of single-end reads (-e option) BCCA (academic use) final
1.1 Sep 16, 2014 Updated HLA databases, Streamlined code, Updated TASR dependency. Due to the clinical implications of HLAminer, the code is now released under the BC Cancer Agency Software License Agreement BCCA (academic use) final
Back to top