Holt lab

The Holt Lab uses cutting edge tools and methodologies to investigate the biology of cancer from several different angles. Focusing on the immune system, the group has used deep sequencing to survey T cell repertoire diversity at the resolution of individual clonotypes and are now using these methodologies to explore the role of T cells in cancer. They are also working to develop cancer immunotherapies using engineered T cells to selectively deliver cytotoxic payloads to bolster the anti-cancer immune response and to enhance tumour cell killing. The group employs their expertise in DNA sequencing and computational analyses to investigate the role of infectious agents in cancer development and were the first to demonstrate a strong link between the pathogen Fusobacterium nucleatum and colorectal cancer. Finally, they apply deep sequencing technologies to identify the spectrum of mutations in various cancer types, with a particular focus on tumour evolution and the identification of antigens for cancer vaccines.

Click here to learn more about the Holt lab. 



We are located at Canada's Michael Smith Genome Sciences Centre, part of the British Columbia Cancer Research Institute.

675 West 10th Avenue 
Vancouver, British Columbia 
V5Z 1L3 



the Holt Lab is using deep sequencing methods to explore the role of T cells in cancer, and how to enhance the anti-cancer immune response. They are particularly focused on developing new sequence-based approaches to T cell antigen discovery and characterization.

Cancer Genomes

Dr. Holt’s lab is using deep sequencing and novel computational methods to identify the spectrum of somatic mutations in various cancers, with a particular focus on tumour evolution and the identification of antigens for cancer vaccines.

Synthetic immunology

It has been recognized for nearly three decades that patients with tumours that are strongly infiltrated by T cells, in particular cytotoxic T cells, have better outcomes. We use computational approaches and targeted immunoassays in the lab to gain insights into the nature of the anti-cancer T cell response, and to determine how and why it varies among healthy individuals and among cancer patients. These studies inform our programs for pre-clinical and clinical development of genetically engineered T cell therapies, including chimeric antigen receptor (CAR-T) and recombinant T cell receptor (rTCR) therapies for cancer.

Selected Publications

A survey of genes modulated by host cell infection.

Microbial genomics, 2020
Cochrane, Kyla, Robinson, Avery V, Holt, Robert A, Allen-Vercoe, Emma
Here, we report comprehensive transcriptomic profiles from under conditions that mimic the first stages of bacterial infection in a highly differentiated adenocarcinoma epithelial cell line. Our transcriptomic adenocarcinoma approach allows us to measure the expression dynamics and regulation of bacterial virulence and response factors in real time, and is a novel strategy for clarifying the role of infection in colorectal cancer (CRC) progression. Our data show that: (i) infection alters metabolic and functional pathways in , allowing the bacterium to adapt to the host-imposed milieu; (ii) infection also stimulates the expression of genes required to help induce and promote a hypoxic and inflammatory microenvironment in the host; and (iii) invasion occurs by a haematogenous route of infection. Our study identifies novel gene targets from that are activated during invasion and which may aid in determining how this species invades and promotes disease within the human gastrointestinal tract. These invasion-specific genes may be useful as biomarkers for CRC progression in a host and could also assist in the development of new diagnostic tools and treatments (such as vaccines or small molecule drug targets), which will be able to combat infection and inflammation in the host while circumventing the potential problem of tolerization.

Rapid selection and identification of functional CD8+ T cell epitopes from large peptide-coding libraries.

Nature communications, 2019
Sharma, Govinda, Rive, Craig M, Holt, Robert A
Cytotoxic CD8{{sup}}+{{/sup}} T cells recognize and eliminate infected or malignant cells that present peptide epitopes derived from intracellularly processed antigens on their surface. However, comprehensive profiling of specific major histocompatibility complex (MHC)-bound peptide epitopes that are naturally processed and capable of eliciting a functional T cell response has been challenging. Here, we report a method for deep and unbiased T cell epitope profiling, using in vitro co-culture of CD8{{sup}}+{{/sup}} T cells together with target cells transduced with high-complexity, epitope-encoding minigene libraries. Target cells that are subject to cytotoxic attack from T cells in co-culture are isolated prior to apoptosis by fluorescence-activated cell sorting, and characterized by sequencing the encoded minigenes. We then validate this highly parallelized method using known murine T cell receptor/peptide-MHC pairs and diverse minigene-encoded epitope libraries. Our data thus suggest that this epitope profiling method allows unambiguous and sensitive identification of naturally processed and MHC-presented peptide epitopes.

Risks and Benefits of Chimeric Antigen Receptor T-Cell (CAR-T) Therapy in Cancer: A Systematic Review and Meta-Analysis.

Transfusion medicine reviews, 2019
Grigor, Emma J M, Fergusson, Dean, Kekre, Natasha, Montroy, Joshua, Atkins, Harold, Seftel, Matthew D, Daugaard, Mads, Presseau, Justin, Thavorn, Kednapa, Hutton, Brian, Holt, Robert A, Lalu, Manoj M
Promising efficacy results of chimeric antigen receptor (CAR) T-cell therapy have been tempered by safety considerations. Our objective was to comprehensively summarize the efficacy and safety of CAR-T cell therapy in patients with relapsed or refractory hematologic or solid malignancies. MEDLINE, Embase, and the Cochrane Register of Controlled Trials (inception - November 21, 2017). Interventional studies investigating CAR-T cell therapy in patients with malignancies were included. Our primary outcome of interest was complete response (defined as the absence of detectable cancer). Two independent reviewers extracted relevant data, assessed risk of bias, and graded the quality of evidence using established methods. A total of 42 hematological malignancy studies and 18 solid tumor studies met were included (913 participants). Of 486 evaluable hematologic patients, 54.4% [95% CI, 42.5%-65.9%] experienced complete response in 27 CD19 CAR-T cell therapy studies. Of 65 evaluable hematologic patients, 24.4% [95% CI, 9.4%-50.3%] experienced complete response in seven non-CD19 CAR-T cell therapy studies. Cytokine release syndrome was experienced by 55.3% [95% CI, 40.3%-69.4%] of patients and neurotoxicity 37.2% [95% CI, 28.6%-46.8%] of patients with hematologic malignancies. Of 86 evaluable solid tumor patients, 4.1% [95% CI, 1.6%-10.6%] experienced complete response in eight CAR-T cell therapy studies. Limitations include heterogeneity of study populations, as well as high risk of bias of included studies. There was a strong signal for efficacy of CAR-T cell therapy in patients with CD19+ hematologic malignancies and no overall signal in solid tumor trials published to date. These results will help inform patients, physicians, and other stakeholders of the benefits and risks associated with CAR-T cell therapy.

Twenty-Seven Tamoxifen-Inducible iCre-Driver Mouse Strains for Eye and Brain, Including Seventeen Carrying a New Inducible-First Constitutive-Ready Allele.

Genetics, 2019
Korecki, Andrea J, Hickmott, Jack W, Lam, Siu Ling, Dreolini, Lisa, Mathelier, Anthony, Baker, Oliver, Kuehne, Claudia, Bonaguro, Russell J, Smith, Jillian, Tan, Chin-Vern, Zhou, Michelle, Goldowitz, Daniel, Deussing, Jan M, Stewart, A Francis, Wasserman, Wyeth W, Holt, Robert A, Simpson, Elizabeth M
To understand gene function, the cre/loxP conditional system is the most powerful available for temporal and spatial control of expression in mouse. However, the research community requires more cre recombinase expressing transgenic mouse strains (cre-drivers) that restrict expression to specific cell types. To address these problems, a high-throughput method for large-scale production that produces high-quality results is necessary. Further, endogenous promoters need to be chosen that drive cell type specific expression, or we need to further focus the expression by manipulating the promoter. Here we test the suitability of using knock-ins at the docking site 5' of for rapid development of numerous cre-driver strains focused on expression in adulthood, using an improved cre tamoxifen inducible allele (icre/ERT2), and testing a novel inducible-first, constitutive-ready allele (icre/f3/ERT2/f3). In addition, we test two types of promoters either to capture an endogenous expression pattern (MaxiPromoters), or to restrict expression further using minimal promoter element(s) designed for expression in restricted cell types (MiniPromoters). We provide new cre-driver mouse strains with applicability for brain and eye research. In addition, we demonstrate the feasibility and applicability of using the locus 5' of for the rapid generation of substantial numbers of cre-driver strains. We also provide a new inducible-first constitutive-ready allele to further speed cre-driver generation. Finally, all these strains are available to the research community through The Jackson Laboratory.

Neoantigen characteristics in the context of the complete predicted MHC class I self-immunopeptidome.

Brown, Scott D, Holt, Robert A
The self-immunopeptidome is the repertoire of all self-peptides that can be presented by the combination of MHC variants carried by an individual, defined by their HLA genotype. Each MHC variant presents a distinct set of self-peptides, and the number of peptides in a set is variable. Subjects carrying MHC variants that present fewer self-peptides should also present fewer mutated peptides, resulting in decreased immune pressure on tumor cells. To explore this, we predicted peptide-MHC binding values using all unique 8-11mer human peptides in the human proteome and all available HLA class I allelic variants, for a total of 134 billion unique peptide--MHC binding predictions. From these predictions, we observe that most peptides are able to be presented by relatively few (< 250) MHC, while some can be presented by upwards of 1,500 different MHC. There is substantial overlap among the repertoires of peptides presented by different MHC and no relationship between the number of peptides presented and HLA population frequency. Nearly 30% of self-peptides are presentable by at least one MHC, leaving 70% of the human peptidome unsurveyed by T cells. We observed similar distributions of predicted self-immunopeptidome sizes in cancer subjects compared to controls, and within the pan-cancer population, predicted self-immunopeptidome size combined with mutational load to predict survival. Self-immunopeptidome analysis revealed evidence for tumor immunoediting and identified specific peptide positions that most influence immunogenicity. Because self-immunopeptidome size is defined by HLA genotypes and approximates neoantigen load, HLA genotyping could offer a rapid predictive biomarker for response to immunotherapy.

A library-based screening method identifies neoantigen-reactive T cells in peripheral blood prior to relapse of ovarian cancer.

Martin, Spencer D, Wick, Darin A, Nielsen, Julie S, Little, Nicole, Holt, Robert A, Nelson, Brad H
Mutated cancer antigens, or neoantigens, represent compelling immunological targets and appear to underlie the success of several forms of immunotherapy. While there are anecdotal reports of neoantigen-specific T cells being present in the peripheral blood and/or tumors of cancer patients, effective adoptive cell therapy (ACT) against neoantigens will require reliable methods to isolate and expand rare, neoantigen-specific T cells from clinically available biospecimens, ideally prior to clinical relapse. Here, we addressed this need using "mini-lines", large libraries of parallel T cell cultures, each originating from only 2,000 T cells. Using small quantities of peripheral blood from multiple time points in an ovarian cancer patient, we screened over 3.3 × 10{{sup}}6{{/sup}} CD8{{sup}}+{{/sup}} T cells by ELISPOT for recognition of peptides corresponding to the full complement of somatic mutations (n = 37) from the patient's tumor. We identified ten T cell lines which collectively recognized peptides encoding five distinct mutations. Six of the ten T cell lines recognized a previously described neoantigen from this patient (HSDL1{{sup}}L25V{{/sup}}), whereas the remaining four lines recognized peptides corresponding to four other mutations. Only the HSDL1{{sup}}L25V{{/sup}}-specific T cell lines recognized autologous tumor. HSDL1{{sup}}L25V{{/sup}}-specific T cells comprised at least three distinct clonotypes and could be identified and expanded from peripheral blood 3-9 months prior to the first tumor recurrence. These T cells became undetectable at later time points, underscoring the dynamic nature of the response. Thus, neoantigen-specific T cells can be expanded from small volumes of blood during tumor remission, making pre-emptive ACT a plausible clinical strategy.

Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma.

Genome research, 2012
Castellarin, Mauro, Warren, René L, Freeman, J Douglas, Dreolini, Lisa, Krzywinski, Martin, Strauss, Jaclyn, Barnes, Rebecca, Watson, Peter, Allen-Vercoe, Emma, Moore, Richard A, Holt, Robert A
An estimated 15% or more of the cancer burden worldwide is attributable to known infectious agents. We screened colorectal carcinoma and matched normal tissue specimens using RNA-seq followed by host sequence subtraction and found marked over-representation of Fusobacterium nucleatum sequences in tumors relative to control specimens. F. nucleatum is an invasive anaerobe that has been linked previously to periodontitis and appendicitis, but not to cancer. Fusobacteria are rare constituents of the fecal microbiota, but have been cultured previously from biopsies of inflamed gut mucosa. We obtained a Fusobacterium isolate from a frozen tumor specimen; this showed highest sequence similarity to a known gut mucosa isolate and was confirmed to be invasive. We verified overabundance of Fusobacterium sequences in tumor versus matched normal control tissue by quantitative PCR analysis from a total of 99 subjects (p = 2.5 × 10(-6)), and we observed a positive association with lymph node metastasis.

Profiling the T-cell receptor beta-chain repertoire by massively parallel sequencing.

Genome research, 2009
Freeman, J Douglas, Warren, René L, Webb, John R, Nelson, Brad H, Holt, Robert A
T-cell receptor (TCR) genomic loci undergo somatic V(D)J recombination, plus the addition/subtraction of nontemplated bases at recombination junctions, in order to generate the repertoire of structurally diverse T cells necessary for antigen recognition. TCR beta subunits can be unambiguously identified by their hypervariable CDR3 (Complement Determining Region 3) sequence. This is the site of V(D)J recombination encoding the principal site of antigen contact. The complexity and dynamics of the T-cell repertoire remain unknown because the potential repertoire size has made conventional sequence analysis intractable. Here, we use 5'-RACE, Illumina sequencing, and a novel short read assembly strategy to sample CDR3(beta) diversity in human T lymphocytes from peripheral blood. Assembly of 40.5 million short reads identified 33,664 distinct TCR(beta) clonotypes and provides precise measurements of CDR3(beta) length diversity, usage of nontemplated bases, sequence convergence, and preferences for TRBV (T-cell receptor beta variable gene) and TRBJ (T-cell receptor beta joining gene) gene usage and pairing. CDR3 length between conserved residues of TRBV and TRBJ ranged from 21 to 81 nucleotides (nt). TRBV gene usage ranged from 0.01% for TRBV17 to 24.6% for TRBV20-1. TRBJ gene usage ranged from 1.6% for TRBJ2-6 to 17.2% for TRBJ2-1. We identified 1573 examples of convergence where the same amino acid translation was specified by distinct CDR3(beta) nucleotide sequences. Direct sequence-based immunoprofiling will likely prove to be a useful tool for understanding repertoire dynamics in response to immune challenge, without a priori knowledge of antigen.

The genome sequence of the malaria mosquito Anopheles gambiae.

Science (New York, N.Y.), 2002
Holt, Robert A, Subramanian, G Mani, Halpern, Aaron, Sutton, Granger G, Charlab, Rosane, Nusskern, Deborah R, Wincker, Patrick, Clark, Andrew G, Ribeiro, José M C, Wides, Ron, Salzberg, Steven L, Loftus, Brendan, Yandell, Mark, Majoros, William H, Rusch, Douglas B, Lai, Zhongwu, Kraft, Cheryl L, Abril, Josep F, Anthouard, Veronique, Arensburger, Peter, Atkinson, Peter W, Baden, Holly, de Berardinis, Veronique, Baldwin, Danita, Benes, Vladimir, Biedler, Jim, Blass, Claudia, Bolanos, Randall, Boscus, Didier, Barnstead, Mary, Cai, Shuang, Center, Angela, Chaturverdi, Kabir, Christophides, George K, Chrystal, Mathew A, Clamp, Michele, Cravchik, Anibal, Curwen, Val, Dana, Ali, Delcher, Art, Dew, Ian, Evans, Cheryl A, Flanigan, Michael, Grundschober-Freimoser, Anne, Friedli, Lisa, Gu, Zhiping, Guan, Ping, Guigo, Roderic, Hillenmeyer, Maureen E, Hladun, Susanne L, Hogan, James R, Hong, Young S, Hoover, Jeffrey, Jaillon, Olivier, Ke, Zhaoxi, Kodira, Chinnappa, Kokoza, Elena, Koutsos, Anastasios, Letunic, Ivica, Levitsky, Alex, Liang, Yong, Lin, Jhy-Jhu, Lobo, Neil F, Lopez, John R, Malek, Joel A, McIntosh, Tina C, Meister, Stephan, Miller, Jason, Mobarry, Clark, Mongin, Emmanuel, Murphy, Sean D, O'Brochta, David A, Pfannkoch, Cynthia, Qi, Rong, Regier, Megan A, Remington, Karin, Shao, Hongguang, Sharakhova, Maria V, Sitter, Cynthia D, Shetty, Jyoti, Smith, Thomas J, Strong, Renee, Sun, Jingtao, Thomasova, Dana, Ton, Lucas Q, Topalis, Pantelis, Tu, Zhijian, Unger, Maria F, Walenz, Brian, Wang, Aihui, Wang, Jian, Wang, Mei, Wang, Xuelan, Woodford, Kerry J, Wortman, Jennifer R, Wu, Martin, Yao, Alison, Zdobnov, Evgeny M, Zhang, Hongyu, Zhao, Qi, Zhao, Shaying, Zhu, Shiaoping C, Zhimulev, Igor, Coluzzi, Mario, della Torre, Alessandra, Roth, Charles W, Louis, Christos, Kalush, Francis, Mural, Richard J, Myers, Eugene W, Adams, Mark D, Smith, Hamilton O, Broder, Samuel, Gardner, Malcolm J, Fraser, Claire M, Birney, Ewan, Bork, Peer, Brey, Paul T, Venter, J Craig, Weissenbach, Jean, Kafatos, Fotis C, Collins, Frank H, Hoffman, Stephen L
Anopheles gambiae is the principal vector of malaria, a disease that afflicts more than 500 million people and causes more than 1 million deaths each year. Tenfold shotgun sequence coverage was obtained from the PEST strain of A. gambiae and assembled into scaffolds that span 278 million base pairs. A total of 91% of the genome was organized in 303 scaffolds; the largest scaffold was 23.1 million base pairs. There was substantial genetic variation within this strain, and the apparent existence of two haplotypes of approximately equal frequency ("dual haplotypes") in a substantial fraction of the genome likely reflects the outbred nature of the PEST strain. The sequence produced a conservative inference of more than 400,000 single-nucleotide polymorphisms that showed a markedly bimodal density distribution. Analysis of the genome sequence revealed strong evidence for about 14,000 protein-encoding transcripts. Prominent expansions in specific families of proteins likely involved in cell adhesion and immunity were noted. An expressed sequence tag analysis of genes regulated by blood feeding provided insights into the physiological adaptations of a hematophagous insect.

The sequence of the human genome.

Science (New York, N.Y.), 2001
Venter, J C, Adams, M D, Myers, E W, Li, P W, Mural, R J, Sutton, G G, Smith, H O, Yandell, M, Evans, C A, Holt, R A, Gocayne, J D, Amanatides, P, Ballew, R M, Huson, D H, Wortman, J R, Zhang, Q, Kodira, C D, Zheng, X H, Chen, L, Skupski, M, Subramanian, G, Thomas, P D, Zhang, J, Gabor Miklos, G L, Nelson, C, Broder, S, Clark, A G, Nadeau, J, McKusick, V A, Zinder, N, Levine, A J, Roberts, R J, Simon, M, Slayman, C, Hunkapiller, M, Bolanos, R, Delcher, A, Dew, I, Fasulo, D, Flanigan, M, Florea, L, Halpern, A, Hannenhalli, S, Kravitz, S, Levy, S, Mobarry, C, Reinert, K, Remington, K, Abu-Threideh, J, Beasley, E, Biddick, K, Bonazzi, V, Brandon, R, Cargill, M, Chandramouliswaran, I, Charlab, R, Chaturvedi, K, Deng, Z, Di Francesco, V, Dunn, P, Eilbeck, K, Evangelista, C, Gabrielian, A E, Gan, W, Ge, W, Gong, F, Gu, Z, Guan, P, Heiman, T J, Higgins, M E, Ji, R R, Ke, Z, Ketchum, K A, Lai, Z, Lei, Y, Li, Z, Li, J, Liang, Y, Lin, X, Lu, F, Merkulov, G V, Milshina, N, Moore, H M, Naik, A K, Narayan, V A, Neelam, B, Nusskern, D, Rusch, D B, Salzberg, S, Shao, W, Shue, B, Sun, J, Wang, Z, Wang, A, Wang, X, Wang, J, Wei, M, Wides, R, Xiao, C, Yan, C, Yao, A, Ye, J, Zhan, M, Zhang, W, Zhang, H, Zhao, Q, Zheng, L, Zhong, F, Zhong, W, Zhu, S, Zhao, S, Gilbert, D, Baumhueter, S, Spier, G, Carter, C, Cravchik, A, Woodage, T, Ali, F, An, H, Awe, A, Baldwin, D, Baden, H, Barnstead, M, Barrow, I, Beeson, K, Busam, D, Carver, A, Center, A, Cheng, M L, Curry, L, Danaher, S, Davenport, L, Desilets, R, Dietz, S, Dodson, K, Doup, L, Ferriera, S, Garg, N, Gluecksmann, A, Hart, B, Haynes, J, Haynes, C, Heiner, C, Hladun, S, Hostin, D, Houck, J, Howland, T, Ibegwam, C, Johnson, J, Kalush, F, Kline, L, Koduru, S, Love, A, Mann, F, May, D, McCawley, S, McIntosh, T, McMullen, I, Moy, M, Moy, L, Murphy, B, Nelson, K, Pfannkoch, C, Pratts, E, Puri, V, Qureshi, H, Reardon, M, Rodriguez, R, Rogers, Y H, Romblad, D, Ruhfel, B, Scott, R, Sitter, C, Smallwood, M, Stewart, E, Strong, R, Suh, E, Thomas, R, Tint, N N, Tse, S, Vech, C, Wang, G, Wetter, J, Williams, S, Williams, M, Windsor, S, Winn-Deen, E, Wolfe, K, Zaveri, J, Zaveri, K, Abril, J F, Guigó, R, Campbell, M J, Sjolander, K V, Karlak, B, Kejariwal, A, Mi, H, Lazareva, B, Hatton, T, Narechania, A, Diemer, K, Muruganujan, A, Guo, N, Sato, S, Bafna, V, Istrail, S, Lippert, R, Schwartz, R, Walenz, B, Yooseph, S, Allen, D, Basu, A, Baxendale, J, Blick, L, Caminha, M, Carnes-Stine, J, Caulk, P, Chiang, Y H, Coyne, M, Dahlke, C, Mays, A, Dombroski, M, Donnelly, M, Ely, D, Esparham, S, Fosler, C, Gire, H, Glanowski, S, Glasser, K, Glodek, A, Gorokhov, M, Graham, K, Gropman, B, Harris, M, Heil, J, Henderson, S, Hoover, J, Jennings, D, Jordan, C, Jordan, J, Kasha, J, Kagan, L, Kraft, C, Levitsky, A, Lewis, M, Liu, X, Lopez, J, Ma, D, Majoros, W, McDaniel, J, Murphy, S, Newman, M, Nguyen, T, Nguyen, N, Nodell, M, Pan, S, Peck, J, Peterson, M, Rowe, W, Sanders, R, Scott, J, Simpson, M, Smith, T, Sprague, A, Stockwell, T, Turner, R, Venter, E, Wang, M, Wen, M, Wu, D, Wu, M, Xia, A, Zandieh, A, Zhu, X
A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.


Dr. Joan Shellard

Staff Scientist

Dr. Eric Yung

Staff Scientist
Lab Manager

Dr. Scott Brown

Research Associate
Clinical Informatics Analyst

Dr. Craig Rive

Research Associate

Lisa Dreolini

Research Assistant

Nasrin Mawji

Genome Sciences Technologist

Dr. Mhairi Sigrist

Projects Manager

Postdoctoral Fellows

Dr. James Round

Postdoctoral Fellow

Dr. Govinda Sharma

Post-Doctoral Fellow

Dr. Sophie Sneddon

Post-Doctoral Fellow


Cody Despins

Graduate Student

Nicole Knoetze

Graduate Student

Christopher May

Graduate Student

Michelle Sag

Student Researcher
Back to top