With more than 40 peer-reviewed scientific publications, findings from the POG program are influencing precision oncology approaches around the world.
Randomized controlled trials (RCTs) are uncommon in precision oncology. We provide an introduction and illustrative example of matching methods for evaluating precision oncology in the absence of RCTs. We focus on British Columbia's Personalized OncoGenomics (POG) program, which applies whole-genome and transcriptome analysis (WGTA) to inform advanced cancer care.
Purpose: Structural variants (SVs) may be an underestimated cause of hereditary cancer syndromes given the current limitations of short-read next-generation sequencing. Here we investigated the utility of long-read sequencing in resolving germline SVs in cancer susceptibility genes detected through short-read genome sequencing.
Methods: Known or suspected deleterious germline SVs were identified using Illumina genome sequencing across a cohort of 669 advanced cancer patients with paired tumor genome and transcriptome sequencing. Candidate SVs were subsequently assessed by Oxford Nanopore long-read sequencing.
Results: Nanopore sequencing confirmed eight simple pathogenic or likely pathogenic SVs, resolving three additional variants whose impact could not be fully elucidated through short-read sequencing. A recurrent sequencing artifact on chromosome 16p13 and one complex rearrangement on chromosome 5q35 were subsequently classified as likely benign, obviating the need for further clinical assessment. Variant configuration was further resolved in one case with a complex pathogenic rearrangement affecting TSC2.
Conclusion: Our findings demonstrate that long-read sequencing can improve the validation, resolution, and classification of germline SVs. This has important implications for return of results, cascade carrier testing, cancer screening, and prophylactic interventions.
Keywords: genome sequencing; hereditary cancer; long-read sequencing; structural variants; variant interpretation.
Read our News Story for this publication.
Purpose: Gene fusions are important oncogenic drivers and many are actionable. Whole-genome and transcriptome (WGS and RNA-seq, respectively) sequencing can discover novel clinically relevant fusions.
Experimental design: Using WGS and RNA-seq, we reviewed the prevalence of fusions in a cohort of 570 patients with cancer, and compared prevalence to that predicted with commercially available panels. Fusions were annotated using a consensus variant calling pipeline (MAVIS) and required that a contig of the breakpoint could be constructed and supported from ≥2 structural variant detection approaches.
Results: In 570 patients with advanced cancer, MAVIS identified 81 recurrent fusions by WGS and 111 by RNA-seq, of which 18 fusions by WGS and 19 by RNA-seq were noted in at least 3 separate patients. The most common fusions were EML4-ALK in thoracic malignancies (9/69, 13%), and CMTM8-CMTM7 in colorectal cancer (4/73, 5.5%). Combined genomic and transcriptomic analysis identified novel fusion partners for clinically relevant genes, such as NTRK2 (novel partners: SHC3, DAPK1), and NTRK3 (novel partners: POLG, PIBF1).
Conclusions: Utilizing WGS/RNA-seq facilitates identification of novel fusions in clinically relevant genes, and detected a greater proportion than commercially available panels are expected to find. A significant benefit of WGS and RNA-seq is the innate ability to retrospectively identify variants that becomes clinically relevant over time, without the need for additional testing, which is not possible with panel-based approaches.
Read our News Story for this publication.
Purpose: Immune checkpoint inhibitors (ICIs) have revolutionised the treatment of solid tumours with dramatic and durable responses seen across multiple tumour types. However, identifying patients who will respond to these drugs remains challenging, particularly in the context of advanced and previously treated cancers.
Experimental design: We characterised fresh tumour biopsies from a heterogeneous pan-cancer cohort of 98 patients with metastatic predominantly pre-treated disease through the Personalized OncoGenomics (POG) program at BC Cancer using whole genome and transcriptome analysis (WGTA). Baseline characteristics and follow up data were collected retrospectively.
Results: We found that tumour mutation burden (TMB), independent of mismatch repair status, was the most predictive marker of time to progression (TTP, p=0.007), but immune related CD8+ T cell and M1-M2 macrophage ratio scores were more predictive for overall survival (OS) (p=0.0014 and 0.0012 respectively). While CD274 (PD-L1) gene expression is comparable to protein levels detected by immunohistochemistry (IHC), we did not observe a clinical benefit for patients with this marker. We demonstrate that a combination of markers based on WGTA provides the best stratification of patients (p=0.00071, OS), and also present a case study of possible acquired resistance to pembrolizumab in a non-small cell lung cancer (NSCLC) patient.
Conclusions: Interpreting the tumour-immune interface to predict ICI efficacy remains challenging. WGTA allows for identification of multiple biomarkers simultaneously that in combination may help to identify responders, particularly in the context of a heterogeneous population of advanced and previously treated cancers, thus precluding tumour type-specific testing.
Read our News Story for this publication.
Introduction: Carcinogenesis is driven by an array of complex genomic patterns; these patterns can render an individual resistant or sensitive to certain chemotherapy agents. The Personalized Oncogenomics (POG) project at BC Cancer has performed integrative genomic analysis of whole tumour genomes and transcriptomes for over 700 patients with advanced cancers, with an aim to predict therapeutic sensitivities. The aim of this study was to utilize the POG genomic data to evaluate a discrete set of biomarkers associated with chemo-sensitivity or-resistance in advanced stage breast and colorectal cancer POG patients.
Methods: This was a retrospective multi-centre analysis across all BC CANCER sites. All breast and colorectal cancer patients enrolled in the POG program between July 1, 2012 and November 30, 2016 were eligible for inclusion. Within the breast cancer population, those treated with capecitabine, paclitaxel, and everolimus were analyzed, and for the colorectal cancer patients, those treated with capecitabine, bevacizumab, irinotecan, and oxaliplatin were analyzed. The expression levels of the selected biomarkers of interest (EPHB4, FIGF, CD133, DICER1, DPYD, TYMP, TYMS, TAP1, TOP1, CKDN1A, ERCC1, GSTP1, BRCA1, PTEN, ABCB1, TLE3, and TXNDC17) were reported as mRNA percentiles.
Results: For the breast cancer population, there were 32 patients in the capecitabine cohort, 15 in the everolimus cohort, and 12 in the paclitaxel cohort. For the colorectal cancer population, there were 29 patients in the bevacizumab cohort, 12 in the oxaliplatin cohort, 29 in the irinotecan cohort, and 6 in the capecitabine cohort. Of the biomarkers evaluated, the strongest associations were found between Bevacizumab-based therapy and DICER1 (P = 0.0445); and between capecitabine therapy and TYMP (P = 0.0553).
Conclusions: Among breast cancer patients, higher TYMP expression was associated with sensitivity to capecitabine. Among colorectal cancer patients, higher DICER1 expression was associated with sensitivity to bevacizumab-based therapy. This study supports further assessment of the potential predictive value of mRNA expression of these genomic biomarkers.
Read our News Story for this publication.
Inherited genetic variation has important implications for cancer screening, early diagnosis, and disease prognosis. A role for germline variation has also been described in shaping the molecular landscape, immune response, microenvironment, and treatment response of individual tumors. However, there is a lack of consensus on the handling and analysis of germline information that extends beyond known or suspected cancer susceptibility in large-scale cancer genomics initiatives. As part of the Personalized OncoGenomics program in British Columbia, we performed whole-genome and transcriptome sequencing in paired tumor and normal tissues from advanced cancer patients to characterize the molecular tumor landscape and identify putative targets for therapy. Overall, our experience supports a multidisciplinary and integrative approach to germline data management. This includes a need for broader definitions and standardized recommendations regarding primary and secondary germline findings in precision oncology. Here, we propose a framework for identifying, evaluating, and returning germline variants of potential clinical significance that may have indications for health management beyond cancer risk reduction or prevention in patients and their families.
Read our News Story for this publication.
Introduction Given the high level of uncertainty surrounding the outcomes of early phase clinical trials, whole genome and transcriptome analysis (WGTA) can be used to optimize patient selection and study assignment. In this retrospective analysis, we reviewed the impact of this approach on one such program. Methods Patients with advanced malignancies underwent fresh tumor biopsies as part of our personalized medicine program (NCT02155621). Tumour molecular data were reviewed for potentially clinically actionable findings and patients were referred to the developmental therapeutics program. Outcomes were reviewed in all patients, including those where trial selection was driven by molecular data (matched) and those where there was no clear molecular rationale (unmatched). Results From January 2014 to January 2018, 28 patients underwent WGTA and enrolled in clinical trials, including 2 patients enrolled in two trials. Fifteen patients were matched to a treatment based on a molecular target. Five patients were matched to a trial based upon single-gene DNA changes, all supported by RNA data. Ten cases were matched on the basis of genome-wide data (n = 4) or RNA gene expression only (n = 6). With a median follow-up of 6.7 months, the median time on treatment was 8.2 weeks. Discussion When compared to single-gene DNA-based data alone, WGTA led to a 3-fold increase in treatment matching. In a setting where there is a high level of uncertainty around both the investigational agents and the biomarkers, more data are needed to fully evaluate the impact of routine use of WGTA.
Advanced and metastatic tumors with complex treatment histories drive cancer mortality. Here we describe the POG570 cohort, a comprehensive whole-genome, transcriptome and clinical dataset, amenable for exploration of the impacts of therapies on genomic landscapes. Previous exposure to DNA-damaging chemotherapies and mutations affecting DNA repair genes, including POLQ and genes encoding Polζ, were associated with genome-wide, therapy-induced mutagenesis. Exposure to platinum therapies coincided with signatures SBS31 and DSB5 and, when combined with DNA synthesis inhibitors, signature SBS17b. Alterations in ESR1, EGFR, CTNNB1, FGFR1, VEGFA and DPYD were consistent with drug resistance and sensitivity. Recurrent noncoding events were found in regulatory region hotspots of genes including TERT, PLEKHS1, AP2A1 and ADGRG6. Mutation burden and immune signatures corresponded with overall survival and response to immunotherapy. Our data offer a rich resource for investigation of advanced cancers and interpretation of whole-genome and transcriptome sequencing in the context of a cancer clinic.
Read our News Story for this publication.
Head and neck squamous cell carcinoma (HNSCC) is one of the most common cancers worldwide and represents a heterogeneous group of tumors, the majority of which are treated with a combination of surgery, radiation, and chemotherapy. Fluoropyrimidine (5-FU) and its oral prodrug, capecitabine, are commonly prescribed treatments for several solid tumor types including HNSCC. 5-FU-associated toxicity is observed in ∼30% of treated patients and is largely caused by germline polymorphisms in DPYD, which encodes dihydropyrimidine dehydrogenase, a key enzyme of 5-FU catabolism and deactivation. Although the association of germline DPYD alterations with toxicity is well-described, the potential contribution of somatic DPYD alterations to 5-FU sensitivity has not been explored. In a patient with metastatic HNSCC, in-depth genomic and transcriptomic integrative analysis on a biopsy from a metastatic neck lesion revealed alterations in genes that are associated with 5-FU uptake and metabolism. These included a novel somatic structural variant resulting in a partial deletion affecting DPYD, a variant of unknown significance affecting SLC29A1, and homozygous deletion of MTAP There was no evidence of deleterious germline polymorphisms that have been associated with 5-FU toxicity, indicating a potential vulnerability of the tumor to 5-FU therapy. The discovery of the novel DPYD variant led to the initiation of 5-FU treatment that resulted in a rapid response lasting 17 wk, with subsequent relapse due to unknown resistance mechanisms. This suggests that somatic alterations present in this tumor may serve as markers for tumor sensitivity to 5-FU, aiding in the selection of personalized treatment strategies.
Read our News Story for this publication.
Purpose: Identification of clinically actionable molecular subtypes of pancreatic ductal adenocarcinoma (PDAC) is key to improving patient outcome. Intertumoral metabolic heterogeneity contributes to cancer survival and the balance between distinct metabolic pathways may influence PDAC outcome. We hypothesized that PDAC can be stratified into prognostic metabolic subgroups based on alterations in the expression of genes involved in glycolysis and cholesterol synthesis.
Experimental design: We performed bioinformatics analysis of genomic, transcriptomic, and clinical data in an integrated cohort of 325 resectable and nonresectable PDAC. The resectable datasets included retrospective The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) cohorts. The nonresectable PDAC cohort studies included prospective COMPASS, PanGen, and BC Cancer Personalized OncoGenomics program (POG).
Results: On the basis of the median normalized expression of glycolytic and cholesterogenic genes, four subgroups were identified: quiescent, glycolytic, cholesterogenic, and mixed. Glycolytic tumors were associated with the shortest median survival in resectable (log-rank test P = 0.018) and metastatic settings (log-rank test P = 0.027). Patients with cholesterogenic tumors had the longest median survival. KRAS and MYC-amplified tumors had higher expression of glycolytic genes than tumors with normal or lost copies of the oncogenes (Wilcoxon rank sum test P = 0.015). Glycolytic tumors had the lowest expression of mitochondrial pyruvate carriers MPC1 and MPC2. Glycolytic and cholesterogenic gene expression correlated with the expression of prognostic PDAC subtype classifier genes.
Conclusions: Metabolic classification specific to glycolytic and cholesterogenic pathways provides novel biological insight into previously established PDAC subtypes and may help develop personalized therapies targeting unique tumor metabolic profiles.
RNA sequencing (RNAseq) has been widely used to generate bulk gene expression measurements collected from pools of cells. Only relatively recently have single-cell RNAseq (scRNAseq) methods provided opportunities for gene expression analyses at the single-cell level, allowing researchers to study heterogeneous mixtures of cells at unprecedented resolution. Tumors tend to be composed of heterogeneous cellular mixtures and are frequently the subjects of such analyses. Extensive method developments have led to several protocols for scRNAseq but, owing to the small amounts of RNA in single cells, technical constraints have required compromises. For example, the majority of scRNAseq methods are limited to sequencing only the 3' or 5' termini of transcripts. Other protocols that facilitate full-length transcript profiling tend to capture only polyadenylated mRNAs and are generally limited to processing only 96 cells at a time. Here, we address these limitations and present a novel protocol that allows for the high-throughput sequencing of full-length, total RNA at single-cell resolution. We demonstrate that our method produced strand-specific sequencing data for both polyadenylated and non-polyadenylated transcripts, enabled the profiling of transcript regions beyond only transcript termini, and yielded data rich enough to allow identification of cell types from heterogeneous biological samples.
The practical application of genome-scale technologies to precision oncology research requires flexible tissue processing strategies that can be used to differentially select both tumour and normal cell populations from formalin-fixed paraffin-embedded tissues. As tumour sequencing scales towards clinical implementation, practical difficulties in scheduling and obtaining fresh tissue biopsies at scale, including blood samples as surrogates for matched "normal" DNA, have focused attention on the use of formalin-preserved clinical samples collected routinely for diagnostic purposes. In practice, such samples often contain both tumour and normal cells which, if correctly partitioned, could be used to profile both tumour and normal genomes, thus identifying somatic alterations. Here we report a semi-automated method for laser microdissecting entire slide-mounted tissue sections to enrich for cells of interest with sufficient yield for whole genome and transcriptome sequencing. Using this method, we demonstrated enrichment of tumour material from mixed tumour-normal samples by up to 67%. Leveraging new methods that allow for the extraction of high-quality nucleic acids from small amounts of formalin-fixed tissues, we further showed that the method was successful in yielding sequence data of sufficient quality for use in BC Cancer's Personalized OncoGenomics (POG) program.
Read our News Story for this publication.
Background RNA-sequencing-based subtyping of pancreatic ductal adenocarcinoma (PDAC) has been reported by multiple research groups, each using different methodologies and patient cohorts. 'Classical' and 'basal-like' PDAC subtypes are associated with survival differences, with basal-like tumors associated with worse prognosis. We amalgamated various PDAC subtyping tools to evaluate the potential of such tools to be reliable in clinical practice. Methods Sequencing data for 574 PDAC tumors was obtained from prospective trials and retrospective public databases. Six published PDAC subtyping strategies (Moffitt regression tools, clustering-based Moffitt, Collisson, Bailey, and Karasinska subtypes) were employed on each sample, and results were tested for subtype call consistency and association with survival. Results Basal-like and classical subtype calls were concordant in 88% of patient samples, and survival outcomes were significantly different (p<0.05) between prognostic subtypes. 12% of tumors had subtype-discordant calls across the different methods, showing intermediate survival in univariate and multivariate survival analyses. Transcriptional profiles compatible with that of a hybrid subtype signature were observed for subtype-discordant tumors, in which classical and basal-like genes were concomitantly expressed. Subtype-discordant tumors showed intermediate molecular characteristics, including subtyping gene expression (p<0.0001) and mutant KRAS allelic imbalance (p<0.001). Conclusions Nearly one in six patients with PDAC have tumors that fail to reliably fall into the classical or basal-like PDAC subtype categories, based on two regression tools aimed towards clinical practice. Rather, these patient tumors show intermediate prognostic and molecular traits. We propose close consideration of the non-binary nature of PDAC subtypes for future incorporation of subtyping into clinical practice.
Purpose: With the rising incidence of early-onset pancreatic cancer (EOPC), molecular characteristics that distinguish early-onset pancreatic ductal adenocarcinoma (PDAC) tumors from those arising at a later age are not well understood.
Experimental design: We performed bioinformatic analysis of genomic and transcriptomic data generated from 269 advanced (metastatic or locally advanced) and 277 resectable PDAC tumor samples. Patient samples were stratified into EOPC (age of onset ≤55 years; n = 117), intermediate (age of onset 55-70 years; n = 264), and average (age of onset ≥70 years; n = 165) groups. Frequency of somatic mutations affecting genes commonly implicated in PDAC, as well as gene expression patterns, were compared between EOPC and all other groups.
Results: EOPC tumors showed significantly lower frequency of somatic single-nucleotide variant (SNV)/insertions/deletions (indel) in CDKN2A (P = 0.0017), and were more likely to achieve biallelic mutation of CDKN2A through homozygous copy loss as opposed to heterozygous copy loss coupled with a loss-of-function SNV/indel mutation, the latter of which was more common for tumors with later ages of onset (P = 1.5e-4). Transcription factor forkhead box protein C2 (FOXC2) was significantly upregulated in EOPC tumors (P = 0.032). Genes significantly correlated with FOXC2 in PDAC samples were enriched for gene sets related to epithelial-to-mesenchymal transition (EMT) and included VIM (P = 1.8e-8), CDH11 (P = 6.5e-5), and CDH2 (P = 2.4e-2).
Conclusions: Our comprehensive analysis of sequencing data generated from a large cohort of PDAC patient samples highlights a distinctive pattern of biallelic CDKN2A mutation in EOPC tumors. Increased expression of FOXC2 in EOPC, with the correlation between FOXC2 and EMT pathways, represents novel molecular characteristics of EOPC.
Next-generation sequencing of solid tumors has revealed variable signatures of immunogenicity across tumors, but underlying molecular characteristics driving such variation are not fully understood. While expression of endogenous retrovirus (ERV)-containing transcripts can provide a source of tumor-specific neoantigen in some cancer models, associations between ERV levels and immunogenicity across different types of metastatic cancer are not well established. We performed bioinformatics analysis of genomic, transcriptomic and clinical data across an integrated cohort of 199 metastatic breast, colorectal and pancreatic ductal adenocarcinoma (PDAC) patient tumors. Within each cancer type, we identified a subgroup of viral mimicry tumors in which increased ERV levels were coupled with transcriptional signatures of autonomous antiviral response and immunogenicity. In addition, viral mimicry colorectal and pancreatic tumors showed increased expression of DNA demethylation gene TET2. Taken together, these data demonstrate the existence of an ERV-associated viral mimicry phenotype across three distinct metastatic cancer types, while indicating links between ERV abundance, epigenetic dysregulation and immunogenicity.
Importance: Pediatric cancers are epigenetic diseases; therefore, considering tumor gene expression information is necessary for a complete understanding of the tumorigenic processes.
Objective: To evaluate the feasibility and utility of incorporating comparative gene expression information into the precision medicine framework for difficult-to-treat pediatric and young adult patients with cancer.
Design, setting, and participants: This cohort study was conducted as a consortium between the University of California, Santa Cruz (UCSC) Treehouse Childhood Cancer Initiative and clinical genomic trials. RNA sequencing (RNA-Seq) data were obtained from the following 4 clinical sites and analyzed at UCSC: British Columbia Children's Hospital (n = 31), Lucile Packard Children's Hospital at Stanford University (n = 80), CHOC Children's Hospital and Hyundai Cancer Institute (n = 46), and the Pacific Pediatric Neuro-Oncology Consortium (n = 24). The study dates were January 1, 2016, to March 22, 2017.
Exposures: Participants underwent tumor RNA-Seq profiling as part of 4 separate clinical trials at partner hospitals. The UCSC either downloaded RNA-Seq data from a partner institution for analysis in the cloud or provided a Docker pipeline that performed the same analysis at a partner institution. The UCSC then compared each participant's tumor RNA-Seq profile with more than 11 000 uniformly analyzed tumor profiles from pediatric and young adult patients with cancer, downloaded from public data repositories. These comparisons were used to identify genes and pathways that are significantly overexpressed in each patient's tumor. Results of the UCSC analysis were presented to clinical partners.
Main outcomes and measures: Feasibility of a third-party institution (UCSC Treehouse Childhood Cancer Initiative) to obtain tumor RNA-Seq data from patients, conduct comparative analysis, and present analysis results to clinicians; and proportion of patients for whom comparative tumor gene expression analysis provided useful clinical and biological information.
Results: Among 144 samples from children and young adults (median age at diagnosis, 9 years; range, 0-26 years; 72 of 118 [61.0%] male [26 patients sex unknown]) with a relapsed, refractory, or rare cancer treated on precision medicine protocols, RNA-Seq-derived gene expression was potentially useful for 99 of 144 samples (68.8%) compared with DNA mutation information that was potentially useful for only 34 of 74 samples (45.9%).
Conclusions and relevance: This study's findings suggest that tumor RNA-Seq comparisons may be feasible and highlight the potential clinical utility of incorporating such comparisons into the clinical genomic interpretation framework for difficult-to-treat pediatric and young adult patients with cancer. The study also highlights for the first time to date the potential clinical utility of harmonized publicly available genomic data sets.
The analysis of cell-free circulating tumor DNA (ctDNA) is potentially a less invasive, more dynamic assessment of cancer progression and treatment response than characterizing solid tumor biopsies. Standard isolation methods require separation of plasma by centrifugation, a time-consuming step that complicates automation. To address these limitations, we present an automatable magnetic bead-based ctDNA isolation method that eliminates centrifugation to purify ctDNA directly from peripheral blood (PB). To develop and test our method, ctDNA from cancer patients was purified from PB and plasma. We found that allelic fractions of somatic single-nucleotide variants from target gene capture libraries were comparable, indicating that the PB ctDNA purification method may be a suitable replacement for the plasma-based protocols currently in use.
Next generation RNA-sequencing (RNA-seq) is a flexible approach that can be applied to a range of applications including global quantification of transcript expression, the characterization of RNA structure such as splicing patterns and profiling of expressed mutations. Many RNA-seq protocols require up to microgram levels of total RNA input amounts to generate high quality data, and thus remain impractical for the limited starting material amounts typically obtained from rare cell populations, such as those from early developmental stages or from laser micro-dissected clinical samples. Here, we present an assessment of the contemporary ribosomal RNA depletion-based protocols, and identify those that are suitable for inputs as low as 1-10 ng of intact total RNA and 100-500 ng of partially degraded RNA from formalin-fixed paraffin-embedded tissues.
Tissues used in pathology laboratories are typically stored in the form of formalin-fixed, paraffin-embedded (FFPE) samples. One important consideration in repurposing FFPE material for next generation sequencing (NGS) analysis is the sequencing artifacts that can arise from the significant damage to nucleic acids due to treatment with formalin, storage at room temperature and extraction. One such class of artifacts consists of chimeric reads that appear to be derived from non-contiguous portions of the genome. Here, we show that a major proportion of such chimeric reads align to both the 'Watson' and 'Crick' strands of the reference genome. We refer to these as strand-split artifact reads (SSARs). This study provides a conceptual framework for the mechanistic basis of the genesis of SSARs and other chimeric artifacts along with supporting experimental evidence, which have led to approaches to reduce the levels of such artifacts. We demonstrate that one of these approaches, involving S1 nuclease-mediated removal of single-stranded fragments and overhangs, also reduces sequence bias, base error rates, and false positive detection of copy number and single nucleotide variants. Finally, we describe an analytical approach for quantifying SSARs from NGS data.
Curation and storage of formalin-fixed, paraffin-embedded (FFPE) samples are standard procedures in hospital pathology laboratories around the world. Many thousands of such samples exist and could be used for next generation sequencing analysis. Retrospective analyses of such samples are important for identifying molecular correlates of carcinogenesis, treatment history and disease outcomes. Two major hurdles in using FFPE material for sequencing are the damaged nature of the nucleic acids and the labor-intensive nature of nucleic acid purification. These limitations and a number of other issues that span multiple steps from nucleic acid purification to library construction are addressed here. We optimized and automated a 96-well magnetic bead-based extraction protocol that can be scaled to large cohorts and is compatible with automation. Using sets of 32 and 91 individual FFPE samples respectively, we generated libraries from 100 ng of total RNA and DNA starting amounts with 95-100% success rate. The use of the resulting RNA in micro-RNA sequencing was also demonstrated. In addition to offering the potential of scalability and rapid throughput, the yield obtained with lower input requirements makes these methods applicable to clinical samples where tissue abundance is limiting.