Frequently Asked Questions | Genome Sciences Centre

Sample Submission

Learn more about the sequencing services offered by the GSC here.

We accept a variety of samples including nucleic acids, blood, fresh or frozen tissues, formalin-fixed/paraffin-embedded (FFPE) tissues, saliva and buccal (cheek) swabs. We also accept fully constructed libraries.

For more information about sample types and submission requirements, please refer to our User Guides.

For more information about recommended starting material and plate/tube requirements for sequencing services, please refer to our user guides.

For best library construction results please submit the recommended amount of starting material or more.
The recommended starting material in our user guides work well for human or mouse derived nucleic acids. Please inquire if you are working with other species.
The amounts listed in our user guides are the minimum amount of starting material which will result in adequate sequencing results. Please contact us if your material is limiting.

Instructions for sample submission are e-mailed to the collaborator once a signed copy of the Statement of Work (SOW) has been received by the GSC. For many sample types, an online submission system is available. For others, excel forms will be sent.

For more information about sample submission, please refer to our User Guides.

< 24 DNA/RNA samples, submit in 1.5 mL Eppendorf tubes (screw top tubes are not accepted).
≥ 24 DNA/RNA samples, submit in an Axygen 96-FS-C plate (can be provided by GSC upon request).
Tissue/cells are accepted only in tube format, for example, Matrix tubes for formalin-fixed paraffin-embedded (FFPE) tissue scrolls/curls or cores.

For more information about sample submission requirements, please refer to our User Guides.

Samples should be sorted by columns in a 96 well plate (eg. A1 to H1, B2 to H2).

Wells E12, F12, G12 and H12 must be left empty for internal controls.

For more information about sample submission requirements, please refer to our User Guides.

Wells E12, F12, G12 and H12 are to be left empty as they are used for internal controls.

For more information about sample submission requirements, please refer to our User Guides.

Every sample submitted to the GSC goes through an initial quality check. The label on the physical tube is checked against the tube label provided in the sample submission form and entered into our LIMS.

Pre-barcoded Axygen 96-FS-C plate(s) can be provided to the collaborator once the Statement of Work (SOW) is signed.

If you require a plate, please contact us at GSC_Submissions@bcgsc.ca.

Shipping contact information and courier account number are provided by the collaborator to ship a plate.

Plate(s) can also be picked-up at the GSC in person.

For more information about sample courier and drop-off, please refer to our getting started user guide.

The sample submission form must be reviewed and approved by GSC personnel prior to submitting samples to the GSC.

Regular hours for sample drop-off and plate pick-up:
Monday – Friday Times: 9:30-11:30am and 1:30-3:30pm

Location:
Suite 100-570 West 7th Avenue, Vancouver, BC V5Z 1B3

To enter the building, dial #100 on the intercom and the receptionist will let you in. The reception is on the ground floor (past the elevators and on the left). Go through to reception and ask the receptionist to call or page anyone from the Biospecimen Core group. We’ll come down to reception to meet you.

For more information about sample courier and drop-off, please refer to our getting started user guide.

Once the sample submission form is approved, samples must be shipped on dry ice and should be addressed to:

Dr. Andrew Mungall - Biospecimen Core,
Room 508
Genome Sciences Centre BC Cancer
Suite 100 - 570 West 7th Avenue
Vancouver, BC
V5Z 1B3

email: amungall@bcgsc.ca

Tel: 604-707-5900 ext 3251

When samples have been shipped, we ask that you please email sampleshipments@bcgsc.ca to notify us of your shipment and the associated tracking number, so we can monitor the progress during transit.

Please ensure that there is sufficient dry ice for a couple of days. We recommend shipping Monday to Wednesday, as we cannot accept packages on weekends.

For more information about sample courier and drop-off, please refer to our getting started user guide.

The GSC has extended our QC processes, to enhance our sample identity tracking. We add a small amount of a plasmid (1 ng/µg) to each tissue or genomic DNA sample upon receipt by the GSC. This plasmid contains a unique insert and allows us to track sample identity and cross contamination throughout the pipeline. The resulting sequence data will contain reads resulting from both the vector (PCR4-Topo) and the insert at a level of 1,000-100,000 reads per lane (spike in reads will not be aligned). Any returned material would also contain this plasmid. This has been extensively tested in both our clinical and research pipelines.

Please advise GSC if you do not wish this tracking spike-in to be added to your sample.

Library Construction and Sequencing

The following library construction strategies are available:

FFPE-Genome
PCR-Free Genome
Low input DNA
Bisulphite
ChIP
miRNA
ssRNA, polyA+
mRNA using strand specific protocol
Ribodepleted strand specific RNA
Exome and custom capture

Please contact us if you have a strategy in mind but it is not listed here.

We can advise on extraction of RNA, isolation of DNA or immunoprecipitation, but cannot provide optimized protocols.

All isolation protocols will need to be optimized in your own lab. Some variability is unavoidable when working with biological samples.

For more information about sample types and submission requirements, please refer to our User Guides.

The costs associated with library production vary depending on starting material. Please contact us for a quote or Statement of Work (SOW).

The solution will depend on the source of the problem. Several quality checks are included in the process of constructing libraries and sequencing, in an effort to minimize the potential for failure.

DNA samples are quantified upon receipt of samples in advance of library construction. RNA samples are quality and quantity checked in advance of library construction. You will be contacted if your sample falls below the required total mass or is degraded and not suitable for library preparation.
If library construction fails, the collaborator will be consulted to find a solution. There is no standard policy, as failure can be attributed to many causes. If the cause is found to be sample related, a replacement sample may be submitted.
For samples requiring multiple lanes of sequencing, a single lane is initially run to assess library quality. If the lane fails any of several quality metrics, the Quality Control team reviews the data to identify the source of the problem. Concerns about library construction are reported to the customer to discuss possible solutions and options. Sequencing run quality metrics are reviewed by the lab to ensure high quality sequence is produced. Runs failed due to instrumentation/technical issues will be repeated at no cost to collaborator. The Quality Control team reviews the content of the sequence data to ensure several quality metrics are met. Failures are investigated to determine the root cause and are reported to the collaborator to discuss possible solutions and options.

Please also check the QC Alerts document for any alerts you may see in your Illumina Data file

Sequencing requirements will vary between researchers and between samples. The number of sequencing lanes required depends on the experimental design and your sample.

Important variables include:

sample quality
sample quantity
genome size
availability of a reference genome for comparison
goal(s) of the project

Illumina also has some helpful resources to help determine this.

HiSeqX sequencing is available for whole genome samples (human and other) to an average depth of coverage of 15X or greater.

Sequencing of bisulphite or phasing libraries to an average depth of coverage of 15X or greater, is also permitted but unsupported.

Illumina does not provide any assurances or guarantees that the performance of the HiSeq X instrument will match published specifications when used for unsupported applications.

If our collaborators choose to submit cells, we require them to be snap frozen pellets. A total of 1M (per sample) is the recommended amount if all six marks are done for one sample. The 1M cells (per sample) can be submitted in one tube.

The NextSeq 500 has a minor restriction on index sequences that can be used when barcoding libraries. To detect a cluster during template generation, there must be at least one base other than G in the first five cycles.

In order to gauge our scientific impact we attempt to track our contribution to the wider scientific community. This is done as part of our ongoing support for the activities of our collaborators, as well as to ensure we meet the requirements of both our funding partners and our charter as a non-profit agency. In order to achieve this, we require our collaborators to acknowledge the work performed by the GSC in any or all of the following ways:

The GSC does not request or require co-authorship on publications when data has been generated through their cost-recovery collaborative service alone, i.e. when no intellectual contribution has been made.
Where intellectual contributions have been made by the GSC, collaborators are required to discuss potential and pending publications based on these contributions with the relevant GSC scientists or staff to identify appropriate co-authorship.
At a minimum, acknowledgement of the work of the GSC should be included in peer-reviewed publications. The following sentence can be incorporated into the Acknowledgements section of the article: “The authors wish to acknowledge Canada's Michael Smith Genome Sciences Centre, Vancouver, Canada for [activity]." A full list of funders of infrastructure and research supporting the services accessed can be found on the About Us page.
In addition, acknowledgements should appear in the text of peer-reviewed publications, for example in the Materials and Methods sections. A suggested sentence for inclusion is: “[Activity] was performed by Canada's Michael Smith Genome Sciences Centre, Vancouver, Canada”.

We would be very pleased to receive notification when collaborators publish papers acknowledging the GSC.

At the GSC the histone modification core marks are the following:

H3K4me1
H3K4me3
H3K9me3
H3K27me3
H3K36me3
H3K27ac

Data Analysis

Please see the Bioinformatic Services page for a full list of the standard analyses we provide.

If you have a custom analysis in mind that is not listed, please contact us directly.

We do not have minimum data guarantees, as the data yield depends too much on the sample supplied. We use our internal QC standards to ensure that the best possible data is generated for each sample.

We employ a wide range of quality control metrics in our bioinformatics QC pipeline:
Assessment of technical contaminants such as adapters and sequencing reagents as well as biological contaminants such as bacteria or host species in xenograft samples.
We also look at library type specific metrics such as insert size and duplicate rates for whole genome libraries, ribosomal and mitochondrial content for RNAseq libraries, and capture efficiency for exome libraries.
If multiple samples are submitted from the same patient, we check for possible sample swaps by comparing the samples at positions of common single nucleotide polymorphism.
Selected QC warnings are given if your libraries fall below our standard thresholds. A full list can be found here

Your data is provided in both fastq and BAM formats by default.

Alignment is included in the sequencing price for human samples.

For more information regarding the BAM file format, please see https://samtools.github.io/hts-specs/SAMv1.pdf

Alignment is performed using the Burrows-Wheeler Aligner (BWA) program. Novoalign is used for bisulphite sequence data. Additional alignment, with specific client specified parameters or other aligners may be available upon request at an additional cost. Please contact us for more information.

Our current default human genome reference version is hg38, although we support hg19.

Please contact us directly for our default genome reference version for any other species.

You can specify any valid reference version for us to use in your alignment when you submit your samples. If we do not have the reference installed in house there will be a cost recovery for installing your reference. If there is no existing public reference for your data, you can provide a custom fasta file, as long as the fasta file can be indexed and is compatible with our aligner. If no reference is provided or the custom reference is not correctly formatted, your BAM file will simply contain all unaligned reads in BAM format.

All of the raw data is included in the final BAM file, with reads failing the vendor quality checks flagged to allow the user to remove them if desired.

Data sequenced on the HiseqX will not contain quality failed reads as the instrument does not output them. Unaligned reads are also included in the BAM file.

Yes, data from pooled libraries will be supplied to you after splitting by index. Indices are sequenced on a separate read so your data will not contain any indices.

We do not trim adapter sequences from our fastq or BAM files. Generally aligners are able to handle adapter sequence at the end of reads by softclipping. The exceptions are bisulphite sequencing reads which are hardclipped in the alignment stage, and miRNA sequencing data for which we do trim adapters due to the short length of the reads and the need for higher sequence specificity in our miRNA profiling pipeline. BAM files for both of these library types will not contain adapter sequence.

All collaborators will receive an email informing them that their data is available for download from our sFTP site. The email is a receipt, identifying which data has recently been made available in addition to the previously uploaded data sets from the same project. This allows the collaborator to track sequence data as it is generated. Data will be automatically deleted from the download site after two weeks. If you are unable to download your data within two weeks, please contact us to re-upload your data. If at that time your data are still available for upload, there may be an additional cost for the re-posting. By default the notification email will be sent to the principal investigator listed on the sample submission and submitter of the samples. Additional email recipients can be specified during submission. Once the notification email has been sent, a separate email with login and password details for the sFTP site will be sent to the PI. To protect the privacy of your data, subsequent amendments to the recipient list and creation of additional sFTP accounts will require approval from the PI. If you do not have a sFTP client on your computer, you will need to download and install one before you are able to access your data. Please visit our webpage for a list of some recommended clients that can be downloaded for free, along with links to installation instructions.

With every data upload, we provide a gsc_library.summary file which can be found in the SFTP folder containing your data. This file provides a mapping between our internal library names and the sample names which you provide on your sample submission form. If you have any problems with your data, please contact data_support@bcgsc.ca.

Some popular toolsets for working with and viewing sequence data are:

For any questions related to the analysis of your data or questions about particular analysis software the SEQanswers, Biostars and Canadian Bioinformatics Helpdesk forums are useful resources.

Sequence data is stored for a minimum of 45 days, and may be deleted after that time without notice.

As much as the GSC would like to help our collaborators get their data published, we do not have the ability to host data publicly here. However, we have experience submitting to most public repositories including SRA, dbGAP, cgHub and EGA and are happy to answer questions you may have.

We are also able to provide submission support on a cost recovery basis.

Sample Submission

What types of sequencing services are offered by the GSC?

What kinds of samples do you accept?

What are the minimum sample requirements?

When and how do I get a sample submission form?

What type of containers can I use to submit my samples?

How should I arrange my samples in a 96 well plate?

Why is a plate limited to 92 samples when there are 96 wells in a plate?

Why do I have to provide a tube label when I am providing a sample ID already?

What if I don’t have the specified plate available?

When and where can I submit my samples?

How do I submit samples via a courier?

Why is spike-in added to genomic DNA samples? Will this affect my sequencing results?

Library Construction and Sequencing

What library construction strategies are provided?

How should I isolate my RNA/DNA or immunoprecipitate my chromatin?

What is the cost of library construction and sequencing?

What happens if the run or a library fails?

How many lanes should I run and how do I determine sequencing coverage?

Are there any specifications for genomic samples or libraries submitted for sequencing on the HiSeq X?

Are there specifications for submitting Chromatin Immunoprecipitation?

Are there any specifications for samples or libraries submitted for sequencing on the NextSeq 500?

Do I have to cite the GSC for the sequencing work performed when I publish my research? How should I cite work performed by the GSC?

What are the Histone modification core marks provided by the GSC?

Data Analysis

What kind of bioinformatics analysis do you provide?

How much pass filter data am I guaranteed?

What kind of bioinformatics QC do you do on my samples?

In what format do I receive my sequencing data?

What software is used for alignment?

What reference genome is used for the alignment?

What if I want my data aligned to a different reference? What if there is no reference for my data?

What reads are included in the BAM file?

Are pooled libraries automatically split by index?

Do you trim the adaptors from my sequence data?

How do I access my data?

How do I match my sample names with the sample names on my data files?

Do you have suggested tools for viewing my BAM files?

What is your data retention policy?

I need to make my data publicly available for a publication. Can the GSC host my sequencing data or submit my data for me?

Genome Sciences Centre