See https://github.com/bcgsc/utrtargets/blob/master/reference/wget.sh for the generation of the reference files. As for the files in the `metadata` folder, they were collected from 3 sources: 1. The MANIFEST file from the CGHub. As of this writing, the CGhub is no longer available. 2. The Google BigQuery table at https://bigquery.cloud.google.com/table/isb-cgc:tcga_seq_metadata.GCS_listing_27apr2016, which included the exact locations of RNA-Seq data on the GCS. As of this writing, this table has been updated to https://bigquery.cloud.google.com/table/isb-cgc:tcga_seq_metadata.GCS_listing_24jun2016, and the 27apr2016 version is no longer available. 3. The Google BigQuery table at https://bigquery.cloud.google.com/table/isb-cgc:tcga_seq_metadata.RNAseq_FastQC, which included the read length information extracted by FastQC for each sample.