Tissue collection and sample preparation
Tissue was collected from patients enroled in the PEACE study (ethics approval reference 11/LO/1996).
Samples were selected from organs where no metastasis was evident during autopsy. Patients were prioritized according to the number of different organ sites available for collection, and to enable balanced sex and mixed age representation. Samples were collected from anatomical regions, snap frozen in liquid nitrogen and stored long term in the −80 °C freezer. All samples (n = 168) used in the cohort lockdown were bioinformatically assessed for presence of infiltrating cancer cells and large-scale allelic imbalance (see below) and deemed cancer-free. Additionally, for 118 cases, we performed pathology review to provide additional evidence of cancer-free status. Pathology review involved analysis of adjacent tissue to the sequenced samples that were fixed in formalin, embedded in paraffin and stained with haematoxylin and eosin before scanning with a NanoZoomer digital pathology system (Hamamatsu). Digital slides were then examined to evaluate the absence or presence of malignancy. This revealed 18 samples where adjacent tissue to the sequenced tissue either contained tumour cells or could not be confidently classified as cancer-free. Removing these samples, which were bioinformatically defined as cancer-free, did not qualitatively alter any of the results. Additionally, five metastatic samples from five different patients were collected. A 2 mm3 piece of tissue was processed for DNA extraction using the Qiagen AllPrep kit following the manufacturer’s instructions. DNA from blood was purified using the DNeasy Blood and Tissue kit (Qiagen). Purified nucleic acids were accessed for yield and purity using DNA Broad Range assay kits (Invitrogen).
Panel design
To investigate mutational processes representative of genome-wide trinucleotide content, we designed a targeted genomic panel spanning 82.5 kb focusing on 30 cancer and normal tissue driver gene regions, selected taking into account the mutation frequency in multiple cancer and normal tissue cohorts14. The regions included in our panel are detailed in Supplementary Table 2. In addition to the driver gene regions, our panel also encompasses several genomic regions with comparable representation of the genome-wide trinucleotide context that are under neutral selection, as defined by Twinstrand Bioscience. These regions are used to study the mutational processes in a context that is not influenced by selective pressures.
Library preparation and sequencing
Libraries were prepared using 1,000 ng of extracted gDNA as input into the TwinStrand DuplexSeq Library Preparation Kit as per the manufacturer’s guidelines. In brief, 1,000 ng of gDNA underwent enzymatic fragmentation, end repair and A-tailing before being ligated with unique DuplexSeq adapters followed by 10 cycles of an indexing PCR reaction. Hybrid capture was then performed using a custom 82.5-kb capture panel from TwinStrand, followed by 16 cycles of PCR amplification. Libraries underwent a second round of hybridization using the same custom capture panel, followed by another five cycles of PCR amplification. The final libraries were then quantified and assessed using the Qubit fluorometer (Thermo Fisher Scientific) and TapeStation 4200 (Agilent) before being sequenced with 150 bp paired end reads on the Illumina NovaSeq 6000 system.
NanoSeq libraries
Libraries were prepared using the NanoSeq protocol as previously described16. In brief, 2 ng of extraction gDNA was purified using a 1:1 mixture of nuclease-free water and SPRIselect beads (Beckman Coulter, B23319). Samples were then fragmented on-bead using the HpyCH4V restriction enzyme (New England Biolabs, R0620S) at 37 °C for 15 min and purified with 2.5× SPRIselect beads. The fragmented and cleaned up DNA was A-tailed and ligated with xGen CS Duplex Adapters (Integrated DNA Technologies, 1080799) and purified again with 1× SPRIselect beads, resuspending in a final volume of 20 μl nuclease-free water.
The adapter ligated libraries were quantified by quantitative PCR (qPCR) using the KAPA Library Quantification Kit (Roche, KK4828) with custom primers, as previously described16. Using the qPCR concentrations, libraries were normalized to 0.6 fmol in 20 μl nuclease-free water. The normalized libraries were added to the PCR mastermix containing 25 μl NEBNext Ultra II Q5 Master Mix (New England Biolabs, M0544S) and 5 μl xGen UDI Primers (Integrated DNA Technologies, 10008052). Libraries were amplified for a total of 13 cycles and cleaned up twice using 0.7× SPRIselect beads. The final libraries were assessed using the Qubit fluorometer (Thermo Fisher, Q33231) and Tapestation 4200 D1000 Assay (Agilent, 5067–5582). Libraries were then pooled and sequenced using 150 bp paired end reads on the Illumina NovaSeq 6000 platform, aiming for 30× coverage per sample.
... continue reading