Page 61 - Molecular features of low-grade developmental brain tumours
P. 61
DISTINCT DNA METHYLATION PATTERNS IN SEGA IN TSC
Benelux, Venlo, The Netherlands) and total RNA was isolated using the miRNeasy Mini kit (Qiagen Benelux, Venlo, The Netherlands) according to manufacturer’s instructions. RNA concentration was determined using a Qubit® 2.0 Fluorometer (Life Technologies, Carlsbad, CA, USA) and the RNA integrity was assessed using a Bioanalyser 2100 (Agilent). Library preparation and sequencing were completed at GenomeScan (Leiden, The Netherlands). The Illumina (San Diego, California, USA) NEBNext Ultra Directional RNA Library preparation kit was used to prepare sequencing libraries in accordance to manufacturers guidelines. Clustering and DNA sequencing was performed using the Illumina cBot and HiSeq 4000 according to manufacturer’s protocols. Each library was subjected to paired-end sequencing, producing reads of 150 nucleotides in length with a read-depth of 36 million reads.
Data was processed as previously described 16. Briefly, sequence reads were trimmed and filtered using FastQC v0.11.5 (Babraham Institute, Babraham, Cambridgeshire, UK) and Trimmomatic v0.36 31. Paired-end reads were aligned to the human reference genome (GRCh38) with TopHat2 v2.0.13 and default settings 32. The number of reads that mapped to each gene, based on Gencode v25, was determined using featureCounts from the SubRead package 33. The count matrix was normalized using the R package DESeq2 34.
Bioinformatic analysis
Raw IDAT files from the 450k were passed to the minfi package in R and quality control was performed. Samples that failed quality control were excluded from the analysis. Normalization included a Noob background correction and dye-correction based on the control probes using the function preprocessFunnorm from the R package minfi, which removes any between-array variation 35. Probes with detection p-values of more than 0.01, located on the sex chromosomes, or in SNPs were removed as well as cross-hybridization probes. After these steps, beta (β)-values ranging from 0.0 to 1.0 from 421,352 probes were used for further analysis.
Using the ConsensusClusterPlus package 36, consensus clustering was performed with h-clust average linkage to detect robust clusters, where the metric was 1 minus the Spearmans correlation coefficient. The procedure was run over 1000 iterations and with a sub-sampling ratio of 0.99. Additionally, we applied silhouette to identify robust clusters. Principal component analysis (PCA) was performed considering all CpG probes. Hierarchical clustering was performed on the top 5% most variable CpG probes using h-clust with average linkage. Principal variance component analysis (PVCA), Receiver Operating Characteristic (ROC) analysis and Random Forest were used to assess the contribution of histological or clinical data to the clusters identified. To determine the CpGs that were differentially methylated between groups (e.g. SEGA compared to control) a non-parametric Mann–Whitney U test was used at each CpG probe. The distribution of CpGs on the gene (TSS200, TSS1500, 5’UTR and Exon 1, Intergenic region (IGR), 3’UTR or gene body) each CpG was evaluated by calculating the percentage of CpGs per gene region. CpGs that were located on multiple genes or transcript variants were counted as one for each corresponding gene region. CpG probes located at the promoter region
59
3