Epigenetics analysis of whole genome Bisulfite (BS) and Oxidative bisulfite (OxBS) sequencing data
This code was developed for the data analysis included in the published paper ¨Epigenetic modulators link mitochondrial redox homeostasis to cardiac function in a sex-dependent manner ¨.
For data reproducibility, it is recommended to test a down-sampled dataset on a local UNIX-based system (such as MacOS or Linux). For analyzing a large dataset with single-base pair resolution of BS and OxBS sequencing, a high-performance computing (HPC) environment is necessary. Please refer to the specific software/package recommendations for the computational requirements at each step.
The analysis involves the following main steps:
The data (BS and oxBS reads) are first cut with Cutadapt (v1.11) and then mapped to the reference genome, here, the mouse genome (GRCm38.p4) using Bismark (v0.19.0). SAMtools (v1.8) can then be used for sorting and indexing. The methylation counts can then be extracted using “bismark_methylation_extractor” tool.
Post-processing steps involve running different R/bash scripts as follows:
This code is to calculate the number of CPGs with step size from all CPGs bed files of a reference genome.
The final outputs from previous steps can be used as inputs to this script. From the pre-processing, CytosineReports "...CpG_report.txt.gz", Bismarkcoverage "...bismark.cov.gz" or after being converted to Methylkit format "..methcounts.gz" can be used as inputs. The script can be run in R (v.3.6) with the main packages. methylKit package (v1.12.0) and MLML2R package (v0.3.3).
The output from the previous step *cov.gz files are used as input in this step, the scripts can be run in R (v3.4.3) and the main package to use is “bsseq” (v1.14.0 )
Besh script to annotate the DMR files from the previous step.
The script is to extract intron information from mouse GTF-annotation No main packages are needed
Building 5mc and 5hmc matrices and making PCA analysis No main packages are needed
Intersection over union analysis of DMRs for functional elements
Building methylation levels for each group of functional elements
Performing differential methylation analysis
Computing heatmap combining gene expression and DNA methylation information
Computing heatmap combining RNAseq pathway analysis with DNA methylation
Making figures for the manuscript