This web toolkit is based on Bioconductor package EpiDISH. The original BioC package contains functions to infer cell-type fractions from DNAm profiles of heterogeneous tissues, using a DNAm reference matrix for common tissue types together with the CellDMC algorithm to identify differentially methylated cell types in EWAS. In addition to all functionalities provided in the BioC R package, the web toolkit provides interactive visualization tools, which are more user-friendly to those who are not familiar with R programming.
You will find the webpage quite self-explaining. You can just follow the order in the navigator(or download the pdf example file):
beta value matrix
, POI vector
(optional) and covariates matrix
(used in CellDMC; optional). mode
and reference(s)
to infer cell-type fractions. Check the results with interactive figures and save results in pdf and txt files. CellDMC
with previsouly inferred CT fractions to identify differentially methylated cell types.Inference of CT fractions proceeds via one of 3 methods (Robust Partial Correlations-RPC(Teschendorff et al. 2017), Cibersort (CBS)(Newman et al. 2015), Constrained Projection (CP)(Houseman et al. 2012), as determined by the user.
For now, we provide 18 references, including 4 EpiDISH references (blood & generic cell-types) and 14 EpiSCORE references (tissue-specific). The 4 EpiDISH references include two blood subtypes references, as well as one reference with epithelial cells, fibroblasts, and total immune cells, and one reference with epithelial cells, fibroblasts, adipose cells, and total immune cells, described in Teschendorff et al. 2017 and Zheng et al. 2018a. If you want to infer CT fractions of each immune cell type, you might want to use HEpiDISH, which is an iterative hierarchical procedure. HEpiDISH uses two distinct DNAm references, a primary reference for the estimation of several cell-types fractions, and a separate secondary non-overlapping DNAm reference for the estimation of underlying subtype fractions of one of the cell types in the primary reference.
The above figure describes how HEpiDISH works. You can find more info in Zheng et al. 2018a.
The 14 tissue-specific EpiSCORE references encompass 13 tissue-types (bladder, brain, breast, colon, esophagus, heart, kidney, liver, lung, olfactory epithelium, pancreas, prostate and skin) and 40 cell-types, described in Zhu et al. 2022. The reference matrices are free to download at EpiSCORE website. In EpiSCORE references, a weight is defined for each marker to show how informative it is. We recommend using weighted robust partial correlation method as implemented in this web server, described in Teschendorff et al. 2022.
An outstanding challenge of epigenome-wide association studies (EWASs) performed in complex tissues is the identification of the specific cell type(s) responsible for the observed differential DNA methylation. We developed a statistical algorithm called CellDMC, which can identify differentially methylated positions within the specific cell type(s) driving the differential methylation. The ability to detect the altered cell types associated with disease and disease risk will facilitate the identification and development of biomarker assays for epigenetic disease risk, in line with the aims of P4 Medicine. CellDMC was published in Zheng et al. 2018b.
Here you upload your beta value matrix with rows labeling the CpGs (usually Illumina BeadArray probe IDs) and columns labeling samples. NA values are not allowed. If you only want to infer cell-type fractions (not running CellDMC later), you can upload a subset beta value matrix, which only contains cell-type specific CpGs as in the reference maitrx.
You can download the example beta value file here (Tips: The first column of your data should have a name, e.g. cpg. The values of the first column will be used as feature names later.). Both txt and csv formats are acceptable. You can choose the separator (tab, comma, or semicolon).
Here you upload your POI vector file. It will be used in CT fractions boxplot and CellDMC. This is not required for CT fraction inference.
You can download the example POI vector file here (Tips: The first column of your data should have a name, e.g. SamleName. The values of the first column will be used as sample names later.). Both txt and csv formats are acceptable. You can choose the separator (tab, comma, or semicolon).
Here you upload covariates matrix used in CellDMC with rows labeling samples and columns labeling variables.
You can download the example covariates file here (Tips: The first column of your data should have a name, e.g. SamleName. The values of the first column will be used as sample names later.). Both txt and csv formats are acceptable. You can choose the separator (tab, comma, or semicolon).
Thanks for trying our web server. Let us know if you have any issue or suggestion by contacting Shijie C. Zheng.
Please cite the following papers if you use our web server. Many thanks!
EpiSCORE DNAm-atlas: