---
output:
html_document:
self_contained: true
number_sections: no
theme: flatly
highlight: tango
mathjax: default
toc: true
toc_float: true
toc_depth: 2
css: style.css
bibliography: bibliography.bib
vignette: >
%\VignetteIndexEntry{"3.4 - Motif enrichment analysis on the selected probes"}
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---
# Motif enrichment analysis on the selected probes
This step is to identify enriched motif in a set of probes which is carried out by
function `get.enriched.motif`.
This function uses a pre-calculated data `Probes.motif` which was generated using HOMER with a $p-value \le 10^{–4}$
to scan a $\pm250bp$ region around each probe using HOmo sapiens
COmprehensive MOdel COllection [http://hocomoco.autosome.ru/](HOCOMOCO) v10 [@kulakovskiy2016hocomoco] position
weight matrices (PWMs).
For each probe set tested (i.e. the list of gene-linked hypomethylated probes in a given
group), a motif enrichment Odds Ratio and a 95% confidence interval were
calculated using following formulas:
$$ p= \frac{a}{a+b} $$
$$ P= \frac{c}{c+d} $$
$$ Odds\quad Ratio = \frac{\frac{p}{1-p}}{\frac{P}{1-P}} $$
$$ SD = \sqrt{\frac{1}{a}+\frac{1}{b}+\frac{1}{c}+\frac{1}{d}} $$
$$ lower\quad boundary\quad of\quad 95\%\quad confidence\quad interval = \exp{(\ln{OR}-SD)} $$
where `a` is the number of probes within the selected probe set that contain one
or more motif occurrences; `b` is the number of probes within the selected probe
set that do not contain a motif occurrence; `c` and `d` are the same counts within
the entire enhancer probe set. A probe set was considered significantly enriched
for a particular motif if the 95% confidence interval of the Odds Ratio was
greater than 1.1 (specified by option `lower.OR`, 1.1 is default), and the motif
occurred at least 10 times (specified by option `min.incidence`. 10 is default) in
the probe set. As described in the text, Odds Ratios were also used for ranking
candidate motifs.
Main get.pair arguments
| Argument | Description |
|-------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| data | A multi Assay Experiment from createMAE function. If set and probes.motif/background probes are missing this will be used to get this other two arguments correctly. This argument is not require, you can set probes.motif and the backaground.probes manually. |
| probes | A vector lists the name of probes to define the set of probes in which motif enrichment OR and confidence interval will be calculated. |
| lower.OR | A number specifies the smallest lower boundary of 95% confidence interval for Odds Ratio. The motif with higher lower boudnary of 95% confidence interval for Odds Ratio than the number are the significantly enriched motifs (detail see reference). |
| min.incidence | A non-negative integer specifies the minimum incidence of motif in the given probes set. 10 is default. |
```{r,eval=TRUE, message=FALSE, warning = FALSE}
# Load results from previous sections
mae <- get(load("mae.rda"))
sig.diff <- read.csv("result/getMethdiff.hypo.probes.significant.csv")
pair <- read.csv("result/getPair.hypo.pairs.significant.csv")
head(pair) # significantly hypomethylated probes with putative target genes
# Identify enriched motif for significantly hypomethylated probes which
# have putative target genes.
enriched.motif <- get.enriched.motif(data = mae,
probes = pair$Probe,
dir.out = "result",
label = "hypo",
min.incidence = 10,
lower.OR = 1.1)
names(enriched.motif) # enriched motifs
head(enriched.motif[names(enriched.motif)[1]]) ## probes in the given set that have the first motif.
# get.enriched.motif automatically save output files.
# getMotif.hypo.enriched.motifs.rda contains enriched motifs and the probes with the motif.
# getMotif.hypo.motif.enrichment.csv contains summary of enriched motifs.
dir(path = "result", pattern = "getMotif")
# motif enrichment figure will be automatically generated.
dir(path = "result", pattern = "motif.enrichment.pdf")
```
# Bibliography