---
title: Integrative analysis menu
bibliography: bibliography.bib
date: "`r BiocStyle::doc_date()`"
vignette: >
  %\VignetteIndexEntry{"4. Integrative analysis menu"}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

# Detailed explanation

For a detailed manual for this section please access these links:

1. [Starburst menu manual](https://drive.google.com/open?id=0B0-8N2fjttG-RzU1T1VQU2dQcXM)
2. [ELMER menu manual](https://drive.google.com/open?id=0B0-8N2fjttG-TDg1czNPUGUwTG8)

# Menu: Starburst plot

![Starburst plot menu: Main window.](starburst.png)

## Data 
Expected input is a CSV file with the following pattern:

* DEA result: DEA_results_Group_subgruop1_subgroup2_pcut_0.01_logFC.cut_2.csv
* DMR result: DMR_results_Group_subgruop1_subgroup2_pcut_1e-30_meancut_0.55.csv

## Thresholds control

The possible thresholds controls are:

* Log FC threshold: gene expression Log2FC threshold
* Expression FDR cut-off: gene expression  FDR cut-off (y-axis)
* Mean DNA methylation difference threshold: Mean DNA methylation difference threshold
* Methylation FDR cut-off: DNA methylation FDR cut-off (x-axis)

The options Mean DNA methylation difference threshold and Log FC threshold are used to circle genes which pass the cut-offs previously defined (eg. mean methylation or FDR).

## Highlight control
The possible highlight controls are:

* Show gene names: show names of significant genes
* Boxed names: show names inside a box
* Circle genes: Circle candidate biologically significant genes

## Other option

* Save result: save results in a CSV file

## Tutorial video:

[![IMAGE ALT TEXT](http://img.youtube.com/vi/_ec6Sij4MBc/maxresdefault.jpg)](http://www.youtube.com/watch?v=_ec6Sij4MBc "Tutorial video: Integrating DMR analysis and DEA results to be visualized into a starburst plot - (http://www.youtube.com/watch?v=_ec6Sij4MBc) ")

<center>**Tutorial Video:** Integrating DMR analysis and DEA results to be visualized into a starburst plot - (http://www.youtube.com/watch?v=_ec6Sij4MBc)
</center>

# Menu: ELMER (Enhancer Linking by Methylation/Expression Relationship)

This sub-menu will help the user to perform an integrative analysis of DNA methylation and Gene expression using the R/Bioconductor ELMER package [@yao2015inferring].

## Introduction

Recently, many studies suggest that enhancers play a major role as
regulators of cell-specific phenotypes leading to alteration in
transcriptomes related to diseases
[@giorgio2015large; @groschel2014single; @sur2012mice; @yao2015demystifying].
In order to investigate regulatory enhancers that can be located at long
distances upstream or downstream of target genes Bioconductor offer the
[Enhancer Linking by Methylation/Expression Relationship
(ELMER)](http://bioconductor.org/packages/ELMER/) package. This package
is designed to combine DNA methylation and gene expression data from
human tissues to infer multi-level cis-regulatory networks. It uses DNA
methylation to identify enhancers and correlates their state with the
expression of nearby genes to identify one or more transcriptional
targets. Transcription factor (TF) binding site analysis of enhancers is
coupled with expression analysis of all TFs to infer upstream
regulators. This package can be easily applied to TCGA public available
cancer datasets and custom DNA methylation and gene expression datasets [@yao2015inferring].

[ELMER](http://bioconductor.org/packages/ELMER/) analysis have 5 main
steps:

1. Identify distal probes on HM450K or EPIC.

2. Identification of distal probes with significant differential DNA methylation (i.e. DMCs) in tumor vs. normal samples.

3. Identification of putative target gene(s) for differentially methylated distal probes.

4. Identification of enriched motifs within a set of probes in significant probe-gene pairs.

5. Identification of master regulator Transcription Factors (TF) for each enriched motif.


## Sub-menu ELMER: Create input data 

The [ELMER](http://bioconductor.org/packages/ELMER/) input is a
mee object that contains a DNA methylation matrix, a gene expression
matrix, a probe information GRanges, the gene information GRanges and a
data frame summarizing the data. It should be highlighted that samples
without both the gene expression and DNA methylation data will be
removed from the mee object.

To perform ELMER analyses, we need to populate a **MultiAssayExperiment** with a DNA methylation matrix or **SummarizedExperiment** object from HM450K or EPIC platform; a gene expression matrix or SummarizedExperiment object for the same samples; a matrix mapping DNA methylation samples to gene expression samples; and a matrix with sample metadata (i.e. clinical data, molecular subtype, etc.). If TCGA data are used, the last two matrices will be automatically generated.
If using non-TCGA data,  the matrix with sample metadata should be provided with at least a column with a patient identifier and another one identifying its group which will be used for analysis, if samples in the methylation and expression matrices are not ordered and with same names, a matrix mapping for each patient identifier their DNA methylation samples and their gene expression samples should be provided to the *createMAE* function.
Based on the genome of reference selected, metadata for the DNA methylation probes, such as genomic coordinates, will be added from http://zwdzwd.github.io/InfiniumAnnotation [@zhou2016comprehensive]; 
and metadata for gene expression and annotation is added from Ensembl database [@yates2015ensembl] using [biomaRt](http://bioconductor.org/packages/biomaRt/) [@durinck2009mapping]. 

The steps required to create this object and perform ELMER analysis is described in this [tutorial](https://bioinformaticsfmrp.github.io/Bioc2017.TCGAbiolinks.ELMER/analysis_gui.html).


###  Sub-menu ELMER: Analysis 

After preparing the data into a MAE object, we will execute the five
[ELMER](http://bioconductor.org/packages/ELMER/) steps for  the hypo
(distal probes hypomethylated in the group2) direction.

This box has all the available options for ELMER functions. Please see the [ELMER vignette](http://bioconductor.org/packages/3.7/bioc/vignettes/ELMER/inst/doc/index.html). 

Also, a description of the data used by ELMER (such as the distal enhancer probes) is found in the
[ELMER.data](https://bioconductor.org/packages/release/data/experiment/vignettes/ELMER.data/inst/doc/vignettes.html)
vignette.

ELMER Identifies the enriched motifs for the distal probes
which are significantly differentially methylated and linked to a putative
target gene, it will plot the Odds Ratio (x axis) for each motif
found.
These motifs by default have a minimum incidence of 10 probes 
(that means at least 10 probes were associated with the motif)
in the given probes set and the smallest lower boundary of 95%
confidence interval for Odds Ratio of 1.1.

After finding the enriched motifs,
[ELMER](http://bioconductor.org/packages/ELMER/) identifies regulatory
transcription factors (TFs) whose expression is associated with DNA
methylation at motifs. [ELMER](http://bioconductor.org/packages/ELMER/)
automatically creates a TF ranking plot for each enriched motif. This
plot shows the TF ranking plots based on the association score
$(-log(P value))$ between TF expression and DNA methylation of the
motif. We can see in Figure below that the top three TFs that are
associated with a motif found.

In case, it identifies regulatory transcription factors (TFs), an object with the prefix ELMER_results will be created
with the necessary data to visualize the results.

## Sub-menu ELMER: Visualize results

### Data

Select the R object (rda) file with ELMER results created in the analysis step (the one with prefix ELMER_results)

# References