---
title: 'Preparing an Alternative Splicing Annotation for psichomics'
author: "Nuno Saraiva Agostinho"
date: "`r Sys.Date()`"
bibliography: refs.bib
output: 
    rmarkdown::html_vignette:
        toc: true
vignette: >
  %\VignetteIndexEntry{Preparing an Alternative Splicing Annotation for psichomics}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

# Creating custom alternative splicing annotation

psichomics quantifies alternative splicing based on alternative splicing event
annotations from [MISO][miso], [SUPPA][suppa], [VAST-TOOLS][vasttools] and
[rMATS][rmats]. New alternative splicing annotation may be prepared and used in
psichomics by parsing alternative splicing events from those tools. Please
contact me if you would like to see support for other tools.

This tutorial will guide you on how to parse alternative splicing events from
different tools. To do so, start by loading the following packages:

```{r message=FALSE}
library(psichomics)
library(plyr)
```

## SUPPA annotation
[SUPPA][suppa] generates alternative splicing events based on a transcript 
annotation. Start by running SUPPA's `generateEvents` script with a transcript 
file (GTF format) for all event types, if desired. See [SUPPA's page][suppa] for
more information.

The resulting output will include a directory containing tab-delimited files
with alternative splicing events (one file for each event type). Hand the path
of this directory to the function `parseSuppaAnnotation()`, prepare the 
annotation using `prepareAnnotationFromEvents()` and save the output to a RDS 
file:

```{r, include=FALSE}
suppaOutput <- system.file("extdata/eventsAnnotSample/suppa_output/suppaEvents",
                           package="psichomics")
suppaFile <- tempfile(fileext=".RDS")
```

```{r, results="hide"}
# suppaOutput <- "path/to/SUPPA/output"

# Replace `genome` for the string with the identifier before the first
# underscore in the filenames of that directory (for instance, if one of your 
# filenames of interest is "hg19_A3.ioe", the string would be "hg19")
suppa <- parseSuppaAnnotation(suppaOutput, genome="hg19")
annot <- prepareAnnotationFromEvents(suppa)

# suppaFile <- "suppa_hg19_annotation.RDS"
saveRDS(annot, file=suppaFile)
```

## rMATS annotation
Just like [SUPPA][suppa], [rMATS][rmats] also allows to generate alternative 
splicing events based on a transcript annotation, although two BAM or FASTQ 
files are required to generate alternative splicing events. Read 
[rMATS' page][rmats] for more information.

The resulting output of [rMATS][rmats] is then handed out to the function
`parseMatsAnnotation()`:

```{r, include=FALSE}
matsOutput <- system.file("extdata/eventsAnnotSample/mats_output/ASEvents/",
                          package="psichomics")
matsFile <- tempfile("mats", fileext=".RDS")
```

```{r, results="hide"}
# matsOutput <- "path/to/rMATS/output"
mats <- parseMatsAnnotation(
    matsOutput,         # Output directory from rMATS
    genome = "fromGTF", # Identifier of the filenames
    novelEvents=TRUE)   # Parse novel events?
annot <- prepareAnnotationFromEvents(mats)

# matsFile <- "mats_hg19_annotation.RDS"
saveRDS(annot, file=matsFile)
```

## MISO annotation
Simply retrieve [MISO's alternative splicing annotation][misoAnnot] and give the
path to the downloaded folder as input.

```{r, include=FALSE}
misoAnnotation <- system.file("extdata/eventsAnnotSample/miso_annotation",
                              package="psichomics")
misoFile <- tempfile("miso", fileext=".RDS")
```

```{r, results="hide"}
# misoAnnotation <- "path/to/MISO/annotation"
miso <- parseMisoAnnotation(misoAnnotation)
annot <- prepareAnnotationFromEvents(miso)

# misoFile <- "miso_AS_annotation_hg19.RDS"
saveRDS(annot, file=misoFile)
```

## VAST-TOOLS annotation
Download and extract [VAST-TOOLS' alternative splicing annotation][vastAnnot]
and use the path to the `TEMPLATES` subfolder as the input of
`parseVastToolsAnnotation()`. Complex events (i.e. alternative coordinates for
the exon ends) are not currently supported.

```{r, include=FALSE}
vastAnnotation <- system.file("extdata/eventsAnnotSample/VASTDB/Hsa/TEMPLATES/",
                              package="psichomics")
vastFile <- tempfile("vast", fileext=".RDS")
```

```{r, results="hide"}
# vastAnnotation <- "path/to/VASTDB/libs/TEMPLATES"
vast <- parseVastToolsAnnotation(vastAnnotation, genome="Hsa")
annot <- prepareAnnotationFromEvents(vast)

# vastFile <- "vast_AS_annotation_hg19.RDS"
saveRDS(annot, file=vastFile)
```

## Combining annotation from different sources
To combine the annotation from different sources, provide the parsed annotations
of interest simultaneously to the function `prepareAnnotationFromEvents`:

```{r, include=FALSE}
annotFile <- tempfile(fileext=".RDS")
```

```{r, results="hide"}
# Combine the annotation from SUPPA, MISO, rMATS and VAST-TOOLS
annot <- prepareAnnotationFromEvents(suppa, vast, mats, miso)

# annotFile <- "AS_annotation_hg19.RDS"
saveRDS(annot, file=annotFile)
```

# Quantifying alternative splicing using the custom annotation
The created alternative splicing annotation can be used in psichomics for 
alternative splicing quantification. To do so, when using the GUI version of
psichomics, be sure to select the **Load annotation from file...** option, click
the button that appears below and select the recently created RDS file.

Otherwise, if you are using the CLI version, perform the following steps:

```{r, results="hide"}
annot <- readRDS(annotFile) # "annotFile" is the path to the annotation file
junctionQuant <- readFile("ex_junctionQuant.RDS") # example set

psi <- quantifySplicing(annot, junctionQuant)
```

```{r}
psi # may have 0 rows because of the small junction quantification set
```

# Feedback

All feedback on the program, documentation and associated material (including
this tutorial) is welcome. Please send any suggestions and comments to:

> Nuno Saraiva-Agostinho (nunoagostinho@medicina.ulisboa.pt)
>
> [Disease Transcriptomics Lab, Instituto de Medicina Molecular (Portugal)][distrans]

[suppa]: https://bitbucket.org/regulatorygenomicsupf/suppa
[rmats]: http://rnaseq-mats.sourceforge.net
[miso]: http://genes.mit.edu/burgelab/miso/
[vasttools]: https://github.com/vastgroup/vast-tools
[misoAnnot]: https://miso.readthedocs.io/en/fastmiso/annotation.html
[vastAnnot]: https://github.com/vastgroup/vast-tools#vastdb-libraries
[distrans]: http://imm.medicina.ulisboa.pt/group/distrans/