---
title: "Introduction to Rtpca"
author:
- name: Nils Kurzawa
  affiliation: 
  - European Molecular Biology Laboratory (EMBL), Genome Biology Unit
date: "`r format(Sys.time(), '%d %B, %Y')`"
package: Rtpca
output:
  BiocStyle::html_document 
vignette: >
    %\VignetteIndexEntry{Introduction to Rtpca}
    %\VignetteEngine{knitr::rmarkdown}
    %VignetteEncoding{UTF-8}
bibliography: refs.bib
csl: cell.csl
header-includes: 
- \usepackage{placeins}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# Installation

Installation from *Bioconductor*
```{r, eval = FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("Rtpca")
```

2. Load the package into your R session.
```{r Load, message=FALSE}
library(Rtpca)
```

# Introduction

Thermal proteome profiling (TPP) [@Savitski2014; @Mateus2020] is a mass
spectrometry-based, proteome-wide implemention of the cellular thermal shift
assay [@Molina2013]. It was originally developed to study drug-(off-)target
engagement. However, it was realized that profiles of interacting protein 
pairs appeared more similar than by chance which was coined as 
'thermal proximity co-aggregation' (TPCA) [@Tan2018]. The R package `Rtpca`
enables analysis of TPP datasets using the TPCA concept for studying
protein-protein interactions and protein complexes and also allows to test for
differential protein-protein interactions across different conditions.         

This vignette only represents a minimal example. To have a look at a more
realistic example feel free to check out [this more realisticexample](https://github.com/nkurzaw/Rtpca_analysis/blob/master/Becher_et_al_reanalysis.pdf).

**Note:** if you use `Rtpca` in published research, please cite:

> Kurzawa, N., Mateus, A. & Savitski, M.M. (2020)
> Rtpca: an R package for differential thermal proximity coaggregation analysis.
> *Bioinformatics*, [10.1093/bioinformatics/btaa682](https://doi.org/10.1093/bioinformatics/btaa682)


# The `Rtpca` package workflow

We also load the `TPP` package to illustrate how to import TPP data with the
Bioconductor package and then input it into the `Rtpca` functions.
```{r, message=FALSE, warning=FALSE}
library(TPP)
```

## Import Thermal proteome profiling data using the `TPP` package

We load the data `hdacTR_smallExample` which is part of the `TPP` package
```{r}
data("hdacTR_smallExample")
```

Filter hdacTR_data to speed up computations
```{r}
set.seed(123)
random_proteins <- sample(hdacTR_data[[1]]$gene_name, 300)
```

```{r}
hdacTR_data_fil <- lapply(hdacTR_data, function(temp_df){
    filter(temp_df, gene_name %in% random_proteins)
})
```

We can now import our small example dataset using the import function from 
the `TPP` package:
```{r}
trData <- tpptrImport(configTable = hdacTR_config, data = hdacTR_data_fil)
```

## Performing thermal co-aggregation analysis with `Rtpca`

Then, we load `string_ppi_df` which is a data frame that annotates
protein-protein interactions as obtained from StringDB [@Szklarczyk2019] that
comes with the `Rtpca` package
```{r}
data("string_ppi_df")
string_ppi_df
```

This table has been created from the human *protein.links* table downloaded from the StringDB website. It can serve as a template for users to create equivalent tables for other organisms.

### Run TPCA on data from a single condition

We can run TPCA for protein-protein interactions like this by using the 
function `runTPCA` 
```{r}
string_ppi_cs_950_df <- string_ppi_df %>% 
    filter(combined_score >= 950 )

vehTPCA <- runTPCA(
    objList = trData,
    ppiAnno = string_ppi_cs_950_df
)
```
Note: it is not necessary that your data has the format of the TPP package
(ExpressionSet), you can also supply the function with a list of matrices of 
data frames (in the case of data frames you need to additionally indicate with
column contains the protein or gene names).

We can also run TPCA to test for coaggregation of protein complexes. For this
purpose with can load a data frame that annotates proteins to protein complexes
curated by @Ori2016
```{r}
data("ori_et_al_complexes_df")
ori_et_al_complexes_df
```

Then, we can invoke

```{r}
vehComplexTPCA <- runTPCA(
    objList = trData,
    complexAnno = ori_et_al_complexes_df,
    minCount = 2
)
```

We can plot a ROC curve for how well our data captures protein-protein
interactions:
```{r}
plotPPiRoc(vehTPCA, computeAUC = TRUE)
```

And we can also plot a ROC curve for how well our data captures protein
complexes:

```{r}
plotComplexRoc(vehComplexTPCA, computeAUC = TRUE)

```


### Run differential TPCA on two conditions 

In order to test for protein-protein interactions that change significantly
between both conditions, we can run the `runDiffTPCA` as illustrated below:

```{r}
diffTPCA <- 
    runDiffTPCA(
        objList = trData[1:2], 
        contrastList = trData[3:4],
        ctrlCondName = "DMSO",
        contrastCondName = "Panobinostat",
        ppiAnno = string_ppi_cs_950_df)
```

We can then plot a volcano plot to visualize the results:
```{r}
plotDiffTpcaVolcano(
    diffTPCA,
    setXLim = TRUE,
    xlimit = c(-0.5, 0.5))
```

The underlying result table can be inspected like this;
```{r}
head(diffTpcaResultTable(diffTPCA) %>% 
         arrange(p_value) %>% 
        dplyr::select(pair, rssC1_rssC2, f_stat, p_value, p_adj))
```

We can see that none of these interactions is significant consiering the 
multiple comparisons we have done. Yet, we can look at the melting curves of 
pairs like the "KPNA6:KPNB1" by evoking:

```{r}
plotPPiProfiles(diffTPCA, pair = c("KPNA6", "KPNB1"))
```
We can see that both protein do seem to coaggregate, but that the mild 
difference in the treatment condition compared to the control condition is 
likely due to technical rather than biological reasons.       
This way of inspecting hits obtained by the differential analysis is 
recommended in the case that significant pairs can be found to validate that 
they do coaggregate in one condition and that the less strong coaggregations 
in the other condition is based on reliable signal.      


# Additional remarks
As mentioned above, this vignette includes only a very minimal example, have a
look at a more extensive example [here](https://github.com/nkurzaw/Rtpca_analysis/blob/master/Becher_et_al_reanalysis.pdf).

```{r}
sessionInfo()
```

# Acknowledgements

A big thanks to Thomas Naake and Mike Smith for helping with speeding up the code.

# References