---
title: "Single-cell immune repertoire trajectory analysis with dandelionR and slingshot"
output:
  BiocStyle::html_document:
    toc: true
    toc_depth: 2
    number_sections: true
knitr:
  opts_chunk:
    dev: 'png'
    fig.align: 'left'
date: "`r Sys.Date()`"
vignette: >
  %\VignetteIndexEntry{Single-cell immune repertoire trajectory analysis with dandelionR and slingshot}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---
```{r, echo=FALSE, results="hide", message=FALSE}
knitr::opts_chunk$set(error = FALSE, message = FALSE, warning = FALSE)
library(BiocStyle)
```

# Foreword

In this vignette, we will demonstrate how to perform TCR trajectory analysis
using the `dandelionR` package in conjunction with `slingshot`.

# Installing slinshot
```{r, eval = FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}
if (!requireNamespace("slinghot", quietly = TRUE)) {
    BiocManager::install("slingshot")
}
```
## Load the required libraries and data

```{r, message=FALSE, warning=FALSE}
library(dandelionR)
library(scRepertoire)
library(scater)

data(sce_vdj)
```

We will also set the seed so that the plots and results are consistent.
```{r}
set.seed(123)
```

## Setup the data as per the other vignettes
We will use the full `SingleCellExperiment` object `sce_vdj` that contains enough cells to demonstrate the trajectory analysis with `slingshot`.
```{r}
sce_vdj <- setupVdjPseudobulk(sce_vdj,
    already.productive = FALSE,
    allowed_chain_status = c(
        "Single pair", "Extra pair",
        "Extra pair-exception", "Orphan VDJ",
        "Orphan VDJ-exception"
    )
)

plotUMAP(sce_vdj, color_by = "anno_lvl_2_final_clean")
```

## Milo object and neighbourhood graph construction
We will use miloR to create the pseudobulks based on the gene expression data. 
The goal is to construct a neighbourhood graph with many neighbors with which 
we can sample the representative neighbours to form the objects.

```{r, warning = FALSE}
library(miloR)
milo_object <- Milo(sce_vdj)
milo_object <- buildGraph(milo_object, k = 30, d = 20, reduced.dim = "X_scvi")
milo_object <- makeNhoods(milo_object,
    reduced_dims = "X_scvi", d = 20,
    prop = 0.3
)
```
## Construct UMAP on milo neighbor graph

We can visualise this milo object using UMAP.

```{r, warning = FALSE}
milo_object <- miloUmap(milo_object, n_neighbors = 30)
```

```{r}
plotUMAP(milo_object,
    color_by = "anno_lvl_2_final_clean",
    dimred = "UMAP_knngraph"
)
```

# Construct pseudobulked VDJ feature space

Next, we will construct the pseudobulked VDJ feature space using the 
neighbourhood graph constructed above. We will also run PCA on the 
pseudobulked VDJ feature space.
```{r}
pb.milo <- vdjPseudobulk(milo_object,
    mode_option = "abT",
    col_to_take = "anno_lvl_2_final_clean"
)
```

Inspect the newly created `pb.milo` object.
```{r}
pb.milo
```

We can compute and visualise the PCA of the pseudobulked VDJ feature space.
```{r}
pb.milo <- runPCA(pb.milo, assay.type = "Feature_space", ncomponents = 20)
plotPCA(pb.milo, color_by = "anno_lvl_2_final_clean")
```

## TCR trajectory inference using Slingshot
In the original dandelion Python package, trajectory inference is performed using the palantir package.
Here, we instead use the Slingshot package.
```{r}
library(slingshot)
```

### input
Slingshot requires a matrix representing cells in a reduced-dimensional space and a vector of cluster labels.
Here, we use PCA for dimensionality reduction and the column anno_lvl_2_final_clean from colData as the cluster labels.
```{r}
table(colData(pb.milo)[["anno_lvl_2_final_clean"]])
```

### run slingshot
```{r}
pb.milo <- slingshot(pb.milo, clusterLabels = colData(pb.milo)[["anno_lvl_2_final_clean"]], reducedDim = "PCA", start.clus = "faDP(P)_T", end.clus = c("faCD4+T", "faCD8+T"))
```

# Visualization
pseudotime of each lineage, if the pseudobulk is with value NA, it will not appear on the phage.

```{r}
library(fields)
# make a color palette from blue to red
colors <- colorRampPalette(rev(c("#d7191c", "#fdae61", "#ffffbf", "#abd9e9", "#2c7bb6")))(50)
plot(reducedDims(pb.milo)$PCA, col = colors[cut(pb.milo$slingPseudotime_1, breaks = 50)], pch = 16, asp = 1)
lines(SlingshotDataSet(pb.milo), lwd = 2, col = "black")
image.plot(
    legend.only = TRUE,
    zlim = range(pb.milo$slingPseudotime_1, na.rm = TRUE),
    col = colors,
    legend.lab = "Pseudotime"
)
```
```{r}
plot(reducedDims(pb.milo)$PCA, col = colors[cut(pb.milo$slingPseudotime_2, breaks = 50)], pch = 16, asp = 1)
lines(SlingshotDataSet(pb.milo), lwd = 2, col = "black")
image.plot(
    legend.only = TRUE,
    zlim = range(pb.milo$slingPseudotime_2, na.rm = TRUE),
    col = colors,
    legend.lab = "Pseudotime"
)
```

# Transfer
The next step is to project the pseudotime 
information from the pseudobulks back to each cell in the dataset. If the cell 
do not belong to any of the pseudobulk, it will be removed. If a cell belongs 
to multiple pseudobulk samples, its value should be calculated as a weighted 
average of the corresponding values from each pseudobulk, where each weight 
is inverse of the size of the pseudobulk.
Pseudobulks with NA values are excluded during the projection. If all pseudobulks associated with a cell have NA values, the projected value for that cell will also be NA.
## Project pseudobulk data to each cell
```{r}
cdata <- projectPseudotimeToCell(milo_object, pb.milo, value_key = c("slingPseudotime_1", "slingPseudotime_2"))
```

## Visualise the trajectory data on a per cell basis
The cell with NA value will in grey.
```{r, message=FALSE}
pal <- colorRampPalette(rev((RColorBrewer::brewer.pal(9, "RdYlBu"))))(255)
plotUMAP(cdata, color_by = "anno_lvl_2_final_clean", dimred = "UMAP_knngraph")
plotUMAP(cdata, color_by = "slingPseudotime_1", dimred = "UMAP_knngraph") +
    scale_color_gradientn(colors = pal)
plotUMAP(cdata, color_by = "slingPseudotime_2", dimred = "UMAP_knngraph") +
    scale_color_gradientn(colors = pal)
```

And that's it! We have successfully inferred the trajectory of the T-cells in 
this dataset with `slingshot`!

# Session info

```{r, warning = FALSE}
sessionInfo()
```