--- title: "Single-cell immune repertoire trajectory analysis with dandelionR and slingshot" output: BiocStyle::html_document: toc: true toc_depth: 2 number_sections: true knitr: opts_chunk: dev: 'png' fig.align: 'left' date: "`r Sys.Date()`" vignette: > %\VignetteIndexEntry{Single-cell immune repertoire trajectory analysis with dandelionR and slingshot} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo=FALSE, results="hide", message=FALSE} knitr::opts_chunk$set(error = FALSE, message = FALSE, warning = FALSE) library(BiocStyle) ``` # Foreword In this vignette, we will demonstrate how to perform TCR trajectory analysis using the `dandelionR` package in conjunction with `slingshot`. # Installing slinshot ```{r, eval = FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) { install.packages("BiocManager") } if (!requireNamespace("slinghot", quietly = TRUE)) { BiocManager::install("slingshot") } ``` ## Load the required libraries and data ```{r, message=FALSE, warning=FALSE} library(dandelionR) library(scRepertoire) library(scater) data(sce_vdj) ``` We will also set the seed so that the plots and results are consistent. ```{r} set.seed(123) ``` ## Setup the data as per the other vignettes We will use the full `SingleCellExperiment` object `sce_vdj` that contains enough cells to demonstrate the trajectory analysis with `slingshot`. ```{r} sce_vdj <- setupVdjPseudobulk(sce_vdj, already.productive = FALSE, allowed_chain_status = c( "Single pair", "Extra pair", "Extra pair-exception", "Orphan VDJ", "Orphan VDJ-exception" ) ) plotUMAP(sce_vdj, color_by = "anno_lvl_2_final_clean") ``` ## Milo object and neighbourhood graph construction We will use miloR to create the pseudobulks based on the gene expression data. The goal is to construct a neighbourhood graph with many neighbors with which we can sample the representative neighbours to form the objects. ```{r, warning = FALSE} library(miloR) milo_object <- Milo(sce_vdj) milo_object <- buildGraph(milo_object, k = 30, d = 20, reduced.dim = "X_scvi") milo_object <- makeNhoods(milo_object, reduced_dims = "X_scvi", d = 20, prop = 0.3 ) ``` ## Construct UMAP on milo neighbor graph We can visualise this milo object using UMAP. ```{r, warning = FALSE} milo_object <- miloUmap(milo_object, n_neighbors = 30) ``` ```{r} plotUMAP(milo_object, color_by = "anno_lvl_2_final_clean", dimred = "UMAP_knngraph" ) ``` # Construct pseudobulked VDJ feature space Next, we will construct the pseudobulked VDJ feature space using the neighbourhood graph constructed above. We will also run PCA on the pseudobulked VDJ feature space. ```{r} pb.milo <- vdjPseudobulk(milo_object, mode_option = "abT", col_to_take = "anno_lvl_2_final_clean" ) ``` Inspect the newly created `pb.milo` object. ```{r} pb.milo ``` We can compute and visualise the PCA of the pseudobulked VDJ feature space. ```{r} pb.milo <- runPCA(pb.milo, assay.type = "Feature_space", ncomponents = 20) plotPCA(pb.milo, color_by = "anno_lvl_2_final_clean") ``` ## TCR trajectory inference using Slingshot In the original dandelion Python package, trajectory inference is performed using the palantir package. Here, we instead use the Slingshot package. ```{r} library(slingshot) ``` ### input Slingshot requires a matrix representing cells in a reduced-dimensional space and a vector of cluster labels. Here, we use PCA for dimensionality reduction and the column anno_lvl_2_final_clean from colData as the cluster labels. ```{r} table(colData(pb.milo)[["anno_lvl_2_final_clean"]]) ``` ### run slingshot ```{r} pb.milo <- slingshot(pb.milo, clusterLabels = colData(pb.milo)[["anno_lvl_2_final_clean"]], reducedDim = "PCA", start.clus = "faDP(P)_T", end.clus = c("faCD4+T", "faCD8+T")) ``` # Visualization pseudotime of each lineage, if the pseudobulk is with value NA, it will not appear on the phage. ```{r} library(fields) # make a color palette from blue to red colors <- colorRampPalette(rev(c("#d7191c", "#fdae61", "#ffffbf", "#abd9e9", "#2c7bb6")))(50) plot(reducedDims(pb.milo)$PCA, col = colors[cut(pb.milo$slingPseudotime_1, breaks = 50)], pch = 16, asp = 1) lines(SlingshotDataSet(pb.milo), lwd = 2, col = "black") image.plot( legend.only = TRUE, zlim = range(pb.milo$slingPseudotime_1, na.rm = TRUE), col = colors, legend.lab = "Pseudotime" ) ``` ```{r} plot(reducedDims(pb.milo)$PCA, col = colors[cut(pb.milo$slingPseudotime_2, breaks = 50)], pch = 16, asp = 1) lines(SlingshotDataSet(pb.milo), lwd = 2, col = "black") image.plot( legend.only = TRUE, zlim = range(pb.milo$slingPseudotime_2, na.rm = TRUE), col = colors, legend.lab = "Pseudotime" ) ``` # Transfer The next step is to project the pseudotime information from the pseudobulks back to each cell in the dataset. If the cell do not belong to any of the pseudobulk, it will be removed. If a cell belongs to multiple pseudobulk samples, its value should be calculated as a weighted average of the corresponding values from each pseudobulk, where each weight is inverse of the size of the pseudobulk. Pseudobulks with NA values are excluded during the projection. If all pseudobulks associated with a cell have NA values, the projected value for that cell will also be NA. ## Project pseudobulk data to each cell ```{r} cdata <- projectPseudotimeToCell(milo_object, pb.milo, value_key = c("slingPseudotime_1", "slingPseudotime_2")) ``` ## Visualise the trajectory data on a per cell basis The cell with NA value will in grey. ```{r, message=FALSE} pal <- colorRampPalette(rev((RColorBrewer::brewer.pal(9, "RdYlBu"))))(255) plotUMAP(cdata, color_by = "anno_lvl_2_final_clean", dimred = "UMAP_knngraph") plotUMAP(cdata, color_by = "slingPseudotime_1", dimred = "UMAP_knngraph") + scale_color_gradientn(colors = pal) plotUMAP(cdata, color_by = "slingPseudotime_2", dimred = "UMAP_knngraph") + scale_color_gradientn(colors = pal) ``` And that's it! We have successfully inferred the trajectory of the T-cells in this dataset with `slingshot`! # Session info ```{r, warning = FALSE} sessionInfo() ```