In this vignette, we will demonstrate how to perform TCR trajectory analysis
using the dandelionR
package in conjunction with slingshot
.
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
if (!requireNamespace("slinghot", quietly = TRUE)) {
BiocManager::install("slingshot")
}
library(dandelionR)
library(scRepertoire)
library(scater)
data(sce_vdj)
We will also set the seed so that the plots and results are consistent.
set.seed(123)
We will use the full SingleCellExperiment
object sce_vdj
that contains enough cells to demonstrate the trajectory analysis with slingshot
.
sce_vdj <- setupVdjPseudobulk(sce_vdj,
already.productive = FALSE,
allowed_chain_status = c(
"Single pair", "Extra pair",
"Extra pair-exception", "Orphan VDJ",
"Orphan VDJ-exception"
)
)
plotUMAP(sce_vdj, color_by = "anno_lvl_2_final_clean")
We will use miloR to create the pseudobulks based on the gene expression data. The goal is to construct a neighbourhood graph with many neighbors with which we can sample the representative neighbours to form the objects.
library(miloR)
milo_object <- Milo(sce_vdj)
milo_object <- buildGraph(milo_object, k = 30, d = 20, reduced.dim = "X_scvi")
milo_object <- makeNhoods(milo_object,
reduced_dims = "X_scvi", d = 20,
prop = 0.3
)
We can visualise this milo object using UMAP.
milo_object <- miloUmap(milo_object, n_neighbors = 30)
plotUMAP(milo_object,
color_by = "anno_lvl_2_final_clean",
dimred = "UMAP_knngraph"
)
Next, we will construct the pseudobulked VDJ feature space using the neighbourhood graph constructed above. We will also run PCA on the pseudobulked VDJ feature space.
pb.milo <- vdjPseudobulk(milo_object,
mode_option = "abT",
col_to_take = "anno_lvl_2_final_clean"
)
Inspect the newly created pb.milo
object.
pb.milo
## class: Milo
## dim: 151 499
## metadata(0):
## assays(1): Feature_space
## rownames(151): TRBV10-1 TRBV10-2 ... TRAJ8 TRAJ9
## rowData names(0):
## colnames(499): 2463 147 ... 977 533
## colData names(3): anno_lvl_2_final_clean
## anno_lvl_2_final_clean_fraction cell_count
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## nhoods dimensions(2): 499 2657
## nhoodCounts dimensions(2): 1 1
## nhoodDistances dimension(1): 0
## graph names(0):
## nhoodIndex names(1): 0
## nhoodExpression dimension(2): 1 1
## nhoodReducedDim names(0):
## nhoodGraph names(0):
## nhoodAdjacency dimension(2): 1 1
We can compute and visualise the PCA of the pseudobulked VDJ feature space.
pb.milo <- runPCA(pb.milo, assay.type = "Feature_space", ncomponents = 20)
plotPCA(pb.milo, color_by = "anno_lvl_2_final_clean")
In the original dandelion Python package, trajectory inference is performed using the palantir package. Here, we instead use the Slingshot package.
library(slingshot)
Slingshot requires a matrix representing cells in a reduced-dimensional space and a vector of cluster labels. Here, we use PCA for dimensionality reduction and the column anno_lvl_2_final_clean from colData as the cluster labels.
table(colData(pb.milo)[["anno_lvl_2_final_clean"]])
##
## faABT(ENTRY) faCD4+T faCD8+T faDP(P)_T faDP(Q)_T
## 98 185 106 49 61
pb.milo <- slingshot(pb.milo, clusterLabels = colData(pb.milo)[["anno_lvl_2_final_clean"]], reducedDim = "PCA", start.clus = "faDP(P)_T", end.clus = c("faCD4+T", "faCD8+T"))
pseudotime of each lineage, if the pseudobulk is with value NA, it will not appear on the phage.
library(fields)
# make a color palette from blue to red
colors <- colorRampPalette(rev(c("#d7191c", "#fdae61", "#ffffbf", "#abd9e9", "#2c7bb6")))(50)
plot(reducedDims(pb.milo)$PCA, col = colors[cut(pb.milo$slingPseudotime_1, breaks = 50)], pch = 16, asp = 1)
lines(SlingshotDataSet(pb.milo), lwd = 2, col = "black")
image.plot(
legend.only = TRUE,
zlim = range(pb.milo$slingPseudotime_1, na.rm = TRUE),
col = colors,
legend.lab = "Pseudotime"
)
plot(reducedDims(pb.milo)$PCA, col = colors[cut(pb.milo$slingPseudotime_2, breaks = 50)], pch = 16, asp = 1)
lines(SlingshotDataSet(pb.milo), lwd = 2, col = "black")
image.plot(
legend.only = TRUE,
zlim = range(pb.milo$slingPseudotime_2, na.rm = TRUE),
col = colors,
legend.lab = "Pseudotime"
)
The next step is to project the pseudotime information from the pseudobulks back to each cell in the dataset. If the cell do not belong to any of the pseudobulk, it will be removed. If a cell belongs to multiple pseudobulk samples, its value should be calculated as a weighted average of the corresponding values from each pseudobulk, where each weight is inverse of the size of the pseudobulk. Pseudobulks with NA values are excluded during the projection. If all pseudobulks associated with a cell have NA values, the projected value for that cell will also be NA. ## Project pseudobulk data to each cell
cdata <- projectPseudotimeToCell(milo_object, pb.milo, value_key = c("slingPseudotime_1", "slingPseudotime_2"))
The cell with NA value will in grey.
pal <- colorRampPalette(rev((RColorBrewer::brewer.pal(9, "RdYlBu"))))(255)
plotUMAP(cdata, color_by = "anno_lvl_2_final_clean", dimred = "UMAP_knngraph")
plotUMAP(cdata, color_by = "slingPseudotime_1", dimred = "UMAP_knngraph") +
scale_color_gradientn(colors = pal)
plotUMAP(cdata, color_by = "slingPseudotime_2", dimred = "UMAP_knngraph") +
scale_color_gradientn(colors = pal)
And that’s it! We have successfully inferred the trajectory of the T-cells in
this dataset with slingshot
!
sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] fields_16.3.1 viridisLite_0.4.2
## [3] spam_2.11-1 slingshot_2.17.0
## [5] TrajectoryUtils_1.17.0 princurve_2.1.6
## [7] destiny_3.23.0 miloR_2.5.1
## [9] edgeR_4.7.3 limma_3.65.3
## [11] scater_1.37.0 scuttle_1.19.0
## [13] SingleCellExperiment_1.31.1 SummarizedExperiment_1.39.1
## [15] Biobase_2.69.0 GenomicRanges_1.61.1
## [17] Seqinfo_0.99.2 IRanges_2.43.0
## [19] S4Vectors_0.47.0 BiocGenerics_0.55.1
## [21] generics_0.1.4 MatrixGenerics_1.21.0
## [23] matrixStats_1.5.0 scRepertoire_2.5.3
## [25] ggplot2_3.5.2 dandelionR_1.1.4
## [27] BiocStyle_2.37.1
##
## loaded via a namespace (and not attached):
## [1] later_1.4.2 splines_4.5.1
## [3] tibble_3.3.0 polyclip_1.10-7
## [5] xts_0.14.1 lifecycle_1.0.4
## [7] processx_3.8.6 globals_0.18.0
## [9] lattice_0.22-7 MASS_7.3-65
## [11] magrittr_2.0.3 vcd_1.4-13
## [13] sass_0.4.10 rmarkdown_2.29
## [15] jquerylib_0.1.4 yaml_2.3.10
## [17] sp_2.2-0 chromote_0.5.1
## [19] cowplot_1.2.0 RColorBrewer_1.1-3
## [21] maps_3.4.3 abind_1.4-8
## [23] rvest_1.0.4 purrr_1.1.0
## [25] ggraph_2.2.1 hash_2.2.6.3
## [27] nnet_7.3-20 pracma_2.4.4
## [29] tweenr_2.0.3 evmix_2.12
## [31] ggrepel_0.9.6 irlba_2.3.5.1
## [33] listenv_0.9.1 iNEXT_3.0.2
## [35] MatrixModels_0.5-4 RSpectra_0.16-2
## [37] parallelly_1.45.1 DelayedMatrixStats_1.31.0
## [39] codetools_0.2-20 smoother_1.3
## [41] DelayedArray_0.35.2 xml2_1.3.8
## [43] ggforce_0.5.0 tidyselect_1.2.1
## [45] farver_2.1.2 ScaledMatrix_1.17.0
## [47] viridis_0.6.5 jsonlite_2.0.0
## [49] BiocNeighbors_2.3.1 e1071_1.7-16
## [51] tidygraph_1.3.1 progressr_0.15.1
## [53] Formula_1.2-5 survival_3.8-3
## [55] ggalluvial_0.12.5 tools_4.5.1
## [57] Rcpp_1.1.0 glue_1.8.0
## [59] gridExtra_2.3 SparseArray_1.9.1
## [61] laeken_0.5.3 xfun_0.52
## [63] ranger_0.17.0 TTR_0.24.4
## [65] websocket_1.4.4 ggthemes_5.1.0
## [67] dplyr_1.1.4 withr_3.0.2
## [69] numDeriv_2016.8-1.1 BiocManager_1.30.26
## [71] fastmap_1.2.0 boot_1.3-31
## [73] bluster_1.19.0 SparseM_1.84-2
## [75] VIM_6.2.2 digest_0.6.37
## [77] rsvd_1.0.5 R6_2.6.1
## [79] colorspace_2.1-1 gtools_3.9.5
## [81] dichromat_2.0-0.1 tidyr_1.3.1
## [83] hexbin_1.28.5 data.table_1.17.8
## [85] robustbase_0.99-4-1 class_7.3-23
## [87] graphlayouts_1.2.2 httr_1.4.7
## [89] S4Arrays_1.9.1 scatterplot3d_0.3-44
## [91] uwot_0.2.3 pkgconfig_2.0.3
## [93] gtable_0.3.6 lmtest_0.9-40
## [95] XVector_0.49.0 htmltools_0.5.8.1
## [97] carData_3.0-5 dotCall64_1.2
## [99] bookdown_0.43 SeuratObject_5.1.0
## [101] scales_1.4.0 knn.covertree_1.0
## [103] ggdendro_0.2.0 knitr_1.50
## [105] rjson_0.2.23 reshape2_1.4.4
## [107] curl_6.4.0 proxy_0.4-27
## [109] cachem_1.1.0 zoo_1.8-14
## [111] stringr_1.5.1 parallel_4.5.1
## [113] vipor_0.4.7 pillar_1.11.0
## [115] grid_4.5.1 vctrs_0.6.5
## [117] RANN_2.6.2 pcaMethods_2.1.0
## [119] promises_1.3.3 car_3.1-3
## [121] BiocSingular_1.25.0 beachmat_2.25.4
## [123] cluster_2.1.8.1 beeswarm_0.4.0
## [125] evaluate_1.0.4 magick_2.8.7
## [127] tinytex_0.57 cli_3.6.5
## [129] locfit_1.5-9.12 compiler_4.5.1
## [131] rlang_1.1.6 crayon_1.5.3
## [133] future.apply_1.20.0 labeling_0.4.3
## [135] ps_1.9.1 immApex_1.3.5
## [137] plyr_1.8.9 ggbeeswarm_0.7.2
## [139] stringi_1.8.7 BiocParallel_1.43.4
## [141] gsl_2.1-8 quantreg_6.1
## [143] Matrix_1.7-3 RcppHNSW_0.6.0
## [145] RcppEigen_0.3.4.0.2 patchwork_1.3.1
## [147] sparseMatrixStats_1.21.0 future_1.67.0
## [149] statmod_1.5.0 igraph_2.1.4
## [151] memoise_2.0.1 bslib_0.9.0
## [153] DEoptimR_1.1-4 ggplot.multistats_1.0.1