--- title: "Visualization of Multi Dimensional Scaling (MDS) objects" author: - name: Andrea Vicini - name: Philippe Hauchamps package: MDSvis abstract: > This vignette is the main one describing the functionality implemented in the `MDSvis` package. `MDSvis` conatins a shiny application enabling interactive visualization of Multi Dimensional Scaling (MDS) objects. This vignette is distributed under a CC BY-SA 4.0 license. output: BiocStyle::html_document: toc_float: true bibliography: MDSvis.bib vignette: > %\VignetteIndexEntry{Visualization of Multi Dimensional Scaling (MDS) objects} %\VignetteEngine{knitr::rmarkdown} %%\VignetteKeywords{multidimensionalScaling, Software, Visualization} %\VignetteEncoding{UTF-8} --- ```{r style, echo=FALSE, results='hide', message=FALSE} BiocStyle::markdown() ``` # Installation and loading dependencies To install this package, start R and enter: ```{r package_install, eval=FALSE, message=FALSE, warning=FALSE} if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") if (!require("MDSvis", quietly = TRUE)) BiocManager::install("MDSvis") ``` We now load the packages needed in the current vignette: ```{r rlibs, results=FALSE, message=FALSE, warning=FALSE} library(HDCytoData) library(CytoMDS) library(MDSvis) ``` # Introduction The `MDSvis` package implements visualization of Multi Dimensional Scaling (MDS) objects representing a low dimensional projection of sample distances. Such objects can be obtained using features implemented in the `CytoMDS` package (@Hauchamps2025-zm). For more information see the [`CytoMDS` vignette](https://www.bioconductor.org/packages/release/bioc/vignettes/CytoMDS/inst/doc/CytoMDS.html). The visualization is done via a Shiny app that allows the user to interactively customise the plots depending on a series of input parameters (see the `CytoMDS::ggplotSampleMDS()` function for more details on the parameters). **IMPORTANT**: the example provided in the current vignette uses cytometry data. As a result, the distance matrix, which contains the pairwise sample distances, and serves as input to the MDS projection, is here calculated from cytometry data samples using the `CytoMDS` package. However, the input MDS object can also be calculated from any distance matrix, where the units of interest are not necessarily cytometry samples. This is demonstrated in this additional [vignette](MDSvis_Input_From_Distance.html). # Illustrative dataset We load an illustrative mass cytometry (*CyTOF*) dataset from [@Krieg2018-zs], accessible from the Bioconductor *HDCytoData* data package (@Weber2019-qp). *Krieg_Anti_PD_1* was used to characterize immune cell subsets in peripheral blood from melanoma skin cancer patients treated with anti-PD-1 immunotherapy. This study found that the frequency of CD14+CD16-HLA-DRhi monocytes in baseline samples (taken from patients prior to treatment) was a strong predictor of survival in response to immunotherapy treatment. Notably, this dataset contains a strong batch effect, due to sample acquisition on two different days (@Krieg2018-zs). # Generation of input files In the current section, we show how to build the input objects needed by the shiny app, step by step. However, if you are only interested in the description of the app functionalities, you might decide to launch the application demo mode - which automatically pre-loads the *Krieg_Anti_PD_1* dataset - and jump to [the next section](#visualization-of-the-mds-projection). First we load the *Krieg_Anti_PD_1* dataset, from the *HDCytoData* package. ```{r loadDataSet} Krieg_fs <- Krieg_Anti_PD_1_flowSet() Krieg_fs ``` Next we build a *phenodata* dataframe with experimental design information, which are here found in the cytometry data structures. ```{r convertPhenoData} # update phenoData structure chLabels <- keyword(Krieg_fs[[1]], "MARKER_INFO")$MARKER_INFO$channel_name chMarkers <- keyword(Krieg_fs[[1]], "MARKER_INFO")$MARKER_INFO$marker_name marker_class <- keyword(Krieg_fs[[1]], "MARKER_INFO")$MARKER_INFO$marker_class chLabels <- chLabels[marker_class != "none"] chMarkers <- chMarkers[marker_class != "none"] # marker_class all equal to "type", 24 markers are left phenoData <- flowCore::pData(Krieg_fs) additionalPhenoData <- keyword(Krieg_fs[[1]], "EXPERIMENT_INFO")$EXPERIMENT_INFO phenoData <- cbind(phenoData, additionalPhenoData) flowCore::pData(Krieg_fs) <- phenoData ``` Next, we scale-transform the mass cytometry data and calculate the pairwise distance matrix. ```{r distCalc} # transform flow set (arcsinh(cofactor = 5)) trans <- arcsinhTransform( transformationId="ArcsinhTransform", a = 0, b = 1/5, c = 0) # Applying arcsinh() transformation Krieg_fs_trans <- transform( Krieg_fs, transformList(chLabels, trans)) # Calculating Sample distances pwDist <- pairwiseEMDDist( Krieg_fs_trans, channels = chMarkers, verbose = FALSE ) ``` From the distance matrix, we can now calculate the MDS projection. This is also done using the `CytoMDS` package. ```{r MDSCalc} mdsObj <- computeMetricMDS( pwDist, seed = 0) show(mdsObj) ``` Optionally, for the use of biplots as an interpretation tool, sample specific statistics can be provided by the user. Here we calculate standard univariate statistics from each variable of the multidimensional distribution (again using the `CytoMDS` package). ```{r statsCalc} # Computing sample statistics statFUNs <- c( "median" = stats::median, "std-dev" = stats::sd, "mean" = base::mean, "Q25" = function(x, na.rm) stats::quantile(x, probs = 0.25, na.rm = na.rm), "Q75" = function(x, na.rm) stats::quantile(x, probs = 0.75, na.rm = na.rm)) chStats <- CytoMDS::channelSummaryStats( Krieg_fs_trans, channels = chMarkers, statFUNs = statFUNs) ``` The newly created objects can be saved as .rds files. These files can in turn be selected within the shiny app for visualization. ```{r saveRDS, eval=FALSE} saveRDS(mdsObj, file = "Krieg_mdsObj.rds") saveRDS(phenoData, file = "Krieg_phenoData.rds") saveRDS(chStats, file = "Krieg_chStats.rds") ``` # Visualization of the MDS projection The `MDSvis` function `run_app` launches the interactive Shiny app and the three tabs window in the figure below gets opened. ```{r launch, eval = FALSE} MDSvis::mdsvis_app() # alternatively launch the application in demo mode, with the Krieg_Anti_PD_1 # dataset already loaded. #MDSvis::mdsvis_app(preLoadDemoDataset = TRUE) ``` In the 'Input files' tab the objects can be loaded for visualization. The plots are then shown in the 'View' tab while the 'General settings' contains more technical plot settings controls. All the input files are expected to have .rds extension and at least a file containing the MDS object has to be loaded. Optionally a phenodata file containing a phenodata dataframe and/or a file containing a list of statistics for biplot visualization can be loaded. Alternatively, a 'demo mode' (see in commented code chunck above) allows the user to launch the app with the *Krieg_Anti_PD_1* dataset pre-loaded, without importing input .rds files. ![](images/InputFiles.png) We can select the previously created files and proceed with the visualization. Note that when a phenodata file is selected a new control appears allowing to select only a subset of variables (by default all are selected). These phenodata selected variables will be the only ones available in the drop-down list controls in the 'View' tab. ![](images/InputFiles2.png) We can now open the 'View' tab to see the plot of the projection results. The controls on the side allow to choose the projection axes; colour, label, define facet or shape the points according to phenodata variables; add biplot; show a `plotly` plot for interactive plot exploration or flip axes. ![](images/View.png) For example we can colour the points according to group_id, label the points according to sample_id, and use shapes according to batch_id, as shown in the figure below. ![](images/View2.png) We can also defne facets according to the two batches, as in the figure below: ![](images/View3.png) We can also add a biplot created based on sample statistics by clicking on the biplot checkbox. The idea is to calculate the correlation of the sample statistics w.r.t. the axes of projection, so that these correlations can be represented on a correlation circle overlaid on the projection plot. The desired statistic can be selected from the drop down menu and it is possible to show only the arrows that respect a selected length threshold. In the example below, the chosen statistic is the median while the arrow length threshold is 0.8. ![](images/ViewBiplot.png) When some data are too large to be displayed as labels, one possible solution is to display an interactive `plotly` plot below the regular one by selecting the corresponding checkbox. We can add new `plotly` tooltips and highlight the corresponding information for each point by hovering over them. ![](images/ViewPlotly.png) Finally, the obtained plots can be exported as a pdf file. The user can defined the corresponding plot title, pdf file name and plot size (with and height) to be used. ![](images/PDFExport.png) # General settings For completeness we show below the 'General settings' tab which contains some general controls regarding e.g. points features, the corresponding labels and biplot arrows. For more details see the `CytoMDS::ggplotSampleMDS()` function parameters. ![](images/GeneralSettings.png) # Session information {-} ```{r sessioninfo, echo=FALSE} sessionInfo() ``` # References {-}