--- title: "Visualization" author: - name: Kevin Stachelek affiliation: - University of Southern California email: kevin.stachelek@gmail.com - name: Bhavana Bhat affiliation: - University of Southern California email: bbhat@usc.edu output: BiocStyle::html_document: self_contained: yes toc: true toc_float: true toc_depth: 2 code_folding: show date: "`r doc_date()`" package: "`r pkg_ver('chevreulPlot')`" vignette: > %\VignetteIndexEntry{Visualization} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", crop = NULL, ## Related to https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016656.html dpi = 100, out.width = "100%", message = FALSE, warning = FALSE, fig.width = 5, fig.height = 3 ) ``` This article demonstrates the data visualization tools in Chevreul. We'll introduce included functions, their usage, and resulting plots First step is to load chevreulPlot package and all other packages required ```{r packages} library(chevreulPlot) library(scater) library(scran) library(clustree) library(patchwork) data("small_example_dataset") ``` The different plotting functions within chevreulPlot allows for visualization of data, these plots can be customized for interactive or non-interactive display. # Plot expression Expression of a feature (genes or transcripts) can be plotted on a given embedding resulting in an interactive feature plot. When plotting only one feature, output is identical to `SingleCellExperiment::FeaturePlot` ```{r} plot_feature_on_embedding(small_example_dataset, embedding = "UMAP", features = "Gene_0001", return_plotly = FALSE ) ``` An interactive output plot can be generated by specifying `return_plotly = TRUE` which uses `ggplotly` allowing identification of individual cells for further investigation. ## Plot read count or other QC measurements The `plot_colData_histogram` function displays a histogram of cell read counts colored according to a categorical variable using the argument `fill_by`. Here we can see that read counts for this dataset are distinctly different depending on the sequencing batch ```{r} plot_colData_histogram(small_example_dataset, group_by = "sizeFactor", fill_by = "Treatment" ) ``` ## Plot metadata variable Make an interactive scatter plot of a metadata variable, where each point in the plot represents a cell whose position on the plot is given by the cell embedding determined by the dimensional reduction technique by default, "UMAP". The group argument specifies the colData variable by which to group the cells by, by default, "batch". ```{r} plot_colData_on_embedding(small_example_dataset, group = "gene_snn_res.1", embedding = "UMAP" ) ``` This function utilizes a SingleCellExperiment function, `DimPlot()`, as sub function which produces the dimensional reduction plot. The interactive parameter, `return_plotly`, in plot_colData_on_embedding when set to TRUE will convert the plot into an interactive plot using ggplotly function from R's plotly package ## Plot cluster marker genes Marker genes of louvain clusters or additional experimental metadata can be plotted using `plot_marker_features`. This allows visualization of n marker features grouped by the metadata of interest. Marker genes are identified using wilcoxon rank-sum test as implemented in `presto`. In the resulting dot plot the size of the dot corresponds to the percentage of cells expressing the feature in each cluster and the color represents the average expression level of the feature. ```{r} plot_marker_features(small_example_dataset, group_by = "gene_snn_res.1", marker_method = "wilcox" ) ``` ## Plotting transcript composition `plot_transcript_composition()` plots the proportion of reads of a given gene map to each transcript. The gene of interest is specified by the argument 'gene_symbol'. ```{r, results=FALSE, echo=FALSE, eval = FALSE} plot_transcript_composition(small_example_dataset, "NRL", group.by = "gene_snn_res.1", standardize = TRUE ) ``` ```{r} sessionInfo() ```