--- title: Introduction to `poem` package: "`r BiocStyle::pkg_ver('poem')`" author: - name: Siyuan Luo email: roseluosy@gmail.com affiliation: - &IMLS Institute for Molecular Life Sciences, University of Zurich, Zurich, Switzerland - &HEST Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland - name: Pierre-Luc Germain email: affiliation: - *IMLS - *HEST date: "`r Sys.Date()`" output: BiocStyle::html_document vignette: | %\VignetteIndexEntry{1_introduction} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo=FALSE, message=FALSE, warning = FALSE} knitr::opts_chunk$set(error=FALSE, message=FALSE, warning=FALSE, collapse = TRUE) library(BiocStyle) ``` # Installation & loading {-} ```{r, eval=FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("poem") ``` ```{r, message = FALSE, warning = FALSE} library(poem) library(ggplot2) library(dplyr) library(tidyr) library(ggnetwork) library(igraph) library(cowplot) ``` # Introduction ## What is this package for? This package provides multiple approaches for comparing two partitions[^1] of the same dataset, and evaluating the alignment between a dataset’s embedding/graph representations and its partition. Besides, this package further offers methods for comparing two fuzzy partitions[^2] as well as for comparing a hard partition with a fuzzy partition. This allows the evaluation of fuzzy partition results by assessing its agreement to a fuzzy or a hard ground-truth partition. Finally, the package implements visualization and evaluation metrics tailored for domain detection in spatially-resolved -omics data. These include especially external evaluation metrics (i.e. based on a comparison to ground truth labels), but also internal metrics. For a detailed description on how to work with `r Biocpkg("SpatialExperiment")` objects, we refer to another vignette of `r Biocpkg("poem", vignette = "PoemOnSpatialExperiment.html")`. [^1]: A partition is a way of organizing the data points of a dataset into distinct, non-overlapping, and non-empty subsets. For example, a clustering is a partition. [^2]: In 'hard' partitions, each data point belongs to one and only one subset. However, clustering can also generate fuzzy partitions, in which data points can belong to multiple subsets with varying degrees (or probability) of membership. ## Main functions The package `r Rpackage("poem")` includes many metrics to perform different kinds of evaluations, and these metrics can be retrieved via 6 main wrapper functions. Unless specified, "partition" means "hard" partition. They are: - `getEmbeddingMetrics()`: Metrics to compare an embedding of data points to a partition of these data points. - `getGraphMetrics()`: Metrics to compare a graph (e.g. kNN/sNN) to a partition, where nodes in the graph are data points in the partition. - `getPartitionMetrics()`: Metrics to compare two partitions of the same dataset. - `getfuzzyPartitionMetrics()`: Metrics to compare two fuzzy partitions, or to compare between a fuzzy and a hard partition of the same dataset. - `getSpatialExternalMetrics()`: External metrics for evaluating spatial clustering results in a spatial-aware fashion. For non-spatial-aware evaluation, one can directly use `getPartitionMetrics()`. - `getSpatialInternalMetrics()`: Internal metrics for evaluating spatial clustering results in a spatial-aware fashion. There are 3 different levels where one can perform the above-mentioned evaluation: element-level, class-level, and dataset-level. Element-level evaluation reports metric values for each data point; Class-level evaluation reports metrics for each classes[^3] or clusters[^4]; and dataset-level evaluation returns a single metric value for the whole dataset. [^3]: In this vignette, classes refer to groups in the **ground-truth** partition. [^4]: In this vignette, clusters refer to groups in the **predicted** partition. The following table illustrates available metrics at different evaluation levels, and the main functions used to retrieve them. ```{r} data(metric_info) DT::datatable(metric_info) ``` # Getting started ## Example data To showcase the main functions, we will use some simulated datasets as examples in this vignette. The two datasets, `g1` and `g2`, both contain 80 data points with `x` and `y` coordinates and of 4 different classes. ```{r} data(toyExamples) g1 <- toyExamples[toyExamples$graph=="graph1",] g2 <- toyExamples[toyExamples$graph=="graph2",] head(g1) ``` If we plot them out: ```{r, fig.height = 3, fig.width = 7} ggplot(rbind(g1,g2), aes(x,y,color=class, shape=class)) + geom_point() + facet_wrap(~graph) + theme_bw() ``` # Embedding evaluation Let's assume `g1` and `g2` contain two different embeddings of the same set of objects. A "good" embedding should put objects of the same class together, and objects of different class apart. Since we know the ground-truth class of each object, one can evaluation such "goodness" of an embedding by calculating embedding evaluation metrics. One can calculate such metrics element-wise, for each class/cluster, or for the whole dataset. ## Element-level evaluation For example, at the element level, one can calculate the Silhouette Width by specifying `level="element"` and `metrics=c("SW")`: ```{r} sw <- getEmbeddingMetrics(x=g1[,c("x","y")], labels=g1$class, metrics=c("SW"), level="element") head(sw) ``` The output will be a `data.frame` containing the metric values for the specified level. ```{r, fig.height = 3, fig.width = 7} g1$sw <- getEmbeddingMetrics(x=g1[,c("x","y")], labels=g1$class, metrics=c("SW"), level="element")$SW g2$sw <- getEmbeddingMetrics(x=g2[,c("x","y")], labels=g2$class, metrics=c("SW"), level="element")$SW ggplot(rbind(g1,g2), aes(x, y, color=sw, shape=class)) + geom_point() + facet_wrap(~graph) + theme_bw() ``` ## Class-level evaluation One can also evaluate at each class level, by specifying `level="class"`. Check `?getEmbeddingMetrics` to see what are the allowed metrics at the class level. For example: ```{r} cl <- getEmbeddingMetrics(x=g1[,c("x","y")], labels=g1$class, metrics=c("dbcv", "meanSW"), level="class") head(cl) ``` ```{r, fig.height = 3, fig.width = 7} res1 <- getEmbeddingMetrics(x=g1[,c("x","y")], labels=g1$class, metrics=c("dbcv", "meanSW"), level="class") res2 <- getEmbeddingMetrics(x=g2[,c("x","y")], labels=g2$class, metrics=c("dbcv", "meanSW"), level="class") bind_rows(list(graph1=res1, graph2=res2), .id="graph") %>% pivot_longer(cols=c("meanSW","dbcv"), names_to="metric",values_to="value") %>% ggplot(aes(class, value, fill=graph, group=graph)) + geom_bar(position = "dodge", stat = "identity") + facet_wrap(~metric) + theme_bw() ``` ## Dataset-level evaluation Similarly, one can evaluate at the dataset level by specifying `level="dataset"`. For example: ```{r} getEmbeddingMetrics(x=g1[,c("x","y")], labels=g1$class, level="dataset", metrics=c("meanSW", "meanClassSW", "pnSW", "minClassSW", "cdbw", "cohesion", "compactness", "sep", "dbcv")) ``` # Graph evaluation Instead of directly using the distances or densities in the embedding space for evaluation, one may want to evaluate from a connectivity stand point by looking at the graph structure constructed from the above datasets. `getGraphMetrics()` can perform k nearest neighbor (KNN) graph or shared nearest neighbor graph (SNN) construction from an embedding and then apply graph-based evaluation metrics. ```{r, fold=TRUE} # Some functions for plotting plotGraphs <- function(d, k=7){ gn <- dplyr::bind_rows(lapply(split(d[,-1],d$graph), FUN=function(d1){ nn <- emb2knn(as.matrix(d1[,c("x","y")]), k=k) g <- poem:::.nn2graph(nn, labels=d1$class) ggnetwork(g, layout=as.matrix(d1[,seq_len(2)]), scale=FALSE) }), .id="graph") ggplot(gn, aes(x = x, y = y, xend = xend, yend = yend)) + theme_blank() + theme(legend.position = "right") + geom_edges(alpha=0.5, colour="grey") + geom_nodes(aes(colour=class, shape=class), size=2) + facet_wrap(~graph, nrow=1) } ``` For our examples `g1` and `g2`, the constructed graphs will look like: ```{r, fig.height = 3, fig.width = 7} plotGraphs(bind_rows(list(g1,g2), .id="graph")) ``` Use `?getGraphMetrics()` to check optional arguments for KNN/SNN graph construction. Similarly, `level` can be `"element"`, `"class"` or `"dataset"`. ```{r} getGraphMetrics(x=g1[,c("x","y")], labels=g1$class, metrics=c("PWC","ISI"), level="class", directed=FALSE, k=7, shared=FALSE) ``` ```{r, fig.height = 3, fig.width = 7} res1 <- getGraphMetrics(x=g1[,c("x","y")], labels=g1$class,metrics=c("PWC","ISI"), level="class", directed=FALSE, k=7, shared=FALSE) res2 <- getGraphMetrics(x=g2[,c("x","y")], labels=g2$class, metrics=c("PWC","ISI"), level="class", directed=FALSE, k=7, shared=FALSE) bind_rows(list(graph1=res1, graph2=res2), .id="graph") %>% pivot_longer(cols=c("PWC","ISI"), names_to="metric",values_to="value") %>% ggplot(aes(class, value, fill=graph, group=graph)) + geom_bar(position = "dodge", stat = "identity") + facet_wrap(~metric) + theme_bw() ``` Alternatively, `getGraphMetrics()` can take an `r Rpackage("igraph")` object as `x`, which enables the application of the evaluation metrics to a general graph, or a list of nearest neighbors as `x`, to accelerate the computation for large datasets. # Partition evaluation We construct SNN graph from g1 and g2 embeddings, and then apply Louvain algorithm to get partitions out of them. ```{r, fig.height = 3, fig.width = 7} k <- 7 r <- 0.5 snn1 <- emb2snn(as.matrix(g1[,c("x","y")]), k=k) snn2 <- emb2snn(as.matrix(g2[,c("x","y")]), k=k) g1$cluster <- factor(igraph::cluster_louvain(snn1, resolution = r)$membership) g2$cluster <- factor(igraph::cluster_louvain(snn2, resolution = r)$membership) ggplot(rbind(g1,g2), aes(x,y,color=cluster, shape=class)) + geom_point() + facet_wrap(~graph) + theme_bw() ``` We then compare the predictions with the known labels using the partition metrics: ```{r} # for g1 getPartitionMetrics(true=g1$class, pred=g1$cluster, level="dataset", metrics = c("RI", "WC", "WH", "ARI", "AWC", "AWH", "FM", "AMI")) # for g2 getPartitionMetrics(true=g2$class, pred=g2$cluster, level="dataset", metrics = c("RI", "WC", "WH", "ARI", "AWC", "AWH", "FM", "AMI")) ``` Note that for class-level metrics, some are reported per class, while some (specifically, "WH", "AWH) are reported per cluster. ```{r} getPartitionMetrics(true=g1$class, pred=g2$cluster, level="class") ``` # Fuzzy partition evaluation For comparing two fuzzy partitions or comparing a fuzzy partition to a hard patition, one can use `getFuzzyPartitionMetrics()`. The fuzzy reprensentation of a partion should look like the following, where each row is a data point, and the value is the class memberships to each class. Each row sums up to 1. ```{r} fuzzyTrue <- matrix(c( 0.95, 0.025, 0.025, 0.98, 0.01, 0.01, 0.96, 0.02, 0.02, 0.95, 0.04, 0.01, 0.95, 0.01, 0.04, 0.99, 0.005, 0.005, 0.025, 0.95, 0.025, 0.97, 0.02, 0.01, 0.025, 0.025, 0.95), ncol = 3, byrow=TRUE) ``` ```{r} # a hard truth: hardTrue <- apply(fuzzyTrue,1,FUN=which.max) # some predicted labels: hardPred <- c(1,1,1,1,1,1,2,2,2) getFuzzyPartitionMetrics(hardPred=hardPred, hardTrue=hardTrue, fuzzyTrue=fuzzyTrue, nperms=3, level="class") ``` By using the input `hardPred`, `hardTrue`, `fuzzyPred`, `fuzzyTrue`, one can control whether the fuzzy or hard version of the two partitions is used in comparison. For example, when `fuzzyTrue` and `fuzzyPred` are not `NULL`, metrics for comparing two fuzzy partitions will be used. # Spatial clustering evaluation ## Example data We use another toy example dataset in the package, `sp_toys`, to illustrate spatial clustering evaluation. ```{r, fig.height = 3, fig.width = 8.5} data(sp_toys) s <- 3 st <- 1 p1 <- ggplot(sp_toys, aes(x, y, color=label)) + geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() + geom_point(shape = 1, size = s, stroke = st, aes(color=p1)) + labs(x="",y="", title="P1") p0 <- ggplot(sp_toys, aes(x, y, color=label)) + geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() + geom_point(shape = 1, size = s, stroke = st, aes(color=label)) + labs(x="",y="", title="C") p2 <- ggplot(sp_toys, aes(x, y, color=label)) + geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() + geom_point(shape = 1, size = s, stroke = st, aes(color=p2)) + labs(x="",y="", title="P2") plot_grid(p0 + theme(legend.position = "none", plot.title = element_text(hjust = 0.5)), p1 + theme(legend.position = "none", plot.title = element_text(hjust = 0.5)), p2 + theme(legend.position = "none", plot.title = element_text(hjust = 0.5)), ncol = 3) ``` Here in C, the spots are colored by the ground-truth class. In P1 and P2, the color inside each spot is according to the ground-truth class, while the color of the border is according to clustering predictions. P1 and P2 misclassified the same amount of red spots into the blue cluster. ## External metrics Let's quantify this by calculating external spatial metrics: ```{r} getSpatialExternalMetrics(true=sp_toys$label, pred=sp_toys$p1, location=sp_toys[,c("x","y")], level="dataset", metrics=c("SpatialARI","SpatialAccuracy"), fuzzy_true = TRUE, fuzzy_pred = FALSE) ``` By specifying `fuzzy_true` and `fuzzy_pred`, one can control whether the fuzzy or hard version of `true` and `pred` is used in comparison. If `fuzzy_true` or `fuzzy_pred` is `TRUE`, the spatial neighborhood information will be used to construct the fuzzy representation of the class/cluster memberships. ```{r} getSpatialExternalMetrics(true=sp_toys$label, pred=sp_toys$p1, location=sp_toys[,c("x","y")], level="class") ``` ```{r} res1.1 <- getSpatialExternalMetrics(true=sp_toys$label, pred=sp_toys$p1, location=sp_toys[,c("x","y")], level="dataset", metrics=c("SpatialARI","SpatialAccuracy"), fuzzy_true = TRUE, fuzzy_pred = FALSE) res2.1 <- getSpatialExternalMetrics(true=sp_toys$label, pred=sp_toys$p2, location=sp_toys[,c("x","y")], level="dataset", metrics=c("SpatialARI","SpatialAccuracy"), fuzzy_true = TRUE, fuzzy_pred = FALSE) res1.2 <- getPartitionMetrics(true=sp_toys$label, pred=sp_toys$p1, level="dataset", metrics=c("ARI")) res2.2 <- getPartitionMetrics(true=sp_toys$label, pred=sp_toys$p2, level="dataset", metrics=c("ARI")) ``` ```{r, fig.height = 2, fig.width = 5} cbind(bind_rows(list(res1.1, res2.1), .id="P"), bind_rows(list(res1.2, res2.2), .id="P")) %>% pivot_longer(cols=c("SpatialARI", "SpatialAccuracy", "ARI"), names_to="metric", values_to="value") %>% ggplot(aes(x=P, y=value, group=metric)) + geom_point(size=3, aes(color=P)) + facet_wrap(~metric) + theme_bw() + labs(x="Prediction") ``` When the evaluation is non-spatial-aware, P1 and P2 get the same ARI score. However, with spatial-aware metrics like SpatialARI and SpatialAccuracy, P2 gets a higher scores than P1. ## Internal metrics Last but not least, there are internal metrics for spatial clustering evaluation: ```{r} sp_toys$c_elsa <- getSpatialInternalMetrics(label=sp_toys$label, location=sp_toys[,c("x","y")], level="element", metrics=c("ELSA"))$ELSA sp_toys$p1_elsa <- getSpatialInternalMetrics(label=sp_toys$p1, location=sp_toys[,c("x","y")], level="element", metrics=c("ELSA"))$ELSA sp_toys$p2_elsa <- getSpatialInternalMetrics(label=sp_toys$p2, location=sp_toys[,c("x","y")], level="element", metrics=c("ELSA"))$ELSA ``` ```{r, fig.height = 3, fig.width = 14} s <- 3 st <- 1 p1 <- ggplot(sp_toys, aes(x, y, color=p1_elsa)) + geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() + labs(x="",y="", title="P1", color="ELSA") + scico::scale_color_scico(palette = "roma", limits = c(0, 1), direction=-1) p0 <- ggplot(sp_toys, aes(x, y, color=c_elsa)) + geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() + labs(x="",y="", title="C", color="ELSA") + scico::scale_color_scico(palette = "roma", limits = c(0, 1), direction=-1) p2 <- ggplot(sp_toys, aes(x, y, color=p2_elsa)) + geom_point(size=s, alpha=0.5) + scale_y_reverse() + theme_bw() + labs(x="",y="", title="P2", color="ELSA") + scico::scale_color_scico(palette = "roma", limits = c(0, 1), direction=-1) plot_grid(p0 + theme(plot.title = element_text(hjust = 0.5)), p1 + theme(plot.title = element_text(hjust = 0.5)), p2 + theme(plot.title = element_text(hjust = 0.5)), nrow=1, rel_width=c(1,1,1)) ``` # Session info ```{r} sessionInfo() ```