| Title: | Cell Ranger Output Filtering and Metrics Visualization | 
| Version: | 0.3.2 | 
| Description: | Sample and cell filtering as well as visualisation of output metrics from 'Cell Ranger' by Grace X.Y. Zheng et al. (2017) <doi:10.1038/ncomms14049>. 'CRMetrics' allows for easy plotting of output metrics across multiple samples as well as comparative plots including statistical assessments of these. 'CRMetrics' allows for easy removal of ambient RNA using 'SoupX' by Matthew D Young and Sam Behjati (2020) <doi:10.1093/gigascience/giaa151> or 'CellBender' by Stephen J Fleming et al. (2022) <doi:10.1101/791699>. Furthermore, it is possible to preprocess data using 'Pagoda2' by Nikolas Barkas et al. (2021) https://github.com/kharchenkolab/pagoda2 or 'Seurat' by Yuhan Hao et al. (2021) <doi:10.1016/j.cell.2021.04.048> followed by embedding of cells using 'Conos' by Nikolas Barkas et al. (2019) <doi:10.1038/s41592-019-0466-z>. Finally, doublets can be detected using 'scrublet' by Samuel L. Wolock et al. (2019) <doi:10.1016/j.cels.2018.11.005> or 'DoubletDetection' by Gayoso et al. (2020) <doi:10.5281/zenodo.2678041>. In the end, cells are filtered based on user input for use in downstream applications. | 
| License: | GPL-3 | 
| Encoding: | UTF-8 | 
| Depends: | R (≥ 4.0.0) | 
| Imports: | cowplot, dplyr, ggbeeswarm, ggplot2, ggpmisc, ggpubr, ggrepel, magrittr, Matrix, methods, R6, scales, sccore, sparseMatrixStats, stats, tibble, tidyr, utils | 
| Suggests: | conos, data.table, markdown, pagoda2, reticulate, rhdf5, Seurat, SoupX, testthat (≥ 3.0.0) | 
| RoxygenNote: | 7.3.1 | 
| URL: | https://github.com/khodosevichlab/CRMetrics | 
| BugReports: | https://github.com/khodosevichlab/CRMetrics/issues | 
| Maintainer: | Rasmus Rydbirk <rrydbirk@outlook.dk> | 
| Config/testthat/edition: | 3 | 
| NeedsCompilation: | no | 
| Packaged: | 2024-11-07 19:24:09 UTC; ucloud | 
| Author: | Rasmus Rydbirk [aut, cre], Fabienne Kick [aut], Henrietta Holze [aut], Xian Xin [ctb] | 
| Repository: | CRAN | 
| Date/Publication: | 2024-11-08 00:20:06 UTC | 
CRMetrics class object
Description
Functions to analyze Cell Ranger count data. To initialize a new object, 'data.path' or 'cms' is needed. 'metadata' is also recommended, but not required.
Public fields
- metadata
- data.frame or character Path to metadata file or name of metadata data.frame object. Metadata must contain a column named 'sample' containing sample names that must match folder names in 'data.path' (default = NULL) 
- data.path
- character Path(s) to Cell Ranger count data, one directory per sample. If multiple paths, do c("path1","path2") (default = NULL) 
- cms
- list List with count matrices (default = NULL) 
- cms.preprocessed
- list List with preprocessed count matrices after $doPreprocessing() (default = NULL) 
- cms.raw
- list List with raw, unfiltered count matrices, i.e., including all CBs detected also empty droplets (default = NULL) 
- summary.metrics
- data.frame Summary metrics from Cell Ranger (default = NULL) 
- detailed.metrics
- data.frame Detailed metrics, i.e., no. genes and UMIs per cell (default = NULL) 
- comp.group
- character A group present in the metadata to compare the metrics by, can be added with addComparison (default = NULL) 
- verbose
- logical Print messages or not (default = TRUE) 
- theme
- ggplot2 theme (default: theme_bw()) 
- pal
- Plotting palette (default = NULL) 
- n.cores
- numeric Number of cores for calculations (default = 1) Initialize a CRMetrics object 
Methods
Public methods
Method new()
To initialize new object, 'data.path' or 'cms' is needed. 'metadata' is also recommended, but not required.
Usage
CRMetrics$new( data.path = NULL, metadata = NULL, cms = NULL, samples = NULL, unique.names = TRUE, sep.cells = "!!", comp.group = NULL, verbose = TRUE, theme = theme_bw(), n.cores = 1, sep.meta = ",", raw.meta = FALSE, pal = NULL )
Arguments
- data.path
- character Path to directory with Cell Ranger count data, one directory per sample (default = NULL). 
- metadata
- data.frame or character Path to metadata file (comma-separated) or name of metadata dataframe object. Metadata must contain a column named 'sample' containing sample names that must match folder names in 'data.path' (default = NULL) 
- cms
- list List with count matrices (default = NULL) 
- samples
- character Sample names. Only relevant is cms is provided (default = NULL) 
- unique.names
- logical Create unique cell names. Only relevant if cms is provided (default = TRUE) 
- sep.cells
- character Sample-cell separator. Only relevant if cms is provided and - unique.names=TRUE(default = "!!")
- comp.group
- character A group present in the metadata to compare the metrics by, can be added with addComparison (default = NULL) 
- verbose
- logical Print messages or not (default = TRUE) 
- theme
- ggplot2 theme (default: theme_bw()) 
- n.cores
- integer Number of cores for the calculations (default = self$n.cores) 
- sep.meta
- character Separator for metadata file (default = ",") 
- raw.meta
- logical Keep metadata in its raw format. If FALSE, classes will be converted using "type.convert" (default = FALSE) 
- pal
- character Plotting palette (default = NULL) 
Returns
CRMetrics object
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
}
Method addDetailedMetrics()
Function to read in detailed metrics. This is not done upon initialization for speed.
Usage
CRMetrics$addDetailedMetrics( cms = self$cms, min.transcripts.per.cell = 100, n.cores = self$n.cores, verbose = self$verbose )
Arguments
- cms
- list List of (sparse) count matrices (default = self$cms) 
- min.transcripts.per.cell
- numeric Minimal number of transcripts per cell (default = 100) 
- n.cores
- integer Number of cores for the calculations (default = self$n.cores). 
- verbose
- logical Print messages or not (default = self$verbose). 
Returns
Count matrices
Examples
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Run function
crm$addDetailedMetrics()
Method addComparison()
Add comparison group for statistical testing.
Usage
CRMetrics$addComparison(comp.group, metadata = self$metadata)
Arguments
- comp.group
- character Comparison metric (default = self$comp.group). 
- metadata
- data.frame Metadata for samples (default = self$metadata). 
Returns
Vector
Examples
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Add metadata
crm$metadata <- data.frame(sex = c("male","female"))
# Add comparison group
crm$addComparison(comp.group = "sex")
Method plotSamples()
Plot the number of samples.
Usage
CRMetrics$plotSamples( comp.group = self$comp.group, h.adj = 0.05, exact = FALSE, metadata = self$metadata, second.comp.group = NULL, pal = self$pal )
Arguments
- comp.group
- character Comparison metric, must match a column name of metadata (default = self$comp.group). 
- h.adj
- numeric Position of statistics test p value as % of max(y) (default = 0.05). 
- exact
- logical Whether to calculate exact p values (default = FALSE). 
- metadata
- data.frame Metadata for samples (default = self$metadata). 
- second.comp.group
- character Second comparison metric, must match a column name of metadata (default = NULL). 
- pal
- character Plotting palette (default = self$pal) 
Returns
ggplot2 object
Examples
samples <- c("sample1", "sample2")
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
names(testdata.cms) <- samples
# Create metadata
metadata <- data.frame(sample = samples,
sex = c("male","female"),
condition = c("a","b"))
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, metadata = metadata, n.cores = 1)
# Plot
crm$plotSamples(comp.group = "sex", second.comp.group = "condition")
Method plotSummaryMetrics()
Plot all summary stats or a selected list.
Usage
CRMetrics$plotSummaryMetrics(
  comp.group = self$comp.group,
  second.comp.group = NULL,
  metrics = NULL,
  h.adj = 0.05,
  plot.stat = TRUE,
  stat.test = c("non-parametric", "parametric"),
  exact = FALSE,
  metadata = self$metadata,
  summary.metrics = self$summary.metrics,
  plot.geom = "bar",
  se = FALSE,
  group.reg.lines = FALSE,
  secondary.testing = TRUE,
  pal = self$pal
)Arguments
- comp.group
- character Comparison metric (default = self$comp.group). 
- second.comp.group
- character Second comparison metric, used for the metric "samples per group" or when "comp.group" is a numeric or an integer (default = NULL). 
- metrics
- character Metrics to plot (default = NULL). 
- h.adj
- numeric Position of statistics test p value as % of max(y) (default = 0.05) 
- plot.stat
- logical Show statistics in plot. Will be FALSE if "comp.group" = "sample" or if "comp.group" is a numeric or an integer (default = TRUE) 
- stat.test
- character Statistical test to perform to compare means. Can either be "non-parametric" or "parametric" (default = "non-parametric"). 
- exact
- logical Whether to calculate exact p values (default = FALSE). 
- metadata
- data.frame Metadata for samples (default = self$metadata). 
- summary.metrics
- data.frame Summary metrics (default = self$summary.metrics). 
- plot.geom
- character Which geometric is used to plot the data (default = "point"). 
- se
- logical For regression lines, show SE (default = FALSE) 
- group.reg.lines
- logical For regression lines, if FALSE show one line, if TRUE show line per group defined by second.comp.group (default = FALSE) 
- secondary.testing
- logical Whether to show post hoc testing (default = TRUE) 
- pal
- character Plotting palette (default = self$pal) 
Returns
ggplot2 object
Examples
\donttest{
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Add summary metrics
crm$addSummaryFromCms()
crm$plotSummaryMetrics(plot.geom = "point")
}
Method plotDetailedMetrics()
Plot detailed metrics from the detailed.metrics object
Usage
CRMetrics$plotDetailedMetrics( comp.group = self$comp.group, detailed.metrics = self$detailed.metrics, metadata = self$metadata, metrics = NULL, plot.geom = "violin", hline = TRUE, pal = self$pal )
Arguments
- comp.group
- character Comparison metric (default = self$comp.group). 
- detailed.metrics
- data.frame Object containing the count matrices (default = self$detailed.metrics). 
- metadata
- data.frame Metadata for samples (default = self$metadata). 
- metrics
- character Metrics to plot. NULL plots both plots (default = NULL). 
- plot.geom
- character How to plot the data (default = "violin"). 
- hline
- logical Whether to show median as horizontal line (default = TRUE) 
- pal
- character Plotting palette (default = self$pal) 
- data.path
- character Path to Cell Ranger count data (default = self$data.path). 
Returns
ggplot2 object
Examples
\donttest{
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Add detailed metrics
crm$addDetailedMetrics()
# Plot
crm$plotDetailedMetrics()
}
Method plotEmbedding()
Plot cells in embedding using Conos and color by depth and doublets.
Usage
CRMetrics$plotEmbedding(
  depth = FALSE,
  doublet.method = NULL,
  doublet.scores = FALSE,
  depth.cutoff = 1000,
  mito.frac = FALSE,
  mito.cutoff = 0.05,
  species = c("human", "mouse"),
  size = 0.3,
  sep = "!!",
  pal = NULL,
  ...
)Arguments
- depth
- logical Plot depth or not (default = FALSE). 
- doublet.method
- character Doublet detection method (default = NULL). 
- doublet.scores
- logical Plot doublet scores or not (default = FALSE). 
- depth.cutoff
- numeric Depth cutoff (default = 1e3). 
- mito.frac
- logical Plot mitochondrial fraction or not (default = FALSE). 
- mito.cutoff
- numeric Mitochondrial fraction cutoff (default = 0.05). 
- species
- character Species to calculate the mitochondrial fraction for (default = c("human","mouse")). 
- size
- numeric Dot size (default = 0.3) 
- sep
- character Separator for creating unique cell names (default = "!!") 
- pal
- character Plotting palette (default = NULL) 
- ...
- Plotting parameters passed to - sccore::embeddingPlot.
Returns
ggplot2 object
Examples
\donttest{
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding() 
crm$plotEmbedding()
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
}
Method plotDepth()
Plot the sequencing depth in histogram.
Usage
CRMetrics$plotDepth( cutoff = 1000, samples = self$metadata$sample, sep = "!!", keep.col = "#E7CDC2", filter.col = "#A65141" )
Arguments
- cutoff
- numeric The depth cutoff to color the cells in the embedding (default = 1e3). 
- samples
- character Sample names to include for plotting (default = $metadata$sample). 
- sep
- character Separator for creating unique cell names (default = "!!") 
- keep.col
- character Color for density of cells that are kept (default = "#E7CDC2") 
- filter.col
- Character Color for density of cells to be filtered (default = "#A65141") 
Returns
ggplot2 object
Examples
\donttest{
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Plot
crm$plotDepth()
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
}
Method plotMitoFraction()
Plot the mitochondrial fraction in histogram.
Usage
CRMetrics$plotMitoFraction(
  cutoff = 0.05,
  species = c("human", "mouse"),
  samples = self$metadata$sample,
  sep = "!!",
  keep.col = "#E7CDC2",
  filter.col = "#A65141"
)Arguments
- cutoff
- numeric The mito. fraction cutoff to color the embedding (default = 0.05) 
- species
- character Species to calculate the mitochondrial fraction for (default = "human") 
- samples
- character Sample names to include for plotting (default = $metadata$sample) 
- sep
- character Separator for creating unique cell names (default = "!!") 
- keep.col
- character Color for density of cells that are kept (default = "#E7CDC2") 
- filter.col
- Character Color for density of cells to be filtered (default = "#A65141") 
Returns
ggplot2 object
Examples
\donttest{
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Plot
crm$plotMitoFraction()
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
}
Method detectDoublets()
Detect doublet cells.
Usage
CRMetrics$detectDoublets(
  method = c("scrublet", "doubletdetection"),
  cms = self$cms,
  samples = self$metadata$sample,
  env = "r-reticulate",
  conda.path = system("whereis conda"),
  n.cores = self$n.cores,
  verbose = self$verbose,
  args = list(),
  export = FALSE,
  data.path = self$data.path
)Arguments
- method
- character Which method to use, either - scrubletor- doubletdetection(default="scrublet").
- cms
- list List containing the count matrices (default=self$cms). 
- samples
- character Vector of sample names. If NULL, samples are extracted from cms (default = self$metadata$sample) 
- env
- character Environment to run python in (default="r-reticulate"). 
- conda.path
- character Path to conda environment (default=system("whereis conda")). 
- n.cores
- integer Number of cores to use (default = self$n.cores) 
- verbose
- logical Print messages or not (default = self$verbose) 
- args
- list A list with additional arguments for either - DoubletDetectionor- scrublet. Please check the respective manuals.
- export
- boolean Export CMs in order to detect doublets outside R (default = FALSE) 
- data.path
- character Path to write data, only relevant if - export = TRUE. Last character must be- /(default = self$data.path)
Returns
data.frame
Examples
\dontrun{
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Detect doublets
crm$detectDoublets(method = "scrublet", 
conda.path = "/opt/software/miniconda/4.12.0/condabin/conda")
}
Method doPreprocessing()
Perform conos preprocessing.
Usage
CRMetrics$doPreprocessing(
  cms = self$cms,
  preprocess = c("pagoda2", "seurat"),
  min.transcripts.per.cell = 100,
  verbose = self$verbose,
  n.cores = self$n.cores,
  get.largevis = FALSE,
  tsne = FALSE,
  make.geneknn = FALSE,
  cluster = FALSE,
  ...
)Arguments
- cms
- list List containing the count matrices (default = self$cms). 
- preprocess
- character Method to use for preprocessing (default = c("pagoda2","seurat")). 
- min.transcripts.per.cell
- numeric Minimal transcripts per cell (default = 100) 
- verbose
- logical Print messages or not (default = self$verbose). 
- n.cores
- integer Number of cores for the calculations (default = self$n.cores). 
- get.largevis
- logical For Pagoda2, create largeVis embedding (default = FALSE) 
- tsne
- logical Create tSNE embedding (default = FALSE) 
- make.geneknn
- logical For Pagoda2, estimate gene kNN (default = FALSE) 
- cluster
- logical For Seurat, estimate clusters (default = FALSE) 
- ...
- Additional arguments for - Pagaoda2::basicP2Procor- conos:::basicSeuratProc
Returns
Conos object
Examples
\donttest{
if (requireNamespace("pagoda2", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Perform preprocessing
crm$doPreprocessing(preprocess = "pagoda2")
} else {
message("Package 'pagoda2' not available.")
}
}
Method createEmbedding()
Create Conos embedding.
Usage
CRMetrics$createEmbedding( cms = self$cms.preprocessed, verbose = self$verbose, n.cores = self$n.cores, arg.buildGraph = list(), arg.findCommunities = list(), arg.embedGraph = list(method = "UMAP") )
Arguments
- cms
- list List containing the preprocessed count matrices (default = self$cms.preprocessed). 
- verbose
- logical Print messages or not (default = self$verbose). 
- n.cores
- integer Number of cores for the calculations (default = self$n.cores). 
- arg.buildGraph
- list A list with additional arguments for the - buildGraphfunction in Conos (default = list())
- arg.findCommunities
- list A list with additional arguments for the - findCommunitiesfunction in Conos (default = list())
- arg.embedGraph
- list A list with additional arguments for the - embedGraphfunction in Conos (default = list(method = "UMAP))
Returns
Conos object
Examples
\donttest{
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
}
Method filterCms()
Filter cells based on depth, mitochondrial fraction and doublets from the count matrix.
Usage
CRMetrics$filterCms(
  depth.cutoff = NULL,
  mito.cutoff = NULL,
  doublets = NULL,
  species = c("human", "mouse"),
  samples.to.exclude = NULL,
  verbose = self$verbose,
  sep = "!!",
  raw = FALSE
)Arguments
- depth.cutoff
- numeric Depth (transcripts per cell) cutoff (default = NULL). 
- mito.cutoff
- numeric Mitochondrial fraction cutoff (default = NULL). 
- doublets
- character Doublet detection method to use (default = NULL). 
- species
- character Species to calculate the mitochondrial fraction for (default = "human"). 
- samples.to.exclude
- character Sample names to exclude (default = NULL) 
- verbose
- logical Show progress (default = self$verbose) 
- sep
- character Separator for creating unique cell names (default = "!!") 
- raw
- boolean Filter on raw, unfiltered count matrices. Usually not intended (default = FALSE) 
Returns
list of filtered count matrices
Examples
\donttest{
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Filter CMs
crm$filterCms(depth.cutoff = 1e3, mito.cutoff = 0.05)
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
}
Method selectMetrics()
Select metrics from summary.metrics
Usage
CRMetrics$selectMetrics(ids = NULL)
Arguments
- ids
- character Metric id to select (default = NULL). 
Returns
vector
Examples
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Select metrics
crm$selectMetrics()
selection.metrics <- crm$selectMetrics(c(1:4))
Method plotFilteredCells()
Plot filtered cells in an embedding, in a bar plot, on a tile or export the data frame
Usage
CRMetrics$plotFilteredCells(
  type = c("embedding", "bar", "tile", "export"),
  depth = TRUE,
  depth.cutoff = 1000,
  doublet.method = NULL,
  mito.frac = TRUE,
  mito.cutoff = 0.05,
  species = c("human", "mouse"),
  size = 0.3,
  sep = "!!",
  cols = c("grey80", "red", "blue", "green", "yellow", "black", "pink", "purple"),
  ...
)Arguments
- type
- character The type of plot to use: embedding, bar, tile or export (default = c("embedding","bar","tile","export")). 
- depth
- logical Plot the depth or not (default = TRUE). 
- depth.cutoff
- numeric Depth cutoff, either a single number or a vector with cutoff per sample and with sampleIDs as names (default = 1e3). 
- doublet.method
- character Method to detect doublets (default = NULL). 
- mito.frac
- logical Plot the mitochondrial fraction or not (default = TRUE). 
- mito.cutoff
- numeric Mitochondrial fraction cutoff, either a single number or a vector with cutoff per sample and with sampleIDs as names (default = 0.05). 
- species
- character Species to calculate the mitochondrial fraction for (default = c("human","mouse")). 
- size
- numeric Dot size (default = 0.3) 
- sep
- character Separator for creating unique cell names (default = "!!") 
- cols
- character Colors used for plotting (default = c("grey80","red","blue","green","yellow","black","pink","purple")) 
- ...
- Plotting parameters passed to - sccore::embeddingPlot.
Returns
ggplot2 object or data frame
Examples
\donttest{
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Plot and extract result
crm$plotFilteredCells(type = "embedding")
filtered.cells <- crm$plotFilteredCells(type = "export")
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
}
Method getDepth()
Extract sequencing depth from Conos object.
Usage
CRMetrics$getDepth(cms = self$cms)
Arguments
- cms
- list List of (sparse) count matrices (default = self$cms) 
Returns
data frame
Examples
\donttest{
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Get depth
crm$getDepth()
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
}
Method getMitoFraction()
Calculate the fraction of mitochondrial genes.
Usage
CRMetrics$getMitoFraction(species = c("human", "mouse"), cms = self$cms)Arguments
- species
- character Species to calculate the mitochondrial fraction for (default = "human"). 
- cms
- list List of (sparse) count matrices (default = self$cms) 
Returns
data frame
Examples
\donttest{
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Get mito. fraction
crm$getMitoFraction(species = c("human", "mouse"))
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
}
Method prepareCellbender()
Create plots and script call for CellBender
Usage
CRMetrics$prepareCellbender( shrinkage = 100, show.expected.cells = TRUE, show.total.droplets = TRUE, expected.cells = NULL, total.droplets = NULL, cms.raw = self$cms.raw, umi.counts = self$cellbender$umi.counts, data.path = self$data.path, samples = self$metadata$sample, verbose = self$verbose, n.cores = self$n.cores, unique.names = FALSE, sep = "!!" )
Arguments
- shrinkage
- integer Select every nth UMI count per cell for plotting. Improves plotting speed drastically. To plot all cells, set to 1 (default = 100) 
- show.expected.cells
- logical Plot line depicting expected number of cells (default = TRUE) 
- show.total.droplets
- logical Plot line depicting total droplets included for CellBender run (default = TRUE) 
- expected.cells
- named numeric If NULL, expected cells will be deduced from the number of cells per sample identified by Cell Ranger. Otherwise, a named vector of expected cells with sample IDs as names. Sample IDs must match those in summary.metrics (default: stored named vector) 
- total.droplets
- named numeric If NULL, total droplets included will be deduced from expected cells multiplied by 3. Otherwise, a named vector of total droplets included with sample IDs as names. Sample IDs must match those in summary.metrics (default: stored named vector) 
- cms.raw
- list Raw count matrices from HDF5 Cell Ranger outputs (default = self$cms.raw) 
- umi.counts
- list UMI counts calculated as column sums of raw count matrices from HDF5 Cell Ranger outputs (default: stored list) 
- data.path
- character Path to Cell Ranger outputs (default = self$data.path) 
- samples
- character Sample names to include (default = self$metadata$sample) 
- verbose
- logical Show progress (default: stored vector) 
- n.cores
- integer Number of cores (default: stored vector) 
- unique.names
- logical Create unique cell names (default = FALSE) 
- sep
- character Separator for creating unique cell names (default = "!!") 
Returns
ggplot2 object and bash script
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data")
crm$prepareCellbender()
}
Method saveCellbenderScript()
Usage
CRMetrics$saveCellbenderScript( file = "cellbender_script.sh", fpr = 0.01, epochs = 150, use.gpu = TRUE, expected.cells = NULL, total.droplets = NULL, data.path = self$data.path, samples = self$metadata$sample, args = NULL )
Arguments
- file
- character File name for CellBender script. Will be stored in - data.path(default: "cellbender_script.sh")
- fpr
- numeric False positive rate for CellBender (default = 0.01) 
- epochs
- integer Number of epochs for CellBender (default = 150) 
- use.gpu
- logical Use CUDA capable GPU (default = TRUE) 
- expected.cells
- named numeric If NULL, expected cells will be deduced from the number of cells per sample identified by Cell Ranger. Otherwise, a named vector of expected cells with sample IDs as names. Sample IDs must match those in summary.metrics (default: stored named vector) 
- total.droplets
- named numeric If NULL, total droplets included will be deduced from expected cells multiplied by 3. Otherwise, a named vector of total droplets included with sample IDs as names. Sample IDs must match those in summary.metrics (default: stored named vector) 
- data.path
- character Path to Cell Ranger outputs (default = self$data.path) 
- samples
- character Sample names to include (default = self$metadata$sample) 
- args
- character (optional) Additional parameters for CellBender 
Returns
bash script
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
}
Method getExpectedCells()
Extract the expected number of cells per sample based on the Cell Ranger summary metrics
Usage
CRMetrics$getExpectedCells(samples = self$metadata$sample)
Arguments
- samples
- character Sample names to include (default = self$metadata$sample) 
Returns
A numeric vector
Examples
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Get summary
crm$addSummaryFromCms()
# Get no. cells
crm$getExpectedCells()
Method getTotalDroplets()
Get the total number of droplets included in the CellBender estimations. Based on the Cell Ranger summary metrics and multiplied by a preset multiplier.
Usage
CRMetrics$getTotalDroplets(samples = self$metadata$sample, multiplier = 3)
Arguments
- samples
- character Samples names to include (default = self$metadata$sample) 
- multiplier
- numeric Number to multiply expected number of cells with (default = 3) 
Returns
A numeric vector
Examples
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Add summary
crm$addSummaryFromCms()
# Get no. droplets
crm$getTotalDroplets()
Method addCms()
Add a list of count matrices to the CRMetrics object.
Usage
CRMetrics$addCms( cms = NULL, data.path = self$data.path, samples = self$metadata$sample, cellbender = FALSE, raw = FALSE, symbol = TRUE, unique.names = TRUE, sep = "!!", add.metadata = TRUE, n.cores = self$n.cores, verbose = self$verbose )
Arguments
- cms
- list List of (sparse) count matrices (default = NULL) 
- data.path
- character Path to cellranger count data (default = self$data.path). 
- samples
- character Vector of sample names. If NULL, samples are extracted from cms (default = self$metadata$sample) 
- cellbender
- logical Add CellBender filtered count matrices in HDF5 format. Requires that "cellbender" is in the names of the files (default = FALSE) 
- raw
- logical Add raw count matrices from Cell Ranger output. Cannot be combined with - cellbender=TRUE(default = FALSE)
- symbol
- character The type of gene IDs to use, SYMBOL (TRUE) or ENSEMBLE (default = TRUE) 
- unique.names
- logical Make cell names unique based on - sepparameter (default = TRUE)
- sep
- character Separator used to create unique cell names (default = "!!") 
- add.metadata
- boolean Add metadata from cms or not (default = TRUE) 
- n.cores
- integer Number of cores to use (default = self$n.cores) 
- verbose
- boolean Print progress (default = self$verbose) 
Returns
Add list of (sparse) count matrices to R6 class object
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
crm$addCms(cms = testdata.cms)
}
Method plotCbTraining()
Plot the results from the CellBender estimations
Usage
CRMetrics$plotCbTraining( data.path = self$data.path, samples = self$metadata$sample, pal = self$pal )
Arguments
- data.path
- character Path to Cell Ranger outputs (default = self$data.path) 
- samples
- character Sample names to include (default = self$metadata$sample) 
- pal
- character Plotting palette (default = self$pal) 
Returns
A ggplot2 object
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## Run CellBender script
crm$plotCbTraining()
}
Method plotCbCellProbs()
Plot the CellBender assigned cell probabilities
Usage
CRMetrics$plotCbCellProbs( data.path = self$data.path, samples = self$metadata$sample, low.col = "gray", high.col = "red" )
Arguments
- data.path
- character Path to Cell Ranger outputs (default = self$data.path) 
- samples
- character Sample names to include (default = self$metadata$sample) 
- low.col
- character Color for low probabilities (default = "gray") 
- high.col
- character Color for high probabilities (default = "red") 
Returns
A ggplot2 object
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## Run the CellBender script
crm$plotCbCellProbs()
}
Method plotCbAmbExp()
Plot the estimated ambient gene expression per sample from CellBender calculations
Usage
CRMetrics$plotCbAmbExp( cutoff = 0.005, data.path = self$data.path, samples = self$metadata$sample )
Arguments
- cutoff
- numeric Horizontal line included in the plot to indicate highly expressed ambient genes (default = 0.005) 
- data.path
- character Path to Cell Ranger outputs (default = self$data.path) 
- samples
- character Sample names to include (default = self$metadata$sample) 
Returns
A ggplot2 object
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## Run CellBender script
crm$plotCbAmbExp()
}
Method plotCbAmbGenes()
Plot the most abundant estimated ambient genes from the CellBender calculations
Usage
CRMetrics$plotCbAmbGenes( cutoff = 0.005, data.path = self$data.path, samples = self$metadata$sample, pal = self$pal )
Arguments
- cutoff
- numeric Cutoff of ambient gene expression to use to extract ambient genes per sample 
- data.path
- character Path to Cell Ranger outputs (default = self$data.path) 
- samples
- character Sample names to include (default = self$metadata$sample) 
- pal
- character Plotting palette (default = self$pal) 
Returns
A ggplot2 object
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## Run CellBender script
crm$plotCbAmbGenes()
}
Method addSummaryFromCms()
Add summary metrics from a list of count matrices
Usage
CRMetrics$addSummaryFromCms( cms = self$cms, n.cores = self$n.cores, verbose = self$verbose )
Arguments
- cms
- list A list of filtered count matrices (default = self$cms) 
- n.cores
- integer Number of cores to use (default = self$n.cores) 
- verbose
- logical Show progress (default = self$verbose) 
Returns
data.frame
Examples
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Add summary
crm$addSummaryFromCms()
Method runSoupX()
Run SoupX ambient RNA estimation and correction
Usage
CRMetrics$runSoupX( data.path = self$data.path, samples = self$metadata$sample, n.cores = self$n.cores, verbose = self$verbose, arg.load10X = list(), arg.autoEstCont = list(), arg.adjustCounts = list() )
Arguments
- data.path
- character Path to Cell Ranger outputs (default = self$data.path) 
- samples
- character Sample names to include (default = self$metadata$sample) 
- n.cores
- numeric Number of cores (default = self$n.cores) 
- verbose
- logical Show progress (default = self$verbose) 
- arg.load10X
- list A list with additional parameters for - SoupX::load10X(default = list())
- arg.autoEstCont
- list A list with additional parameters for - SoupX::autoEstCont(default = list())
- arg.adjustCounts
- list A list with additional parameters for - SoupX::adjustCounts(default = list())
Returns
List containing a list with corrected counts, and a data.frame containing plotting estimations
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$runSoupX()
}
Method plotSoupX()
Plot the results from the SoupX estimations
Usage
CRMetrics$plotSoupX(plot.df = self$soupx$plot.df)
Arguments
- plot.df
- data.frame SoupX estimations (default = self$soupx$plot.df) 
Returns
A ggplot2 object
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$runSoupX()
crm$plotSoupX()
}
Method plotCbCells()
Plot CellBender cell estimations against the estimated cell numbers from Cell Ranger
Usage
CRMetrics$plotCbCells( data.path = self$data.path, samples = self$metadata$sample, pal = self$pal )
Arguments
- data.path
- character Path to Cell Ranger outputs (default = self$data.path) 
- samples
- character Sample names to include (default = self$metadata$sample) 
- pal
- character Plotting palette (default = self$pal) 
Returns
A ggplot2 object
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## Run CellBender script
crm$plotCbCells()
}
Method addDoublets()
Add doublet results created from exported Python script
Usage
CRMetrics$addDoublets(
  method = c("scrublet", "doubletdetection"),
  data.path = self$data.path,
  samples = self$metadata$sample,
  cms = self$cms,
  verbose = self$verbose
)Arguments
- method
- character Which method to use, either - scrubletor- doubletdetection(default is both).
- data.path
- character Path to Cell Ranger outputs (default = self$data.path) 
- samples
- character Sample names to include (default = self$metadata$sample) 
- cms
- list List containing the count matrices (default = self$cms). 
- verbose
- boolean Print progress (default = self$verbose) 
Returns
List of doublet results
Examples
\dontrun{
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$detectDoublets(export = TRUE)
## Run Python script
crm$addDoublets()
}
Method clone()
The objects of this class are cloneable with this method.
Usage
CRMetrics$clone(deep = FALSE)
Arguments
- deep
- Whether to make a deep clone. 
Examples
## ------------------------------------------------
## Method `CRMetrics$new`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$addDetailedMetrics`
## ------------------------------------------------
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Run function
crm$addDetailedMetrics()
## ------------------------------------------------
## Method `CRMetrics$addComparison`
## ------------------------------------------------
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Add metadata
crm$metadata <- data.frame(sex = c("male","female"))
# Add comparison group
crm$addComparison(comp.group = "sex")
## ------------------------------------------------
## Method `CRMetrics$plotSamples`
## ------------------------------------------------
samples <- c("sample1", "sample2")
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
names(testdata.cms) <- samples
# Create metadata
metadata <- data.frame(sample = samples,
sex = c("male","female"),
condition = c("a","b"))
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, metadata = metadata, n.cores = 1)
# Plot
crm$plotSamples(comp.group = "sex", second.comp.group = "condition")
## ------------------------------------------------
## Method `CRMetrics$plotSummaryMetrics`
## ------------------------------------------------
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Add summary metrics
crm$addSummaryFromCms()
crm$plotSummaryMetrics(plot.geom = "point")
## ------------------------------------------------
## Method `CRMetrics$plotDetailedMetrics`
## ------------------------------------------------
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Add detailed metrics
crm$addDetailedMetrics()
# Plot
crm$plotDetailedMetrics()
## ------------------------------------------------
## Method `CRMetrics$plotEmbedding`
## ------------------------------------------------
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding() 
crm$plotEmbedding()
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
## ------------------------------------------------
## Method `CRMetrics$plotDepth`
## ------------------------------------------------
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Plot
crm$plotDepth()
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
## ------------------------------------------------
## Method `CRMetrics$plotMitoFraction`
## ------------------------------------------------
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Plot
crm$plotMitoFraction()
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
## ------------------------------------------------
## Method `CRMetrics$detectDoublets`
## ------------------------------------------------
## Not run: 
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Detect doublets
crm$detectDoublets(method = "scrublet", 
conda.path = "/opt/software/miniconda/4.12.0/condabin/conda")
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$doPreprocessing`
## ------------------------------------------------
if (requireNamespace("pagoda2", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Perform preprocessing
crm$doPreprocessing(preprocess = "pagoda2")
} else {
message("Package 'pagoda2' not available.")
}
## ------------------------------------------------
## Method `CRMetrics$createEmbedding`
## ------------------------------------------------
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
## ------------------------------------------------
## Method `CRMetrics$filterCms`
## ------------------------------------------------
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Filter CMs
crm$filterCms(depth.cutoff = 1e3, mito.cutoff = 0.05)
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
## ------------------------------------------------
## Method `CRMetrics$selectMetrics`
## ------------------------------------------------
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Select metrics
crm$selectMetrics()
selection.metrics <- crm$selectMetrics(c(1:4))
## ------------------------------------------------
## Method `CRMetrics$plotFilteredCells`
## ------------------------------------------------
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Plot and extract result
crm$plotFilteredCells(type = "embedding")
filtered.cells <- crm$plotFilteredCells(type = "export")
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
## ------------------------------------------------
## Method `CRMetrics$getDepth`
## ------------------------------------------------
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Get depth
crm$getDepth()
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
## ------------------------------------------------
## Method `CRMetrics$getMitoFraction`
## ------------------------------------------------
if (requireNamespace("pagoda2", quietly = TRUE)) {
if (requireNamespace("conos", quietly = TRUE)) {
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Create embedding
crm$doPreprocessing()
crm$createEmbedding()
# Get mito. fraction
crm$getMitoFraction(species = c("human", "mouse"))
} else {
message("Package 'conos' not available.")
}
} else {
message("Package 'pagoda2' not available.")
}
## ------------------------------------------------
## Method `CRMetrics$prepareCellbender`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data")
crm$prepareCellbender()
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$saveCellbenderScript`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$getExpectedCells`
## ------------------------------------------------
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Get summary
crm$addSummaryFromCms()
# Get no. cells
crm$getExpectedCells()
## ------------------------------------------------
## Method `CRMetrics$getTotalDroplets`
## ------------------------------------------------
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Add summary
crm$addSummaryFromCms()
# Get no. droplets
crm$getTotalDroplets()
## ------------------------------------------------
## Method `CRMetrics$addCms`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
crm$addCms(cms = testdata.cms)
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$plotCbTraining`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## Run CellBender script
crm$plotCbTraining()
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$plotCbCellProbs`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## Run the CellBender script
crm$plotCbCellProbs()
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$plotCbAmbExp`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## Run CellBender script
crm$plotCbAmbExp()
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$plotCbAmbGenes`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## Run CellBender script
crm$plotCbAmbGenes()
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$addSummaryFromCms`
## ------------------------------------------------
# Simulate data
testdata.cms <- lapply(seq_len(2), \(x) {
out <- Matrix::rsparsematrix(2e3, 1e3, 0.1)
out[out < 0] <- 1
dimnames(out) <- list(sapply(seq_len(2e3), \(x) paste0("gene",x)),
sapply(seq_len(1e3), \(x) paste0("cell",x)))
return(out)
})
# Initialize
crm <- CRMetrics$new(cms = testdata.cms, samples = c("sample1", "sample2"), n.cores = 1)
# Add summary
crm$addSummaryFromCms()
## ------------------------------------------------
## Method `CRMetrics$runSoupX`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$runSoupX()
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$plotSoupX`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$runSoupX()
crm$plotSoupX()
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$plotCbCells`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$prepareCellbender()
crm$saveCellbenderScript()
## Run CellBender script
crm$plotCbCells()
## End(Not run)
## ------------------------------------------------
## Method `CRMetrics$addDoublets`
## ------------------------------------------------
## Not run: 
crm <- CRMetrics$new(data.path = "/path/to/count/data/")
crm$detectDoublets(export = TRUE)
## Run Python script
crm$addDoublets()
## End(Not run)
Add detailed metrics
Description
Add detailed metrics, requires to load raw count matrices using pagoda2.
Usage
addDetailedMetricsInner(cms, verbose = TRUE, n.cores = 1)
Arguments
| cms | List containing the count matrices. | 
| verbose | Print messages (default = TRUE). | 
| n.cores | Number of cores for the calculations (default = 1). | 
Value
data frame
Add statistics to plot
Description
Use ggpubr to add statistics to plots.
Usage
addPlotStats(
  p,
  comp.group,
  metadata,
  h.adj = 0.05,
  primary.test,
  secondary.test,
  exact = FALSE
)
Arguments
| p | Plot to add statistics to. | 
| comp.group | Comparison metric. | 
| metadata | Metadata for samples. | 
| h.adj | Position of statistics test p value as % of max(y) (default = 0.05). | 
| primary.test | Primary statistical test, e.g. "anova", "kruskal.test". | 
| secondary.test | Secondary statistical test, e.g. "t-test", "wilcox.test" | 
| exact | Whether to calculate exact p values (default = FALSE). | 
Value
ggplot2 object
Add statistics to plot
Description
Use ggpubr to add statistics to samples or plot
Usage
addPlotStatsSamples(
  p,
  comp.group,
  metadata,
  h.adj = 0.05,
  exact = FALSE,
  second.comp.group
)
Arguments
| p | Plot to add statistics to. | 
| comp.group | Comparison metric. | 
| metadata | Metadata for samples. | 
| h.adj | Position of statistics test p value as % of max(y) (default = 0.05). | 
| exact | Whether to calculate exact p values (default = FALSE). | 
| second.comp.group | Second comparison metric. | 
Value
ggplot2 object
Add summary metrics
Description
Add summary metrics by reading Cell Ranger metrics summary files.
Usage
addSummaryMetrics(data.path, metadata, n.cores = 1, verbose = TRUE)
Arguments
| data.path | Path to cellranger count data. | 
| metadata | Metadata for samples. | 
| n.cores | Number of cores for the calculations (default = 1). | 
| verbose | Print messages (default = TRUE). | 
Value
data frame
Set correct 'comp.group' parameter
Description
Set comp.group to 'category' if null.
Usage
checkCompGroup(comp.group, category, verbose = TRUE)
Arguments
| comp.group | Comparison metric. | 
| category | Comparison metric to use if comp.group is not provided. | 
| verbose | Print messages (default = TRUE). | 
Value
vector
Check whether 'comp.group' is in metadata
Description
Checks whether 'comp.group' is any of the column names in metadata.
Usage
checkCompMeta(comp.group, metadata)
Arguments
| comp.group | Comparison metric. | 
| metadata | Metadata for samples. | 
Value
nothing or stop
Check data path
Description
Helper function to check that data.path is not NULL
Usage
checkDataPath(data.path)
Arguments
| data.path | character Path to be checked | 
Create unique cell names
Description
Create unique cell names from sample IDs and cell IDs
Usage
createUniqueCellNames(cms, samples, sep = "!!")
Arguments
| cms | list List of count matrices, should be named (optional) | 
| samples | character Optional, list of sample names | 
| sep | character Separator between sample IDs and cell IDs (default = "!!") | 
Create filtering vector
Description
Create logical filtering vector based on a numeric vector and a (sample-wise) cutoff
Usage
filterVector(num.vec, name, filter, samples, sep = "!!")
Arguments
| num.vec | numeric Numeric vector to create filter on | 
| name | character Name of filter | 
| filter | numeric Either a single numeric value or a numeric value with length of samples | 
| samples | character Sample IDs | 
| sep | character Separator to split cells by into sample-wise lists (default = "!!") | 
Get H5 file paths
Description
Get file paths for H5 files
Usage
getH5Paths(data.path, samples = NULL, type = NULL)
Arguments
| data.path | character Path for directory containing sample-wise directories with Cell Ranger count outputs | 
| samples | character Sample names to include (default = NULL) | 
| type | character Type of H5 files to get paths for, one of "raw", "filtered" (Cell Ranger count outputs), "cellbender" (raw CellBender outputs), "cellbender_filtered" (CellBender filtered outputs) (default = "type") | 
Get labels for percentage of filtered cells
Description
Labels the percentage of filtered cells based on mitochondrial fraction, sequencing depth and doublets as low, medium or high
Usage
labelsFilter(filter.data)
Arguments
| filter.data | Data frame containing the mitochondrial fraction, depth and doublets per sample. | 
Value
data frame
Calculate percentage of filtered cells
Description
Calculate percentage of filtered cells based on the filter
Usage
percFilter(filter.data, filter = "mito", no.vars = 1)
Arguments
| filter.data | Data frame containing the mitochondrial fraction, depth and doublets per sample. | 
| filter | The variable to filter (default = "mito") | 
| no.vars | numeric Number of variables (default = 1) | 
Value
vector
Plot the data as points, as bars as a histogram, or as a violin
Description
Plot the data as points, barplot, histogram or violin
Usage
plotGeom(g, plot.geom, col, pal = NULL)
Arguments
| g | ggplot2 object | 
| plot.geom | The plot.geom to use, "point", "bar", "histogram", or "violin". | 
| pal | character Palette (default = NULL) | 
Value
geom
Load 10x count matrices
Description
Load gene expression count data
Usage
read10x(
  data.path,
  samples = NULL,
  raw = FALSE,
  symbol = TRUE,
  sep = "!!",
  unique.names = TRUE,
  n.cores = 1,
  verbose = TRUE
)
Arguments
| data.path | Path to cellranger count data. | 
| samples | Vector of sample names (default = NULL) | 
| raw | logical Add raw count matrices (default = FALSE) | 
| symbol | The type of gene IDs to use, SYMBOL (TRUE) or ENSEMBLE (default = TRUE). | 
| sep | Separator for cell names (default = "!!"). | 
| n.cores | Number of cores for the calculations (default = 1). | 
| verbose | Print messages (default = TRUE). | 
Value
data frame
Examples
## Not run: 
cms <- read10x(data.path = "/path/to/count/data", 
samples = crm$metadata$samples, 
raw = FALSE, 
symbol = TRUE, 
n.cores = crm$n.cores)
## End(Not run)
Read 10x HDF5 files
Description
Read 10x HDF5 files
Usage
read10xH5(
  data.path,
  samples = NULL,
  type = c("raw", "filtered", "cellbender", "cellbender_filtered"),
  symbol = TRUE,
  sep = "!!",
  n.cores = 1,
  verbose = TRUE,
  unique.names = FALSE
)
Arguments
| data.path | character | 
| samples | character vector, select specific samples for processing (default = NULL) | 
| type | name of H5 file to search for, "raw" and "filtered" are Cell Ranger count outputs, "cellbender" is output from CellBender after running script from saveCellbenderScript | 
| symbol | logical Use gene SYMBOLs (TRUE) or ENSEMBL IDs (FALSE) (default = TRUE) | 
| sep | character Separator for creating unique cell names from sample IDs and cell IDs (default = "!!") | 
| n.cores | integer Number of cores (default = 1) | 
| verbose | logical Print progress (default = TRUE) | 
| unique.names | logical Create unique cell IDs (default = FALSE) | 
Value
list with sparse count matrices
Examples
## Not run: 
cms.h5 <- read10xH5(data.path = "/path/to/count/data")
## End(Not run)