Changes in version 1.8.0 o Respect default options from the underlying C++ libraries in most functions. Specifically, almost all R defaults are now set to NULL, and we only replace the defaults in the C++ library if a non-NULL argument is passed in R. This eliminates the need for manual synchronization between R and C++ when updating libraries with new defaults. We also add *Defaults() utility functions to report the C++ defaults for user inspection. This has some additional specific consequences for individual functions: • Updated default max.bias= in choosePseudoCount() to reduce log-transformation bias. • Updated default extra.work= in runPca() and scoreGeneSet() for better performance. • Updated default compute.median= in aggregateAcrossCells() to avoid unnecessary work. • Updated default use.min.width= in fitVarianceTrend() and modelGeneVariances() to improve trend fits for typical scRNA-seq datasets. o Deprecated the num.neighbors= argument in runTsne(). Now, the number of neighbors is fully determined by perplexity=. For precomputed KNN results, an error is thrown if there are any mismatches between the number of supplied neighbors and tsnePerplexityToNeighbors(perplexity). o Optionally add column names to the dimensions for runTsne.se(), runPca.se() and runUmap.se(). This is occasionally helpful for downstream plotting frameworks. o Added fit.trend= option to skip trend fitting in modelGeneVariances(). o Check for IRLBA convergence in scoreGeneSet() and raise a warning if the algorithm did not converge. o Bugfix to preserve assay names so that they match up to character subsets= in various compute*QcMetrics() functions. o Added subsampleByPartition() to subsample cells within each partition. o Improved support for experiments with spike-in transcripts: • Added centerSpikeInFactors() for convenient centering of size factors for endogenous genes and spike-in transcripts. • Added normalizeRnaCountsWithSpikeIns.se() for convenient normalization in the presence of spike-in transcripts. • Added chooseRnaHvgsWithSpikeIns.se() for convenient variance modelling against a spike-in trend. o Renamed components.from.residuals= to center.scores.by.block= in runPca(), for consistency with the C++ code. o Prevent an R session crash when an NA is supplied to block= and group= in functions like suggestRnaQcThresholds() and scoreMarkers(). Now, an error is thrown if a NA value is detected. o Consistently overwrite existing columns when saving results in the colData or rowData of a SummarizedExperiment. This affects chooseRnaHvgs.se(), quickAdtQc.se(), quickCrisprQc.se() and quickRnaQc.se(), where the results would be appended onto the existing DataFrame and introduce columns with duplicate names. o Removed the defunct analyze() and convertAnalyzeResults() functions. Changes in version 1.6.0 o Multiple changes to clusterGraph(): • Updates to use the latest version of the igraph library. • method="leiden" now supports the ER objective function in leiden.objective=. • Non-zero status are now directly reported as errors, status is removed from the output. o Multiple additions to scoreMarkers(): • Support use of quantiles to average statistics across blocks, via the new block.average.policy= and block.quantile= arguments. • Allow calculation of arbitrary quantiles as a summary statistic for the effect sizes of each group, via the new compute.summary.quantiles= argument. • Output now contains the number of rows, row names and group IDs for easier interpretation by downstream functions. • Added top.index.only= argument to report only the indices of the top genes when all.pairwise= is an integer. o Added arguments to summarizeEffects() to disable calculation of certain summary statistics, e.g., by setting compute.summary.median=FALSE. Also support calculation of arbitrary quantiles as described for scoreMarkers(). o suggest*QcThresholds() functions now return a block.ids field that preserves the type of block=. This enables easier matching in the corresponding filter*QcMetrics() function. o Multiple changes to modelGeneVariances(): • Allow the use of quantiles to average statistics across blocks via the new block.average.policy= and block.quantile= arguments. • Return a block.ids field that preserves the type of block= for easier matching. o Multiple changes to runPca(): • Added a subset= argument that performs a PCA on a subset of features but still reports the full rotation matrix for all features. • Return a block.ids field that preserves the type of block= for easier matching. o clusterKmeans() now emits a warning upon convergence failure. This can be disabled by setting warn=FALSE. o Simplified the automatic naming performed by combineFactors() when factors is unnamed. o Added SummarizedExperiment-compatible wrappers for all functions. These can be identified by their *.se suffix, e.g., aggregateAcrossCells.se(). Some functions execute multiple steps for convenience, e.g., quickRnaQc.se() runs all of the RNA-related QC functions. o Added analyze.se() to replace the analyze() function. The former is more convenient it stores most outputs in a SingleCellExperiment and formats the marker tables correctly. The latter is now deprecated. o compute*QcMetrics() functions now return a DataFrame for more convenient inspection and manipulation. o All functions that previously returned a base data.frame will now return a DataFrame from S4Vectors. This is a little prettier and we already load S4Vectors anyway. o Removed checks on num.neighbors= when pre-computed neighbor search results are supplied in runUmap(), buildSnnGraph() and subsampleByNeighbors(). Any differences between num.neighbors= and the number of neighbors in the pre-computed results are now ignored. o Implemented the LogNormalizedMatrix class for delayed log-transformation. Modified normalizeCounts() to return instances of this class when log=TRUE and delayed=TRUE. This allows downstream C++ code to use a more efficient normalization wrapper via initializeCpp(). Note that there are some very minor differences in the log-normalized values when computed in R via extract_array() and those in C++ via tatami extraction. o Added the countGroupsByBlock() function to count the number of cells within each combination of group and block. This is primarily intended for diagnosing batch effects. o Added options to control which statistics are computed in aggregateAcrossCells(). Also added bindings to compute the per-group medians via the compute.median= option. Changes in version 1.4.0 o Multiple changes to scoreMarkers(): • Setting all.pairwise= to an integer will now return the top markers by effect size for each pairwise comparison. This is more efficient than creating the 3D array of effects and then ranking the genes in each comparison. • The min-rank summary is now limited to the top min.rank.limit= genes, for efficiency. All other genes will have their reported min-rank set to a maximum placeholder. • Added compute.group.mean= and compute.group.detected= options to disable reporting of the group-wise means. o Multiple changes to testEnrichment(): • Setting universe=NULL will automatically define the universe as the union of all genes. • Return value is now a data frame with the number of overlapping genes and the size of each gene set. o Updated correctMnn() to the new algorithm used by the mnncorrect C++ library. This aims to improve correction accuracy and reduce sensitivity to the choice of reference. o Bugfix to runAllNeighborSteps() to return an empty list when no step is requested. o Set bound=0 by default in chooseHighlyVariableGenes(), for more convenient use with residuals. o Modified scoreGeneSet() to return weights as a data frame containing the row indices of the genes in the set. o Support character and logical vectors in sets= for aggregateAcrossGenes(). o Automatically remove empty clusters from the output of clusterKmeans(). o Supported blocking and related arguments for weighted averages of distances in scaleByNeighbors(). Changes in version 1.2.0 o Added the aggregateAcrossGenes() function, to compute an aggregate expression value for gene sets. o Added compute.cohens.d=, compute.delta.mean= and compute.delta.detected= options to scoreMarkers(). o Support top=Inf in chooseHighlyVariableGenes(). Also added the bound= argument to set a hard upper/lower bound. o Bugfix for correct filtering with block= in the various filter*QcMetrics() functions when not all blocking levels are present. o Bugfix to clusterKmeans() to respect the user-supplied seed= in relevant initialization methods. o Added a return.graph= option to return the SNN graph from runAllNeighborSteps(). o Added a testEnrichment() function for quick and dirty gene set enrichment testing. o Modified runPca() so that it caps number= to the maximum number of available PCs. o Added an analyze() function that provides a one-click approach for analyzing single-cell data. o Added a reportGroupMarkerStatistics() function to combine all marker statistics for a single group into one data frame. o Switch clusterGraph() to use C++ wrappers around the igraph community detection algorithms via Rigraphlib. This replaces the dependency on the igraph R package. o Added a delayed= option to avoid wrapping the matrix in a DelayedArray in normalizeCounts().