The SEtools package is a set of convenience functions for
the Bioconductor class SummarizedExperiment.
It facilitates merging, melting, and plotting
SummarizedExperiment objects.
NOTE that the heatmap-related and melting functions have been
moved to a standalone package, sechm.
The old sehm function of SEtools should be
considered deprecated, and most SEtools functions are
conserved for legacy/reproducibility reasons (or until they find a
better home).
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("SEtools")Or, to install the latest development version:
To showcase the main functions, we will use an example object which contains (a subset of) whole-hippocampus RNAseq of mice after different stressors:
## Warning: package 'MatrixGenerics' was built under R version 4.5.2
## Warning: package 'ComplexHeatmap' was built under R version 4.5.2
## class: SummarizedExperiment
## dim: 100 20
## metadata(0):
## assays(2): counts logcpm
## rownames(100): Egr1 Nr4a1 ... CH36-200G6.4 Bhlhe22
## rowData names(2): meanCPM meanTPM
## colnames(20): HC.Homecage.1 HC.Homecage.2 ... HC.Swim.4 HC.Swim.5
## colData names(2): Region Condition
This is taken from Floriou-Servou et al., Biol Psychiatry 2018.
## class: SummarizedExperiment
## dim: 100 20
## metadata(3): se1 se2 anno_colors
## assays(2): counts logcpm
## rownames(100): AC139063.2 Actr6 ... Zfp667 Zfp930
## rowData names(2): meanCPM meanTPM
## colnames(20): se1.HC.Homecage.1 se1.HC.Homecage.2 ... se2.HC.Swim.4
## se2.HC.Swim.5
## colData names(3): Dataset Region Condition
All assays were merged, along with rowData and colData slots.
By default, row z-scores are calculated for each object when merging. This can be prevented with:
If more than one assay is present, one can specify a different scaling behavior for each assay:
Differences to the cbind method include prefixes added
to column names, optional scaling, handling of metadata (e.g. for
sechm)
It is also possible to merge by rowData columns, which are specified
through the mergeBy argument. In this case, one can have
one-to-many and many-to-many mappings, in which case two behaviors are
possible:
aggFun, the features of
each object will by aggregated by mergeBy using this
function before merging.rowData(se1)$metafeature <- sample(LETTERS,nrow(se1),replace = TRUE)
rowData(se2)$metafeature <- sample(LETTERS,nrow(se2),replace = TRUE)
se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE, mergeBy="metafeature", aggFun=median)## Aggregating the objects by metafeature
## Merging...
A single SE can also be aggregated by using the aggSE
function:
## Aggregation methods for each assay:
## counts: sum; logcpm: expsum
## class: SummarizedExperiment
## dim: 25 10
## metadata(0):
## assays(2): counts logcpm
## rownames(25): A B ... Y Z
## rowData names(0):
## colnames(10): HC.Homecage.1 HC.Homecage.2 ... HC.Handling.4
## HC.Handling.5
## colData names(2): Region Condition
If the aggregation function(s) are not specified, aggSE
will try to guess decent aggregation functions from the assay names.
This is similar to scuttle::sumCountsAcrossFeatures, but
preserves other SE slots.
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] grid stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] SEtools_1.25.0 sechm_1.19.0
## [3] ComplexHeatmap_2.27.0 SummarizedExperiment_1.41.0
## [5] Biobase_2.71.0 GenomicRanges_1.63.0
## [7] Seqinfo_1.1.0 IRanges_2.45.0
## [9] S4Vectors_0.49.0 BiocGenerics_0.57.0
## [11] generics_0.1.4 MatrixGenerics_1.23.0
## [13] matrixStats_1.5.0 BiocStyle_2.39.0
##
## loaded via a namespace (and not attached):
## [1] DBI_1.2.3 rlang_1.1.6 magrittr_2.0.4
## [4] clue_0.3-66 GetoptLong_1.0.5 compiler_4.5.1
## [7] RSQLite_2.4.3 mgcv_1.9-3 png_0.1-8
## [10] vctrs_0.6.5 sva_3.59.0 stringr_1.5.2
## [13] shape_1.4.6.1 crayon_1.5.3 fastmap_1.2.0
## [16] XVector_0.51.0 ca_0.71.1 rmarkdown_2.30
## [19] bit_4.6.0 xfun_0.54 cachem_1.1.0
## [22] jsonlite_2.0.0 blob_1.2.4 DelayedArray_0.37.0
## [25] BiocParallel_1.45.0 parallel_4.5.1 cluster_2.1.8.1
## [28] R6_2.6.1 bslib_0.9.0 stringi_1.8.7
## [31] RColorBrewer_1.1-3 limma_3.67.0 genefilter_1.93.0
## [34] jquerylib_0.1.4 Rcpp_1.1.0 iterators_1.0.14
## [37] knitr_1.50 splines_4.5.1 Matrix_1.7-4
## [40] abind_1.4-8 yaml_2.3.10 TSP_1.2-5
## [43] doParallel_1.0.17 codetools_0.2-20 curl_7.0.0
## [46] lattice_0.22-7 KEGGREST_1.51.0 S7_0.2.0
## [49] evaluate_1.0.5 Rtsne_0.17 survival_3.8-3
## [52] zip_2.3.3 Biostrings_2.79.1 circlize_0.4.16
## [55] BiocManager_1.30.26 foreach_1.5.2 ggplot2_4.0.0
## [58] scales_1.4.0 xtable_1.8-4 glue_1.8.0
## [61] pheatmap_1.0.13 maketools_1.3.2 tools_4.5.1
## [64] sys_3.4.3 data.table_1.17.8 openxlsx_4.2.8.1
## [67] annotate_1.89.0 locfit_1.5-9.12 registry_0.5-1
## [70] buildtools_1.0.0 XML_3.99-0.19 seriation_1.5.8
## [73] AnnotationDbi_1.73.0 edgeR_4.9.0 colorspace_2.1-2
## [76] nlme_3.1-168 randomcoloR_1.1.0.1 cli_3.6.5
## [79] S4Arrays_1.11.0 V8_8.0.1 gtable_0.3.6
## [82] DESeq2_1.51.0 sass_0.4.10 digest_0.6.37
## [85] SparseArray_1.11.1 rjson_0.2.23 farver_2.1.2
## [88] memoise_2.0.1 htmltools_0.5.8.1 lifecycle_1.0.4
## [91] httr_1.4.7 GlobalOptions_0.1.2 statmod_1.5.1
## [94] bit64_4.6.0-1