STexampleData 1.0.8
The STexampleData package provides access to a collection of spatially resolved transcriptomics datasets, which have been formatted into the SpatialExperiment Bioconductor object class.
These datasets have been collected from various publicly available sources, and cover several technological platforms. We provide them in the form of SpatialExperiment objects to make them easier to access, so that we and others can use them for examples, demonstrations, tutorials, and other purposes.
The SpatialExperiment class is an extension of SingleCellExperiment, adapted for the properties of spatially resolved transcriptomics data. For more details, see the SpatialExperiment documentation.
Note that objects in Bioconductor version 3.13 have names ending with _3_13, and have been kept for backward compatibility. The latest versions of the objects are available in Bioconductor version 3.14 onwards.
To install the STexampleData package from Bioconductor:
install.packages("BiocManager")
BiocManager::install("STexampleData")The package contains the following datasets:
Visium_humanDLPFC (10x Genomics Visium): A single sample (sample 151673) of human brain dorsolateral prefrontal cortex (DLPFC) in the human brain, measured using the 10x Genomics Visium platform. This is a subset of the full dataset containing 12 samples from 3 neurotypical donors, published by Maynard and Collado-Torres et al. (2021). The full dataset is available from the spatialLIBD Bioconductor package.
Visium_mouseCoronal (10x Genomics Visium): A single coronal section from the mouse brain, spanning one hemisphere. This dataset was previously released by 10x Genomics on their website.
seqFISH_mouseEmbryo (seqFISH): A subset of cells (embryo 1, z-slice 2) from a previously published dataset investigating mouse embryogenesis by Lohoff and Ghazanfar et al. (2020), generated using the seqFISH platform. The full dataset is available online.
The following examples show how to load the example datasets as SpatialExperiment objects in an R session.
library(ExperimentHub)# create ExperimentHub instance
eh <- ExperimentHub()## snapshotDate(): 2021-05-18# query STexampleData datasets
myfiles <- query(eh, "STexampleData")
myfiles## ExperimentHub with 3 records
## # snapshotDate(): 2021-05-18
## # $dataprovider: NA
## # $species: Mus musculus, Homo sapiens
## # $rdataclass: SpatialExperiment
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH5443"]]' 
## 
##            title                   
##   EH5443 | Visium_humanDLPFC_3_13  
##   EH5444 | Visium_mouseCoronal_3_13
##   EH5445 | seqFISH_mouseEmbryo_3_13# metadata
md <- as.data.frame(mcols(myfiles))library(STexampleData)
library(SpatialExperiment)# load object
spe <- Visium_humanDLPFC_3_13()## see ?STexampleData and browseVignettes('STexampleData') for documentation## loading from cache# alternatively: using ExperimentHub query
# spe <- myfiles[[1]]
# alternatively: using ExperimentHub ID
# spe <- myfiles[["EH5443"]]
# check object
spe## class: SpatialExperiment 
## dim: 33538 4992 
## metadata(0):
## assays(1): counts
## rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
##   ENSG00000268674
## rowData names(3): gene_id gene_name feature_type
## colnames(4992): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
##   TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1
## colData names(3): cell_count ground_truth sample_id
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialData names(6) : barcode_id in_tissue ... pxl_col_in_fullres
##   pxl_row_in_fullres
## spatialCoords names(2) : x y
## imgData names(4): sample_id image_id data scaleFactordim(spe)## [1] 33538  4992assayNames(spe)## [1] "counts"rowData(spe)## DataFrame with 33538 rows and 3 columns
##                         gene_id   gene_name    feature_type
##                     <character> <character>     <character>
## ENSG00000243485 ENSG00000243485 MIR1302-2HG Gene Expression
## ENSG00000237613 ENSG00000237613     FAM138A Gene Expression
## ENSG00000186092 ENSG00000186092       OR4F5 Gene Expression
## ENSG00000238009 ENSG00000238009  AL627309.1 Gene Expression
## ENSG00000239945 ENSG00000239945  AL627309.3 Gene Expression
## ...                         ...         ...             ...
## ENSG00000277856 ENSG00000277856  AC233755.2 Gene Expression
## ENSG00000275063 ENSG00000275063  AC233755.1 Gene Expression
## ENSG00000271254 ENSG00000271254  AC240274.1 Gene Expression
## ENSG00000277475 ENSG00000277475  AC213203.1 Gene Expression
## ENSG00000268674 ENSG00000268674     FAM231C Gene ExpressioncolData(spe)## DataFrame with 4992 rows and 9 columns
##                    cell_count ground_truth     sample_id         barcode_id
##                     <integer>     <factor>   <character>        <character>
## AAACAACGAATAGTTC-1         NA       NA     sample_151673 AAACAACGAATAGTTC-1
## AAACAAGTATCTCCCA-1          6       Layer3 sample_151673 AAACAAGTATCTCCCA-1
## AAACAATCTACTAGCA-1         16       Layer1 sample_151673 AAACAATCTACTAGCA-1
## AAACACCAATAACTGC-1          5       WM     sample_151673 AAACACCAATAACTGC-1
## AAACAGAGCGACTCCT-1          2       Layer3 sample_151673 AAACAGAGCGACTCCT-1
## ...                       ...          ...           ...                ...
## TTGTTTCACATCCAGG-1          3       WM     sample_151673 TTGTTTCACATCCAGG-1
## TTGTTTCATTAGTCTA-1          4       WM     sample_151673 TTGTTTCATTAGTCTA-1
## TTGTTTCCATACAACT-1          3       Layer6 sample_151673 TTGTTTCCATACAACT-1
## TTGTTTGTATTACACG-1         16       WM     sample_151673 TTGTTTGTATTACACG-1
## TTGTTTGTGTAAATTC-1          5       Layer2 sample_151673 TTGTTTGTGTAAATTC-1
##                    in_tissue array_row array_col pxl_col_in_fullres
##                    <integer> <integer> <integer>          <integer>
## AAACAACGAATAGTTC-1         0         0        16               2435
## AAACAAGTATCTCCCA-1         1        50       102               8468
## AAACAATCTACTAGCA-1         1         3        43               2807
## AAACACCAATAACTGC-1         1        59        19               9505
## AAACAGAGCGACTCCT-1         1        14        94               4151
## ...                      ...       ...       ...                ...
## TTGTTTCACATCCAGG-1         1        58        42               9396
## TTGTTTCATTAGTCTA-1         1        60        30               9630
## TTGTTTCCATACAACT-1         1        45        27               7831
## TTGTTTGTATTACACG-1         1        73        41              11193
## TTGTTTGTGTAAATTC-1         1         7        51               3291
##                    pxl_row_in_fullres
##                             <integer>
## AAACAACGAATAGTTC-1               3913
## AAACAAGTATCTCCCA-1               9791
## AAACAATCTACTAGCA-1               5769
## AAACACCAATAACTGC-1               4068
## AAACAGAGCGACTCCT-1               9271
## ...                               ...
## TTGTTTCACATCCAGG-1               5653
## TTGTTTCATTAGTCTA-1               4825
## TTGTTTCCATACAACT-1               4631
## TTGTTTGTATTACACG-1               5571
## TTGTTTGTGTAAATTC-1               6317spatialData(spe)## DataFrame with 4992 rows and 6 columns
##                            barcode_id in_tissue array_row array_col
##                           <character> <integer> <integer> <integer>
## AAACAACGAATAGTTC-1 AAACAACGAATAGTTC-1         0         0        16
## AAACAAGTATCTCCCA-1 AAACAAGTATCTCCCA-1         1        50       102
## AAACAATCTACTAGCA-1 AAACAATCTACTAGCA-1         1         3        43
## AAACACCAATAACTGC-1 AAACACCAATAACTGC-1         1        59        19
## AAACAGAGCGACTCCT-1 AAACAGAGCGACTCCT-1         1        14        94
## ...                               ...       ...       ...       ...
## TTGTTTCACATCCAGG-1 TTGTTTCACATCCAGG-1         1        58        42
## TTGTTTCATTAGTCTA-1 TTGTTTCATTAGTCTA-1         1        60        30
## TTGTTTCCATACAACT-1 TTGTTTCCATACAACT-1         1        45        27
## TTGTTTGTATTACACG-1 TTGTTTGTATTACACG-1         1        73        41
## TTGTTTGTGTAAATTC-1 TTGTTTGTGTAAATTC-1         1         7        51
##                    pxl_col_in_fullres pxl_row_in_fullres
##                             <integer>          <integer>
## AAACAACGAATAGTTC-1               2435               3913
## AAACAAGTATCTCCCA-1               8468               9791
## AAACAATCTACTAGCA-1               2807               5769
## AAACACCAATAACTGC-1               9505               4068
## AAACAGAGCGACTCCT-1               4151               9271
## ...                               ...                ...
## TTGTTTCACATCCAGG-1               9396               5653
## TTGTTTCATTAGTCTA-1               9630               4825
## TTGTTTCCATACAACT-1               7831               4631
## TTGTTTGTATTACACG-1              11193               5571
## TTGTTTGTGTAAATTC-1               3291               6317head(spatialCoords(spe))##                       x    y
## AAACAACGAATAGTTC-1 3913 2435
## AAACAAGTATCTCCCA-1 9791 8468
## AAACAATCTACTAGCA-1 5769 2807
## AAACACCAATAACTGC-1 4068 9505
## AAACAGAGCGACTCCT-1 9271 4151
## AAACAGCTTTCAGAAG-1 3393 7583imgData(spe)## DataFrame with 2 rows and 4 columns
##       sample_id    image_id   data scaleFactor
##     <character> <character> <list>   <numeric>
## 1 sample_151673      lowres   ####   0.0450045
## 2 sample_151673       hires   ####   0.1500150# load object
spe <- Visium_mouseCoronal_3_13()
# alternatively: using ExperimentHub query
# spe <- myfiles[[2]]
# alternatively: using ExperimentHub ID
# spe <- myfiles[["EH5444"]]
# check object
spe## class: SpatialExperiment 
## dim: 32285 4992 
## metadata(0):
## assays(1): counts
## rownames(32285): ENSMUSG00000051951 ENSMUSG00000089699 ...
##   ENSMUSG00000095019 ENSMUSG00000095041
## rowData names(3): gene_id gene_name feature_type
## colnames(4992): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
##   TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1
## colData names(1): sample_id
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialData names(6) : barcode_id in_tissue ... pxl_col_in_fullres
##   pxl_row_in_fullres
## spatialCoords names(2) : x y
## imgData names(4): sample_id image_id data scaleFactor# load object
spe <- seqFISH_mouseEmbryo_3_13()
# alternatively: using ExperimentHub query
# spe <- myfiles[[3]]
# alternatively: using ExperimentHub ID
# spe <- myfiles[["EH5445"]]
# check object
spe## class: SpatialExperiment 
## dim: 351 11026 
## metadata(0):
## assays(2): counts molecules
## rownames(351): Abcc4 Acp5 ... Zfp57 Zic3
## rowData names(1): gene_name
## colnames(11026): embryo1_Pos0_cell10_z2 embryo1_Pos0_cell100_z2 ...
##   embryo1_Pos28_cell97_z2 embryo1_Pos28_cell98_z2
## colData names(14): uniqueID embryo ... x_global_affine y_global_affine
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialData names(3) : cell_id x_global_affine y_global_affine
## spatialCoords names(2) : x y
## imgData names(0):For reference, we include code scripts to generate the SpatialExperiment objects from the raw data files.
These scripts are saved in /inst/scripts/ in the source code of the STexampleData package. The scripts include references and links to the raw data files from the original sources for each dataset.
sessionInfo()## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.13-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.13-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    parallel  stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] BumpyMatrix_1.0.1           STexampleData_1.0.8        
##  [3] SpatialExperiment_1.2.1     SingleCellExperiment_1.14.1
##  [5] SummarizedExperiment_1.22.0 Biobase_2.52.0             
##  [7] GenomicRanges_1.44.0        GenomeInfoDb_1.28.1        
##  [9] IRanges_2.26.0              S4Vectors_0.30.0           
## [11] MatrixGenerics_1.4.2        matrixStats_0.60.0         
## [13] ExperimentHub_2.0.0         AnnotationHub_3.0.1        
## [15] BiocFileCache_2.0.0         dbplyr_2.1.1               
## [17] BiocGenerics_0.38.0         BiocStyle_2.20.2           
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-7                  bit64_4.0.5                  
##  [3] filelock_1.0.2                httr_1.4.2                   
##  [5] tools_4.1.0                   bslib_0.2.5.1                
##  [7] utf8_1.2.2                    R6_2.5.0                     
##  [9] HDF5Array_1.20.0              DBI_1.1.1                    
## [11] rhdf5filters_1.4.0            withr_2.4.2                  
## [13] tidyselect_1.1.1              bit_4.0.4                    
## [15] curl_4.3.2                    compiler_4.1.0               
## [17] DelayedArray_0.18.0           bookdown_0.22                
## [19] sass_0.4.0                    rappdirs_0.3.3               
## [21] stringr_1.4.0                 digest_0.6.27                
## [23] rmarkdown_2.10                R.utils_2.10.1               
## [25] XVector_0.32.0                pkgconfig_2.0.3              
## [27] htmltools_0.5.1.1             sparseMatrixStats_1.4.2      
## [29] limma_3.48.2                  fastmap_1.1.0                
## [31] rlang_0.4.11                  RSQLite_2.2.7                
## [33] shiny_1.6.0                   DelayedMatrixStats_1.14.2    
## [35] jquerylib_0.1.4               generics_0.1.0               
## [37] jsonlite_1.7.2                BiocParallel_1.26.1          
## [39] dplyr_1.0.7                   R.oo_1.24.0                  
## [41] RCurl_1.98-1.3                magrittr_2.0.1               
## [43] scuttle_1.2.1                 GenomeInfoDbData_1.2.6       
## [45] Matrix_1.3-4                  Rcpp_1.0.7                   
## [47] Rhdf5lib_1.14.2               fansi_0.5.0                  
## [49] lifecycle_1.0.0               R.methodsS3_1.8.1            
## [51] edgeR_3.34.0                  stringi_1.7.3                
## [53] yaml_2.2.1                    zlibbioc_1.38.0              
## [55] rhdf5_2.36.0                  grid_4.1.0                   
## [57] blob_1.2.2                    promises_1.2.0.1             
## [59] dqrng_0.3.0                   crayon_1.4.1                 
## [61] lattice_0.20-44               Biostrings_2.60.2            
## [63] beachmat_2.8.0                KEGGREST_1.32.0              
## [65] magick_2.7.2                  locfit_1.5-9.4               
## [67] knitr_1.33                    pillar_1.6.2                 
## [69] rjson_0.2.20                  glue_1.4.2                   
## [71] BiocVersion_3.13.1            evaluate_0.14                
## [73] BiocManager_1.30.16           png_0.1-7                    
## [75] vctrs_0.3.8                   httpuv_1.6.1                 
## [77] purrr_0.3.4                   assertthat_0.2.1             
## [79] cachem_1.0.5                  xfun_0.25                    
## [81] DropletUtils_1.12.2           mime_0.11                    
## [83] xtable_1.8-4                  later_1.2.0                  
## [85] tibble_3.1.3                  AnnotationDbi_1.54.1         
## [87] memoise_2.0.0                 ellipsis_0.3.2               
## [89] interactiveDisplayBase_1.30.0