scRNAseq 2.6.1
The scRNAseq package provides convenient access to several publicly available data sets
in the form of SingleCellExperiment
objects.
The focus of this package is to capture datasets that are not easily read into R with a one-liner from, e.g., read.csv()
.
Instead, we do the necessary data munging so that users only need to call a single function to obtain a well-formed SingleCellExperiment
.
For example:
library(scRNAseq)
fluidigm <- ReprocessedFluidigmData()
fluidigm
## class: SingleCellExperiment
## dim: 26255 130
## metadata(3): sample_info clusters which_qc
## assays(4): tophat_counts cufflinks_fpkm rsem_counts rsem_tpm
## rownames(26255): A1BG A1BG-AS1 ... ZZEF1 ZZZ3
## rowData names(0):
## colnames(130): SRR1275356 SRR1274090 ... SRR1275366 SRR1275261
## colData names(28): NREADS NALIGNED ... Cluster1 Cluster2
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
Readers are referred to the SummarizedExperiment and SingleCellExperiment documentation
for further information on how to work with SingleCellExperiment
objects.
The listDatasets()
function returns all available datasets in scRNAseq,
along with some summary statistics and the necessary R command to load them.
out <- listDatasets()
Reference | Taxonomy | Part | Number | Call |
---|---|---|---|---|
Aztekin et al. (2019) | Xenopus | tail | 13199 | AztekinTailData() |
Bach et al. (2017) | Mouse | mammary gland | 25806 | BachMammaryData() |
Bacher et al. (2020) | Human | T cells | 104417 | BacherTCellData() |
Baron et al. (2016) | Human | pancreas | 8569 | BaronPancreasData('human') |
Baron et al. (2016) | Mouse | pancreas | 1886 | BaronPancreasData('mouse') |
Bhaduri et al. (2020) | Human | cortical organoids | 242349 | BhaduriOrganoidData() |
Buettner et al. (2015) | Mouse | embryonic stem cells | 288 | BuettnerESCData() |
(???) | Human | haematopoietic stem and progenitor | 5183 | BunisHSPCData() |
Campbell et al. (2017) | Mouse | brain | 21086 | CampbellBrainData() |
Chen et al. (2017) | Mouse | brain | 14437 | ChenBrainData() |
Darmanis et al. (2015) | Human | brain | 466 | DarmanisBrainData() |
Ernst et al. (2019) | Mouse | testis | 68937 | ErnstSpermatogenesisData() |
Fletcher et al. (2017) | Mouse | olfactory epithelium | 616 | FletcherOlfactoryData() |
Grun et al. (2016) | Mouse | haematopoietic stem cells | 1915 | GrunHSCData() |
Grun et al. (2016) | Human | pancreas | 1728 | GrunPancreasData() |
Giladi et al. (2018) | Mouse | haematopoietic stem cells | 81024 | GiladiHSCData(mode='rna') |
He et al. (2020) | Human | various organs | 84363 | HeOrganAtlasData() |
Hermann et al. (2018) | Mouse | spermatogenic cells | 2325 | HermannSpermatogenesisData() |
Hu et al. (2017) | Mouse | cortex | 48000 | HuCortexData() |
Kolodziejczyk et al. (2015) | Mouse | embryonic stem cells | 704 | KolodziejczykESCData() |
Jessa et al. (2019) | Mouse | brain | 61595 | JessaBrainData() |
La Manno et al. (2016) | Human | embryonic stem cells | 1715 | LaMannoBrainData('human-es') |
La Manno et al. (2016) | Human | embryonic midbrain | 1977 | LaMannoBrainData('human-embryo') |
La Manno et al. (2016) | Human | induced pluripotent stem cells | 337 | LaMannoBrainData('human-ips') |
La Manno et al. (2016) | Mouse | adult dopaminergic neurons | 243 | LaMannoBrainData('mouse-adult') |
La Manno et al. (2016) | Human | embyronic midbrain | 1907 | LaMannoBrainData('mouse-embryo') |
Lawlor et al. (2017) | Human | pancreas | 638 | LawlorPancreasData() |
Leng et al. (2015) | Human | embryonic stem cells | 460 | LengESCData() |
Lun et al. (2017) | Mouse | 416B cells | 192 | LunSpikeInData('416b') |
Lun et al. (2017) | Mouse | trophoblasts | 192 | LunSpikeInData('tropho') |
Macosko et al. (2015) | Mouse | retina | 49300 | MacoskoRetinaData() |
Mahata et al. (2014) | Mouse | T helper cells | 96 | ReprocessedTh2Data() |
Mair et al. (2020) | Human | peripheral blood mononuclear cells | 29033 | MairPBMCData() |
Kotliarov et al. (2020) | Human | peripheral blood mononuclear cells | 58654 | KotliarovPBMCData() |
Marques et al. (2016) | Mouse | brain | 5069 | MarquesBrainData() |
Messmer et al. (2019) | Human | embryonic stem cells | 1344 | MessmerESCData() |
Muraro et al. (2016) | Human | pancreas | 3072 | MuraroPancreasData() |
Nestorowa et al. (2016) | Mouse | haematopoietic stem cells | 1920 | NestorowaHSCData() |
Nowakowski et al. (2017) | Human | cortex | 4261 | NowakowskiCortexData() |
Paul et al. (2015) | Mouse | haematopoietic stem cells | 10368 | PaulHSCData() |
Pollen et al. (2014) | Human | cortex | 65 | ReprocessedFluidigmData() |
Pollen et al. (2015) | Human | outer radial glia | 367 | PollenGliaData() |
Richard et al. (2018) | Mouse | CD8+ T cells | 572 | RichardTCellData() |
Romanov et al. (2017) | Mouse | brain | 2881 | RomanovBrainData() |
Segerstolpe et al. (2016) | Human | pancreas | 3514 | SegerstolpePancreasData() |
Shekhar et al. (2016) | Mouse | retina | 44994 | ShekharRetinaData() |
Stoeckius et al. (2018) | Mouse | peripheral blood mononuclear cells | 50000 | StoeckiusHashingData(mode='mouse') |
Stoeckius et al. (2018) | Human | peripheral blood mononuclear cells | 50000 | StoeckiusHashingData(mode='human') |
Stoeckius et al. (2018) | Human | HEK, THP1, K562, KG1 cells | 30000 | StoeckiusHashingData(type='mixed') |
Usoskin et al. (2015) | Mouse | brain | 864 | UsoskinBrainData() |
Tasic et al. (2016) | Mouse | brain | 1809 | TasicBrainData() |
Tasic et al. (2016) | Mouse | visual cortex | 379 | ReprocessedAllenData() |
Wu et al. (2019) | Mouse | kidney | 17542 | WuKidneyData() |
Xin et al. (2016) | Human | pancreas | 1600 | XinPancreasData() |
Zeisel et al. (2015) | Mouse | brain | 3005 | ZeiselBrainData() |
Zeisel et al. (2018) | Mouse | nervous system | 160796 | ZeiselNervousData() |
Zhao et al. (2020) | Human | liver immune cells | 68100 | ZhaoImmuneLiverData() |
Zhong et al. (2018) | Human | prefrontal cortex | 2394 | ZhongPrefrontalData() |
Zilionis et al. (2019) | Human | lung | 173954 | ZilionisLungData() |
Zilionis et al. (2019) | Mouse | lung | 17549 | ZilionisLungData('mouse') |
If the original dataset was not provided with Ensembl annotation, we can map the identifiers with ensembl=TRUE
.
Any genes without a corresponding Ensembl identifier is discarded from the dataset.
sce <- ZeiselBrainData(ensembl=TRUE)
head(rownames(sce))
## [1] "ENSMUSG00000029669" "ENSMUSG00000046982" "ENSMUSG00000039735"
## [4] "ENSMUSG00000033453" "ENSMUSG00000046798" "ENSMUSG00000034009"
Functions also have a location=TRUE
argument that loads in the gene coordinates.
sce <- ZeiselBrainData(ensembl=TRUE, location=TRUE)
head(rowRanges(sce))
## GRanges object with 6 ranges and 2 metadata columns:
## seqnames ranges strand | featureType
## <Rle> <IRanges> <Rle> | <character>
## ENSMUSG00000029669 6 21771395-21852515 - | endogenous
## ENSMUSG00000046982 18 84011627-84087706 - | endogenous
## ENSMUSG00000039735 3 122538719-122619715 - | endogenous
## ENSMUSG00000033453 9 30899155-30922452 - | endogenous
## ENSMUSG00000046798 5 5489537-5514958 - | endogenous
## ENSMUSG00000034009 3 79641611-79737880 - | endogenous
## originalName
## <character>
## ENSMUSG00000029669 Tspan12
## ENSMUSG00000046982 Tshz1
## ENSMUSG00000039735 Fnbp1l
## ENSMUSG00000033453 Adamts15
## ENSMUSG00000046798 Cldn12
## ENSMUSG00000034009 Rxfp1
## -------
## seqinfo: 118 sequences from GRCm38 genome
Please contact us if you have a data set that you would like to see added to this package. The only requirement is that your data set has publicly available expression values (ideally counts) and sample annotation. The more difficult/custom the format, the better, as its inclusion in this package will provide more value for other users in the R/Bioconductor community.
If you have already written code that processes your desired data set in a SingleCellExperiment
-like form,
we would welcome a pull request here.
The process can be expedited by ensuring that you have the following files:
inst/scripts/make-X-Y-data.Rmd
, a Rmarkdown report that creates all components of a SingleCellExperiment
.
X
should be the last name of the first author of the relevant study while Y
should be the name of the biological system.inst/scripts/make-X-Y-metadata.R
, an R script that creates a metadata CSV file at inst/extdata/metadata-X-Y.csv
.
Metadata files should follow the format described in the ExperimentHub documentation.R/XYData.R
, an R source file that defines a function XYData()
to download the components from ExperimentHub
and creates a SingleCellExperiment
object.Potential contributors are recommended to examine some of the existing scripts in the package to pick up the coding conventions. Remember, we’re more likely to accept a contribution if it’s indistinguishable from something we might have written ourselves!
Aztekin, C., T. W. Hiscock, J. C. Marioni, J. B. Gurdon, B. D. Simons, and J. Jullien. 2019. “Identification of a regeneration-organizing cell in the Xenopus tail.” Science 364 (6441): 653–58.
Bach, K., S. Pensa, M. Grzelak, J. Hadfield, D. J. Adams, J. C. Marioni, and W. T. Khaled. 2017. “Differentiation dynamics of mammary epithelial cells revealed by single-cell RNA sequencing.” Nat Commun. 8 (1): 2128.
Bacher, P., E. Rosati, D. Esser, G. R. Martini, C. Saggau, E. Schiminsky, J. Dargvainiene, et al. 2020. “Low-Avidity CD4+ T Cell Responses to SARS-CoV-2 in Unexposed Individuals and Humans with Severe COVID-19.” Immunity 53 (6): 1258–71.
Baron, M., A. Veres, S. L. Wolock, A. L. Faust, R. Gaujoux, A. Vetere, J. H. Ryu, et al. 2016. “A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure.” Cell Syst 3 (4): 346–60.
Bhaduri, A., M. G. Andrews, W. Mancia Leon, D. Jung, D. Shin, D. Allen, D. Jung, et al. 2020. “Cell stress in cortical organoids impairs molecular subtype specification.” Nature 578 (7793): 142–48.
Buettner, F., K. N. Natarajan, F. P. Casale, V. Proserpio, A. Scialdone, F. J. Theis, S. A. Teichmann, J. C. Marioni, and O. Stegle. 2015. “Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells.” Nat. Biotechnol. 33 (2): 155–60.
Campbell, J. N., E. Z. Macosko, H. Fenselau, T. H. Pers, A. Lyubetskaya, D. Tenen, M. Goldman, et al. 2017. “A molecular census of arcuate hypothalamus and median eminence cell types.” Nat. Neurosci. 20 (3): 484–96.
Chen, R., X. Wu, L. Jiang, and Y. Zhang. 2017. “Single-Cell RNA-Seq Reveals Hypothalamic Cell Diversity.” Cell Rep 18 (13): 3227–41.
Darmanis, S., S. A. Sloan, Y. Zhang, M. Enge, C. Caneda, L. M. Shuer, M. G. Hayden Gephart, B. A. Barres, and S. R. Quake. 2015. “A survey of human brain transcriptome diversity at the single cell level.” Proc Natl Acad Sci U S A 112 (23): 7285–90.
Ernst, C., N. Eling, C. P. Martinez-Jimenez, J. C. Marioni, and D. T. Odom. 2019. “Staged developmental mapping and X chromosome transcriptional dynamics during mouse spermatogenesis.” Nat Commun 10 (1): 1251.
Fletcher, Russell B, Diya Das, Levi Gadye, Kelly N Street, Ariane Baudhuin, Allon Wagner, Michael B Cole, et al. 2017. “Deconstructing Olfactory Stem Cell Trajectories at Single-Cell Resolution.” Cell Stem Cell 20 (6): 817–30.
Giladi, A., F. Paul, Y. Herzog, Y. Lubling, A. Weiner, I. Yofe, D. Jaitin, et al. 2018. “Single-cell characterization of haematopoietic progenitors and their trajectories in homeostasis and perturbed haematopoiesis.” Nat Cell Biol 20 (7): 836–46.
Grun, D., M. J. Muraro, J. C. Boisset, K. Wiebrands, A. Lyubimova, G. Dharmadhikari, M. van den Born, et al. 2016. “De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data.” Cell Stem Cell 19 (2): 266–77.
He, S., L. H. Wang, Y. Liu, Y. Q. Li, H. T. Chen, J. H. Xu, W. Peng, et al. 2020. “Single-cell transcriptome profiling of an adult human cell atlas of 15 major organs.” Genome Biol 21 (1): 294.
Hermann, B.P., K. Cheng, A. Singh, L. Roa-De La Cruz, K.N. Mutoji, I-C. Chen, H. Gildersleeve, et al. 2018. “The Mammalian Spermatogenesis Single-Cell Transcriptome, from Spermatogonial Stem Cells to Spermatids.” Cell Rep. 25 (6): 1650–1667.e8.
Hu, P., E. Fabyanic, D. Y. Kwon, S. Tang, Z. Zhou, and H. Wu. 2017. “Dissecting Cell-Type Composition and Activity-Dependent Transcriptional State in Mammalian Brains by Massively Parallel Single-Nucleus Rna-Seq.” Mol. Cell 68 (5): 1006–15.
Jessa, S., A. Blanchet-Cohen, B. Krug, M. Vladoiu, M. Coutelier, D. Faury, B. Poreau, et al. 2019. “Stalled developmental programs at the root of pediatric brain tumors.” Nat Genet 51 (12): 1702–13.
Kolodziejczyk, A. A., J. K. Kim, J. C. Tsang, T. Ilicic, J. Henriksson, K. N. Natarajan, A. C. Tuck, et al. 2015. “Single cell RNA-Sequencing of pluripotent states unlocks modular transcriptional variation.” Cell Stem Cell 17 (4): 471–85.
Kotliarov, Y., R. Sparks, A. Martins, M. Mulè, Y. Lu, M. Goswami, L. Kardava, et al. 2020. “Broad Immune Activation Underlies Shared Set Point Signatures for Vaccine Responsiveness in Healthy Individuals and Disease Activity in Patients with Lupus.” Nat. Med. 26 (4): 618–29.
La Manno, G., D. Gyllborg, S. Codeluppi, K. Nishimura, C. Salto, A. Zeisel, L. E. Borm, et al. 2016. “Molecular Diversity of Midbrain Development in Mouse, Human, and Stem Cells.” Cell 167 (2): 566–80.
Lawlor, N., J. George, M. Bolisetty, R. Kursawe, L. Sun, V. Sivakamasundari, I. Kycia, P. Robson, and M. L. Stitzel. 2017. “Single-cell transcriptomes identify human islet cell signatures and reveal cell-type-specific expression changes in type 2 diabetes.” Genome Res. 27 (2): 208–22.
Leng, N., L. F. Chu, C. Barry, Y. Li, J. Choi, X. Li, P. Jiang, R. M. Stewart, J. A. Thomson, and C. Kendziorski. 2015. “Oscope identifies oscillatory genes in unsynchronized single-cell RNA-seq experiments.” Nat. Methods 12 (10): 947–50.
Lun, A. T. L., F. J. Calero-Nieto, L. Haim-Vilmovsky, B. Gottgens, and J. C. Marioni. 2017. “Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data.” Genome Res. 27 (11): 1795–1806.
Macosko, E. Z., A. Basu, R. Satija, J. Nemesh, K. Shekhar, M. Goldman, I. Tirosh, et al. 2015. “Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets.” Cell 161 (5): 1202–14.
Mahata, B., X. Zhang, A. A. Kolodziejczyk, V. Proserpio, L. Haim-Vilmovsky, A. E. Taylor, D. Hebenstreit, et al. 2014. “Single-cell RNA sequencing reveals T helper cells synthesizing steroids de novo to contribute to immune homeostasis.” Cell Rep 7 (4): 1130–42.
Mair, F., J. R. Erickson, V. Voillet, Y. Simoni, T. Bi, A. J. Tyznik, J. Martin, R. Gottardo, E. W. Newell, and M. Prlic. 2020. “A Targeted Multi-omic Analysis Approach Measures Protein Expression and Low-Abundance Transcripts on the Single-Cell Level.” Cell Rep 31 (1): 107499.
Marques, S., A. Zeisel, S. Codeluppi, D. van Bruggen, A. Mendanha Falcao, L. Xiao, H. Li, et al. 2016. “Oligodendrocyte heterogeneity in the mouse juvenile and adult central nervous system.” Science 352 (6291): 1326–9.
Messmer, T., F. von Meyenn, A. Savino, F. Santos, H. Mohammed, A. T. L. Lun, J. C. Marioni, and W. Reik. 2019. “Transcriptional heterogeneity in naive and primed human pluripotent stem cells at single-cell resolution.” Cell Rep. 26 (4): 815–24.
Muraro, M. J., G. Dharmadhikari, D. Grun, N. Groen, T. Dielen, E. Jansen, L. van Gurp, et al. 2016. “A Single-Cell Transcriptome Atlas of the Human Pancreas.” Cell Syst 3 (4): 385–94.
Nestorowa, S., F. K. Hamey, B. Pijuan Sala, E. Diamanti, M. Shepherd, E. Laurenti, N. K. Wilson, D. G. Kent, and B. Gottgens. 2016. “A single-cell resolution map of mouse hematopoietic stem and progenitor cell differentiation.” Blood 128 (8): 20–31.
Nowakowski, T. J., A. Bhaduri, A. A. Pollen, B. Alvarado, M. A. Mostajo-Radji, E. Di Lullo, M. Haeussler, et al. 2017. “Spatiotemporal gene expression trajectories reveal developmental hierarchies of the human cortex.” Science 358 (6368): 1318–23.
Paul, F., Y. Arkin, A. Giladi, D. A. Jaitin, E. Kenigsberg, H. Keren-Shaul, D. Winter, et al. 2015. “Transcriptional Heterogeneity and Lineage Commitment in Myeloid Progenitors.” Cell 163 (7): 1663–77.
Pollen, A. A., T. J. Nowakowski, J. Chen, H. Retallack, C. Sandoval-Espinosa, C. R. Nicholas, J. Shuga, et al. 2015. “Molecular identity of human outer radial glia during cortical development.” Cell 163 (1): 55–67.
Pollen, A. A., T. J. Nowakowski, J. Shuga, X. Wang, A. A. Leyrat, J. H. Lui, N. Li, et al. 2014. “Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex.” Nat. Biotechnol. 32 (10): 1053–8.
Richard, A. C., A. T. L. Lun, W. W. Y. Lau, B. Gottgens, J. C. Marioni, and G. M. Griffiths. 2018. “T cell cytolytic capacity is independent of initial stimulation strength.” Nat. Immunol. 19 (8): 849–58.
Romanov, R. A., A. Zeisel, J. Bakker, F. Girach, A. Hellysaz, R. Tomer, A. Alpar, et al. 2017. “Molecular interrogation of hypothalamic organization reveals distinct dopamine neuronal subtypes.” Nat. Neurosci. 20 (2): 176–88.
Segerstolpe, A., A. Palasantza, P. Eliasson, E. M. Andersson, A. C. Andreasson, X. Sun, S. Picelli, et al. 2016. “Single-Cell Transcriptome Profiling of Human Pancreatic Islets in Health and Type 2 Diabetes.” Cell Metab. 24 (4): 593–607.
Shekhar, K., S. W. Lapan, I. E. Whitney, N. M. Tran, E. Z. Macosko, M. Kowalczyk, X. Adiconis, et al. 2016. “Comprehensive Classification of Retinal Bipolar Neurons by Single-Cell Transcriptomics.” Cell 166 (5): 1308–23.
Stoeckius, M., S. Zheng, B. Houck-Loomis, S. Hao, B. Z. Yeung, W. M. Mauck, P. Smibert, and R. Satija. 2018. “Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics.” Genome Biol. 19 (1): 224.
Tasic, B., V. Menon, T. N. Nguyen, T. K. Kim, T. Jarsky, Z. Yao, B. Levi, et al. 2016. “Adult mouse cortical cell taxonomy revealed by single cell transcriptomics.” Nat. Neurosci. 19 (2): 335–46.
Usoskin, D., A. Furlan, S. Islam, H. Abdo, P. Lonnerberg, D. Lou, J. Hjerling-Leffler, et al. 2015. “Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing.” Nat. Neurosci. 18 (1): 145–53.
Wu, H., Y. Kirita, E. L. Donnelly, and B. D. Humphreys. 2019. “Advantages of Single-Nucleus over Single-Cell Rna Sequencing of Adult Kidney: Rare Cell Types and Novel Cell States Revealed in Fibrosis.” J. Am. Soc. Nephrol. 30 (1): 23–32.
Xin, Y., J. Kim, H. Okamoto, M. Ni, Y. Wei, C. Adler, A. J. Murphy, G. D. Yancopoulos, C. Lin, and J. Gromada. 2016. “RNA Sequencing of Single Human Islet Cells Reveals Type 2 Diabetes Genes.” Cell Metab. 24 (4): 608–15.
Zeisel, A., H. Hochgerner, P. L?nnerberg, A. Johnsson, F. Memic, J. van der Zwan, M. H?ring, et al. 2018. “Molecular Architecture of the Mouse Nervous System.” Cell 174 (4): 999–1014.
Zeisel, A., A. B. Munoz-Manchado, S. Codeluppi, P. Lonnerberg, G. La Manno, A. Jureus, S. Marques, et al. 2015. “Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq.” Science 347 (6226): 1138–42.
Zhao, J., S. Zhang, Y. Liu, X. He, M. Qu, G. Xu, H. Wang, et al. 2020. “Single-cell RNA sequencing reveals the heterogeneity of liver-resident immune cells in human.” Cell Discov 6: 22.
Zhong, S., S. Zhang, X. Fan, Q. Wu, L. Yan, J. Dong, H. Zhang, et al. 2018. “A single-cell RNA-seq survey of the developmental landscape of the human prefrontal cortex.” Nature 555 (7697): 524–28.
Zilionis, R., C. Engblom, C. Pfirschke, V. Savova, D. Zemmour, H. D. Saatcioglu, I. Krishnan, et al. 2019. “Single-Cell Transcriptomics of Human and Mouse Lung Cancers Reveals Conserved Myeloid Populations across Individuals and Species.” Immunity 50 (5): 1317–34.