Chapter 10 Chimeric mouse embryo (10X Genomics)
10.1 Introduction
This performs an analysis of the Pijuan-Sala et al. (2019) dataset on mouse gastrulation. Here, we examine chimeric embryos at the E8.5 stage of development where td-Tomato-positive embryonic stem cells (ESCs) were injected into a wild-type blastocyst.
10.2 Data loading
## class: SingleCellExperiment
## dim: 29453 20935
## metadata(0):
## assays(1): counts
## rownames(29453): ENSMUSG00000051951 ENSMUSG00000089699 ...
## ENSMUSG00000095742 tomato-td
## rowData names(2): ENSEMBL SYMBOL
## colnames(20935): cell_9769 cell_9770 ... cell_30702 cell_30703
## colData names(11): cell barcode ... doub.density sizeFactor
## reducedDimNames(2): pca.corrected.E7.5 pca.corrected.E8.5
## mainExpName: NULL
## altExpNames(0):
10.3 Quality control
Quality control on the cells has already been performed by the authors, so we will not repeat it here. We additionally remove cells that are labelled as stripped nuclei or doublets.
10.5 Variance modelling
We retain all genes with any positive biological component, to preserve as much signal as possible across a very heterogeneous dataset.
library(scran)
dec.chimera <- modelGeneVar(sce.chimera, block=sce.chimera$sample)
chosen.hvgs <- dec.chimera$bio > 0
par(mfrow=c(1,2))
blocked.stats <- dec.chimera$per.block
for (i in colnames(blocked.stats)) {
current <- blocked.stats[[i]]
plot(current$mean, current$total, main=i, pch=16, cex=0.5,
xlab="Mean of log-expression", ylab="Variance of log-expression")
curfit <- metadata(current)
curve(curfit$trend(x), col='dodgerblue', add=TRUE, lwd=2)
}
10.6 Merging
We use a hierarchical merge to first merge together replicates with the same genotype, and then merge samples across different genotypes.
library(batchelor)
set.seed(01001001)
merged <- correctExperiments(sce.chimera,
batch=sce.chimera$sample,
subset.row=chosen.hvgs,
PARAM=FastMnnParam(
merge.order=list(
list(1,3,5), # WT (3 replicates)
list(2,4,6) # td-Tomato (3 replicates)
)
)
)
We use the percentage of variance lost as a diagnostic:
## 5 6 7 8 9 10
## [1,] 0.000e+00 0.0204238 0.000e+00 0.0169321 0.000000 0.000000
## [2,] 0.000e+00 0.0007403 0.000e+00 0.0004431 0.000000 0.015455
## [3,] 3.089e-02 0.0000000 2.012e-02 0.0000000 0.000000 0.000000
## [4,] 9.042e-05 0.0000000 8.298e-05 0.0000000 0.018044 0.000000
## [5,] 4.318e-03 0.0072489 4.123e-03 0.0078254 0.003827 0.007779
10.7 Clustering
g <- buildSNNGraph(merged, use.dimred="corrected")
clusters <- igraph::cluster_louvain(g)
colLabels(merged) <- factor(clusters$membership)
We examine the distribution of cells across clusters and samples.
## Sample
## Cluster 5 6 7 8 9 10
## 1 87 20 62 53 151 73
## 2 146 37 132 110 231 215
## 3 96 16 162 124 364 272
## 4 127 97 185 433 362 457
## 5 103 41 284 362 145 195
## 6 207 52 344 208 556 646
## 7 153 73 86 89 166 383
## 8 130 97 110 63 159 311
## 9 82 20 75 33 165 203
## 10 97 19 36 18 50 35
## 11 114 43 43 38 39 154
## 12 122 64 62 51 63 139
## 13 150 75 128 99 134 391
## 14 110 69 73 96 127 255
## 15 99 54 195 408 255 682
## 16 42 34 81 80 85 355
## 17 180 47 225 182 212 384
## 18 74 38 179 106 318 458
## 19 50 27 94 62 98 158
## 20 39 41 50 49 130 127
## 21 1 5 0 84 0 66
## 22 17 7 13 17 20 37
## 23 50 24 76 63 75 179
## 24 9 7 18 13 30 27
## 25 11 16 20 9 47 57
## 26 2 1 7 3 75 137
## 27 0 2 0 51 0 5
10.8 Dimensionality reduction
We use an external algorithm to compute nearest neighbors for greater speed.
merged <- runTSNE(merged, dimred="corrected", external_neighbors=TRUE)
merged <- runUMAP(merged, dimred="corrected", external_neighbors=TRUE)
gridExtra::grid.arrange(
plotTSNE(merged, colour_by="label", text_by="label", text_colour="red"),
plotTSNE(merged, colour_by="batch")
)
Session Info
R Under development (unstable) (2024-10-21 r87258)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.1 LTS
Matrix products: default
BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB LC_COLLATE=C
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/New_York
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] batchelor_1.23.0 scran_1.35.0
[3] scater_1.35.0 ggplot2_3.5.1
[5] scuttle_1.17.0 MouseGastrulationData_1.21.0
[7] SpatialExperiment_1.17.0 SingleCellExperiment_1.29.1
[9] SummarizedExperiment_1.37.0 Biobase_2.67.0
[11] GenomicRanges_1.59.1 GenomeInfoDb_1.43.2
[13] IRanges_2.41.2 S4Vectors_0.45.2
[15] BiocGenerics_0.53.3 generics_0.1.3
[17] MatrixGenerics_1.19.1 matrixStats_1.5.0
[19] BiocStyle_2.35.0 rebook_1.17.0
loaded via a namespace (and not attached):
[1] jsonlite_1.8.9 CodeDepends_0.6.6
[3] magrittr_2.0.3 ggbeeswarm_0.7.2
[5] magick_2.8.5 farver_2.1.2
[7] rmarkdown_2.29 vctrs_0.6.5
[9] memoise_2.0.1 DelayedMatrixStats_1.29.1
[11] htmltools_0.5.8.1 S4Arrays_1.7.1
[13] AnnotationHub_3.15.0 curl_6.1.0
[15] BiocNeighbors_2.1.2 SparseArray_1.7.4
[17] sass_0.4.9 bslib_0.8.0
[19] cachem_1.1.0 ResidualMatrix_1.17.0
[21] igraph_2.1.3 mime_0.12
[23] lifecycle_1.0.4 pkgconfig_2.0.3
[25] rsvd_1.0.5 Matrix_1.7-1
[27] R6_2.5.1 fastmap_1.2.0
[29] GenomeInfoDbData_1.2.13 digest_0.6.37
[31] colorspace_2.1-1 AnnotationDbi_1.69.0
[33] dqrng_0.4.1 irlba_2.3.5.1
[35] ExperimentHub_2.15.0 RSQLite_2.3.9
[37] beachmat_2.23.6 filelock_1.0.3
[39] labeling_0.4.3 httr_1.4.7
[41] abind_1.4-8 compiler_4.5.0
[43] bit64_4.6.0-1 withr_3.0.2
[45] BiocParallel_1.41.0 viridis_0.6.5
[47] DBI_1.2.3 rappdirs_0.3.3
[49] DelayedArray_0.33.4 rjson_0.2.23
[51] bluster_1.17.0 tools_4.5.0
[53] vipor_0.4.7 beeswarm_0.4.0
[55] glue_1.8.0 grid_4.5.0
[57] Rtsne_0.17 cluster_2.1.8
[59] gtable_0.3.6 BiocSingular_1.23.0
[61] ScaledMatrix_1.15.0 metapod_1.15.0
[63] XVector_0.47.2 ggrepel_0.9.6
[65] BiocVersion_3.21.1 pillar_1.10.1
[67] limma_3.63.3 BumpyMatrix_1.15.0
[69] dplyr_1.1.4 BiocFileCache_2.15.1
[71] lattice_0.22-6 bit_4.5.0.1
[73] tidyselect_1.2.1 locfit_1.5-9.10
[75] Biostrings_2.75.3 knitr_1.49
[77] gridExtra_2.3 bookdown_0.42
[79] edgeR_4.5.1 xfun_0.50
[81] statmod_1.5.0 UCSC.utils_1.3.1
[83] yaml_2.3.10 evaluate_1.0.3
[85] codetools_0.2-20 tibble_3.2.1
[87] BiocManager_1.30.25 graph_1.85.1
[89] cli_3.6.3 uwot_0.2.2
[91] munsell_0.5.1 jquerylib_0.1.4
[93] Rcpp_1.0.14 dir.expiry_1.15.0
[95] dbplyr_2.5.0 png_0.1-8
[97] XML_3.99-0.18 parallel_4.5.0
[99] blob_1.2.4 sparseMatrixStats_1.19.0
[101] viridisLite_0.4.2 scales_1.3.0
[103] purrr_1.0.2 crayon_1.5.3
[105] rlang_1.1.5 cowplot_1.1.3
[107] KEGGREST_1.47.0