--- title: Converting single-cell data structures between Bioconductor and Python author: - name: Luke Zappia email: luke@lazappi.id.au - name: Aaron Lun email: infinite.monkeys.with.keyboards@gmail.com date: "Revised: 17 April 2022" output: BiocStyle::html_document: toc_float: true package: zellkonverter vignette: > %\VignetteIndexEntry{Converting to/from AnnData to SingleCellExperiments} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, echo = FALSE, results = "hide", message = FALSE} require(knitr) library(BiocStyle) opts_chunk$set(error = FALSE, message = FALSE, warning = FALSE) ``` Overview ======== This package provides a lightweight interface between the Bioconductor `SingleCellExperiment` data structure and the Python `AnnData`-based single-cell analysis environment. The idea is to enable users and developers to easily move data between these frameworks to construct a multi-language analysis pipeline across R/Bioconductor and Python. Reading and writing H5AD files ============================== The `readH5AD()` function can be used to read a `SingleCellExperiment` from a H5AD file. This can be manipulated in the usual way as described in the `r Biocpkg("SingleCellExperiment")` documentation. ```{r read} library(zellkonverter) # Obtaining an example H5AD file. example_h5ad <- system.file( "extdata", "krumsiek11.h5ad", package = "zellkonverter" ) readH5AD(example_h5ad) ``` We can also write a `SingleCellExperiment` to a H5AD file with the `writeH5AD()` function. This is demonstrated below on the classic Zeisel mouse brain dataset from the `r Biocpkg("scRNAseq")` package. The resulting file can then be directly used in compatible Python-based analysis frameworks. ```{r write} library(scRNAseq) sce_zeisel <- ZeiselBrainData() out_path <- tempfile(pattern = ".h5ad") writeH5AD(sce_zeisel, file = out_path) ``` Converting between `SingleCellExperiment` and `AnnData` objects =============================================================== Developers and power users who control their Python environments can directly convert between `SingleCellExperiment` and [`AnnData` objects](https://anndata.readthedocs.io/en/stable/) using the `SCE2AnnData()` and `AnnData2SCE()` utilities. These functions expect that `r CRANpkg("reticulate")` has already been loaded along with an appropriate version of the [_anndata_](https://pypi.org/project/anndata/) package. We suggest using the `r Biocpkg("basilisk")` package to set up the Python environment before using these functions. ```{r convert} library(basilisk) library(scRNAseq) seger <- SegerstolpePancreasData() roundtrip <- basiliskRun(fun = function(sce) { # Convert SCE to AnnData: adata <- SCE2AnnData(sce) # Maybe do some work in Python on 'adata': # BLAH BLAH BLAH # Convert back to an SCE: AnnData2SCE(adata) }, env = zellkonverterAnnDataEnv(), sce = seger) ``` Package developers can guarantee that they are using the same versions of Python packages as `r Biocpkg("zellkonverter")` by using the `AnnDataDependencies()` function to set up their Python environments. ```{r anndata-deps} AnnDataDependencies() ``` This function can also be used to return dependencies for environments using older versions of _anndata_. ```{r anndata-deps-old} AnnDataDependencies(version = "0.7.6") ``` Progress messages ================= By default the functions in `r Biocpkg("zellkonverter")` don't display any information about their progress but this can be turned on by setting the `verbose = TRUE` argument. ```{r verbose} readH5AD(example_h5ad, verbose = TRUE) ``` If you would like to see progress messages for all functions by default you can turn this on using the `setZellkonverterVerbose()` function. ```{r verbose-set, eval = FALSE} # This is not run here setZellkonverterVerbose(TRUE) ``` Session information =================== ```{r} sessionInfo() ```