---
title: Converting single-cell data structures between Bioconductor and Python
author: 
- name: Luke Zappia
  email: luke@lazappi.id.au
- name: Aaron Lun
  email: infinite.monkeys.with.keyboards@gmail.com
date: "Revised: 17 April 2022"
output:
  BiocStyle::html_document:
    toc_float: true
package: zellkonverter 
vignette: >
  %\VignetteIndexEntry{Converting to/from AnnData to SingleCellExperiments}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}    
---

```{r setup, echo = FALSE, results = "hide", message = FALSE}
require(knitr)
library(BiocStyle)
opts_chunk$set(error = FALSE, message = FALSE, warning = FALSE)
```

Overview
========

This package provides a lightweight interface between the Bioconductor
`SingleCellExperiment` data structure and the Python `AnnData`-based single-cell
analysis environment. The idea is to enable users and developers to easily move
data between these frameworks to construct a multi-language analysis pipeline
across R/Bioconductor and Python.

Reading and writing H5AD files
==============================

The `readH5AD()` function can be used to read a `SingleCellExperiment` from a
H5AD file. This can be manipulated in the usual way as described in the
`r Biocpkg("SingleCellExperiment")` documentation.

```{r read}
library(zellkonverter)

# Obtaining an example H5AD file.
example_h5ad <- system.file("extdata", "krumsiek11.h5ad",
                            package = "zellkonverter")
readH5AD(example_h5ad)
```

We can also write a `SingleCellExperiment` to a H5AD file with the
`writeH5AD()` function. This is demonstrated below on the classic Zeisel mouse
brain dataset from the `r Biocpkg("scRNAseq")` package. The resulting file can
then be directly used in compatible Python-based analysis frameworks.

```{r write}
library(scRNAseq)

sce_zeisel <- ZeiselBrainData()
out_path <- tempfile(pattern = ".h5ad")
writeH5AD(sce_zeisel, file = out_path)
```

Converting between `SingleCellExperiment` and `AnnData` objects
===============================================================

Developers and power users who control their Python environments can directly
convert between `SingleCellExperiment` and
[`AnnData` objects](https://anndata.readthedocs.io/en/stable/) using the
`SCE2AnnData()` and `AnnData2SCE()` utilities. These functions expect that
`r CRANpkg("reticulate")` has already been loaded along with an appropriate
version of the [_anndata_](https://pypi.org/project/anndata/) package. We
suggest using the `r Biocpkg("basilisk")` package to set up the Python
environment before using these functions.

```{r convert}
library(basilisk)
library(scRNAseq)

seger <- SegerstolpePancreasData()
roundtrip <- basiliskRun(fun = function(sce) {
     # Convert SCE to AnnData:
     adata <- SCE2AnnData(sce)

     # Maybe do some work in Python on 'adata':
     # BLAH BLAH BLAH

     # Convert back to an SCE:
     AnnData2SCE(adata)
}, env = zellkonverterAnnDataEnv(), sce = seger)
```

Package developers can guarantee that they are using the same versions of Python
packages as `r Biocpkg("zellkonverter")` by using the `AnnDataDependencies()`
function to set up their Python environments.

```{r anndata-deps}
AnnDataDependencies()
```

This function can also be used to return dependencies for environments using
older versions of _anndata_.

```{r anndata-deps-old}
AnnDataDependencies(version = "0.7.6")
```

Progress messages
=================

By default the functions in `r Biocpkg("zellkonverter")` don't display any
information about their progress but this can be turned on by setting the
`verbose = TRUE` argument.

```{r verbose}
readH5AD(example_h5ad, verbose = TRUE)
```

If you would like to see progress messages for all functions by default you can
turn this on using the `setZellkonverterVerbose()` function.

```{r verbose-set, eval = FALSE}
# This is not run here 
setZellkonverterVerbose(TRUE)
```

Session information
===================

```{r}
sessionInfo()
```