--- title: "Python Integration with anndataR" package: anndataR output: BiocStyle::html_document vignette: > %\VignetteIndexEntry{Python integration with anndataR} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} reticulate::py_require("mudata") reticulate::py_require("scanpy") # Check if required Python packages are available python_available <- reticulate::py_module_available("scanpy") && reticulate::py_module_available("mudata") knitr::opts_chunk$set( collapse = TRUE, comment = "#>", # Set eval based on Python package availability eval = python_available ) ``` # Introduction `r Biocpkg("anndataR")` works with Python `AnnData` objects through `r CRANpkg("reticulate")`. You can load Python objects, apply Python functions to them, and convert them to `Seurat` or `SingleCellExperiment` objects. ```{r python_unavailable, eval=!python_available} message( "Python packages scanpy and mudata are required to run this vignette. Code chunks will not be evaluated." ) ``` ## Prerequisites This vignette requires Python with the _scanpy_ and _mudata_ packages installed. If these are not available, the code chunks will not be evaluated but the examples remain visible. # Basic Integration with _scanpy_ Install required Python packages if needed: ```{r python_setup, eval=FALSE} reticulate::py_require("scanpy") ``` ```{r load_libraries} library(anndataR) library(reticulate) sc <- import("scanpy") ``` Load a dataset directly from _scanpy_: ```{r load_python_data} adata <- sc$datasets$pbmc3k_processed() adata ``` Apply _scanpy_ functions directly: ```{r apply_python_functions} sc$pp$filter_cells(adata, min_genes = 200L) sc$pp$normalize_total(adata, target_sum = 1e4) sc$pp$log1p(adata) ``` # Conversion to R objects Convert to `SingleCellExperiment` (see `vignette("usage_singlecellexperiment")`): ```{r convert_to_sce} sce_obj <- adata$as_SingleCellExperiment() sce_obj ``` Convert to `Seurat` (see `vignette("usage_seurat")`): ```{r convert_to_seurat} seurat_obj <- adata$as_Seurat() seurat_obj ``` # Multi-modal data with _mudata_ Install required Python packages if needed: ```{r mudata_setup, eval=FALSE} reticulate::py_install("mudata") ``` ```{r load_mudata} md <- import("mudata") ``` Load a `MuData` object from file: ```{r load_mudata_example, eval = python_available && requireNamespace("BiocFileCache", quietly = TRUE)} cache <- BiocFileCache::BiocFileCache(ask = FALSE) h5mu_file <- BiocFileCache::bfcrpath( cache, "https://github.com/gtca/h5xx-datasets/raw/b1177ac8877c89d8bb355b072164384b4e9cc81d/datasets/minipbcite.h5mu" ) mdata <- md$read_h5mu(h5mu_file) ``` Access individual modalities and convert them: ```{r access_modalities} rna_mod <- mdata$mod[["rna"]] rna_seurat <- rna_mod$as_Seurat() print(rna_seurat) rna_sce <- rna_mod$as_SingleCellExperiment() print(rna_sce) ``` # Session info ## R ```{r r-sessioninfo} sessionInfo() ``` ## Python ```{r python-sessioninfo} reticulate::py_config() reticulate::py_list_packages() ```