alabaster.files 1.0.0
The alabaster.files package implements methods to save common bioinformatics file formats within the alabaster framework.
It does not perform any validation or parsing of the files, it just provides very light-weight wrappers for processing via alabaster.base::stageObject().
Check out the alabaster.base package for more details on the motivation and concepts behind alabaster.
We’ll start with an indexed BAM file from the Rsamtools package:
bam.file <- system.file("extdata", "ex1.bam", package="Rsamtools", mustWork=TRUE)
bam.index <- paste0(bam.file, ".bai")We can wrap this inside a BamWrapper class, which can be further decorated with arbitrary metadata:
library(alabaster.files)
library(S4Vectors)
wrapped.bam <- BamWrapper(bam.file, index=bam.index)
metadata(wrapped.bam) <- list(foo="bar")Then we can save it into an alabaster staging directory.
dir <- tempfile()
dir.create(dir)
saveLocalObject(wrapped.bam, dir, path="my_bam")… and load it back at some later time.
readLocalObject(dir, "my_bam")## BamWrapper object
## path: /tmp/RtmpbmiPyy/filea245e184ba783/my_bam/file.bam 
## index: /tmp/RtmpbmiPyy/filea245e184ba783/my_bam/index/file.bam.bai 
## metadata(1): fooThe example above isn’t very exciting, but it demonstrates how these files can be easily added to an alabaster project.
This allows us to incorporate the Wrapper objects into other Bioconductor data structures, like:
df <- DataFrame(Sample=LETTERS[1:4])
# Adding a column of assorted wrapper files:
df$File <- list(
    wrapped.bam,
    BedWrapper(system.file("tests", "test.bed", package = "rtracklayer")),
    BigBedWrapper(system.file("tests", "test.bb", package = "rtracklayer")),
    GffWrapper(system.file("tests", "test.gtf", package = "rtracklayer"))
)
# Saving it all to the staging directory:
dir <- tempfile()
dir.create(dir)
saveLocalObject(df, dir, path="stuff")
# Now reading it back in:
roundtrip <- readLocalObject(dir, "stuff")
roundtrip$File## [[1]]
## BamWrapper object
## path: /tmp/RtmpbmiPyy/filea245ebacc88e/stuff/column2/child-1/file.bam 
## index: /tmp/RtmpbmiPyy/filea245ebacc88e/stuff/column2/child-1/index/file.bam.bai 
## metadata(1): foo
## 
## [[2]]
## BedWrapper object
## path: /tmp/RtmpbmiPyy/filea245ebacc88e/stuff/column2/child-2/file.bed 
## compression: none 
## index: <none>
## metadata(0):
## 
## [[3]]
## BigBedWrapper object
## path: /tmp/RtmpbmiPyy/filea245ebacc88e/stuff/column2/child-3/file.bb 
## metadata(0):
## 
## [[4]]
## GffWrapper object in the GFF2 format
## path: /tmp/RtmpbmiPyy/filea245ebacc88e/stuff/column2/child-4/file.gff2 
## compression: none 
## index: <none>
## metadata(0):Similarly, if the staging directory is uploaded to a remote store, the wrapped files will automatically be included in the upload. This avoids the need for a separate process to handle these files.
alabaster.files will try to perform some cursory validation of the wrapped file to catch errors in user inputs.
The level of validation is format-dependent but should be fast, e.g., BAM file validation is performed by scanning the header.
In all cases, users should not expect an exhaustive check of file validity, as that would take too long and involve more parsing than desired for the scope of alabaster.files.
If stricter validation is required, applications calling alabaster.files should override the stageObject() methods for the relevant Wrapper classes.
sessionInfo()## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.18-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] GenomicRanges_1.54.0  IRanges_2.36.0        S4Vectors_0.40.0     
## [4] BiocGenerics_0.48.0   alabaster.files_1.0.0 alabaster.base_1.2.0 
## [7] BiocStyle_2.30.0     
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.7                  SparseArray_1.2.0          
##  [3] bitops_1.0-7                lattice_0.22-5             
##  [5] jsonvalidate_1.3.2          digest_0.6.33              
##  [7] grid_4.3.1                  evaluate_0.22              
##  [9] bookdown_0.36               fastmap_1.1.1              
## [11] Matrix_1.6-1.1              jsonlite_1.8.7             
## [13] restfulr_0.0.15             GenomeInfoDb_1.38.0        
## [15] alabaster.schemas_1.2.0     BiocManager_1.30.22        
## [17] XML_3.99-0.14               Biostrings_2.70.0          
## [19] codetools_0.2-19            jquerylib_0.1.4            
## [21] abind_1.4-5                 cli_3.6.1                  
## [23] rlang_1.1.1                 crayon_1.5.2               
## [25] XVector_0.42.0              Biobase_2.62.0             
## [27] DelayedArray_0.28.0         cachem_1.0.8               
## [29] yaml_2.3.7                  S4Arrays_1.2.0             
## [31] tools_4.3.1                 parallel_4.3.1             
## [33] BiocParallel_1.36.0         Rhdf5lib_1.24.0            
## [35] GenomeInfoDbData_1.2.11     Rsamtools_2.18.0           
## [37] SummarizedExperiment_1.32.0 curl_5.1.0                 
## [39] R6_2.5.1                    BiocIO_1.12.0              
## [41] matrixStats_1.0.0           rhdf5_2.46.0               
## [43] rtracklayer_1.62.0          zlibbioc_1.48.0            
## [45] V8_4.4.0                    bslib_0.5.1                
## [47] Rcpp_1.0.11                 xfun_0.40                  
## [49] GenomicAlignments_1.38.0    MatrixGenerics_1.14.0      
## [51] knitr_1.44                  rhdf5filters_1.14.0        
## [53] rjson_0.2.21                htmltools_0.5.6.1          
## [55] rmarkdown_2.25              compiler_4.3.1             
## [57] RCurl_1.98-1.12