--- title: 'AWAggregatorData Vignette' author: - name: Jiahua Tan affiliation: - &1 "Department of Chemistry, University of British Columbia, Vancouver, BC, Canada" - name: Gian L. Negri affiliation: - &2 "Canada's Michael Smith Genome Sciences Centre, BC Cancer Research Institute, University of British Columbia, Vancouver, BC, Canada" - name: Gregg B. Morin affiliation: - *2 - &3 "Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada" - name: David D. Y. Chen affiliation: - *1 date: '`r format(Sys.Date(), "%B %e, %Y")`' package: AWAggregatorData output: BiocStyle::html_document vignette: > %\VignetteIndexEntry{AWAggregatorData vignette} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # Introduction The `AWAggregatorData` package contains the data associated with the `AWAggregator` R package. It includes two pre-trained random forest models, one incorporating the average coefficient of variation as a feature, and the other one not including it. It also contains the PSMs in Benchmark Set 1\~3 derived from the `psm.tsv` output files generated by FragPipe, which are used to train the random forest models. # Overview of Package Data Data available in the `AWAggregatorData` package: - `regr`: represent the pre-trained random forest model that incorporates the average coefficient of variation (CV) as a feature. - `regr.no.CV`: represent the pre-trained random forest model that does not include the average CV as a feature. - `benchmark.set.1`, `benchmark.set.2`, `benchmark.set.3`: represents PSMs in Benchmark Set 1\~3 derived from the `psm.tsv` output files generated by FragPipe, which are used to train the random forest model. Columns unnecessary for the `AWAggregator` have been removed from the sample data. # Installation ```{r install, eval=FALSE} if (!requireNamespace('BiocManager', quietly = TRUE)) install.packages('BiocManager') BiocManager::install('ExperimentHub') BiocManager::install('AWAggregatorData') ``` # Load Data from `ExperimentHub` Data are stored via `ExperimentHub` package. The information of available datasets can be retrieved by the `query` function ```{r query datasets} library(ExperimentHub) eh = ExperimentHub() query(eh, 'AWAggregatorData') # Require Bioconductor version 3.21 or later ``` The datasets and pre-trained models can be downloaded by: ```{r download datasets} # Benchmark Set 1 df = eh[['EH9637']] # Benchmark Set 2 df = eh[['EH9638']] # Benchmark Set 3 df = eh[['EH9639']] # Pre-trained model incorporating the average coefficient of variation (CV) as # a feature regr = eh[['EH9640']] # Pre-trained model excluding CV as a feature regr = eh[['EH9641']] ``` ```{r session info} sessionInfo() ```