Package: RNAseqCovarImpute
Title: Impute Covariate Data in RNA Sequencing Studies
Version: 1.7.0
Authors@R: c(
    person("Brennan", "Baker", email = "brennanhilton@gmail.com", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0001-5459-9141")),
    person("Sheela", "Sathyanarayana", role = "aut"),
    person("Adam", "Szpiro", role = "aut"),
    person("James", "MacDonald", role = "aut"),
    person("Alison", "Paquette", role = "aut"))    
URL: https://github.com/brennanhilton/RNAseqCovarImpute
BugReports: https://github.com/brennanhilton/RNAseqCovarImpute/issues
Description: The RNAseqCovarImpute package makes linear model analysis for RNA sequencing read counts
    compatible with multiple imputation (MI) of missing covariates. A major problem with implementing
    MI in RNA sequencing studies is that the outcome data must be included in the imputation prediction
    models to avoid bias. This is difficult in omics studies with high-dimensional data. The first method
    we developed in the RNAseqCovarImpute package surmounts the problem of high-dimensional outcome data
    by binning genes into smaller groups to analyze pseudo-independently. This method implements covariate
    MI in gene expression studies by 1) randomly binning genes into smaller groups, 2) creating M imputed
    datasets separately within each bin, where the imputation predictor matrix includes all covariates and
    the log counts per million (CPM) for the genes within each bin, 3) estimating gene expression changes
    using `limma::voom` followed by `limma::lmFit` functions, separately on each M imputed dataset within
    each gene bin, 4) un-binning the gene sets and stacking the M sets of model results before applying the
    `limma::squeezeVar` function to apply a variance shrinking Bayesian procedure to each M set of model
    results, 5) pooling the results with Rubins’ rules to produce combined coefficients, standard errors,
    and P-values, and 6) adjusting P-values for multiplicity to account for false discovery rate (FDR). A
    faster method uses principal component analysis (PCA) to avoid binning genes while still retaining
    outcome information in the MI models. Binning genes into smaller groups requires that the MI and
    limma-voom analysis is run many times (typically hundreds). The more computationally efficient MI PCA
    method implements covariate MI in gene expression studies by 1) performing PCA on the log CPM values for
    all genes using the Bioconductor `PCAtools` package, 2) creating M imputed datasets where the imputation
    predictor matrix includes all covariates and the optimum number of PCs to retain (e.g., based on Horn’s
    parallel analysis or the number of PCs that account for >80% explained variation), 3) conducting the
    standard limma-voom pipeline with the `voom` followed by `lmFit` followed by `eBayes` functions on each
    M imputed dataset, 4) pooling the results with Rubins’ rules to produce combined coefficients, standard
    errors, and P-values, and 5) adjusting P-values for multiplicity to account for false discovery rate (FDR).
License: GPL-3
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
VignetteBuilder: knitr
biocViews: RNASeq, GeneExpression, DifferentialExpression, Sequencing
Suggests: BiocStyle, knitr, PCAtools, rmarkdown, tidyr, stringr,
        testthat (>= 3.0.0)
Depends: R (>= 4.3.0)
Imports: Biobase, BiocGenerics, BiocParallel, stats, limma, dplyr,
        magrittr, rlang, edgeR, foreach, mice
Config/testthat/edition: 3
Collate: 'RNAseqCovarImpute-package.R' 'combine_rubins.R'
        'combine_rubins_pca.R' 'example_DGE.R' 'example_data.R'
        'get_gene_bin_intervals.R' 'impute_by_gene_bin_helper.R'
        'impute_by_gene_bin.R' 'voom_sx_sy.R' 'lowess_all_gene_bins.R'
        'voom_master_lowess.R' 'limmavoom_imputed_data_list_helper.R'
        'limmavoom_imputed_data_list.R'
        'limmavoom_imputed_data_pca_helper.R'
        'limmavoom_imputed_data_pca.R'
git_url: https://git.bioconductor.org/packages/RNAseqCovarImpute
git_branch: devel
git_last_commit: 11f93b3
git_last_commit_date: 2025-04-15
Repository: Bioconductor 3.22
Date/Publication: 2025-06-04
NeedsCompilation: no
Packaged: 2025-06-05 01:42:31 UTC; biocbuild
Author: Brennan Baker [aut, cre] (ORCID:
    <https://orcid.org/0000-0001-5459-9141>),
  Sheela Sathyanarayana [aut],
  Adam Szpiro [aut],
  James MacDonald [aut],
  Alison Paquette [aut]
Maintainer: Brennan Baker <brennanhilton@gmail.com>
Built: R 4.5.0; ; 2025-06-05 13:52:32 UTC; windows
