DIAlignR 2.2.0
In this document we are presenting a workflow of retention-time alignment across multiple Targeted-MS (e.g. DIA, SWATH-MS, PRM, SRM) runs using DIAlignR. This tool requires MS2 chromatograms and provides a hybrid approach of global and local alignment to establish correspondence between peaks.
if(!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("DIAlignR")library(DIAlignR)Mass-spectrometry files mostly contains spectra. Targeted proteomics workflow identifyies analytes from their chromatographic elution profile. DIAlignR extends the same concept for retention-time (RT) alignment and, therefore, relies on MS2 chromatograms. DIAlignR expects raw chromatogram file (.chrom.mzML) and FDR-scored features (.osw) file.
Example files are available with this package and can be located with this command:
dataPath <- system.file("extdata", package = "DIAlignR")bash command can be used:OpenSwathWorkflow -in Filename.mzML.gz -tr library.pqp -tr_irt
iRTassays.TraML -out_osw Filename.osw -out_chrom Filename.chrom.mzMLmzR. In such cases mzR would throw an error indicating Invalid cvParam accession "1002746". To avoid this issue, uncompress chromatograms using OpenMS.FileConverter -in Filename.chrom.mzML -in_type 'mzML' -out Filename.chrom.mzMLpyprophet merge --template=library.pqp --out=merged.osw *.osw
pyprophet score --in=merged.osw --classifier=XGBoost --level=ms2 --xeval_num_iter=3 \
--ss_initial_fdr=0.05 --ss_iteration_fdr=0.01
pyprophet peptide --in=merged.osw --context=experiment-widexics directory and merged.osw file in osw directory. The parent folder is given as dataPath to DIAlignR functions.alignTargetedRuns function aligns Proteomics or Metabolomics DIA runs. It expects two directories “osw” and “xics” at dataPath. It outputs an intensity table where rows specify each analyte and columns specify runs. Use parameter saveFiles = TRUE to have aligned retetion time and intensities saved in the current directory.
runs <- c("hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt",
          "hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt")
params <- paramsDIAlignR()
params[["context"]] <- "experiment-wide"
# For specific runs provide their names.
alignTargetedRuns(dataPath = dataPath, outFile = "test.csv", runs = runs, oswMerged = TRUE, params = params)
# For all the analytes in all runs, keep them as NULL.
alignTargetedRuns(dataPath = dataPath, outFile = "test.csv", runs = NULL, oswMerged = TRUE, params = params)For getting alignment object which has aligned indices of XICs getAlignObjs function can be used. Like previous function, it expects two directories “osw” and “xics” at dataPath. It performs alignment for exactly two runs. In case of refRun is not provided, m-score from osw files is used to select reference run.
runs <- c("hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt",
          "hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt")
AlignObjLight <- getAlignObjs(analytes = 4618L, runs = runs, dataPath = dataPath, objType   = "light", params = params)
#> [1] "hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt" 
#> [2] "hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt"
#> [1] "Finding reference run using SCORE_PEPTIDE table"
# First element contains names of runs, spectra files, chromatogram files and feature files.
AlignObjLight[[1]][, c("runName", "spectraFile")]
#>                                                 runName
#> run1  hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt
#> run2 hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt
#>                                                              spectraFile
#> run1  data/raw/hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt.mzML.gz
#> run2 data/raw/hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt.mzML.gz
obj <- AlignObjLight[[2]][["4618"]][[1]][["AlignObj"]]
slotNames(obj)
#> [1] "indexA_aligned" "indexB_aligned" "score"
names(as.list(obj))
#> [1] "indexA_aligned" "indexB_aligned" "score"
AlignObjMedium <- getAlignObjs(analytes = 4618L, runs = runs, dataPath = dataPath, objType  = "medium", params = params)
#> [1] "hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt" 
#> [2] "hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt"
#> [1] "Finding reference run using SCORE_PEPTIDE table"
obj <- AlignObjMedium[[2]][["4618"]][[1]][["AlignObj"]]
slotNames(obj)
#> [1] "s"              "path"           "indexA_aligned" "indexB_aligned"
#> [5] "score"Alignment object has slots * indexA_aligned aligned indices of reference chromatogram. * indexB_aligned aligned indices of experiment chromatogram * score cumulative score of the alignment till an index. * s similarity score matrix. * path path of the alignment through similarity score matrix.
We can visualize aligned chromatograms using plotAlignedAnalytes. The top figure is experiment unaligned-XICs, middle one is reference XICs, last figure is experiment run aligned to reference.
runs <- c("hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt",
 "hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt")
AlignObj <- getAlignObjs(analytes = 4618L, runs = runs, dataPath = dataPath, params = params)
#> [1] "hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt" 
#> [2] "hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt"
#> [1] "Finding reference run using SCORE_PEPTIDE table"
plotAlignedAnalytes(AlignObj, annotatePeak = TRUE)
#> Warning: Removed 30 row(s) containing missing values (geom_path).We can also visualize the alignment path using plotAlignemntPath function.
library(lattice)
runs <- c("hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt",
 "hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt")
AlignObjOutput <- getAlignObjs(analytes = 4618L, runs = runs, params = params, dataPath = dataPath, objType = "medium")
#> [1] "hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt" 
#> [2] "hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt"
#> [1] "Finding reference run using SCORE_PEPTIDE table"
plotAlignmentPath(AlignObjOutput)Gupta S, Ahadi S, Zhou W, Röst H. “DIAlignR Provides Precise Retention Time Alignment Across Distant Runs in DIA and Targeted Proteomics.” Mol Cell Proteomics. 2019 Apr;18(4):806-817. doi: https://doi.org/10.1074/mcp.TIR118.001132
sessionInfo()
#> R version 4.1.1 (2021-08-10)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.3 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.14-bioc/R/lib/libRblas.so
#> LAPACK: /home/biocbuild/bbs-3.14-bioc/R/lib/libRlapack.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] lattice_0.20-45  DIAlignR_2.2.0   BiocStyle_2.22.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.7          ape_5.5             tidyr_1.1.4        
#>  [4] png_0.1-7           zoo_1.8-9           assertthat_0.2.1   
#>  [7] digest_0.6.28       utf8_1.2.2          R6_2.5.1           
#> [10] signal_0.7-7        RSQLite_2.2.8       evaluate_0.14      
#> [13] pracma_2.3.3        highr_0.9           ggplot2_3.3.5      
#> [16] pillar_1.6.4        rlang_0.4.12        data.table_1.14.2  
#> [19] jquerylib_0.1.4     blob_1.2.2          phangorn_2.7.1     
#> [22] magick_2.7.3        Matrix_1.3-4        reticulate_1.22    
#> [25] rmarkdown_2.11      labeling_0.4.2      stringr_1.4.0      
#> [28] igraph_1.2.7        bit_4.0.4           munsell_0.5.0      
#> [31] compiler_4.1.1      xfun_0.27           pkgconfig_2.0.3    
#> [34] RMSNumpress_1.0.1   htmltools_0.5.2     tidyselect_1.1.1   
#> [37] gridExtra_2.3       tibble_3.1.5        bookdown_0.24      
#> [40] quadprog_1.5-8      codetools_0.2-18    fansi_0.5.0        
#> [43] crayon_1.4.1        dplyr_1.0.7         MASS_7.3-54        
#> [46] grid_4.1.1          nlme_3.1-153        jsonlite_1.7.2     
#> [49] gtable_0.3.0        lifecycle_1.0.1     DBI_1.1.1          
#> [52] magrittr_2.0.1      scales_1.1.1        stringi_1.7.5      
#> [55] cachem_1.0.6        farver_2.1.0        latticeExtra_0.6-29
#> [58] bslib_0.3.1         ellipsis_0.3.2      generics_0.1.1     
#> [61] vctrs_0.3.8         fastmatch_1.1-3     RColorBrewer_1.1-2 
#> [64] tools_4.1.1         bit64_4.0.5         glue_1.4.2         
#> [67] purrr_0.3.4         jpeg_0.1-9          parallel_4.1.1     
#> [70] fastmap_1.1.0       yaml_2.2.1          colorspace_2.0-2   
#> [73] BiocManager_1.30.16 memoise_2.0.0       knitr_1.36         
#> [76] sass_0.4.0