The package immunogenViewer is meant to support researchers in comparing and choosing suitable antibodies provided that information on the immunogen used to raise the antibody is available. When the immunogen of an antibody is known, its binding site within the protein antigen is defined and can be examined in detail. As antibodies raised against peptide immunogens often do not function properly when used to detect natively folded proteins (Brown et al. 2011), examination of the position of the immunogen within the full-length protein can provide insights. Using immunogenViewer provides an easy approach to visualize, evaluate and compare immunogens within the full-length sequence of a protein. Information on structural and functional annotations of the immunogen and thus antibody binding site can tell the user if an antibody is potentially useful for native protein detection (Trier, Hansen, and Houen 2012; Waury et al. 2022).
Specifically, immunogenViewer can be used to retrieve protein features for a protein of interest using an API call to the UniProtKB (Bateman et al. 2022) and PredictProtein (Bernhofer et al. 2021) databases. The features are saved on a per-residue level in a dataframe. One or several immunogens can be associated with the protein. The immunogen(s) can then be visualized and evaluated regarding their structure and other annotations that can influence successful antibody recognition within the full-length protein. A summary report of the immunogen can be created to easily compare and select favorable immunogens and their respective antibodies. This package should be used as a pre-selection step to exclude unsuitable antibodies early on. It does not replace comprehensive antibody validation. For more information on validation, please refer to other excellent resources (Roncador et al. 2015; Voskuil et al. 2020).
The package can be installed directly from Bioconductor.
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("immunogenViewer")library(immunogenViewer)To retrieve the features for the protein of interest the correct UniProt ID (also known as accession number) is required. If the UniProt ID is not known yet, one can search the UniProtKB using the gene or protein name. Be sure to select the UniProt ID of the correct organism and preferable search within reviewed SwissProt entries instead of unreviewed TrEMBL entries. Our example protein is the human protein TREM2 (UniProt ID: Q9NZC2). Using getProteinFeatures() relevant features from UniProt and PredictProtein are retrieved. Interaction with UniProt is done using the Bioconductor package UniProt.ws. To see how the dataframe is structured, we will look at the returned dataframe.
The following protein features are included:
Secondary Structure: Each residue is assigned as helix, sheet or other conformation based on predictions by RePROF. Source: PredictProtein
Solvent Accessibility: Each residue is assigned as exposed or buried based on predictions by RePROF. Source: PredictProtein
Membrane: Residues that are annotated as transmembrane or intramembrane. Source: UniProtKB
Protein Binding: Each residue that is predicted to bind proteins based on predictions by ProNA. Source: PredictProtein
Disorder: Each residue that is predicted to be disordered based on predictions by Meta-Disorder. Source: PredictProtein
PTM: Residues that are annotated as modified residues, lipidation or glycosylation. Source: UniProtKB
Disulfide Bridges: Residues that are annotated as disulfide bonds. Source: UniProtKB
To see how the dataframe is structured, we will look at the returned dataframe.
protein <- getProteinFeatures("Q9NZC2")
# check protein dataframe
DT::datatable(protein, width = "80%", options = list(scrollX = TRUE))After creating the protein dataframe using getProteinFeatures() immunogens to be visualized and evaluated can be added to the dataframe. For this purpose, we use addImmunogen(). With every call to the function one immunogen can be added to the protein dataframe. Besides the protein dataframe, we need to define the immunogen to be added by supplying the start and end position of the immunogen and a name.
Searching antibody database Antibodypedia, three antibodies are identified that were raised against known immunogens peptide. These immunogens are added to the dataframe by defining their start and end position or the immunogen peptide sequence within the full protein sequence and naming them after their catalog identifiers. Each immunogen is added as an additional column to the protein dataframe, the immunogen name is used as the column name.
protein <- addImmunogen(protein, start = 142, end = 192, name = "ABIN2783734_")
protein <- addImmunogen(protein, start = 196, end = 230, name = "HPA010917")
protein <- addImmunogen(protein, seq = "HGQKPGTHPPSELD", name = "EB07921")
# check that immunogens were added as columns
colnames(protein)
 [1] "Uniprot"              "Position"             "Residue"             
 [4] "SecondaryStructure"   "SolventAccessibility" "Membrane"            
 [7] "ProteinBinding"       "Disorder"             "PTM"                 
[10] "DisulfideBridge"      "ABIN2783734_"         "HPA010917"           
[13] "EB07921"             Already added immunogens can be renamed using renameImmunogen() if the provided start and end position are correct but the name should be updated. This way a typo can be corrected or a more informative name added instead of re-adding the immunogen. The column name in the protein dataframe is then updated.
protein <- renameImmunogen(protein, oldName = "ABIN2783734_", newName = "ABIN2783734")
# check that immunogen name was updated
colnames(protein)
 [1] "Uniprot"              "Position"             "Residue"             
 [4] "SecondaryStructure"   "SolventAccessibility" "Membrane"            
 [7] "ProteinBinding"       "Disorder"             "PTM"                 
[10] "DisulfideBridge"      "ABIN2783734"          "HPA010917"           
[13] "EB07921"             A previously added immunogen can be removed from the protein dataframe using removeImmunogen(). The corresponding column is dropped from the protein dataframe.
protein <- removeImmunogen(protein, name = "HPA010917")
# check that immunogen was removed
colnames(protein)
 [1] "Uniprot"              "Position"             "Residue"             
 [4] "SecondaryStructure"   "SolventAccessibility" "Membrane"            
 [7] "ProteinBinding"       "Disorder"             "PTM"                 
[10] "DisulfideBridge"      "ABIN2783734"          "EB07921"             After retrieval of the protein features and adding the relevant immunogens correctly, the full protein sequence can be plotted with the features and the immunogens annotated along the sequence. The plot allows to understand the position of the immunogen peptide within the full-length sequence as well as identify relevant obstacles within the protein that might hinder or limit successful antibody binding.
plotProtein(protein)If interested in one specific immunogen, one can visualize the relevant part of the protein sequence. In this plot the amino acid sequence of the immunogen is shown along the x axis while the same features as in the protein plot are included.
plotImmunogen(protein, "ABIN2783734")Apart from visualizing specific immunogens, it is also possible to summarize the protein features within a specific immunogen. This can either be done for one immunogen of interest or for all immunogens added to a protein dataframe at once. The output is a summary dataframe that can be sorted by the feature columns. By sorting the most suitable immunogen, e.g., with the highest fraction of exposed residues, can be selected.
The following summary statistics are included:
SumPTM: Number of PTM residues within the immunogen.
SumDisulfideBridges: Number of disulfide bond residues within the immunogen.
ProportionMembrane: Proportion of immunogen residues that are annotated as membrane residues.
ProportionDisorder: Proportion of immunogen residues that are predicted as disordered.
ProportionBinding: Proportion of immunogen residues that are predicted to bind proteins.
ProportionsHelix: Proportion of immunogen residues that are predicted to form alpha helices.
ProportionsSheet: Proportion of immunogen residues that are predicted to form beta sheets.
ProportionsCoil: Proportion of immunogen residues that are predicted to form coils.
ProportionsBuried: Proportion of immunogen residues that are predicted to be buried.
ProportionsExposed: Proportion of immunogen residues that are predicted to be solvent exposed.
immunogens <- evaluateImmunogen(protein)
[1] "No immunogen specified, evaluating all immunogens."
[1] "Immunogen name: ABIN2783734"
[1] "Immunogen sequence: WFPGESESFEDAHVEHSISRSLLEGEIPFPPTSILLLLACIFLIKILAASA (Residues 142 - 192)"
[1] "Immunogen name: EB07921"
[1] "Immunogen sequence: HGQKPGTHPPSELD (Residues 199 - 212)"
# check summary dataframe
DT::datatable(immunogens, width = "80%", options = list(scrollX = TRUE))getProteinFeatures() the taxonomy ID for the
protein’s species has to be set. The default is human (ID: 9606). If
the protein of interest is from a different species, the correct
taxonomy ID must be set as a parameter.Bateman, Alex, Maria-Jesus Martin, Sandra Orchard, Michele Magrane, Shadab Ahmad, Emanuele Alpi, Emily H Bowler-Barnett, et al. 2022. “UniProt: The Universal Protein Knowledgebase in 2023.” Nucleic Acids Research 51 (D1): D523–D531. https://doi.org/10.1093/nar/gkac1052.
Bernhofer, Michael, Christian Dallago, Tim Karl, Venkata Satagopam, Michael Heinzinger, Maria Littmann, Tobias Olenyi, et al. 2021. “PredictProtein - Predicting Protein Structure and Function for 29 Years.” Nucleic Acids Research 49 (W1): W535–W540. https://doi.org/10.1093/nar/gkab354.
Brown, Michael C., Tony R. Joaquim, Ross Chambers, Dale V. Onisk, Fenglin Yin, Janet M. Moriango, Yichun Xu, et al. 2011. “Impact of Immunization Technology and Assay Application on Antibody Performance – a Systematic Comparative Evaluation.” PLoS ONE 6 (12): e28718. https://doi.org/10.1371/journal.pone.0028718.
Roncador, Giovanna, Pablo Engel, Lorena Maestre, Amanda P. Anderson, Jacqueline L. Cordell, Mark S. Cragg, Vladka Č. Šerbec, et al. 2015. “The European Antibody Network’s Practical Guide to Finding and Validating Suitable Antibodies for Research.” mAbs 8 (1): 27–36. https://doi.org/10.1080/19420862.2015.1100787.
Trier, N. H., P. R. Hansen, and G. Houen. 2012. “Production and Characterization of Peptide Antibodies.” Methods 56 (2): 136–44. https://doi.org/10.1016/j.ymeth.2011.12.001.
Voskuil, Jan L. A., Anita Bandrowski, C. Glenn Begley, Andrew R. M. Bradbury, Andrew D. Chalmers, Aldrin V. Gomes, Travis Hardcastle, et al. 2020. “The Antibody Society’s Antibody Validation Webinar Series.” mAbs 12 (1). https://doi.org/10.1080/19420862.2020.1794421.
Waury, Katharina, Eline A. J. Willemse, Eugeen Vanmechelen, Henrik Zetterberg, Charlotte E. Teunissen, and Sanne Abeln. 2022. “Bioinformatics Tools and Data Resources for Assay Development of Fluid Protein Biomarkers.” Biomarker Research 10 (1). https://doi.org/10.1186/s40364-022-00425-w.
sessionInfo()
R Under development (unstable) (2025-01-20 r87609)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.1 LTS
Matrix products: default
BLAS:   /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB              LC_COLLATE=C              
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
time zone: America/New_York
tzcode source: system (glibc)
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] immunogenViewer_1.1.1 BiocStyle_2.35.0     
loaded via a namespace (and not attached):
 [1] KEGGREST_1.47.0         gtable_0.3.6            xfun_0.50              
 [4] bslib_0.9.0             ggplot2_3.5.1           httr2_1.1.0            
 [7] htmlwidgets_1.6.4       AnVILBase_1.1.0         Biobase_2.67.0         
[10] crosstalk_1.2.1         vctrs_0.6.5             rjsoncons_1.3.1        
[13] tools_4.5.0             generics_0.1.3          stats4_4.5.0           
[16] curl_6.2.0              tibble_3.2.1            AnnotationDbi_1.69.0   
[19] RSQLite_2.3.9           blob_1.2.4              pkgconfig_2.0.3        
[22] BiocBaseUtils_1.9.0     dbplyr_2.5.0            S4Vectors_0.45.2       
[25] lifecycle_1.0.4         GenomeInfoDbData_1.2.13 farver_2.1.2           
[28] compiler_4.5.0          Biostrings_2.75.3       progress_1.2.3         
[31] tinytex_0.54            munsell_0.5.1           GenomeInfoDb_1.43.4    
[34] htmltools_0.5.8.1       sass_0.4.9              yaml_2.3.10            
[37] pillar_1.10.1           crayon_1.5.3            jquerylib_0.1.4        
[40] DT_0.33                 cachem_1.1.0            magick_2.8.5           
[43] tidyselect_1.2.1        digest_0.6.37           dplyr_1.1.4            
[46] bookdown_0.42           labeling_0.4.3          UniProt.ws_2.47.6      
[49] fastmap_1.2.0           grid_4.5.0              colorspace_2.1-1       
[52] cli_3.6.3               magrittr_2.0.3          patchwork_1.3.0        
[55] withr_3.0.2             prettyunits_1.2.0       filelock_1.0.3         
[58] UCSC.utils_1.3.1        scales_1.3.0            rappdirs_0.3.3         
[61] bit64_4.6.0-1           rmarkdown_2.29          XVector_0.47.2         
[64] httr_1.4.7              bit_4.5.0.1             png_0.1-8              
[67] hms_1.1.3               memoise_2.0.1           evaluate_1.0.3         
[70] knitr_1.49              IRanges_2.41.2          BiocFileCache_2.15.1   
[73] rlang_1.1.5             Rcpp_1.0.14             glue_1.8.0             
[76] DBI_1.2.3               BiocManager_1.30.25     BiocGenerics_0.53.6    
[79] jsonlite_1.8.9          R6_2.5.1