OmniPath is a database of molecular signaling knowledge, combining data from more than 100 resources. It contains protein-protein and gene regulatory interactions, enzyme-PTM relationships, protein complexes, annotations about protein function, structure, localization and intercellular communication. OmniPath focuses on networks with directed interactions and effect signs (activation or inhibition) which are suitable inputs for many modeling techniques. OmniPath also features a large collection of proteins’ intercellular communication roles and interactions. OmniPath is distributed by a web service at https://omnipathdb.org/. The Bioconductor package OmnipathR is an R client with full support for all features of the OmniPath web server. Apart from OmniPath, it provides direct access to more than 15 further signaling databases (such as BioPlex, InBioMap, EVEX, Harmonizome, etc) and contains a number of convenience methods, such as igraph integration, and a close integration with the NicheNet pipeline for ligand activity prediction from transcriptomics data. In this demo we show the diverse data in OmniPath and the versatile and convenient ways to access this data by OmnipathR.
OmnipathR 3.12.4
Database knowledge is essential for omics data analysis and modeling. Despite being an important factor, contributing to the outcome of studies, often subject to little attention. With OmniPath our aim is to raise awarness of the diversity of available resources and facilitate access to these resources in a uniform and transparent way. OmniPath has been developed in a close contact to mechanistic modeling applications and functional omics analysis, hence it is especially suitable for these fields. OmniPath has been used for the analysis of various omics data. In the Saez-Rodriguez group we often use it in a pipeline with our footprint based methods DoRothEA and PROGENy and our causal reasoning method CARNIVAL to infer signaling mechanisms from transcriptomics data.
One recent novelty of OmniPath is a collection of intercellular communication interactions. Apart from simply merging data from existing resources, OmniPath defines a number of intercellular communication roles, such as ligand, receptor, adhesion, enzyme, matrix, etc, and generalizes the terms ligand and receptor by introducing the terms transmitter, receiver and mediator. This unique knowledge base is especially adequate for the emerging field of cell-cell communication analysis, typically from single cell transcriptomics, but also from other kinds of data.
No special pre-requisites apart from basic knowledge of R. OmniPath, the database resource in the focus of this workshop has been published in [1,2], however you don’t need to know anything about OmniPath to benefit from the workshop. In the workshop we will demonstrate the R/Bioconductor package OmnipathR. If you would like to try the examples yourself we recommend to install the latest version of the package before the workshop:
library(devtools)
install_github('saezlab/OmnipathR')In the workshop we will present the design and some important features of the OmniPath database, so can be confident you get the most out of it. Then we will demonstrate further useful features of the OmnipathR package, such as accessing other resources, building graphs. Participants are encouraged to experiment with the examples and shape the contents of the workshop by asking questions. We are happy to recieve questions and topic suggestions by email also before the workshop. These could help us to adjust the contents to the interests of the participants.
Total: 45 minutes
| Activity | Time | 
|---|---|
| OmniPath database overview | 5m | 
| Network datasets | 10m | 
| Other OmniPath databases | 5m | 
| Intercellular communication | 10m | 
| Igraph integration | 5m | 
| Further resources | 10m | 
In this workshop you will get familiar with the design and features of the OmniPath databases. For example, to know some important details about the datasets and parameters which help you to query the database the most suitable way according to your purposes. You will also learn about functionalities of the OmnipathR package which might make your work easier.
library(OmnipathR)OmniPath consists of five major databases, each combining many original resources. The five databases are:
The parameters for each database (query type) are available in the web service, for example: https://omnipathdb.org/queries/interactions. The R package supports all features of the web service and the parameter names and values usually correspond to the web service parameters which you would use in a HTTP query string.
The network database contains protein-protein, gene regulatory and miRNA-mRNA interactions. Soon more interaction types will be added. Some of these categories can be further divided into datasets which are defined by the type of evidences. A full list of network datasets:
Not individual interactions but resource are classified into the datasets
above, so these can overlap. Each interaction type and dataset has its
dedicated function in OmnipathR, above we provide links to their help
pages. As an example, let’s see the gene regulatory interactions:
gri <- import_transcriptional_interactions()
gri## # A tibble: 128,218 × 16
##    source                target source_genesymbol target_genesymbol is_directed is_stimulation is_inhibition
##    <chr>                 <chr>  <chr>             <chr>                   <dbl>          <dbl>         <dbl>
##  1 P01106                O14746 MYC               TERT                        1              1             0
##  2 P17947                P02818 SPI1              BGLAP                       1              1             0
##  3 COMPLEX:P15407_P17275 P05412 FOSL1_JUNB        JUN                         1              1             0
##  4 COMPLEX:P01100_P05412 P05412 FOS_JUN           JUN                         1              1             0
##  5 COMPLEX:P01100_P17275 P05412 FOS_JUNB          JUN                         1              1             0
##  6 COMPLEX:P15408_P17535 P05412 FOSL2_JUND        JUN                         1              1             0
##  7 COMPLEX:P05412_P15408 P05412 FOSL2_JUN         JUN                         1              1             0
##  8 COMPLEX:P05412        P05412 JUN               JUN                         1              1             0
##  9 COMPLEX:P17275_P53539 P05412 FOSB_JUNB         JUN                         1              1             0
## 10 COMPLEX:P17275        P05412 JUNB              JUN                         1              1             0
## # ℹ 128,208 more rows
## # ℹ 9 more variables: consensus_direction <dbl>, consensus_stimulation <dbl>, consensus_inhibition <dbl>,
## #   sources <chr>, references <chr>, curation_effort <dbl>, dorothea_level <chr>, n_references <dbl>,
## #   n_resources <int>The interaction data frame contains the UniProt IDs and Gene Symbols of the interacting partners, the list of resources and references (PubMed IDs) for each interaction, and whether the interaction is directed, stimulatory or inhibitory.
The network data frames can be converted to igraph graph objects, so you can make use of the graph and visualization methods of igraph:
gr_graph <- interaction_graph(gri)
gr_graph## IGRAPH 3b14b62 DN-- 15679 128218 -- 
## + attr: name (v/c), up_ids (v/c), is_directed (e/n), is_stimulation (e/n), is_inhibition (e/n),
## | consensus_direction (e/n), consensus_stimulation (e/n), consensus_inhibition (e/n), sources
## | (e/x), references (e/x), curation_effort (e/n), dorothea_level (e/c), n_references (e/n),
## | n_resources (e/n)
## + edges from 3b14b62 (vertex names):
##  [1] MYC       ->TERT  SPI1      ->BGLAP FOSL1_JUNB->JUN   FOS_JUN   ->JUN   FOS_JUNB  ->JUN  
##  [6] FOSL2_JUND->JUN   FOSL2_JUN ->JUN   JUN       ->JUN   FOSB_JUNB ->JUN   JUNB      ->JUN  
## [11] FOSL2_JUNB->JUN   JUND      ->JUN   JUN_JUND  ->JUN   JUNB_JUND ->JUN   FOSL1_JUND->JUN  
## [16] FOSL1_JUN ->JUN   FOSB_JUND ->JUN   FOSB_JUN  ->JUN   JUN_JUNB  ->JUN   FOS_JUND  ->JUN  
## [21] SMAD3     ->JUN   SMAD4     ->JUN   STAT5A    ->IL2   STAT5B    ->IL2   RELA      ->FAS  
## + ... omitted several edgesOn this network we can use OmnipathR’s find_all_paths function, which
is able to look up all paths up to a certain length between two set of
nodes:
paths <- find_all_paths(
    graph = gr_graph,
    start = c('EGFR', 'STAT3'),
    end = c('AKT1', 'ULK1'),
    attr = 'name'
)As this is a gene regulatory network, the paths are TFs regulating the transcription of other TFs.
Enzyme-substrate interactions are also available also in the interactions query, but the enzyme-substrate query type provides additional information about the PTM types and residues.
enz_sub <- import_omnipath_enzsub()
enz_sub## # A tibble: 43,269 × 12
##    enzyme substrate enzyme_genesymbol substrate_genesymbol residue_type residue_offset modification    sources
##    <chr>  <chr>     <chr>             <chr>                <chr>                 <dbl> <chr>           <chr>  
##  1 P06239 O14543    LCK               SOCS3                Y                       204 phosphorylation KEA;MI…
##  2 P06239 O14543    LCK               SOCS3                Y                       221 phosphorylation KEA;MI…
##  3 P12931 O14746    SRC               TERT                 Y                       707 phosphorylation BEL-La…
##  4 P06241 O15117    FYN               FYB1                 Y                       651 phosphorylation HPRD;K…
##  5 P06241 O15117    FYN               FYB1                 Y                       595 phosphorylation HPRD;K…
##  6 P06241 O15117    FYN               FYB1                 Y                       697 phosphorylation HPRD;K…
##  7 P06241 O15117    FYN               FYB1                 Y                       625 phosphorylation Phosph…
##  8 P06241 O15117    FYN               FYB1                 Y                       571 phosphorylation Phosph…
##  9 P06241 O15117    FYN               FYB1                 Y                       771 phosphorylation Phosph…
## 10 P06241 O15117    FYN               FYB1                 Y                       559 phosphorylation Phosph…
## # ℹ 43,259 more rows
## # ℹ 4 more variables: references <chr>, curation_effort <dbl>, n_references <dbl>, n_resources <int>This data frame also can be converted to an igraph object:
es_graph <- enzsub_graph(enz_sub)
es_graph## IGRAPH bce7b8e DN-- 4793 43269 -- 
## + attr: name (v/c), up_ids (v/c), residue_type (e/c), residue_offset (e/n), modification (e/c),
## | sources (e/x), references (e/x), curation_effort (e/n), n_references (e/n), n_resources (e/n)
## + edges from bce7b8e (vertex names):
##  [1] LCK  ->SOCS3  LCK  ->SOCS3  SRC  ->TERT   FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1  
##  [8] FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1  
## [15] FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1   FYN  ->FYB1  
## [22] ABL1 ->PLSCR1 ABL1 ->PLSCR1 SRC  ->PLSCR1 SRC  ->PLSCR1 ABL1 ->TP73   CDK2 ->TP73   CHEK1->TP73  
## [29] AURKB->BIRC5  AURKB->BIRC5  AURKB->BIRC5  CDK1 ->BIRC5  PDPK1->PDPK1  PDPK1->PDPK1  PDPK1->PDPK1 
## [36] PDPK1->PDPK1  PDPK1->PDPK1  PDPK1->PDPK1  PDPK1->PDPK1  PDPK1->PDPK1  PDPK1->PDPK1  PDPK1->PDPK1 
## [43] PDPK1->PDPK1  PDPK1->PDPK1  SRC  ->PDPK1  SRC  ->PDPK1  SRC  ->PDPK1  SRC  ->PDPK1  SRC  ->PDPK1 
## + ... omitted several edgesIt is also possible to add effect signs (stimulatory or inhibitory) to enzyme-PTM relationships:
es_signed <- get_signed_ptms(enz_sub)cplx <- import_omnipath_complexes()
cplx## # A tibble: 35,459 × 7
##    name         components                 components_genesymbols stoichiometry sources references identifiers
##    <chr>        <chr>                      <chr>                  <chr>         <chr>   <chr>      <chr>      
##  1 NFY          P23511_P25208_Q13952       NFYA_NFYB_NFYC         1:1:1         CORUM;… 9372932;1… Compleat:H…
##  2 mTORC2       P42345_P68104_P85299_Q6R3… DEPTOR_EEF1A1_MLST8_M… 0:0:0:0:0:0   SIGNOR  <NA>       SIGNOR:SIG…
##  3 mTORC1       P42345_Q8N122_Q8TB45_Q96B… AKT1S1_DEPTOR_MLST8_M… 0:0:0:0:0     SIGNOR  <NA>       SIGNOR:SIG…
##  4 SCF-betaTRCP P63208_Q13616_Q9Y297       BTRC_CUL1_SKP1         1:1:1         CORUM;… 9990852    Compleat:H…
##  5 CBP/p300     Q09472_Q92793              CREBBP_EP300           0:0           SIGNOR  <NA>       SIGNOR:SIG…
##  6 P300/PCAF    Q09472_Q92793_Q92831       CREBBP_EP300_KAT2B     0:0:0         SIGNOR  <NA>       SIGNOR:SIG…
##  7 SMAD2/SMAD4  Q13485_Q15796              SMAD2_SMAD4            1:2           Comple… 12923550;… PDB:1u7v;S…
##  8 SMAD3/SMAD4  P84022_Q13485              SMAD3_SMAD4            2:1           Comple… 12923550;… PDB:1U7F;P…
##  9 SMAD4/JUN    P05412_Q13485              JUN_SMAD4              0:0           SIGNOR  <NA>       SIGNOR:SIG…
## 10 SMAD2/SMURF2 Q15796_Q9HAU4              SMAD2_SMURF2           1:1           Comple… 11389444   Compleat:H…
## # ℹ 35,449 more rowsThe resulted data frame provides the constitution and stoichiometry of protein complexes, with references.
The annotations query type includes a diverse set of resources (about 60 of them), about protein function, localization, structure and expression. For most use cases it is better to convert the data into wide data frames, as these correspond to the original format of the resources. If you load more than one resources into wide data frames, the result will be a list of data frames, otherwise one data frame. See a few examples with localization data from UniProt, tissue expression from Human Protein Atlas and pathway information from SignaLink:
uniprot_loc <- import_omnipath_annotations(
    resources = 'UniProt_location',
    wide = TRUE
)
uniprot_loc## # A tibble: 67,934 × 5
##    uniprot genesymbol entity_type location                       features                                    
##    <chr>   <chr>      <chr>       <chr>                          <chr>                                       
##  1 Q96JT2  SLC45A3    protein     Membrane                       Multi-pass membrane protein                 
##  2 Q9UP95  SLC12A4    protein     Membrane                       Multi-pass membrane protein                 
##  3 Q08357  SLC20A2    protein     Cell membrane                  Multi-pass membrane protein                 
##  4 O94855  SEC24D     protein     Cytoplasm                      <NA>                                        
##  5 O94855  SEC24D     protein     Endoplasmic reticulum membrane Peripheral membrane protein;Cytoplasmic side
##  6 O94855  SEC24D     protein     COPII-coated vesicle membrane  Peripheral membrane protein;Cytoplasmic side
##  7 O94855  SEC24D     protein     Cytosol                        <NA>                                        
##  8 O94855  SEC24D     protein     Cytoplasmic vesicle            <NA>                                        
##  9 Q8N2U9  SLC66A2    protein     Membrane                       Multi-pass membrane protein                 
## 10 Q96CW6  SLC7A6OS   protein     Cytoplasm                      <NA>                                        
## # ℹ 67,924 more rowsThe entity_type field can be protein, mirna or complex. Protein complexes
mostly annotated based on the consensus of their members, we should be aware
that this is an in silico inference.
In case of spelling mistake either in parameter names or values OmnipathR
either corrects the mistake or gives a warning or error:
uniprot_loc <- import_omnipath_annotations(
    resources = 'Uniprot_location',
    wide = TRUE
)## Warning in omnipath_check_param(.): The following resources are not available: Uniprot_location. Check the
## resource names for spelling mistakes.Above the name of the resource is wrong. If the parameter name is wrong, it throws an error:
uniprot_loc <- import_omnipath_annotations(
    resuorces = 'UniProt_location',
    wide = TRUE
)## Error in import_omnipath_annotations(resuorces = "UniProt_location", wide = TRUE): Downloading the entire annotations database is not allowed by default because of its huge size (>1GB). If you really want to do that, you find static files at https://archive.omnipathdb.org/. However we recommend to query a set of proteins or a few resources, depending on your interest.Singular vs. plural forms and a few synonyms are automatically corrected:
uniprot_loc <- import_omnipath_annotations(
    resource = 'UniProt_location',
    wide = TRUE
)Another example with tissue expression from Human Protein Atlas:
hpa_tissue <- import_omnipath_annotations(
    resources = 'HPA_tissue',
    wide = TRUE,
    # Limiting to a handful of proteins for a faster vignette build:
    proteins = c('DLL1', 'MEIS2', 'PHOX2A', 'BACH1', 'KLF11', 'FOXO3', 'MEFV')
)
hpa_tissue## # A tibble: 529 × 15
##    uniprot genesymbol entity_type organ     tissue level status prognostic favourable pathology n_not_detected
##    <chr>   <chr>      <chr>       <chr>     <chr>  <chr> <chr>  <lgl>      <lgl>      <lgl>              <dbl>
##  1 O43524  FOXO3      protein     placenta  endot… Not … Suppo… FALSE      FALSE      FALSE                 NA
##  2 O43524  FOXO3      protein     cerebral… glial… Not … Suppo… FALSE      FALSE      FALSE                 NA
##  3 O43524  FOXO3      protein     lung      alveo… High  Suppo… FALSE      FALSE      FALSE                 NA
##  4 O43524  FOXO3      protein     urotheli… uroth… Medi… <NA>   FALSE      TRUE       TRUE                   3
##  5 O43524  FOXO3      protein     seminal … gland… Medi… Suppo… FALSE      FALSE      FALSE                 NA
##  6 O43524  FOXO3      protein     skin 1    fibro… Not … Suppo… FALSE      FALSE      FALSE                 NA
##  7 O43524  FOXO3      protein     colon     gland… Medi… Suppo… FALSE      FALSE      FALSE                 NA
##  8 O43524  FOXO3      protein     skeletal… myocy… Medi… Suppo… FALSE      FALSE      FALSE                 NA
##  9 O43524  FOXO3      protein     colon     endot… Medi… Suppo… FALSE      FALSE      FALSE                 NA
## 10 O43524  FOXO3      protein     testis    Leydi… High  Suppo… FALSE      FALSE      FALSE                 NA
## # ℹ 519 more rows
## # ℹ 4 more variables: n_low <dbl>, n_medium <dbl>, n_high <dbl>, score <dbl>And pathway annotations from SignaLink:
slk_pathw <- import_omnipath_annotations(
    resources = 'SignaLink_pathway',
    wide = TRUE
)
slk_pathw## # A tibble: 2,487 × 4
##    uniprot genesymbol entity_type pathway                 
##    <chr>   <chr>      <chr>       <chr>                   
##  1 P20963  CD247      protein     T-cell receptor         
##  2 P43403  ZAP70      protein     Receptor tyrosine kinase
##  3 P43403  ZAP70      protein     T-cell receptor         
##  4 Q9NYJ8  TAB2       protein     Toll-like receptor      
##  5 Q9NYJ8  TAB2       protein     Innate immune pathways  
##  6 Q9NYJ8  TAB2       protein     JAK/STAT                
##  7 Q9NYJ8  TAB2       protein     Receptor tyrosine kinase
##  8 O43318  MAP3K7     protein     TNF pathway             
##  9 O43318  MAP3K7     protein     T-cell receptor         
## 10 O43318  MAP3K7     protein     Receptor tyrosine kinase
## # ℹ 2,477 more rowsAnnotations can be easily added to network data frames, in this case both the source and target nodes will have their annotation data. This function accepts either the name of an annotation resource or an annotation data frame:
network <- import_omnipath_interactions()
network_slk_pw <- annotated_network(network, 'SignaLink_pathway')
network_slk_pw## # A tibble: 116,712 × 17
##    source target source_genesymbol target_genesymbol is_directed is_stimulation is_inhibition
##    <chr>  <chr>  <chr>             <chr>                   <dbl>          <dbl>         <dbl>
##  1 P0DP23 P48995 CALM1             TRPC1                       1              0             1
##  2 P0DP25 P48995 CALM3             TRPC1                       1              0             1
##  3 P0DP24 P48995 CALM2             TRPC1                       1              0             1
##  4 Q03135 P48995 CAV1              TRPC1                       1              1             0
##  5 P14416 P48995 DRD2              TRPC1                       1              1             0
##  6 Q99750 P48995 MDFI              TRPC1                       1              0             1
##  7 Q14571 P48995 ITPR2             TRPC1                       1              1             0
##  8 P29966 P48995 MARCKS            TRPC1                       1              0             1
##  9 Q13255 P48995 GRM1              TRPC1                       1              1             0
## 10 Q13586 P48995 STIM1             TRPC1                       1              1             0
## # ℹ 116,702 more rows
## # ℹ 10 more variables: consensus_direction <dbl>, consensus_stimulation <dbl>, consensus_inhibition <dbl>,
## #   sources <chr>, references <chr>, curation_effort <dbl>, n_references <int>, n_resources <int>,
## #   pathway_source <chr>, pathway_target <chr>The intercell database assigns roles to proteins such as ligand, receptor,
adhesion, transporter, ECM, etc. The design of this database is far from
being simple, best is to check the description in the recent OmniPath paper
[1].
ic <- import_omnipath_intercell()
ic## # A tibble: 332,447 × 15
##    category     parent database scope aspect source uniprot genesymbol entity_type consensus_score transmitter
##    <chr>        <chr>  <chr>    <chr> <chr>  <chr>  <chr>   <chr>      <chr>                 <dbl> <lgl>      
##  1 transmembra… trans… UniProt… gene… locat… resou… Q8N661  TMEM86B    protein                   4 FALSE      
##  2 transmembra… trans… UniProt… gene… locat… resou… Q8IWU2  LMTK2      protein                   7 FALSE      
##  3 transmembra… trans… UniProt… gene… locat… resou… P41273  TNFSF9     protein                   7 FALSE      
##  4 transmembra… trans… UniProt… gene… locat… resou… Q9Y661  HS3ST4     protein                   4 FALSE      
##  5 transmembra… trans… UniProt… gene… locat… resou… Q9UPX0  IGSF9B     protein                   5 FALSE      
##  6 transmembra… trans… UniProt… gene… locat… resou… Q9NYV7  TAS2R16    protein                   5 FALSE      
##  7 transmembra… trans… UniProt… gene… locat… resou… P01911  HLA-DRB1   protein                   8 FALSE      
##  8 transmembra… trans… UniProt… gene… locat… resou… Q6P9B9  INTS5      protein                   4 FALSE      
##  9 transmembra… trans… UniProt… gene… locat… resou… P05496  ATP5MC1    protein                   5 FALSE      
## 10 transmembra… trans… UniProt… gene… locat… resou… P55344  LIM2       protein                   4 FALSE      
## # ℹ 332,437 more rows
## # ℹ 4 more variables: receiver <lgl>, secreted <lgl>, plasma_membrane_transmembrane <lgl>,
## #   plasma_membrane_peripheral <lgl>This data frame is about individual proteins. To create a network of
intercellular communication, we provide the import_intercell_network
function:
icn <- import_intercell_network(high_confidence = TRUE)
icn## NULLThe result is similar to the annotated_network, each interacting partner
has its intercell annotations. In the intercell database, OmniPath aims to
ship all available information, which means it might contain quite some
false positives. The high_confidence option is a shortcut to stringent
filter settings based on the number and consensus of provenances. Using
instead the filter_intercell_network function, you can have a fine control
over the quality filters. It has many options which are described in the
manual.
icn <- import_intercell_network()
icn_hc <- filter_intercell_network(
    icn,
    ligand_receptor = TRUE,
    consensus_percentile = 30,
    loc_consensus_percentile = 50,
    simplify = TRUE
)The filter_intecell function does a similar procedure on an intercell
annotation data frame.
The list of available resources for each query type can be retrieved
by the get_..._resources function. For example, the annotation resources
are:
get_annotation_resources()##  [1] "Adhesome"             "Almen2009"            "Baccin2019"           "CORUM_Funcat"        
##  [5] "CORUM_GO"             "CSPA"                 "CSPA_celltype"        "CancerDrugsDB"       
##  [9] "CancerGeneCensus"     "CancerSEA"            "CellCall"             "CellCellInteractions"
## [13] "CellChatDB"           "CellChatDB_complex"   "CellPhoneDB"          "CellPhoneDB_complex" 
## [17] "CellTalkDB"           "CellTypist"           "Cellinker"            "Cellinker_complex"   
## [21] "ComPPI"               "CytoSig"              "DGIdb"                "DisGeNet"            
## [25] "EMBRACE"              "Exocarta"             "GO_Intercell"         "GPCRdb"              
## [29] "Guide2Pharma"         "HGNC"                 "HPA_secretome"        "HPA_subcellular"     
## [33] "HPA_tissue"           "HPMR"                 "HumanCellMap"         "ICELLNET"            
## [37] "ICELLNET_complex"     "IntOGen"              "Integrins"            "InterPro"            
## [41] "KEGG-PC"              "Kirouac2010"          "LOCATE"               "LRdb"                
## [45] "Lambert2018"          "MCAM"                 "MSigDB"               "Matrisome"           
## [49] "MatrixDB"             "Membranome"           "NetPath"              "OPM"                 
## [53] "PROGENy"              "PanglaoDB"            "Phobius"              "Phosphatome"         
## [57] "Ramilowski2015"       "Ramilowski_location"  "SIGNOR"               "SignaLink_function"  
## [61] "SignaLink_pathway"    "Surfaceome"           "TCDB"                 "TFcensus"            
## [65] "TopDB"                "UniProt_family"       "UniProt_keyword"      "UniProt_location"    
## [69] "UniProt_tissue"       "UniProt_topology"     "Vesiclepedia"         "Wang"                
## [73] "Zhong2015"            "connectomeDB2020"     "iTALK"                "kinase.com"          
## [77] "scConnect"            "scConnect_complex"    "talklr"Categories in the intercell query also can be listed:
get_intercell_generic_categories()##  [1] "transmembrane"                       "transmembrane_predicted"            
##  [3] "peripheral"                          "plasma_membrane"                    
##  [5] "plasma_membrane_transmembrane"       "plasma_membrane_regulator"          
##  [7] "plasma_membrane_peripheral"          "secreted"                           
##  [9] "cell_surface"                        "ecm"                                
## [11] "ligand"                              "receptor"                           
## [13] "secreted_enzyme"                     "secreted_peptidase"                 
## [15] "extracellular"                       "intracellular"                      
## [17] "receptor_regulator"                  "secreted_receptor"                  
## [19] "sparc_ecm_regulator"                 "ecm_regulator"                      
## [21] "ligand_regulator"                    "cell_surface_ligand"                
## [23] "cell_adhesion"                       "matrix_adhesion"                    
## [25] "adhesion"                            "matrix_adhesion_regulator"          
## [27] "cell_surface_enzyme"                 "cell_surface_peptidase"             
## [29] "secreted_enyzme"                     "extracellular_peptidase"            
## [31] "secreted_peptidase_inhibitor"        "transporter"                        
## [33] "ion_channel"                         "ion_channel_regulator"              
## [35] "gap_junction"                        "tight_junction"                     
## [37] "adherens_junction"                   "desmosome"                          
## [39] "intracellular_intercellular_related"# get_intercell_categories() # this would show also the specific categoriesAn increasing number of other resources (currently around 20) can be directly
accessed by OmnipathR (not from the omnipathdb.org domain, but from their
original providers). As an example,
OmnipathR uses UniProt data to translate identifiers. You may find a list
of the available identifiers in the manual page of translate_ids function.
The evaluation of the parameters is tidyverse style, and both UniProt’s
notation and a simple internal notation can be used. Furthermore, it can
handle vectors, data frames or list of vectors.
d <- data.frame(uniprot_id = c('P00533', 'Q9ULV1', 'P43897', 'Q9Y2P5'))
d <- translate_ids(
    d,
    uniprot_id = uniprot, # the source ID type and column name
    genesymbol # the target ID type using OmniPath's notation
)
d## NULLIt is possible to have one source ID type and column in one call, but multiple target ID types and columns: to translate a network, two calls are necessary. Note: certain functionality fails recently due to changes in other packages, will be fixed in a few days.
network <- import_omnipath_interactions()
network <- translate_ids(
    network,
    source = uniprot_id,
    source_entrez = entrez
)
network <- translate_ids(
    network,
    target = uniprot_id,
    target_entrez = entrez
)OmnipathR is able to look up ancestors and descendants in ontology trees,
and also exposes the ontology tree in three different formats: as a
data frame, as a list of lists or as an igraph graph object. All these
can have two directions: child-to-parent (c2p) or parent-to-child (p2c).
go <- go_ontology_download()
go$rel_tbl_c2p## NULLTo convert the relations to list or graph format, use the
relations_table_to_list or relations_table_to_graph functions. To
swap between c2p and p2c use the swap_relations function.
go_graph <- relations_table_to_graph(go$rel_tbl_c2p)
go_graph## NULLIt can also translate term IDs to term names:
ontology_ensure_name('GO:0000022')## NULLThe first call takes a few seconds as it loads the database, subsequent calls are faster.
OmnipathR features a logging facility, a YML configuration file and
a cache directory. By default the highest level messages are printed to
the console, and you can browse the full log from R by calling
omnipath_log(). The cache can be controlled by a number of functions,
for example you can search for cache files by omnipath_cache_search(),
and delete them by omnipath_cache_remove:
omnipath_cache_search('dorothea')## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$key
## [1] "8e1fed15bbe7704374f40d278e719e18b4a9d60f"
## 
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$url
## [1] "https://omnipathdb.org/interactions?genesymbols=yes&datasets=dorothea,tf_target,collectri&organisms=9606&dorothea_levels=A,B&fields=sources,references,curation_effort,dorothea_level&license=academic"
## 
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$post
## list()
## 
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$payload
## list()
## 
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$ext
## [1] "rds"
## 
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$versions
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$versions$`1`
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$versions$`1`$number
## [1] "1"
## 
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$versions$`1`$path
## [1] "/home/biocbuild/.cache/OmnipathR/8e1fed15bbe7704374f40d278e719e18b4a9d60f-1.rds"
## 
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$versions$`1`$dl_started
## [1] "2024-10-02 18:51:15 EDT"
## 
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$versions$`1`$status
## [1] "ready"
## 
## $`8e1fed15bbe7704374f40d278e719e18b4a9d60f`$versions$`1`$dl_finished
## [1] "2024-10-02 18:51:15 EDT"The configuration can be set by options, all options are prefixed with
omnipath., and can be saved by omnipath_save_config. For example, to
exclude all OmniPath resources which don’t allow for-profit use:
options(omnipath.license = 'commercial')The internal state is contained by the omnipath.env environment.
Find more examples in the other vignettes and the manual. For example, the
NicheNet vignette presents the integratation between OmnipathR and
nichenetr, a method for prediction of ligand-target gene connections.
Another Bioconductor package wppi is able to add context specific scores
to networks, based on genes of interest, functional annotations and network
proximity (random walks with restart). The new paths vignette presents
some approaches to construct pathways from networks. The design of the
OmniPath database is described in our recent paper [1], while an in depth
analysis of the pathway resources is available in the first OmniPath
paper [2].
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.19-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB             
##  [4] LC_COLLATE=C               LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
## [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] OmnipathR_3.12.4 BiocStyle_2.32.1
## 
## loaded via a namespace (and not attached):
##  [1] tidyr_1.3.1         rappdirs_0.3.3      sass_0.4.9          utf8_1.2.4          generics_0.1.3     
##  [6] xml2_1.3.6          stringi_1.8.4       hms_1.1.3           digest_0.6.37       magrittr_2.0.3     
## [11] evaluate_1.0.0      timechange_0.3.0    bookdown_0.40       fastmap_1.2.0       cellranger_1.1.0   
## [16] jsonlite_1.8.9      processx_3.8.4      progress_1.2.3      backports_1.5.0     chromote_0.3.1     
## [21] ps_1.8.0            promises_1.3.0      BiocManager_1.30.25 httr_1.4.7          rvest_1.0.4        
## [26] selectr_0.4-2       purrr_1.0.2         fansi_1.0.6         jquerylib_0.1.4     cli_3.6.3          
## [31] rlang_1.1.4         crayon_1.5.3        bit64_4.5.2         withr_3.0.1         cachem_1.1.0       
## [36] yaml_2.3.10         parallel_4.4.1      tools_4.4.1         tzdb_0.4.0          checkmate_2.3.2    
## [41] dplyr_1.1.4         curl_5.2.3          vctrs_0.6.5         logger_0.3.0        R6_2.5.1           
## [46] lifecycle_1.0.4     lubridate_1.9.3     stringr_1.5.1       bit_4.5.0           vroom_1.6.5        
## [51] pkgconfig_2.0.3     pillar_1.9.0        bslib_0.8.0         later_1.3.2         glue_1.8.0         
## [56] Rcpp_1.0.13         xfun_0.47           tibble_3.2.1        tidyselect_1.2.1    knitr_1.48         
## [61] websocket_1.4.2     htmltools_0.5.8.1   igraph_2.0.3        rmarkdown_2.28      readr_2.1.5        
## [66] compiler_4.4.1      prettyunits_1.2.0   readxl_1.4.3[1] D Turei, A Valdeolivas, L Gul, N Palacio-Escat, M Klein, O Ivanova, M Olbei, A Gabor, F Theis, D Modos, T Korcsmaros and J Saez-Rodriguez (2021) Integrated intra- and intercellular signaling knowledge for multicellular omics analysis. Molecular Systems Biology 17:e9923
[2] D Turei, T Korcsmaros and J Saez-Rodriguez (2016) OmniPath: guidelines and gateway for literature-curated signaling pathway resources. Nature Methods 13(12)