Contents

1 Introduction

GOaGO, also spelled GO-a-GO, is an R package for Gene Ontology enrichment analysis in a given set of gene pairs. The identification of overrepresented Gene Ontology (GO) terms in a set of genes is a standard approach to obtain functional associations, e.g. to characterize the set of differentially expressed genes between treatment and control samples. The approach of GO-a-GO allows to identify overrepresented GO terms that are shared by both genes in a pair, given a set of gene pairs. This provides the opportunity to annotate which functions are associated with gene pairs defined by a selected group of chromatin contacts, such as differential contacts between cell types or chromatin loops.

GO-a-GO calculates enrichment from a permutation test for overrepresentation of gene pairs that are associated with a shared GO term. Such gene pairs are counted for the original set of gene pairs and compared against randomized sets in which the structure of the pairs is preserved, but the gene identities (including the associated GO terms) are permuted. Therefore, we do not focus on the fact that the term is enriched in a group of genes, but that genes with this term get paired more often than expected.

2 Installation

GOaGO is an R package distributed as part of the Bioconductor Bioconductor project. To install GO-a-GO along with any package dependencies that are missing, use BiocManager:

if (!requireNamespace("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}

BiocManager::install("GOaGO")

3 Quick start to using GO-a-GO

First, load the package:

library(GOaGO)

3.1 Example dataset of gene pairs

We will use an example dataset based on 9,448 chromatin loops (Rao et al. 2014) identified in human lymphoblastoid cell line GM12878. Chromatin loops are a level of 3D genome organization, formed when genomic loci (anchors) distant along the DNA double strand are in spatial proximity. They have an important role in regulation of gene expression, bringing regulatory elements into contact with their target genes. Several architectural proteins are involved in the formation of chromatin loops (e.g. CTCF in humans). Besides acting as architectural/regulatory regions, many loop anchors encompass Transcription Start Sites of activated or repressed genes. The existence of chromatin loops can be experimentally probed by chromatin conformation capture techniques, such as Hi-C, allowing to identify the loops as peaks in Hi-C contact maps.

We can inspect a few rows of the dataset:

data("genePairsGM12878")
head(genePairsGM12878)
##    interactionID chrom1    start1      end1 geneID1      tss1 strand1 chrom2
##            <int> <char>     <int>     <int>  <char>     <int>  <char> <char>
## 1:             3  chr10 101190001 101195000    2805 101190530       -  chr10
## 2:             7  chr10 102100001 102110000    6319 102106772       +  chr10
## 3:             7  chr10 102100001 102110000    6319 102106772       +  chr10
## 4:            10  chr10 102810001 102815000   81621 102820999       +  chr10
## 5:            19  chr10 103600001 103605000   30819 103603677       -  chr10
## 6:            23  chr10 103985001 103990000   83401 103986143       +  chr10
##       start2      end2 geneID2      tss2 strand2
##        <int>     <int>  <char>     <int>  <char>
## 1: 101370001 101375000   81894 101372701       -
## 2: 102270001 102280000   25956 102276754       -
## 3: 102270001 102280000   25956 102279595       -
## 4: 102900001 102905000    3195 102891061       +
## 5: 103830001 103835000   79803 103825124       +
## 6: 104160001 104165000    5662 104169041       -

The column names ending with 1 and 2 refer to the first and second loop anchors, respectively. The essential columns are: geneID1 and geneID2 (gene identifiers from the Entrez database) and interactionID (identifier of a chromatin loop). See the section Extracting gene pairs from paired genomic regions for how these gene pairs were obtained.

Of the 9,448 chromatin loops in the original study, 1,581 overlapped at least one gene Transcription Start Site (TSS) at both loop anchors. As some loop anchors overlapped multiple TSSes, possibly of different genes, the dataset contains all combinations for these loops, yielding a total of 2,339 gene pairs:

nrow(genePairsGM12878)
## [1] 2339

From the first few rows you might have noticed that some gene pairs in this dataset are duplicated. The approach used by GO-a-GO is to take each gene pair once only, i.e. to treat gene pairs as a set, similar to taking the set of genes in existing GO enrichment approaches. Note that gene pair (A, B) is a duplicate of (B, A). Also, gene pairs (A, A) containing the same gene twice are excluded by GO-a-GO as non-informative. In this dataset, these cases correspond to chromatin loops between different TSSes of the same gene. We will later see that the dataset contains 1,743 gene pairs that are unique and do not contain the same gene twice.

3.2 Enrichment analysis of the example dataset

We will identify GO terms enriched in gene pairs associated with chromatin loops identified in human cell line GM12878. The main function GOaGO takes as its first argument a data frame with columns geneID1 and geneID2 containing gene identifiers. By setting keyType, we specify that the gene identifiers are from the Entrez database. We will further use the Bioconductor org.Hs.eg.db package as the source of Gene Ontology annotations for human genes, taking all three subontologies (Biological Process, Molecular Function, and Cellular Component) by setting ont = "ALL". We will run the default 10,000 permutations, and also use the default p-value cutoff of 0.05 with Benjamini-Hochberg correction:

library(org.Hs.eg.db)

# set the number of CPU threads to use
library(BiocParallel)
options(MulticoreParam = MulticoreParam(workers = 2))

goago <- GOaGO(genePairsGM12878,
    keyType = "ENTREZID", OrgDb = org.Hs.eg.db, ont = "ALL"
)
## Warning in uniqueGenePairs(genePairs): removing 509 duplicated gene pair(s)
## Warning in uniqueGenePairs(genePairs): removing 87 gene pair(s) containing the
## same gene twice

Note that we got warned about duplicated gene pairs and gene pairs containing the same gene twice, as discussed above. We can now inspect the enriched GO terms, treating the resulting object of class GOaGO-result as a data frame:

head(as.data.frame(goago))
##   ONTOLOGY         ID
## 1       BP GO:0040029
## 2       BP GO:0071824
## 3       BP GO:0034728
## 4       BP GO:0002227
## 5       BP GO:0065004
## 6       BP GO:0002495
##                                                               Description Count
## 1                                epigenetic regulation of gene expression    18
## 2                                        protein-DNA complex organization    16
## 3                                                 nucleosome organization    16
## 4                                        innate immune response in mucosa     6
## 5                                            protein-DNA complex assembly    16
## 6 antigen processing and presentation of peptide antigen via MHC class II     3
##     PairRatio      BgRatio FoldEnrichment pvalue p.adjust qvalue
## 1 0.010327022 6.871486e-04       15.02881      0        0      0
## 2 0.009179575 3.494550e-04       26.26826      0        0      0
## 3 0.009179575 1.390132e-04       66.03384      0        0      0
## 4 0.003442341 2.099828e-05      163.93443      0        0      0
## 5 0.009179575 2.841079e-04       32.31018      0        0      0
## 6 0.001721170 2.048193e-05       84.03361      0        0      0

The enriched terms are internally ordered by p-value. While this allows for an easy cut-off, it is not a good criterion alone. We should further focus on the terms that have the highest FoldEnrichment, and also have sufficient Count (number of input gene pairs sharing the given term).

Note that some of the p-values have the value of zero, which means that for all the randomizations fewer gene pairs were associated with a given term than in the input dataset. These values should be considered as less than 0.0001.

3.3 Plotting the enriched Gene Ontology terms

We can plot the most enriched GO terms as a dotplot. The X-axis shows the fold enrichment of the fraction of gene pairs sharing the given term, calculated as a quotient of PairRatio (the fraction in the input gene pairs) and BgRatio (the fraction in permuted ones). Point size indicates the number of input gene pairs sharing the given term, and point color – the adjusted p-value:

DotPlot(goago)
Dotplot of 10 most enriched Gene Ontology terms

Figure 1: Dotplot of 10 most enriched Gene Ontology terms

Note that by default only the terms associated with at least 5 gene pairs are shown; you can change this by setting minCount to any other value. Furthermore, only 10 most enriched terms (with the highest FoldEnrichment) are plotted by default. This can be changed by setting showCategory accordingly, possibly showCategory = Inf to plot all.

We can also see the sampling distributions of numbers of gene pairs sharing each GO term, obtained for the randomized gene pairs. From these distributions, empirical p-values were calculated:

RidgePlot(goago)
## Picking joint bandwidth of 0.0478
Ridgeplot of 10 most enriched Gene Ontology terms

Figure 2: Ridgeplot of 10 most enriched Gene Ontology terms

Given the shape of the above distributions, the low p-values are understandable.

3.4 Comparison to unpaired Gene Ontology enrichment

GO-a-GO identifies terms that get preferentially paired in a set of gene pairs, addressing a complementary problem to the identification of enriched terms in a set of genes. To confirm that the two approaches indeed yield different results, we will perform a comparison to GO term enrichment of the genes at loop anchors, without using the gene pair information. For this unpaired approach, we will use enrichGO function from the widely used clusterProfiler package (Yu et al. 2012).

First, load the required packages:

library(clusterProfiler)
library(ggplot2)
library(ggrepel)

To extract the set of genes, we will go back to the example dataset of gene pairs. Note that for a fair comparison we will exclude gene pairs (A, A), as these pairs are ignored by GO-a-GO.

genes <- with(
    subset(genePairsGM12878, geneID1 != geneID2),
    unique(c(geneID1, geneID2))
)

ego <- enrichGO(genes,
    keyType = "ENTREZID", OrgDb = org.Hs.eg.db, ont = "ALL"
)

We can plot the most enriched GO terms identified by enrichGO as a dotplot:

DotPlot(ego)
Dotplot of 10 most enriched Gene Ontology terms, using unpaired approach by enrichGO

Figure 3: Dotplot of 10 most enriched Gene Ontology terms, using unpaired approach by enrichGO

At first glance, the most enriched terms in this unpaired approach are very different from the ones identified by GO-a-GO. This impression could be misleading, since we are plotting only the top 10 terms. For a more detailed comparison, we will need more permissive (relaxed) versions of both enrichment analyses.

goago_relaxed <- GOaGO(genePairsGM12878,
    keyType = "ENTREZID", OrgDb = org.Hs.eg.db, ont = "ALL",
    pvalueCutoff = Inf, qvalueCutoff = Inf
)
## Warning in uniqueGenePairs(genePairs): removing 509 duplicated gene pair(s)
## Warning in uniqueGenePairs(genePairs): removing 87 gene pair(s) containing the
## same gene twice

ego_relaxed <- enrichGO(genes,
    keyType = "ENTREZID", OrgDb = org.Hs.eg.db, ont = "ALL",
    pvalueCutoff = Inf, qvalueCutoff = Inf
)

In these two objects we have calculated statistics for 738 Gene Ontology terms that are shared by at least one gene pair, and separately for 7,923 terms that are associated to at least one gene. We will merge them and flag the enriched terms according to the previous results.

df <- merge(as.data.frame(goago_relaxed), as.data.frame(ego_relaxed),
    by = c("ONTOLOGY", "ID", "Description"), suffixes = c(".GOaGO", ".enrichGO")
)

df$signif.GOaGO <- df$ID %in% as.data.frame(goago)$ID
df$signif.enrichGO <- df$ID %in% ego$ID

df$enrichment <- ifelse(df$signif.GOaGO,
    ifelse(df$signif.enrichGO, "goago_and_ego", "goago_only"),
    ifelse(df$signif.enrichGO, "ego_only", "insignificant")
)

We can now compare the fold enrichment calculated using the two approaches. In the plot below, point size indicates the number of input gene pairs sharing the given term, and point color – enrichment status. For better visibility, we use log scale for both axes.

col_labels <- c(
    goago_and_ego = "GO-a-GO and enrichGO",
    goago_only = "GO-a-GO (paired) only",
    ego_only = "enrichGO (unpaired) only",
    insignificant = "none of the above"
)

col_values <- c(
    goago_and_ego = "#6a3d9a",
    goago_only = "#e31a1c",
    ego_only = "#1f78b4",
    insignificant = "gray80"
)

ggplot(
    df,
    aes(x = log2(FoldEnrichment.enrichGO), y = log2(FoldEnrichment.GOaGO), size = Count.GOaGO, color = enrichment)
) +
    geom_vline(xintercept = 0, lty = 2) +
    geom_hline(yintercept = 0, lty = 2) +
    geom_point(alpha = 0.3) +
    scale_size_area() +
    scale_color_manual("Term enrichment", labels = col_labels, values = col_values) +
    DOSE::theme_dose(12)
Gene Ontology term enrichment using unpaired approach by enrichGO (X-axis) and paired approach by GO-a-GO (Y-axis)

Figure 4: Gene Ontology term enrichment using unpaired approach by enrichGO (X-axis) and paired approach by GO-a-GO (Y-axis)

You may be surprised by the fact that the vast majority of the plotted terms were called enriched by GO-a-GO. This is due to the fact that goago_relaxed contains only the terms that are shared by at least one gene pair.

To get a better view of the most frequent annotations, we will plot only the terms associated with at least 5 gene pairs, and show term descriptions as point labels. To repel overlapping labels, we use geom_text_repel from ggrepel package, which also draws segments between points and their labels located further away.

ggplot(
    subset(df, Count.GOaGO >= 5),
    aes(x = log2(FoldEnrichment.enrichGO), y = log2(FoldEnrichment.GOaGO), size = Count.GOaGO, color = enrichment)
) +
    geom_vline(xintercept = 0, lty = 2) +
    geom_point(alpha = 0.3) +
    geom_text_repel(aes(label = label_wrap_gen(50)(Description)), show.legend = FALSE, size = 3, lineheight = 0.9) +
    scale_size_area() +
    scale_color_manual("Term enrichment", labels = col_labels, values = col_values) +
    DOSE::theme_dose(12)
Gene Ontology term enrichment using unpaired approach by enrichGO (X-axis) and paired approach by GO-a-GO (Y-axis), showing only the terms shared by at least 5 gene pairs

Figure 5: Gene Ontology term enrichment using unpaired approach by enrichGO (X-axis) and paired approach by GO-a-GO (Y-axis), showing only the terms shared by at least 5 gene pairs

Many of the terms enriched in paired approach by GO-a-GO were not enriched in unpaired approach by enrichGO. What is more, some of the terms reported as enriched by GO-a-GO are actually depleted when annotating the gene set without the information on gene pairs. Explaining this conundrum requires further analysis and biological interpretation.

3.5 Extracting gene pairs associated with terms of interest

For follow-up analysis, it is often desired to extract the enriched Gene Ontology terms along with the gene pairs sharing them. Function termGenePairs returns a data frame including all this information, including gene symbols fetched from the OrgDb object. For example, we can list the gene pairs associated with olfactory receptor activity:

tgp <- termGenePairs(goago, org.Hs.eg.db)
subset(tgp, ID == "GO:0004984") # olfactory receptor activity
##    ONTOLOGY         ID                 Description Count   PairRatio
##      <char>     <char>                      <char> <int>       <num>
## 1:       MF GO:0004984 olfactory receptor activity     7 0.004016064
## 2:       MF GO:0004984 olfactory receptor activity     7 0.004016064
## 3:       MF GO:0004984 olfactory receptor activity     7 0.004016064
## 4:       MF GO:0004984 olfactory receptor activity     7 0.004016064
## 5:       MF GO:0004984 olfactory receptor activity     7 0.004016064
## 6:       MF GO:0004984 olfactory receptor activity     7 0.004016064
## 7:       MF GO:0004984 olfactory receptor activity     7 0.004016064
##         BgRatio FoldEnrichment pvalue p.adjust qvalue pairID interactionID
##           <num>          <num>  <num>    <num>  <num>  <int>         <int>
## 1: 6.052783e-05       66.35071      0        0      0    141           849
## 2: 6.052783e-05       66.35071      0        0      0    145           861
## 3: 6.052783e-05       66.35071      0        0      0    153           872
## 4: 6.052783e-05       66.35071      0        0      0    162           899
## 5: 6.052783e-05       66.35071      0        0      0    164           902
## 6: 6.052783e-05       66.35071      0        0      0    370          1834
## 7: 6.052783e-05       66.35071      0        0      0    740          3444
##    chrom1    start1      end1 geneID1      tss1 strand1 chrom2    start2
##    <char>     <int>     <int>  <char>     <int>  <char> <char>     <int>
## 1:  chr11   4655001   4660000  390038   4660945       +  chr11   5025001
## 2:  chr11  48380001  48390000  403257  48373999       -  chr11  50010001
## 3:  chr11   5140001   5150000  390054   5153872       -  chr11   5400001
## 4:  chr11   5495001   5500000  390066   5509915       +  chr11   5830001
## 5:  chr11  55360001  55365000  219429  55371874       -  chr11  55430001
## 6:   chr1 248400001 248410000   26245 248402231       +   chr1 248850001
## 7:  chr17   2970001   2980000    8386   2966901       -  chr17   3130001
##         end2 geneID2      tss2 strand2 geneSymbol1 geneSymbol2
##        <int>  <char>     <int>  <char>      <char>      <char>
## 1:   5030000  119682   5020213       +      OR51D1      OR51L1
## 2:  50020000  283093  50004071       -      OR4C45      OR4C12
## 3:   5410000  390059   5410607       +      OR52A5      OR51M1
## 4:   5835000  390077   5841566       +      OR52D1      OR52N2
## 5:  55435000  219432  55432643       +      OR4C11       OR4C6
## 6: 248860000  401994 248845605       -       OR2M4      OR14I1
## 7:   3140000  653166   3143970       +       OR1D5       OR1D4

As can be seen in the last two columns, these gene pairs involve several genes from the olfactory receptor gene family, which, incidentally, is the largest gene family in the human genome. Olfactory receptor genes are located in clusters at multiple loci distributed across 18 human chromosomes. In olfactory sensory neurons, these gene clusters are brought together to form specific inter-chromosomal contacts (Monahan, Horta, and Lomvardas 2019). The enrichment observed above suggests that some contacts within olfactory receptor gene clusters may have a role in non-olfactory tissues.

3.6 Obtaining additional statistics

We can calculate the number of chromatin loops from the original study that overlapped at least one gene TSS at both loop anchors (should be 1,581):

length(unique(genePairsGM12878$interactionID))
## [1] 1581

and the number of input gene pairs that are unique and do not contain the same gene twice (should be 1,743):

nrow(genePairs(goago))
## [1] 1743

Further information is also reported by the method show, implicitly used here:

goago
## #
## # Gene Ontology over-representation test in a set of gene pairs
## #
## #...@organism     Homo sapiens 
## #...@ontology     ALL 
## #...@keyType      ENTREZID 
## #
## #...1743 gene pairs considered, 10000 permutations
## #...identified GO terms shared by at least 1 gene pair
## #...p-values adjusted by 'BH' with cutoff < 0.05
## #
## #...293 enriched terms found
## 'data.frame':    293 obs. of  10 variables:
##  $ ONTOLOGY      : chr  "BP" "BP" "BP" "BP" ...
##  $ ID            : chr  "GO:0040029" "GO:0071824" "GO:0034728" "GO:0002227" ...
##  $ Description   : chr  "epigenetic regulation of gene expression" "protein-DNA complex organization" "nucleosome organization" "innate immune response in mucosa" ...
##  $ Count         : int  18 16 16 6 16 3 17 3 4 16 ...
##  $ PairRatio     : num  0.01033 0.00918 0.00918 0.00344 0.00918 ...
##  $ BgRatio       : num  0.000687 0.000349 0.000139 0.000021 0.000284 ...
##  $ FoldEnrichment: num  15 26.3 66 163.9 32.3 ...
##  $ pvalue        : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ p.adjust      : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ qvalue        : num  0 0 0 0 0 0 0 0 0 0 ...

4 Extracting gene pairs from paired genomic regions

We started this vignette with an example dataset of gene pairs. These pairs are often defined by a set of chromatin interactions, such as chromatin loops (as it is the case here), differential contacts or spatial gene clusters. What these scenarios have in common is that the underlying genomic regions need to be annotated to obtain gene pairs.

We will use the Bioconductor package rtracklayer to read a file in BEDPE format containing genomic coordinates of the 9,448 chromatin loops discussed before:

fpath <- system.file("extdata", "GM12878_loops.bedpe.gz", package = "GOaGO")
pairs <- rtracklayer::import(fpath, genome = "hg19")
pairs
## Pairs object with 9448 pairs and 2 metadata columns:
##                              first                    second |        name
##                          <GRanges>                 <GRanges> | <character>
##      [1] chr10:100225001-100230000 chr10:100420001-100425000 |        <NA>
##      [2] chr10:100225001-100230000 chr10:101005001-101010000 |        <NA>
##      [3] chr10:101190001-101195000 chr10:101370001-101375000 |        <NA>
##      [4] chr10:101190001-101200000 chr10:101470001-101480000 |        <NA>
##      [5] chr10:101600001-101605000 chr10:101805001-101810000 |        <NA>
##      ...                       ...                       ... .         ...
##   [9444]      chrX:9845001-9850000    chrX:10085001-10090000 |        <NA>
##   [9445]      chrX:9845001-9850000      chrX:9960001-9965000 |        <NA>
##   [9446]      chrX:9960001-9965000    chrX:10085001-10090000 |        <NA>
##   [9447]    chrX:99760001-99770000  chrX:100020001-100030000 |        <NA>
##   [9448]    chrX:99940001-99945000  chrX:100020001-100025000 |        <NA>
##              score
##          <numeric>
##      [1]         0
##      [2]         0
##      [3]         0
##      [4]         0
##      [5]         0
##      ...       ...
##   [9444]         0
##   [9445]         0
##   [9446]         0
##   [9447]         0
##   [9448]         0

The first and second genomic regions shown above correspond to the first and second loop anchors. These paired regions will be further associated to relevant genes. We will use transcript database from the annotation package TxDb.Hsapiens.UCSC.hg19.knownGene. For more informative associations we will take only the transcripts of protein-coding genes. This can be done by applying a filter that ensures that the coding DNA sequence (CDS) strand is "-" or "+", i.e. not NA:

library(TxDb.Hsapiens.UCSC.hg19.knownGene)
transcripts <- transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene,
    columns = "gene_id", filter = list(cds_strand = c("-", "+"))
)

From these transcripts, only the Transcription Start Site (TSS) position will be used for annotation. Each genomic region (loop anchor) will be associated to all the TSSes it overlaps. If there is no such overlap, the region will be associated to the closest TSS within 10 kb distance, if there are any. As some regions overlapped multiple TSSes, possibly of different genes, the resulting annotation contains all the combinations:

genePairs <- annotateInteractions(pairs, transcripts, maxDistanceToTSS = 10e3)
genePairs
##        interactionID chrom1    start1      end1 geneID1      tss1 strand1
##                <int> <char>     <int>     <int>  <char>     <int>  <char>
##     1:             3  chr10 101190001 101195000    2805 101190381       -
##     2:             7  chr10 102100001 102110000    6319 102106990       +
##     3:             7  chr10 102100001 102110000    6319 102106990       +
##     4:             7  chr10 102100001 102110000    6319 102106990       +
##     5:             7  chr10 102100001 102110000    6319 102106990       +
##    ---                                                                   
## 12278:          9434   chrX  80060001  80070000  254065  80065376       -
## 12279:          9434   chrX  80060001  80070000  254065  80065376       -
## 12280:          9435   chrX  80065001  80070000  254065  80065376       -
## 12281:          9435   chrX  80065001  80070000  254065  80065376       -
## 12282:          9435   chrX  80065001  80070000  254065  80065376       -
##        chrom2    start2      end2 geneID2      tss2 strand2
##        <char>     <int>     <int>  <char>     <int>  <char>
##     1:  chr10 101370001 101375000   81894 101379740       -
##     2:  chr10 102270001 102280000   25956 102279616       -
##     3:  chr10 102270001 102280000   25956 102279618       -
##     4:  chr10 102270001 102280000   25956 102279595       -
##     5:  chr10 102270001 102280000   25956 102276752       -
##    ---                                                     
## 12278:   chrX  80450001  80460000   79366  80457412       -
## 12279:   chrX  80450001  80460000   79366  80457385       -
## 12280:   chrX  80375001  80380000   79366  80377187       -
## 12281:   chrX  80375001  80380000   79366  80377137       -
## 12282:   chrX  80375001  80380000   79366  80377182       -

We obtain the same result as provided in the dataset genePairsGM12878. Note that the information on keyType is extracted from transcripts and propagated to gene pairs:

attr(genePairs, "keyType")
## [1] "ENTREZID"

so that we do not need to specify keyType when running GOaGO function on this object.

For further manipulation of paired genomic regions, and specifically chromatin contacts, Bioconductor packages GenomicInteractions and mariner can be useful. In particular, the function annotateInteractions can take objects of class GenomicInteractions.

5 Citing GO-a-GO

We hope that GO-a-GO will be useful for your research. Please use the following information to cite the package and the overall approach.

citation("GOaGO")
## To cite package 'GOaGO' in publications use:
## 
##   Jankowski A (2026). _GOaGO: Gene Ontology enrichment analysis of gene
##   pairs_. doi:10.18129/B9.bioc.GOaGO
##   <https://doi.org/10.18129/B9.bioc.GOaGO>. R package version 1.1.0,
##   <https://bioconductor.org/packages/GOaGO>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {GOaGO: Gene Ontology enrichment analysis of gene pairs},
##     author = {Aleksander Jankowski},
##     year = {2026},
##     note = {R package version 1.1.0},
##     url = {https://bioconductor.org/packages/GOaGO},
##     doi = {10.18129/B9.bioc.GOaGO},
##   }

6 Session information

## R version 4.6.0 RC (2026-04-17 r89917)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.24-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.22.1
##  [2] GenomicFeatures_1.65.0                  
##  [3] BSgenome.Hsapiens.UCSC.hg19_1.4.3       
##  [4] BSgenome_1.81.0                         
##  [5] rtracklayer_1.73.0                      
##  [6] BiocIO_1.23.0                           
##  [7] Biostrings_2.81.0                       
##  [8] XVector_0.53.0                          
##  [9] GenomicRanges_1.65.0                    
## [10] Seqinfo_1.3.0                           
## [11] ggrepel_0.9.8                           
## [12] ggplot2_4.0.3                           
## [13] clusterProfiler_4.21.0                  
## [14] BiocParallel_1.47.0                     
## [15] org.Hs.eg.db_3.23.1                     
## [16] AnnotationDbi_1.75.0                    
## [17] IRanges_2.47.0                          
## [18] S4Vectors_0.51.0                        
## [19] Biobase_2.73.0                          
## [20] BiocGenerics_0.59.0                     
## [21] generics_0.1.4                          
## [22] GOaGO_1.1.0                             
## [23] BiocStyle_2.41.0                        
## 
## loaded via a namespace (and not attached):
##   [1] RColorBrewer_1.1-3          jsonlite_2.0.0             
##   [3] tidydr_0.0.6                magrittr_2.0.5             
##   [5] ggtangle_0.1.2              magick_2.9.1               
##   [7] farver_2.1.2                rmarkdown_2.31             
##   [9] fs_2.1.0                    vctrs_0.7.3                
##  [11] Rsamtools_2.29.0            memoise_2.0.1              
##  [13] RCurl_1.98-1.18             ggtree_4.3.0               
##  [15] tinytex_0.59                S4Arrays_1.13.0            
##  [17] htmltools_0.5.9             curl_7.1.0                 
##  [19] SparseArray_1.13.0          gridGraphics_0.5-1         
##  [21] sass_0.4.10                 bslib_0.10.0               
##  [23] htmlwidgets_1.6.4           plyr_1.8.9                 
##  [25] httr2_1.2.2                 cachem_1.1.0               
##  [27] GenomicAlignments_1.49.0    igraph_2.3.0               
##  [29] lifecycle_1.0.5             pkgconfig_2.0.3            
##  [31] Matrix_1.7-5                R6_2.6.1                   
##  [33] fastmap_1.2.0               gson_0.1.0                 
##  [35] MatrixGenerics_1.25.0       digest_0.6.39              
##  [37] aplot_0.2.9                 enrichplot_1.33.0          
##  [39] ggnewscale_0.5.2            patchwork_1.3.2            
##  [41] aisdk_1.1.0                 ps_1.9.3                   
##  [43] RSQLite_2.4.6               labeling_0.4.3             
##  [45] abind_1.4-8                 httr_1.4.8                 
##  [47] polyclip_1.10-7             compiler_4.6.0             
##  [49] bit64_4.8.0                 fontquiver_0.2.1           
##  [51] withr_3.0.2                 S7_0.2.2                   
##  [53] DBI_1.3.0                   ggforce_0.5.0              
##  [55] MASS_7.3-65                 DelayedArray_0.39.0        
##  [57] rappdirs_0.3.4              rjson_0.2.23               
##  [59] tools_4.6.0                 otel_0.2.0                 
##  [61] ape_5.8-1                   scatterpie_0.2.6           
##  [63] glue_1.8.1                  restfulr_0.0.16            
##  [65] callr_3.7.6                 nlme_3.1-169               
##  [67] GOSemSim_2.39.0             grid_4.6.0                 
##  [69] cluster_2.1.8.2             reshape2_1.4.5             
##  [71] gtable_0.3.6                tidyr_1.3.2                
##  [73] data.table_1.18.2.1         pillar_1.11.1              
##  [75] stringr_1.6.0               yulab.utils_0.2.4          
##  [77] splines_4.6.0               dplyr_1.2.1                
##  [79] tweenr_2.0.3                treeio_1.37.0              
##  [81] lattice_0.22-9              bit_4.6.0                  
##  [83] tidyselect_1.2.1            fontLiberation_0.1.0       
##  [85] GO.db_3.23.1                knitr_1.51                 
##  [87] fontBitstreamVera_0.1.1     bookdown_0.46              
##  [89] SummarizedExperiment_1.43.0 xfun_0.57                  
##  [91] matrixStats_1.5.0           stringi_1.8.7              
##  [93] UCSC.utils_1.9.0            lazyeval_0.2.3             
##  [95] ggfun_0.2.0                 yaml_2.3.12                
##  [97] cigarillo_1.3.0             evaluate_1.0.5             
##  [99] codetools_0.2-20            gdtools_0.5.0              
## [101] tibble_3.3.1                qvalue_2.45.0              
## [103] BiocManager_1.30.27         ggplotify_0.1.3            
## [105] cli_3.6.6                   systemfonts_1.3.2          
## [107] processx_3.9.0              jquerylib_0.1.4            
## [109] dichromat_2.0-0.1           Rcpp_1.1.1-1.1             
## [111] GenomeInfoDb_1.49.0         png_0.1-9                  
## [113] XML_3.99-0.23               parallel_4.6.0             
## [115] blob_1.3.0                  DOSE_4.7.0                 
## [117] bitops_1.0-9                tidytree_0.4.7             
## [119] ggiraph_0.9.6               enrichit_0.1.4             
## [121] scales_1.4.0                ggridges_0.5.7             
## [123] purrr_1.2.2                 crayon_1.5.3               
## [125] rlang_1.2.0                 KEGGREST_1.53.0

Bibliography

Monahan, Kevin, Adan Horta, and Stavros Lomvardas. 2019. “LHX2- and LDB1-Mediated Trans Interactions Regulate Olfactory Receptor Choice.” Nature 565 (7740): 448–53. https://doi.org/10.1038/s41586-018-0845-0.

Rao, Suhas S. P., Miriam H. Huntley, Neva C. Durand, Elena K. Stamenova, Ivan D. Bochkov, James T. Robinson, Adrian L. Sanborn, et al. 2014. “A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping.” Cell 159 (7): 1665–80. https://doi.org/10.1016/j.cell.2014.11.021.

Yu, Guangchuang, Li-Gen Wang, Yanyan Han, and Qing-Yu He. 2012. “clusterProfiler: An R Package for Comparing Biological Themes Among Gene Clusters.” Omics: A Journal of Integrative Biology 16 (5): 284–87. https://doi.org/10.1089/omi.2011.0118.