MungeSumstats
: Getting startedIf you use the MungeSumstats package, please cite
The MungeSumstats package is designed to facilitate the standardisation of GWAS summary statistics as utilised in our Nature Genetics paper.1
The package is designed to handle the lack of standardisation of output files by the GWAS community. There is a group who have now manually standardised many GWAS: R interface to the IEU GWAS database API • ieugwasr and gwasvcf but because a lot of GWAS remain closed access, these repositories are not all encompassing.
The GWAS-Download project has collated summary statistics from 200+ GWAS. This repository has been utilsed to identify the most common formats, all of which can be standardised with MungeSumstats.
Moreover, there is an emerging standard of VCF format for summary statistics files with multiple, useful, associated R packages such as vcfR. However, there is currently no method to convert VCF formats to a standardised format that matches older approaches.
The MungeSumstats package standardises both VCF and the most common summary statistic file formats to enable downstream integration and analysis.
MungeSumstats also offers comprehensive Quality Control (QC) steps which are important prerequisites for downstream analysis like Linkage disequilibrium score regression (LDSC) and MAGMA.
Moreover, MungeSumstats is efficiently written resulting in all
reformatting and quality control checks completing in minutes for GWAS
summary statistics with 500k SNPs on a standard desktop machine. This
speed can be increased further by increasing the number of threads
(nThread) for data.table
to use.
Currently MungeSumstats only works on data from humans, as it uses human-based genome references.
MungeSumstats will ensure that the all essential columns for analysis are present and syntactically correct. Generally, summary statistic files include (but are not limited to) the columns:
MungeSumstats uses a mapping file to infer the inputted column names
(run data("sumstatsColHeaders")
to view these). This mapping file is
far more comprehensive than any other publicly available munging tool
containing more than 200 unique mappings at the time of writing this
vignette. However, if your column headers are missing or if you want to
change the mapping, you can do so by passing your own mapping file (see
format_sumstats(mapping_file)
).
MungeSumstats offers unmatched levels of quality control to ensure, for example, consistency of allele assignment and direction of effects. Tests run by MungeSumstats include:
Users can specify which checks to run on their data. A note on the allele flipping check: MungeSumstats infers the effect allele will always be the A2 allele, this is the approach done for IEU GWAS VCF and has such also been adopted here. This inference is first from the inputted file’s column headers however, the allele flipping check ensures this by comparing A1, what should be the reference allele, to the reference genome. If a SNP’s A1 DNA base doesn’t match the reference genome but it’s A2 (what should be the alternative allele) does, the alleles will be flipped along with the effect information (e.g. Beta, Odds Ratio, signed summary statistics, FRQ, Z-score*).
*-by default the Z-score is assumed to be calculated off the effect size not the P-value and so will be flipped. This can be changed by a user.
If a test is failed, the user will be notified and if possible, the
input will be corrected. The QC steps from the checks above can also be
adjusted to suit the user’s analysis, see
MungeSumstats::format_sumstats
.
MungeSumstats can handle VCF, txt, tsv, csv file types or .gz/.bgz versions of these file types. The package also gives the user the flexibility to export the reformatted file as tab-delimited, VCF or R native objects such as data.table, GRanges or VRanges objects. The output can also be outputted in an LDSC ready format which means the file can be fed directly into LDSC without the need for additional munging.
The MungeSumstats package contains small subsets of GWAS summary statistics files. Firstly, on Educational Attainment by Okbay et al 2016: PMID: 27898078 PMCID: PMC5509058 DOI: 10.1038/ng1216-1587b.
Secondly, a VCF file (VCFv4.2) relating to the GWAS Amyotrophic lateral sclerosis from ieu open GWAS project. Dataset: ebi-a-GCST005647: https://gwas.mrcieu.ac.uk/datasets/ebi-a-GCST005647/
These datasets will be used to showcase MungeSumstats functionality.
MungeSumstats is available on Bioconductor. To install the package on Bioconductor run the following lines of code:
if (!require("BiocManager")) install.packages("BiocManager")
BiocManager::install("MungeSumstats")
Once installed, load the package:
library(MungeSumstats)
## Warning: replacing previous import 'utils::findMatches' by
## 'S4Vectors::findMatches' when loading 'AnnotationDbi'
To standardise the summary statistics’ file format, simply call
format_sumstats()
passing in the path to your summary statistics file
or directly pass the summary statistics as a dataframe or datatable. You
can specify which genome build was used in the GWAS(GRCh37 or GRCh38)
or, as default, infer the genome build from the data.The reference
genome is used for multiple checks like deriving missing data such
SNP/BP/CHR/A1/A2 and for QC steps like removing non-biallelic SNPs,
strand-ambiguous SNPs or ensuring correct allele and direction of SNP
effects. The path to the reformatted summary statistics file can be
returned by the function call, the user can specify a location to save
the file or the user can return an R native object for the data:
data.table, VRanges or GRanges object.
Note that for a number of the checks implored by MungeSumstats a
reference genome is used. If your GWAS summary statistics file of
interest relates to GRCh38, you will need to install
SNPlocs.Hsapiens.dbSNP144.GRCh38
and BSgenome.Hsapiens.NCBI.GRCh38
from Bioconductor as follows:
BiocManager::install("SNPlocs.Hsapiens.dbSNP144.GRCh38")
BiocManager::install("BSgenome.Hsapiens.NCBI.GRCh38")
If your GWAS summary statistics file of interest relates to GRCh37,
you will need to install SNPlocs.Hsapiens.dbSNP144.GRCh37
and
BSgenome.Hsapiens.1000genomes.hs37d5
from Bioconductor as follows:
BiocManager::install("SNPlocs.Hsapiens.dbSNP144.GRCh37")
BiocManager::install("BSgenome.Hsapiens.1000genomes.hs37d5")
These may take some time to install and are not included in the package as some users may only need one of GRCh37/GRCh38.
The Educational Attainment by Okbay GWAS summary statistics file is saved as a text document in the package’s external data folder so we can just pass the file path to it straight to MungeSumstats.
NOTE - By default, Formatted results will be saved to tempdir()
.
This means all formatted summary stats will be deleted upon ending the R
session if not copied to a local file path. Otherwise, to keep formatted
summary stats, change save_path
(
e.g.file.path('./formatted',basename(path))
), or make sure to copy
files elsewhere after processing (
e.g.file.copy(save_path, './formatted/' )
.
eduAttainOkbayPth <- system.file("extdata","eduAttainOkbay.txt",
package="MungeSumstats")
reformatted <-
MungeSumstats::format_sumstats(path=eduAttainOkbayPth,
ref_genome="GRCh37")
##
##
## ******::NOTE::******
## - Formatted results will be saved to `tempdir()` by default.
## - This means all formatted summary stats will be deleted upon ending the R session.
## - To keep formatted summary stats, change `save_path` ( e.g. `save_path=file.path('./formatted',basename(path))` ), or make sure to copy files elsewhere after processing ( e.g. `file.copy(save_path, './formatted/' )`.
## ********************
## Formatted summary statistics will be saved to ==> /tmp/RtmpF89jhr/file1933a924bab13c.tsv.gz
## Warning: replacing previous import 'utils::findMatches' by
## 'S4Vectors::findMatches' when loading 'SNPlocs.Hsapiens.dbSNP155.GRCh37'
## Importing tabular file: /tmp/RtmpXxCZeS/Rinst1921314e69dfbd/MungeSumstats/extdata/eduAttainOkbay.txt
## Checking for empty columns.
## Standardising column headers.
## First line of summary statistics file:
## MarkerName CHR POS A1 A2 EAF Beta SE Pval
## Summary statistics report:
## - 93 rows
## - 93 unique variants
## - 70 genome-wide significant variants (P<5e-8)
## - 20 chromosomes
## Checking for multi-GWAS.
## Checking for multiple RSIDs on one row.
## Checking SNP RSIDs.
## Checking for merged allele column.
## Checking A1 is uppercase
## Checking A2 is uppercase
## Checking for incorrect base-pair positions
## Checking for missing data.
## Checking for duplicate columns.
## Checking for duplicated rows.
## INFO column not available. Skipping INFO score filtering step.
## Filtering SNPs, ensuring SE>0.
## Ensuring all SNPs have N<5 std dev above mean.
## Removing 'chr' prefix from CHR.
## Making X/Y/MT CHR uppercase.
## Warning: When method is an integer, must be >0.
## 47 SNPs (50.5%) have FRQ values > 0.5. Conventionally the FRQ column is intended to show the minor/effect allele frequency.
## The FRQ column was mapped from one of the following from the inputted summary statistics file:
## FRQ, EAF, FREQUENCY, FRQ_U, F_U, MAF, FREQ, FREQ_TESTED_ALLELE, FRQ_TESTED_ALLELE, FREQ_EFFECT_ALLELE, FRQ_EFFECT_ALLELE, EFFECT_ALLELE_FREQUENCY, EFFECT_ALLELE_FREQ, EFFECT_ALLELE_FRQ, A1FREQ, A1FRQ, A2FREQ, A2FRQ, ALLELE_FREQUENCY, ALLELE_FREQ, ALLELE_FRQ, AF, MINOR_AF, EFFECT_AF, A2_AF, EFF_AF, ALT_AF, ALTERNATIVE_AF, INC_AF, A_2_AF, TESTED_AF, AF1, ALLELEFREQ, ALT_FREQ, EAF_HRC, EFFECTALLELEFREQ, FREQ.A1.1000G.EUR, FREQ.A1.ESP.EUR, FREQ.ALLELE1.HAPMAPCEU, FREQ.B, FREQ1, FREQ1.HAPMAP, FREQ_EUROPEAN_1000GENOMES, FREQ_HAPMAP, FREQ_TESTED_ALLELE_IN_HRS, FRQ_A1, FRQ_U_113154, FRQ_U_31358, FRQ_U_344901, FRQ_U_43456, POOLED_ALT_AF, AF_ALT, AF.ALT, AF-ALT, ALT.AF, ALT-AF, A2.AF, A2-AF, AF.EFF, AF_EFF, AF_EFF
## As frq_is_maf=TRUE, the FRQ column will not be renamed. If the FRQ values were intended to represent major allele frequency,
## set frq_is_maf=FALSE to rename the column as MAJOR_ALLELE_FRQ and differentiate it from minor/effect allele frequency.
## Sorting coordinates with 'data.table'.
## Writing in tabular format ==> /tmp/RtmpF89jhr/file1933a924bab13c.tsv.gz
## Summary statistics report:
## - 93 rows (100% of original 93 rows)
## - 93 unique variants
## - 70 genome-wide significant variants (P<5e-8)
## - 20 chromosomes
## Done munging in 0.004 minutes.
## Successfully finished preparing sumstats file, preview:
## Reading header.
## SNP CHR BP A1 A2 FRQ BETA SE P
## 1: rs301800 1 8490603 T C 0.17910 0.019 0.003 1.794e-08
## 2: rs11210860 1 43982527 A G 0.36940 0.017 0.003 2.359e-10
## 3: rs34305371 1 72733610 A G 0.08769 0.035 0.005 3.762e-14
## 4: rs2568955 1 72762169 T C 0.23690 -0.017 0.003 1.797e-08
## Returning path to saved data.
Here we know the summary statistics are based on the reference genome GRCh37,
GRCh38 can also be inputted. Moreover, if you are unsure of the genome build,
leave it as NULL
and Mungesumstats will infer it from the data.
Also note that the default dbSNP version used along with the reference genome is
the latest version available on Bioconductor (currently dbSNP 155) but older
versions are also availble. Use the dbSNP
input parameter to control this.
The arguments format_sumstats
in that control the level of QC
conducted by MungeSumstats are:
p-values < 5e-324
be converted
to 0? Small p-values pass the R limit and can cause errors with
LDSC/MAGMA and should be converted. Default is TRUE.data.table
, GRanges
or VRanges
directly
to user. Otherwise, return the path to the save data. Default is
FALSE.data(sumstatsColHeaders)
for
default mapping and necessary format.See ?MungeSumstats::format_sumstats()
for the full list of parameters
to control MungeSumstats QC and standardisation steps.
VCF files can also be standardised to the same format as other summary statistic files. A subset of the Amyotrophic lateral sclerosis GWAS from the ieu open GWAS project (a .vcf file) has been added to MungeSumstats to demonstrate this functionality.Simply pass the path to the file in the same manner you would for other summary statistic files:
#save ALS GWAS from the ieu open GWAS project to a temp directory
ALSvcfPth <- system.file("extdata","ALSvcf.vcf", package="MungeSumstats")
reformatted_vcf <-
MungeSumstats::format_sumstats(path=ALSvcfPth,
ref_genome="GRCh37")
You can also get more information on the SNPs which have had data
imputed or have been filtered out by MungeSumstats by using the
imputation_ind
and log_folder_ind
parameters respectively. For
example:
#set
reformatted_vcf_2 <-
MungeSumstats::format_sumstats(path=ALSvcfPth,
ref_genome="GRCh37",
log_folder_ind=TRUE,
imputation_ind=TRUE,
log_mungesumstats_msgs=TRUE)
## Time difference of 0.1 secs
## Time difference of 0.7 secs
Check the file snp_bi_allelic.tsv.gz
in the log_folder
directory you supply
(by default a temp directory), for a list of SNPs removed as they are
non-bi-allelic. The text files containing the console output and messages are
also stored in the same directory.
Note you can also control the dbSNP version used as a reference dataset by
MungeSumstats using the dbSNP
parameter. By default this will be set to the
most recent dbSNP version available (155).
Note that using log_folder_ind
returns a list from format_sumstats
which includes the file locations of the differing classes of removed
SNPs. Using log_mungesumstats_msgs
saves the messages to the console
to a file which is returned in the same list. Note that not all the
messages will also print to screen anymore when you set
log_mungesumstats_msgs
:
names(reformatted_vcf_2)
## [1] "sumstats" "log_files"
A user can load a file to view the excluded SNPs.
In this case, no SNPs were filtered based on INFO criterion, so an NA is returned for this value (instead of a path to a table of filtered SNPs).
print(reformatted_vcf_2$log_files$info_filter)
## NULL
The different types of exclusion which lead to the names are explained below:
Note to export to another type such as R native objects including
data.table, GRanges, VRanges or save as a VCF file, set return_data=TRUE
and
choose your return_format
:
#set
reformatted_vcf_2 <-
MungeSumstats::format_sumstats(path=ALSvcfPth,
ref_genome="GRCh37",
log_folder_ind=TRUE,
imputation_ind=TRUE,
log_mungesumstats_msgs=TRUE,
return_data=TRUE,
return_format="GRanges")
Also you can now output a VCF compatible with IEU OpenGWAS format (Note that currently all IEU OpenGWAS sumstats are GRCh37, MungeSumstats will throw a warning if your data isn’t GRCh37 when saving):
#set
reformatted_vcf_2 <-
MungeSumstats::format_sumstats(path=ALSvcfPth,
ref_genome="GRCh37",
write_vcf=TRUE,
save_format ="openGWAS")
See our publication for further discussion of these checks and options:
MungeSumstats also contains a function to quickly infer the genome
build of multiple summary statistic files. This can be called separately
to format_sumstats()
which is useful if you want to just quickly check
the genome build:
# Pass path to Educational Attainment Okbay sumstat file to a temp directory
eduAttainOkbayPth <- system.file("extdata", "eduAttainOkbay.txt",
package = "MungeSumstats")
ALSvcfPth <- system.file("extdata","ALSvcf.vcf", package="MungeSumstats")
sumstats_list <- list(ss1 = eduAttainOkbayPth, ss2 = ALSvcfPth)
ref_genomes <- MungeSumstats::get_genome_builds(sumstats_list = sumstats_list)
MungeSumstats exposes the liftover()
function as a general utility
for users.
Useful features include: - Automatic standardisation of genome build
names (i.e. “hg19”, “hg37”, and “GRCh37” will all be recognized as the
same genome build.) - Ability to specify chrom_col
as well as both
start_col
and end_col
(for variants that span >1bp). - Ability to
return in data.table
or GRanges
format. - Ability to specify which
chromosome format (e.g. “chr1” vs. 1) to return GRanges
as.
sumstats_dt <- MungeSumstats::formatted_example()
## Standardising column headers.
## First line of summary statistics file:
## MarkerName CHR POS A1 A2 EAF Beta SE Pval
## Sorting coordinates with 'data.table'.
sumstats_dt_hg38 <- MungeSumstats::liftover(sumstats_dt = sumstats_dt,
ref_genome = "hg19",
convert_ref_genome = "hg38")
## Performing data liftover from hg19 to hg38.
## Converting summary statistics to GenomicRanges.
## Downloading chain file from Ensembl.
## /tmp/RtmpF89jhr/GRCh37_to_GRCh38.chain.gz
## Reordering so first three column headers are SNP, CHR and BP in this order.
## Reordering so the fourth and fifth columns are A1 and A2.
knitr::kable(head(sumstats_dt_hg38))
SNP | CHR | BP | A1 | A2 | FRQ | BETA | SE | P | IMPUTATION_gen_build |
---|---|---|---|---|---|---|---|---|---|
rs301800 | 1 | 8430543 | T | C | 0.17910 | 0.019 | 0.003 | 0e+00 | TRUE |
rs11210860 | 1 | 43516856 | A | G | 0.36940 | 0.017 | 0.003 | 0e+00 | TRUE |
rs34305371 | 1 | 72267927 | A | G | 0.08769 | 0.035 | 0.005 | 0e+00 | TRUE |
rs2568955 | 1 | 72296486 | T | C | 0.23690 | -0.017 | 0.003 | 0e+00 | TRUE |
rs1008078 | 1 | 90724174 | T | C | 0.37310 | -0.016 | 0.003 | 0e+00 | TRUE |
rs61787263 | 1 | 98153158 | T | C | 0.76120 | 0.016 | 0.003 | 1e-07 | TRUE |
In some cases, users may not want to run the full munging pipeline
provided by
MungeSumstats::format_sumstats
, but still would like to take advantage
of the file type conversion and column header standardisation features.
This will not be nearly as robust as the full pipeline, but can still be
helpful.
To do this, simply run the following:
eduAttainOkbayPth <- system.file("extdata", "eduAttainOkbay.txt",
package = "MungeSumstats")
formatted_path <- tempfile(fileext = "eduAttainOkbay_standardised.tsv.gz")
#### 1. Read in the data and standardise header names ####
dat <- MungeSumstats::read_sumstats(path = eduAttainOkbayPth,
standardise_headers = TRUE)
## Importing tabular file: /tmp/RtmpXxCZeS/Rinst1921314e69dfbd/MungeSumstats/extdata/eduAttainOkbay.txt
## Checking for empty columns.
## Standardising column headers.
## First line of summary statistics file:
## MarkerName CHR POS A1 A2 EAF Beta SE Pval
knitr::kable(head(dat))
SNP | CHR | BP | A1 | A2 | FRQ | BETA | SE | P |
---|---|---|---|---|---|---|---|---|
rs10061788 | 5 | 87934707 | A | G | 0.2164 | 0.021 | 0.004 | 0e+00 |
rs1007883 | 16 | 51163406 | T | C | 0.3713 | -0.015 | 0.003 | 1e-07 |
rs1008078 | 1 | 91189731 | T | C | 0.3731 | -0.016 | 0.003 | 0e+00 |
rs1043209 | 14 | 23373986 | A | G | 0.6026 | 0.018 | 0.003 | 0e+00 |
rs10496091 | 2 | 61482261 | A | G | 0.2705 | -0.018 | 0.003 | 0e+00 |
rs10930008 | 2 | 161854736 | A | G | 0.7183 | -0.016 | 0.003 | 1e-07 |
#### 2. Write to disk as a compressed, tab-delimited, tabix-indexed file ####
formatted_path <- MungeSumstats::write_sumstats(sumstats_dt = dat,
save_path = formatted_path,
tabix_index = TRUE,
write_vcf = FALSE,
return_path = TRUE)
## Sorting coordinates with 'data.table'.
## Writing in tabular format ==> /tmp/RtmpF89jhr/file1933a96f25efa9eduAttainOkbay_standardised.tsv
## Writing uncompressed instead of gzipped to enable tabix indexing.
## Converting full summary stats file to tabix format for fast querying...
## Reading header.
## Ensuring file is bgzipped.
## Tabix-indexing file.
## Removing temporary .tsv file.
data.table
If you already have your data imported as an data.table
, you can also
standardise its headers like so:
#### Mess up some column names ####
dat_raw <- data.table::copy(dat)
data.table::setnames(dat_raw, c("SNP","CHR"), c("rsID","Seqnames"))
#### Add a non-standard column that I want to keep the casing for ####
dat_raw$Support <- runif(nrow(dat_raw))
dat2 <- MungeSumstats::standardise_header(sumstats_dt = dat_raw,
uppercase_unmapped = FALSE,
return_list = FALSE )
## Standardising column headers.
## First line of summary statistics file:
## rsID Seqnames BP A1 A2 FRQ BETA SE P Support
## Returning unmapped column names without making them uppercase.
knitr::kable(head(dat2))
SNP | CHR | BP | A1 | A2 | FRQ | BETA | SE | P | Support |
---|---|---|---|---|---|---|---|---|---|
rs301800 | 1 | 8490603 | T | C | 0.17910 | 0.019 | 0.003 | 0e+00 | 0.3418940 |
rs11210860 | 1 | 43982527 | A | G | 0.36940 | 0.017 | 0.003 | 0e+00 | 0.2115915 |
rs34305371 | 1 | 72733610 | A | G | 0.08769 | 0.035 | 0.005 | 0e+00 | 0.0177653 |
rs2568955 | 1 | 72762169 | T | C | 0.23690 | -0.017 | 0.003 | 0e+00 | 0.5964848 |
rs1008078 | 1 | 91189731 | T | C | 0.37310 | -0.016 | 0.003 | 0e+00 | 0.7742931 |
rs61787263 | 1 | 98618714 | T | C | 0.76120 | 0.016 | 0.003 | 1e-07 | 0.5711812 |
The MungeSumstats package aims to be able to handle the most common summary statistic file formats including VCF. If your file can not be formatted by MungeSumstats feel free to report the bug on github: https://github.com/neurogenomics/MungeSumstats along with your summary statistic file header.
We also encourage people to edit the code to resolve their particular issues too and are happy to incorporate these through pull requests on github. If your summary statistic file headers are not recognised by MungeSumstats but correspond to one of:
SNP, BP, CHR, A1, A2, P, Z, OR, BETA, LOG_ODDS,
SIGNED_SUMSTAT, N, N_CAS, N_CON, NSTUDY, INFO or FRQ
feel free to update the MungeSumstats::sumstatsColHeaders
following
the approach in the data.R file and add your mapping. Then use a pull
request on github and we will incorporate this change into the package.
A note on MungeSumstats::sumstatsColHeaders
for summary statistic
files with A0/A1. The mapping in MungeSumstats::sumstatsColHeaders
converts A0 to A*, this is a special case so that the code knows to map
A0/A1 to A1/A2 (ref/alt). The special case is needed since ordinarily A1
refers to the reference not the alternative allele.
A note on MungeSumstats::sumstatsColHeaders
for summary statistic
files with Effect Size (ES). By default, MSS takes BETA to be any BETA-like
value (including ES). This is coded into the mapping file -
MungeSumstats::sumstatsColHeaders
. If this isn’t the case for your sumstats,
you can set the es_is_beta
parameter in MungeSumstats::format_sumstats()
to
FALSE to avoid this. Note this is done to try and capture most use cases of MSS.
See the Open GWAS vignette for how MungeSumstats can be used along with data from the MRC IEU Open GWAS Project and also Mungesumstats’ functionality to handle lists of summary statistics files.
## R version 4.3.0 RC (2023-04-13 r84269)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.2 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.17-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] MungeSumstats_1.8.0 BiocStyle_2.28.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.0
## [2] dplyr_1.1.2
## [3] blob_1.2.4
## [4] filelock_1.0.2
## [5] R.utils_2.12.2
## [6] Biostrings_2.68.0
## [7] bitops_1.0-7
## [8] fastmap_1.1.1
## [9] RCurl_1.98-1.12
## [10] BiocFileCache_2.8.0
## [11] VariantAnnotation_1.46.0
## [12] GenomicAlignments_1.36.0
## [13] XML_3.99-0.14
## [14] digest_0.6.31
## [15] lifecycle_1.0.3
## [16] KEGGREST_1.40.0
## [17] RSQLite_2.3.1
## [18] googleAuthR_2.0.1
## [19] magrittr_2.0.3
## [20] compiler_4.3.0
## [21] rlang_1.1.0
## [22] sass_0.4.5
## [23] progress_1.2.2
## [24] tools_4.3.0
## [25] utf8_1.2.3
## [26] yaml_2.3.7
## [27] data.table_1.14.8
## [28] rtracklayer_1.60.0
## [29] knitr_1.42
## [30] prettyunits_1.1.1
## [31] bit_4.0.5
## [32] curl_5.0.0
## [33] DelayedArray_0.26.0
## [34] xml2_1.3.3
## [35] BiocParallel_1.34.0
## [36] BiocGenerics_0.46.0
## [37] R.oo_1.25.0
## [38] grid_4.3.0
## [39] stats4_4.3.0
## [40] fansi_1.0.4
## [41] biomaRt_2.56.0
## [42] SummarizedExperiment_1.30.0
## [43] cli_3.6.1
## [44] rmarkdown_2.21
## [45] crayon_1.5.2
## [46] generics_0.1.3
## [47] BSgenome.Hsapiens.1000genomes.hs37d5_0.99.1
## [48] httr_1.4.5
## [49] rjson_0.2.21
## [50] DBI_1.1.3
## [51] cachem_1.0.7
## [52] stringr_1.5.0
## [53] zlibbioc_1.46.0
## [54] assertthat_0.2.1
## [55] parallel_4.3.0
## [56] AnnotationDbi_1.62.0
## [57] BiocManager_1.30.20
## [58] XVector_0.40.0
## [59] restfulr_0.0.15
## [60] matrixStats_0.63.0
## [61] vctrs_0.6.2
## [62] Matrix_1.5-4
## [63] jsonlite_1.8.4
## [64] bookdown_0.33
## [65] IRanges_2.34.0
## [66] hms_1.1.3
## [67] S4Vectors_0.38.0
## [68] bit64_4.0.5
## [69] GenomicFiles_1.36.0
## [70] GenomicFeatures_1.52.0
## [71] jquerylib_0.1.4
## [72] glue_1.6.2
## [73] codetools_0.2-19
## [74] stringi_1.7.12
## [75] GenomeInfoDb_1.36.0
## [76] BiocIO_1.10.0
## [77] GenomicRanges_1.52.0
## [78] tibble_3.2.1
## [79] pillar_1.9.0
## [80] SNPlocs.Hsapiens.dbSNP155.GRCh37_0.99.24
## [81] rappdirs_0.3.3
## [82] htmltools_0.5.5
## [83] GenomeInfoDbData_1.2.10
## [84] BSgenome_1.68.0
## [85] R6_2.5.1
## [86] dbplyr_2.3.2
## [87] evaluate_0.20
## [88] lattice_0.21-8
## [89] Biobase_2.60.0
## [90] R.methodsS3_1.8.2
## [91] png_0.1-8
## [92] Rsamtools_2.16.0
## [93] gargle_1.4.0
## [94] memoise_2.0.1
## [95] bslib_0.4.2
## [96] xfun_0.39
## [97] fs_1.6.2
## [98] MatrixGenerics_1.12.0
## [99] pkgconfig_2.0.3
1. Nathan G. Skene, T. E. B., Julien Bryois. Genetic identification of brain cell types underlying schizophrenia. Nature Genetics (2018). doi:10.1038/s41588-018-0129-5