\documentclass[12pt]{article} %\VignetteIndexEntry{Using MyVariant.R} %\\SweaveOpts{concordance=TRUE} <>= BiocStyle::latex() @ \newcommand{\exitem}[3] {\item \texttt{\textbackslash#1\{#2\}} #3 \csname#1\endcsname{#2}.} \title{MyVariant.info R Client} \author{Adam Mark, Chunlei Wu} \begin{document} \SweaveOpts{concordance=TRUE} \maketitle \tableofcontents \section{Overview} MyVariant.Info is a simple-to-use REST web service to query/retrieve genetic variant annotation from an aggregation of variant annotation resources. \Rpackage{myvariant} is an easy-to-use R wrapper to access MyVariant.Info services and explore variant annotions. \section{Variant Annotation Service} \subsection{Obtaining HGVS IDs from a VCF file. } \begin{itemize} \item Use \Rfunction{readVcf} from the VariantAnnotation package to read a Vcf file in. The Vcf object can then be passed to \Rfunction{formatHgvs} to retrieve HGVS IDs. HGVS IDs are based on the GRCh38/hg19 reference genome. Support for hg38 is coming soon. \end{itemize} <>= library(myvariant) library(VariantAnnotation) @ <<>>= file.path <- system.file("extdata", "dbsnp_mini.vcf", package="myvariant") vcf <- readVcf(file.path, genome="hg19") rowRanges(vcf) @ \begin{itemize} \item You can then use \Rfunction{formatHgvs} to extract HGVS IDs from the Vcf object. \end{itemize} <<>>= hgvs <- formatHgvs(vcf, variant_type="snp") head(hgvs) @ \subsection{\Rfunction{getVariant}} \begin{itemize} \item Use \Rfunction{getVariant}, the wrapper for GET query of "/v1/variant/" service, to return the variant object for the given HGVS id. \end{itemize} <<>>= variant <- getVariant("chr1:g.35367G>A") variant[[1]]$dbnsfp$genename variant[[1]]$cadd$phred @ \subsection{\Rfunction{getVariants}} \begin{itemize} \item Use \Rfunction{getVariants}, the wrapper for POST query of "/v1/variant" service, to return the list of variant objects for the given character vector of HGVS ids. \end{itemize} <<>>= getVariants(c("chr1:g.35367G>A", "chr16:g.28883241A>G"), fields="cadd.consequence") @ \section{Variant Query Service} \subsection{\Rfunction{queryVariant}} \begin{itemize} \item \Rfunction{queryVariant} is a wrapper for GET query of "/v1/query?q=" service, to return the query result. This function accepts wild card input terms and allows you to query for variants that contain a specific annotation. For example, the following query searches for the CADD phred score and consequence for all variants whose genename (dbNSFP) is MLL2. \end{itemize} <<>>= queryVariant(q="dbnsfp.genename:MLL2", fields=c("cadd.phred", "cadd.consequence")) @ \begin{itemize} \item You can also use \Rfunction{queryVariant} to retrieve all annotations that map to a specific rsID. \end{itemize} <<>>= queryVariant(q="rs58991260", fields="dbsnp.flags")$hits @ \subsection{\Rfunction{queryVariants}} \begin{itemize} \item \Rfunction{queryVariants} is a wrapper for POST query of "/v1/query?q=" service, to return the query result. Query terms include any available field as long as scopes are defined. The following example reads the dbSNP rsIDs from a VCF and queries for all fields. The returned DataFrame can then be easily subsetted to include, for example, those that have not been documented in the Wellderly study. \end{itemize} <<>>= rsids <- paste("rs", info(vcf)$RS, sep="") res <- queryVariants(q=rsids, scopes="dbsnp.rsid", fields="all") subset(res, !is.na(wellderly.vartype))$query @ \section{References} MyVariant.info help@myvariant.info \end{document}