% \VignetteIndexEntry{Part 1: how to build tracks} % \VignetteDepends{} % \VignetteKeywords{visualizationutilities} % \VignettePackage{ggbio} \documentclass[11pt,a4paper]{article} % \usepackage{times} \usepackage{hyperref} \usepackage{verbatim} \usepackage{graphicx} \usepackage{fancybox} \usepackage{color} <>= opts_chunk$set(eval=FALSE) @ % \setkeys{Gin}{width=0.95\textwidth} \textwidth=6.5in \textheight=8.5in \parskip=.3cm \parindent = 0cm \oddsidemargin=-.1in \evensidemargin=-.1in \headheight=-.3in \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\Rmethod}[1]{{\texttt{#1}}} \newcommand{\Rfunarg}[1]{{\texttt{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \newcommand{\Rcode}[1]{{\texttt{#1}}} \newcommand{\software}[1]{\textsf{#1}} \newcommand{\R}{\software{R}} \newcommand{\Bioc}{\software{Bioconductor}} \newcommand{\IRanges}{\Rpackage{IRanges}} \newcommand{\biovizBase}{\Rpackage{biovizBase}} \newcommand{\ggbio}{\Rpackage{ggbio}} \newcommand{\visnab}{\Rpackage{visnab}} \newcommand{\ggplot}{\Rpackage{ggplot2}} \newcommand{\grid}{\Rpackage{grid}} \newcommand{\gridExtra}{\Rpackage{gridExtra}} \newcommand{\qplot}{\Rfunction{qplot}} \newcommand{\autoplot}{\Rfunction{autoplot}} \newcommand{\knitr}{\Rpackage{knitr}} \newcommand{\tracks}{\Rfunction{tracks}} \newcommand{\chipseq}{\Rpackage{chipseq}} % my own frambox \newcommand{\sfbox}[2][Tips]{ \begin{center} \shadowbox{ \parbox{0.8\linewidth}{ \textcolor{blue}{#1:} #2 } } \end{center} } \title{Tracks: bind and align plots} \author{Tengfei Yin} \date{\today} \begin{document} % \setkeys{Gin}{width=0.6\textwidth} \maketitle \newpage \tableofcontents \newpage <>= library(knitr) opts_chunk$set(fig.path='./figures/ggbio-', fig.align='center', fig.show='asis', eval = TRUE, fig.width = 5, fig.height = 5) options(replace.assign=TRUE,width=90) @ <>= options(width=72) @ \sfbox{To read this vignette, and learn how to use tracks, you don't need any background about biology. Basic knowledge about \ggplot{} is preferred.} \section{Motivation} This is fundamental components used almost everywhere in the documentations, what's more important, this function could be used with \textbf{any other ggplot2 graphics}, not just for graphics produced by \ggbio{}. \ggbio{} depends on \ggplot{} and extends it to genomic world, \textbf{so every graphics produced by \ggbio{} is essentially a \ggplot{} object or a combination of them, so you can use any tricks works for \ggplot{} on \ggbio{} graphics.}, but we bring more features here which doesn't exists in \ggplot{} yet. \sfbox{If you want to manipulate graphics from \ggbio{} more freely, I strongly recommend you to read documentation about \ggplot{}, most time the edit you want could be achieved by some basic functionality already in \ggplot{}, so enjoy those handy tools and don't reinvent the wheel! What's more, if you want to be an expert, knowledge about \Rpackage{grid, gtable} packages are necessary. Tracks relies on the new \Rpackage{gtable} package heavily, it has several convenient ways to manipulate the graphic objects.} Track-based view are widely used in almost all genome viewers, it usually stacks multiple plots row by row and align them on exactly the same coordinate, which in most cases, the genomic coordinates. In this way, we could be able to align various annotation data against each other to make an efficient comparison. UCSC genome browser\footnote{\url{http://genome.ucsc.edu/cgi-bin/hgGateway}} is one of the most widely used track-based genome browser, as shown in Figure \ref{fig:ucsc}. There are some other packages in \R{}, that support track-based view like UCSC genome browser, such as \Rpackage{Gviz}. General tracks for viewing genomic data should probably have following features: \begin{figure}[h!t!p] \centering \includegraphics[width = 0.8\textwidth]{figures/ucsc.png} \end{figure} \begin{itemize} \item Align each plot in exactly the same X coordinate(genomic coordinate), the common range is the union of the limits. \item Naming ability for each track, this is different from Y-label, which is used to illustrate variable used as y. \item Shared ``scale'' track. \item Multiple ways to visualize the data, as points, line, bar chart or density.etc. \end{itemize} As comparison, \ggbio{} is trying to be even more general in terms of building tracks, and offer more features. \begin{itemize} \item You can bind any graphics produced by \ggplot{}, not necessarily produced by \ggbio{}, in that way, \ggplot{} users will find it pretty conventient that they can construct plots independently, and \Rfunction{tracks} will align them for you. So you can use \Rfunction{tracks} to align your own data, e.g. time series data. \item Utilities for zooming, backup, restore a view. This is useful when you tweak around with your best snapshot, so you can always go back. \item A extended \textbf{"+"} method. If you are familiar with \ggplot{}'s \textbf{"+"} method to edit an existing plot, this is the way it works, if tracks is \textbf{"+"} with anything behind, it will be applied to each track. This make it easy to tweak with theme and update all the plots. \item You could specify whether you want to label a plot or not by using \Rfunction{labeled, labeled<-}, and to specify whether you what the plot x-axis synchronized with other tracks or not by using function \Rfunction{fixed, fixed<-}. \item Creating your own customized themes for not only single plot but also tracks! We will show an example how to create a theme called \Rfunction{theme\_tracks\_subset} in the following sections. \item Support not only vertical alignments, but also horizontal alignments. \end{itemize} \sfbox{\Rfunction{tracks} function only support graphic objects produced by either \ggplot{} or \ggbio{}. If you want to align plots, produced by other grid based system, like lattice, users need to tweak in grid level, to insert a lattice grob to a layout.} \section{Usage} Function \Rfunction{tracks} is a constructor for an object with class \Rclass{Tracks}. This object is a container for each plot you are going to align, and all the graphic attributes controlling the appearance of tracks. \subsection{A minimal example for \ggplot{} graphics} We can construct any \ggplot{} graphics \textbf{independently} without worrying about aligning them, only one thing you need to make sure about is that \textbf{the x-axis you are going to align must have the same meanings}, in this example, it's \textit{time}. <>= ## load ggbio automatically load ggplot2 library(ggbio) ## make a simulated time series data set df1 <- data.frame(time = 1:100, score = sin((1:100)/20)*10) p1 <- qplot(data = df1, x = time, y = score, geom = "line") df2 <- data.frame(time = 30:120, score = sin((30:120)/20)*10, value = rnorm(120-30 + 1)) p2 <- ggplot(data = df2, aes(x = time, y = score)) + geom_line() + geom_point(size = 4, aes(color = value)) @ %def When you see \Rfunction{qplot} function, you have to know it's \ggplot{}'s function(means 'quick plot'), since \Bioc{} 2.10, \ggbio{} stop using a confusing generic \Rfunction{qplot} function, instead, we are using a new generic method introduced in \ggplot{}, called \autoplot{}. \sfbox{If you don't know how many existing components you could use in pure \ggplot{} package, please check Hadley's online documentation. Websites is here \url{http://had.co.nz/ggplot2/}, For \ggbio{} based components, please read relevant part in this vignettes and visit \url{http://tengfei.github.com/ggbio/docs} to check documentation. There are plenty of examples with graphics there.} <>= print(p1) @ %def <>= p2 @ %def As shown in Figure \ref{fig:ts-plot}, we can see these two plots have different scale on x-axis, but we want to compare those two plots and hope to align them on exactly the same x-axis scale, then we could make vertical comaprison easily. Now we introduce the \tracks{} function, we can pass the multiple plots we want to align into it. \subsection{Arith method \Rcode{+}} Before we move on to more modificatino method for \Rclass{Tracks} object. We are going to introduce the flexible \Rcode{+} method first. Arith \Rcode{+} method is very powerful and heavily used in \ggplot{} and \ggbio{}, for constructing a plot layer by layer, \Rcode{+} is used to connect and add components one by one, and could also be used for updating and editing an existing plot. I have to mention the improvements in ggbio, \Rcode{+} is extended to work on not only the single plot but also the \Robject{Tracks} object. \begin{itemize} \item For single plot, \Rcode{+} apply or add the change on the right hand side to the left hand side object. \item For \Robject{Tracks} object, \Rcode{+} apply or add the change on the right hand side to every plot stored in the tracks, it's like a 'batch' mode, so you don't have to edit plots before passing into the tracks, unless you want to edit them respectively. \end{itemize} \sf{remember, you can always get the plot you passed from a \Robject{Tracks} object, by accessing \Rcode{grobs} slot, such as tks@grobs[[i]], \Rcode{grobs} is a list of plots.} \subsection{Modification} We provide some attributes associated with plot, they won't affect the single plot, those attributes will take effect when they are embeded into tracks. Those attributes include \begin{itemize} \item \Rcode{height}: defeault height for this plot. \item \Rcode{bgColor}: background color for this plot. \item \Rcode{labeled}: if you want to show label(and backgrond) for the plot or not, even through the plot is named. \item \Rcode{fixed}: control if scale of plot is fixed or not. \item \Rcode{mutable}:control if plot is affected by \Rcode{+} method on tracks or not. \end{itemize} \subsection{Zoom in/out} Simply plus \Rcode{xlim, ylim} will apply the change to every track, only if the \Rcode{fixed} attribute is set to \Rcode{TRUE}. <>= tracks(time1 = p1, time2 = p2) + xlim(1, 40) + ylim(0, 10) @ %def Other availalbe zoom in/out methods: <>= library(GenomicRanges) gr <- GRanges("chr", IRanges(1, 40)) # GRanges tracks(time1 = p1, time2 = p2) + xlim(gr) # IRanges tracks(time1 = p1, time2 = p2) + xlim(ranges(gr)) tks <- tracks(time1 = p1, time2 = p2) xlim(tks) xlim(tks) <- c(1, 35) xlim(tks) <- gr xlim(tks) <- ranges(gr) @ %def \subsection{Other unitilies} Please check manual of \tracks{} for other utilities like reset/backup. \end{document}