%
% NOTE -- ONLY EDIT THE .Rnw FILE!!!  The .tex file is
% likely to be overwritten.
%
% \VignetteIndexEntry{goTools overview}
% \VignetteDepends{tools, GO.db}
% \VignetteSuggest{hgu133a.db}
% \VignetteKeywords{Gene Ontology}
% \VignettePackage{goTools}
\documentclass[11pt]{article}

\usepackage{amsmath,epsfig,fullpage}
%\usepackage[authoryear,round]{natbib}
\usepackage{hyperref}



\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textit{#1}}}

\parindent 0in

%\bibliographystyle{abbrvnat}

\begin{document}

\title{\bf Getting started with \Rpackage{goTools} package}

\author{Agnes Paquet$^1$ and (Jean) Yee Hwa Yang$^2$}

\maketitle


\begin{center}
1. {\tt paquetagnes@yahoo.com}\\
2. Department of Medicine, University of California, San Francisco, \url{http://www.biostat.ucsf.edu/jean}
\end{center}

% library(tools)
% setwd("c:/MyDoc/Projects/Jean/CVS/goTools/inst/doc")
% setwd("c:/MyDoc/Projects/madman/Rpacks-devel/goTools/inst/doc")
% Rnwfile<- file.path("c:/MyDoc/Projects/madman/Rpacks-devel/goTools/inst/doc","goTools.Rnw")
% Sweave(Rnwfile,pdf=TRUE,eps=TRUE,stylepath=TRUE,driver=RweaveLatex())

\tableofcontents

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Getting started}

This document provides a tutorial for the \Rpackage{goTools} package, which
allows graphical comparisons of functional groups between two sets of
genes.

\paragraph{Installing the package:}
To install the \Rpackage{goTools} package, go
to the Bioconductor installation web site
\url{http://www.bioconductor.org/help/faq} for more detailed instructions.


\paragraph{Help files:}
As with any R package, detailed information on
functions, classes and methods can be obtained in the help files. For
instance, to view the help file for the function {\tt ontoCompare} in a
browser, use {\tt help.start()} followed by {\tt ?ontoCompare}. \\
We demonstrate the functionality with a randomly 
selected set of probe IDs from both Affymetrix hgu133a chip
 (\Robject{affylist}). To load the \Robject{probeID}
dataset, use {\tt data(probeID)},  and to view a  
description of the experiments and data, type {\tt ?probeID}.


\paragraph{Sweave:}
This document was generated using the \Rfunction{Sweave}
function from the R \Rpackage{tools} package. The source file is in the
{\tt /inst.doc} directory of the package \Rpackage{goTools}. 

\paragraph{}
To begin, let's load the package and the probeID datasets into your R
session. 
<<eval=TRUE,echo=TRUE>>=
library("goTools", verbose=FALSE)
data(probeID)  
@

As shown below, \Robject{affylist} is a vector of lists containing 3
list of vectors of
probe ids from Affymetrix hgu133a chip.

<<eval=TRUE, echo=TRUE>>=
class(affylist)
length(affylist)
affylist[[1]][1:5]
@


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Graphical comparisons of two sets of genes}

Gene Ontology is a Direct Acyclic Graph (DAG) that provides three structured
 networks of defined terms to describe gene product attributes:
 Molecular Function (MF), Biological 
 Process (BP) and Cellular Component (CC). A gene product has one or more
 molecular functions and is used in one or more biological processes; it
 might be associated with one or more cellular components. To learn more
 about GO and DAG, please refer to Gene Ontology web site
 \url{http://www.geneontology.org/}.\\

We have created a set of R functions that use GO structure to describe and
compare the composition of sets of genes (or probes). We use the
following algorithm:\\

\begin{enumerate}
\item{Read in a {\bf list} of sets of probe id you want to compare.}
\item{Map each probe id to corresponding ontologies in the GO tree, if any.}
\item{Create the set of GO ids of interest used to compare your
datasets (\Robject{endnode}). The function \Rfunction{EndNodeList()}
will create a set of nodes of the DAG located one level under MF, BP or
CC, but you can use any sets of GO ids.}
\item{For each GO id, go up the GO tree until
reaching the nodes in \Robject{endnode}. Search may be limited to MF, BP
or CC if specified in \Robject{goType}.}
\item{Compute the percentage of direct children found under each node in
\Robject{endnode}.}
\item{Return the results. Plot them if \Robject{plot=TRUE}.}
\end{enumerate}


\subsection{How to use goTools}
The main function that we provide is \Rfunction{ontoCompare}. It takes
as argument {\bf a list} of probe ids. Their type must be specified in
the argument \Robject{probeType}. For more details about it,you can
refer to the corresponding help file by typing:
 \Rfunction{?ontoCompare}.

%\subsubsection{Examples}
%The following examples demonstrate how to use \Rfunction{ontoCompare} on the
%\Robject{probeID} dataset.

<<eval=FALSE,echo=TRUE>>=
library(GO.db)
subset=c(L1=list(affylist[[1]][1:5]),L2=list(affylist[[2]][1:5]))
res <-ontoCompare(subset, probeType="hgu133a")
@

\subsubsection{Methods for computing percentages}

\Rfunction{ontoCompare} allows you to choose from 3 different methods to
estimate the percentage of probes under each element of
\Robject{endnode}. The default method is \Robject{TGenes}.


\begin{enumerate}

\item{\Robject{TGenes}: for each end node, return the number of direct
  children found / total number of probe ids.\\
  This includes oligos which do not have GO annotations.}

\item{\Robject{TIDS}: for each end node, return the number of direct
  children found / total number of GO ids describing the list.}

\item{\Robject{none}: for each end node, return the number of direct
  children found.}

\end{enumerate}
 
\subsection{Plotting the results}

The plots are produced using the function \Rfunction{ontoPlot}. It is called by
\Rfunction{ontoCompare} when you set \Robject{plot=TRUE}. You can also call it
directly, passing as argument \Rfunction{ontoCompare}
results. If only one set of genes is passed to
\Rfunction{ontoCompare}, \Rfunction{ontoPlot} will return a pie
chart. In other cases, it will return a bargraph. You can modify
\Rfunction{ontoPlot} layout parameters using usual R graphics layout
parameters. For more details, type \Rfunction{?par}.

<<ontoPlotEg1,fig=TRUE,prefix=FALSE,echo=TRUE,include=FALSE, eval=FALSE>>=
library(GO.db)
subset=c(L1=list(affylist[[1]][1:5]),L2=list(affylist[[2]][1:5]))
res <- ontoCompare(subset, probeType="hgu133a", plot=TRUE)
@

%See Figure \ref{fig:ontoPlotEg}

\section{How to set up the "end nodes"}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Default list}
The default end nodes list is defined by a call to the
function \Rfunction{EndNodeList}. It contains all children of
MF(GO:0003674), BP(GO:0008150) and CC (GO:0005575). 

<<eval=TRUE,echo=TRUE>>=
EndNodeList()
@


\subsection{Customized end node list}
If you want to use more ontologies to describe your set of genes, you
can use the function\\
 \mbox{\Rfunction{CustomEndNodeList(id,rank)}} to create a bigger
set of end nodes. It returns all GO ids children of \Robject{id} up to
\Robject{rank} levels below \Robject{id}.


<<eval=FALSE,echo=TRUE>>=
MFendnode <- CustomEndNodeList("GO:0003674", rank=2)
@

Finally, the code below shows you how to use a custom end node list, and also
how to modify the \Robject{goType} argument to select only Molecular
Function (MF) ontologies. 


<<eval=FALSE,echo=TRUE>>=
res <- ontoCompare(subset, probeType="hgu133a", endnode=MFendnode, goType="MF")
@ 



You can also create a
list of GO ids of nodes of interest and pass it directly to the
\Robject{endnode} argument in \Rfunction{ontoCompare}. GO ids must be in
the following format:
\mbox{"GO:\it{XXXXXXX}.}"\\



%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%




%\begin{figure} %[htpb]
%\begin{center}
%\begin{tabular}{cc}
%\includegraphics[width=3in,height=3in,angle=0]{ontoPlotEg1} %&
%\includegraphics[width=3in,height=3in,angle=0]{ontoPlotEg2} \\
%(a) & (b)\\
%\end{tabular}
%\end{center}
%\caption{Plots obtained using \Rfunction{ontoCompare} on \Robject{probeID} dataset}
%\protect\label{fig:ontoPlotEg}
%\end{figure}

%\bibliography{marrayPacks}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{document}