% NOTE -- ONLY EDIT THE .Rnw FILE!!!  The .tex file is
% likely to be overwritten.
%
% \VignetteIndexEntry{marray Overview}
% \VignetteDepends{marray}
% \VignetteKeywords{Expression Analysis, Preprocessing}
% \VignettePackage{marray}

\documentclass[11pt]{article}

\usepackage{amsmath,epsfig,fullpage,hyperref}
\parindent 0in

\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textit{#1}}}


\begin{document}

\title{\bf Quick start guide for marray}

\author{Yee Hwa Yang}

\maketitle

\begin{center} 
1. Department of Medicine, University of California, San Francisco, \url{http://www.biostat.ucsf.edu/jean}\\ 
\end{center}

\tableofcontents

% library(tools) 
% setwd("C:/MyDoc/Projects/madman/Rpacks/marray/inst/doc")
% Rnwfile<-file.path("C:/MyDoc/Projects/madman/Rpacks/marray/inst/doc","marray.Rnw")
% options(width=65)
% Sweave(Rnwfile,pdf=TRUE,eps=TRUE,stylepath=TRUE,driver=RweaveLatex())

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Overview}

This document provides a brief guide to the \Rpackage{marray} package,
which is packages for diagnostic plots and normalization of cDNA
microarray data.  Information on the other packages can be found in the
other vignettes.

There are three main components to this package.  These are:
\begin{itemize}
\item Reading in data.
\item Perform simple diagnositc plots to access quality.
\item Normalization.
\end{itemize}

After the two main pre--processing tasks, image analysis and
normalization, the next steps in the statistical analysis depend on the
biological question for which the microarray experiment was
designed. Thus, different Bioconductor packages may be applicable. For
example, for identifying differentially expressed genes, functions in
the packages {\tt limma}, {\tt EBarrays}, {\tt siggenes}, and {\tt
multtest} may be used.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Getting started}

To load the {\tt marray} package in your R session, type {\tt
library(marray)}.  We demonstrate the functionality of this R packages
using gene expression data from the Swirl zebrafish experiment which is
included as part of the package. To load the swirl dataset, use {\tt
data(swirl)}, and to view a description of the experiments and data,
type {\tt ?  swirl}.  


\begin{enumerate}

\item To begin, users will create a directory and move all the relevant
image processing output files (e.g. \texttt{.spot} files) and a file
containing target (or samples) descriptions
(e.g. \texttt{SwirlSample.txt} file) to that directory. For this
illustration, the data has been gathered in the data directory {\tt
swirldata}.

\item Start R in the desired working directory and load the
  \Rpackage{marray} packages:

%code 1
<<eval=TRUE, echo=TRUE>>=
library(marray)
dir(system.file("swirldata", package="marray")) 
@

\item {\bf Data input:} Read in the target file containing
information about the hybridization.

%code 2
<<eval=TRUE, echo=TRUE>>=
datadir <- system.file("swirldata", package="marray")
swirlTargets <- read.marrayInfo(file.path(datadir, "SwirlSample.txt"))
@

\item Read in the raw fluorescent intensities data, by default we assume
that the file names are provided in the {\bf first} column of the target
file.
%code 3
<<eval=TRUE, echo=TRUE>>=
mraw <- read.Spot(targets = swirlTargets, path=datadir)
@

If your working directory contains \Rpackage{GenePix} files (\Rfunction{
  .gpr}), run the following command.  By default, the function {\tt
  read.GenePix} will also set up printer layout and probe annotation
  information. 
\begin{verbatim}
> data <- read.GenePix(targets=swirlTargets)
\end{verbatim}

\item Read in the probe annotation information.
%code 4a
<<eval=TRUE, echo=TRUE>>=
galinfo <- read.Galfile("fish.gal", path=datadir)
mraw@maLayout <- galinfo$layout
mraw@maGnames <- galinfo$gnames
@

\item {\bf Array quality assessment:}, the following command generates
     diagnostic plots for a qualitative assessment of slide quality.
     The results are saved as png files in the working directory.  We
     uses the wrapper functions provided in the package {\tt
     arrayQuality}.

%% Code 4
\begin{verbatim}
> library(arrayQuality)
> maQualityPlots(mraw)
\end{verbatim}

In addition, you can perform simple diagnostic plots with
\begin{verbatim}
> image(mraw)
> boxplot(mraw)
> plot(mraw)
\end{verbatim}

\item {\bf Normalization:} Perform print-tip normalization
for each arrays and take a look at the data summary.
%% Code 5
\begin{verbatim}
> normdata <- maNorm(mraw)
> summary(normdata)
\end{verbatim}

\item  Output the normalized log--ratios $M$ data.
%% Code 6
\begin{verbatim}
> write.marray(normdata)
\end{verbatim}

\item {\bf Identify DE genes:} Using the linear model package
\Rpackage{limma} to identify differential expressed (DE) genes between
wildtype and mutant.  Perform fold-chance estimation as well as apply
Bayesian smoothing to the standard errors.
%% Code 8
\begin{verbatim}
> library(limma)
> LMres <- lmFit(normdata, design = c(1, -1, -1, 1), weights=NULL)
> LMres <- eBayes(LMres)
\end{verbatim}

\item Show the top 50 genes and write it out into a clickable html file.
%% Code 9
\begin{verbatim}
> restable <- toptable(LMres, number=50, genelist=maGeneTable(normdata), resort.by="M")
> table2html(restable, disp="file")
\end{verbatim}


\item To utilize other bioconductor packages for downstream analysis, it
  is also possible to convert objects of class {\tt marrayNorm} into
  objects of class {\tt ExpressionSet} (see definition in the {\tt Biobase}
  package), see package {\tt convert} package for more details.
\begin{verbatim}
> library(convert)
> as(normdata, "ExpressionSet")
\end{verbatim}

\end{enumerate}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Other vignettes and packages}

Greater details can be found in other vignettes.  These are: 

\begin{description}
\item {\tt marrayClasses}.  This vignette describes basic class
definitions and associated methods for pre-- and post--normalization
intensity data for batches of arrays.

\item {\tt marrayInput}. This vignette describes functionality for
reading microarray data into R, such as intensity data from image
processing output files (e.g. {\tt .spot} and {\tt .gpr} files for the
{\tt Spot} and {\tt GenePix} packages, respectively) and textual
information on probes and targets (e.g. from gal files and god
lists). {\tt tcltk} widgets are supplied to facilitate and automate data
input and the creation of microarray specific R objects for storing
these data.

\item {\tt marrayPlot}. This vignette provides descriptions to
  functions for diagnostic plots of microarray spot statistics, such as
  boxplots, scatter--plots, and spatial color images. Examination of
  diagnostic plots of intensity data is important in order to identify
  printing, hybridization, and scanning artifacts which can lead to
  biased inference concerning gene expression.

\item {\tt marrayNorm}. This vignette describes various location and
scale normalization procedures, which correct for different types of dye
biases (e.g. intensity, spatial, plate biases) and allow the use of
control sequences spotted onto the array and possibly spiked into the
mRNA samples. Normalization is needed to ensure that observed
differences in intensities are indeed due to differential expression and
not experimental artifacts; fluorescence intensities should therefore be
normalized before any analysis which involves comparisons among genes
within or between arrays. 

\end{description}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

{\bf Note: Sweave.} This document was generated using the \Rfunction{Sweave}
function from the R \Rpackage{tools} package. The source file is in the
\Rfunction{/inst/doc} directory of the package \Rpackage{marray}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{document}
 

This wrapper function \Rfunction{?maQualityPlots} automatically produces
nine plots of  
\begin{itemize}
\item pre-- and post--normalization cDNA microarray data;
\item $MA$--plots of pre-- and post--normalization log--ratios $M$; 
\item color images of pre-- and post--normalization log--ratios $M$; 
\item color images of average log--intensities $A$; 
\item histogram and overlay density of the signal to noise log--ratio
  for Cy5 and Cy3 channels; where the signal to noise ratios is defined
  as the foreground intensity (without background adjustment) over the
  background intensity; and 
\item dot--plots of $M$ and $A$ values for replicate controls probes.
\end{itemize}
In addition, this function automatically saves the figures to a file.
More detailed descriptions of all the arguments and options and be found
the in the arrayQuality package.  Please contact us if you have other
image processing output formats and would like a similar wrapper
functions.