\name{featureCounts}
\alias{featureCounts}
\title{Count the number of mapped reads for each feature}
\description{Summarize read counts to features including genes and exons}
\usage{
featureCounts(SAMfiles,type="gene",species="mm",annot=NULL)
}
\arguments{
  \item{SAMfiles}{ a character vector giving names of SAM format files.}
  \item{type}{ a character string giving the feature type. Its value could be \code{gene} or \code{exon}.}
  \item{species}{ a character string specifying the species. It can be \code{mm} or \code{hg}. Values of this argument determines which in-built annotation file will be used, if \code{annot} is \code{NULL}.}
  \item{annot}{ a character string giving the name of the annotation file provided by users, which includes feature information such as chromosomal coordinates etc. This file will override the in-built annotation file chosen from using \option{species} argument.}
  }
\details{
This function takes as input a set of SAM format files and assigns reads to the features.
Currently, only feature types including \code{gene} and \code{exon} are supported.
\code{gene} is the aggregation of all the exons for each gene.

There are two in-built annotation files which are used by this function to summarize reads for genes or exons for mouse and human, respectively.
These annotation files include the exon annotation information downloaded from NCBI Build 37.2, including Entrez gene identifier and chromosomal coordinates for each exon.
The \code{species} argument specifies which annotation file should be used.

Users can provide their own annotation file for read summarization as well, by using the \code{annot} argument.
In this case, the user provided annotation file will override the in-built annotation file.
The annotation file provided by users should be a tab delimited file, and its first four columns should provide gene identifiers, chromosome names, chromosomal start locations and chromosomal end locations for each exon, respectively.
Below is an example:
\preformatted{
entrezid	chromosome	chr_start	chr_stop
497097	chr1	3204563	3207049
497097	chr1	3411783	3411982
497097	chr1	3660633	3661579
100503874	chr1	3637390	3640590
100503874	chr1	3648928	3648985
100038431	chr1	3670236	3671869
...
}

Although this function is designed for summarizing reads from RNA-seq experiments, it can be used to summarize reads from other next-gen sequencing experiments as well, for example ChIP-seq or other DNA sequencing experiments.
Simply by setting \code{type} to \code{exon} and providing an annotation, this function will yield numbers of mapped reads for each feature. 
}
\value{ 
A list with the following components:
\tabular{ll}{
  \code{counts}:\tab a data matrix containing read counts for each gene or exon.\cr
  \code{annotation}:\tab a data frame containing Entrez gene identifers, gene/exon length etc.\cr
  \code{targets}:\tab a character vector giving sample information.
}
}
%\references{
%}
\author{Wei Shi}
%\note{}
%\seealso{}
\examples{}
% Add one or more standard keywords, see file 'KEYWORDS' in the
% R documentation directory.
%\keyword{}
%\keyword{}% __ONLY ONE__ keyword per line