---
title: "Workshop: Bioconductor for High-Throughput Sequence Analysis"
output:
BiocStyle::html_document:
toc: false
vignette: >
% \VignetteIndexEntry{Workshop: Bioconductor for High-Throughput Sequence Analysis}
% \VignetteEngine{knitr::rmarkdown}
---
```{r style, echo = FALSE, results = 'asis'}
BiocStyle::markdown()
options(width=100, max.print=1000)
knitr::opts_chunk$set(
eval=as.logical(Sys.getenv("KNITR_EVAL", "TRUE")),
cache=as.logical(Sys.getenv("KNITR_CACHE", "TRUE")))
```
Author: Martin Morgan (mtmorgan@fredhutch.org)
Date: 7 October, 2015
Back to [Workshop Outline](Developer-Meeting-Workshop.html)
The material in this document requires _R_ version 3.2 and
_Bioconductor_ version 3.1
```{r configure-test}
stopifnot(
getRversion() >= '3.2' && getRversion() < '3.3',
BiocInstaller::biocVersion() >= "3.1"
)
```
# Abstract
DNA sequence analysis generates large volumes of data that present
challenging bioinformatic and statistical problems. This tutorial
introduces established and new Bioconductor packages and workflows for
analyzing sequence data. The Bioconductor project
(http://bioconductor.org) is a widely used collection of nearly 1000 R
packages for high-throughput genomic analysis. Approaches for
efficiently manipulating sequences and alignments and other common
work flows will be covered along with the unique statistical
challenges associated with 'RNAseq', variant annotation and other
experiments. The emphasis is on exploratory analysis, and the analysis
of designed experiments. The workshop will touch on the Biostrings,
ShortRead, GenomicRanges, DESeq2, VariantAnnotation, and other
packages, with short exercises to illustrate the functionality of each
package.
# Goals
1. Gain overall familiarity with Bioconductor packages for
high-throughput sequence analysis, including Bioconductor vignettes
and classes.
2. Obtain experience running bioinformatic workflows for data quality
assessment, RNA-seq differential expression, and manipulating
variant call format files.
3. Appreciate the importance of ranges and range-based manipulation
for modern genomic analysis
4. Learn 'best practices' for working with large data
# Outline
- Introduction to Bioconductor -- packages and classes
- Short work flows
- Exploring sequences and alignments
- RNA-seq: a high-level tour
- Annotating variants
# Prerequisites
The workshop assumes an intermediate level of familiarity with R, and
basic understanding of biological and technological aspects of
high-throughput sequence analysis. Participants should come prepared
with a modern wireless-enabled laptop and web browser installed.
# Intended Audience:
This workshop is for professional bioinformaticians and statisticians
intending to use R/Bioconductor for analysis and comprehension of
high-throughput sequence data.
# Reference
Huber et al. (2015) Orchestrating high-throughput genomic analysis
with Bioconductor. Nature Methods. Jan 29;12(2):115-21.