Zhu LJ, Holmes BR, Aronin N and Brodsky MH (2014). “CRISPRseek: A Bioconductor Package to Identify Target-Specific Guide RNAs for CRISPR-Cas9 Genome-Editing Systems.” PLoS one, 9(9). http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4172692/.
Zhu LJ (2015). “Overview of guide RNA design tools for CRISPR-Cas9 genome editing technology.” Front. Biol., 10(4).
We are going to use a sequence from human as input, which has been included as as fasta file in the CRISPRseek package.
To perform off-target analysis, we need to load Human BSgenome package.
To annotate the target and off-targets, we need to load Human Transcript and gene identifier mapping packages.
In addition, you need to specify the output directory which will be the directory to look for all the output files.
For the current release, you no longer need to specify the file containing all restriction enzyme (RE) cut patterns. You have the option to specify your own RE pattern file instead of the default one supplied by the CRISPR package.
library(CRISPRseek)
## Loading required package: BiocGenerics
## Loading required package: parallel
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
##
## clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
## clusterExport, clusterMap, parApply, parCapply, parLapply,
## parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## Filter, Find, Map, Position, Reduce, anyDuplicated, append,
## as.data.frame, cbind, colMeans, colSums, colnames, do.call,
## duplicated, eval, evalq, get, grep, grepl, intersect,
## is.unsorted, lapply, lengths, mapply, match, mget, order,
## paste, pmax, pmax.int, pmin, pmin.int, rank, rbind, rowMeans,
## rowSums, rownames, sapply, setdiff, sort, table, tapply,
## union, unique, unsplit, which, which.max, which.min
## Loading required package: Biostrings
## Loading required package: S4Vectors
## Loading required package: stats4
##
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:base':
##
## expand.grid
## Loading required package: IRanges
## Loading required package: XVector
##
## Attaching package: 'Biostrings'
## The following object is masked from 'package:base':
##
## strsplit
library(BSgenome.Hsapiens.UCSC.hg19)
## Loading required package: BSgenome
## Loading required package: GenomeInfoDb
## Loading required package: GenomicRanges
## Loading required package: rtracklayer
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
## Loading required package: GenomicFeatures
## Loading required package: AnnotationDbi
## Loading required package: Biobase
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
library(org.Hs.eg.db)
##
outputDir <- file.path(getwd(),"CRISPRseekDemo")
inputFilePath <- system.file('extdata', 'inputseq.fa', package = 'CRISPRseek')
args(offTargetAnalysis)
args(compare2Sequences)
?offTargetAnalysis
?compare2Sequences
?CRISPRseek
browseVignettes('CRISPRseek')
Please note that chromToSearch is set to chrX here for speed purpose, usually you do not need to set it, by default it is set to all.
offTargetAnalysis(inputFilePath,
BSgenomeName = Hsapiens,
chromToSearch ="chrX",
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
orgAnn = org.Hs.egSYMBOL,
outputDir = outputDir,
overwrite = TRUE)
## Validating input ...
## Searching for gRNAs ...
## >>> Finding all hits in sequence chrX ...
## >>> DONE searching
## Building feature vectors for scoring ...
## Calculating scores ...
## Annotating, filtering and generating reports ...
## Done annotating
## Add paired information...
## Add RE information...
## write gRNAs to bed file...
## Scan for REsites in flanking region...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
## $on.target
## names forViewInUCSC
## [1,] "Hsap_GATA1_ex2_gR7r" "chrX:48649564-48649586"
## [2,] "Hsap_GATA1_ex2_gR17f" "chrX:48649564-48649586"
## [3,] "Hsap_GATA1_ex2_gR20r" "chrX:48649577-48649599"
## [4,] "Hsap_GATA1_ex2_gR39f" "chrX:48649586-48649608"
## [5,] "Hsap_GATA1_ex2_gR40f" "chrX:48649587-48649609"
## extendedSequence gRNAefficacy
## [1,] "GACACCAGAGCAGGATCCACAAACTGGGGG" "0.101719829329578"
## [2,] "TCCCCCAGTTTGTGGATCCTGCTCTGGTGT" "0.106547942216103"
## [3,] "TTCTGGTGTGGAGGACACCAGAGCAGGATC" "0.142091080678942"
## [4,] "TCTGGTGTCCTCCACACCAGAATCAGGGGT" "0.0696295234374132"
## [5,] "CTGGTGTCCTCCACACCAGAATCAGGGGTT" "0.261881755628254"
##
## $summary
## names forViewInUCSC
## 1 Hsap_GATA1_ex2_gR17f chrX:48649564-48649586
## 2 Hsap_GATA1_ex2_gR20r chrX:48649577-48649599
## 3 Hsap_GATA1_ex2_gR39f chrX:48649586-48649608
## 4 Hsap_GATA1_ex2_gR40f chrX:48649587-48649609
## 5 Hsap_GATA1_ex2_gR7r chrX:48649564-48649586
## extendedSequence gRNAefficacy
## 1 TCCCCCAGTTTGTGGATCCTGCTCTGGTGT 0.106547942216103
## 2 TTCTGGTGTGGAGGACACCAGAGCAGGATC 0.142091080678942
## 3 TCTGGTGTCCTCCACACCAGAATCAGGGGT 0.0696295234374132
## 4 CTGGTGTCCTCCACACCAGAATCAGGGGTT 0.261881755628254
## 5 GACACCAGAGCAGGATCCACAAACTGGGGG 0.101719829329578
## gRNAsPlusPAM top5OfftargetTotalScore top10OfftargetTotalScore
## 1 CCAGTTTGTGGATCCTGCTCNGG 3.3 3.3
## 2 GGTGTGGAGGACACCAGAGCNGG 3.4 3.4
## 3 GTGTCCTCCACACCAGAATCNGG 1.1 1.1
## 4 TGTCCTCCACACCAGAATCANGG 2.9 2.9
## 5 CCAGAGCAGGATCCACAAACNGG 0.2 0.2
## top1Hit.onTarget.MMdistance2PAM topOfftarget1MMdistance2PAM
## 1 NMM 20,17,13
## 2 NMM 18,16,15
## 3 NMM 16,15,1
## 4 NMM 13,2
## 5 NMM 16,8,3
## topOfftarget2MMdistance2PAM topOfftarget3MMdistance2PAM
## 1 20,19,3
## 2 17,14,1 20,14,8
## 3 15,11,3
## 4
## 5
## topOfftarget4MMdistance2PAM topOfftarget5MMdistance2PAM
## 1
## 2 13,10,7
## 3
## 4
## 5
## topOfftarget6MMdistance2PAM topOfftarget7MMdistance2PAM
## 1
## 2
## 3
## 4
## 5
## topOfftarget8MMdistance2PAM topOfftarget9MMdistance2PAM
## 1
## 2
## 3
## 4
## 5
## topOfftarget10MMdistance2PAM REname uniqREin200 uniqREin100
## 1
## 2 Aco12261II
## 3 BslI HinfI TfiI HinfI TfiI HinfI TfiI
## 4 BslI HinfI TfiI HinfI TfiI HinfI TfiI
## 5 BslI PflMI PflMI PflMI
##
## $offtarget
## name gRNAPlusPAM OffTargetSequence
## 1 Hsap_GATA1_ex2_gR17f CCAGTTTGTGGATCCTGCTCNGG GTAGTTTGTGGATCCTGTTCTAG
## 2 Hsap_GATA1_ex2_gR20r GGTGTGGAGGACACCAGAGCNGG GGTATGTAGGACACCAGAGACAG
## 3 Hsap_GATA1_ex2_gR20r GGTGTGGAGGACACCAGAGCNGG GGAGCTGAGGACACCAGAGCGGG
## 4 Hsap_GATA1_ex2_gR39f GTGTCCTCCACACCAGAATCNGG GTGTCTTCCCCACCAGATTCTAG
## 5 Hsap_GATA1_ex2_gR7r CCAGAGCAGGATCCACAAACNGG CCAGAGCAGGATCCACAAACTGG
## 7 Hsap_GATA1_ex2_gR17f CCAGTTTGTGGATCCTGCTCNGG CCAGTTTGTGGATCCTGCTCTGG
## 9 Hsap_GATA1_ex2_gR20r GGTGTGGAGGACACCAGAGCNGG GGTGTGGAGGACACCAGAGCAGG
## 10 Hsap_GATA1_ex2_gR39f GTGTCCTCCACACCAGAATCNGG GTGTCCTCCACACCAGAATCAGG
## 11 Hsap_GATA1_ex2_gR40f TGTCCTCCACACCAGAATCANGG TGTCCTCCACACCAGAATCAGGG
## 12 Hsap_GATA1_ex2_gR20r GGTGTGGAGGACACCAGAGCNGG TGTGTGTAGGACTCCAGAGCAAG
## 13 Hsap_GATA1_ex2_gR20r GGTGTGGAGGACACCAGAGCNGG GGTGTGGGGGTCAGCAGAGCCAG
## 14 Hsap_GATA1_ex2_gR7r CCAGAGCAGGATCCACAAACNGG CCAGTGCAGGATTCACATACTAG
## 15 Hsap_GATA1_ex2_gR17f CCAGTTTGTGGATCCTGCTCNGG ACAATTTCTGGATCCTGCTCCAG
## 16 Hsap_GATA1_ex2_gR39f GTGTCCTCCACACCAGAATCNGG GTGTTTTCCACACCAGAATGCAG
## 17 Hsap_GATA1_ex2_gR40f TGTCCTCCACACCAGAATCANGG TGTCCTCAACACCAGAATGATAG
## inExon inIntron entrez_id gene score n.mismatch
## 1 TRUE 1183 CLCN4 0.7 3
## 2 1 3
## 3 1.4 3
## 4 TRUE 10149 ADGRG2 0.3 3
## 5 TRUE 2623 GATA1 100 0
## 7 TRUE 2623 GATA1 100 0
## 9 TRUE 2623 GATA1 100 0
## 10 TRUE 2623 GATA1 100 0
## 11 TRUE 2623 GATA1 100 0
## 12 TRUE 57477 SHROOM4 0.8 3
## 13 0.2 3
## 14 0.2 3
## 15 TRUE 618 BCYRN1 2.6 3
## 16 0.8 3
## 17 TRUE 139324 HDX 2.9 2
## mismatch.distance2PAM alignment isCanonicalPAM
## 1 20,19,3 GT...............T.. 0
## 2 17,14,1 ...A..T............A 0
## 3 18,16,15 ..A.CT.............. 1
## 4 15,11,3 .....T...C.......T.. 0
## 5 .................... 1
## 7 .................... 1
## 9 .................... 1
## 10 .................... 1
## 11 .................... 1
## 12 20,14,8 T.....T.....T....... 0
## 13 13,10,7 .......G..T..G...... 0
## 14 16,8,3 ....T.......T....T.. 0
## 15 20,17,13 A..A...C............ 0
## 16 16,15,1 ....TT.............G 0
## 17 13,2 .......A..........G. 0
## forViewInUCSC strand chrom chromStart chromEnd
## 1 chrX:10161855-10161877 + chrX 10161855 10161877
## 2 chrX:14446436-14446458 - chrX 14446436 14446458
## 3 chrX:152513913-152513935 - chrX 152513913 152513935
## 4 chrX:19043357-19043379 + chrX 19043357 19043379
## 5 chrX:48649564-48649586 - chrX 48649564 48649586
## 7 chrX:48649564-48649586 + chrX 48649564 48649586
## 9 chrX:48649577-48649599 - chrX 48649577 48649599
## 10 chrX:48649586-48649608 + chrX 48649586 48649608
## 11 chrX:48649587-48649609 + chrX 48649587 48649609
## 12 chrX:50383047-50383069 - chrX 50383047 50383069
## 13 chrX:50699578-50699600 + chrX 50699578 50699600
## 14 chrX:66676065-66676087 - chrX 66676065 66676087
## 15 chrX:70453482-70453504 - chrX 70453482 70453504
## 16 chrX:81194638-81194660 + chrX 81194638 81194660
## 17 chrX:83738197-83738219 - chrX 83738197 83738219
## extendedSequence gRNAefficacy
## 1 GGCTGTAGTTTGTGGATCCTGTTCTAGAGG 0.10516186
## 2 TTTTGGTATGTAGGACACCAGAGACAGTCC 0.48093111
## 3 GGTAGGAGCTGAGGACACCAGAGCGGGAAT 0.06199434
## 4 TCTTGTGTCTTCCCCACCAGATTCTAGAGC 0.07578946
## 5 GACACCAGAGCAGGATCCACAAACTGGGGG 0.10171983
## 7 TCCCCCAGTTTGTGGATCCTGCTCTGGTGT 0.10654794
## 9 TTCTGGTGTGGAGGACACCAGAGCAGGATC 0.14209108
## 10 TCTGGTGTCCTCCACACCAGAATCAGGGGT 0.06962952
## 11 CTGGTGTCCTCCACACCAGAATCAGGGGTT 0.26188176
## 12 GCTTTGTGTGTAGGACTCCAGAGCAAGCAT 0.12975874
## 13 GCAGGGTGTGGGGGTCAGCAGAGCCAGTCA 0.01514054
## 14 GTTGCCAGTGCAGGATTCACATACTAGTTA 0.22156635
## 15 ATGGACAATTTCTGGATCCTGCTCCAGGTA 0.09746120
## 16 GGCAGTGTTTTCCACACCAGAATGCAGCCC 0.35310464
## 17 CCACTGTCCTCAACACCAGAATGATAGATC 0.21729625
## flankSequence
## 1 GTCAGCAAACTTTTTCTGCAAGAGGCCAGAGAGTAAATATTTTCAGGTTTGTGGCCCATGTGGTTTCTATCTCAGCTCTTCAGTTCTGCCACTGTAGCATGAAGGCAGCCACAGAGTGCATGTAATTGAATGGTGTAGCTGTGCTCCAATAAAACTTTATTTACAAAAACAGGCTGTGGGCCAGATTTGGCCCATGGGCTGTAGTTTGTGGATCCTGTTCTAGAGGAATGAGCTAAGGCCATTGCCTTAGCTGCATGGGAATCTCCTGGTAGATTGAGGGGATGGACTGGGCCAAGGGAAGGAAAGAGATTGGATGGTGACTAGAGGGAAAGTGGAAACAGGGAGGGTTTTTTTTTTTTAATGTTTAAATGCTGATGGCAAGGAGAGGAAGAGGTTGAAGATACAGGAGAAGGAAAGGGGAAT
## 2 ACATTTTCCCCATGACATTCTTTTAATTGCAAATGCTCCTGAGAAGAGGAATAACACATGAAACGAATCCGCTCTGATCACAGATTTTTCTCCTTCTCAGATAAATCCAATTAACAGAAGATGTCTTAGAATTTAAGGGGGGGCAGAAAGAGAATCTCTGTGGGTAGAGGGTTGGGGAAAAAGCATGGGATGGTAGTTTTGGTATGTAGGACACCAGAGACAGTCCTCACCTAATTCAATGGCTTAAAATTATACCAAAGGAAAACAACTCATAACCCAAAAGGAATTAGGATGCACACACCAGGCTTACTCAGACATGTGGTGCTGCCTCCTATTTTACTTTATGGTAGATAAAGTTGAGCAGTTGTATAAATAGAAATGACTGGAGCCTTTACGTCTAATAGCAGAACTCATTTTTTATTT
## 3 GCCTTGACTGTACCATGGTCACACATTTCTTATCCAAAAGCAAAGGGATTGTTCATATAAAGAAATTATGAGTGTTGGTTGATCAGCTCAGTTTTGGAAATGTTAAAGTTGAGGTGCATATGAAGCACCTGAGGGCGGATGTCAAGTGGGGATAAGCAATATACCCTGGATATATGAGCCTCATGTTCAGGAGAGTGGTAGGAGCTGAGGACACCAGAGCGGGAATTAATATCACACAGATTGCGTTTAATCAAGGAGACTAGGAGAGATCACCTAGGGAATGACTGTAAACAAAAAAGGGAAGTCTAAAGACTGATTCCGCGTTCCTTCAATACTTTGAGCTACCGCAGAATTGGGGCATCAGTAATGAAGCCGAAGAATGAGTGAACAAATAAGTAAAAGAAAACCAGCTGAAACACAGTA
## 4 TGCTGGGATTATAGTCATGAGCCACCATGCCCCACCTACAGGTCACCTTCTTTCTGGGGGTTCCCCTATCCCTCCGCTGCAGTCCGGTTTAATTGCTCCTGCTGCTCTACATTCAGAGACTTAGAACATCCATCCTTTGGTACCCTATACAAGCCACCTTCAGTGCAGGGGTTGCACTGTATTGTTATTTGTTGCATCTTGTGTCTTCCCCACCAGATTCTAGAGCCCTGGTTCTCACAATTAGCCTCATGTTGGAAACATGCCAATGCTGGGTCTCAGCCCCTAAATACTGATTTGTTATGGAGCATGCCTGGGCACTGGGATTTTTGAAAGCTCCCAGGTGATTTTAATATGCAACCAGGGATGAGAAACATTGCTCTAGACTTGGTTTCCTATCTCAGAGAGTTGGGTTTATCCCTGTGG
## 5 AACCCCAACAGCACTCAGCCAATGCCAAGACAGCCACTCAATGGAGTTACCTGGGGAGTGTCTGTAGGCCTCAGCGTCCCTGTAGTAGGCCAGTGCCGCAGCTGCAGCGGTGGCTGTGCTCGGGGCAGTGGAGGAAGCTGCTGCATCCAAGCCCTCAGGCCCAGAGGGGAAGAAAACCCCTGATTCTGGTGTGGAGGACACCAGAGCAGGATCCACAAACTGGGGGAGGGGCTCTGAGGTCCCCAGGGACCCCAGGCCAGGGAACTCCATGGAGCCTCTGGGGATTAACCTGCGAGGACAGAAGGGGTCCTCAGACACAGAAATCCTTCCACCCCACATCCTTTCACCTGCTCCTCTTCCTCCTTTCCCCCTCCTCCCACTCCATCACCTCAGTCTCCATATTTCTCCTTCCCACCTCCCCAT
## 7 ATGGGGAGGTGGGAAGGAGAAATATGGAGACTGAGGTGATGGAGTGGGAGGAGGGGGAAAGGAGGAAGAGGAGCAGGTGAAAGGATGTGGGGTGGAAGGATTTCTGTGTCTGAGGACCCCTTCTGTCCTCGCAGGTTAATCCCCAGAGGCTCCATGGAGTTCCCTGGCCTGGGGTCCCTGGGGACCTCAGAGCCCCTCCCCCAGTTTGTGGATCCTGCTCTGGTGTCCTCCACACCAGAATCAGGGGTTTTCTTCCCCTCTGGGCCTGAGGGCTTGGATGCAGCAGCTTCCTCCACTGCCCCGAGCACAGCCACCGCTGCAGCTGCGGCACTGGCCTACTACAGGGACGCTGAGGCCTACAGACACTCCCCAGGTAACTCCATTGAGTGGCTGTCTTGGCATTGGCTGAGTGCTGTTGGGGTT
## 9 GGATCTCCATGGCAACCCCAACAGCACTCAGCCAATGCCAAGACAGCCACTCAATGGAGTTACCTGGGGAGTGTCTGTAGGCCTCAGCGTCCCTGTAGTAGGCCAGTGCCGCAGCTGCAGCGGTGGCTGTGCTCGGGGCAGTGGAGGAAGCTGCTGCATCCAAGCCCTCAGGCCCAGAGGGGAAGAAAACCCCTGATTCTGGTGTGGAGGACACCAGAGCAGGATCCACAAACTGGGGGAGGGGCTCTGAGGTCCCCAGGGACCCCAGGCCAGGGAACTCCATGGAGCCTCTGGGGATTAACCTGCGAGGACAGAAGGGGTCCTCAGACACAGAAATCCTTCCACCCCACATCCTTTCACCTGCTCCTCTTCCTCCTTTCCCCCTCCTCCCACTCCATCACCTCAGTCTCCATATTTCTCCTT
## 10 TATGGAGACTGAGGTGATGGAGTGGGAGGAGGGGGAAAGGAGGAAGAGGAGCAGGTGAAAGGATGTGGGGTGGAAGGATTTCTGTGTCTGAGGACCCCTTCTGTCCTCGCAGGTTAATCCCCAGAGGCTCCATGGAGTTCCCTGGCCTGGGGTCCCTGGGGACCTCAGAGCCCCTCCCCCAGTTTGTGGATCCTGCTCTGGTGTCCTCCACACCAGAATCAGGGGTTTTCTTCCCCTCTGGGCCTGAGGGCTTGGATGCAGCAGCTTCCTCCACTGCCCCGAGCACAGCCACCGCTGCAGCTGCGGCACTGGCCTACTACAGGGACGCTGAGGCCTACAGACACTCCCCAGGTAACTCCATTGAGTGGCTGTCTTGGCATTGGCTGAGTGCTGTTGGGGTTGCCATGGAGATCCTTGGCTAGG
## 11 ATGGAGACTGAGGTGATGGAGTGGGAGGAGGGGGAAAGGAGGAAGAGGAGCAGGTGAAAGGATGTGGGGTGGAAGGATTTCTGTGTCTGAGGACCCCTTCTGTCCTCGCAGGTTAATCCCCAGAGGCTCCATGGAGTTCCCTGGCCTGGGGTCCCTGGGGACCTCAGAGCCCCTCCCCCAGTTTGTGGATCCTGCTCTGGTGTCCTCCACACCAGAATCAGGGGTTTTCTTCCCCTCTGGGCCTGAGGGCTTGGATGCAGCAGCTTCCTCCACTGCCCCGAGCACAGCCACCGCTGCAGCTGCGGCACTGGCCTACTACAGGGACGCTGAGGCCTACAGACACTCCCCAGGTAACTCCATTGAGTGGCTGTCTTGGCATTGGCTGAGTGCTGTTGGGGTTGCCATGGAGATCCTTGGCTAGGT
## 12 TCAAGAACCCTAAATATGGAAAAAGGCTTGAAATAGGAAGGATACCCTGTTCACCACTATATCCTCAGTGTCTAAAAGCATAGGTCATAGTAGATACCCATGGTACATAGTACTACTACTGGTAGATGCTTGACAAATAGTTGTTGACCGGATAACTACCTGGTAGCAGCAGCTGGAGAGATTTCCCGGCAGTGTGGCTTTGTGTGTAGGACTCCAGAGCAAGCATAGCCCCCAAGGTCTCCTTGTGCCTTCAGCCCCCACTCTGGGCACTCTGAAATCAATAATAGAACAATGTGATTACAAAGCTAGGGGCTTCACCTGGCTTTCCTCCGCTTAAATCTTAGCCACTGTTGCTGACCTGTGGCTATACTTCCTGGCCTTTGACAAACAGCTTCCCCTGCCCTTCCTCCTATACCTCCTGGT
## 13 AGACATAAGATGAAAATAAAGATTTTAAGAGAGAGACCATGAGTGGTGGGCCAATACCTTTATTGGTCCAGGGCATTATCCAAACAGGTTTCCCTCAGGGAGTTTTAACTGGTGGGTTTAAAGCAATCAGGCATGAATTCCAATAGGTAACGCTGTGTCTGAGAGGTGGTCACTGTGGCATATCTGCACAGTGCATGCAGGGTGTGGGGGTCAGCAGAGCCAGTCAAGAGGTTGTATCTAGCTATCTTATAGAGAAGTGATCACCAAGAGGAAGTTGTCTAAGACACATATCTGGATCAACCACATTATGAAACTGGGGGTGGGGAGGGTGTAGAACTGGAAACTGTGGCCAGGCGCGGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGTGGACAGATCACGAGGTCAGGAGT
## 14 TTTGCACAGTTTGGTGGAAAAAAGAATGGTTTCCCAGCTGGGTAGCACACTCACTCACTGCCTACCTTGGCTAGAGGTTGGGGACTCCCCTGCCCTGTGTGGCTATCAGGTGGGCCACCGCAACACACTTCTCTTTTCTCTCCATGTACAACACCAGCCGCCTAGTCAATTCTGATAAGATAAACTAGATACCTTGGTTGCCAGTGCAGGATTCACATACTAGTTATTATTTATTTTTTATGGGAGCCTCTGATCCCCAGTGCTTTTGTCATCCATCTTGGCCCCTCCCCAATATTTTCTTCTAAGACTCTTATAGTTTTAAATCTAATGTTTAGGCTTTTGATCCATTTTAAGTTAATTTTGTATAAAGAAGGGGTCCAATTTCATACTTTTACATGTGGATATTCAGTTTTTCTAATACAA
## 15 GCTGGAGCAGAAACAGAAAGGGGTTAGGGGAGACTGTGGGAAAATAAAGTGTGTTTGGGATCCCTAAGGTGAAGGGAGAGAAAGGGTGACAGTTTCAACAGCTTATCTTGACTTAGGCCAAAGAAGGTTAAGGGGTTCTGAATAACTGTCTCTCCACATCTTCCCTGCACTCCCCAGGTTGCCTCCCTTCCCAGGGATGGACAATTTCTGGATCCTGCTCCAGGTAGACATCCTGGGAAGGGCCCTCTCCCCACAAAACTGCCCTTCCCTGCCTGAGTGCAGGGCAGCCCCTTCCTCAACTTTCACCATTTCTCTCTCTTAACCTTAGGGATCCTGTACAAACATGTTGTTTTCCAGCACATTAGTACTTAGGGATATTTTCATATCATACTGAGCTCCCCACCCCCACCCCACCCAACATAT
## 16 AAGGAAGTAATTAAGATTAAATGAGGCTTGGAGAATGGGGCTCTAATTCGATAGGATTAATATCATTATAAGAAGAGGTACCAGAGTGTTGGCATGTGCATGCTCTCTGTCTTTCTTCGTGTGTGTGTGTGTGTGTATGTGTGTGTTTCTCCATGTGTATGCACTGAGGAAAGGACATGTGAGGATATAGAAAGAAGGCAGTGTTTTCCACACCAGAATGCAGCCCTTATCAGAAGTCAAATAAACCAGAACCTTTATCTTAGACTTCTAGCCTCTAAAACACTGAAAAATAAATTTTCGTTTAAACTGCCAGGTCTATAGTACCTTGTTACAGCAGCCTAAGGTGACTAATATAATTGTCAAGGTAGTGTTACTGGGAATTTTGTATTGGGATGGATTTTTTAACTGAGGAGATCAACACAG
## 17 GCAGGTGCACATAAGTCAAGAATTGAGGTTTGGGAACCTCCACCTAGATTTCAGAGGATGTATGAAAATGTCTGGATGTCCCTGCAGACATTTGCTGCATGGGTGGAGCCCTCATGGAGAACCTCTGCTAGGGCAATGTGGAAGAGACATGTGGGGTTGGAGCCCCAACACAGATAGTGGAGCTGTGAGAAGAGGGCCACTGTCCTCAACACCAGAATGATAGATCCACCGACAGCTTGCACCATGAGCCTGGGAAAGCCACATACACTTAACATCAGTCTGTGAAATCAGCTGAGGGGGGCGCTGTACCCTGCAAAGCCACAAAGTTGGAGCTGCCCAAGCCCTTTGATGCCCACTCCTTGCATCAGCATAACCTGAATGTGGGACATGGAGTCAAAGGAGATCACTTTGGAACTTTAAGTT
##
## $gRNAs.bedFormat
## [,1] [,2] [,3] [,4]
## [1,] "chrX" "48649563" "48649586" "Hsap_GATA1_ex2_gR7r"
## [2,] "chrX" "48649563" "48649586" "Hsap_GATA1_ex2_gR17f"
## [3,] "chrX" "48649576" "48649599" "Hsap_GATA1_ex2_gR20r"
## [4,] "chrX" "48649585" "48649608" "Hsap_GATA1_ex2_gR39f"
## [5,] "chrX" "48649586" "48649609" "Hsap_GATA1_ex2_gR40f"
## [,5] [,6] [,7] [,8]
## [1,] "101.719829329578" "-" "48649568" "48649570"
## [2,] "106.547942216103" "+" "48649579" "48649581"
## [3,] "142.091080678942" "-" "48649581" "48649583"
## [4,] "69.6295234374132" "+" "48649601" "48649603"
## [5,] "261.881755628254" "+" "48649602" "48649604"
##
## $REcutDetails
## gRNAPlusPAM REcutgRNAName REname REpattern
## 2 GGTGTGGAGGACACCAGAGCAGG Hsap_GATA1_ex2_gR20r Aco12261II CCRGAG
## 3 GTGTCCTCCACACCAGAATCAGG Hsap_GATA1_ex2_gR39f BslI CCNNNNNNNGG
## 4 TGTCCTCCACACCAGAATCAGGG Hsap_GATA1_ex2_gR40f BslI CCNNNNNNNGG
## 5 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r BslI CCNNNNNNNGG
## 6 GTGTCCTCCACACCAGAATCAGG Hsap_GATA1_ex2_gR39f HinfI GANTC
## 7 TGTCCTCCACACCAGAATCAGGG Hsap_GATA1_ex2_gR40f HinfI GANTC
## 8 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r PflMI CCANNNNNTGG
## 9 GTGTCCTCCACACCAGAATCAGG Hsap_GATA1_ex2_gR39f TfiI GAWTC
## 10 TGTCCTCCACACCAGAATCAGGG Hsap_GATA1_ex2_gR40f TfiI GAWTC
## REcutStart REcutEnd
## 2 14 19
## 3 13 23
## 4 12 22
## 5 13 23
## 6 16 20
## 7 15 19
## 8 13 23
## 9 16 20
## 10 15 19
##
## $REs.isUnique100
## [1] "" "" "HinfI TfiI" "HinfI TfiI" "PflMI"
##
## $REs.isUnique50
## [1] "" "" "HinfI TfiI" "HinfI TfiI" "PflMI"
Paired nickases decreases off-target cleavage by requiring the independent binding of two separate gRNAs around a genomic region. Here is how to find gRNAs in paired configuration.
offTargetAnalysis(inputFilePath,
findPairedgRNAOnly = TRUE,
min.gap = 0,
max.gap = 20,
BSgenomeName = Hsapiens,
chromToSearch ="chrX",
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
orgAnn = org.Hs.egSYMBOL,
max.mismatch = 0,
outputDir = outputDir,
overwrite = TRUE)
## Validating input ...
## Searching for gRNAs ...
## >>> Finding all hits in sequence chrX ...
## >>> DONE searching
## Building feature vectors for scoring ...
## Calculating scores ...
## Annotating, filtering and generating reports ...
## Done annotating
## Add paired information...
## Add RE information...
## write gRNAs to bed file...
## Scan for REsites in flanking region...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
## $on.target
## names forViewInUCSC
## [1,] "Hsap_GATA1_ex2_gR7r" "chrX:48649564-48649586"
## [2,] "Hsap_GATA1_ex2_gR40f" "chrX:48649587-48649609"
## extendedSequence gRNAefficacy
## [1,] "GACACCAGAGCAGGATCCACAAACTGGGGG" "0.101719829329578"
## [2,] "CTGGTGTCCTCCACACCAGAATCAGGGGTT" "0.261881755628254"
##
## $summary
## names forViewInUCSC
## 1 Hsap_GATA1_ex2_gR40f chrX:48649587-48649609
## 2 Hsap_GATA1_ex2_gR7r chrX:48649564-48649586
## extendedSequence gRNAefficacy gRNAsPlusPAM
## 1 CTGGTGTCCTCCACACCAGAATCAGGGGTT 0.261881755628254 TGTCCTCCACACCAGAATCANGG
## 2 GACACCAGAGCAGGATCCACAAACTGGGGG 0.101719829329578 CCAGAGCAGGATCCACAAACNGG
## top5OfftargetTotalScore top10OfftargetTotalScore
## 1 NA NA
## 2 NA NA
## top1Hit.onTarget.MMdistance2PAM topOfftarget1MMdistance2PAM
## 1 NMM
## 2 NMM
## topOfftarget2MMdistance2PAM topOfftarget3MMdistance2PAM
## 1
## 2
## topOfftarget4MMdistance2PAM topOfftarget5MMdistance2PAM
## 1
## 2
## topOfftarget6MMdistance2PAM topOfftarget7MMdistance2PAM
## 1
## 2
## topOfftarget8MMdistance2PAM topOfftarget9MMdistance2PAM
## 1
## 2
## topOfftarget10MMdistance2PAM PairedgRNAName REname
## 1 Hsap_GATA1_ex2_gR7r BslI HinfI TfiI
## 2 Hsap_GATA1_ex2_gR40f BslI PflMI
## uniqREin200 uniqREin100
## 1
## 2
##
## $offtarget
## name gRNAPlusPAM OffTargetSequence
## 1 Hsap_GATA1_ex2_gR7r CCAGAGCAGGATCCACAAACNGG CCAGAGCAGGATCCACAAACTGG
## 2 Hsap_GATA1_ex2_gR40f TGTCCTCCACACCAGAATCANGG TGTCCTCCACACCAGAATCAGGG
## inExon inIntron entrez_id gene score n.mismatch mismatch.distance2PAM
## 1 TRUE 2623 GATA1 100 0 <NA>
## 2 TRUE 2623 GATA1 100 0 <NA>
## alignment isCanonicalPAM forViewInUCSC strand chrom
## 1 .................... 1 chrX:48649564-48649586 - chrX
## 2 .................... 1 chrX:48649587-48649609 + chrX
## chromStart chromEnd extendedSequence gRNAefficacy
## 1 48649564 48649586 GACACCAGAGCAGGATCCACAAACTGGGGG 0.1017198
## 2 48649587 48649609 CTGGTGTCCTCCACACCAGAATCAGGGGTT 0.2618818
## flankSequence
## 1 AACCCCAACAGCACTCAGCCAATGCCAAGACAGCCACTCAATGGAGTTACCTGGGGAGTGTCTGTAGGCCTCAGCGTCCCTGTAGTAGGCCAGTGCCGCAGCTGCAGCGGTGGCTGTGCTCGGGGCAGTGGAGGAAGCTGCTGCATCCAAGCCCTCAGGCCCAGAGGGGAAGAAAACCCCTGATTCTGGTGTGGAGGACACCAGAGCAGGATCCACAAACTGGGGGAGGGGCTCTGAGGTCCCCAGGGACCCCAGGCCAGGGAACTCCATGGAGCCTCTGGGGATTAACCTGCGAGGACAGAAGGGGTCCTCAGACACAGAAATCCTTCCACCCCACATCCTTTCACCTGCTCCTCTTCCTCCTTTCCCCCTCCTCCCACTCCATCACCTCAGTCTCCATATTTCTCCTTCCCACCTCCCCAT
## 2 ATGGAGACTGAGGTGATGGAGTGGGAGGAGGGGGAAAGGAGGAAGAGGAGCAGGTGAAAGGATGTGGGGTGGAAGGATTTCTGTGTCTGAGGACCCCTTCTGTCCTCGCAGGTTAATCCCCAGAGGCTCCATGGAGTTCCCTGGCCTGGGGTCCCTGGGGACCTCAGAGCCCCTCCCCCAGTTTGTGGATCCTGCTCTGGTGTCCTCCACACCAGAATCAGGGGTTTTCTTCCCCTCTGGGCCTGAGGGCTTGGATGCAGCAGCTTCCTCCACTGCCCCGAGCACAGCCACCGCTGCAGCTGCGGCACTGGCCTACTACAGGGACGCTGAGGCCTACAGACACTCCCCAGGTAACTCCATTGAGTGGCTGTCTTGGCATTGGCTGAGTGCTGTTGGGGTTGCCATGGAGATCCTTGGCTAGGT
##
## $gRNAs.bedFormat
## [,1] [,2] [,3] [,4]
## [1,] "chrX" "48649563" "48649586" "Hsap_GATA1_ex2_gR7r"
## [2,] "chrX" "48649586" "48649609" "Hsap_GATA1_ex2_gR40f"
## [,5] [,6] [,7] [,8]
## [1,] "101.719829329578" "-" "48649568" "48649570"
## [2,] "261.881755628254" "+" "48649602" "48649604"
##
## $REcutDetails
## ReversegRNAPlusPAM ReversegRNAName ForwardgRNAPlusPAM
## 1 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 2 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 3 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 4 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 5 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 6 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## ForwardgRNAName gap ForwardREcutgRNAName ForwardREname
## 1 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f BslI
## 2 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f BslI
## 3 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f HinfI
## 4 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f HinfI
## 5 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f TfiI
## 6 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f TfiI
## ForwardREpattern ForwardREcutStart ForwardREcutEnd ReverseREcutgRNAName
## 1 CCNNNNNNNGG 12 22 Hsap_GATA1_ex2_gR7r
## 2 CCNNNNNNNGG 12 22 Hsap_GATA1_ex2_gR7r
## 3 GANTC 15 19 Hsap_GATA1_ex2_gR7r
## 4 GANTC 15 19 Hsap_GATA1_ex2_gR7r
## 5 GAWTC 15 19 Hsap_GATA1_ex2_gR7r
## 6 GAWTC 15 19 Hsap_GATA1_ex2_gR7r
## ReverseREname ReverseREpattern ReverseREcutStart ReverseREcutEnd
## 1 BslI CCNNNNNNNGG 13 23
## 2 PflMI CCANNNNNTGG 13 23
## 3 BslI CCNNNNNNNGG 13 23
## 4 PflMI CCANNNNNTGG 13 23
## 5 BslI CCNNNNNNNGG 13 23
## 6 PflMI CCANNNNNTGG 13 23
##
## $REs.isUnique100
## [1] "" ""
##
## $REs.isUnique50
## [1] "" ""
You can specify the criteria of off-target search by specifying max.mismatch, e.g., allowing up to 2 mismatches to be considered as potential off-targets, by default it is set to 4.
offTargetAnalysis(inputFilePath,
findPairedgRNAOnly = TRUE, min.gap = 0, max.gap = 20,
BSgenomeName = Hsapiens,
chromToSearch ="chrX",
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
orgAnn = org.Hs.egSYMBOL,
max.mismatch = 2,
outputDir = outputDir,
overwrite = TRUE)
## Validating input ...
## Searching for gRNAs ...
## >>> Finding all hits in sequence chrX ...
## >>> DONE searching
## Building feature vectors for scoring ...
## Calculating scores ...
## Annotating, filtering and generating reports ...
## Done annotating
## Add paired information...
## Add RE information...
## write gRNAs to bed file...
## Scan for REsites in flanking region...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
## $on.target
## names forViewInUCSC
## [1,] "Hsap_GATA1_ex2_gR7r" "chrX:48649564-48649586"
## [2,] "Hsap_GATA1_ex2_gR40f" "chrX:48649587-48649609"
## extendedSequence gRNAefficacy
## [1,] "GACACCAGAGCAGGATCCACAAACTGGGGG" "0.101719829329578"
## [2,] "CTGGTGTCCTCCACACCAGAATCAGGGGTT" "0.261881755628254"
##
## $summary
## names forViewInUCSC
## 1 Hsap_GATA1_ex2_gR40f chrX:48649587-48649609
## 2 Hsap_GATA1_ex2_gR7r chrX:48649564-48649586
## extendedSequence gRNAefficacy gRNAsPlusPAM
## 1 CTGGTGTCCTCCACACCAGAATCAGGGGTT 0.261881755628254 TGTCCTCCACACCAGAATCANGG
## 2 GACACCAGAGCAGGATCCACAAACTGGGGG 0.101719829329578 CCAGAGCAGGATCCACAAACNGG
## top5OfftargetTotalScore top10OfftargetTotalScore
## 1 2.9 2.9
## 2 NA NA
## top1Hit.onTarget.MMdistance2PAM topOfftarget1MMdistance2PAM
## 1 NMM 13,2
## 2 NMM
## topOfftarget2MMdistance2PAM topOfftarget3MMdistance2PAM
## 1
## 2
## topOfftarget4MMdistance2PAM topOfftarget5MMdistance2PAM
## 1
## 2
## topOfftarget6MMdistance2PAM topOfftarget7MMdistance2PAM
## 1
## 2
## topOfftarget8MMdistance2PAM topOfftarget9MMdistance2PAM
## 1
## 2
## topOfftarget10MMdistance2PAM PairedgRNAName REname
## 1 Hsap_GATA1_ex2_gR7r BslI HinfI TfiI
## 2 Hsap_GATA1_ex2_gR40f BslI PflMI
## uniqREin200 uniqREin100
## 1
## 2
##
## $offtarget
## name gRNAPlusPAM OffTargetSequence
## 1 Hsap_GATA1_ex2_gR7r CCAGAGCAGGATCCACAAACNGG CCAGAGCAGGATCCACAAACTGG
## 2 Hsap_GATA1_ex2_gR40f TGTCCTCCACACCAGAATCANGG TGTCCTCCACACCAGAATCAGGG
## 3 Hsap_GATA1_ex2_gR40f TGTCCTCCACACCAGAATCANGG TGTCCTCAACACCAGAATGATAG
## inExon inIntron entrez_id gene score n.mismatch mismatch.distance2PAM
## 1 TRUE 2623 GATA1 100 0
## 2 TRUE 2623 GATA1 100 0
## 3 TRUE 139324 HDX 2.9 2 13,2
## alignment isCanonicalPAM forViewInUCSC strand chrom
## 1 .................... 1 chrX:48649564-48649586 - chrX
## 2 .................... 1 chrX:48649587-48649609 + chrX
## 3 .......A..........G. 0 chrX:83738197-83738219 - chrX
## chromStart chromEnd extendedSequence gRNAefficacy
## 1 48649564 48649586 GACACCAGAGCAGGATCCACAAACTGGGGG 0.1017198
## 2 48649587 48649609 CTGGTGTCCTCCACACCAGAATCAGGGGTT 0.2618818
## 3 83738197 83738219 CCACTGTCCTCAACACCAGAATGATAGATC 0.2172963
## flankSequence
## 1 AACCCCAACAGCACTCAGCCAATGCCAAGACAGCCACTCAATGGAGTTACCTGGGGAGTGTCTGTAGGCCTCAGCGTCCCTGTAGTAGGCCAGTGCCGCAGCTGCAGCGGTGGCTGTGCTCGGGGCAGTGGAGGAAGCTGCTGCATCCAAGCCCTCAGGCCCAGAGGGGAAGAAAACCCCTGATTCTGGTGTGGAGGACACCAGAGCAGGATCCACAAACTGGGGGAGGGGCTCTGAGGTCCCCAGGGACCCCAGGCCAGGGAACTCCATGGAGCCTCTGGGGATTAACCTGCGAGGACAGAAGGGGTCCTCAGACACAGAAATCCTTCCACCCCACATCCTTTCACCTGCTCCTCTTCCTCCTTTCCCCCTCCTCCCACTCCATCACCTCAGTCTCCATATTTCTCCTTCCCACCTCCCCAT
## 2 ATGGAGACTGAGGTGATGGAGTGGGAGGAGGGGGAAAGGAGGAAGAGGAGCAGGTGAAAGGATGTGGGGTGGAAGGATTTCTGTGTCTGAGGACCCCTTCTGTCCTCGCAGGTTAATCCCCAGAGGCTCCATGGAGTTCCCTGGCCTGGGGTCCCTGGGGACCTCAGAGCCCCTCCCCCAGTTTGTGGATCCTGCTCTGGTGTCCTCCACACCAGAATCAGGGGTTTTCTTCCCCTCTGGGCCTGAGGGCTTGGATGCAGCAGCTTCCTCCACTGCCCCGAGCACAGCCACCGCTGCAGCTGCGGCACTGGCCTACTACAGGGACGCTGAGGCCTACAGACACTCCCCAGGTAACTCCATTGAGTGGCTGTCTTGGCATTGGCTGAGTGCTGTTGGGGTTGCCATGGAGATCCTTGGCTAGGT
## 3 GCAGGTGCACATAAGTCAAGAATTGAGGTTTGGGAACCTCCACCTAGATTTCAGAGGATGTATGAAAATGTCTGGATGTCCCTGCAGACATTTGCTGCATGGGTGGAGCCCTCATGGAGAACCTCTGCTAGGGCAATGTGGAAGAGACATGTGGGGTTGGAGCCCCAACACAGATAGTGGAGCTGTGAGAAGAGGGCCACTGTCCTCAACACCAGAATGATAGATCCACCGACAGCTTGCACCATGAGCCTGGGAAAGCCACATACACTTAACATCAGTCTGTGAAATCAGCTGAGGGGGGCGCTGTACCCTGCAAAGCCACAAAGTTGGAGCTGCCCAAGCCCTTTGATGCCCACTCCTTGCATCAGCATAACCTGAATGTGGGACATGGAGTCAAAGGAGATCACTTTGGAACTTTAAGTT
##
## $gRNAs.bedFormat
## [,1] [,2] [,3] [,4]
## [1,] "chrX" "48649563" "48649586" "Hsap_GATA1_ex2_gR7r"
## [2,] "chrX" "48649586" "48649609" "Hsap_GATA1_ex2_gR40f"
## [,5] [,6] [,7] [,8]
## [1,] "101.719829329578" "-" "48649568" "48649570"
## [2,] "261.881755628254" "+" "48649602" "48649604"
##
## $REcutDetails
## ReversegRNAPlusPAM ReversegRNAName ForwardgRNAPlusPAM
## 1 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 2 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 3 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 4 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 5 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 6 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## ForwardgRNAName gap ForwardREcutgRNAName ForwardREname
## 1 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f BslI
## 2 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f BslI
## 3 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f HinfI
## 4 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f HinfI
## 5 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f TfiI
## 6 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f TfiI
## ForwardREpattern ForwardREcutStart ForwardREcutEnd ReverseREcutgRNAName
## 1 CCNNNNNNNGG 12 22 Hsap_GATA1_ex2_gR7r
## 2 CCNNNNNNNGG 12 22 Hsap_GATA1_ex2_gR7r
## 3 GANTC 15 19 Hsap_GATA1_ex2_gR7r
## 4 GANTC 15 19 Hsap_GATA1_ex2_gR7r
## 5 GAWTC 15 19 Hsap_GATA1_ex2_gR7r
## 6 GAWTC 15 19 Hsap_GATA1_ex2_gR7r
## ReverseREname ReverseREpattern ReverseREcutStart ReverseREcutEnd
## 1 BslI CCNNNNNNNGG 13 23
## 2 PflMI CCANNNNNTGG 13 23
## 3 BslI CCNNNNNNNGG 13 23
## 4 PflMI CCANNNNNTGG 13 23
## 5 BslI CCNNNNNNNGG 13 23
## 6 PflMI CCANNNNNTGG 13 23
##
## $REs.isUnique100
## [1] "" ""
##
## $REs.isUnique50
## [1] "" ""
Paired gRNAs in proper spacing and orientation give more specificity and gRNAs overlap with restriction enzyme cut sites facilitates cleavage monitoring. Calling the function offTargetAnalysis with findPairedgRNAOnly = TRUE and findgRNAsWithREcutOnly = TRUE results in searching, scoring and annotating gRNAs that are in paired configuration and at least one of the pairs overlap a restriction enzyme cut site. To be considered as a pair, gap between forward gRNA and the corresponding reverse gRNA needs to be (min.gap, max.gap) inclusive and the reverse gRNA must sit before the forward gRNA. The default (min.gap, max.gap) is (0,20). In order for a gRNA to be considered overlap with restriction enzyme cut site, the enzyme cut pattern must overlap with one of the gRNA positions specified in overlap.gRNA.positions, default position 17 and 18.
offTargetAnalysis(inputFilePath,
findgRNAsWithREcutOnly = TRUE,
minREpatternSize = 6,
overlap.gRNA.positions = c(17, 18),
findPairedgRNAOnly = TRUE, min.gap = 0, max.gap = 20,
BSgenomeName = Hsapiens,
chromToSearch ="chrX",
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
orgAnn = org.Hs.egSYMBOL,
max.mismatch = 0,
outputDir = outputDir,
overwrite = TRUE)
## Validating input ...
## Searching for gRNAs ...
## >>> Finding all hits in sequence chrX ...
## >>> DONE searching
## Building feature vectors for scoring ...
## Calculating scores ...
## Annotating, filtering and generating reports ...
## Done annotating
## Add paired information...
## Add RE information...
## write gRNAs to bed file...
## Scan for REsites in flanking region...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
## $on.target
## names forViewInUCSC
## [1,] "Hsap_GATA1_ex2_gR7r" "chrX:48649564-48649586"
## [2,] "Hsap_GATA1_ex2_gR40f" "chrX:48649587-48649609"
## extendedSequence gRNAefficacy
## [1,] "GACACCAGAGCAGGATCCACAAACTGGGGG" "0.101719829329578"
## [2,] "CTGGTGTCCTCCACACCAGAATCAGGGGTT" "0.261881755628254"
##
## $summary
## names forViewInUCSC
## 1 Hsap_GATA1_ex2_gR40f chrX:48649587-48649609
## 2 Hsap_GATA1_ex2_gR7r chrX:48649564-48649586
## extendedSequence gRNAefficacy gRNAsPlusPAM
## 1 CTGGTGTCCTCCACACCAGAATCAGGGGTT 0.261881755628254 TGTCCTCCACACCAGAATCANGG
## 2 GACACCAGAGCAGGATCCACAAACTGGGGG 0.101719829329578 CCAGAGCAGGATCCACAAACNGG
## top5OfftargetTotalScore top10OfftargetTotalScore
## 1 NA NA
## 2 NA NA
## top1Hit.onTarget.MMdistance2PAM topOfftarget1MMdistance2PAM
## 1 NMM
## 2 NMM
## topOfftarget2MMdistance2PAM topOfftarget3MMdistance2PAM
## 1
## 2
## topOfftarget4MMdistance2PAM topOfftarget5MMdistance2PAM
## 1
## 2
## topOfftarget6MMdistance2PAM topOfftarget7MMdistance2PAM
## 1
## 2
## topOfftarget8MMdistance2PAM topOfftarget9MMdistance2PAM
## 1
## 2
## topOfftarget10MMdistance2PAM PairedgRNAName REname
## 1 Hsap_GATA1_ex2_gR7r BslI
## 2 Hsap_GATA1_ex2_gR40f BslI PflMI
## uniqREin200 uniqREin100
## 1
## 2
##
## $offtarget
## name gRNAPlusPAM OffTargetSequence
## 1 Hsap_GATA1_ex2_gR7r CCAGAGCAGGATCCACAAACNGG CCAGAGCAGGATCCACAAACTGG
## 2 Hsap_GATA1_ex2_gR40f TGTCCTCCACACCAGAATCANGG TGTCCTCCACACCAGAATCAGGG
## inExon inIntron entrez_id gene score n.mismatch mismatch.distance2PAM
## 1 TRUE 2623 GATA1 100 0 <NA>
## 2 TRUE 2623 GATA1 100 0 <NA>
## alignment isCanonicalPAM forViewInUCSC strand chrom
## 1 .................... 1 chrX:48649564-48649586 - chrX
## 2 .................... 1 chrX:48649587-48649609 + chrX
## chromStart chromEnd extendedSequence gRNAefficacy
## 1 48649564 48649586 GACACCAGAGCAGGATCCACAAACTGGGGG 0.1017198
## 2 48649587 48649609 CTGGTGTCCTCCACACCAGAATCAGGGGTT 0.2618818
## flankSequence
## 1 AACCCCAACAGCACTCAGCCAATGCCAAGACAGCCACTCAATGGAGTTACCTGGGGAGTGTCTGTAGGCCTCAGCGTCCCTGTAGTAGGCCAGTGCCGCAGCTGCAGCGGTGGCTGTGCTCGGGGCAGTGGAGGAAGCTGCTGCATCCAAGCCCTCAGGCCCAGAGGGGAAGAAAACCCCTGATTCTGGTGTGGAGGACACCAGAGCAGGATCCACAAACTGGGGGAGGGGCTCTGAGGTCCCCAGGGACCCCAGGCCAGGGAACTCCATGGAGCCTCTGGGGATTAACCTGCGAGGACAGAAGGGGTCCTCAGACACAGAAATCCTTCCACCCCACATCCTTTCACCTGCTCCTCTTCCTCCTTTCCCCCTCCTCCCACTCCATCACCTCAGTCTCCATATTTCTCCTTCCCACCTCCCCAT
## 2 ATGGAGACTGAGGTGATGGAGTGGGAGGAGGGGGAAAGGAGGAAGAGGAGCAGGTGAAAGGATGTGGGGTGGAAGGATTTCTGTGTCTGAGGACCCCTTCTGTCCTCGCAGGTTAATCCCCAGAGGCTCCATGGAGTTCCCTGGCCTGGGGTCCCTGGGGACCTCAGAGCCCCTCCCCCAGTTTGTGGATCCTGCTCTGGTGTCCTCCACACCAGAATCAGGGGTTTTCTTCCCCTCTGGGCCTGAGGGCTTGGATGCAGCAGCTTCCTCCACTGCCCCGAGCACAGCCACCGCTGCAGCTGCGGCACTGGCCTACTACAGGGACGCTGAGGCCTACAGACACTCCCCAGGTAACTCCATTGAGTGGCTGTCTTGGCATTGGCTGAGTGCTGTTGGGGTTGCCATGGAGATCCTTGGCTAGGT
##
## $gRNAs.bedFormat
## [,1] [,2] [,3] [,4]
## [1,] "chrX" "48649563" "48649586" "Hsap_GATA1_ex2_gR7r"
## [2,] "chrX" "48649586" "48649609" "Hsap_GATA1_ex2_gR40f"
## [,5] [,6] [,7] [,8]
## [1,] "101.719829329578" "-" "48649568" "48649570"
## [2,] "261.881755628254" "+" "48649602" "48649604"
##
## $REcutDetails
## ReversegRNAPlusPAM ReversegRNAName ForwardgRNAPlusPAM
## 1 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## 2 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r TGTCCTCCACACCAGAATCAGGG
## ForwardgRNAName gap ForwardREcutgRNAName ForwardREname
## 1 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f BslI
## 2 Hsap_GATA1_ex2_gR40f 0 Hsap_GATA1_ex2_gR40f BslI
## ForwardREpattern ForwardREcutStart ForwardREcutEnd ReverseREcutgRNAName
## 1 CCNNNNNNNGG 12 22 Hsap_GATA1_ex2_gR7r
## 2 CCNNNNNNNGG 12 22 Hsap_GATA1_ex2_gR7r
## ReverseREname ReverseREpattern ReverseREcutStart ReverseREcutEnd
## 1 BslI CCNNNNNNNGG 13 23
## 2 PflMI CCANNNNNTGG 13 23
##
## $REs.isUnique100
## [1] "" ""
##
## $REs.isUnique50
## [1] "" ""
offTargetAnalysis(inputFilePath,
findgRNAsWithREcutOnly = TRUE,
minREpatternSize = 6,
overlap.gRNA.positions = c(17, 18),
findPairedgRNAOnly = FALSE,
BSgenomeName = Hsapiens,
chromToSearch ="chrX",
max.mismatch = 0,
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
orgAnn = org.Hs.egSYMBOL,
outputDir = outputDir,
overwrite = TRUE)
## Validating input ...
## Searching for gRNAs ...
## >>> Finding all hits in sequence chrX ...
## >>> DONE searching
## Building feature vectors for scoring ...
## Calculating scores ...
## Annotating, filtering and generating reports ...
## Done annotating
## Add paired information...
## Add RE information...
## write gRNAs to bed file...
## Scan for REsites in flanking region...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
## $on.target
## names forViewInUCSC
## [1,] "Hsap_GATA1_ex2_gR7r" "chrX:48649564-48649586"
## [2,] "Hsap_GATA1_ex2_gR20r" "chrX:48649577-48649599"
## [3,] "Hsap_GATA1_ex2_gR39f" "chrX:48649586-48649608"
## [4,] "Hsap_GATA1_ex2_gR40f" "chrX:48649587-48649609"
## extendedSequence gRNAefficacy
## [1,] "GACACCAGAGCAGGATCCACAAACTGGGGG" "0.101719829329578"
## [2,] "TTCTGGTGTGGAGGACACCAGAGCAGGATC" "0.142091080678942"
## [3,] "TCTGGTGTCCTCCACACCAGAATCAGGGGT" "0.0696295234374132"
## [4,] "CTGGTGTCCTCCACACCAGAATCAGGGGTT" "0.261881755628254"
##
## $summary
## names forViewInUCSC
## 1 Hsap_GATA1_ex2_gR20r chrX:48649577-48649599
## 2 Hsap_GATA1_ex2_gR39f chrX:48649586-48649608
## 3 Hsap_GATA1_ex2_gR40f chrX:48649587-48649609
## 4 Hsap_GATA1_ex2_gR7r chrX:48649564-48649586
## extendedSequence gRNAefficacy
## 1 TTCTGGTGTGGAGGACACCAGAGCAGGATC 0.142091080678942
## 2 TCTGGTGTCCTCCACACCAGAATCAGGGGT 0.0696295234374132
## 3 CTGGTGTCCTCCACACCAGAATCAGGGGTT 0.261881755628254
## 4 GACACCAGAGCAGGATCCACAAACTGGGGG 0.101719829329578
## gRNAsPlusPAM top5OfftargetTotalScore top10OfftargetTotalScore
## 1 GGTGTGGAGGACACCAGAGCNGG NA NA
## 2 GTGTCCTCCACACCAGAATCNGG NA NA
## 3 TGTCCTCCACACCAGAATCANGG NA NA
## 4 CCAGAGCAGGATCCACAAACNGG NA NA
## top1Hit.onTarget.MMdistance2PAM topOfftarget1MMdistance2PAM
## 1 NMM
## 2 NMM
## 3 NMM
## 4 NMM
## topOfftarget2MMdistance2PAM topOfftarget3MMdistance2PAM
## 1
## 2
## 3
## 4
## topOfftarget4MMdistance2PAM topOfftarget5MMdistance2PAM
## 1
## 2
## 3
## 4
## topOfftarget6MMdistance2PAM topOfftarget7MMdistance2PAM
## 1
## 2
## 3
## 4
## topOfftarget8MMdistance2PAM topOfftarget9MMdistance2PAM
## 1
## 2
## 3
## 4
## topOfftarget10MMdistance2PAM REname uniqREin200 uniqREin100
## 1 Aco12261II
## 2 BslI
## 3 BslI
## 4 BslI PflMI PflMI PflMI
##
## $offtarget
## name gRNAPlusPAM OffTargetSequence
## 1 Hsap_GATA1_ex2_gR7r CCAGAGCAGGATCCACAAACNGG CCAGAGCAGGATCCACAAACTGG
## 2 Hsap_GATA1_ex2_gR20r GGTGTGGAGGACACCAGAGCNGG GGTGTGGAGGACACCAGAGCAGG
## 3 Hsap_GATA1_ex2_gR39f GTGTCCTCCACACCAGAATCNGG GTGTCCTCCACACCAGAATCAGG
## 4 Hsap_GATA1_ex2_gR40f TGTCCTCCACACCAGAATCANGG TGTCCTCCACACCAGAATCAGGG
## inExon inIntron entrez_id gene score n.mismatch mismatch.distance2PAM
## 1 TRUE 2623 GATA1 100 0 <NA>
## 2 TRUE 2623 GATA1 100 0 <NA>
## 3 TRUE 2623 GATA1 100 0 <NA>
## 4 TRUE 2623 GATA1 100 0 <NA>
## alignment isCanonicalPAM forViewInUCSC strand chrom
## 1 .................... 1 chrX:48649564-48649586 - chrX
## 2 .................... 1 chrX:48649577-48649599 - chrX
## 3 .................... 1 chrX:48649586-48649608 + chrX
## 4 .................... 1 chrX:48649587-48649609 + chrX
## chromStart chromEnd extendedSequence gRNAefficacy
## 1 48649564 48649586 GACACCAGAGCAGGATCCACAAACTGGGGG 0.10171983
## 2 48649577 48649599 TTCTGGTGTGGAGGACACCAGAGCAGGATC 0.14209108
## 3 48649586 48649608 TCTGGTGTCCTCCACACCAGAATCAGGGGT 0.06962952
## 4 48649587 48649609 CTGGTGTCCTCCACACCAGAATCAGGGGTT 0.26188176
## flankSequence
## 1 AACCCCAACAGCACTCAGCCAATGCCAAGACAGCCACTCAATGGAGTTACCTGGGGAGTGTCTGTAGGCCTCAGCGTCCCTGTAGTAGGCCAGTGCCGCAGCTGCAGCGGTGGCTGTGCTCGGGGCAGTGGAGGAAGCTGCTGCATCCAAGCCCTCAGGCCCAGAGGGGAAGAAAACCCCTGATTCTGGTGTGGAGGACACCAGAGCAGGATCCACAAACTGGGGGAGGGGCTCTGAGGTCCCCAGGGACCCCAGGCCAGGGAACTCCATGGAGCCTCTGGGGATTAACCTGCGAGGACAGAAGGGGTCCTCAGACACAGAAATCCTTCCACCCCACATCCTTTCACCTGCTCCTCTTCCTCCTTTCCCCCTCCTCCCACTCCATCACCTCAGTCTCCATATTTCTCCTTCCCACCTCCCCAT
## 2 GGATCTCCATGGCAACCCCAACAGCACTCAGCCAATGCCAAGACAGCCACTCAATGGAGTTACCTGGGGAGTGTCTGTAGGCCTCAGCGTCCCTGTAGTAGGCCAGTGCCGCAGCTGCAGCGGTGGCTGTGCTCGGGGCAGTGGAGGAAGCTGCTGCATCCAAGCCCTCAGGCCCAGAGGGGAAGAAAACCCCTGATTCTGGTGTGGAGGACACCAGAGCAGGATCCACAAACTGGGGGAGGGGCTCTGAGGTCCCCAGGGACCCCAGGCCAGGGAACTCCATGGAGCCTCTGGGGATTAACCTGCGAGGACAGAAGGGGTCCTCAGACACAGAAATCCTTCCACCCCACATCCTTTCACCTGCTCCTCTTCCTCCTTTCCCCCTCCTCCCACTCCATCACCTCAGTCTCCATATTTCTCCTT
## 3 TATGGAGACTGAGGTGATGGAGTGGGAGGAGGGGGAAAGGAGGAAGAGGAGCAGGTGAAAGGATGTGGGGTGGAAGGATTTCTGTGTCTGAGGACCCCTTCTGTCCTCGCAGGTTAATCCCCAGAGGCTCCATGGAGTTCCCTGGCCTGGGGTCCCTGGGGACCTCAGAGCCCCTCCCCCAGTTTGTGGATCCTGCTCTGGTGTCCTCCACACCAGAATCAGGGGTTTTCTTCCCCTCTGGGCCTGAGGGCTTGGATGCAGCAGCTTCCTCCACTGCCCCGAGCACAGCCACCGCTGCAGCTGCGGCACTGGCCTACTACAGGGACGCTGAGGCCTACAGACACTCCCCAGGTAACTCCATTGAGTGGCTGTCTTGGCATTGGCTGAGTGCTGTTGGGGTTGCCATGGAGATCCTTGGCTAGG
## 4 ATGGAGACTGAGGTGATGGAGTGGGAGGAGGGGGAAAGGAGGAAGAGGAGCAGGTGAAAGGATGTGGGGTGGAAGGATTTCTGTGTCTGAGGACCCCTTCTGTCCTCGCAGGTTAATCCCCAGAGGCTCCATGGAGTTCCCTGGCCTGGGGTCCCTGGGGACCTCAGAGCCCCTCCCCCAGTTTGTGGATCCTGCTCTGGTGTCCTCCACACCAGAATCAGGGGTTTTCTTCCCCTCTGGGCCTGAGGGCTTGGATGCAGCAGCTTCCTCCACTGCCCCGAGCACAGCCACCGCTGCAGCTGCGGCACTGGCCTACTACAGGGACGCTGAGGCCTACAGACACTCCCCAGGTAACTCCATTGAGTGGCTGTCTTGGCATTGGCTGAGTGCTGTTGGGGTTGCCATGGAGATCCTTGGCTAGGT
##
## $gRNAs.bedFormat
## [,1] [,2] [,3] [,4]
## [1,] "chrX" "48649563" "48649586" "Hsap_GATA1_ex2_gR7r"
## [2,] "chrX" "48649576" "48649599" "Hsap_GATA1_ex2_gR20r"
## [3,] "chrX" "48649585" "48649608" "Hsap_GATA1_ex2_gR39f"
## [4,] "chrX" "48649586" "48649609" "Hsap_GATA1_ex2_gR40f"
## [,5] [,6] [,7] [,8]
## [1,] "101.719829329578" "-" "48649568" "48649570"
## [2,] "142.091080678942" "-" "48649581" "48649583"
## [3,] "69.6295234374132" "+" "48649601" "48649603"
## [4,] "261.881755628254" "+" "48649602" "48649604"
##
## $REcutDetails
## gRNAPlusPAM REcutgRNAName REname REpattern
## 2 GGTGTGGAGGACACCAGAGCAGG Hsap_GATA1_ex2_gR20r Aco12261II CCRGAG
## 3 GTGTCCTCCACACCAGAATCAGG Hsap_GATA1_ex2_gR39f BslI CCNNNNNNNGG
## 4 TGTCCTCCACACCAGAATCAGGG Hsap_GATA1_ex2_gR40f BslI CCNNNNNNNGG
## 5 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r BslI CCNNNNNNNGG
## 6 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r PflMI CCANNNNNTGG
## REcutStart REcutEnd
## 2 14 19
## 3 13 23
## 4 12 22
## 5 13 23
## 6 13 23
##
## $REs.isUnique100
## [1] "" "" "" "PflMI"
##
## $REs.isUnique50
## [1] "" "" "" "PflMI"
Calling the function offTargetAnalysis with findgRNAs = FALSE will skip the gRNA search step and go directly to off-target search, scoring and annotation for the input gRNAs. The input gRNAs will be annotated with restriction enzyme cut sites for users to review later. However, paired information will not be available.
gRNAFilePath <- system.file('extdata', 'testHsap_GATA1_ex2_gRNA1.fa',
package = 'CRISPRseek')
offTargetAnalysis(inputFilePath = gRNAFilePath,
findPairedgRNAOnly = FALSE,
findgRNAs = FALSE,
BSgenomeName = Hsapiens,
chromToSearch = 'chrX',
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
orgAnn = org.Hs.egSYMBOL,
max.mismatch = 2,
outputDir = outputDir,
overwrite = TRUE)
## Validating input ...
## >>> Finding all hits in sequence chrX ...
## >>> DONE searching
## Building feature vectors for scoring ...
## Calculating scores ...
## Annotating, filtering and generating reports ...
## Done annotating
## Add RE information...
## write gRNAs to bed file...
## Scan for REsites in flanking region...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
## $on.target
## names forViewInUCSC
## [1,] "Hsap_GATA1_ex2_gRNA1" "chrX:48649564-48649586"
## extendedSequence gRNAefficacy
## [1,] "TCCCCCAGTTTGTGGATCCTGCTCTGGTGT" "0.106547942216103"
##
## $summary
## names forViewInUCSC
## 1 Hsap_GATA1_ex2_gRNA1 chrX:48649564-48649586
## extendedSequence gRNAefficacy gRNAsPlusPAM
## 1 TCCCCCAGTTTGTGGATCCTGCTCTGGTGT 0.106547942216103 CCAGTTTGTGGATCCTGCTCNGG
## top5OfftargetTotalScore top10OfftargetTotalScore
## 1 NA NA
## top1Hit.onTarget.MMdistance2PAM topOfftarget1MMdistance2PAM
## 1 NMM
## topOfftarget2MMdistance2PAM topOfftarget3MMdistance2PAM
## 1
## topOfftarget4MMdistance2PAM topOfftarget5MMdistance2PAM
## 1
## topOfftarget6MMdistance2PAM topOfftarget7MMdistance2PAM
## 1
## topOfftarget8MMdistance2PAM topOfftarget9MMdistance2PAM
## 1
## topOfftarget10MMdistance2PAM REname uniqREin200 uniqREin100
## 1
##
## $offtarget
## name gRNAPlusPAM OffTargetSequence
## 1 Hsap_GATA1_ex2_gRNA1 CCAGTTTGTGGATCCTGCTCNGG CCAGTTTGTGGATCCTGCTCTGG
## inExon inIntron entrez_id gene score n.mismatch mismatch.distance2PAM
## 1 TRUE 2623 GATA1 100 0 <NA>
## alignment isCanonicalPAM forViewInUCSC strand chrom
## 1 .................... 1 chrX:48649564-48649586 + chrX
## chromStart chromEnd extendedSequence gRNAefficacy
## 1 48649564 48649586 TCCCCCAGTTTGTGGATCCTGCTCTGGTGT 0.1065479
## flankSequence
## 1 ATGGGGAGGTGGGAAGGAGAAATATGGAGACTGAGGTGATGGAGTGGGAGGAGGGGGAAAGGAGGAAGAGGAGCAGGTGAAAGGATGTGGGGTGGAAGGATTTCTGTGTCTGAGGACCCCTTCTGTCCTCGCAGGTTAATCCCCAGAGGCTCCATGGAGTTCCCTGGCCTGGGGTCCCTGGGGACCTCAGAGCCCCTCCCCCAGTTTGTGGATCCTGCTCTGGTGTCCTCCACACCAGAATCAGGGGTTTTCTTCCCCTCTGGGCCTGAGGGCTTGGATGCAGCAGCTTCCTCCACTGCCCCGAGCACAGCCACCGCTGCAGCTGCGGCACTGGCCTACTACAGGGACGCTGAGGCCTACAGACACTCCCCAGGTAACTCCATTGAGTGGCTGTCTTGGCATTGGCTGAGTGCTGTTGGGGTT
##
## $gRNAs.bedFormat
## [,1] [,2] [,3] [,4]
## [1,] "chrX" "48649563" "48649586" "Hsap_GATA1_ex2_gRNA1"
## [,5] [,6] [,7] [,8]
## [1,] "106.547942216103" "+" "48649579" "48649581"
##
## $REcutDetails
## [1] gRNAPlusPAM REcutgRNAName REname REpattern REcutStart
## [6] REcutEnd
## <0 rows> (or 0-length row.names)
##
## $REs.isUnique100
## [1] ""
##
## $REs.isUnique50
## [1] ""
Calling the function offTargetAnalysis with chromToSearch = “” results in quick gRNA search without performing off-target analysis. Parameters findgRNAsWithREcutOnly and findPairedgRNAOnly can be tuned to indicate whether searching for gRNAs overlap restriction enzyme cut sites, and whether searching for gRNAs in paired configuration.
offTargetAnalysis(inputFilePath,
chromToSearch = "",
outputDir = outputDir,
overwrite = TRUE)
## Validating input ...
## Searching for gRNAs ...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
## A DNAStringSet instance of length 5
## width seq names
## [1] 23 CCAGTTTGTGGATCCTGCTCTGG Hsap_GATA1_ex2_gR17f
## [2] 23 GTGTCCTCCACACCAGAATCAGG Hsap_GATA1_ex2_gR39f
## [3] 23 TGTCCTCCACACCAGAATCAGGG Hsap_GATA1_ex2_gR40f
## [4] 23 GGTGTGGAGGACACCAGAGCAGG Hsap_GATA1_ex2_gR20r
## [5] 23 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r
Below is an example to search for gRNAs that target at least one of the alleles. Two files are provided containing sequences that differ by a single nucleotide polymorphism (SNP). The results are saved in file scoresFor2InputSequences.xls in outputDir directory.
Hungtinton disease is caused by mutations in the HTT gene. Expansion of CAG repeats in one copy of HTT can result in adult onset neurodegeneration. Because HTT is an essential gene, nucleases cannot be used that inactivate both alleles. Therefore, to identify nuclease target sites that are allele-specific, we will try to search for sites that overlap a single nucleotide polymorphism (SNP), RS362331 is located in a coding exon of HTT. Two sequences that differ only at the polymorphism site will be used as inputs for compare2sequences.
inputFile1Path <- system.file("extdata", "rs362331C.fa", package = "CRISPRseek")
inputFile2Path <- system.file("extdata", "rs362331T.fa", package = "CRISPRseek")
seqs <- compare2Sequences(inputFile1Path, inputFile2Path,
outputDir = outputDir,
overwrite = TRUE)
## search for gRNAs for input file1...
## Validating input ...
## Searching for gRNAs ...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/rs362331C.fa-Jul-24-2017/
## search for gRNAs for input file2...
## Validating input ...
## Searching for gRNAs ...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/rs362331T.fa-Jul-24-2017/
## [1] "Scoring ..."
## >>> Finding all hits in sequence rs362331T ...
## >>> DONE searching
## finish off-target search in sequence 2
## >>> Finding all hits in sequence rs362331C ...
## >>> DONE searching
## finish off-target search in sequence 1
## finish feature vector building
## finish score calculation
## [1] "Done!"
Calling the function offTargetAnalysis with max.mismatch = 0 results in quick gRNA search with gRNA efficacy prediction without off-target analysis.
inputFilePath <- system.file('extdata', 'inputseq.fa', package = 'CRISPRseek')
results <- offTargetAnalysis(inputFilePath,
annotateExon = FALSE,chromToSearch = "chrX",
max.mismatch = 0, BSgenomeName = Hsapiens,
outputDir = outputDir, overwrite = TRUE)
## Validating input ...
## Searching for gRNAs ...
## >>> Finding all hits in sequence chrX ...
## >>> DONE searching
## Building feature vectors for scoring ...
## Calculating scores ...
## Annotating, filtering and generating reports ...
## Done annotating
## Add paired information...
## Add RE information...
## write gRNAs to bed file...
## Scan for REsites in flanking region...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
Alternatively, you can set useEfficacyFromInputSeq = TRUE and chromToSearch = “” without specifying BSgenomeName if input sequence is long enough.
inputFilePath <- system.file('extdata', 'inputseq.fa', package = 'CRISPRseek')
offTargetAnalysis(inputFilePath,
annotateExon = FALSE,chromToSearch = "",
useEfficacyFromInputSeq = TRUE,
max.mismatch = 0,
outputDir = outputDir, overwrite = TRUE)
## Validating input ...
## Searching for gRNAs ...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
## A DNAStringSet instance of length 5
## width seq names
## [1] 23 CCAGTTTGTGGATCCTGCTCTGG Hsap_GATA1_ex2_gR17f
## [2] 23 GTGTCCTCCACACCAGAATCAGG Hsap_GATA1_ex2_gR39f
## [3] 23 TGTCCTCCACACCAGAATCAGGG Hsap_GATA1_ex2_gR40f
## [4] 23 GGTGTGGAGGACACCAGAGCAGG Hsap_GATA1_ex2_gR20r
## [5] 23 CCAGAGCAGGATCCACAAACTGG Hsap_GATA1_ex2_gR7r
Calling the function offTargetAnalysis with annotatePaired = FALSE, enable.multicore = TRUE and set n.cores.max will improve the performance. We also suggest split the super long sequence into smaller chunks and perform offTarget analysis for each subsequence separately (Thank Alex Williams for sharing this use case at https://support.bioconductor.org/p/72994/). In addition, please remember to use repeat masked sequence as input.
results <- offTargetAnalysis(inputFilePath, annotatePaired = FALSE,
chromToSearch = "chrX",
enable.multicore = TRUE, n.cores.max = 6,
annotateExon = FALSE,
max.mismatch = 0, BSgenomeName = Hsapiens,
outputDir = outputDir, overwrite = TRUE)
## Validating input ...
## Searching for gRNAs ...
## >>> Finding all hits in sequence chrX ...
## >>> DONE searching
## Building feature vectors for scoring ...
## Calculating scores ...
## Annotating, filtering and generating reports ...
## Done annotating
## Add RE information...
## write gRNAs to bed file...
## Scan for REsites in flanking region...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
Calling the function offTargetAnalysis with scoring.method set to CFDscore will output CFD score using the algorithm by Doench et al., 2016, which models the effects of both mismatch position and mismatch type on cutting frequency. By default, scoring.method is set to Hsu-Zhang, which only models the effect of mismatch position.
results <- offTargetAnalysis(inputFilePath, annotatePaired = FALSE,
scoring.method = "CFDscore",
chromToSearch = "chrX",
annotateExon = FALSE,
max.mismatch = 2, BSgenomeName = Hsapiens,
outputDir = outputDir, overwrite = TRUE)
## Validating input ...
## Searching for gRNAs ...
## >>> Finding all hits in sequence chrX ...
## >>> DONE searching
## Building feature vectors for scoring ...
## Calculating scores ...
## Annotating, filtering and generating reports ...
## Done annotating
## Add RE information...
## write gRNAs to bed file...
## Scan for REsites in flanking region...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
results <- offTargetAnalysis(inputFilePath,
annotatePaired = FALSE,
BSgenomeName = Hsapiens, chromToSearch = "chrX",
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
orgAnn = org.Hs.egSYMBOL, max.mismatch = 4,
outputDir = outputDir, overwrite = TRUE,
PAM.location = "5prime",
PAM = "TGT",
PAM.pattern = "^T[A|G]N", allowed.mismatch.PAM = 2,
subPAM.position = c(1,2))
## Validating input ...
## Searching for gRNAs ...
## >>> Finding all hits in sequence chrX ...
## >>> DONE searching
## Building feature vectors for scoring ...
## Calculating scores ...
## Annotating, filtering and generating reports ...
## Done annotating
## Add RE information...
## write gRNAs to bed file...
## Scan for REsites in flanking region...
## Done. Please check output files in directory /tmp/Rtmpe6VhUq/Rbuild2085e2e2f23/CRISPRseekGUIDEseqBioc2017Workshop/vignettes/CRISPRseekDemo/
To preferentially target one allele, select gRNAs that have the lowest score for the other allele. Selected gRNAs can then be examined for potential off-target cleavage as described in Scenario 5.
Identify gRNAs that target the following two input sequences equally well with minimized off-target cleavage
MfSerpAEx2
GACGATGGCATCCTCCGTTCCCTGGGGCCTCCTGCTGCTGGCGGGGCTGTGCTGCCTGGCCCCCCGCTCCCTGGCCTCGAGTCCCCTGGGAGCCGCTGTCCAGGACACAGGTGCACCCCACCACGACCATGAGCACCATGAGGAGCCAGCCTGCCACAAGATTGCCCCGAACCTGGCCGACTTCGCCTTCAGCATGTACCGCCAGGTGGCGCATGGGTCCAACACCACCAACATCTTCTTCTCCCCCGTGAGCATCGCGACCGCCTTTGCGTTGCTTTCTCTGGGGGCCAAGGGTGACACTCACTCCGAGATCATGAAGGGCCTTAGGTTCAACCTCACTGAGAGAGCCGAGGGTGAGGTCCACCAAGGCTTCCAGCAACTTCTCCGCACCCTCAACCACCCAGACAACCAGCTGCAGCTGACCACTGGCAATGGTCTCTTCATCGCTGAGGGCATGAAGCTACTGGATAAGTTTTTGGAGGATGTCAAGAACCTGTACCACTCAGAAGCCTTCTCCACCAATTTCGGGGACACCGAAGCAGCCAAGAAACAGATCAACGATTATGTTGAGAAGGGAACCCAAGGGAAAATTGTGGATTTGGTCAAAGACCTTGACAAAGACACAGCTTTCGCTCTGGTGAATTACATTTTCTTTAAAG
HsSerpAEx2
GACAATGCCGTCTTCTGTCTCGTGGGGCATCCTCCTGCTGGCAGGCCTGTGCTGCCTGGTCCCTGTCTCCCTGGCTGAGGATCCCCAGGGAGATGCTGCCCAGAAGACAGATACATCCCACCATGATCAGGATCACCCAACCTTCAACAAGATCACCCCCAACCTGGCTGAGTTCGCCTTCAGCCTATACCGCCAGCTGGCACACCAGTCCAACAGCACCAATATCTTCTTCTCCCCAGTGAGCATCGCTACAGCCTTTGCAATGCTCTCCCTGGGGACCAAGGCTGACACTCACGATGAAATCCTGGAGGGCCTGAATTTCAACCTCACGGAGATTCCGGAGGCTCAGATCCATGAAGGCTTCCAGGAACTCCTCCGTACCCTCAACCAGCCAGACAGCCAGCTCCAGCTGACCACCGGCAATGGCCTGTTCCTCAGCGAGGGCCTGAAGCTAGTGGATAAGTTTTTGGAGGATGTTAAAAAGTTGTACCACTCAGAAGCCTTCACTGTCAACTTCGGGGACACCGAAGAGGCCAAGAAACAGATCAACGATTACGTGGAGAAGGGTACTCAAGGGAAAATTGTGGATTTGGTCAAGGAGCTTGACAGAGACACAGTTTTTGCTCTGGTGAATTACATCTTCTTTAAAG
Constraint gRNA Sequence by setting gRNA.pattern to require or exclude specific features within the target site.
3a. Synthesis of gRNAs in vivo from host U6 promoters is more efficient if the first base is guanine. To maximize the efficiency, what can we set gRNA.pattern?
3b. Synthesis of gRNAs in vitro using T7 promoters is most efficient when the first two bases are GG. To maximize the efficiency, what can we set gRNA.pattern?
In the examples we went through, we deliberately restricted to search off-targets in chromosome X. If we are interested in genome-wide search, what needs to be changed and how?
Find gRNAs in a paired configuration with distance apart between 5 and 15 without performing off-target analysis
Create a txdb object
It is known that different CRISPR-cas system uses different PAM sequence, what parameter needs to be reset for PAM = ‘NNNNGGGT’?
It is known that different CRISPR-cas system has different gRNA length, what parameter needs to be reset?
Which parameter needs to be reset to what if we are interested in finding gRANs with restriction enzyme pattern of size 8 or above?
New penalty matrix has been recently derived, which parameter needs to be reset accordingly?
It has been shown that although PAM sequence NGG is preferred, a variant NAG is also recognized with less efficiency. The researcher is interested in performing off-target searching to include both NGG and NAG variants. What parameter(s) need to be set correctly to carry such a search?
Which parameter to reset if you would like to skip off-target analysis but still want the summary output file? How to reset the parameter?
How to reset the parameters to perform offtarget analysis with truncated gRNAs?
How to perform offtarget analysis for genomes whose BSgenome is not available?
Hint: use compare2Sequences and reset searchDirection and findgRNAs
How to perform offtarget analysis for cpf1?
Hint: similar to scenario 11
You will need to alter parameters PAM.location, PAM, PAM.pattern and PAM.size
Which parameters need to be altered to perform offtarget analysis for NmCas9?
Hit: need to alter PAM.size, PAM, PAM.pattern and weights
Which parameter needs to be set to output gRNAs in fasta or genbank format?