library(NanoMethViz)
In order to use this package, your data must be converted from the output of methylation calling software to a special tabix format. Due to the use of the Unix sort
function, this can only currently be done on a Linux or MacOS system.
We currently support output from
The conversion can be done using the create_tabix_file()
function. We provide example data of nanopolish output within the package, we can look inside to see how the data looks coming out of nanopolish
methy_calls <- system.file(package = "NanoMethViz",
c("sample1_nanopolish.tsv.gz", "sample2_nanopolish.tsv.gz"))
# have a look at the first 10 rows of methy_data
methy_calls_example <- read.table(
methy_calls[1], sep = "\t", header = TRUE, nrows = 6)
methy_calls_example
## chromosome strand start end read_name
## 1 chr1 - 127732476 127732476 e648c4e3-ca6a-4671-af17-86dab4c819eb
## 2 chr11 - 115423144 115423144 726dd8b5-1531-4279-9cf0-a7e4d5ea0478
## 3 chr11 + 69150806 69150814 34f9ee3e-4b27-4d2d-a203-4067f0662044
## 4 chr1 + 170484965 170484965 d8309c06-375f-4dfe-b22e-0c47af888cd9
## 5 chrY - 4082060 4082060 f68940f6-4236-4f0f-9af7-a81b5c2911b6
## 6 chr8 + 120733312 120733312 13ae181f-b88b-4d6c-a815-553ff2e25312
## log_lik_ratio log_lik_methylated log_lik_unmethylated num_calling_strands
## 1 -5.91 -100.38 -94.47 1
## 2 -8.07 -115.21 -107.13 1
## 3 -1.65 -183.12 -181.47 1
## 4 2.74 -112.14 -114.88 1
## 5 -1.78 -135.09 -133.32 1
## 6 5.02 -129.31 -134.33 1
## num_motifs sequence
## 1 1 CATTACGTTTC
## 2 1 AACTTCGTTGA
## 3 2 GGTCACGGGAATCCGGTTC
## 4 1 AGAAGCGCTAA
## 5 1 CTCACCGTATA
## 6 1 TCTGACGTTGA
We then create a temporary path to store a converted file, this will be deleted once you exit your R session. Once create_tabix_file()
is run, it will create a tabix file along with its index. Because we have a small amount of data, we can read in a small portion of it to see how it looks, do not do this with large datasets as it decompresses all the data and will take very long to run.
methy_tabix <- file.path(tempdir(), "methy_data.bgz")
samples <- c("sample1", "sample2")
# you should see messages when running this yourself
create_tabix_file(methy_calls, methy_tabix, samples)
# don't do this with actual data
# we have to use gzfile to tell R that we have a gzip compressed file
methy_data <- read.table(
gzfile(methy_tabix), col.names = methy_col_names(), nrows = 6)
methy_data
## sample chr pos strand statistic read_name
## 1 sample2 chr1 5141050 - 6.93 3818f2e2-d520-4305-bbab-efad891f67f2
## 2 sample1 chr1 6283067 - 1.05 36e3c55f-c41f-4bd6-b371-54368d013008
## 3 sample1 chr1 7975278 - 1.39 6f6cbc59-af4c-4dfa-8e48-ef4ac4eeb13b
## 4 sample1 chr1 10230292 - 2.19 fbe53b38-e264-4c7a-824e-2651c22f8ea6
## 5 sample1 chr1 13127127 - 2.51 7660ba1f-9b44-4783-b901-ed79b2f0481b
## 6 sample1 chr1 13127134 - 2.51 7660ba1f-9b44-4783-b901-ed79b2f0481b
Now methy_tabix
will be the path to a tabix object that is ready for use with NanoMethViz. Please head over to the “Introduction” vignette to see how to use this data for visualisation!