--- title: '`voice` vignette' subtitle: 'version `r packageVersion("voice")`' author: 'Filipe J. Zabala' date: "`r format(Sys.time(), '%Y-%m-%d')`" # date: "`r format(Sys.time(), '%Y-%m-%d %H:%M:%S')`" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{`voice` vignette} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ini0 <- Sys.time() ``` ## 0. Basic installation ```{r, eval=FALSE} # Development version from GitHub install.packages('devtools') devtools::install_github('filipezabala/voice') # Stable version from CRAN install.packages('voice') ``` ### 0.1 For Windows Users If you're compiling R packages from source, you may need to install [RTools](https://cran.r-project.org/bin/windows/Rtools/), a collection of Windows-specific build tools for R. ### 0.2 For macOS Users If you're compiling packages, ensure you have [Xcode Command Line Tools](https://mac.install.guide/commandlinetools/) installed. You also may need [macOS tools](https://cran.r-project.org/bin/macosx/tools/). ```{bash, eval=FALSE} # Install Xcode on MacOS xcode-select --install ``` More details may be found at https://github.com/filipezabala/voice. ## 1. Extract features ### 1.1 Load packages and audio files ```{r, message=FALSE, warning=FALSE} # packs library(voice) library(tidyverse) # get path to audio file wavDir <- list.files(system.file('extdata', package = 'wrassp'), pattern = glob2rx('*.wav'), full.names = TRUE) ``` ### 1.2 Extract features ```{r, message=FALSE, warning=FALSE} # minimal usage M <- voice::extract_features(wavDir) glimpse(M) ``` ## 2. Tag ```{r, message=FALSE, warning=FALSE} # creating Extended synthetic data E <- dplyr::tibble(subject_id = c(1,1,1,2,2,2,3,3,3), wav_path = wavDir) E # minimal usage voice::tag(E) # canonical data voice::tag(E, groupBy = 'subject_id') ``` ## 3. Visualization ### 3.1 Get audio ```{r, message=FALSE, warning=FALSE, eval=FALSE} url0 <- 'https://github.com/filipezabala/voiceAudios/raw/refs/heads/main/wav/doremi.wav' download.file(url0, paste0(tempdir(), '/doremi.wav'), mode = 'wb') voice::embed_audio(url0) # See https://github.com/mccarthy-m-g/embedr for more details. ``` ### 3.2 Media data ```{r, message=FALSE, warning=FALSE, eval=FALSE} M <- voice::extract_features(tempdir()) summary(M) ``` ### 3.3 Plot ```{r, message=FALSE, warning=FALSE, fig.width=7.5, fig.height=4, eval=FALSE} voice::piano_plot(M, 0) # f0 voice::piano_plot(M, 0:1) # f0 + f1 ``` ![](./img/plot1.png){width=95%} ![](./img/plot2.png){width=95%} ### 3.4 Assign notes ```{r, message=FALSE, warning=FALSE, eval=FALSE} (f0_spn <- voice::assign_notes(M, fmt = 0, min_points = 22, min_percentile = .85)) # f0 (f1_spn <- voice::assign_notes(M, fmt = 1, min_points = 22, min_percentile = .85)) # f1 ``` ### 3.5 Sheet music Must have [MuseScore](https://musescore.org/en) and [gm](https://flujoo.github.io/gm/articles/gm.html). #### 3.5.1 Notes sequence of f0 ```{r, message=FALSE, warning=FALSE, eval=FALSE} library(gm) line_0 <- gm::Line(as.character(f0_spn)) m0 <- gm::Music() + gm::Meter(4, 4) + line_0 gm::show(m0, to = c('score', 'audio')) ``` ![](./img/sheet1.png){width=95%} #### 3.5.2 Notes sequences of f0 and f1 ```{r, message=FALSE, warning=FALSE, eval=FALSE} line_0 <- gm::Line(as.character(f0_spn)) line_1 <- gm::Line(as.character(f1_spn)) m1 <- gm::Music() + gm::Meter(4, 4) + line_0 + line_1 gm::show(m1, to = c('score', 'audio')) ``` ![](./img/sheet2.png){width=95%} ## 4. Advanced installation Python-based functions `diarize` and `extract_features` (when the latter is inferring `f0_praat` and `fmt_praat` features) require a configured Python environment. ### 4.1 Ubuntu The following steps are used to fully configure `voice` on [Ubuntu 24.04 LTS (Noble Numbat)](https://ubuntu.com/blog/tag/ubuntu-24-04-lts). Reports of inconsistencies are welcome. #### 4.1.1. [Curl](https://curl.se/) Command line tool and library for transferring data with URLs. ```bash # installing dependencies sudo apt-get update sudo apt-get install -y libssl-dev autoconf libtool make # installing curl sudo apt install curl # verify installation curl --version ``` #### 4.1.2. [ffmpeg](https://ffmpeg.org/) ffmpeg is a cross-platform solution to record, convert and stream audio and video. ```bash sudo apt-get update sudo apt-get install ffmpeg ``` #### 4.1.3. Audio drivers and extra packages ```bash sudo apt-get update sudo apt-get install portaudio19-dev libasound2-dev libfontconfig1-dev libmagick++-dev libxml2-dev libharfbuzz-dev libfribidi-dev libgdal-dev cmake cmake-doc ninja-build ``` #### 4.1.4. [MuseScore](https://musescore.org/) MuseScore is an open source notation software. ```bash sudo add-apt-repository ppa:mscore-ubuntu/mscore-stable sudo apt-get update sudo apt-get install musescore ``` #### 4.1.5. [R](https://www.r-project.org) R is a free software environment for statistical computing and graphics. To find out your Ubuntu distribution use `lsb_release -a` at terminal. ```bash sudo sh -c 'echo "deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/" >> /etc/apt/sources.list.d/cran.list' sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9 sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 51716619E084DAB9 gpg -a --export E084DAB9 | sudo apt-key add - sudo add-apt-repository ppa:c2d4u.team/c2d4u4.0+ sudo apt-get update && sudo apt-get upgrade sudo apt-get install r-base r-base-dev ``` #### 4.1.6. [RStudio](https://posit.co/download/rstudio-desktop/) RStudio is an Integrated Development Environment (IDE) for R. Check for updates [here](https://posit.co/download/rstudio-desktop/). ```bash sudo apt-get update sudo apt-get install gdebi-core wget https://download1.rstudio.org/electron/jammy/amd64/rstudio-2025.05.0-496-amd64.deb sudo gdebi rstudio-2025.05.0-496-amd64.deb ``` #### 4.1.9. R packages "Packages are the fundamental units of reproducible R code." [Hadley Wickham and Jennifer Bryan](https://r-pkgs.org/). The installation may take several minutes. At terminal run: ```bash sudo R ``` Running R as super user paste the following, row by row: ```r packs <- c('audio','reticulate','R.utils','seewave','tidyverse','tuneR','wrassp') install.packages(packs, dep = TRUE) update.packages(ask = FALSE) devtools::install_github('egenn/music') devtools::install_github('flujoo/gm') ``` To configure the `gm` package. ```r usethis::edit_r_environ() ``` Add the line `MUSESCORE_PATH=/usr/bin/mscore` to `/root/.Renviron` file. To exit use `:wq` at VI. Save and restart the R/RStudio session. #### 4.1.10. [Miniconda](https://docs.conda.io/en/latest/miniconda.html) Miniconda is a free minimal installer for [conda](https://docs.conda.io/), an open source package, dependency and environment management system for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN and more, that runs on Windows, macOS and Linux. Follow the instructions at https://docs.conda.io/en/latest/miniconda.html. At terminal: ```bash cd ~/Downloads/ wget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh cd repo.anaconda.com/miniconda/ bash Miniconda3-latest-Linux-x86_64.sh ``` Do you accept the license terms? [yes|no] `yes`. Miniconda3 will now be installed into this location: /home/user/miniconda3 [ENTER] You can undo this by running `conda init --reverse $SHELL`? `yes` Do you wish the installer to initialize Miniconda3 by running conda init? `yes`. Close and reopen terminal. ```bash conda update -n base -c defaults conda ``` The following packages will be INSTALLED/REMOVED/UPDATED/DOWNGRADED:... Proceed ([y]/n)? `y` ```bash conda create -n pyvoice python=3.12 ``` The following (NEW) packages will be downloaded/INSTALLED:... Proceed ([y]/n)? `y` ```bash conda activate pyvoice pip install -r https://raw.githubusercontent.com/filipezabala/voice/master/requirements.txt ``` ### 4.2 MacOS The following steps are used to fully configure `voice` on [MacOS Sonoma](https://www.apple.com/macos/macos-sequoia/) (Link to MacOS Sequoia). Reports of inconsistencies are welcome. #### 4.2.1. [Homebrew](https://brew.sh/) Install Homebrew, 'The Missing Package Manager for macOS (or Linux)' and remember to `brew doctor` eventually. At terminal (`command + space 'terminal'`) run: ```bash /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" ``` #### 4.2.2. [wget](https://www.gnu.org/software/wget/) GNU Wget is a free software package for retrieving files using HTTP, HTTPS, FTP and FTPS, the most widely used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc. ```bash brew install wget ``` #### 4.2.3. [Python](https://www.python.org/) Python is a programming language that integrate systems. According to [this](https://github.com/Homebrew/homebrew-core/issues/62911) post, it is recommended to install Python 3.8 and 3.9 and make it consistent. ```bash brew install python@3.12 python3 --version pip3 --version ``` #### 4.2.4. [ffmpeg](https://ffmpeg.org/) ffmpeg is a cross-platform solution to record, convert and stream audio and video. The installation may take several minutes. ```bash brew install ffmpeg ``` #### 4.2.5. [XQuartz](https://www.xquartz.org/) The XQuartz project is an open-source effort to develop a version of the [X.Org X Window System](https://www.x.org/wiki/) that runs on macOS. - Download and run https://github.com/XQuartz/XQuartz/releases/download/XQuartz-2.8.5/XQuartz-2.8.5.pkg - Will take around 320 MB of disk space - Send XQuartz-2.8.5.dmg to Trash #### 4.2.6. [MacPorts](https://guide.macports.org/chunked/installing.macports.html) Follow the instructions from https://guide.macports.org/chunked/installing.macports.html. #### 4.2.7. [tcllib](https://ports.macports.org/port/tcllib/) ```bash sudo port selfupdate && sudo port upgrade tcllib sudo port install tcllib ``` #### 4.2.8. [MuseScore](https://musescore.org/) MuseScore is an open source notation software. - Download and run https://musescore.org/en/download/musescore.dmg - Drag MuseScore 4 to Applications folder - Will take around 320 MB of disk space - Unmount MuseScore-4.2.1 virtual disk and send MuseScore-4.2.1.240230937.dmg to Trash #### 4.2.9. [R](https://www.r-project.org) R is a free software environment for statistical computing and graphics. - Download and run the pkg file according to you architecture from https://www.r-project.org - Will take around 180 MB of disk space #### 4.2.10. [RStudio](https://posit.co/download/rstudio-desktop/) RStudio is an Integrated Development Environment (IDE) for R. - Download and run https://download1.rstudio.org/electron/macos/RStudio-2025.05.0-496.dmg - Drag RStudio to Applications folder - Will take around 770 MB of disk space - Unmount RStudio virtual disk and send RStudio-2025.05.0-496.dmg to Trash - Type `command + space 'rstudio'` - Tools > Global Options... > Appearance > Merbivore (Restart required) #### 4.2.11. R packages "Packages are the fundamental units of reproducible R code." [Hadley Wickham and Jennifer Bryan](https://r-pkgs.org/). Type `command + space 'terminal'` ```bash sudo R ``` Running R as super user paste the following, one line at a time. ```r packs <- c('audio','reticulate','R.utils','seewave','tidyverse','tuneR','wrassp') install.packages(packs, dep = TRUE) update.packages(ask = FALSE) devtools::install_github('egenn/music') devtools::install_github('flujoo/gm') ``` #### 4.2.12. [Miniconda](https://docs.conda.io/en/latest/miniconda.html) Miniconda is a free minimal installer for [conda](https://docs.conda.io/), an open source package, dependency and environment management system for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN and more, that runs on Windows, macOS and Linux. For 64-bit version use ```bash cd ~/Downloads wget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh cd repo.anaconda.com/miniconda/ bash Miniconda3-latest-MacOSX-x86_64.sh ``` For M1 version use ```bash cd ~/Downloads wget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh cd repo.anaconda.com/miniconda/ bash Miniconda3-latest-MacOSX-arm64.sh ``` In order to continue the installation process, please review the license agreement. Please, press ENTER to continue `ENTER`. You can undo this by running `conda init --reverse $SHELL`? `yes` Close and reopen terminal. ```bash export PATH="~/miniconda3/bin:$PATH" conda update -n base -c defaults conda ``` The following packages will be INSTALLED/REMOVED/UPDATED/DOWNGRADED:... Proceed ([y]/n)? `y` ```bash conda create -n pyvoice python=3.12 ``` The following (NEW) packages will be downloaded/INSTALLED:... Proceed ([y]/n)? `y` Close and reopen terminal. ```bash conda activate base conda activate pyvoice pip install -r https://raw.githubusercontent.com/filipezabala/voice/master/requirements.txt ``` ## 5. Diarize ```{r, eval=FALSE} # download url0 <- 'https://github.com/filipezabala/voiceAudios/raw/main/wav/sherlock0.wav' wavDir <- normalizePath(tempdir()) download.file(url0, paste0(wavDir, '/sherlock0.wav'), mode = 'wb') ``` Diarization can be performed to detect speaker segments (i.e., 'who spoke when'). ```{r, eval=FALSE} # diarize voice::diarize(fromWav = wavDir, toRttm = wavDir, token = 'YOUR_TOKEN') ``` The `voice::diarize()` function creates Rich Transcription Time Marked (RTTM)[^rttm] files, space-delimited text files containing one turn per line defined by NIST - National Institute of Standards and Technology. The RTTM files can be read using `voice::read_rttm()`. [^rttm]: See Appendix C at https://www.nist.gov/system/files/documents/itl/iad/mig/KWS15-evalplan-v05.pdf. ```{r, eval=FALSE} # read_rttm (rttm <- voice::read_rttm(wavDir)) ``` Finally, the audio waves can be automatically segmented. ```{r, eval=FALSE} # split audio wave voice::splitw(fromWav = wavDir, fromRttm = wavDir, to = wavDir) dir(wavDir, pattern = '.[Ww][Aa][Vv]$') ```