voice
vignette# Development version from GitHub
install.packages('devtools')
devtools::install_github('filipezabala/voice')
# Stable version from CRAN
install.packages('voice')
If you’re compiling R packages from source, you may need to install RTools, a collection of Windows-specific build tools for R.
If you’re compiling packages, ensure you have Xcode Command Line Tools installed. You also may need macOS tools.
More details may be found at https://github.com/filipezabala/voice.
# minimal usage
M <- voice::extract_features(wavDir)
glimpse(M)
#> Rows: 1,196
#> Columns: 13
#> $ section_seq <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16…
#> $ section_seq_file <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16…
#> $ wav_path <chr> "/Library/Frameworks/R.framework/Versions/4.5-arm64/R…
#> $ f0 <dbl> NA, NA, NA, NA, NA, NA, NA, 115.8593, 108.9439, 107.4…
#> $ f1 <int> NA, NA, NA, NA, 185, 260, 254, 277, 261, 231, 177, 19…
#> $ f2 <int> 1854, 1886, 1749, 1888, 1962, 1973, 2026, 2037, 2130,…
#> $ f3 <int> NA, 2893, 2676, 2659, 2639, 2676, 2993, 2932, 3016, 2…
#> $ f4 <int> 3113, 3708, 3509, 3658, 3248, 3239, 3830, 3479, 3561,…
#> $ f5 <int> 4191, 4678, 4502, 4331, 3653, 3836, 4602, 4585, 4720,…
#> $ f6 <int> 5226, 5659, 5035, 5177, 5208, 5146, 5233, 5390, 5366,…
#> $ f7 <int> 6077, 6725, 6526, 6518, 6493, 6567, 6603, 6532, 6510,…
#> $ f8 <int> 6675, NA, NA, NA, 7681, 7751, 7803, NA, 7835, 7614, 7…
#> $ gain <dbl> 21.63347, 22.76034, 28.52825, 29.67069, 36.25124, 43.…
# creating Extended synthetic data
E <- dplyr::tibble(subject_id = c(1,1,1,2,2,2,3,3,3), wav_path = wavDir)
E
#> # A tibble: 9 × 2
#> subject_id wav_path
#> <dbl> <chr>
#> 1 1 /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/libra…
#> 2 1 /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/libra…
#> 3 1 /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/libra…
#> 4 2 /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/libra…
#> 5 2 /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/libra…
#> 6 2 /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/libra…
#> 7 3 /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/libra…
#> 8 3 /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/libra…
#> 9 3 /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/libra…
# minimal usage
voice::tag(E)
#> # A tibble: 9 × 7
#> wav_path f0_tag_mean f0_tag_sd f0_tag_vc f0_tag_median f0_tag_iqr f0_tag_mad
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 /Library/… 85.4 17.6 0.206 76.1 29.4 7.53
#> 2 /Library/… 85.4 15.6 0.183 80.1 27.8 14.4
#> 3 /Library/… 84.6 13.0 0.154 78.8 23.9 14.0
#> 4 /Library/… 84.8 14.5 0.171 79.1 28.1 11.9
#> 5 /Library/… 86.0 14.7 0.170 78.7 30.0 11.0
#> 6 /Library/… 82.9 15.6 0.188 74.8 23.8 4.78
#> 7 /Library/… 78.2 16.2 0.207 73.5 13.4 6.82
#> 8 /Library/… 84.5 14.5 0.172 78.1 17.8 8.95
#> 9 /Library/… 81.0 12.2 0.151 75.9 23.1 9.14
# canonical data
voice::tag(E, groupBy = 'subject_id')
#> # A tibble: 3 × 7
#> subject_id f0_tag_mean f0_tag_sd f0_tag_vc f0_tag_median f0_tag_iqr f0_tag_mad
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 85.1 15.3 0.180 78.3 26.8 11.9
#> 2 2 84.6 14.9 0.176 76.4 28.3 7.97
#> 3 3 81.0 14.6 0.180 75.6 21.6 8.68
Python-based functions diarize
and
extract_features
(when the latter is inferring
f0_praat
and fmt_praat
features) require a
configured Python environment.
The following steps are used to fully configure voice
on
Ubuntu 24.04 LTS
(Noble Numbat). Reports of inconsistencies are welcome.
Command line tool and library for transferring data with URLs.
ffmpeg is a cross-platform solution to record, convert and stream audio and video.
MuseScore is an open source notation software.
R is a free software environment for statistical computing and
graphics. To find out your Ubuntu distribution use
lsb_release -a
at terminal.
sudo sh -c 'echo "deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/" >> /etc/apt/sources.list.d/cran.list'
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 51716619E084DAB9
gpg -a --export E084DAB9 | sudo apt-key add -
sudo add-apt-repository ppa:c2d4u.team/c2d4u4.0+
sudo apt-get update && sudo apt-get upgrade
sudo apt-get install r-base r-base-dev
RStudio is an Integrated Development Environment (IDE) for R. Check for updates here.
“Packages are the fundamental units of reproducible R code.” Hadley Wickham and Jennifer Bryan. The installation may take several minutes. At terminal run:
Running R as super user paste the following, row by row:
packs <- c('audio','reticulate','R.utils','seewave','tidyverse','tuneR','wrassp')
install.packages(packs, dep = TRUE)
update.packages(ask = FALSE)
devtools::install_github('egenn/music')
devtools::install_github('flujoo/gm')
To configure the gm
package.
Add the line MUSESCORE_PATH=/usr/bin/mscore
to
/root/.Renviron
file. To exit use :wq
at VI.
Save and restart the R/RStudio session.
Miniconda is a free minimal installer for conda, an open source package,
dependency and environment management system for any language—Python, R,
Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN and more, that runs
on Windows, macOS and Linux.
Follow the instructions at https://docs.conda.io/en/latest/miniconda.html.
At terminal:
cd ~/Downloads/
wget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
cd repo.anaconda.com/miniconda/
bash Miniconda3-latest-Linux-x86_64.sh
Do you accept the license terms? [yes|no] yes
.
Miniconda3 will now be installed into this location: /home/user/miniconda3 [ENTER]
You can undo this by running
conda init --reverse $SHELL
? yes
Do you wish the installer to initialize Miniconda3 by running conda
init? yes
.
Close and reopen terminal.
The following packages will be INSTALLED/REMOVED/UPDATED/DOWNGRADED:…
Proceed ([y]/n)? y
The following (NEW) packages will be downloaded/INSTALLED:… Proceed
([y]/n)? y
The following steps are used to fully configure voice
on
MacOS Sonoma
(Link to MacOS Sequoia). Reports of inconsistencies are welcome.
Install Homebrew, ‘The Missing Package Manager for macOS (or Linux)’
and remember to brew doctor
eventually. At terminal
(command + space 'terminal'
) run:
GNU Wget is a free software package for retrieving files using HTTP, HTTPS, FTP and FTPS, the most widely used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc.
Python is a programming language that integrate systems. According to this post, it is recommended to install Python 3.8 and 3.9 and make it consistent.
ffmpeg is a cross-platform solution to record, convert and stream audio and video. The installation may take several minutes.
The XQuartz project is an open-source effort to develop a version of the X.Org X Window System that runs on macOS.
Follow the instructions from https://guide.macports.org/chunked/installing.macports.html.
MuseScore is an open source notation software.
R is a free software environment for statistical computing and graphics.
RStudio is an Integrated Development Environment (IDE) for R.
command + space 'rstudio'
“Packages are the fundamental units of reproducible R code.” Hadley Wickham and Jennifer Bryan. Type
command + space 'terminal'
Running R as super user paste the following, one line at a time.
Miniconda is a free minimal installer for conda, an open source package, dependency and environment management system for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN and more, that runs on Windows, macOS and Linux.
For 64-bit version use
cd ~/Downloads
wget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
cd repo.anaconda.com/miniconda/
bash Miniconda3-latest-MacOSX-x86_64.sh
For M1 version use
cd ~/Downloads
wget -r -np -k https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
cd repo.anaconda.com/miniconda/
bash Miniconda3-latest-MacOSX-arm64.sh
In order to continue the installation process, please review the
license agreement. Please, press ENTER to continue
ENTER
.
You can undo this by running
conda init --reverse $SHELL
? yes
Close and reopen terminal.
The following packages will be INSTALLED/REMOVED/UPDATED/DOWNGRADED:…
Proceed ([y]/n)? y
The following (NEW) packages will be downloaded/INSTALLED:… Proceed
([y]/n)? y
Close and reopen terminal.
# download
url0 <- 'https://github.com/filipezabala/voiceAudios/raw/main/wav/sherlock0.wav'
wavDir <- normalizePath(tempdir())
download.file(url0, paste0(wavDir, '/sherlock0.wav'), mode = 'wb')
Diarization can be performed to detect speaker segments (i.e., ‘who spoke when’).
The voice::diarize()
function creates Rich Transcription
Time Marked (RTTM)1 files, space-delimited text files
containing one turn per line defined by NIST - National Institute of
Standards and Technology. The RTTM files can be read using
voice::read_rttm()
.
Finally, the audio waves can be automatically segmented.