The idea is to make a package that has all the functionality of wrassp, and extend it with analyses made avaiable in Praat or MATLAB. The added functions should behave in a wrassp-like manner, and thereby be callable in a similar way in the emuR
framwork.
The praat_formant_burg
provides an illustration of how a Praat script that extracts formant values may be wrapped inside of an R function and produce a SSFF formant track file.
Details
By loading this package, you also get all the functions exported by the wrassp
package into your namespace. This is achieved by the superassp
package being Depending the wrassp
package (rather than Importing, which is usually the preferred way of creating depmendencies between R packages).
Installation
The package requires the Praat program to be installed in the user’s PATH (or in ‘/Applications’ on Mac OS).
Then simply install the package using
install.packages("devtools") # If not installed already
devtools::install_github("humlab-speech/superassp",dependencies = "Imports")
Indications of performance of Praat and wrassp functions
library(microbenchmark)
microbenchmark(
"wrassp::forest"=wrassp::forest(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE),
praat_formant_burg=praat_formant_burg(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE),
praat_formantpath_burg=praat_formantpath_burg(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE),
praat_sauce=praat_sauce(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE),
times=100
)
which results in
Unit: milliseconds
expr min lq mean median uq max neval
wrassp::forest 26.42033 28.25807 28.93884 28.74673 29.4339 34.7792 100
praat_formant_burg 520.85644 556.63421 596.37337 578.95567 628.1025 777.4130 100
praat_formantpath_burg 669.06082 708.40986 751.45798 733.72906 776.1413 1170.8230 100
praat_sauce 3247.77570 3400.72715 3668.55957 3577.48971 3895.4160 4753.4219 100
Getting an SSFF file from a wrassp
function rather than the praat_formant_burg
function, which is wrapped call of Praat call and which also involves the parsing of a csv file. Since the parsing of input and output in the praat_formant_burg
Praat calls already slows computation down considerably, the function also computes formant amplitudes (L) before returning the output to increase the usefulness of the function. The praat_formantpath_burg
function is of course an additional bit slower than method of computing formant frequencies as multiple formant tracks are computed and compared when this function is used.
Also, even it is adviced that even though the functions praat_sauce
does compute formant tracks (F and B properties) as well, it is not really efficiently implemented and is really mostly there for correction of harmonic amplitudes. And, an additional factor to consider is that the formant tracks will be stored by the praat_sauce
function in a field in the same file as all the other tracks computed by the function, which will likely result in a performance issue when working with the tracks.
So, if you need only formant frequency and bandwidth estimations, then you should really use one of the other functions instead.
Similarly, f0 computation using functions that call Praat or python are considerably slower than their wrassp
counterparts:
library(microbenchmark)
microbenchmark(
"wrassp::ksvF0"=wrassp::ksvF0(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"wrassp::mhsF0"=wrassp::mhsF0(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"praat_pitch ac & cc"=praat_pitch(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,corr.only=TRUE,windowShift=5),
"praat_pitch all methods"=praat_pitch(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,corr.only=FALSE,windowShift=5),
"rapt"=rapt(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"kaldi pitch tracker"=kaldi_pitch(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"swipe"=swipe(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"reaper"=reaper(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"yin"=yin(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"pyin"=pyin(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"dio"=dio(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"crepe"=crepe(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"harvest"=harvest(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
"yaapt"=yaapt(
file.path(getwd(),"tests/signalfiles/msajc003.wav"),toFile=FALSE,windowShift=5),
times=10
)
Unit: milliseconds
expr min lq mean median uq max neval
wrassp::ksvF0 2.233792 2.281709 2.335946 2.299771 2.428917 2.459084 10
wrassp::mhsF0 16.071417 16.318417 16.747580 16.629147 16.911459 18.199459 10
dio 96.165001 98.599834 113.813180 103.385646 109.233959 196.728834 10
rapt 117.702001 123.111418 130.628734 125.266938 132.193834 172.862917 10
swipe 132.854251 142.303001 154.309442 146.920792 149.362251 203.044126 10
yin 174.342459 176.832625 184.944901 179.869314 185.566626 227.427792 10
kaldi pitch tracker 195.539792 198.606459 217.634276 203.737271 211.397167 336.594000 10
reaper 232.866792 240.993793 817.140772 242.397271 244.391001 5989.057042 10
harvest 327.358626 334.249459 341.861072 337.557001 353.730001 368.915709 10
pyin 403.986376 410.861001 439.136880 440.595814 455.601042 500.009751 10
praat_pitch ac & cc 516.680042 536.404792 543.467217 546.130938 552.376292 557.215459 10
crepe 566.955126 605.125417 853.842680 612.232230 650.363376 3008.081876 10
yaapt 526.269001 575.632584 632.245809 626.762709 662.692500 849.112584 10
praat_pitch all methods 13428.942209 13825.150959 13894.481517 13884.433500 13964.615626 14224.678042 10
I have rearranged the output so that the algorithms are roughly ordered by (median) time used to compute output tracks.
Please note that these relative timings are not necessarily indicative of the relative efficiency of the algorithms themselves. The communication between R and Praat / python has a severe impact on performance, so the benchmarks above indicate only the relative performance in the current version of superassp
.
It should also be noted that as the computation is already slow due to the process of calling Praat the superassp
functions instead takes the opportunity to return more information once processing a file. For instance, praat_pitch
returns up to two or four tracks in which f_0 was estimated and may therefore be worth the wait. The swipe
estimates an additional “pitch” track, and reaper
and kaldi_pitch
computes and returns also normalized cross-correlation.
Steps to implement a new Praat function
- Indentify what the output of the function will be
- A signal track (or tracks) that follows the original sound wave
- A value (or a limited list of values) that summarises the acoustic properties of a wav file, and can therefore not sensibly be shown alongside the sound wave.
- Implement the core analysis in a Praat script file, and place it in
inst/praat
.- In the case where track(s) that follow the sound wave file are returned, the Praat function should write the output to a CSV table file and return the name of that table. The Praat script should also take the desired output table file name (including full path) as an argument. Please refer to
praat/formant_burg.praat
for some example code that computes formants and bandwidths for them for a (possibly windowed) sound file and writes them to a table.
- In the case where track(s) that follow the sound wave file are returned, the Praat function should write the output to a CSV table file and return the name of that table. The Praat script should also take the desired output table file name (including full path) as an argument. Please refer to
- Make a copy of the suitable template function, rename it (please keep the praat_ prefix for clarity) and make modifications to the code to suit the new track computed by Praat. You will need to think about what the tracks should be called in the SSFF file and document your choice.
- For a function that computes a sound wave following signal track (or tracks), use the code of
praat_formant_burg
as a template. Please refer to a suitable function in wrassp for inspiration on what to call sets of tracks. (Thepraat_formant_burg
outputs and “fm” and “bw” set, for formant frequencies and formant bandwidths respectivelly) - For single value (or list of values) output, there is currently no template function implemented, but please note that the
tjm.praat::wrap_praat_script()
, whichcs_wrap_praat_script
is a revised version of, has an option to return the “Info window” of Praat, which opens up lots of possibilities.
- For a function that computes a sound wave following signal track (or tracks), use the code of
- There are many moving parts to this whole package, so make sure to contruct a test file and a test suit for the new function to make sure that it works.