Formant estimation using the FormantPath functionality of Praat

This function exposes Praat's functionality for iteratively searching for the best fit formant track for a file by adjusting the maximum formant frequency (frequency ceiling). In each iteration, Praat's built in function (burg algorithm). See (Escudero et al. 2009-09) for example of how the procedure has been used and (Weenink and others 2015) for a description of how the optimal formant track is identified. If stepsUpDown is zero, then this function and the praat_formant_burg function would produce the same result if identical settings are used. The function also computes the intensity (L) of the best fit formant tracks based on the power of the spectrum at the frequency of the formant. Naturally, if the algorithm failed to find a formant in a specified time frame, then the function will not return a formant frequency, bandwidth and intensity estimation.

Usage

praat_formantpath_burg(
  listOfFiles,
  beginTime = 0,
  endTime = 0,
  windowShift = 5,
  numFormants = 5,
  maxFormantHz = 5500,
  windowSize = 30,
  preemphasis = 50,
  ceilingStepSize = 0.05,
  stepsUpDown = 4,
  trackFormants = TRUE,
  numberOfTracks = 3,
  nominalF1 = 550,
  nominalF2 = 1650,
  nominalF3 = 2750,
  frequencyCost = 1,
  bandwidthCost = 1,
  transitionCost = 1,
  windowShape = "Gaussian1",
  relativeWidth = 1,
  spectWindowShape = "Gaussian",
  spectResolution = 40,
  toFile = TRUE,
  explicitExt = "pfp",
  outputDirectory = NULL,
  verbose = FALSE,
  praat_path = NULL
)

Arguments

listOfFiles: a vector of wav file paths to be processed by function.
beginTime: the time where processing should end (in s) The default is 0 (zero) which means that the computation of formants will start at the start of the sound file.
endTime: the time where processing should end (in s) The default is 0 (zero) which means that formants will be computed up to the end of the file.
windowShift: the analysis window shift length (in ms).
numFormants: the number of formants that the analysis should try to find
maxFormantHz: The maximum frequency under which the formants should be found
windowSize: the analysis window length (in ms).
preemphasis: the frequency from which a preemphasis will be applied.
ceilingStepSize: The function multiple searches for formant tracks with the frequency ceiling set to maxhzformant*exp(-ceilingStepSize*stepsUpDown) to maxhzformant*exp(ceilingStepSize*stepsUpDown).
stepsUpDown: The number of iterations of increases and decreases of the frequency ceiling to use when trying to find the an optimal formant track.
trackFormants: boolean; Should Praat attempt to gather short time formant frequency estimates into tracks?
numberOfTracks: The number of tracks to follow (if trackFormants is TRUE), and the number of tracks in the output. Information on frequencies bandwidths of formants with numbers above numberOfTracks will be discarded.
nominalF1: Described by the Praat manual as the preferred value near which the first track wants to be. For average (i.e. adult female) speakers, this value will be around the average F1 for vowels of female speakers, i.e. 550 Hz.
nominalF2: Described by the Praat manual as the preferred value near which the second track wants to be. A good value will be around the average F2 for vowels of female speakers, i.e. 1650 Hz.
nominalF3: Described by the Praat manual as the preferred value near which the third track wants to be. A good value will be around the average F3 for vowels of female speakers, i.e. 2750 Hz. This argument will be ignored if you choose to have fewer than three tracks, i.e., if you are only interested in F1 and F2.
frequencyCost: Described by the Praat manual as the preferred value near which the five track wants to be. In the unlikely case that you want five tracks, a good value may be around 4950 Hz.Frequency cost (per kiloHertz)
bandwidthCost: Described by the Praat manual as the local cost of having a bandwidth, relative to the formant frequency. For instance, if a candidate has a formant frequency of 400 Hz and a bandwidth of 80 Hz, and Bandwidth cost is 1.0, the cost of having this formant in any track is (80/400) · 1.0 = 0.200. So we see that the procedure locally favours the inclusion of candidates with low relative bandwidths.
transitionCost: Described by the Praat manual as the cost of having two different consecutive formant values in a track. For instance, if a proposed track through the candidates has two consecutive formant values of 300 Hz and 424 Hz, and Transition cost is 1.0/octave, the cost of having this large frequency jump is (0.5 octave) · (1.0/octave) = 0.500.
windowShape: the analysis window function used when extracting part of a sound file for analysis. De faults to "Hanning".
relativeWidth: the relative width of the windowing function used.
spectWindowShape: The shape of the windowing function used for constructing the spectrogram.
spectResolution: The frequency resolution of the spectrogram from which formant intensities will be collected.
toFile: write the output to a file? The file will be written in outputDirectory, if defined, or in the same directory as the sound file.
explicitExt: the file extension that should be used.
outputDirectory: set an explicit directory for where the signal file will be written. If not defined, the file will be written to the same directory as the sound file.
verbose: Not implemented. Only included here for compatibility.
praat_path: give an explicit path for Praat. If the praat

Value

An SSFF track data object (if toFile=FALSE) containing three fields ("F", "B" and "L") containing formant frequencies, bandwidth and intensities.

Details

If the user only want to estimate formant frequencies that should later be manually corrected, computing them using the function wrassp::forest or even praat_formant_burg is much quicker. The user should consider this function only if the use case specifically demands an iterative serch for best fit formants.

References

Escudero P, Boersma P, Rauber AS, Bion RAH (2009-09). “A cross-dialect acoustic description of vowels: Brazilian and European Portuguese.” The Journal of the Acoustical Society of America, 126(3), 1379--1393. ISSN 0001-4966, doi:10.1121/1.3180321 , http://www.ncbi.nlm.nih.gov/pubmed/19739752.

Weenink D, others (2015). “Improved formant frequency measurements of short segments.” In ICPhS.