Compute f0 tracks using Praat — praat

This function calls Praat to compute f0 tracks. Both the auto-correlation and cross-correlation methods are used, and the results are stored in separate fields in the returned SSFF track object. Most arguments to the function map directly to formal arguments to the underlying Praat procedure, and the description of these are therefore replicated here. See the Praat manual for more information.

Usage

praat_pitch(
  listOfFiles,
  beginTime = 0,
  endTime = 0,
  windowShift = 5,
  minF = 75,
  maxF = 600,
  max.f0.candidates = 15,
  very.accurate = TRUE,
  silence.threshold = 0.03,
  voicing.threshold = 0.45,
  octave.cost = 0.01,
  octave.jump.cost = 0.35,
  voiced.voiceless.cost = 0.14,
  corr.only = FALSE,
  windowSize = 40,
  min.filter.freq = 70,
  max.filter.freq = 5000,
  filters = 250,
  max.freq.components = 1250,
  subharmonics = 15,
  compression = 0.84,
  points.per.octave = 48,
  windowShape = "Gaussian1",
  relativeWidth = 1,
  toFile = TRUE,
  explicitExt = "pf0",
  outputDirectory = NULL,
  verbose = FALSE,
  praat_path = NULL
)

Arguments

listOfFiles: A vector of file paths to wav files.
beginTime: The start time of the section of the sound file that should be processed.
endTime: The end time of the section of the sound file that should be processed.
windowShift: The measurement interval (frame duration), in seconds. If you supply 0, Praat will use a time step of 0.75 / (pitch floor), e.g. 0.01 seconds if the pitch floor is 75 Hz; in this example, Praat computes 100 pitch values per second.
minF: candidates below this frequency will not be recruited. This parameter determines the effective length of the analysis window: it will be 3 longest periods long, i.e., if the pitch floor is 75 Hz, the window will be effectively 3/75 = 0.04 seconds long. Note that if you set the time step to zero, the analysis windows for consecutive measurements will overlap appreciably: Praat will always compute 4 pitch values within one window length, i.e., the degree of oversampling is 4.
maxF: Candidates above this frequency will be ignored.
max.f0.candidates: The maximum numbrf of f0 candidates to consider
very.accurate: If FALSE, the window is a Hanning window with a physical length of 3 / (pitch floor). If TRUE, the window is a Gaussian window with a physical length of 6 / (pitch floor), i.e. twice the effective length.
silence.threshold: Frames that do not contain amplitudes above this threshold (relative to the global maximum amplitude), are probably silent.
voicing.threshold: The strength of the unvoiced candidate, relative to the maximum possible autocorrelation. To increase the number of unvoiced decisions, increase this value.
octave.cost: The degree of favoring of high-frequency candidates, relative to the maximum possible autocorrelation. This is necessary because even (or: especially) in the case of a perfectly periodic signal, all undertones of f0 are equally strong candidates as f0 itself. To more strongly favour recruitment of high-frequency candidates, increase this value.
octave.jump.cost: Degree of disfavoring of pitch changes, relative to the maximum possible autocorrelation. To decrease the number of large frequency jumps, increase this value. In contrast with what is described by (Boersma 1993) , this value will be corrected for the time step: multiply by 10ms / windowShift to get the value in the way it is used in the formulas in the article.
voiced.voiceless.cost: Degree of disfavoring of voiced/unvoiced transitions, relative to the maximum possible autocorrelation. To decrease the number of voiced/unvoiced transitions, increase this value. In contrast with what is described in the article, this value will be corrected for the time step: multiply by 10 ms / windowShift to get the value in the way it is used in the formulas in (Boersma 1993) .
corr.only: boolean; Compute autocorrelation (AC) and cross-correlation (CC) estimates of f0 only. If FALSE (the default) the function will additionally estimate f0 using a Spatial Pitch Network (SPINET) model (Cohen et al. 1995) as well as as using a spectral compression (SHS) model (Hermes 1988) . The computational load is increased considerably by these f0 estimates, and should be avoided if not explicitly needed by setting this parameter to TRUE.
windowSize: the window size used for computing the SPINET model.
min.filter.freq: the minimum filter frequency used when computing the SPINET model.
max.filter.freq: the maximum filter frequency used when computing the SPINET model.
filters: the number of filters used when computing the SPINET model.
max.freq.components: higher frequencies will not be considered when computing SHS.
subharmonics: the maximum number of harmonics that add up to the pitch in SHS.
compression: the factor by which successive compressed spectra are multiplied before the summation in SHS.
points.per.octave: determines the sampling of the logarithmic frequency scale in SHS.
windowShape: the analysis window function used when extracting part of a sound file for analysis. De faults to "Hanning".
relativeWidth: the relative width of the windowing function used.
toFile: write the output to a file? The file will be written in outputDirectory, if defined, or in the same directory as the soundfile.
explicitExt: the file extension that should be used.
outputDirectory: set an explicit directory for where the signal file will be written. If not defined, the file will be written to the same directory as the sound file.
verbose: Not implemented. Only included here for compatibility.
praat_path: give an explicit path for Praat.

Value

An SSFF object containing the f0 tracks (if toFile==FALSE).

References

Boersma P (1993). “Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound.” In Proceedings of the institute of phonetic sciences, volume 17, 97--110.

Cohen MA, Grossberg S, Wyse LL (1995). “A spectral network model of pitch perception.” The Journal of the Acoustical Society of America, 98(2), 862--879. ISSN 0001-4966, doi:10.1121/1.413512 , http://www.ncbi.nlm.nih.gov/pubmed/7642825.

Hermes DJ (1988). “Measurement of pitch by subharmonic summation.” The Journal of the Acoustical Society of America, 83(1), 257--264. ISSN 0001-4966, doi:10.1121/1.396427 , http://www.ncbi.nlm.nih.gov/pubmed/3343445.