Compute f0 using the Harvest algorithm

The DIO algorithm (Morise et al. 2010) was developed for the WORLD vocoder (MORISE et al. 2016) and aims provide a fast estimate of the f0 contour.

Usage

dio(
  listOfFiles,
  beginTime = 0,
  endTime = 0,
  windowShift = 5,
  minF = 70,
  maxF = 200,
  voiced_voiceless_threshold = 0.01,
  explicitExt = "wd0",
  outputDirectory = NULL,
  toFile = TRUE
)

Arguments

listOfFiles: A vector of file paths to wav files.
beginTime: The start time of the section of the sound file that should be processed.
endTime: The end time of the section of the sound file that should be processed.
windowShift: The measurement interval (frame duration), in seconds.
minF: Candidate f0 frequencies below this frequency will not be considered.
maxF: Candidates above this frequency will be ignored.
explicitExt: the file extension that should be used.
outputDirectory: set an explicit directory for where the signal file will be written. If not defined, the file will be written to the same directory as the sound file.
toFile: write the output to a file? The file will be written in outputDirectory, if defined, or in the same directory as the soundfile.
voiced.voiceless.threshold: Threshold for voiced/unvoiced decision. Can be any value >= 0, but 0.02 to 0.2 is a reasonable range. Lower values will cause more frames to be considered unvoiced (in the extreme case of threshold=0, almost all frames will be unvoiced).

Value

An SSFF track object containing two tracks (f0 and corr) that are either returned (toFile == FALSE) or stored on disk.

References

Morise M, Kawahara H, Nishiura T (2010). “Rapid F0 estimation for high-SNR speech based on fundamental component extraction.” Trans. IEICEJ, 93, 109--117.

MORISE M, YOKOMORI F, OZAWA K (2016). “WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications.” IEICE Transactions on Information and Systems, E99.D(7), 1877--1884. ISSN 0916-8532, doi:10.1587/transinf.2015edp7457 .