Skip to contents

The DIO algorithm (Morise et al. 2010) was developed for the WORLD vocoder (MORISE et al. 2016) and aims provide a fast estimate of the f0 contour.

Usage

dio(
  listOfFiles,
  beginTime = 0,
  endTime = 0,
  windowShift = 5,
  minF = 70,
  maxF = 200,
  voiced_voiceless_threshold = 0.01,
  explicitExt = "wd0",
  outputDirectory = NULL,
  toFile = TRUE
)

Arguments

listOfFiles

A vector of file paths to wav files.

beginTime

The start time of the section of the sound file that should be processed.

endTime

The end time of the section of the sound file that should be processed.

windowShift

The measurement interval (frame duration), in seconds.

minF

Candidate f0 frequencies below this frequency will not be considered.

maxF

Candidates above this frequency will be ignored.

explicitExt

the file extension that should be used.

outputDirectory

set an explicit directory for where the signal file will be written. If not defined, the file will be written to the same directory as the sound file.

toFile

write the output to a file? The file will be written in outputDirectory, if defined, or in the same directory as the soundfile.

voiced.voiceless.threshold

Threshold for voiced/unvoiced decision. Can be any value >= 0, but 0.02 to 0.2 is a reasonable range. Lower values will cause more frames to be considered unvoiced (in the extreme case of threshold=0, almost all frames will be unvoiced).

Value

An SSFF track object containing two tracks (f0 and corr) that are either returned (toFile == FALSE) or stored on disk.

References

Morise M, Kawahara H, Nishiura T (2010). “Rapid F0 estimation for high-SNR speech based on fundamental component extraction.” Trans. IEICEJ, 93, 109--117.

MORISE M, YOKOMORI F, OZAWA K (2016). “WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications.” IEICE Transactions on Information and Systems, E99.D(7), 1877--1884. ISSN 0916-8532, doi:10.1587/transinf.2015edp7457 .