Skip to contents

This function estimates pitch by normalized cross-correlation function (NCCF) and median smoothing, as implemented in the torchaudio (Yang et al. 2021) library. The exact algorithm is undisclosed by the implementing library but approach likely builds on earlier implementations that use NCCFs (Talkin and Kleijn 1995; Kasi and Zahorian 2002) including the RAPT algorithm.

Usage

torch_pitch(
  listOfFiles,
  beginTime = 0,
  endTime = 0,
  windowShift = 10,
  windowSize = 30,
  minF = 70,
  maxF = 200,
  explicitExt = "tpi",
  outputDirectory = NULL,
  toFile = TRUE
)

Arguments

listOfFiles

A vector of file paths to wav files.

beginTime

The start time of the section of the sound file that should be processed.

endTime

The end time of the section of the sound file that should be processed.

windowShift

The measurement interval (frame duration), in seconds.

minF

Candidate f0 frequencies below this frequency will not be considered.

maxF

Candidates above this frequency will be ignored.

explicitExt

the file extension that should be used.

outputDirectory

set an explicit directory for where the signal file will be written. If not defined, the file will be written to the same directory as the sound file.

toFile

write the output to a file? The file will be written in outputDirectory, if defined, or in the same directory as the soundfile.

Value

An SSFF track object containing two tracks (f0 and pitch) that are either returned (toFile == FALSE) or stored on disk.

References

Kasi K, Zahorian SA (2002). “Yet Another Algorithm for Pitch Tracking.” 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, I--361-I-364. doi:10.1109/icassp.2002.5743729 .

Talkin D, Kleijn WB (1995). “A robust algorithm for pitch tracking (RAPT).” Speech coding and synthesis, 495, 518.

Yang Y, Hira M, Ni Z, Chourdia A, Astafurov A, Chen C, Yeh C, Puhrsch C, Pollack D, Genzel D, Greenberg D, Yang EZ, Lian J, Mahadeokar J, Hwang J, Chen J, Goldsborough P, Roy P, Narenthiran S, Watanabe S, Chintala S, Quenneville-Bélair V, Shi Y (2021). “TorchAudio: Building Blocks for Audio and Speech Processing.” arXiv preprint arXiv:2110.15018.

See also

rapt