Pitch tracking using the torch pitch tracker

This function estimates pitch by normalized cross-correlation function (NCCF) and median smoothing, as implemented in the torchaudio (Yang et al. 2021) library. The exact algorithm is undisclosed by the implementing library but approach likely builds on earlier implementations that use NCCFs (Talkin and Kleijn 1995; Kasi and Zahorian 2002) including the RAPT algorithm.

Usage

torch_pitch(
  listOfFiles,
  beginTime = 0,
  endTime = 0,
  windowShift = 10,
  windowSize = 30,
  minF = 70,
  maxF = 200,
  explicitExt = "tpi",
  outputDirectory = NULL,
  toFile = TRUE
)

Arguments

listOfFiles: A vector of file paths to wav files.
beginTime: The start time of the section of the sound file that should be processed.
endTime: The end time of the section of the sound file that should be processed.
windowShift: The measurement interval (frame duration), in seconds.
minF: Candidate f0 frequencies below this frequency will not be considered.
maxF: Candidates above this frequency will be ignored.
explicitExt: the file extension that should be used.
outputDirectory: set an explicit directory for where the signal file will be written. If not defined, the file will be written to the same directory as the sound file.
toFile: write the output to a file? The file will be written in outputDirectory, if defined, or in the same directory as the soundfile.

Value

An SSFF track object containing two tracks (f0 and pitch) that are either returned (toFile == FALSE) or stored on disk.

References

Kasi K, Zahorian SA (2002). “Yet Another Algorithm for Pitch Tracking.” 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, I--361-I-364. doi:10.1109/icassp.2002.5743729 .

Talkin D, Kleijn WB (1995). “A robust algorithm for pitch tracking (RAPT).” Speech coding and synthesis, 495, 518.

Yang Y, Hira M, Ni Z, Chourdia A, Astafurov A, Chen C, Yeh C, Puhrsch C, Pollack D, Genzel D, Greenberg D, Yang EZ, Lian J, Mahadeokar J, Hwang J, Chen J, Goldsborough P, Roy P, Narenthiran S, Watanabe S, Chintala S, Quenneville-Bélair V, Shi Y (2021). “TorchAudio: Building Blocks for Audio and Speech Processing.” arXiv preprint arXiv:2110.15018.

Usage

Arguments

Value

References

See also