Compute the components of a Praat Voice report — praat_voice

The Praat program defines a voice report containing a range of fundamental properties of a voice sample. The most common application of the voice report is on a sustained vowel. This function computes the report from a sectio of a recording using Praat, and returns the voice measures as a list. The function also enable the user to mark just a part of the sustained vowel for analysis using an offset and a subsample length. In this scenario, the user specifies the start and end times (beginTime and endTime, respectively) of the sustained vowel. Then, the user specifies a selectionOffset, which is the number of seconds into the vowel where extraction for analysis will start. Finally, the user specifies a selectionLength, which is the (maximum) length of the extracted part. This means that if the user has a sustained vowel staring 1 second into the signal and extends for 2 seconds (very short), and the user asks for a 2 second extraction starting 0.5 s into the vowel, what will actually be analysed is a portion from 1.5s to 3s (a 1.5s signal, and not the 2s that the user asked for). This behavior is there so that the user is not inadvertently adding parts that are not part of a sustaind vowel production. The user may of course always choose to disregard measurements that were based on a too short sample.

Usage

praat_voice_report(
  listOfFiles,
  beginTime = NULL,
  endTime = NULL,
  selectionOffset = NULL,
  selectionLength = NULL,
  windowShape = "Gaussian1",
  relativeWidth = 1,
  minF = 75,
  maxF = 600,
  max_period_factor = 1.3,
  max_ampl_factor = 1.6,
  silence_threshold = 0.03,
  voicing_threshold = 0.45,
  octave_cost = 0.01,
  octave_jump_cost = 0.35,
  voiced_unvoiced_cost = 0.14,
  praat_path = NULL
)

Arguments

listOfFiles: The full path of the sound file.
beginTime: The time point (in s) in the sound file where the sustained vowel starts. If NULL, the start of the sound file will also be viewed as the start of the sustained vowel production.
endTime: The time point (in s) in the sound file where the sustained vowel ends. If NULL, everyting up until the end of the sound file will be considered part of the sustained vowel.
selectionOffset: An optional offset to be added to the time of the sustained vowel production when determining where the start of the extracted portion of the vowel.
selectionLength: An optional (maximal) length of the selection.
windowShape: The window shape used for extracting the vowel. May be one of "rectangular", "triangular", "parabolic", "Hanning", "Hamming", "Gaussian1", "Gaussian2", "Gaussian3", "Gaussian4", "Gaussian5", "Kaiser1", and "Kaiser2".
relativeWidth: The relative width of the window used for extracting the vowel portion.
minF: The minimum pitch (f~0~) to be considered.
maxF: The maximum pitch (f~0~) to be considered.
max_period_factor: The larges possible differences between consecutive intervals that will be used in computing jitter. Please consult the Praat manual for further information.
max_ampl_factor: The larges possible differences between consecutive intervals that will be used in computing schimmer Please consult the Praat manual for further information.
silence_threshold: The silence threshold. Please consult the Praat manual for further information.
voicing_threshold: The voicing threshold. Please consult the Praat manual for further information.
octave_cost: The octave cost. Please consult the Praat manual for further information.
octave_jump_cost: The octave jump cost. Please consult the Praat manual for further information.
voiced_unvoiced_cost: The cost for voiced to unvoiced change detection. Please consult the Praat manual for further information.
praat_path: An optional explicit path to the Praat binary. Not usually required.

Value

A list of voice parameters:

Median pitch: The median pitch (f~0~) of the sample (in Hz)
Mean pitch: The mean pitch (f~0~) of the sample (in Hz)
Standard deviation: The standard deviation of pitch (f~0~, in Hz) of the sample.
Minimum pitch: The lowest pitch (f~0~) detected (in Hz)
Maximum pitch: The highest pitch (f~0~) detected (in Hz)
Number of pulses: The number of pulses detected
Number of periods: The number of periods detected
Mean period: The average period length
Standard deviation of period: The standard deviation of period length
Fraction of locally unvoiced frames: The fraction of frames detected as unvoiced in the sample.
Number of voice breaks: Number of voice breaks
Degree of voice breaks: The number of voice breaks in relation to the number of frames
Jitter (local): The average absolute difference between consequtive periods, divided by the average period (in %). See the Praat manual for more information.
Jitter (local, absolute): The average absolute difference between consequtive periods, in seconds. See the Praat manual for more information.
Jitter (rap): The three point Relative Average Pertubation: the average absolute difference between a period and the three point local average, divided by the average period (in %).
Jitter (ppq5): The five point Relative Average Pertubation: the average absolute difference between a period and the five point local average, divided by the average period (in %).
Jitter (ddp): The average absolute difference between consequtive differences between periods, divided by the average period (in %).
Shimmer (local): The average absolute difference between amplitudes of consequtive periods, divided by the average amplitude (in %).
Shimmer (local, dB): The average absolute difference between amplitudes of consequtive periods (in dB).
Shimmer (apq3): The three point Amplitude Pertubation Quotient: the average absolute difference between the amplitude of a period and the three point local average, divided by the average amplitude (in %).
Shimmer (apq5): The five point Amplitude Pertubation Quotient: the average absolute difference between the amplitude of a period and the five point local average, divided by the average amplitude (in %).
Shimmer (apq11): The 11 point Amplitude Pertubation Quotient: the average absolute difference between the amplitude of a period and the 11 point local average, divided by the average amplitude (in %).
Shimmer (dda): The average absolute difference between consequtive differences between amplitudes of consequtive periods, divided by the average period (in %).
Mean autocorrelation: The average autocorrelation of the signal.
Mean noise-to-harmonics ratio: The average NHR of the voice sample.
Mean harmonics-to-noise ratio: The average HNR of the voice sample.

Details

which may be advantageous in cases where it may be suspected that