Audio samples produced by TubeTalker

The audio files below coincide with this paper:

The speech samples were produced by speech simulation system called TubeTalker. TubeTalker operates at the level of the vocal tract area function on the theoretical view that speech is produced by multiple levels of airway structure and modulation. A "neutral" vocal tract shape is the base structure on which all other modulation is superimposed. The first level of modulation consists of time-dependent shaping of the neutral tract shape over most of its length; this produces transitions from one to another. Spatially localized perturbations are imposed in the second level of modulation that momentarily perturb the underlying vowel substrate. The examples below are demonstrations of using TubeTalker to generate speech at the word and phrase levels.


Neutral vocal tract

This sample is the neutral vocal tract only. The voice source does produce a fundamental frequency (F0) contour to give samples a more natural quality. The F0 contour is identical for all samples.



Word: "Ohio"

With regard to the vocal tract, "Ohio" is an all-vowel utterance. It was generated by modulating the neutral vocal tract shape such that it produced acoustic characteristics of the vowels. The glottal aspiration for the "h" sound was creating by an adbuctory maneuver of the vocal folds.



Word: "Abracadabra" (Vowels only)

This word requires modulation at the level of vowel transitions and consonantal perturbations. The sample below, however, is of only the vowel transitions that underlie production of the word.



Word: "Abracadabra"

Now the consonantal perturbations are imposed on vowel substrate to produce the "Abracadabra".



Phrase: "He had a rabbit" (Vowels only)

This sample demonstrates increased complexity due to it being a phrase rather than a word. This audio file, however, is only the vowel substrate on which phrase is built.



Phrase: "He had a rabbit"

The consonantal perturbations are now imposed. Note that an "r" is present in this phrase which requires that consonant perturbation not occlude the vocal tract.



Phrase: "The brown cow" (Vowels only)

This audio file demonstrates the vowel substrate for the phrase.



Phrase: "The brown cow"

The unique component of this example is that it includes a nasal consonant. This requires that the area of the nasal port that couples the main vocal tract to the nasal passages/sinuses be precisely timed to allow nasalization, but also to terminate quickly for adequate production of the "k" sound in the following word ("cow").




---------------------------------------------------

Modifications to the neutral vocal tract shape


By changing only the neutral vocal tract shape while keeping all other modulations the same, a new sound quality is produced. Here are two examples using the "He had a rabbit" phrase.

Phrase: "He had a rabbit" - Modified neutral vocal tract #1


Phrase: "He had a rabbit" - Modified neutral vocal tract #2



---------------------------------------------------

Modifications to the timing of the control parameters


In the following two phrases, the timing of all control parameters was altered such that the first half of each phrase was increased in duration by 25 percent and the latter half decreased by 25 percent. The total duration of each phrase is the same as the original.

Phrase: "He had a rabbit" - Modified timing #1


Phrase: "The brown cow" - Modified timing #1


In the following two phrases, the timing of all control parameters was altered such that the first half of each phrase was decreased in duration by 25 percent and the latter half increased by 25 percent. The total duration of each phrase is again the same as the original.

Phrase: "He had a rabbit" - Modified timing #2


Phrase: "The brown cow" - Modified timing #2



---------------------------------------------------

Modifications to the voice source


In the following two phrases, the baseline separation of the vocal processes was increased from 0.1 cm to 0.15 cm. This change has the effect of allowing a greater non-oscillatory component of the glottal flow during voicing, and results in increased glottal turbulence. The perceptual effect is a breathier voice quality.

Phrase: "He had a rabbit" - Modified voice source


Phrase: "The brown cow" - Modified voice source



---------------------------------------------------

Modifications to the nasal coupling parameters (hypernasal)


In the following two phrases, the nasal coupling area was maintained at a minimum value of 0.2cm2 throughout the duration of each phrase. The effect is to nasalize all portions of the phrases resulting in a hypernasal quality.

Phrase: "He had a rabbit" - Modified nasal coupling


Phrase: "The brown cow" - Modified nasal coupling



---------------------------------------------------

Modifications to the epilaryngeal tube


In the following two phrases, the entry area to the vocal tract was increased to effectively widen the epilaryngeal tube. This modification alters the voice quality in two ways - the first three formants are shifted slightly downward in frequency and the glottal flow waveform is altered. The perceptual effect is a darker voice quality.

Phrase: "He had a rabbit" - Modified epilarynx


Phrase: "The brown cow" - Modified epilarynx



---------------------------------------------------

Modifications to the epilaryngeal tube and increase in vocal tract length


In the following two phrases, the entry area to the vocal tract was increased as in the previous example. In addition, the vocal tract length was increased to 18.5 cm.

Phrase: "He had a rabbit" - Modified epilarynx + increased VT length


Phrase: "The brown cow" - Modified epilarynx + increased VT length



---------------------------------------------------

Extra modifications not in the published paper


This version of "abracadabra" has increased duration, decreased fundamental frequency, widened epilarynx, and the vocal tract length was increased to 18.5 cm.

Word: "Abracadabra"


This is the same as the sample above, but has an added vocal tremor.

Word: "Abracadabra"