Anatomy of Speech Production
We have to first understand
how human speech production works in order to create a model for machine
speech. Our understanding of the anatomy of speech production can help
us create a model for machine speech.
In general, a speech signal
is an air pressure wave that travels from the speaker's mouth to the
listener's ears. Figure1 is a schematic of the anatomy of speech production.
The lung produces the initial air pressure that is essential for the
speech signal; the pharyngeal cavity, oral cavity, and nasal cavity
shapes the final output sound that is perceived as speech.
The pharyngeal cavity and
oral cavity (collectively known as the vocal tract) contracts and relaxes
dynamically to create all sorts of sounds through resonance. The nasal
cavity opens another air hole to create what linguists call nasal sounds
(ie. /m/, /n/). Together, these cavities characterize the sounds we