RECOGNITION OF STOP CONSONANTS BY COMPUTER ANALYSIS (ACOUSTICS, PHONETICS, TRANSFORMS, SPEECH)

SARAH KATHRYN SMITH YODER, Purdue University

Abstract

Most speech recognition to date has been done with template matching of word or syllable characteristics. In such systems, a parametric representation of an unknown word is compared to a library of known words. This is not a feasible long-term approach to speech recognition since the English language has approximately 300,000 words and approximately 60,000 syllables. A more reasonable approach is that of recognizing phonemes, the smallest contrasting speech sound, since English has approximately 40 phonemes and no known (yet studied) language has more than 100. Phonemes combined together yield the words of a language. There are, however, yet smaller units called allophones which represent the distinct sounds the human mouth can make. These allophones are combined together by rules, e.g., position of occurrence in syllables, to form phonemes. There has been some research done on the recognition of speech at the phoneme level. Many of the problems associated with the phoneme level of recognition seem to be due to the different allophones of the phoneme being confused. For that reason, this research looks into the possibility of doing recognition at the phonetic level as opposed to either the phoneme, syllable, or word level. Much work has been done on acoustic phonetic analysis of speech. This involves relating acoustic features (parameters that can be measured directly from a speech waveform) to phonetic labels (information pertaining to the phonetic composition of the speech). In phonetic theories that have been proposed, the features due to the vocal tract configuration, e.g. manner and place of articulation, are important categories of speech sounds for perception of speech; in particular, for perception of phones. Of these different phonetic features, place of articulation for a certain manner of articulation, namely stops, is especially difficult to identify from measurements on the speech signal. This work pertains to the accurate recognition of the stop consonants /p,t,k,b,d,g/, independent of speaker and phonetic context. A technique using a Mellin-Fourier homomorphism has yielded good results on the word-initial stops in isolated words. The techniques found to work for word-initial stops in isolated speech are then examined for their applicability in identifying stops in word-initial, word-medial, and word-final positions in continuous speech, as well as in stop clusters in all three positions.

Degree

Ph.D.

Subject Area

Electrical engineering|Artificial intelligence

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS