Speech recognition using statistical models and recurrent neural networks

Abstract

This thesis addresses the problem of speech phone recognition. Phones are the acoustic sounds of speech. Statistical models of speech have been widely used for phone recognition. In this thesis, we propose a new model in which speech coarticulation--the effect of phonetic context on speech sounds--is modeled explicitly under a statistical framework. Unlike hidden Markov models, in which the current state depends only on the previous state and current observation, the proposed model supports dependence on the previous and next states and on the previous and current observations. The degree of coarticulation between adjacent phones is modeled parametrically, and can be adjusted according to a parameter representing the speaking rate. The model also incorporates a parameter that represents a frame-by-frame measure of confidence in the speech. We present a new efficient forward algorithm and novel system parameter estimation methods that aim directly at improving speech recognition performance. Experiments using the model clearly reveal the factors that affect the performance of speech recognition systems. We also study speech phone recognition by recurrent neural networks. A general framework for recurrent neural networks and considerations for network training are discussed in detail. Many alternative energy measures and training methods are proposed and implemented. A speaker independent phone recognition rate of 80% with 25% frame error rate has been achieved on the TIMIT data base. Telephone speech recognition experiments on the NTIMIT database result in a phone recognition rate of 68% correct. The research results in this thesis are competitive with the best results reported in the literature.

Degree

Ph.D.

Advisors

Jamieson, Purdue University.

Subject Area

Electrical engineering|Computer science

Download

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server.

COinS

Speech recognition using statistical models and recurrent neural networks

Abstract

Degree

Advisors

Subject Area

Search

Links

Links for Authors

Browse

Speech recognition using statistical models and recurrent neural networks

Abstract

Degree

Advisors

Subject Area

Share

Search

Links

Links for Authors

Browse