Neurophysiological Mechanisms of Speech Intelligibility Under Masking and Distortion

Vibha Viswanathan, Purdue University


Difficulty understanding speech in background noise is the most common hearing complaint. Elucidating the neurophysiological mechanisms underlying speech intelligibility in everyday environments with multiple sound sources and distortions is hence important for any technology that aims to improve real-world listening. Using a combination of behavioral, electroencephalography (EEG), and computational modeling experiments, this dissertation provides insight into how the brain analyzes such complex scenes, and what roles different acoustic cues play in facilitating this process and in conveying phonetic content. Experiment #1 showed that brain oscillations selectively track the temporal envelopes (i.e., modulations) of attended speech in a mixture of competing talkers, and that the strength and pattern of this attention effect differs between individuals. Experiment #2 showed that the fidelity of neural tracking of attended-speech envelopes is strongly shaped by the modulations in interfering sounds as well as the temporal fine structure (TFS) conveyed by the cochlea, and predicts speech intelligibility in diverse listening environments. Results from Experiments #1 and #2 support the theory that temporal coherence of sound elements across envelopes and/or TFS shapes scene analysis and speech intelligibility. Experiment #3 tested this theory further by measuring and computationally modeling consonant categorization behavior in a range of background noises and distortions. We found that a physiologically plausible model that incorporated temporal-coherence effects predicted consonant confusions better than conventional speech-intelligibility models, providing independent evidence that temporal coherence influences scene analysis. Finally, results from Experiment #3 also showed that TFS is used to extract speech content (voicing) for consonant categorization even when intact envelope cues are available. Together, the novel insights provided by our results can guide future models of speech intelligibility and scene analysis, clinical diagnostics, improved assistive listening devices, and other audio technologies.




Shinn-Cunningham, Purdue University.

Subject Area

Biomedical engineering|Neurosciences

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server