Performance evaluation of an open source speaker recognition system under the effect of speech compression in a real world VoIP environment

Asawaree Ajit Kulkarni, Purdue University

Abstract

Voice is a biometric modality suitable for use with telephonic applications. Voice over Internet Protocol (VoIP) is a type of telephony which allows the transmission of voice over packet switched networks and has several advantages over traditional telephony. VoIP is characterized by its usage of various codecs. A speech codec allows the transmission of voice at lower bit rates. However, the usage of codecs can cause a change in a speaker’s speech waves and this in turn has the potential to affect the recognition performance of an Automatic Speaker Recognition (ASR) system. This research investigated the effect of two codecs viz., GSM (lossy codec) and G.711 (lossless codec) on the performance of an open source ASR system built on a pattern recognition framework known as MARF. The research addressed two questions viz., if poor quality codecs affected the performance of speaker recognition and if a codec mismatch during the training and testing phases of the ASR had an adverse effect on the performance of the system. The performance was measured in terms of False Rejection Rates (FRR). Three different scenarios were considered based on the number of enrolled subjects and the mode of operation of the ASR system (identification or verification). The MARF system worked with decoded audio streams which retained the voice quality produced by the applied codec. The results showed that available evidence was not enough to prove that poor quality codec adversely affected the performance of the ASR system; however a mismatch of codecs decreased the performance of the system considerably. To date, investigators have primarily worked on existing voice databases. However, this research was different as it collected voice samples in a real world VoIP environment.

Degree

M.S.

Advisors

Goldman, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS