Query translation-based cross-language diagnosis for nonnative English users

Pilsung Choe, Purdue University

Abstract

Many companies have developed websites that their customers can use to diagnose product problems. However, it is often difficult for nonnative English users to read, understand, and follow information written in English. In this study, a query translation-based cross-language diagnosis (Q-CLD) tool for assisting nonnative English users diagnosing print defects was developed and then evaluated. The first step in developing the Q-CLD tool involved collecting print defect descriptions in Korean and English from 40 subjects in five universities in Korea. The remaining 40 subjects were from Purdue University. In the next step, three fuzzy Bayesian models were developed: one was based on English descriptions provided by English speaking subjects (referred to as the native English model); the second used English descriptions provided by Korean subjects (referred to as the nonnative English model); and the third used Korean descriptions provided by Korean subjects (referred to as the Korean model). Performance of the models was then evaluated using five different types of input descriptions: English descriptions given by English subjects (referred to as the native English descriptions); English descriptions given by Korean subjects (referred to as the nonnative English descriptions); Korean descriptions given by Korean subjects; descriptions translated from Korean into English using the Google translator (referred to as the Google translations); and descriptions translated from Korean into English using a keywords matching method developed in this dissertation (referred to as the keywords matching translations). Native English descriptions, Google translations, and keywords matching translations were used as inputs to evaluate the native English model. Korean descriptions were used as inputs to evaluate the Korean model. Nonnative English descriptions were used as inputs to evaluate both the native English model and the nonnative English model. The native English model using the native English descriptions gave the most accurate predictions of the tested models. In this case, the native English model correctly predicted 45% of the print defects with its top prediction, and in 87% of the cases the actual defect was one of the top five predictions. The keywords matching translations were nearly as accurate as the native English descriptions. Using the keywords matching translations, the native English model correctly predicted 37% of the print defects with its top prediction and, in 80% of the cases the actual defect was one of the top five predictions. Both were better than the predictions of the other tested models. The query translation-based Korean-English cross-language diagnosis (Q-KE-CLD) tool used for print quality troubleshooting was then implemented and evaluated through a human factors experiment conducted in four universities in South Korea. The experimental results showed that Korean subjects both generated Korean queries faster (p = 0.008) and identified print defects faster (p = 0.051) when they entered Korean queries. In addition, the subjects rated Korean queries as being easier to generate (p = 0.004). Untrained subjects reported that use of the Korean language made it easier to generate queries and identify print defects. Overall, the Q-KE-CLD tool resulted in quicker identification of print defects at all user levels.

Degree

Ph.D.

Advisors

Lehto, Purdue University.

Subject Area

Industrial engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS