Development of a strategy to improve the accuracy and efficiency of computerized E-codes classification: Narrative coding assignments between human and computer

Chiew-zhi Chin, Purdue University


External cause of injury codes (E-codes) and the Occupational Injury and Illness Classification system codes (OIICS) are useful for the purpose of accident prevention analysis. However, the coding task has become burdensome as the trained coders need to code a huge amount of text narratives. This study presents the use of Naïve Bayes machine learning tool to classify large amounts of narrative texts, including the strategic assignment of the tasks between human and computer in order to reduce erroneous decisions. Receiver Operating Characteristic (ROC) curves were used to identify the optimal region for different categories to achieve optimal results by effectively minimizing resources necessary for manual coding. The results showed that by utilizing the ROC based reject rule to assign difficult tasks for manual coding, it was possible to obtain an final accuracy of the classification of 82 percent, 89 percent and 94 percent respectively for filtering strategies in which 20 percent, 35 percent and 49 percent of the narratives were manually coded. ^




Mark R. Lehto, Purdue University.

Subject Area

Health Sciences, Occupational Health and Safety|Information Technology|Engineering, Industrial

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server