A Machine Learning Approach for Identifying the Effectiveness of Simulation Tools for Conceptual Understanding

Sindhura Elluri, Purdue University


Interactive learning environments have been identified as promising technologies to improve teaching and learning in science and engineering. Specifically, simulation tools have become a vital part of coursework in both K-12 and higher education. It is therefore essential to identify better ways to integrate simulation tools in the classroom and at the same time provide teachers and students with feedback capabilities that can support existing assessment methods and provide opportunities for just-in-time teaching. One effective way to identify how students are benefiting from the use of computer simulations for conceptual learning is by having them explain the phenomena being explored. However, such type of qualitative data is difficult to evaluate on a timely manner. With increase in qualitative data in the form of open-ended responses, the process of data analysis by human expert is expensive and requires colossal manual effort as well as time. In this study, we took advantage of machine learning technique to analyze students' responses to a set of open-ended questions on pre-test and post-test assessments to identify if the students' understanding of the concepts improved after using a computer aided design (CAD) simulation tool called Energy 3D.Basic statistical analysis did not show any significant differences between pretest and posttest. Other clustering techniques like K-means and random clustering algorithms did not reveal any significant patterns in the data. This study used random projection clustering to identify patterns in the data based on the annotated open-ended responses to determine the characteristics of different student groups. Random projection clustering algorithm provided the capability to cluster the data into diverse groups and identify the cluster groups which are statistically significant making it easier to identify the most distinct groups. Many clusters have been identified and one of the significant clusters has been analyzed to describe the characteristics of the groups. Two major groups have been identified in this study. A stable group which was a high performing group but did not show any significant improvement after instructional intervention in posttest. An improving group, which was identified as the low performing group, showed significant improvement after instructional intervention in posttest.




Magana, Purdue University.

Subject Area

Information Technology|Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server