Statistical Analysis When the Data is an Image: Eliciting Student Thinking About Sampling and Variability
Within statistics education, there is a growing interest in understanding students' application of understanding about variability and sampling given the relative lack of research in either area (Shaughnessy, 2007). The task examined in this paper elicited students' knowledge of these concepts within a small-group problem solving task completed by teams of first-year engineering students. In the Nanoroughness task, teams of students designed a procedure for quantifying the roughness of a material surface using digital images generated by atomic force microscopy. The procedure required students to apply statistical methods in order to aggregate the data. The focus of this article is the subsequent analysis of the responses to the task and the questions raised by that analysis.
The Nanoroughness task is unique but critical as a statistical modeling task for two reasons. First, the students needed to use statistical measures to develop a measure that would describe a qualitative characteristic (roughness) without any prompting as to what statistical procedures were relevant. There are different ways to conceptualize roughness of a surface. Sandpaper’s roughness depends on the grain size of the sand. A road may be rough if it has randomly occurring large holes but smoother if the bumps are evenly distributed. The challenge in developing quantitative measures to define qualitative characteristics is that different quantitative analyses emphasize different variables and the students needed to both analyze and apply statistical procedures relevant to the context. For instance, determining which member of a set is the "most rough" or the "least rough" will depend on what measurements were selected, and how those measures were analyzed. The second unique characteristic of the task is that the students also needed to define a sampling procedure for an image that would facilitate quantifying the variability in the surface portrayed in the digital image. Usually when students need to take measurements of a population, the population is a discrete set of objects. In this case, the data set was a continuous surface. From the data set, the students need to determine the relevant population (e.g., every point on the surface, every peak on the surface, peaks and valleys). Such continuous populations are not unique within engineering and the sciences and occur in a variety of contexts where characteristics need to be measured and operationally defined.
The task was implemented in a first-year engineering course that served as an introduction to basic tools of engineering with an emphasis on MatLab® and Excel® as technological tools. The Nanoroughness task was used in the course to introduce students to the real work of engineers who must not only calculate statistics but also analyze and interpret the results. Our research asked a two-part question. First, what is the quality of student responses to the Nanoroughness task? To answer this we looked at the viability of the model they had created and how well they had explained their procedure for comparing the roughness of images. Second, what statistical models were elicited by the task? We specifically looked at the sampling methods students used and then how the students analyzed the data set they had created. In this paper, we describe the quantitative and qualitative analyses we completed of a sample of student responses.
Date of this Version
*Hjalmarson, M.A., Moore, T.J., & delMas, R. (2011). Statistical analysis when the data is an image: Eliciting student thinking about sampling and variability. Statistics Education Research Journal, 10(1), 15-34.
Link to original published article:
Hjalmarson, M.A., Moore, T.J., & delMas, R. (2011). Statistical analysis when the data is an image: Eliciting student thinking about sampling and variability. Statistics Education Research Journal, 10(1), 15-34.