Keywords
shape perception, ideal observer analysis, image based modelling
Abstract
The visual representation of shape reduces a high-dimensional input into a smaller set of more informative features. These features can span a range of abstractions from shallow features based on statistical summaries of images, to deep features related to the generative causes of the shapes. Here we examined the depth of the visual system’s representation of shape by comparing human judgments of whether novel shapes appeared to belong to a common class with a range of models with different shape representations. Each shape class was based on a unique 2D base shape, formed by attaching parts of contours from different naturalistic shapes. We generated novel samples by transforming the base shape’s skeletal representation (Feldman and Singh, 2006) to produce new shapes with limbs that varied in length, width, spatial position, and orientation relative to the base shape. Multiple related classes were derived from each base shape using different distributions of parameter values. On each trial, observers judged whether the given target shape was in the same class as the given context shapes(either one or sixteen samples drawn from a particular shape class). Target shapes were samples taken from the same shape class as the context or one of the 5 related classes.
Participants perform remarkably well given the ill-posed nature of the task. Models based on shallow features (Euclidean distance and shape area), and deep features (an ideal observer model with knowledge on the distribution of skeletal parts), were evaluated in terms of trial-by-trial consistency with the human data. In general, human responses indicated generalization beyond the context class and were best described by ideal and sub-optimal observer models suggesting that shape features for novel object classes are an abstract version of the underlying deep features.
Start Date
19-5-2017 9:44 AM
End Date
19-5-2017 10:06 AM
Included in
Determining visual shape features for novel object classes
The visual representation of shape reduces a high-dimensional input into a smaller set of more informative features. These features can span a range of abstractions from shallow features based on statistical summaries of images, to deep features related to the generative causes of the shapes. Here we examined the depth of the visual system’s representation of shape by comparing human judgments of whether novel shapes appeared to belong to a common class with a range of models with different shape representations. Each shape class was based on a unique 2D base shape, formed by attaching parts of contours from different naturalistic shapes. We generated novel samples by transforming the base shape’s skeletal representation (Feldman and Singh, 2006) to produce new shapes with limbs that varied in length, width, spatial position, and orientation relative to the base shape. Multiple related classes were derived from each base shape using different distributions of parameter values. On each trial, observers judged whether the given target shape was in the same class as the given context shapes(either one or sixteen samples drawn from a particular shape class). Target shapes were samples taken from the same shape class as the context or one of the 5 related classes.
Participants perform remarkably well given the ill-posed nature of the task. Models based on shallow features (Euclidean distance and shape area), and deep features (an ideal observer model with knowledge on the distribution of skeletal parts), were evaluated in terms of trial-by-trial consistency with the human data. In general, human responses indicated generalization beyond the context class and were best described by ideal and sub-optimal observer models suggesting that shape features for novel object classes are an abstract version of the underlying deep features.