Keywords

visual search; object recognition; dissimilarity

Abstract

Despite advances in computation and machine learning, computers are still far behind humans in vision. This is most likely because humans use a sophisticated object representation which is very different from that used in computers today. Another challenge is that object representations in computer vision and human vision have not been systematically compared on the same objects. To address this issue, we measured perceptual dissimilarity between objects in humans in a visual search (taking search difficulty as an index of target-distracter similarity). We then compared these observed dissimilarities against the dissimilarity predicted by a large number of state-of-the-art computational models of shape (e.g. Fourier descriptors, HMAX, Gabor filters, spatial pyramid etc.). In general, computational models were able to explain perceptual dissimilarity to a reasonable degree (r = 0.7-0.8 depending on the shape set). More interestingly, there were systematic deviations between all models and perceptual dissimilarity: for some pairs of objects, perceptual dissimilarity was greater than predicted by every model, whereas for other pairs, it was smaller. These systematic deviations are indicative of what is lacking in nearly all computational models of shape. Specifically, we propose that computational models of shape must incorporate some form of parts-based representation in order to account for the unexplained variation. We will also preview some related work (to be presented at the main VSS meeting) in which we have elucidated how object dissimilarity can be understood in terms of dissimilarities between parts.

Start Date

13-5-2015 2:00 PM

End Date

13-5-2015 2:25 PM

Session Number

02

Session Title

Shape and Form

Share

COinS
 
May 13th, 2:00 PM May 13th, 2:25 PM

Can Computational Models of Shape Explain Object Perception?

Despite advances in computation and machine learning, computers are still far behind humans in vision. This is most likely because humans use a sophisticated object representation which is very different from that used in computers today. Another challenge is that object representations in computer vision and human vision have not been systematically compared on the same objects. To address this issue, we measured perceptual dissimilarity between objects in humans in a visual search (taking search difficulty as an index of target-distracter similarity). We then compared these observed dissimilarities against the dissimilarity predicted by a large number of state-of-the-art computational models of shape (e.g. Fourier descriptors, HMAX, Gabor filters, spatial pyramid etc.). In general, computational models were able to explain perceptual dissimilarity to a reasonable degree (r = 0.7-0.8 depending on the shape set). More interestingly, there were systematic deviations between all models and perceptual dissimilarity: for some pairs of objects, perceptual dissimilarity was greater than predicted by every model, whereas for other pairs, it was smaller. These systematic deviations are indicative of what is lacking in nearly all computational models of shape. Specifically, we propose that computational models of shape must incorporate some form of parts-based representation in order to account for the unexplained variation. We will also preview some related work (to be presented at the main VSS meeting) in which we have elucidated how object dissimilarity can be understood in terms of dissimilarities between parts.