Abstract

Deep neural networks (DNNs) have become influential computational models of human vision, particularly in explaining neural responses in the ventral stream. However, they frequently diverge from well-established findings in psychophysics, especially with regard to human perceptual biases, robustness, and generalization behavior. Much of the progress in aligning DNNs with human perception has focused on the task of core object recognition—the rapid identification of objects in static images (DiCarlo et al., 2012). In contrast, other critical dimensions of visual perception, such as motion processing and multi-object scene understanding, remain comparatively underexplored. In this talk, I present our recent work on modeling how motion supports perceptual organization and discuss both the prospects and limitations of using DNNs as scientific models of dynamic scene perception. I argue that geometric aspects of visual experience—particularly motion and depth—offer a promising path forward for bridging human and machine vision.

Keywords

DNNs, Motion, Perceptual Organization, Scene Perception

Start Date

15-5-2025 11:30 AM

End Date

15-5-2025 12:00 PM

Share

COinS
 
May 15th, 11:30 AM May 15th, 12:00 PM

Beyond Core Object Recognition: DNNs as Models of Dynamic Scene Perception

Deep neural networks (DNNs) have become influential computational models of human vision, particularly in explaining neural responses in the ventral stream. However, they frequently diverge from well-established findings in psychophysics, especially with regard to human perceptual biases, robustness, and generalization behavior. Much of the progress in aligning DNNs with human perception has focused on the task of core object recognition—the rapid identification of objects in static images (DiCarlo et al., 2012). In contrast, other critical dimensions of visual perception, such as motion processing and multi-object scene understanding, remain comparatively underexplored. In this talk, I present our recent work on modeling how motion supports perceptual organization and discuss both the prospects and limitations of using DNNs as scientific models of dynamic scene perception. I argue that geometric aspects of visual experience—particularly motion and depth—offer a promising path forward for bridging human and machine vision.