Keywords

tilt, slant, surface orientation, Bayesian estimation, natural scene statistics

Abstract

The ability of human visual systems to estimate 3D surface orientation from 2D retinal images is critical. But the computation to calculate 3D orientation in real-world scenes is not fully understood. A Bayes optimal model grounded in natural statistics has explained 3D surface tilt estimation of human observers in natural scenes (Kim and Burge, 2018). However, the model is limited because it estimates only unsigned tilt (tilt modulo 180deg). We extend the model to predict signed tilt estimates and compared with human signed estimates. The model takes image pixels as input and produces optimal estimates of tilt as output, using the joint statistics of tilt and image cues in natural scenes. The image cues to tilt are the directions of luminance, texture, and disparity gradients in a local area on the image. To estimate signed tilt, the disparity cue is used as a signed tilt cue, and the luminance and texture cues are used as unsigned tilt cues. Given a particular set of local image cues, the model computes the minimum mean squared error (MMSE) estimate, which is equal to the posterior mean over signed tilt. We found that the signed MMSE estimates were well aligned with human signed tilt estimates on the identical set of stimuli. Next, we pooled the local MMSE estimates across the space to obtain a global tilt estimate. Given that local MMSE estimates are unbiased predictor of groundtruth tilt with nearly equal reliability, the global pooled estimates are also near-optimal. The global estimates even better explained human tilt estimation. We conclude that this computational model provides a tool to understand how human visual systems make the best use of 2D image information to compute local estimates and integrate a global estimate of 3D surface tilt in complex natural scenes using the local estimates.

Start Date

17-5-2018 9:25 AM

End Date

17-5-2018 9:50 AM

Share

COinS
 
May 17th, 9:25 AM May 17th, 9:50 AM

Global Estimation of Signed 3D Surface Tilt from Natural Images

The ability of human visual systems to estimate 3D surface orientation from 2D retinal images is critical. But the computation to calculate 3D orientation in real-world scenes is not fully understood. A Bayes optimal model grounded in natural statistics has explained 3D surface tilt estimation of human observers in natural scenes (Kim and Burge, 2018). However, the model is limited because it estimates only unsigned tilt (tilt modulo 180deg). We extend the model to predict signed tilt estimates and compared with human signed estimates. The model takes image pixels as input and produces optimal estimates of tilt as output, using the joint statistics of tilt and image cues in natural scenes. The image cues to tilt are the directions of luminance, texture, and disparity gradients in a local area on the image. To estimate signed tilt, the disparity cue is used as a signed tilt cue, and the luminance and texture cues are used as unsigned tilt cues. Given a particular set of local image cues, the model computes the minimum mean squared error (MMSE) estimate, which is equal to the posterior mean over signed tilt. We found that the signed MMSE estimates were well aligned with human signed tilt estimates on the identical set of stimuli. Next, we pooled the local MMSE estimates across the space to obtain a global tilt estimate. Given that local MMSE estimates are unbiased predictor of groundtruth tilt with nearly equal reliability, the global pooled estimates are also near-optimal. The global estimates even better explained human tilt estimation. We conclude that this computational model provides a tool to understand how human visual systems make the best use of 2D image information to compute local estimates and integrate a global estimate of 3D surface tilt in complex natural scenes using the local estimates.