What can casual walkers tell us about a 3D scene?

An approach for incremental learning of a 3D scene from a single static video camera is presented in this paper. In particular, we exploit the presence of casual people walking in the scene to infer relative depth, learn shadows, and segment the critical ground structure. Considering that this type of video data is so ubiquitous, this work provides an important step towards 3D scene analysis from single cameras in readily available ordinary videos and movies. On-line 3D scene learning, as presented here, is very important for applications such as scene analysis, foreground refinement, tracking, biometrics, automated camera collaboration, activity analysis, identification, and real-time computer-graphics applications. The main contributions of this work are then two-fold. First, we use the people in the scene to continuously learn and update the 3D scene parameters using an incremental robust (L1) error minimization. Secondly, models of shadows in the scene are learned using a statistical framework. A symbiotic relationship between the shadow model and the estimated scene geometry is exploited towards incremental mutual improvement. We illustrate the effectiveness of the proposed framework with applications in foreground refinement, automatic segmentation as well as relative depth mapping of the floor/ground, and estimation of 3D trajectories of people in the scene. ©2007 IEEE.

  • Rother, D; Patwardhan, KA; Sapiro, G

  • 2007

  • Proceedings of the IEEE International Conference on Computer Vision

  • 10.1109/ICCV.2007.4409083