UdS Logo

Invited Talk,
Wednesday, 4.9., 13:45-14:45, Günter Hotz Lecture Hall

The Three R's of Computer Vision: Recognition, Reconstruction and Reorganization

Jitendra Malik
UC Berkeley


Over the last two decades, we have seen remarkable progress in computer vision with demonstration of capabilities such as face detection, handwritten digit recognition, reconstructing three-dimensional models of cities, automated monitoring of activities, segmenting out organs or tissues in biological images, and sensing for control of robots and cars. Yet there are many problems where computers still perform significantly below human perception. For example, in the recent PASCAL benchmark challenge on visual object detection, the average precision for most 3D object categories was under 50%.

I will argue that further progress on the classic problems of computational vision: recognition, reconstruction and re-organization requires us to study the interaction among these processes. For example recognition of 3d objects benefits from a preliminary reconstruction of 3d structure, instead of just treating it as a 2d pattern classification problem. Recognition is also reciprocally linked to reorganization, with bottom up grouping processes generating candidates, which with top-down activations of object and part detectors. In this talk, I will show some of the progress we have made towards the goal of a unified framework for the 3R's of computer vision. I will also point towards some of the exciting applications we may expect over the next decade as computer vision starts to deliver on even more of its grand promise.