Goethe University VSI DAGM GfKl CVL


Donald Geman

Johns Hopkins University, Baltimore


Image Interpretation by Entropy Pursuit


Image interpretation, which is effortless and instantaneous for people, is one of the grand challenges of artificial intelligence. The dream is to build a "description machine" which produces a rich semantic annotation of the underlying scene, including the names and poses of the objects that are present, as well as recognizing actions and context. Mathematical frameworks are advanced from time to time, but none is yet widely accepted, and none clearly points the way to closing the gap with natural vision. After reviewing the general situation, I will outline an approach inspired by the efficiency of the divide-and-conquer strategy in games like ``twenty questions'' and by selective attention in natural vision. This leads to an information-theoretic, model-based framework for determining what evidence to acquire from multiple scales, locations and semantic resolutions, and for coherently integrating the evidence by updating likelihoods.

mvtec logo