This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Search for Publication

Year(s) from:  to 
Keywords (separated by spaces):

Large-scale knowledge transfer for object localization in ImageNet

Matthieu Guillaumin, Vittorio Ferrari
IEEE Conference on Computer Vision \& Pattern Recognition (CVPR)
Providence, RI, June 2012


ImageNet is a large-scale database of object classes with millions of images. Unfortunately only a small fraction of them is manually annotated with bounding-boxes. This prevents useful developments, such as learning reliable object detectors for thousands of classes. In this paper we propose to automatically populate ImageNet with many more bounding-boxes, by leveraging existing manual annotations. The key idea is to localize objects of a target class for which annotations are not available, by transferring knowledge from related source classes with available annotations. We distinguish two kinds of source classes: ancestors and siblings. Each source provides knowledge about the plausible location, appearance and context of the target objects, which induces a probability distribution over windows in images of the target class. We learn to combine these distributions so as to maximize the location accuracy of the most probable window. Finally, we employ the combined distribution in a procedure to jointly localize objects in all images of the target class. % here in 'joint' and 'all' hidden notion of learning a model of T Through experiments on 0.5 million images from 219 classes we show that our technique (i) annotates a wide range of classes with bounding-boxes; (ii) effectively exploits the hierarchical structure of ImageNet, since all sources and types of knowledge we propose contribute to the results; (iii) scales efficiently.

Download in pdf format
  author = {Matthieu Guillaumin and Vittorio Ferrari},
  title = {Large-scale knowledge transfer for object localization in ImageNet},
  booktitle = {IEEE Conference on Computer Vision \& Pattern Recognition (CVPR)},
  year = {2012},
  month = {June},
  keywords = {}