The visual recognition problem is central to artificial intelligence. From robotics to information retrieval, many desired applications demand the ability to identify and localize categories, places, and objects. This tutorial will overview computer vision algorithms for visual object recognition and image classification. We will introduce primary representations and learning approaches, with an emphasis on recent advances in the field. The target audience consists of researchers/students working in AI and robotics who would like to understand what methods and representations are available for these problems. Our intent is for attendees to walk away understanding what is and isn't possible to do reliably today, and to gain key concepts that could be employed in their own systems or research.
Topics will include recognizing specific objects versus recognizing visual categories; global representations and sliding-windows classifiers; local invariant features: detection and description; pose clustering, voting, Hough transform; indexing features, visual vocabularies; bag-of-words representations; constellation models, part-based models; implicit shape models; distance measures and kernels; supervised and weakly supervised model learning; and current challenges and research directions.
We will assume some familiarity with probability, basic machine learning ideas, and linear algebra.
Please refer to the conference page in case of last-minute schedule changes.
| Time | Topic | Slides | Additional Material |
| 9:00h | Introduction
|
ppt pdf (1/page) pdf (6/page) |
|
Sliding-window Approaches for Object Detection
|
ppt pdf (1/page) pdf (6/page) |
Viola-Jones detector code (OpenCV) Dalal-Triggs pedestrian detector Felszenszwalb-Ramanan pedestrian detector "Hello! My name is... Buffy" video Simple boosting classifier (by Fergus/FeiFei/Torralba) |
|
Local Invariant Features: Detection & Description
|
ppt pdf (1/page) pdf (6/page) |
Oxford Interest Point Webpage David Lowe's SIFT UNC GPU-SIFT implementation Herbert Bay's SURF Nico Cornelis's GPU-SURF |
|
Recognition of Specific Objects with Local Features
|
ppt pdf (1/page) pdf (6/page) |
kooaba movie poster recognition demo Oxford web-scale object search demo |
|
| 10:30h | -- Coffee Break -- | ||
| 11:00h | Indexing Features, Bag-of-Words Categorization
|
ppt pdf (1/page) pdf (6/page) |
Simple BoW classifiers (by Fergus/FeiFei/Torralba) Oxford Video Google demo pLSA code |
Matching Local Feature Sets
|
ppt pdf (1/page) pdf (6/page) |
Pyramid match kernel code Shape context matching code |
|
Part-based Methods for Object Categorization
|
ppt pdf (1/page) pdf (6/page) |
Simple parts-and-structure detector (by Fergus/FeiFei/Torralba) ISM detector code Cow detection/segmentation video Pedestrian detection video |
|
| Current Challenges and Research Directions |
ppt pdf (1/page) pdf (6/page) |
Datasets: CalTech101, CalTech256, Pascal VOC, LabelMe Car detection & 3D localization videos Mobile pedestrian detection & tracking videos Combined recognition & reconstruction video |
|
| 13:00h | -- End of Tutorial -- | References: doc, pdf |