AAAI'08 Tutorial on Visual Recognition

Kristen Grauman (University of Texas at Austin) and Bastian Leibe (ETH Zurich)

The visual recognition problem is central to artificial intelligence. From robotics to information retrieval, many desired applications demand the ability to identify and localize categories, places, and objects. This tutorial will overview computer vision algorithms for visual object recognition and image classification. We will introduce primary representations and learning approaches, with an emphasis on recent advances in the field. The target audience consists of researchers/students working in AI and robotics who would like to understand what methods and representations are available for these problems. Our intent is for attendees to walk away understanding what is and isn't possible to do reliably today, and to gain key concepts that could be employed in their own systems or research.

Topics will include recognizing specific objects versus recognizing visual categories; global representations and sliding-windows classifiers; local invariant features: detection and description; pose clustering, voting, Hough transform; indexing features, visual vocabularies; bag-of-words representations; constellation models, part-based models; implicit shape models; distance measures and kernels; supervised and weakly supervised model learning; and current challenges and research directions.

We will assume some familiarity with probability, basic machine learning ideas, and linear algebra.

Syllabus

Please refer to the conference page in case of last-minute schedule changes.

Time Topic Slides Additional Material
9:00h Introduction
  • Recognizing specific objects vs. recognizing object categories
  • Challenges of recognition
ppt
pdf (1/page)
pdf (6/page)
Sliding-window Approaches for Object Detection
  • Detection via classification
  • Global appearance description
  • Face detection with Boosting (Viola & Jones)
  • Pedestrian detection
  • Limitations
ppt
pdf (1/page)
pdf (6/page)
Viola-Jones detector code (OpenCV)
Dalal-Triggs pedestrian detector
Felszenszwalb-Ramanan pedestrian detector
"Hello! My name is... Buffy" video
Simple boosting classifier (by Fergus/FeiFei/Torralba)
Local Invariant Features: Detection & Description
  • Local feature detectors
  • Scale-invariant feature detection
  • Local descriptors
ppt
pdf (1/page)
pdf (6/page)
Oxford Interest Point Webpage
David Lowe's SIFT
UNC GPU-SIFT implementation
Herbert Bay's SURF
Nico Cornelis's GPU-SURF
Recognition of Specific Objects with Local Features
  • Generalized Hough transform
  • RANSAC
  • Applications
ppt
pdf (1/page)
pdf (6/page)
kooaba movie poster recognition demo
Oxford web-scale object search demo
10:30h -- Coffee Break --
---
---
11:00h Indexing Features, Bag-of-Words Categorization
  • Indexing local features
  • Feature clustering, visual words
  • Recognition with a vocabulary tree
  • Recognition with bag-of-words
ppt
pdf (1/page)
pdf (6/page)
Simple BoW classifiers (by Fergus/FeiFei/Torralba)
Oxford Video Google demo
pLSA code
Matching Local Feature Sets
  • Pyramid match kernel
  • Spatial pyramid match kernel
  • Matching smoothness & local geometry
  • Distance function learning
ppt
pdf (1/page)
pdf (6/page)
Pyramid match kernel code
Shape context matching code
Part-based Methods for Object Categorization
  • Constellation Model (Fergus, Zisserman, Perona)
  • Implicit Shape Model (Leibe & Schiele)
  • Connection with segmentation
ppt
pdf (1/page)
pdf (6/page)
Simple parts-and-structure detector (by Fergus/FeiFei/Torralba)
ISM detector code
Cow detection/segmentation video
Pedestrian detection video
Current Challenges and Research Directions ppt
pdf (1/page)
pdf (6/page)
Datasets: CalTech101, CalTech256, Pascal VOC, LabelMe
Car detection & 3D localization videos
Mobile pedestrian detection & tracking videos
Combined recognition & reconstruction video
13:00h -- End of Tutorial --
---
References: doc, pdf

Bastian Leibe
Last modified: Mon Jun 9 20:35:26 CEST 2008