beyond semi-supervised tracking

Beyond Semi-Supervised Tracking: Tracking Should Be as Simple as Detection, but not Simpler than Recognition
S. Stalder, H. Grabner, and L. Van Gool
In Proceedings ICCV’09 WS on On-line Learning for Computer Vision, 2009


 

 

We present a multiple classifier system for model-free tracking. The tasks of detection (finding the object of interest), recognition (distinguishing similar objects in a scene), and tracking (retrieving the object to be tracked) are split into separate classifiers in the spirit of simplifying each classification task. The supervised and semi-supervised classifiers are carefully trained on-line in order to increase adaptivity while limiting accumulation of errors, i.e. drifting. In the experiments, we demonstrate real-time tracking on several challenging sequences, including multi-object tracking of faces, humans, and other objects. We outperform other on-line tracking methods especially in case of occlusions and presence of similar objects.

videos

texture patch in front of similar texture

Tracking a texture patch in front of very similar texture.

 

Comparison between:

  • Yellow "New": Beyond Semi-Supervised Tracker

  • Red "Semi": Semi-Supervised Tracker

  • Blue "On": On-line Boosting Tracker

texture patch partially occluded 

Tracking a texture patch which gets occluded.

Comparison between:

  • Yellow "New": Beyond Semi-Supervised Tracker

  • Red "Semi": Semi-Supervised Tracker

  • Blue "On": On-line Boosting Tracker
     

static occlusion

 

This experiment shows that static occluders are implicitly ignored by the classifiers. A one-shot detector is trained in the first frame to track the toy. Local detectors are trained during tracking. The location of the three most important Haar-like features of the local detector at the tracked position are shown. The static occluder (green cotter) is present in the image and the background image which are to be distinguished by the local detector. Thus, no discriminant features are selected on the occluder (green cotter).

long-term tracking

 

This experiment shows a long-term tracking sequence by tracking a static object with significant appearance changes during 24h. The on-line tracker fails because of drifting whereas the semi-supervised tracker fails because of large appearance changes which are not properly handled by the fixed prior. The proposed tracker is more adaptive to appearance changes than the semi-supervised one without drifting.
 

multiple object tracking with re-identification

 

In this experiment, we take a face detector and track two persons (Dev Patel and Freida Pinto in a talk about their movie Slumdog Millionaire). Temporal gaps are bridged by re-identification.
The emphasis is put on the matching of the identities when they reappear in a similar pose. The longer the track, the more likely is a successfull re-identification as more information about the identities and their poses can be accumulated. The identification matching is done very conservatively, i.e., only if an identifier has significantly higher response than all others. Sometimes a re-identification may fail because the appearance is too different (i.e., id2 and id3). However, both actors can be successfully matched to their initial tracks without confusing the identities.

implicit recognition through a background model

This experiment shows that the recognizer is able to distinguish very similar objects. In fact, the on-line and the semi-boosting trackers prefer to jump to a similar object instead of tracking the initial object changing its appearance. The proposed tracker, however, is trained negatively on the background image and will not confuse the initial object with the similar ones. In case of a large appearance change it looses the track and re-detects it afterward.

face tracking