This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Search for Publication

Year(s) from:  to 
Keywords (separated by spaces):

Exploiting Privileged Information from Web Data for Action and Event Recognition

Li Niu, Wen Li, and Dong Xu
International Journal of Computer Vision (IJCV)
Vol. 118, No. 2, pp. 130-150, June 2016


In the conventional approaches for action and event recognition, sufficient labelled training videos are generally required to learn robust classifiers with good generalization capability on new testing videos. However, collecting labelled training videos is often time consuming and expensive. In this work, we propose new learning frameworks to train robust classifiers for action and event recognition by using freely available web videos as training data. We aim to address three challenging issues: (1) the training web videos are generally associated with rich textual descriptions, which are not available in test videos; (2) the labels of training web videos are noisy and may be inaccurate; (3) the data distributions between training and test videos are often considerably different. To address the first two issues, we propose a new framework called multi-instance learning with privileged information (MIL-PI) together with three new MIL methods, in which we not only take advantage of the additional textual descriptions of training web videos as privileged information, but also explicitly cope with noise in the loose labels of training web videos. When the training and test videos come from different data distributions, we further extend our MIL-PI as a new framework called domain adaptive MIL-PI. We also propose another three new domain adaptation methods, which can additionally reduce the data distribution mismatch between training and test videos. Comprehensive experiments for action and event recognition demonstrate the effectiveness of our proposed approaches.

Link to publisher's page
  author = {Li Niu and Wen Li and and Dong Xu },
  title = {Exploiting Privileged Information from Web Data for Action and Event Recognition},
  journal = {International Journal of Computer Vision (IJCV)},
  year = {2016},
  month = {June},
  pages = {130-150},
  volume = {118},
  number = {2},
  keywords = {}