This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Search for Publication

Year(s) from:  to 
Keywords (separated by spaces):

Coupled Action Recognition and Pose Estimation from Multiple Views

Angela Yao and Juergen Gall and Luc Van Gool
International Journal of Computer Vision (IJCV)
Vol. 100, No. 1, pp. 16-37, 2012


Action recognition and pose estimation are two closely related topics in understanding human body movements; information from one task can be leveraged to assist the other, yet the two are often treated separately. We present here a framework for coupled action recognition and pose estimation by formulating pose estimation as an optimization over a set of action-specific manifolds. The framework allows for integration of a 2D appearance-based action recognition system as a prior for 3D pose estimation and for refinement of the action labels using relational pose features based on the extracted 3D poses. Our experiments show that our pose estimation system is able to estimate body poses with high degrees of freedom using very few particles and can achieve state-of-the-art results on the HumanEva-II benchmark. We also thoroughly investigate the impact of pose estimation and action recognition accuracy on each other on the challenging TUM kitchen dataset. We demonstrate not only the feasibility of using extracted 3D poses for action recognition, but also improved performance in comparison to action recognition using low-level appearance features.

Download in pdf format
  author = {Angela Yao and Juergen Gall and Luc Van Gool},
  title = {Coupled Action Recognition and Pose Estimation from Multiple Views},
  journal = {International Journal of Computer Vision (IJCV)},
  year = {2012},
  month = {},
  pages = {16-37},
  volume = {100},
  number = {1},
  keywords = {Human pose estimation – Human action recognition, Tracking, Stochastic optimization, Hough transform }