Prof. Jürgen Gall

Computer Vision Group,
University of Bonn, Bonn, Germany

Forecasting Activities, Object Interactions, and Semantic Scene Geometry

In this talk, I will present Multi-Stage Temporal Convolutional Networks. In contrast to previous temporal convolutional networks, the proposed model operates on the full temporal resolution of the videos and outperforms recurrent networks and previous temporal convolutional networks for temporal action segmentation by a large margin. As part of the DFG research unit “Anticipating Human Behavior” at the University of Bonn, we are, however, not only interested in analyzing observed video sequences but we also aim to forecast the activities that will happen in the future. I will therefore also present approaches that forecast activities from video data or detect objects in an image that humans will use to solve a specific task. Finally, I will introduce a dataset for semantic segmentation of point cloud sequences where one of the tasks requires to anticipate the semantic geometry that has not been observed yet due to occlusions or due to the distance to the sensor. This task is particularly relevant for autonomous driving since it would allow to steer cars more like humans that have limited sensor capacities but very strong anticipation capabilities. BIO: Juergen Gall obtained his B.Sc. and his Masters degree in mathematics from the University of Wales Swansea (2004) and from the University of Mannheim (2005). In 2009, he obtained a Ph.D. in computer science from the Saarland University and the Max Planck Institut fur Informatik. He was a postdoctoral researcher ¨ at the Computer Vision Laboratory, ETH Zurich, from 2009 until 2012 and senior research scientist at the Max Planck Institute for Intelligent Systems in Tubingen from 2012 until 2013. Since ¨ 2013, he is professor at the University of Bonn and head of the Computer Vision Group.