Demos

Full Body Pose Recognition

Authors: Michael Van den Bergh, Esther Koller-Meier, and L. Van Gool

Based on a 3D hull reconstruction, the current pose of the user is detected from a database of predefined poses. This is done in real-time using 3D Haarlets. The system works for any orientation of the user.


References:

M. Van den Bergh, E. Koller-Meier, and L. Van Gool
"Realtime body pose recognition using 2d or 3d haarlets",
Interna- tional Journal of Computer Vision, vol. 83, pp. 72-84,June 2009.

M. Van den Bergh, E. Koller-Meier, and L. Van Gool
"Realtime 3d body pose estimation",
Multi-Camera Networks: Concepts and Applications, pp. 335-360, 2009.

MPEG-4 movie (13 MB) Created: October 2010

Authors: Andreas Ess, Bastian Leibe, Konrad Schindler and L. Van Gool

Multi-Person Tracking from a Moving Platform

We address the problem of vision-based multi-person tracking in busy pedestrian zones using a pair of forward-looking cameras mounted on a mobile platform. Specifically, we are interested in the application of such a system for supporting path planning algorithms in the avoidance of dynamic obstacles. The complexity of the problem calls for an integrated solution, which extracts as much visual information as possible and combines it through cognitive feedback. We propose such an approach, which jointly estimates camera position, stereo depth, object detections, and trajectories based on visual information only. We represent the interplay between these components using a graphical model. For each frame, we first estimate the ground surface together with a set of object detections. Conditioned on these results, we then address object interactions and estimate trajectories. Finally, we employ the tracking results to predict future motion for dynamic objects and fuse this information with a static occupancy map estimated from stereo.

References:

A. Ess, B. Leibe, K. Schindler, and L. Van Gool
"Moving Obstacle Detection in Highly Dynamic Scenes",
IEEE International Conference on Robotics and Automation (ICRA'09), 2009, best vision paper award.

A. Ess, B. Leibe, K. Schindler, and L. van Gool
"Robust Multi-Person Tracking from a Mobile Platform",
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 31, No. 10, pp. 1831-1846, 2009

MPEG-4 movie (25 MB) Created: October 2010

Hand Gesture Interaction

Authors: Michael Van den Bergh, Frédéric Bosché, Esther Koller-Meier, and L. Van Gool

A hand gesture interaction system set up at the Value Lab. A camera mounted on top of the screen detects hand gestures. Using these gestures, a user can manipulate a 3D model.

References:

M. Van den Bergh, F. Bosche, E. Koller-Meier, and L. Van Gool
"Haarlet-based hand gesture recognition for 3d interaction",
IEEE Workshop on Motion and Video Computing, December 2009.

M. Van den Bergh, J. Halatsch, A. Kunze, F. Bosche, L. Van Gool, and G. Schmitt
"Towards Collaborative Interaction with Large nD Models for Effective Project Management", 9th International Conference on Construction Applications of Virtual Reality (ConVR), November 2009.

MPEG-4 movie (3.9 MB) Created: October 2010

MPEG-4 movie (5.2 MB) Created: October 2010

MPEG-4 movie (4.0 MB) Created: October 2010

Robust Tracking-by-Detection from a Single Camera

Authors: Michael D. Breitenstein, Fabian Reichlin, Bastian Leibe, Esther Koller-Meier, and L. Van Gool

Completely automatic multi-person detection and tracking. No background modeling - robust to camera motion (up to some amount). Only based on 2D information from a single, uncalibrated camera. No scene-specific information (ground plane). Causal/Markovian (no "looking into the future'') - suitable for time-critical online applications.

Additional information and videos

References:

M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. Van Gool
"Online Multi-Person Tracking-by-Detection from a Single, Uncalibrated Camera",
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010

M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. Van Gool
"Robust Tracking-by-Detection using a Detector Confidence Particle Filter",
IEEE International Conference on Computer Vision, October 2009

M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. Van Gool
"Markovian Tracking-by-Detection from a Single, Uncalibrated Camera",
IEEE CVPR Workshop on Performance Evaluation of Tracking and Surveillance (PETS'09), June 2009

MPEG-4 movie (3.0 MB) Created: October 2010

MPEG-4 movie (3.3 MB) Created: October 2010

MPEG-4 movie (23 MB) Created: October 2010

Authors: Andreas Ess, Tomas Mueller, Helmut Grabner and L. Van Gool

Urban Traffic Scene Understanding

In this work, we propose a method to recognize the traffic scene in front of a moving vehicle with respect to the road topology and the existence of objects. To this end, we use a two-stage system, where the first stage abstracts from the underlying image by means of a rough super-pixel segmentation of the scene. In a second stage, this meta representation is then used to construct a feature set for a classifier that is able to distinguish between different road types as well as detect the existence of commonly encountered objects, such as cars or pedestrian crossings. We show that by relying on an intermediate stage, we can effectively abstract from any peculiarities of the underlying image data due to, e.g., color aberrations. The method is tested on two long, challenging urban data sets, covering both daylight and dusk conditions. Compared to a state-of-the-art descriptor, we show improved classification performance, especially for object classes.

References:

A. Ess, T. Mueller, H. Grabner, L. Van Gool,
"Segmentation-Based Urban Traffic Scene Understanding",
British Machine Vision Conference (BMVC '09), 2009.

MPEG-4 movie (9.6 MB) Created: October 2010

Authors: Gabriele Fanelli, Juergen Gall and L. Van Gool

Hough Transform-Based Mouth Detection for Audio-Visual Speech Recognition

We present a novel method for mouth localization in the context of multimodal speech recognition where audio and visual cues are fused to improve the speech recognition accuracy. While facial feature points like mouth corners or lip contours are commonly used to estimate at least scale, position, and orientation of the mouth, we propose a Hough transform-based method. Instead of relying on a predefined sparse subset of mouth features, it casts probabilistic votes for the mouth center from several patches in the neighborhood and accumulates the votes in a Hough image. This makes the localization more robust as it does not rely on the detection of a single feature. In addition, we exploit the different shape properties of eyes and mouth in order to localize the mouth more efficiently. Using the rotation invariant representation of the iris, scale and orientation can be efficiently inferred from the localized eye positions. The accompanying video shows some example sequences of tracked faces, with the recognition of the uttered words, both using only audio cues (AO) and fusing audio and visual features (AV), with and without white noise added to the audio channel.

References:

G. Fanelli, J. Gall and L. Van Gool,
"Hough Transform-based Mouth Localization for Audio-Visual Speech Recognition",
British Machine Vision Conference (BMVC '09), 2009.

J. Gall and V. Lempitsky,
"Class-Specific Hough Forests for Object Detection",
IEEE Conference on Computer Vision and Pattern Recognition, 2009.

MPEG-4 movie (22 MB) Created: October 2010

Procedural Modeling of Buildings

CGA shape, a novel shape grammar for the procedural modeling of CG architecture, produces building shells with high visual quality and geometric detail. It produces extensive architectural models for computer games and movies, at low cost. Context sensitive shape rules allow the user to specify interactions between the entities of the hierarchical shape descriptions. Selected examples demonstrate solutions to previously unsolved modeling problems, especially to consistent mass modeling with volumetric shapes of arbitrary orientation. CGA shape is shown to efficiently generate massive urban models with unprecedented level of detail, with the virtual rebuilding of the archaeological site of Pompeii as a case in point.

Created by Pascal Müller and Simon Haegler

More Info here

SIGGRAPH 2006 Video

DivX movie
Quicktime Movie
Created: Juli 2006

3D City Modeling Using Cognitive Loops

Authors: Nico Cornelis, Bastian Leibe, Kurt Cornelis, Luc Van Gool

CVPR'06 Video Proceedings Best Video Award

CVPR 2006 Video

AVI movie
Created: June 2006

In this video [1] we show the combined results from two recent publications [2], [3]. In [2], we introduce a real-time 3D City Modeling algorithm which is able to build compact 3D representations of cities using the assumption that building facades and roads can be modeled by simple ruled surfaces. The main advantage of this algorithm is its exceptional speed. It can process the full Structure-from-Motion and dense reconstruction pipeline at 25-30fps -- thus, the reconstructed model can directly be created online, while the survey vehicle is driving through the streets. However, due to the simple geometry assumptions, this original algorithm is unable to model cars which are everpresent in cities and obviously visually degrade our resulting 3D city model.

In [3], we therefore propose to combine the 3D reconstruction with an object detection algorithm based on Implicit Shape Models. The two components are integrated in a cognitive feedback loop. The 3D reconstruction modules inform object detection about the scene geometry, which greatly helps to improve detection precision. Using the knowledge of camera parameters and scene geometry from [2], the 2D car detections are temporally integrated in a world coordinate frame, which allows to obtain precise 3D location and orientation estimates. Those can then be used to instantiate the virtual 3D car models which improve the visual realism of our final 3D city model.

Our final system is able to create an automatic 3D city model from the input video streams of a survey vehicle, identify the locations of cars in the recorded real-world scene, and replace them by virtual 3D models in the reconstruction. Besides improving the visual realism of the final 3D model, this has as the additional benefit that it also solves privacy issues by removing personalized information from the resulting final city model. Therefore, object recognition can aid 3D reconstruction in achieving more realistic results. On the other hand, the object recognition algorithm itself can benefit from the higher-level scene knowledge which is available through 3D reconstruction. It is exactly this bidirectional nature of interactions between both the reconstruction and recognition algorithm which earns it the name of cognitive loop.

References:

[1] N. Cornelis, B. Leibe, K. Cornelis, L. Van Gool,
"3D City Modeling Using Cognitive Loops",
3rd International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06), Chapel Hill, USA, June 2006.
and
Video Proceedings for CVPR 2006 (VPCVPR'06), New York, June 2006.

[2] N. Cornelis, K. Cornelis, L. Van Gool,
"Fast Compact City Modeling for Navigation Pre-Visualization",
In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'06), New York, 2006.

[3] B. Leibe, N. Cornelis, K. Cornelis, L. Van Gool,
"Integrating Recognition and Reconstruction for Cognitive Traffic Scene Analysis from a Moving Vehicle",
In DAGM Annual Pattern Recognition Symposium, Berlin, Germany,
LNCS Vol. 4174, pp. 192-201, Springer, September 2006.


Hysteroscopy Simulator

The prototype has been created using several modules developed within a number of Co-Me projects. These modules provide simulation of soft tissue deformation, collision detection and response, cutting, as well as a hysteroscopy tool as input device to the simulator. In addition, a CFD module has been integrated for blood flow simulation. Moreover, we replicated an OR in our lab and provide standard hysteroscopic tools for interaction. In this setting, the training starts as soon as the trainee enters the OR, and it ends, when she leaves the room.

More info: http://www.hystsim.ethz.ch/

Overview of the Hysteroscopy Simulation Project

AVI-DivX movie (33 MB) Created: 2006

Haptic Augmented Reality System

In our current research we examine the integration of haptic interfaces into augmented reality setups. The ultimate target of these endeavours is the application of the framework to training of manipulative skills in surgical environments. To this end, highly accurate calibration, system stability, and low latency are indispensable prerequisites. Therefore, we developed a new calibration method to exactly align the haptic and world coordinate systems. Moreover, a distributed framework was created, which ensures low latency and component synchronization. Finally, to demonstrate our results, we integrated all elements into an augmented reality haptics ping-pong game. (Video 1)

Publication: G. Bianchi, B. Knörlein, G. Székely and M. Harders, "High Precision Augmented Reality Haptics", Eurohaptics 2006, July 2006

The driving force of our research is the precise combination of real and - possibly indistinguishable - virtual interactive objects in an augmented reality environment. This requires an interactive, multimodal simulation, as well as stable and accurate overlay of the computer-generated objects. This paper describes several methods to improve accuracy and stability of our hybrid augmented reality system. In a comparison of two approaches to hybrid head pose refinement, we show the superior performance of Quasi-Newton optimization for image space error minimization. Moreover, a 3D landmark refinement step is proposed, which significantly improves robustness of the overlay process. The enhanced system is demonstrated in an interactive AR environment, which provides accurate haptic feedback from real and virtual deformable objects. Finally, the effect of landmark occlusion on tracking stability during user interaction is also analyzed.

Publication: G. Bianchi, C. Jung, B. Knörlein, M. Harders and G. Székely, "High-fidelity visuo-haptic interaction with virtual objects in multi-modal AR systems", ISMAR 2006, October 2006.

AR Ping Pong

600x450, DIVX,11.5MB
February 2006

Haptic Feedback

600x450, DIVX, 17MB
February 2006

4D MRI

In contrast to CT, MRI provides excellent soft tissue contrast and volunteers and patients are not exposed to ionising radiation.

Sequences of 3D volumes (4D data sets) were reconstructed from dynamic sagittal 2D images acquired during free breathing. Other gating methods assume regular respiratory motion and reduce the respiratory organ deformation to one parameter such as amplitude or phase. This neglects all residual variability and is a too coarse approximation in some cases, leading to artefacts in the reconstructed images.

The proposed approach derives a multi-dimensional gating measure from dedicated so-called navigator frames in order to determine the state of the liver retrospectively and find corresponding 2D slices that can be combined to 3D volumes. The method does not assume a constant breathing depth or even strict periodicity and does not depend on an external gating signal. The technique is applicable to any organ that undergoes respiratory motion such as lung, liver, pancreas or kidneys and can be implemented on a standard MR scanner without additional equipment.

Created by: Martin von Siebenthal.

More info: /4dmri/.

4D MRI of the liver

GIF Animation
Created: 2006

4D MRI of the lung

GIF Animation
Created: 2006

Blue-C

Blue-C is a interdisciplinary research project of the ETH. It combines the qualities of total immersion experienced in CAVE-like environments with simultaneous, real-time 3D video acquisition and rendering from multiple cameras.

Overview

MPEG-1 movie (5,3MB) Created: August 2003

Real-Time Pointing Gesture Recognition

MPEG-1 movie (3,1MB)
AVI-DivX movie (2,3MB) Created: September 2004

Background Segmentation

MPEG-1 movie (1,6MB)
AVI-DivX movie (1,2MB) Created: September 2004

Photo-Realistic and detailed 3D modeling: The Antonine Nymphaeum at Sagalassos (Turkey).

An accurate archaeological high-resolution reconstruction of an ancient Roman fountain.
Created by Pascal Müller

Final Reconstruction

MPEG-1 movie (21,1MB)
AVI-DivX movie (10,8MB)
Created: August 2004

Cleaning Medusa

MPEG-1 movie (1,1MB)
AVI-DivX movie (2,2MB)
Created: August 2004

Small-Scale Remodeling

MPEG-1 movie (578kB)
AVI-DivX movie (495kB)
Created: August 2004

Mid-Scale Remodeling

MPEG-1 movie (580kB)
AVI-DivX movie (517kB)
Created: August 2004

Large-Scale Remodeling

MPEG-1 movie (706kB)
AVI-DivX movie (727kB)
Created: August 2004

Traditional Modeling

MPEG-1 movie (2,2MB)
AVI-DivX movie (2,0MB)
Created: August 2004

Talking faces

Realistic Face Animation for Speech created by Gregor Kalberer and Pascal Müller

Female Faces created from Face-Space

MPEG-1 movie (6,1MB)
AVI-DivX movie (7,1MB) Created: August 2004

Female Face from 3D-scan

MPEG-1 movie (3,6MB)
AVI-DivX movie (3,5MB) Created: August 2004

Male Face from 3D-scan

MPEG-1 movie (4,8MB)
AVI-DivX movie (3,1MB) Created: August 2004

Plugin for Maya

MPEG-1 movie (35,3MB)
AVI-DivX movie (40,2MB) Created: August 2004

PCA vs. ICA

MPEG-1 movie (2,0MB)
AVI-DivX movie (1,5MB) Created: August 2004

Tracked vs. Synthetic

MPEG-1 movie (1,1MB)
AVI-DivX movie (744kB) Created: August 2004

Texture Synthesis

Viewpoint Consistent Synthesis of 3D Textures created by Alexander Neubeck

Torus

MPEG-1 movie (2128kB) Created: September 2004

Plane

MPEG-1 movie (3080kB) Created: September 2004

Hand Tracking

3D-tracking of human hands created by Matthieu Bray and Pascal Müller

The Hand Model

MPEG-1 movie (4,7MB)
AVI-DivX movie (7,2MB) Created: August 2004

Hand Tracking with Stochastic Meta Descent

MPEG-1 movie (7,8MB)
AVI-DivX movie (4,8MB) Created: August 2004

Virtual Tumor

Tumor and Polyp models in Uterus created by Raimundo Sierra

Skeleton based Pathology design

MPEG-1 movie (2,7MB)
AVI-DivX movie (1,2MB) Created: August 2004

Particle based Tumor Growth Model

MPEG-1 movie (4,8MB)
AVI-DivX movie (1,7MB) Created: August 2004

Single Myom

MPEG-1 movie (1,9MB)
AVI-DivX movie (1,6MB) Created: August 2004

Multiple Myom

MPEG-1 movie (1,9MB)
AVI-DivX movie (1,7MB) Created: August 2004

Arteries

Macroscopic modellings of vascular systems
Computer generated structures of Vascular Systems created by Dominik Szczerba
PNG-image (738x895, 9kB) Created: March 2003
PNG-image (1209x1011, 192kB) Created: March 2003
PNG-image (706x944, 64kB) Created: March 2003
PNG-image (765x881, 62kB) Created: March 2003
PNG-image (760x709, 91kB) Created: March 2003
PNG-image (745x583, 21kB) Created: March 2003
PNG-image (986x966, 79kB) Created: March 2003
PNG-image (532x735, 9kB) Created: March 2003
PNG-image (897x907, 13kB) Created: March 2003
PNG-image (826x887, 26kB) Created: March 2003
PNG-image (892x887, 19kB) Created: March 2003
PNG-image (553x505, 9kB) Created: March 2003
PNG-image (756x966, 36kB) Created: March 2003
PNG-image (641x855, 22kB) Created: March 2003
PNG-image (1065x891, 103kB) Created: March 2003
PNG-image (1028x946, 25kB) Created: March 2003

Markerless 2D and 3D Augmented Reality with a Real-time Affine Region Tracker

Here are some movies demonstrating the capabilities of the affine region tracker and the augmented reality system developed by Vittorio Ferrari at the Computer Vision Lab of ETH Zuerich. Some of these results are discussed in the following papers:

Vittorio Ferrari, Tinne Tuytelaars and Luc Van Gool
"Real-time Affine Region Tracking and Coplanar Grouping",
in Proc. of the IEEE Computer Vision and Pattern Recognition (CVPR), Kauai, Hawaii, December 2001.

Vittorio Ferrari, Tinne Tuytelaars and Luc Van Gool
"Markerless Augmented Reality with a Real-time Affine Region Tracker",
in Proc. of the IEEE and ACM International Symposium on Augmented Reality (ISAR), New York, New York, October 2001, pp. 87-96.

2d "AR" Sign mapped to building (The Cinema Sequence)
AVI movie
Regions of the Cinema Sequence
AVI movie
Regions on Calendar Scene
AVI movie
Calendar scene 2d augmented
AVI movie
Graffiti 2d augmented
AVI movie
Jennifer Sequence
AVI movie
Augmented Panel
AVI movie
Regions of Jennifer Sequence
AVI movie
Spaceship scene with tracked regions.
AVI movie
3D Augentation with a Buddha model. Created in cooperation with Lukas Hohl and Till Quack.
Details for the 3D Augmentation can be found in this report.
AVI movie


Flight through Swiss Mountains

3D flight through Swiss mountains: Olympic region as proposed for Sion 2006
Sion (Southwest of Switzerland) was candidate for the XX Olympic Winter Games.
MPEG-1 movie (848kB) Created: October 2001