This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Search for Publication

Year(s) from:  to 
Keywords (separated by spaces):

Visual Speech Generator

G. A. Kalberer, P. Mueller and L. Van Gool
Videometrics VII 2003 IS&T/SPIE
Santa Clara, California, USA, January 2003


Efficient, realistic face animation is still a challenge. A system is proposed that yields realistic animations for speech. It starts from real 3D face dynamics, observed at a frame rate of 25 fps for thousands of points on the faces of speaking actors. When asked to animate a face it replicates the visemes that it has learned, and adds the necessary coarticulation effects. The speech animation could be based on as few as 16 modes, extracted through Independent Component Analysis from the observed face dynamics. Also faces for which only a static, neutral 3D model is available, can be animated. Rather than animating via verbatim copying other faces' deformation fields, the visemes are adapted to the shape of the new face. By localising this face in a 'Face Space', where also the locations of the example faces are known, visemes are adapted automatically according to the relative distance with respect to these examples. The animation tool proposes a good speech-based face animation as a point of departure for animators, who also get support by the system to then make further changes as desired.

Download in pdf format
  author = {G. A. Kalberer and P. Mueller and L. Van Gool},
  title = {Visual Speech Generator},
  booktitle = {Videometrics VII 2003 IS&T/SPIE},
  year = {2003},
  month = {January},
  pages = {46-53},
  volume = {5013},
  editor = {S. F. El-Hakim and A. Gruen and J. S. Walton},
  publisher = {IS&T, SPIE},
  keywords = {facial animation, speech, visemes space, ica, realism, face space}