This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Search for Publication

Year(s) from:  to 
Keywords (separated by spaces):

The DIRAC AWEAR audio-visual platform for detection of unexpected and incongruent events

J. Anemüller, J-H. Bach, B. Caputo, M. Havlena, J. Luo, H. Kayser, B. Leibe, P. Motlícek, T. Pajdla, M. Pavel, A. Torii, L.. Van Gool, A. Zweig, H. Hermansky:
International Conference on Multimodal Interfaces
Chania, Crete, Greece, October 2008


It is of prime importance in everyday human life to cope with and respond appropriately to events that are not foreseen by prior experience. Machines to a large extent lack the ability to respond appropriately to such inputs. An important class of unexpected events is defined by incongruent combinations of inputs from different modalities and therefore multimodal information provides a crucial cue for the identification of such events, e.g., the sound of a voice is being heard while the person in the field-of-view does not move her lips. In the project DIRAC ("Detection and Identification of Rare Audio-visual Cues") we have been developing algorithmic approaches to the detection of such events, as well as an experimental hardware platform to test it. An audio-visual platform ("AWEAR" - audio-visual wearable device) has been constructed with the goal to help users with disabilities or a high cognitive load to deal with unexpected events. Key hardware components include stereo panoramic vision sensors and 6-channel worn-behind-the-ear (hearing aid) microphone arrays. Data have been recorded to study audio-visual tracking, a/v scene/object classification and a/v detection of incongruencies.

Download in pdf format
  author = {J. Anem\"uller and J-H. Bach and B. Caputo and M. Havlena and J. Luo and H. Kayser and B. Leibe and P. Motlícek and T. Pajdla and M. Pavel and A. Torii and L.. Van Gool and A. Zweig and H. Hermansky:},
  title = {The DIRAC AWEAR audio-visual platform for detection of unexpected and incongruent events},
  booktitle = {International Conference on Multimodal Interfaces},
  year = {2008},
  month = {October},
  keywords = {}