Publications

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Search for Publication


Year(s) from:  to 
Author:
Keywords (separated by spaces):

Equivalence of Generative and Log-Linear Models

G. Heigold, H. Ney, P. Lehnen, T. Gass and R. Schlueter
IEEE Transactions on Audio, Speech, and Language Processing
Vol. PP, No. 99, pp. 1, September 2010

Abstract

Conventional speech recognition systems are based on hidden Markov models (HMMs) with Gaussian mixture models (GHMMs). Discriminative log-linear models are an alternative modeling approach and have been investigated recently in speech recognition. GHMMs are directed models with constraints, e.g. positivity of variances and normalization of conditional probabilities, while log-linear models do not use such constraints. This paper compares the posterior form of typical generative models related to speech recognition with their log-linear model counterparts. The key result will be the derivation of the equivalence of these two dierent approaches under weak assumptions. In particular, we study Gaussian mixture models, part-of-speech bigram tagging models and eventually, the GHMMs. This result unifies two important but fundamentally dierent modeling paradigms in speech recognition on the functional level. Furthermore, this paper will present comparative experimental results for various speech tasks of dierent complexity, including a digit string and large vocabulary continuous speech recognition tasks.


Link to publisher's page
@Article{eth_biwi_00816,
  author = {G. Heigold and H. Ney and P. Lehnen and T. Gass and R. Schlueter},
  title = {Equivalence of Generative and Log-Linear Models},
  journal = {IEEE Transactions on Audio, Speech, and Language Processing},
  year = {2010},
  month = {September},
  pages = {1},
  volume = {PP},
  number = {99},
  keywords = {}
}