This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Search for Publication

Year(s) from:  to 
Keywords (separated by spaces):

Learning Bregman Distance Functions and Its Application for Semi-Supervised Clustering

L. Wu, R. Jin, S. C. Hoi, J. Zhu and N. Yu
Advances in Neural Information Processing Systems 2009
2009, in press


Learning distance functions with side information plays a key role in many machine learning and data mining applications. Conventional approaches often assume a Mahalanobis distance function. These approaches are limited in two aspects: (i) they are computationally expensive (even infeasible) for high dimensional data because the size of the metric is in the square of dimensionality; (ii) they assume a fixed metric for the entire input space and therefore are unable to handle heterogeneous data. In this paper, we propose a novel scheme that learns nonlinear Bregman distance functions from side information using a non-parametric approach that is similar to support vector machines. The proposed scheme avoids the assumption of fixed metric by implicitly deriving a local distance from the Hessian matrix of a convex function that is used to generate the Bregman distance function. We also present an efficient learning algorithm for the proposed scheme for distance function learning. The extensive experiments with semi-supervised clustering show the proposed technique (i) outperforms the state-of-the-art approaches for distance function learning, and (ii) is computationally efficient for high dimensional data.

Download in pdf format
  author = {L. Wu and R. Jin and S. C. Hoi and J. Zhu and N. Yu},
  title = {Learning Bregman Distance Functions and Its Application for Semi-Supervised Clustering},
  booktitle = {Advances in Neural Information Processing Systems 2009},
  year = {2009},
  keywords = {distance metric learning},
  note = {in press}