Learning distance functions with side information plays a key role in many machine learning and data mining applications. Conventional approaches often assume a Mahalanobis distance function. These approaches are limited in two aspects: (i) they are computationally expensive (even infeasible) for high dimensional data because the size of the metric is in the square of dimensionality; (ii) they assume a ï¬xed metric for the entire input space and therefore are unable to handle heterogeneous data. In this paper, we propose a novel scheme that learns nonlinear Bregman distance functions from side information using a non-parametric approach that is similar to support vector machines. The proposed scheme avoids the assumption of ï¬xed metric by implicitly deriving a local distance from the Hessian matrix of a convex function that is used to generate the Bregman distance function. We also present an efï¬cient learning algorithm for the proposed scheme for distance function learning. The extensive experiments with semi-supervised clustering show the proposed technique (i) outperforms the state-of-the-art approaches for distance function learning, and (ii) is computationally efï¬cient for high dimensional data.