Unsupervised High-level Feature Learning by Ensemble Projection for Semi-supervised Image Classification and Image Clustering

Dengxin Dai , and Luc Van Gool


Abstract

This paper investigates the problem of semi-supervised image classification and image clustering. Unlike previous methods to develop sophisticated classifier models, our method learns a new image representation from all available data (labeled and unlabeled) by exploiting the patterns of data distribution in a novel manner. Particularly, a rich set of visual prototypes are sampled from all available data, and are taken as surrogate classes to train discriminative classifiers; images are projected (classified) via the classifiers and the projected values (similarities to the prototypes) are stacked to build a feature vector. The training set is noisy. Hence, in the spirit of ensemble learning we create a set of such training sets which are all diverse, leading to diverse classifiers. The method is dubbed Ensemble Projection (EP). EP captures not only the characteristics of individual images, but also the relationships among images. It is conceptually simple and computationally efficient, yet effective and flexible. Experiments on nine standard datasets show that: (1) EP outperforms previous methods for semi-supervised image classification; (2) EP produces promising results for self-taught image classification, where unlabeled samples are a random collection of images rather than being from the same distribution as the labeled ones; and (3) EP improves over the original features for image clustering.


Pipeline

Ensemble Projection

Figure 1: The pipeline of Ensemble Projection (EP). EP consists of unsupervised feature learning (left panel) and plain classification or clustering (right panel). For feature learning, EP samples an ensemble of T diverse prototype sets from all known images and learns discriminative classifiers on them for the projection functions. Images are then projected using these functions to obtain their new representation. These features are fed into standard classifiers and clustering methods for image classification and clustering respectively.


Results

classification results

Figure 2: Classification results of Ensemble Projection (EP) on the nine datasets, where three classifiers are used: k-NN, Logistic Regression, and SVMs with RBF kernels. All methods were tested with two feature inputs: the original CNN feature and the learned feature by EP on top of it (indicated by “+ EP”).

classification results

Figure 3: Classification results of Ensemble Projection (EP) on eight of the nine datasets, when compared to other semi-supervised methods.

classification results

Table 1: Precision (%) of image classification on the nine datasets, with 5 labeled training examples per class. “+ EP” indicate that classifiers working with our learned feature as input rather than the original CNN. The best performance is indicated in bold, and the second best is underlined.

classification results

Table 2: Purity (%) of image clustering on the nine datasets, where the CNN feature (Chatfield et al., 2014) and our learned feature from it (indicated by + EP) are used. The best results are indicated in bold, and the second best is underlined.


Downloads

  • The code of Ensemble Projection for semi-supervised image classification and image clustering is available.

  • The CNN features (Data) of the nine datasets considered are available to download.

  • Dengxin Dai and Luc Van Gool. "Unsupervised High-level Feature Learning by Ensemble Projection for Semi-supervised Image Classification and Image Clustering". Tech. Report, Feb. 2016.

    This page has been edited by Dengxin Dai. All rights reserved.

    web counter