Please see the recent results of Ensemble Projection with the CNN features here if you are interested.
This paper investigates the problem of semi-supervised classification. Unlike previous methods to regularize classifying boundaries with unlabeled data, our method learns a new image representation from all available data (labeled and unlabeled) and performs plain supervised learning with the new feature. In particular, an ensemble of image prototype sets are sampled automatically from the available data, to represent a rich set of visual categories/attributes. Discriminative functions are then learned on these prototype sets, and image are represented by the concatenation of their projected values onto the prototypes (similarities to them) for further classification. Experiments on four standard datasets show three interesting phenomena: (1) our method consistently outperforms previous methods for semi-supervised image classification; (2) our method lets itself combine well with these methods; and (3) our method works well for self-taught image classification where unlabeled data are not coming from the same distribution as labeled ones, but rather from a random collection of images.
Fig1. The pipeline of Ensemble Projection (EP).
EP consists in unsupervised feature learning (left panel) and plain supervised classification (right panel).
For feature learning, we sample an ensemble of T diverse prototype sets from all known images and learn discriminative
classifiers on them for the projection functions. Images are then projected using these functions to obtain their new representation. For
classification, we train plain classifiers on labeled images with the learned features to classify the unlabeled ones.
Fig2. Semi-supervised classification results (average ap) on the four datasets. The top panel evaluate the performance of our learned features when
fed into Logistic Regression and SVMs. The bottom shows its performance when fed into HF [34] and LapSVM [1]. All methods were tested with two
feature inputs: the concatenation of GIST, PHOG and LBP, and our learned feature from them (indicated by “+ EP”).
Fig3. Comparison of our learned features (indicated by EP(.)) to the corresponding original features GIST, PHOG, and LBP.
Logistic Regression was used as the classifier with 5 labeled training images per class.
This page has been edited by Dengxin Dai