Active Hashing with Joint Data Example and Tag Selection


Similarity search is an important problem in many large scale applications such as image and text retrieval. Hashing method has become popular for similarity search due to its fast search speed and low storage cost. Recent research has shown that hashing quality can be dramatically improved by incorporating supervised information, e.g. semantic tags/labels, into hashing function learning. However, most existing supervised hashing methods can be regarded as passive methods, which assume that the labeled data are provided in advance. But in many real world applications, such supervised information may not be available.

This paper proposes a novel active hashing approach, Active Hashing with Joint Data Example and Tag Selection (AH-JDETS), which actively selects the most informative data examples and tags in a joint manner for hashing function learning. In particular, it first identifies a set of informative data examples and tags for users to label based on the selection criteria that both the data examples and tags should be most uncertain and dissimilar with each other. Then this labeled information is combined with the unlabeled data to generate an effective hashing function.

An iterative procedure is proposed for learning the optimal hashing function and selecting the most informative data examples and tags. Extensive experiments on four different datasets demonstrate that AH-JDETS achieves good performance compared with state-of-the-art supervised hashing methods but requires much less labeling cost, which overcomes the limitation of passive hashing methods. Furthermore, experimental results also indicate that the joint active selection approach outperforms a random (non-active) selection method and active selection methods only focusing on either data examples or tags.


Hashing, Active Learning, Similarity Search, Data Selection

Date of this Version