[cs-talks] Upcoming Seminars: PhD Proposal (Thurs)

Greenwald, Faith fgreen1 at bu.edu
Wed Oct 7 15:39:42 EDT 2015

Ph.D Proposal

Online Supervised Hashing for Ever-Growing Datasets

Fatih Cakir, BU

Thursday, October 8, 2015 at 3:30pm in MCS 144

Abstract: In this thesis, we study the problem of supervised hashing methods which are widely used for nearest neighbor search in computer vision applications. Most state-of-the-art supervised hashing approaches employ batch-learners. Unfortunately, batch-learning strategies can be inefficient when
confronted with large training datasets. Moreover, with batch-learners, it is unclear how to adapt the hash functions as a dataset continues to grow and/or when new variations appear in the dataset over time. Yet, in many practical scenarios the dataset grows and diversifies; thus, both the hash functions and the indexing must swiftly accommodate these changes. To handle these issues, we propose an online hashing method that is amenable to changes and expansions of the datasets. In the feasibility study, we developed such a technique based on Error Correcting Output Codes (ECOCs). Our solution is supervised, in that we incorporate available label information to preserve the semantic neighborhood. Specifically, we assign ECOCs as target hash codes for label(s) in the dataset and minimize an upper bound on the hamming distances between these target codes and the output of the hash mappings.

Such an adaptive hashing method is attractive; but it requires recomputing the hash table as the hash functions are updated. If the frequency of update is high, then recomputing the hash table entries may cause inefficiencies in the system, especially for large indexes. Thus, we also propose a framework to reduce hash table updates. In the feasibility study, the proposed approach achieved significant improvements over state-of-the-art on multiple image retrieval benchmarks. However, future work is needed to further reduce the number of hash table entry updates when the hash mapping is changed. Also, the label space encountered in practice often has a hierarchical nature with correlations between the labels. Hence, a robust scheme is needed when assigning ECOCs as target hash codes. In the remaining work, we will explore these aspects of the problem.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs-mailman.bu.edu/pipermail/cs-talks/attachments/20151007/42538577/attachment.html>

More information about the cs-talks mailing list