[RBD12] Similarity Search in a Very Large Scale Using Hadoop and HBase

Rapport du Laboratoire : Date de dépot: 2012/01/10, Nb pages 13,

Mots clés: Web scale - search by content - multimedia databases

Résumé: We experimented a solution for large-scale indexing of multimedia datasets based on an existing platform, namely Hadoop/HBase. Since this platform does not natively supports indexing of multimedia descriptors in a high-dimensional space, we chose to adopt a solution based on a transformation of the problem to a more traditional Information Retrieval approach, and inspired by research elaborated with our parters. We implemented the solution and conducted extensive experiences to assess its robustness in a Big Data context. The design of this solution and the results obtained so far are described in what follows.

Equipe: vertigo


@techreport {
title="{Similarity Search in a Very Large Scale Using Hadoop and HBase}",
author="P. Rigaux and S. Barton and V. Dohnal",
institution="{CEDRIC laboratory, CNAM-Paris, France}",