Large Scale Image Search Benchmark
Description
Indexing quickly and accurately in a large collection of images has
become an important problem with many applications. Given a query image,
the goal is to retrieve matching images in the collection. We compare the
structure and properties of seven different methods based on the two leading
approaches: voting from matching of local descriptors vs. matching histograms
of visual words, including some new methods. In particular, we compare:
Kd-Trees, Hierarchical K-Means, Locality Sensitive Hashing (LSH) with three
different hash functions (L2, Spherical Simplex, Spherical Orthoplex),
Inverted File, and Min-Hash. We derive theoretical estimates
of how the memory and computational cost scale with the number of images in
the database. We evaluate these properties empirically on four real-world
datasets with different statistics. We discuss the pros and cons of the
different methods and suggest promising directions for future research.
Software
Caltech Large
Scale Image Search Toolbox: contains our implementations of the
algorithms compared in this work.
References
-
Mohamed Aly, Mario Munich, and Pietro Perona.
Indexing in Large Scale Image Collections: Scaling Properties and
Benchmark,
IEEE Workshop on Applications of Computer Vision (WACV), Hawaii,
January 2011.
[pdf]
-
Mohamed Aly, Mario Munich, and Pietro Perona.
Indexing in Large Scale Image Collections: Scaling Properties,
Parameter Tuning, and Benchmark,
Technical Report, Caltech, USA, October 2010.
[pdf]