Please note that my new updated homepage is at mohamedaly.info

Caltech Large Scale Image Search Toolbox

Description

This Matlab package implements several algorithms used for large scale image search. The algorithms are implemented in C++, with an eye on large scale databases. It can handle millions of images and hundreds of millions of local features. It has MEX interfaces for Matlab, but can also be used (with possible future modifications) from Python and directly from C++. It can also be used for approximate nearest neighbor search, especially using the KdTrees or LSH implementations. The algorithms can be divided into two broad categories, depending on the approach taken for image search:
  1. Bag of Words (BoW)
    The images are represented by histograms of visual words.

    It includes algorithms for computing dictionaries:
    • K-Means.
    • Approximate K-Means (AKM).
    • Hierarchical K-Means (HKM).

    It also includes algorithms for fast search:
    • Inverted File Index.
    • Inverted File Index with Extra Information (for example for implementing Hamming Embedding).
    • Min-Hash.

  2. Full Representation (FR)
    The images are represented by the individual features.

    It includes algorithms for fast approximate nearest neighbor search:
    • Kd-Trees (Kdtree).
    • Hierarchical K-Means (Hkm).
    • Locality Senstivie Hashing (LSH), with several hash functions:
      • Hamming hash function (bit sampling, approximates hamming distance) i.e. h = x[i]
      • Cosine hash function (random hyperplanes through the origin, approximates dot product) i.e.h = sign(<x,r>)
      • L1 hash function (approximates the L1 distance) i.e. h = floor((x[i]-b) / w)
      • L2 hash function (random hyperplanes with bias, approximates euclidean distance, similar to E2LSH) i.e. h = floor((<x,r> - b) / w)
      • Spherical Simplex (approximates distances on the unit hypersphere)
      • Spherical Orthoplex (approximates distances on the unit hypersphere)
      • Spherical Hypercube (approximates distances on the unit hypersphere)
      • Binary Gausian Kernels (approximates gaussian kernel)

News

  • November 5, 2010: Version 1.0.

Download

Caltech Large Scale Image Search [From Google Code (zip) or Local (zip) 144 KB]

Source Code

More Information

The Large Scale Image Search Benchmark Project page has more information.

References

  1. Mohamed Aly, Mario Munich, and Pietro Perona. Indexing in Large Scale Image Collections: Scaling Properties and Benchmark, IEEE Workshop on Applications of Computer Vision (WACV), Hawaii, January 2011. [pdf]
  2. Mohamed Aly, Mario Munich, and Pietro Perona. Indexing in Large Scale Image Collections: Scaling Properties, Parameter Tuning, and Benchmark, Technical Report, Caltech, USA, October 2010. [pdf]