LSH

java.lang.Object
- smile.neighbor.LSH<E>

Type Parameters:

E - the type of data objects in the hash table.

All Implemented Interfaces:

KNNSearch<double[],E>, NearestNeighborSearch<double[],E>, RNNSearch<double[],E>
```
public class LSH<E>
extends java.lang.Object
implements NearestNeighborSearch<double[],E>, KNNSearch<double[],E>, RNNSearch<double[],E>
```
Locality-Sensitive Hashing. LSH is an efficient algorithm for approximate nearest neighbor search in high dimensional spaces by performing probabilistic dimension reduction of data. The basic idea is to hash the input items so that similar items are mapped to the same buckets with high probability (the number of buckets being much smaller than the universe of possible input items).
By default, the query object (reference equality) is excluded from the neighborhood. You may change this behavior with setIdenticalExcluded. Note that you may observe weird behavior with String objects. JVM will pool the string literal objects. So the below variables String a = "ABC"; String b = "ABC"; String c = "AB" + "C"; are actually equal in reference test a == b == c. With toy data that you type explicitly in the code, this will cause problems. Fortunately, the data would be read from secondary storage in production.

References
1. Alexandr Andoni and Piotr Indyk. Near-Optimal Hashing Algorithms for Near Neighbor Problem in High Dimensions. FOCS, 2006.
2. Alexandr Andoni, Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab Mirrokni. Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. 2004.
See Also:

MPLSH

Constructor Summary

Constructors
Constructor and Description
`LSH(double[][] keys, E[] data)` Constructor.
`LSH(double[][] keys, E[] data, double w)` Constructor.
`LSH(double[][] keys, E[] data, double w, int H)` Constructor.
`LSH(int d, int L, int k)` Constructor.
`LSH(int d, int L, int k, double w)` Constructor.
`LSH(int d, int L, int k, double w, int H)` Constructor.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`boolean`	`isIdenticalExcluded()` Get whether if query object self be excluded from the neighborhood.
`Neighbor<double[],E>[]`	`knn(double[] q, int k)` Search the k nearest neighbors to the query.
`Neighbor<double[],E>`	`nearest(double[] q)` Search the nearest neighbor to the given sample.
`void`	`put(double[] key, E value)` Insert an item into the hash table.
`void`	`range(double[] q, double radius, java.util.List<Neighbor<double[],E>> neighbors)` Search the neighbors in the given radius of query object, i.e.
`LSH`	`setIdenticalExcluded(boolean excluded)` Set if exclude query object self from the neighborhood.
`java.lang.String`	`toString()`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - LSH
```
public LSH(double[][] keys,
           E[] data)
```
    Constructor.
    
    Parameters:
    
    keys - the keys of data objects.
    
    data - the data objects.
  - LSH
```
public LSH(double[][] keys,
           E[] data,
           double w)
```
    Constructor.
    
    Parameters:
    
    keys - the keys of data objects.
    
    data - the data objects.
    
    w - the width of random projections. It should be sufficiently away from 0. But we should not choose an w value that is too large, which will increase the query time.
  - LSH
```
public LSH(double[][] keys,
           E[] data,
           double w,
           int H)
```
    Constructor.
    
    Parameters:
    
    keys - the keys of data objects.
    
    data - the data objects.
    
    w - the width of random projections. It should be sufficiently away from 0. But we should not choose an w value that is too large, which will increase the query time.
    
    H - the size of universal hash tables.
  - LSH
```
public LSH(int d,
           int L,
           int k)
```
    Constructor.
    
    Parameters:
    
    d - the dimensionality of data.
    
    L - the number of hash tables.
    
    k - the number of random projection hash functions, which is usually set to log(N) where N is the dataset size.
  - LSH
```
public LSH(int d,
           int L,
           int k,
           double w)
```
    Constructor.
    
    Parameters:
    
    d - the dimensionality of data.
    
    L - the number of hash tables.
    
    k - the number of random projection hash functions, which is usually set to log(N) where N is the dataset size.
    
    w - the width of random projections. It should be sufficiently away from 0. But we should not choose an w value that is too large, which will increase the query time.
  - LSH
```
public LSH(int d,
           int L,
           int k,
           double w,
           int H)
```
    Constructor.
    
    Parameters:
    
    d - the dimensionality of data.
    
    L - the number of hash tables.
    
    k - the number of random projection hash functions, which is usually set to log(N) where N is the dataset size.
    
    w - the width of random projections. It should be sufficiently away from 0. But we should not choose an w value that is too large, which will increase the query time.
    
    H - the size of universal hash tables.
- Method Detail
  - toString
```
public java.lang.String toString()
```
    Overrides:
    
    toString in class java.lang.Object
  - isIdenticalExcluded
```
public boolean isIdenticalExcluded()
```
    Get whether if query object self be excluded from the neighborhood.
  - setIdenticalExcluded
```
public LSH setIdenticalExcluded(boolean excluded)
```
    Set if exclude query object self from the neighborhood.
  - put
```
public void put(double[] key,
                E value)
```
    Insert an item into the hash table.
  - nearest
```
public Neighbor<double[],E> nearest(double[] q)
```
    Description copied from interface: NearestNeighborSearch
    
    Search the nearest neighbor to the given sample.
    
    Specified by:
    
    nearest in interface NearestNeighborSearch<double[],E>
    
    Parameters:
    
    q - the query key.
    
    Returns:
    
    the nearest neighbor
  - knn
```
public Neighbor<double[],E>[] knn(double[] q,
                                  int k)
```
    Description copied from interface: KNNSearch
    
    Search the k nearest neighbors to the query.
    
    Specified by:
    
    knn in interface KNNSearch<double[],E>
    
    Parameters:
    
    q - the query key.
    
    k - the number of nearest neighbors to search for.
  - range
```
public void range(double[] q,
                  double radius,
                  java.util.List<Neighbor<double[],E>> neighbors)
```
    Description copied from interface: RNNSearch
    
    Search the neighbors in the given radius of query object, i.e. d(q, v) ≤ radius.
    
    Specified by:
    
    range in interface RNNSearch<double[],E>
    
    Parameters:
    
    q - the query key.
    
    radius - the radius of search range from target.
    
    neighbors - the list to store found neighbors in the given range on output.

Class LSH<E>

References

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

LSH

LSH

LSH

LSH

LSH

LSH

Method Detail

toString

isIdenticalExcluded

setIdenticalExcluded

put

nearest

knn

range