E
- the type of data objects in the hash table.public class LSH<E> extends java.lang.Object implements NearestNeighborSearch<double[],E>, KNNSearch<double[],E>, RNNSearch<double[],E>
By default, the query object (reference equality) is excluded from the neighborhood.
You may change this behavior with setIdenticalExcluded
. Note that
you may observe weird behavior with String objects. JVM will pool the string literal
objects. So the below variables
String a = "ABC";
String b = "ABC";
String c = "AB" + "C";
are actually equal in reference test a == b == c
. With toy data that you
type explicitly in the code, this will cause problems. Fortunately, the data would be
read from secondary storage in production.
MPLSH
Constructor and Description |
---|
LSH(double[][] keys,
E[] data)
Constructor.
|
LSH(double[][] keys,
E[] data,
double w)
Constructor.
|
LSH(double[][] keys,
E[] data,
double w,
int H)
Constructor.
|
LSH(int d,
int L,
int k)
Constructor.
|
LSH(int d,
int L,
int k,
double w)
Constructor.
|
LSH(int d,
int L,
int k,
double w,
int H)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
boolean |
isIdenticalExcluded()
Get whether if query object self be excluded from the neighborhood.
|
Neighbor<double[],E>[] |
knn(double[] q,
int k)
Search the k nearest neighbors to the query.
|
Neighbor<double[],E> |
nearest(double[] q)
Search the nearest neighbor to the given sample.
|
void |
put(double[] key,
E value)
Insert an item into the hash table.
|
void |
range(double[] q,
double radius,
java.util.List<Neighbor<double[],E>> neighbors)
Search the neighbors in the given radius of query object, i.e.
|
LSH |
setIdenticalExcluded(boolean excluded)
Set if exclude query object self from the neighborhood.
|
java.lang.String |
toString() |
public LSH(double[][] keys, E[] data)
keys
- the keys of data objects.data
- the data objects.public LSH(double[][] keys, E[] data, double w)
keys
- the keys of data objects.data
- the data objects.w
- the width of random projections. It should be sufficiently
away from 0. But we should not choose an w value that is too large, which
will increase the query time.public LSH(double[][] keys, E[] data, double w, int H)
keys
- the keys of data objects.data
- the data objects.w
- the width of random projections. It should be sufficiently
away from 0. But we should not choose an w value that is too large, which
will increase the query time.H
- the size of universal hash tables.public LSH(int d, int L, int k)
d
- the dimensionality of data.L
- the number of hash tables.k
- the number of random projection hash functions, which is usually
set to log(N) where N is the dataset size.public LSH(int d, int L, int k, double w)
d
- the dimensionality of data.L
- the number of hash tables.k
- the number of random projection hash functions, which is usually
set to log(N) where N is the dataset size.w
- the width of random projections. It should be sufficiently
away from 0. But we should not choose an w value that is too large, which
will increase the query time.public LSH(int d, int L, int k, double w, int H)
d
- the dimensionality of data.L
- the number of hash tables.k
- the number of random projection hash functions, which is usually
set to log(N) where N is the dataset size.w
- the width of random projections. It should be sufficiently
away from 0. But we should not choose an w value that is too large, which
will increase the query time.H
- the size of universal hash tables.public java.lang.String toString()
toString
in class java.lang.Object
public boolean isIdenticalExcluded()
public LSH setIdenticalExcluded(boolean excluded)
public void put(double[] key, E value)
public Neighbor<double[],E> nearest(double[] q)
NearestNeighborSearch
nearest
in interface NearestNeighborSearch<double[],E>
q
- the query key.public Neighbor<double[],E>[] knn(double[] q, int k)
KNNSearch