E
- the type of data objects in the hash table.public class MPLSH<E> extends java.lang.Object implements NearestNeighborSearch<double[],E>, KNNSearch<double[],E>, RNNSearch<double[],E>
By default, the query object (reference equality) is excluded from the neighborhood.
You may change this behavior with setIdenticalExcluded
. Note that
you may observe weird behavior with String objects. JVM will pool the string literal
objects. So the below variables
String a = "ABC";
String b = "ABC";
String c = "AB" + "C";
are actually equal in reference test a == b == c
. With toy data that you
type explicitly in the code, this will cause problems. Fortunately, the data would be
read from secondary storage in production.
LSH
Constructor and Description |
---|
MPLSH(int d,
int L,
int k,
double r)
Constructor.
|
MPLSH(int d,
int L,
int k,
double r,
int H)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
boolean |
isIdenticalExcluded()
Get whether if query object self be excluded from the neighborhood.
|
Neighbor<double[],E>[] |
knn(double[] q,
int k)
Search the k nearest neighbors to the query.
|
Neighbor<double[],E>[] |
knn(double[] q,
int k,
double recall,
int T)
Returns the approximate k-nearest neighbors.
|
void |
learn(RNNSearch<double[],double[]> range,
double[][] samples,
double radius)
Train the posteriori multiple probe algorithm.
|
void |
learn(RNNSearch<double[],double[]> range,
double[][] samples,
double radius,
int Nz)
Train the posteriori multiple probe algorithm.
|
void |
learn(RNNSearch<double[],double[]> range,
double[][] samples,
double radius,
int Nz,
double sigma)
Train the posteriori multiple probe algorithm.
|
Neighbor<double[],E> |
nearest(double[] q)
Search the nearest neighbor to the given sample.
|
Neighbor<double[],E> |
nearest(double[] q,
double recall,
int T)
Returns the approximate nearest neighbor.
|
void |
put(double[] key,
E value)
Insert an item into the hash table.
|
void |
range(double[] q,
double radius,
java.util.List<Neighbor<double[],E>> neighbors)
Search the neighbors in the given radius of query object, i.e.
|
void |
range(double[] q,
double radius,
java.util.List<Neighbor<double[],E>> neighbors,
double recall,
int T)
Search the neighbors in the given radius of query object, i.e.
|
MPLSH |
setIdenticalExcluded(boolean excluded)
Set if exclude query object self from the neighborhood.
|
java.lang.String |
toString() |
public MPLSH(int d, int L, int k, double r)
d
- the dimensionality of data.L
- the number of hash tables.k
- the number of random projection hash functions, which is usually
set to log(N) where N is the dataset size.r
- the width of random projections. It should be sufficiently
away from 0. But we should not choose an r value that is too large, which
will increase the query time.public MPLSH(int d, int L, int k, double r, int H)
d
- the dimensionality of data.L
- the number of hash tables.k
- the number of random projection hash functions, which is usually
set to log(N) where N is the dataset size.r
- the width of random projections. It should be sufficiently
away from 0. But we should not choose an r value that is too large, which
will increase the query time.H
- the number of buckets of hash tables.public java.lang.String toString()
toString
in class java.lang.Object
public boolean isIdenticalExcluded()
public MPLSH setIdenticalExcluded(boolean excluded)
public void put(double[] key, E value)
public void learn(RNNSearch<double[],double[]> range, double[][] samples, double radius)
range
- the neighborhood search data structure.radius
- the radius for range search.samples
- the training samples.public void learn(RNNSearch<double[],double[]> range, double[][] samples, double radius, int Nz)
range
- the neighborhood search data structure.radius
- the radius for range search.Nz
- the number of quantized values.public void learn(RNNSearch<double[],double[]> range, double[][] samples, double radius, int Nz, double sigma)
range
- the neighborhood search data structure.radius
- the radius for range search.Nz
- the number of quantized values.sigma
- the Parzen window width.public Neighbor<double[],E> nearest(double[] q)
NearestNeighborSearch
nearest
in interface NearestNeighborSearch<double[],E>
q
- the query key.public Neighbor<double[],E> nearest(double[] q, double recall, int T)
q
- the query object.recall
- the expected recall rate.T
- the maximum number of probes.public Neighbor<double[],E>[] knn(double[] q, int k)
KNNSearch
public Neighbor<double[],E>[] knn(double[] q, int k, double recall, int T)
q
- the query object.k
- the number of nearest neighbors to search for.recall
- the expected recall rate.T
- the maximum number of probes.public void range(double[] q, double radius, java.util.List<Neighbor<double[],E>> neighbors)
RNNSearch
public void range(double[] q, double radius, java.util.List<Neighbor<double[],E>> neighbors, double recall, int T)
q
- the query object.radius
- the radius of search range.neighbors
- the list to store found neighbors in the given range on output.recall
- the expected recall rate.T
- the maximum number of probes.