Class

com.soundcloud.lsh

Lsh

Related Doc: package lsh

Permalink

class Lsh extends Joiner with Serializable

Lsh implementation as described in 'Randomized Algorithms and NLP: Using Locality Sensitive Hash Function for High Speed Noun Clustering' by Ravichandran et al. See original publication for a detailed description of the parameters.

See also

http://dl.acm.org/citation.cfm?id=1219917

Linear Supertypes
Serializable, Serializable, Joiner, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Lsh
  2. Serializable
  3. Serializable
  4. Joiner
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Lsh(minCosineSimilarity: Double, dimensions: Int, numNeighbours: Int, numPermutations: Int, partitions: Int = 200, storageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK)

    Permalink

    minCosineSimilarity

    minimum similarity two items need to have otherwise they are discarded from the result set

    dimensions

    number of random vectors (hyperplanes) to generate bit vectors of length d

    numNeighbours

    beam factor e.g. how many neighbours are considered in the sliding window

    numPermutations

    number of times bitsets are permuted

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. def createSlidingWindow(signatures: RDD[Signature], b: Int): RDD[Array[Signature]]

    Permalink

    Creates a sliding window

  7. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. def findNeighbours(signatures: RDD[Array[Signature]], minCosineSimilarity: Double): RDD[MatrixEntry]

    Permalink
  11. def generatePermutation(size: Int): Iterable[Int]

    Permalink

    Generates a random permutation of size n

  12. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  13. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  14. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  15. def join(inputMatrix: IndexedRowMatrix): CoordinateMatrix

    Permalink

    Find the k nearest neighbors from a data set for every other object in the same data set.

    Find the k nearest neighbors from a data set for every other object in the same data set. Implementations may be either exact or approximate.

    returns

    a similarity matrix with MatrixEntry(itemA, itemB, similarity).

    Definition Classes
    LshJoiner
  16. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  17. def neighbours(signatures: Array[Signature], minCosineSimilarity: Double): Iterator[MatrixEntry]

    Permalink

    Generate all pairs and emit if cosine of pair > minCosineSimilarity

  18. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  19. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  20. def orderByBitSet(signatures: RDD[Signature]): RDD[Signature]

    Permalink

    Orderes an RDD of signatures by their bit set representation

  21. def permuteBitSet(bitSet: BitSet, permutation: Iterable[Int], d: Int): BitSet

    Permalink

    Permutes a bit set representation of a vector by a given permutation

  22. def permuteBitSet(signatures: RDD[Signature], permutation: Iterable[Int], d: Int): RDD[Signature]

    Permalink

    Permutes a signatures by a given permutation

  23. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  24. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  25. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  27. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Joiner

Inherited from AnyRef

Inherited from Any

Ungrouped