public class KShingling extends Object
Modifier and Type | Field and Description |
---|---|
protected int |
k |
Constructor and Description |
---|
KShingling()
k-shingling is the operation of transforming a string (or text document) into
a set of n-grams, which can be used to measure the similarity between two
strings or documents.
|
KShingling(int k) |
Modifier and Type | Method and Description |
---|---|
protected int[] |
getArrayProfile(String s)
Compute and return the profile of s, as defined by Ukkonen "Approximate
string-matching with q-grams and maximal matches".
|
int |
getDimension()
Return the number of different n-grams (k-shingles) found by this
k-shingling instance.
|
int |
getK() |
StringProfile |
getProfile(String s)
Compute and returns the profile of string s
The profiles of different strings can be used to compute cosine similarity
or qgram distance.
|
StringSet |
getSet(String s) |
public KShingling()
public KShingling(int k)
public int getK()
protected int[] getArrayProfile(String s)
s
- public StringProfile getProfile(String s)
s
- public int getDimension()
Copyright © 2016. All rights reserved.