@Immutable public class Jaccard extends Object implements MetricStringDistance, NormalizedStringDistance, NormalizedStringSimilarity
Constructor and Description |
---|
Jaccard()
The strings are first transformed into sets of k-shingles (sequences of k
characters), then Jaccard index is computed as |A inter B| / |A union B|.
|
Jaccard(int k)
The strings are first transformed into sets of k-shingles (sequences of k
characters), then Jaccard index is computed as |A inter B| / |A union B|.
|
Modifier and Type | Method and Description |
---|---|
double |
distance(String s1,
String s2)
Distance is computed as 1 - similarity.
|
int |
getK()
Return k, the length of k-shingles (aka n-grams).
|
Map<String,Integer> |
getProfile(String string)
Compute and return the profile of s, as defined by Ukkonen "Approximate
string-matching with q-grams and maximal matches".
|
double |
similarity(String s1,
String s2)
Compute Jaccard index: |A inter B| / |A union B|.
|
public Jaccard(int k)
k
- public Jaccard()
public final double similarity(String s1, String s2)
similarity
in interface StringSimilarity
s1
- The first string to compare.s2
- The second string to compare.NullPointerException
- if s1 or s2 is null.public final double distance(String s1, String s2)
distance
in interface MetricStringDistance
distance
in interface StringDistance
s1
- The first string to compare.s2
- The second string to compare.NullPointerException
- if s1 or s2 is null.public int getK()
public final Map<String,Integer> getProfile(String string)
string
- Copyright © 2017. All rights reserved.