Class Cosine
- java.lang.Object
-
- info.debatty.java.stringsimilarity.ShingleBased
-
- info.debatty.java.stringsimilarity.Cosine
-
- All Implemented Interfaces:
NormalizedStringDistance
,NormalizedStringSimilarity
,StringDistance
,StringSimilarity
,Serializable
@Immutable public class Cosine extends ShingleBased implements NormalizedStringDistance, NormalizedStringSimilarity
The similarity between the two strings is the cosine of the angle between these two vectors representation. It is computed as V1 . V2 / (|V1| * |V2|) The cosine distance is computed as 1 - cosine similarity.- Author:
- Thibault Debatty
- See Also:
- Serialized Form
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description double
distance(String s1, String s2)
Return 1.0 - similarity.double
similarity(String s1, String s2)
Compute the cosine similarity between strings.double
similarity(Map<String,Integer> profile1, Map<String,Integer> profile2)
Compute similarity between precomputed profiles.-
Methods inherited from class info.debatty.java.stringsimilarity.ShingleBased
getK, getProfile
-
-
-
-
Constructor Detail
-
Cosine
public Cosine(int k)
Implements Cosine Similarity between strings. The strings are first transformed in vectors of occurrences of k-shingles (sequences of k characters). In this n-dimensional space, the similarity between the two strings is the cosine of their respective vectors.- Parameters:
k
-
-
Cosine
public Cosine()
Implements Cosine Similarity between strings. The strings are first transformed in vectors of occurrences of k-shingles (sequences of k characters). In this n-dimensional space, the similarity between the two strings is the cosine of their respective vectors. Default k is 3.
-
-
Method Detail
-
similarity
public final double similarity(String s1, String s2)
Compute the cosine similarity between strings.- Specified by:
similarity
in interfaceStringSimilarity
- Parameters:
s1
- The first string to compare.s2
- The second string to compare.- Returns:
- The cosine similarity in the range [0, 1]
- Throws:
NullPointerException
- if s1 or s2 is null.
-
distance
public final double distance(String s1, String s2)
Return 1.0 - similarity.- Specified by:
distance
in interfaceStringDistance
- Parameters:
s1
- The first string to compare.s2
- The second string to compare.- Returns:
- 1.0 - the cosine similarity in the range [0, 1]
- Throws:
NullPointerException
- if s1 or s2 is null.
-
-