C D E G I J L M N O P Q R S W
All Classes All Packages
All Classes All Packages
All Classes All Packages
C
- CharacterInsDelInterface - Interface in info.debatty.java.stringsimilarity
-
As an adjunct to CharacterSubstitutionInterface, this interface allows you to specify the cost of deletion or insertion of a character.
- CharacterSubstitutionInterface - Interface in info.debatty.java.stringsimilarity
-
Used to indicate the cost of character substitution.
- Cosine - Class in info.debatty.java.stringsimilarity
-
The similarity between the two strings is the cosine of the angle between these two vectors representation.
- Cosine() - Constructor for class info.debatty.java.stringsimilarity.Cosine
-
Implements Cosine Similarity between strings.
- Cosine(int) - Constructor for class info.debatty.java.stringsimilarity.Cosine
-
Implements Cosine Similarity between strings.
- cost(char, char) - Method in interface info.debatty.java.stringsimilarity.CharacterSubstitutionInterface
-
Indicate the cost of substitution c1 and c2.
D
- Damerau - Class in info.debatty.java.stringsimilarity
-
Implementation of Damerau-Levenshtein distance with transposition (also sometimes calls unrestricted Damerau-Levenshtein distance).
- Damerau() - Constructor for class info.debatty.java.stringsimilarity.Damerau
- deletionCost(char) - Method in interface info.debatty.java.stringsimilarity.CharacterInsDelInterface
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.Cosine
-
Return 1.0 - similarity.
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.Damerau
-
Compute the distance between strings: the minimum number of operations needed to transform one string into the other (insertion, deletion, substitution of a single character, or a transposition of two adjacent characters).
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.experimental.Sift4
-
Sift4 - a general purpose string distance algorithm inspired by JaroWinkler and Longest Common Subsequence.
- distance(String, String) - Method in interface info.debatty.java.stringsimilarity.interfaces.MetricStringDistance
-
Compute and return the metric distance.
- distance(String, String) - Method in interface info.debatty.java.stringsimilarity.interfaces.StringDistance
-
Compute and return a measure of distance.
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.Jaccard
-
Distance is computed as 1 - similarity.
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.JaroWinkler
-
Return 1 - similarity.
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.Levenshtein
-
Equivalent to distance(s1, s2, Integer.MAX_VALUE).
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.LongestCommonSubsequence
-
Return the LCS distance between strings s1 and s2, computed as |s1| + |s2| - 2 * |LCS(s1, s2)|.
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.MetricLCS
-
Distance metric based on Longest Common Subsequence, computed as 1 - |LCS(s1, s2)| / max(|s1|, |s2|).
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.NGram
-
Compute n-gram distance.
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.NormalizedLevenshtein
-
Compute distance as Levenshtein(s1, s2) / max(|s1|, |s2|).
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.OptimalStringAlignment
-
Compute the distance between strings: the minimum number of operations needed to transform one string into the other (insertion, deletion, substitution of a single character, or a transposition of two adjacent characters) while no substring is edited more than once.
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.QGram
-
The distance between two strings is defined as the L1 norm of the difference of their profiles (the number of occurence of each k-shingle).
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.RatcliffObershelp
-
Return 1 - similarity.
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.SorensenDice
-
Returns 1 - similarity.
- distance(String, String) - Method in class info.debatty.java.stringsimilarity.WeightedLevenshtein
-
Equivalent to distance(s1, s2, Double.MAX_VALUE).
- distance(String, String, double) - Method in class info.debatty.java.stringsimilarity.WeightedLevenshtein
-
Compute Levenshtein distance using provided weights for substitution.
- distance(String, String, int) - Method in class info.debatty.java.stringsimilarity.Levenshtein
-
The Levenshtein distance, or edit distance, between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.
- distance(Map<String, Integer>, Map<String, Integer>) - Method in class info.debatty.java.stringsimilarity.QGram
-
Compute QGram distance using precomputed profiles.
E
- Examples - Class in info.debatty.java.stringsimilarity.examples
- Examples() - Constructor for class info.debatty.java.stringsimilarity.examples.Examples
G
- getK() - Method in class info.debatty.java.stringsimilarity.ShingleBased
-
Return k, the length of k-shingles (aka n-grams).
- getProfile(String) - Method in class info.debatty.java.stringsimilarity.ShingleBased
-
Compute and return the profile of s, as defined by Ukkonen "Approximate string-matching with q-grams and maximal matches".
- getThreshold() - Method in class info.debatty.java.stringsimilarity.JaroWinkler
-
Returns the current value of the threshold used for adding the Winkler bonus.
I
- info.debatty.java.stringsimilarity - package info.debatty.java.stringsimilarity
- info.debatty.java.stringsimilarity.examples - package info.debatty.java.stringsimilarity.examples
- info.debatty.java.stringsimilarity.experimental - package info.debatty.java.stringsimilarity.experimental
- info.debatty.java.stringsimilarity.interfaces - package info.debatty.java.stringsimilarity.interfaces
- insertionCost(char) - Method in interface info.debatty.java.stringsimilarity.CharacterInsDelInterface
J
- Jaccard - Class in info.debatty.java.stringsimilarity
-
Each input string is converted into a set of n-grams, the Jaccard index is then computed as |V1 inter V2| / |V1 union V2|.
- Jaccard() - Constructor for class info.debatty.java.stringsimilarity.Jaccard
-
The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|.
- Jaccard(int) - Constructor for class info.debatty.java.stringsimilarity.Jaccard
-
The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|.
- JaroWinkler - Class in info.debatty.java.stringsimilarity
-
The Jaro–Winkler distance metric is designed and best suited for short strings such as person names, and to detect typos; it is (roughly) a variation of Damerau-Levenshtein, where the substitution of 2 close characters is considered less important then the substitution of 2 characters that a far from each other.
- JaroWinkler() - Constructor for class info.debatty.java.stringsimilarity.JaroWinkler
-
Instantiate with default threshold (0.7).
- JaroWinkler(double) - Constructor for class info.debatty.java.stringsimilarity.JaroWinkler
-
Instantiate with given threshold to determine when Winkler bonus should be used.
L
- length(String, String) - Method in class info.debatty.java.stringsimilarity.LongestCommonSubsequence
-
Return the length of Longest Common Subsequence (LCS) between strings s1 and s2.
- Levenshtein - Class in info.debatty.java.stringsimilarity
-
The Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one string into the other.
- Levenshtein() - Constructor for class info.debatty.java.stringsimilarity.Levenshtein
- LongestCommonSubsequence - Class in info.debatty.java.stringsimilarity
-
The longest common subsequence (LCS) problem consists in finding the longest subsequence common to two (or more) sequences.
- LongestCommonSubsequence() - Constructor for class info.debatty.java.stringsimilarity.LongestCommonSubsequence
M
- main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.Examples
- main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.MetricLCS
- main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.nischay21
- main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.PrecomputedCosine
- MetricLCS - Class in info.debatty.java.stringsimilarity.examples
- MetricLCS - Class in info.debatty.java.stringsimilarity
-
Distance metric based on Longest Common Subsequence, from the notes "An LCS-based string metric" by Daniel Bakkelund.
- MetricLCS() - Constructor for class info.debatty.java.stringsimilarity.examples.MetricLCS
- MetricLCS() - Constructor for class info.debatty.java.stringsimilarity.MetricLCS
- MetricStringDistance - Interface in info.debatty.java.stringsimilarity.interfaces
-
String distances that implement this interface are metrics.
N
- NGram - Class in info.debatty.java.stringsimilarity
-
N-Gram Similarity as defined by Kondrak, "N-Gram Similarity and Distance", String Processing and Information Retrieval, Lecture Notes in Computer Science Volume 3772, 2005, pp 115-126.
- NGram() - Constructor for class info.debatty.java.stringsimilarity.NGram
-
Instantiate with default value for n-gram length (2).
- NGram(int) - Constructor for class info.debatty.java.stringsimilarity.NGram
-
Instantiate with given value for n-gram length.
- nischay21 - Class in info.debatty.java.stringsimilarity.examples
- nischay21() - Constructor for class info.debatty.java.stringsimilarity.examples.nischay21
- NormalizedLevenshtein - Class in info.debatty.java.stringsimilarity
-
This distance is computed as levenshtein distance divided by the length of the longest string.
- NormalizedLevenshtein() - Constructor for class info.debatty.java.stringsimilarity.NormalizedLevenshtein
- NormalizedStringDistance - Interface in info.debatty.java.stringsimilarity.interfaces
-
Normalized string similarities return a similarity between 0.0 and 1.0.
- NormalizedStringSimilarity - Interface in info.debatty.java.stringsimilarity.interfaces
O
- OptimalStringAlignment - Class in info.debatty.java.stringsimilarity
-
Implementation of the the Optimal String Alignment (sometimes called the restricted edit distance) variant of the Damerau-Levenshtein distance.
- OptimalStringAlignment() - Constructor for class info.debatty.java.stringsimilarity.OptimalStringAlignment
P
- PrecomputedCosine - Class in info.debatty.java.stringsimilarity.examples
-
Example of computing cosine similarity with pre-computed profiles.
- PrecomputedCosine() - Constructor for class info.debatty.java.stringsimilarity.examples.PrecomputedCosine
Q
- QGram - Class in info.debatty.java.stringsimilarity
-
Q-gram distance, as defined by Ukkonen in "Approximate string-matching with q-grams and maximal matches".
- QGram() - Constructor for class info.debatty.java.stringsimilarity.QGram
-
Q-gram similarity and distance.
- QGram(int) - Constructor for class info.debatty.java.stringsimilarity.QGram
-
Q-gram similarity and distance.
R
- RatcliffObershelp - Class in info.debatty.java.stringsimilarity
-
Ratcliff/Obershelp pattern recognition The Ratcliff/Obershelp algorithm computes the similarity of two strings a the doubled number of matching characters divided by the total number of characters in the two strings.
- RatcliffObershelp() - Constructor for class info.debatty.java.stringsimilarity.RatcliffObershelp
S
- setMaxOffset(int) - Method in class info.debatty.java.stringsimilarity.experimental.Sift4
-
Set the maximum distance to search for character transposition.
- ShingleBased - Class in info.debatty.java.stringsimilarity
-
Abstract class for string similarities that rely on set operations (like cosine similarity or jaccard index).
- ShingleBased(int) - Constructor for class info.debatty.java.stringsimilarity.ShingleBased
- Sift4 - Class in info.debatty.java.stringsimilarity.experimental
-
Sift4 - a general purpose string distance algorithm inspired by JaroWinkler and Longest Common Subsequence.
- Sift4() - Constructor for class info.debatty.java.stringsimilarity.experimental.Sift4
- similarity(String, String) - Method in class info.debatty.java.stringsimilarity.Cosine
-
Compute the cosine similarity between strings.
- similarity(String, String) - Method in interface info.debatty.java.stringsimilarity.interfaces.StringSimilarity
-
Compute and return a measure of similarity between 2 strings.
- similarity(String, String) - Method in class info.debatty.java.stringsimilarity.Jaccard
-
Compute Jaccard index: |A inter B| / |A union B|.
- similarity(String, String) - Method in class info.debatty.java.stringsimilarity.JaroWinkler
-
Compute Jaro-Winkler similarity.
- similarity(String, String) - Method in class info.debatty.java.stringsimilarity.NormalizedLevenshtein
-
Return 1 - distance.
- similarity(String, String) - Method in class info.debatty.java.stringsimilarity.RatcliffObershelp
-
Compute the Ratcliff-Obershelp similarity between strings.
- similarity(String, String) - Method in class info.debatty.java.stringsimilarity.SorensenDice
-
Similarity is computed as 2 * |A inter B| / (|A| + |B|).
- similarity(Map<String, Integer>, Map<String, Integer>) - Method in class info.debatty.java.stringsimilarity.Cosine
-
Compute similarity between precomputed profiles.
- SorensenDice - Class in info.debatty.java.stringsimilarity
-
Similar to Jaccard index, but this time the similarity is computed as 2 * |V1 inter V2| / (|V1| + |V2|).
- SorensenDice() - Constructor for class info.debatty.java.stringsimilarity.SorensenDice
-
Sorensen-Dice coefficient, aka Sørensen index, Dice's coefficient or Czekanowski's binary (non-quantitative) index.
- SorensenDice(int) - Constructor for class info.debatty.java.stringsimilarity.SorensenDice
-
Sorensen-Dice coefficient, aka Sørensen index, Dice's coefficient or Czekanowski's binary (non-quantitative) index.
- StringDistance - Interface in info.debatty.java.stringsimilarity.interfaces
- StringSimilarity - Interface in info.debatty.java.stringsimilarity.interfaces
W
- WeightedLevenshtein - Class in info.debatty.java.stringsimilarity
-
Implementation of Levenshtein that allows to define different weights for different character substitutions.
- WeightedLevenshtein(CharacterSubstitutionInterface) - Constructor for class info.debatty.java.stringsimilarity.WeightedLevenshtein
-
Instantiate with provided character substitution.
- WeightedLevenshtein(CharacterSubstitutionInterface, CharacterInsDelInterface) - Constructor for class info.debatty.java.stringsimilarity.WeightedLevenshtein
-
Instantiate with provided character substitution, insertion, and deletion weights.
All Classes All Packages