Skip navigation links
C D E G I J L M N O P Q S W 

C

CharacterInsDelInterface - Interface in info.debatty.java.stringsimilarity
As an adjunct to CharacterSubstitutionInterface, this interface allows you to specify the cost of deletion or insertion of a character.
CharacterSubstitutionInterface - Interface in info.debatty.java.stringsimilarity
Used to indicate the cost of character substitution.
Cosine - Class in info.debatty.java.stringsimilarity
The similarity between the two strings is the cosine of the angle between these two vectors representation.
Cosine(int) - Constructor for class info.debatty.java.stringsimilarity.Cosine
Implements Cosine Similarity between strings.
Cosine() - Constructor for class info.debatty.java.stringsimilarity.Cosine
Implements Cosine Similarity between strings.
cost(char, char) - Method in interface info.debatty.java.stringsimilarity.CharacterSubstitutionInterface
Indicate the cost of substitution c1 and c2.

D

Damerau - Class in info.debatty.java.stringsimilarity
Implementation of Damerau-Levenshtein distance with transposition (also sometimes calls unrestricted Damerau-Levenshtein distance).
Damerau() - Constructor for class info.debatty.java.stringsimilarity.Damerau
 
deletionCost(char) - Method in interface info.debatty.java.stringsimilarity.CharacterInsDelInterface
 
distance(String, String) - Method in class info.debatty.java.stringsimilarity.Cosine
Return 1.0 - similarity.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.Damerau
Compute the distance between strings: the minimum number of operations needed to transform one string into the other (insertion, deletion, substitution of a single character, or a transposition of two adjacent characters).
distance(String, String) - Method in class info.debatty.java.stringsimilarity.experimental.Sift4
Sift4 - a general purpose string distance algorithm inspired by JaroWinkler and Longest Common Subsequence.
distance(String, String) - Method in interface info.debatty.java.stringsimilarity.interfaces.MetricStringDistance
Compute and return the metric distance.
distance(String, String) - Method in interface info.debatty.java.stringsimilarity.interfaces.StringDistance
Compute and return a measure of distance.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.Jaccard
Distance is computed as 1 - similarity.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.JaroWinkler
Return 1 - similarity.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.Levenshtein
The Levenshtein distance, or edit distance, between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.LongestCommonSubsequence
Return the LCS distance between strings s1 and s2, computed as |s1| + |s2| - 2 * |LCS(s1, s2)|.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.MetricLCS
Distance metric based on Longest Common Subsequence, computed as 1 - |LCS(s1, s2)| / max(|s1|, |s2|).
distance(String, String) - Method in class info.debatty.java.stringsimilarity.NGram
Compute n-gram distance.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.NormalizedLevenshtein
Compute distance as Levenshtein(s1, s2) / max(|s1|, |s2|).
distance(String, String) - Method in class info.debatty.java.stringsimilarity.OptimalStringAlignment
Compute the distance between strings: the minimum number of operations needed to transform one string into the other (insertion, deletion, substitution of a single character, or a transposition of two adjacent characters) while no substring is edited more than once.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.QGram
The distance between two strings is defined as the L1 norm of the difference of their profiles (the number of occurence of each k-shingle).
distance(Map<String, Integer>, Map<String, Integer>) - Method in class info.debatty.java.stringsimilarity.QGram
Compute QGram distance using precomputed profiles.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.SorensenDice
Returns 1 - similarity.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.WeightedLevenshtein
Compute Levenshtein distance using provided weights for substitution.

E

Examples - Class in info.debatty.java.stringsimilarity.examples
 
Examples() - Constructor for class info.debatty.java.stringsimilarity.examples.Examples
 

G

getK() - Method in class info.debatty.java.stringsimilarity.ShingleBased
Return k, the length of k-shingles (aka n-grams).
getProfile(String) - Method in class info.debatty.java.stringsimilarity.ShingleBased
Compute and return the profile of s, as defined by Ukkonen "Approximate string-matching with q-grams and maximal matches".
getThreshold() - Method in class info.debatty.java.stringsimilarity.JaroWinkler
Returns the current value of the threshold used for adding the Winkler bonus.

I

info.debatty.java.stringsimilarity - package info.debatty.java.stringsimilarity
 
info.debatty.java.stringsimilarity.examples - package info.debatty.java.stringsimilarity.examples
 
info.debatty.java.stringsimilarity.experimental - package info.debatty.java.stringsimilarity.experimental
 
info.debatty.java.stringsimilarity.interfaces - package info.debatty.java.stringsimilarity.interfaces
 
insertionCost(char) - Method in interface info.debatty.java.stringsimilarity.CharacterInsDelInterface
 

J

Jaccard - Class in info.debatty.java.stringsimilarity
Each input string is converted into a set of n-grams, the Jaccard index is then computed as |V1 inter V2| / |V1 union V2|.
Jaccard(int) - Constructor for class info.debatty.java.stringsimilarity.Jaccard
The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|.
Jaccard() - Constructor for class info.debatty.java.stringsimilarity.Jaccard
The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|.
JaroWinkler - Class in info.debatty.java.stringsimilarity
The Jaro–Winkler distance metric is designed and best suited for short strings such as person names, and to detect typos; it is (roughly) a variation of Damerau-Levenshtein, where the substitution of 2 close characters is considered less important then the substitution of 2 characters that a far from each other.
JaroWinkler() - Constructor for class info.debatty.java.stringsimilarity.JaroWinkler
Instantiate with default threshold (0.7).
JaroWinkler(double) - Constructor for class info.debatty.java.stringsimilarity.JaroWinkler
Instantiate with given threshold to determine when Winkler bonus should be used.

L

length(String, String) - Method in class info.debatty.java.stringsimilarity.LongestCommonSubsequence
Return the length of Longest Common Subsequence (LCS) between strings s1 and s2.
Levenshtein - Class in info.debatty.java.stringsimilarity
The Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one string into the other.
Levenshtein() - Constructor for class info.debatty.java.stringsimilarity.Levenshtein
 
LongestCommonSubsequence - Class in info.debatty.java.stringsimilarity
The longest common subsequence (LCS) problem consists in finding the longest subsequence common to two (or more) sequences.
LongestCommonSubsequence() - Constructor for class info.debatty.java.stringsimilarity.LongestCommonSubsequence
 

M

main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.Examples
 
main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.MetricLCS
 
main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.nischay21
 
main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.PrecomputedCosine
 
MetricLCS - Class in info.debatty.java.stringsimilarity.examples
 
MetricLCS() - Constructor for class info.debatty.java.stringsimilarity.examples.MetricLCS
 
MetricLCS - Class in info.debatty.java.stringsimilarity
Distance metric based on Longest Common Subsequence, from the notes "An LCS-based string metric" by Daniel Bakkelund.
MetricLCS() - Constructor for class info.debatty.java.stringsimilarity.MetricLCS
 
MetricStringDistance - Interface in info.debatty.java.stringsimilarity.interfaces
String distances that implement this interface are metrics.

N

NGram - Class in info.debatty.java.stringsimilarity
N-Gram Similarity as defined by Kondrak, "N-Gram Similarity and Distance", String Processing and Information Retrieval, Lecture Notes in Computer Science Volume 3772, 2005, pp 115-126.
NGram(int) - Constructor for class info.debatty.java.stringsimilarity.NGram
Instantiate with given value for n-gram length.
NGram() - Constructor for class info.debatty.java.stringsimilarity.NGram
Instantiate with default value for n-gram length (2).
nischay21 - Class in info.debatty.java.stringsimilarity.examples
 
nischay21() - Constructor for class info.debatty.java.stringsimilarity.examples.nischay21
 
NormalizedLevenshtein - Class in info.debatty.java.stringsimilarity
This distance is computed as levenshtein distance divided by the length of the longest string.
NormalizedLevenshtein() - Constructor for class info.debatty.java.stringsimilarity.NormalizedLevenshtein
 
NormalizedStringDistance - Interface in info.debatty.java.stringsimilarity.interfaces
Normalized string similarities return a similarity between 0.0 and 1.0.
NormalizedStringSimilarity - Interface in info.debatty.java.stringsimilarity.interfaces
 

O

OptimalStringAlignment - Class in info.debatty.java.stringsimilarity
Implementation of the the Optimal String Alignment (sometimes called the restricted edit distance) variant of the Damerau-Levenshtein distance.
OptimalStringAlignment() - Constructor for class info.debatty.java.stringsimilarity.OptimalStringAlignment
 

P

PrecomputedCosine - Class in info.debatty.java.stringsimilarity.examples
Example of computing cosine similarity with pre-computed profiles.
PrecomputedCosine() - Constructor for class info.debatty.java.stringsimilarity.examples.PrecomputedCosine
 

Q

QGram - Class in info.debatty.java.stringsimilarity
Q-gram distance, as defined by Ukkonen in "Approximate string-matching with q-grams and maximal matches".
QGram(int) - Constructor for class info.debatty.java.stringsimilarity.QGram
Q-gram similarity and distance.
QGram() - Constructor for class info.debatty.java.stringsimilarity.QGram
Q-gram similarity and distance.

S

setMaxOffset(int) - Method in class info.debatty.java.stringsimilarity.experimental.Sift4
Set the maximum distance to search for character transposition.
ShingleBased - Class in info.debatty.java.stringsimilarity
Abstract class for string similarities that rely on set operations (like cosine similarity or jaccard index).
ShingleBased(int) - Constructor for class info.debatty.java.stringsimilarity.ShingleBased
 
Sift4 - Class in info.debatty.java.stringsimilarity.experimental
Sift4 - a general purpose string distance algorithm inspired by JaroWinkler and Longest Common Subsequence.
Sift4() - Constructor for class info.debatty.java.stringsimilarity.experimental.Sift4
 
similarity(String, String) - Method in class info.debatty.java.stringsimilarity.Cosine
Compute the cosine similarity between strings.
similarity(Map<String, Integer>, Map<String, Integer>) - Method in class info.debatty.java.stringsimilarity.Cosine
similarity(String, String) - Method in interface info.debatty.java.stringsimilarity.interfaces.StringSimilarity
Compute and return a measure of similarity between 2 strings.
similarity(String, String) - Method in class info.debatty.java.stringsimilarity.Jaccard
Compute Jaccard index: |A inter B| / |A union B|.
similarity(String, String) - Method in class info.debatty.java.stringsimilarity.JaroWinkler
Compute Jaro-Winkler similarity.
similarity(String, String) - Method in class info.debatty.java.stringsimilarity.NormalizedLevenshtein
Return 1 - distance.
similarity(String, String) - Method in class info.debatty.java.stringsimilarity.SorensenDice
Similarity is computed as 2 * |A inter B| / (|A| + |B|).
SorensenDice - Class in info.debatty.java.stringsimilarity
Similar to Jaccard index, but this time the similarity is computed as 2 * |V1 inter V2| / (|V1| + |V2|).
SorensenDice(int) - Constructor for class info.debatty.java.stringsimilarity.SorensenDice
Sorensen-Dice coefficient, aka Sørensen index, Dice's coefficient or Czekanowski's binary (non-quantitative) index.
SorensenDice() - Constructor for class info.debatty.java.stringsimilarity.SorensenDice
Sorensen-Dice coefficient, aka Sørensen index, Dice's coefficient or Czekanowski's binary (non-quantitative) index.
StringDistance - Interface in info.debatty.java.stringsimilarity.interfaces
 
StringSimilarity - Interface in info.debatty.java.stringsimilarity.interfaces
 

W

WeightedLevenshtein - Class in info.debatty.java.stringsimilarity
Implementation of Levenshtein that allows to define different weights for different character substitutions.
WeightedLevenshtein(CharacterSubstitutionInterface) - Constructor for class info.debatty.java.stringsimilarity.WeightedLevenshtein
Instantiate with provided character substitution.
WeightedLevenshtein(CharacterSubstitutionInterface, CharacterInsDelInterface) - Constructor for class info.debatty.java.stringsimilarity.WeightedLevenshtein
Instantiate with provided character substitution, insertion, and deletion weights.
C D E G I J L M N O P Q S W 
Skip navigation links

Copyright © 2018. All rights reserved.