C D E G I J K L M N O P Q S T U V W 

C

CharacterSubstitutionInterface - Interface in info.debatty.java.stringsimilarity
Used to indicate the cost of character substitution.
Cosine - Class in info.debatty.java.stringsimilarity
The similarity between the two strings is the cosine of the angle between these two vectors representation.
Cosine(int) - Constructor for class info.debatty.java.stringsimilarity.Cosine
Implements Cosine Similarity between strings.
Cosine() - Constructor for class info.debatty.java.stringsimilarity.Cosine
Implements Cosine Similarity between strings.
cosineSimilarity(SparseDoubleVector) - Method in class info.debatty.java.utils.SparseDoubleVector
Return the cosine similarity between the vectors.
cosineSimilarity(SparseIntegerVector) - Method in class info.debatty.java.utils.SparseIntegerVector
Compute and return the cosine similarity (cosine of angle between both vectors).
cost(char, char) - Method in interface info.debatty.java.stringsimilarity.CharacterSubstitutionInterface
Indicate the cost of substitution c1 and c2.

D

Damerau - Class in info.debatty.java.stringsimilarity
Implementation of Damerau-Levenshtein distance with transposition (also sometimes calls unrestricted Damerau-Levenshtein distance).
Damerau() - Constructor for class info.debatty.java.stringsimilarity.Damerau
 
distance(String, String) - Method in class info.debatty.java.stringsimilarity.Cosine
Return 1.0 - similarity.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.Damerau
Compute the distance between strings: the minimum number of operations needed to transform one string into the other (insertion, deletion, substitution of a single character, or a transposition of two adjacent characters).
distance(String, String) - Method in class info.debatty.java.stringsimilarity.experimental.Sift4
Sift4 - a general purpose string distance algorithm inspired by JaroWinkler and Longest Common Subsequence.
distance(String, String) - Method in interface info.debatty.java.stringsimilarity.interfaces.MetricStringDistance
 
distance(String, String) - Method in interface info.debatty.java.stringsimilarity.interfaces.StringDistance
 
distance(String, String) - Method in class info.debatty.java.stringsimilarity.Jaccard
Distance is computed as 1 - similarity.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.JaroWinkler
Return 1 - similarity.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.Levenshtein
The Levenshtein distance, or edit distance, between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.LongestCommonSubsequence
Return the LCS distance between strings s1 and s2, computed as |s1| + |s2| - 2 * |LCS(s1, s2)|.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.MetricLCS
Distance metric based on Longest Common Subsequence, computed as 1 - |LCS(s1, s2)| / max(|s1|, |s2|).
distance(String, String) - Method in class info.debatty.java.stringsimilarity.NGram
Compute n-gram distance.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.NormalizedLevenshtein
Compute distance as Levenshtein(s1, s2) / max(|s1|, |s2|).
distance(String, String) - Method in class info.debatty.java.stringsimilarity.OptimalStringAlignment
Compute the distance between strings: the minimum number of operations needed to transform one string into the other (insertion, deletion, substitution of a single character, or a transposition of two adjacent characters) while no substring is edited more than once.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.QGram
The distance between two strings is defined as the L1 norm of the difference of their profiles (the number of occurence of each k-shingle).
distance(String, String) - Method in class info.debatty.java.stringsimilarity.SorensenDice
Returns 1 - similarity.
distance(String, String) - Method in class info.debatty.java.stringsimilarity.WeightedLevenshtein
Compute Levenshtein distance using provided weights for substitution.
dotProduct(SparseDoubleVector) - Method in class info.debatty.java.utils.SparseDoubleVector
 
dotProduct(double[]) - Method in class info.debatty.java.utils.SparseDoubleVector
 
dotProduct(SparseIntegerVector) - Method in class info.debatty.java.utils.SparseIntegerVector
Compute and return the dot product.
dotProduct(double[]) - Method in class info.debatty.java.utils.SparseIntegerVector
Compute and return the dot product.

E

Examples - Class in info.debatty.java.stringsimilarity.examples
 
Examples() - Constructor for class info.debatty.java.stringsimilarity.examples.Examples
 

G

getKey(int) - Method in class info.debatty.java.utils.SparseIntegerVector
Get the key at position i.
getThreshold() - Method in class info.debatty.java.stringsimilarity.JaroWinkler
Returns the current value of the threshold used for adding the Winkler bonus.
getValue(int) - Method in class info.debatty.java.utils.SparseIntegerVector
Get the value of position i.

I

info.debatty.java.stringsimilarity - package info.debatty.java.stringsimilarity
 
info.debatty.java.stringsimilarity.examples - package info.debatty.java.stringsimilarity.examples
 
info.debatty.java.stringsimilarity.experimental - package info.debatty.java.stringsimilarity.experimental
 
info.debatty.java.stringsimilarity.interfaces - package info.debatty.java.stringsimilarity.interfaces
 
info.debatty.java.utils - package info.debatty.java.utils
 
intersection(SparseBooleanVector) - Method in class info.debatty.java.utils.SparseBooleanVector
 
intersection(SparseDoubleVector) - Method in class info.debatty.java.utils.SparseDoubleVector
Return the number of non-zero values these two vectors have in common, |A inter B|.
intersection(SparseIntegerVector) - Method in class info.debatty.java.utils.SparseIntegerVector
Compute the number of values that are present in both vectors (used to compute jaccard index).

J

Jaccard - Class in info.debatty.java.stringsimilarity
Each input string is converted into a set of n-grams, the Jaccard index is then computed as |V1 inter V2| / |V1 union V2|.
Jaccard(int) - Constructor for class info.debatty.java.stringsimilarity.Jaccard
The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|.
Jaccard() - Constructor for class info.debatty.java.stringsimilarity.Jaccard
The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|.
jaccard(SparseBooleanVector) - Method in class info.debatty.java.utils.SparseBooleanVector
Computes and return the Jaccard index with other SparseVector.
jaccard(SparseDoubleVector) - Method in class info.debatty.java.utils.SparseDoubleVector
Computes and return the Jaccard index with other SparseVector.
jaccard(SparseIntegerVector) - Method in class info.debatty.java.utils.SparseIntegerVector
Computes and return the Jaccard index with other SparseVector.
JaroWinkler - Class in info.debatty.java.stringsimilarity
The Jaro–Winkler distance metric is designed and best suited for short strings such as person names, and to detect typos; it is (roughly) a variation of Damerau-Levenshtein, where the substitution of 2 close characters is considered less important then the substitution of 2 characters that a far from each other.
JaroWinkler() - Constructor for class info.debatty.java.stringsimilarity.JaroWinkler
Instantiate with default threshold (0.7).
JaroWinkler(double) - Constructor for class info.debatty.java.stringsimilarity.JaroWinkler
Instantiate with given threshold to determine when Winkler bonus should be used.

K

keys - Variable in class info.debatty.java.utils.SparseBooleanVector
Indicates the positions that hold the value "true"
keys - Variable in class info.debatty.java.utils.SparseDoubleVector
 

L

length(String, String) - Method in class info.debatty.java.stringsimilarity.LongestCommonSubsequence
Return the length of Longest Common Subsequence (LCS) between strings s1 and s2.
Levenshtein - Class in info.debatty.java.stringsimilarity
The Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one string into the other.
Levenshtein() - Constructor for class info.debatty.java.stringsimilarity.Levenshtein
 
LongestCommonSubsequence - Class in info.debatty.java.stringsimilarity
The longest common subsequence (LCS) problem consists in finding the longest subsequence common to two (or more) sequences.
LongestCommonSubsequence() - Constructor for class info.debatty.java.stringsimilarity.LongestCommonSubsequence
 

M

main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.Examples
 
main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.MetricLCS
 
main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.PrecomputedCosine
 
main(String[]) - Static method in class info.debatty.java.stringsimilarity.examples.SparseDoubleVectorExample
 
MetricLCS - Class in info.debatty.java.stringsimilarity.examples
 
MetricLCS() - Constructor for class info.debatty.java.stringsimilarity.examples.MetricLCS
 
MetricLCS - Class in info.debatty.java.stringsimilarity
Distance metric based on Longest Common Subsequence, from the notes "An LCS-based string metric" by Daniel Bakkelund.
MetricLCS() - Constructor for class info.debatty.java.stringsimilarity.MetricLCS
 
MetricStringDistance - Interface in info.debatty.java.stringsimilarity.interfaces
String distances that implement this interface are metrics, which means: d(x, y) ≥ 0 (non-negativity, or separation axiom) d(x, y) = 0 if and only if x = y (identity, or coincidence axiom) d(x, y) = d(y, x) (symmetry) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality).

N

NGram - Class in info.debatty.java.stringsimilarity
N-Gram Similarity as defined by Kondrak, "N-Gram Similarity and Distance", String Processing and Information Retrieval, Lecture Notes in Computer Science Volume 3772, 2005, pp 115-126.
NGram(int) - Constructor for class info.debatty.java.stringsimilarity.NGram
Instantiate with given value for n-gram length.
NGram() - Constructor for class info.debatty.java.stringsimilarity.NGram
Instantiate with default value for n-gram length (2).
norm() - Method in class info.debatty.java.utils.SparseDoubleVector
Compute and return the L2 norm of the vector
norm() - Method in class info.debatty.java.utils.SparseIntegerVector
Compute and return the L2 norm of the vector.
NormalizedLevenshtein - Class in info.debatty.java.stringsimilarity
This distance is computed as levenshtein distance divided by the length of the longest string.
NormalizedLevenshtein() - Constructor for class info.debatty.java.stringsimilarity.NormalizedLevenshtein
 
NormalizedStringDistance - Interface in info.debatty.java.stringsimilarity.interfaces
Normalized string similarities return a similarity between 0.0 and 1.0.
NormalizedStringSimilarity - Interface in info.debatty.java.stringsimilarity.interfaces
 

O

OptimalStringAlignment - Class in info.debatty.java.stringsimilarity
Implementation of the the Optimal String Alignment (sometimes called the restricted edit distance) variant of the Damerau-Levenshtein distance.
OptimalStringAlignment() - Constructor for class info.debatty.java.stringsimilarity.OptimalStringAlignment
 

P

PrecomputedCosine - Class in info.debatty.java.stringsimilarity.examples
Example of computing cosine similarity with pre-computed profiles.
PrecomputedCosine() - Constructor for class info.debatty.java.stringsimilarity.examples.PrecomputedCosine
 

Q

QGram - Class in info.debatty.java.stringsimilarity
Q-gram distance, as defined by Ukkonen in "Approximate string-matching with q-grams and maximal matches".
QGram(int) - Constructor for class info.debatty.java.stringsimilarity.QGram
Q-gram similarity and distance.
QGram() - Constructor for class info.debatty.java.stringsimilarity.QGram
Q-gram similarity and distance.
qgram(SparseDoubleVector) - Method in class info.debatty.java.utils.SparseDoubleVector
Compute and return the qgram similarity with other vector.
qgram(SparseIntegerVector) - Method in class info.debatty.java.utils.SparseIntegerVector
Compute and return the qgram similarity with other vector.

S

sampleDIMSUM(double, int, int) - Method in class info.debatty.java.utils.SparseDoubleVector
 
setMaxOffset(int) - Method in class info.debatty.java.stringsimilarity.experimental.Sift4
Set the maximum distance to search for character transposition.
Sift4 - Class in info.debatty.java.stringsimilarity.experimental
Sift4 - a general purpose string distance algorithm inspired by JaroWinkler and Longest Common Subsequence.
Sift4() - Constructor for class info.debatty.java.stringsimilarity.experimental.Sift4
 
similarity(String, String) - Method in class info.debatty.java.stringsimilarity.Cosine
Compute the cosine similarity between strings.
similarity(Map<String, Integer>, Map<String, Integer>) - Method in class info.debatty.java.stringsimilarity.Cosine
 
similarity(String, String) - Method in interface info.debatty.java.stringsimilarity.interfaces.StringSimilarity
Compute and return a measure of similarity between 2 strings.
similarity(String, String) - Method in class info.debatty.java.stringsimilarity.Jaccard
Compute Jaccard index: |A inter B| / |A union B|.
similarity(String, String) - Method in class info.debatty.java.stringsimilarity.JaroWinkler
Compute Jaro-Winkler similarity.
similarity(String, String) - Method in class info.debatty.java.stringsimilarity.NormalizedLevenshtein
Return 1 - distance.
similarity(String, String) - Method in class info.debatty.java.stringsimilarity.SorensenDice
Similarity is computed as 2 * |A inter B| / (|A| + |B|).
size() - Method in class info.debatty.java.utils.SparseBooleanVector
Return the number of (non-zero) elements in this vector.
size - Variable in class info.debatty.java.utils.SparseDoubleVector
 
size() - Method in class info.debatty.java.utils.SparseDoubleVector
Return the number of non-zero elements in this vector.
size() - Method in class info.debatty.java.utils.SparseIntegerVector
Return the number of (non-zero) elements in this vector.
SorensenDice - Class in info.debatty.java.stringsimilarity
Similar to Jaccard index, but this time the similarity is computed as 2 * |V1 inter V2| / (|V1| + |V2|).
SorensenDice(int) - Constructor for class info.debatty.java.stringsimilarity.SorensenDice
Sorensen-Dice coefficient, aka Sørensen index, Dice's coefficient or Czekanowski's binary (non-quantitative) index.
SorensenDice() - Constructor for class info.debatty.java.stringsimilarity.SorensenDice
Sorensen-Dice coefficient, aka Sørensen index, Dice's coefficient or Czekanowski's binary (non-quantitative) index.
SparseBooleanVector - Class in info.debatty.java.utils
 
SparseBooleanVector(int) - Constructor for class info.debatty.java.utils.SparseBooleanVector
 
SparseBooleanVector() - Constructor for class info.debatty.java.utils.SparseBooleanVector
 
SparseBooleanVector(HashMap<Integer, Integer>) - Constructor for class info.debatty.java.utils.SparseBooleanVector
 
SparseBooleanVector(boolean[]) - Constructor for class info.debatty.java.utils.SparseBooleanVector
 
SparseDoubleVector - Class in info.debatty.java.utils
Sparse vector of double, implemented using two arrays.
SparseDoubleVector(int) - Constructor for class info.debatty.java.utils.SparseDoubleVector
 
SparseDoubleVector() - Constructor for class info.debatty.java.utils.SparseDoubleVector
 
SparseDoubleVector(HashMap<Integer, Double>) - Constructor for class info.debatty.java.utils.SparseDoubleVector
 
SparseDoubleVector(double[]) - Constructor for class info.debatty.java.utils.SparseDoubleVector
 
SparseDoubleVectorExample - Class in info.debatty.java.stringsimilarity.examples
 
SparseDoubleVectorExample() - Constructor for class info.debatty.java.stringsimilarity.examples.SparseDoubleVectorExample
 
SparseIntegerVector - Class in info.debatty.java.utils
Sparse vector of int, implemented using two arrays.
SparseIntegerVector(int) - Constructor for class info.debatty.java.utils.SparseIntegerVector
Sparse vector of int, implemented using two arrays.
SparseIntegerVector() - Constructor for class info.debatty.java.utils.SparseIntegerVector
Sparse vector of int, implemented using two arrays.
SparseIntegerVector(HashMap<Integer, Integer>) - Constructor for class info.debatty.java.utils.SparseIntegerVector
Sparse vector of int, implemented using two arrays.
SparseIntegerVector(int[]) - Constructor for class info.debatty.java.utils.SparseIntegerVector
Sparse vector of int, implemented using two arrays.
StringDistance - Interface in info.debatty.java.stringsimilarity.interfaces
 
StringSimilarity - Interface in info.debatty.java.stringsimilarity.interfaces
 

T

toArray(int) - Method in class info.debatty.java.utils.SparseDoubleVector
Return the array corresponding to this sparse vector.
toString() - Method in class info.debatty.java.utils.SparseBooleanVector
 
toString() - Method in class info.debatty.java.utils.SparseDoubleVector
 
toString() - Method in class info.debatty.java.utils.SparseIntegerVector
 

U

union(SparseBooleanVector) - Method in class info.debatty.java.utils.SparseBooleanVector
 
union(SparseDoubleVector) - Method in class info.debatty.java.utils.SparseDoubleVector
 
union(SparseIntegerVector) - Method in class info.debatty.java.utils.SparseIntegerVector
Compute the size of the union of these two vectors.

V

values - Variable in class info.debatty.java.utils.SparseDoubleVector
 

W

WeightedLevenshtein - Class in info.debatty.java.stringsimilarity
Implementation of Levenshtein that allows to define different weights for different character substitutions.
WeightedLevenshtein(CharacterSubstitutionInterface) - Constructor for class info.debatty.java.stringsimilarity.WeightedLevenshtein
Instatiate with provided character substitution.
C D E G I J K L M N O P Q S T U V W 

Copyright © 2017. All rights reserved.