Interface | Description |
---|---|
StringSimilarityInterface |
Class | Description |
---|---|
JaroWinkler | |
KShingling |
A k-shingling is a set of unique k-grams, used to measure the similarity of
two documents.
|
Levenshtein |
The Levenshtein distance between two words is the minimum number of
single-character edits (insertions, deletions or substitutions) required to
change one word into the other.
|
LongestCommonSubsequence |
The longest common subsequence (LCS) problem consists in finding the
longest subsequence common to two (or more) sequences.
|
NGram |
N-Gram Similarity as defined by Kondrak, "N-Gram Similarity and Distance",
String Processing and Information Retrieval, Lecture Notes in Computer
Science Volume 3772, 2005, pp 115-126.
|
QGram |
QGram similarity is the relative number of n-grams both strings have in
common.
|
Copyright © 2015. All rights reserved.