Package

com.github.vickumar1981

stringdistance

Permalink

package stringdistance

Provides classes for calculating distances and fuzzy match similarities between two strings. Also provides implicits for using distance and fuzzy match scores as an operator, like:

val result = "abc" levenshtein "abc"

Includes functionality for phonetic comparisons between strings.

Overview

The main class to use is com.github.vickumar1981.stringdistance.StringDistance

If you include com.github.vickumar1981.stringdistance.StringConverter, you can convert/use the string distance and score functions as an operator between two strings.

To compare two strings phonetically, i.e. if they sound alike, use the com.github.vickumar1981.stringdistance.util.StringSound class.

To use in Java, please use the corresponding classes in the com.github.vickumar1981.stringdistance.util package.

| Class | Description | | :--- | :--- | | com.github.vickumar1981.stringdistance.StringDistance | Singleton class with fuzzy match scores and distances | | com.github.vickumar1981.stringdistance.StringConverter | Implicit converstions between strings s1 and s2 | | com.github.vickumar1981.stringdistance.StringSound | Phonetic comparison between strings s1 and s2 | | com.github.vickumar1981.stringdistance.util.StringDistance | Java class for fuzzy match scores and distances | | com.github.vickumar1981.stringdistance.util.StringSound | Java class for phonetic comparison between strings s1 and s2 |

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. stringdistance
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. trait CosineAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the cosine similarity algorithm.

  2. class CosineSimilarityImplWrapper extends CosSimilarityImpl

    Permalink

    Jave Wrapper for cosine similarity.

  3. trait DamerauLevenshteinAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the damerau levenshtein distance algorithm.

  4. trait DiceCoefficientAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the dice coefficient algorithm.

  5. class DiceCoefficientImplWrapper extends DiceCoefficientImpl

    Permalink

    Jave Wrapper for dice coefficient similarity.

  6. trait DistanceAlgorithm[+T <: StringMetricAlgorithm] extends AnyRef

    Permalink

    A type class to extend a distance method to StringMetricAlgorithm.

  7. trait HammingAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the hamming distance algorithm.

  8. class HammingImplWrapper extends HammingImpl

    Permalink

    Jave Wrapper for hamming distance.

  9. trait JaccardAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for a jaccard similarity algorithm.

  10. class JaccardImplWrapper extends JaccardImpl

    Permalink

    Jave Wrapper for jaccard similarity.

  11. trait JaroAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the jaro similarity algorithm.

  12. class JaroImplWrapper extends JaroImpl

    Permalink

    Jave Wrapper for jaro and jaro winkler similarity.

  13. trait JaroWinklerAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the jaro winkler algorithm.

  14. trait LevenshteinAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the levenshtein distance algorithm.

  15. class LevenshteinDistanceImplWrapper extends LevenshteinDistanceImpl

    Permalink

    Jave Wrapper for levenshtein distance.

  16. trait LongestCommonSeqAlorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the longest common subsequence algorithm.

  17. class LongestCommonSeqWrapper extends LongestCommonSeqImpl

    Permalink

    Jave Wrapper for longest comment sequence.

  18. trait MetaphoneAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the metaphone algorithm.

  19. class MetaphoneImplWrapper extends MetaphoneImpl

    Permalink

    Jave Wrapper for metaphone similarity.

  20. trait NGramAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the n-gram similarity algorithm.

  21. class NGramImplWrapper extends NGramImpl

    Permalink

    Jave Wrapper for n-gram similarity.

  22. trait NeedlemanWunschAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the needleman wunsch similarity algorithm.

  23. class NeedlemanWunschImplWrapper extends NeedlemanWunschImpl

    Permalink

    Jave Wrapper for needleman wunsch similarity.

  24. trait OverlapAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the overlap similarity algorithm.

  25. class OverlapImplWrapper extends OverlapImpl

    Permalink

    Jave Wrapper for overlap similarity.

  26. trait ScorableFromDistance[+T <: StringMetricAlgorithm] extends ScoringAlgorithm[T]

    Permalink

    A mix-in trait to extend a score method using the distance method to StringMetricAlgorithm.

  27. trait ScoringAlgorithm[+T <: StringMetricAlgorithm] extends AnyRef

    Permalink

    A type class to extend a score method to StringMetricAlgorithm.

  28. trait SmithWatermanAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the smith waterman similarity algorithm.

  29. trait SmithWatermanGotohAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the smith waterman gotoh similarity algorithm.

  30. class SmithWatermanImplWrapper extends SmithWatermanImpl

    Permalink

    Jave Wrapper for smith waterman similarity.

  31. trait SoundScoringAlgorithm[+T <: StringMetricAlgorithm] extends AnyRef

    Permalink

    A type class to extend a sound score method to StringMetricAlgorithm.

  32. trait SoundexAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the soundex similarity algorithm.

  33. class SoundexImplWrapper extends SoundexImpl

    Permalink

    Jave Wrapper for soundex similarity.

  34. trait StringMetric[A <: StringMetricAlgorithm] extends AnyRef

    Permalink

    Defines implementation for StringMetricAlgorithm by adding implicit definitions from DistanceAlgorithm, ScoringAlgorithm, WeightedDistanceAlgorithm, or WeightedScoringAlgorithm

  35. trait StringMetricAlgorithm extends AnyRef

    Permalink

    A marker interface for the string metric algorithm.

  36. trait TverskyAlgorithm extends StringMetricAlgorithm

    Permalink

    A marker interface for the tversky similarity algorithm.

  37. trait WeightedDistanceAlgorithm[+A <: StringMetricAlgorithm, B] extends AnyRef

    Permalink

    A type class to extend a distance method with a 2nd typed parameter to StringMetricAlgorithm.

  38. trait WeightedScoringAlgorithm[+A <: StringMetricAlgorithm, B] extends AnyRef

    Permalink

    A type class to extend a score method with a 2nd typed parameter to StringMetricAlgorithm.

Value Members

  1. object Strategy

    Permalink

    The Strategy object has two strategies(reg ex) expressions on which to split input.

    The Strategy object has two strategies(reg ex) expressions on which to split input. Strategy.splitWord splits a word into a sequence of characters. Strategy.splitSentence splits a sentence into a sequence of words.

  2. object StringConverter

    Permalink

    Object to extend operations to the String class.

    Object to extend operations to the String class.

    import com.github.vickumar1981.stringdistance.StringConverter._
    
    // Scores between two strings
    val cosSimilarity: Double = "hello".cosine("chello")
    val damerau: Double = "martha".damerau("marhta")
    val diceCoefficient: Double = "martha".diceCoefficient("marhta")
    val hamming: Double = "martha".hamming("marhta")
    val jaccard: Double = "karolin".jaccard("kathrin")
    val jaro: Double = "martha".jaro("marhta")
    val jaroWinkler: Double = "martha".jaroWinkler("marhta")
    val levenshtein: Double = "martha".levenshtein("marhta")
    val needlemanWunsch: Double = "martha".needlemanWusnch("marhta")
    val ngramSimilarity: Double = "karolin".nGram("kathrin")
    val bigramSimilarity: Double = "karolin".nGram("kathrin", 2)
    val overlap: Double = "karolin".overlap("kathrin")
    val smithWaterman: Double = "martha".smithWaterman("marhta")
    val smithWatermanGotoh: Double = "martha".smithWatermanGotoh("marhta")
    val tversky: Double = "karolin".tversky("kathrin", 0.5)
    
    // Distances between two strings
    val damerauDist: int = "martha".damerauDist("marhta")
    val hammingDist: Int = "martha".hammingDist("marhta")
    val levenshteinDist: Int = "martha".levenshteinDist("marhta")
    val longestCommonSeq: Int = "martha".longestCommonSeq("marhta")
    val ngramDist: Int = "karolin".nGramDist("kathrin")
    val bigramDist: Int = "karolin".nGramDist("kathrin", 2)
    
    // Phonetic similarity of two strings
    val metaphone: Boolean = "merci".metaphone("mercy")
    val soundex: Boolean = "merci".soundex("mercy")
  3. object StringDistance

    Permalink

    Main class to organize functionality of different string distance algorithms

    Main class to organize functionality of different string distance algorithms

    import com.github.vickumar1981.stringdistance.StringDistance._
    
    // Scores between strings
    val cosSimilarity: Double = Cosine.score("hello", "chello")
    val damerau: Double = Damerau.score("martha", "marhta")
    val diceCoefficient: Double = DiceCoefficient.score("martha", "marhta")
    val hamming: Double = Hamming.score("martha", "marhta")
    val jaccard: Double = Jaccard.score("karolin", "kathrin")
    val jaro: Double = Jaro.score("martha", "marhta")
    val jaroWinkler: Double = JaroWinkler.score("martha", "marhta")
    val levenshtein: Double = Levenshtein.score("martha", "marhta")
    val needlemanWunsch: Double = NeedlemanWunsch.score("martha", "marhta")
    val ngramSimilarity: Double = NGram.score("karolin", "kathrin")
    val bigramSimilarity: Double = NGram.score("karolin", "kathrin", 2)
    val overlap: Double = Overlap.score("karolin", "kathrin")
    val smithWaterman: Double = SmithWaterman.score("martha", "marhta")
    val smithWatermanGotoh: Double = SmithWatermanGotoh.score("martha", "marhta")
    val tversky: Double = Tversky.score("karolin", "kathrin", 0.5)
    
    // Distances between strings
    val damerauDist: Int = Damerau.distance("martha", "marhta")
    val hammingDist: Int = Hamming.distance("martha", "marhta")
    val levenshteinDist: Int = Levenshtein.distance("martha", "marhta")
    val longestCommonSubSeq: Int = LongestCommonSeq.distance("martha", "marhta")
    val ngramDist: Int = NGram.distance("karolin", "kathrin")
    val bigramDist: Int = NGram.distance("karolin", "kathrin", 2)
  4. object StringSound

    Permalink

    Main class to organize functionality of different phonetic/sound string algorithms

    Main class to organize functionality of different phonetic/sound string algorithms

    import com.github.vickumar1981.stringdistance.StringSound._
    
    // Phonetic similarity between strings
    val metaphone: Boolean = Metaphone.score("merci", "mercy")
    val soundex: Boolean = Soundex.score("merci", "mercy")
  5. package impl

    Permalink
  6. package implicits

    Permalink
  7. package interfaces

    Permalink
  8. package util

    Permalink

Inherited from AnyRef

Inherited from Any

Ungrouped