Class Jaccard

    • Constructor Detail

      • Jaccard

        public Jaccard​(int k)
        The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|. The default value of k is 3.
        Parameters:
        k -
      • Jaccard

        public Jaccard()
        The strings are first transformed into sets of k-shingles (sequences of k characters), then Jaccard index is computed as |A inter B| / |A union B|. The default value of k is 3.
    • Method Detail

      • similarity

        public final double similarity​(String s1,
                                       String s2)
        Compute Jaccard index: |A inter B| / |A union B|.
        Specified by:
        similarity in interface StringSimilarity
        Parameters:
        s1 - The first string to compare.
        s2 - The second string to compare.
        Returns:
        The Jaccard index in the range [0, 1]
        Throws:
        NullPointerException - if s1 or s2 is null.