Class StringUtilities

java.lang.Object
com.cedarsoftware.util.StringUtilities

public final class StringUtilities extends Object
Comprehensive utility class for string operations providing enhanced manipulation, comparison, and conversion capabilities with null-safe implementations.

Key Features

  • String Comparison:
    • Case-sensitive and case-insensitive equality
    • Comparison with automatic trimming
    • Null-safe operations
    • CharSequence support
  • Content Analysis:
    • Empty and whitespace checking
    • String length calculations
    • Character/substring counting
    • Pattern matching with wildcards
  • String Manipulation:
    • Advanced trimming operations
    • Quote handling
    • Encoding conversions
    • Random string generation
  • Distance Metrics:
    • Levenshtein distance calculation
    • Damerau-Levenshtein distance calculation

Usage Examples

String Comparison:


 // Case-sensitive and insensitive comparison
 boolean equals = StringUtilities.equals("text", "text");           // true
 boolean equals = StringUtilities.equalsIgnoreCase("Text", "text"); // true

 // Comparison with trimming
 boolean equals = StringUtilities.equalsWithTrim(" text ", "text"); // true
 

Content Checking:


 // Empty and whitespace checking
 boolean empty = StringUtilities.isEmpty("   ");      // true
 boolean empty = StringUtilities.isEmpty(null);       // true
 boolean hasContent = StringUtilities.hasContent(" text "); // true

 // Length calculations
 int len = StringUtilities.length(null);             // 0
 int len = StringUtilities.trimLength(" text ");     // 4
 

String Manipulation:


 // Trimming operations
 String result = StringUtilities.trimToEmpty(null);    // ""
 String result = StringUtilities.trimToNull("  ");     // null
 String result = StringUtilities.trimEmptyToDefault("  ", "default");  // "default"

 // Quote handling
 String result = StringUtilities.removeLeadingAndTrailingQuotes("\"text\"");  // text

 // Set conversion
 Set<String> set = StringUtilities.commaSeparatedStringToSet("a,b,c");  // [a, b, c]
 

Distance Calculations:


 // Edit distance metrics
 int distance = StringUtilities.levenshteinDistance("kitten", "sitting");        // 3
 int distance = StringUtilities.damerauLevenshteinDistance("book", "back");      // 2
 

Thread Safety

All methods in this class are stateless and thread-safe.

Author:
Ken Partlow, John DeRegnaucourt ([email protected])
Copyright (c) Cedar Software LLC

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

License

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
  • Field Details

    • FOLDER_SEPARATOR

      public static String FOLDER_SEPARATOR
    • EMPTY

      public static String EMPTY
  • Method Details

    • equals

      public static boolean equals(CharSequence cs1, CharSequence cs2)
      Compares two CharSequences, returning true if they represent equal sequences of characters.

      nulls are handled without exceptions. Two null references are considered to be equal. The comparison is case-sensitive.

      Parameters:
      cs1 - the first CharSequence, may be null
      cs2 - the second CharSequence, may be null
      Returns:
      true if the CharSequences are equal (case-sensitive), or both null
      See Also:
    • equals

      public static boolean equals(String s1, String s2)
      See Also:
    • equalsIgnoreCase

      public static boolean equalsIgnoreCase(CharSequence cs1, CharSequence cs2)
      Compares two CharSequences, returning true if they represent equal sequences of characters, ignoring case.

      nulls are handled without exceptions. Two null references are considered equal. The comparison is case insensitive.

      Parameters:
      cs1 - the first CharSequence, may be null
      cs2 - the second CharSequence, may be null
      Returns:
      true if the CharSequences are equal (case-insensitive), or both null
      See Also:
    • equalsIgnoreCase

      public static boolean equalsIgnoreCase(String s1, String s2)
      See Also:
    • equalsWithTrim

      public static boolean equalsWithTrim(String s1, String s2)
    • equalsIgnoreCaseWithTrim

      public static boolean equalsIgnoreCaseWithTrim(String s1, String s2)
    • isEmpty

      public static boolean isEmpty(CharSequence cs)
      Checks if a CharSequence is empty (""), null, or only whitespace.
      Parameters:
      cs - the CharSequence to check, may be null
      Returns:
      true if the CharSequence is empty or null
    • isEmpty

      public static boolean isEmpty(String s)
      See Also:
    • isWhitespace

      public static boolean isWhitespace(CharSequence cs)
      Checks if a CharSequence is empty (""), null or whitespace only.
      Parameters:
      cs - the CharSequence to check, may be null
      Returns:
      true if the CharSequence is null, empty or whitespace only
    • hasContent

      public static boolean hasContent(String s)
      Checks if a String is not empty (""), not null and not whitespace only.
      Parameters:
      s - the CharSequence to check, may be null
      Returns:
      true if the CharSequence is not empty and not null and not whitespace only
    • length

      public static int length(CharSequence cs)
      Gets a CharSequence length or 0 if the CharSequence is null.
      Parameters:
      cs - a CharSequence or null
      Returns:
      CharSequence length or 0 if the CharSequence is null.
    • length

      public static int length(String s)
      See Also:
    • trimLength

      public static int trimLength(String s)
      Returns the length of the trimmed string. If the length is null then it returns 0.
      Parameters:
      s - the string to get the trimmed length of
      Returns:
      the length of the trimmed string, or 0 if the input is null
    • lastIndexOf

      public static int lastIndexOf(String path, char ch)
    • decode

      public static byte[] decode(String s)
    • encode

      public static String encode(byte[] bytes)
      Convert a byte array into a printable format containing a String of hex digit characters (two per byte).
      Parameters:
      bytes - array representation
    • count

      public static int count(String s, char c)
    • count

      public static int count(CharSequence content, CharSequence token)
      Count the number of times that 'token' occurs within 'content'.
      Returns:
      int count (0 if it never occurs, null is the source string, or null is the token).
    • wildcardToRegexString

      public static String wildcardToRegexString(String wildcard)
      Convert strings containing DOS-style '*' or '?' to a regex String.
    • levenshteinDistance

      public static int levenshteinDistance(CharSequence s, CharSequence t)
      The Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other. The phrase 'edit distance' is often used to refer specifically to Levenshtein distance.
      Parameters:
      s - String one
      t - String two
      Returns:
      the 'edit distance' (Levenshtein distance) between the two strings.
    • damerauLevenshteinDistance

      public static int damerauLevenshteinDistance(CharSequence source, CharSequence target)
      Calculate the Damerau-Levenshtein Distance between two strings. The basic difference between this algorithm and the general Levenshtein algorithm is that damerau-Levenshtein counts a swap of two characters next to each other as 1 instead of 2. This breaks the 'triangular equality', which makes it unusable for Metric trees. See Wikipedia pages on both Levenshtein and Damerau-Levenshtein and then make your decision as to which algorithm is appropriate for your situation.
      Parameters:
      source - Source input string
      target - Target input string
      Returns:
      The number of substitutions it would take to make the source string identical to the target string
    • getRandomString

      public static String getRandomString(Random random, int minLen, int maxLen)
      Parameters:
      random - Random instance
      minLen - minimum number of characters
      maxLen - maximum number of characters
      Returns:
      String of alphabetical characters, with the first character uppercase (Proper case strings).
    • getRandomChar

      public static String getRandomChar(Random random, boolean upper)
    • getBytes

      public static byte[] getBytes(String s, String encoding)
      Convert a String into a byte[] with a particular encoding. Preferable used when the encoding is one of the guaranteed Java types and you don't want to have to catch the UnsupportedEncodingException required by Java
      Parameters:
      s - string to encode into bytes
      encoding - encoding to use
    • createUtf8String

      public static String createUtf8String(byte[] bytes)
      Convert a byte[] into a UTF-8 String. Preferable used when the encoding is one of the guaranteed Java types and you don't want to have to catch the UnsupportedEncodingException required by Java
      Parameters:
      bytes - bytes to encode into a string
    • getUTF8Bytes

      public static byte[] getUTF8Bytes(String s)
      Convert a String into a byte[] encoded by UTF-8.
      Parameters:
      s - string to encode into bytes
    • createString

      public static String createString(byte[] bytes, String encoding)
      Convert a byte[] into a String with a particular encoding. Preferable used when the encoding is one of the guaranteed Java types and you don't want to have to catch the UnsupportedEncodingException required by Java
      Parameters:
      bytes - bytes to encode into a string
      encoding - encoding to use
    • createUTF8String

      public static String createUTF8String(byte[] bytes)
      Convert a byte[] into a UTF-8 encoded String.
      Parameters:
      bytes - bytes to encode into a string
    • hashCodeIgnoreCase

      public static int hashCodeIgnoreCase(String s)
      Get the hashCode of a String, insensitive to case, without any new Strings being created on the heap.
      Parameters:
      s - String input
      Returns:
      int hashCode of input String insensitive to case
    • trim

      public static String trim(String str)
      Removes control characters (char <= 32) from both ends of this String, handling null by returning null.

      The String is trimmed using String.trim(). Trim removes start and end characters <= 32.

      Parameters:
      str - the String to be trimmed, may be null
      Returns:
      the trimmed string, null if null String input
    • trimToEmpty

      public static String trimToEmpty(String value)
      Trims a string, its null safe and null will return empty string here..
      Parameters:
      value - string input
      Returns:
      String trimmed string, if value was null this will be empty
    • trimToNull

      public static String trimToNull(String value)
      Trims a string, If the string trims to empty then we return null.
      Parameters:
      value - string input
      Returns:
      String, trimmed from value. If the value was empty we return null.
    • trimEmptyToDefault

      public static String trimEmptyToDefault(String value, String defaultValue)
      Trims a string, If the string trims to empty then we return the default.
      Parameters:
      value - string input
      defaultValue - value to return on empty or null
      Returns:
      trimmed string, or defaultValue when null or empty
    • removeLeadingAndTrailingQuotes

      public static String removeLeadingAndTrailingQuotes(String input)
      Removes all leading and trailing double quotes from a String. Multiple consecutive quotes at the beginning or end of the string will all be removed.

      Examples:

      • "text" → text
      • ""text"" → text
      • """text""" → text
      • "text with "quotes" inside" → text with "quotes" inside
      Parameters:
      input - the String from which to remove quotes (may be null)
      Returns:
      the String with all leading and trailing quotes removed, or null if input was null
    • commaSeparatedStringToSet

      public static Set<String> commaSeparatedStringToSet(String commaSeparatedString)
      Converts a comma-separated string into a Set of trimmed, non-empty strings.

      This method splits the provided string by commas, trims whitespace from each resulting substring, filters out any empty strings, and collects the unique strings into a Set. If the input string is null or empty after trimming, the method returns an empty set.

      Usage Example:

      
       String csv = "apple, banana, cherry, apple,  ";
       Set<String> fruitSet = commaSeparatedStringToSet(csv);
       // fruitSet contains ["apple", "banana", "cherry"]
       

      Note: The resulting Set does not maintain the insertion order. If order preservation is required, consider using a LinkedHashSet.

      Parameters:
      commaSeparatedString - the comma-separated string to convert
      Returns:
      a Set containing the trimmed, unique, non-empty substrings from the input string. Returns an empty set if the input is null, empty, or contains only whitespace.
      Throws:
      IllegalArgumentException - if the method is modified to disallow null inputs in the future
      See Also: