All Superinterfaces:: Column<String>, Comparator<String>, Iterable<String>

All Known Implementing Classes:: AbstractStringColumn, StringColumn, TextColumn

public interface StringMapFunctions extends Column<String>

String utility functions. Each function takes one or more String columns as input and produces another Column as output. The resulting column need not be a string column.

This code was developed as part of Apache Commons Text.

Method Summary

Modifier and Type

Method

Description

default StringColumn

abbreviate(int maxWidth)

Abbreviates a String using ellipses.

default StringColumn

capitalize()

Capitalizes each String changing the first character of each to title case as per Character.toTitleCase(int), as if in a sentence.

default StringColumn

commonPrefix(Column<String> column2)

default StringColumn

commonSuffix(Column<String> column2)

default StringColumn

concatenate(Object... stringsToAppend)

Return a copy of this column with the given string appended to each element

default StringColumn

concatenate(Column<?>... stringColumns)

Return a copy of this column with the corresponding value of each column argument appended to each element. getString is used to ensure the value returned by the args are strings

default DoubleColumn

countTokens(String separator)

default DoubleColumn

distance(Column<String> column2)

Returns a column containing the levenshtein distance between the two given string columns

default StringColumn

format(String formatString)

default StringColumn

join(String separator, Column<?>... columns)

Return a copy of this column with the given string appended

default DoubleColumn

length()

Returns a column containing the character length of each string in this column The returned column is the same size as the original

default StringColumn

lowerCase()

default StringColumn

padEnd(int minLength, char padChar)

default StringColumn

padStart(int minLength, char padChar)

default DoubleColumn

parseDouble()

Returns an Double containing all the values of this string column as doubles, assuming all the values are stringified doubles in the first place.

default FloatColumn

parseFloat()

Returns an Float containing all the values of this string column as floats, assuming all the values are stringified floats in the first place.

default IntColumn

parseInt()

Returns an IntColumn containing all the values of this string column as integers, assuming all the values are stringified ints in the first place.

default StringColumn

repeat(int times)

Repeats each the column's values elementwise, concatinating the results into a new StringColumn

default StringColumn

replaceAll(String[] regexArray, String replacement)

Creates a new column, replacing each string in this column with a new string formed by replacing any substring that matches the regex

default StringColumn

replaceAll(String regex, String replacement)

default StringColumn

replaceFirst(String regex, String replacement)

default StringColumn

substring(int start)

Returns a column containing the substrings from start to the end of the input

default StringColumn

substring(int start, int end)

default StringColumn

tokenizeAndRemoveDuplicates(String separator)

default StringColumn

tokenizeAndSort()

Splits on Whitespace and returns the lexicographically sorted result.

default StringColumn

tokenizeAndSort(String separator)

default StringColumn

tokens(String separator)

Returns a column of arbitrary size containing each token in this column, where a token is defined using the given separator.

default StringColumn

trim()

default StringColumn

uniqueTokens(String separator)

Returns a column of arbitrary size containing each unique token in this column, where a token is defined using the given separator, and uniqueness is calculated across the entire column

default StringColumn

upperCase()

Methods inherited from interface tech.tablesaw.columns.Column
allMatch, anyMatch, append, append, append, appendCell, appendCell, appendMissing, appendObj, asBytes, asList, asObjectArray, asSet, asStringColumn, byteSize, clear, columnWidth, contains, copy, count, count, countMissing, countUnique, emptyCopy, emptyCopy, equals, filter, first, get, getString, getUnformattedString, indexOf, inRange, interpolate, isEmpty, isMissing, isMissing, isNotMissing, lag, last, lead, map, map, mapInto, max, max, min, min, name, noneMatch, parser, print, reduce, reduce, removeMissing, rolling, rowComparator, sampleN, sampleX, set, set, set, set, set, set, setMissing, setMissingTo, setName, setParser, size, sortAscending, sortDescending, sorted, subset, summary, title, type, unique, valueHash, where

Methods inherited from interface java.util.Comparator
compare, equals, reversed, thenComparing, thenComparing, thenComparing, thenComparingDouble, thenComparingInt, thenComparingLong

Methods inherited from interface java.lang.Iterable
forEach, iterator, spliterator

Method Details
- upperCase
  
  default StringColumn upperCase()
- lowerCase
  
  default StringColumn lowerCase()
- capitalize
  
  default StringColumn capitalize()
  Capitalizes each String changing the first character of each to title case as per Character.toTitleCase(int), as if in a sentence. No other characters are changed.
  capitalize(null) = null capitalize("") = "" capitalize("cat") = "Cat" capitalize("cAt") = "CAt" capitalize("'cat'") = "'cat'"
- repeat
  
  default StringColumn repeat(int times)
  
  Repeats each the column's values elementwise, concatinating the results into a new StringColumn
  Parameters:
  
  times - The number of repeat desired
  repeat("", 2) = "" repeat("cat", 3) = "catcatcat"
  
  Returns:
  
  the new StringColumn
- trim
  
  default StringColumn trim()
- replaceAll
  
  default StringColumn replaceAll(String regex, String replacement)
- replaceFirst
  
  default StringColumn replaceFirst(String regex, String replacement)
- substring
  
  default StringColumn substring(int start, int end)
- substring
  
  default StringColumn substring(int start)
  
  Returns a column containing the substrings from start to the end of the input
  
  Throws:
  
  StringIndexOutOfBoundsException - if any string in the column is shorter than start
- abbreviate
  
  default StringColumn abbreviate(int maxWidth)
  
  Abbreviates a String using ellipses. This will turn "Now is the time for all good men" into "Now is the time for..."
  
  Parameters:
  
  maxWidth - the maximum width of the resulting strings, including the elipses.
- format
  
  default StringColumn format(String formatString)
- parseInt
  
  default IntColumn parseInt()
  
  Returns an IntColumn containing all the values of this string column as integers, assuming all the values are stringified ints in the first place. Otherwise an exception is thrown
  
  Returns:
  
  An IntColumn containing ints parsed from the strings in this column
- parseDouble
  
  default DoubleColumn parseDouble()
  
  Returns an Double containing all the values of this string column as doubles, assuming all the values are stringified doubles in the first place. Otherwise an exception is thrown
  
  Returns:
  
  A DoubleColumn containing doubles parsed from the strings in this column
- parseFloat
  
  default FloatColumn parseFloat()
  
  Returns an Float containing all the values of this string column as floats, assuming all the values are stringified floats in the first place. Otherwise an exception is thrown
  
  Returns:
  
  A FloatColumn containing floats parsed from the strings in this column
- padEnd
  
  default StringColumn padEnd(int minLength, char padChar)
- padStart
  
  default StringColumn padStart(int minLength, char padChar)
- commonPrefix
  
  default StringColumn commonPrefix(Column<String> column2)
- commonSuffix
  
  default StringColumn commonSuffix(Column<String> column2)
- distance
  
  default DoubleColumn distance(Column<String> column2)
  
  Returns a column containing the levenshtein distance between the two given string columns
- join
  
  default StringColumn join(String separator, Column<?>... columns)
  
  Return a copy of this column with the given string appended
  
  Parameters:
  
  columns - the column to append
  
  Returns:
  
  the new column
- concatenate
  
  default StringColumn concatenate(Object... stringsToAppend)
  
  Return a copy of this column with the given string appended to each element
  
  Parameters:
  
  stringsToAppend - the stringified objects to append
  
  Returns:
  
  the new column
- concatenate
  
  default StringColumn concatenate(Column<?>... stringColumns)
  
  Return a copy of this column with the corresponding value of each column argument appended to each element. getString is used to ensure the value returned by the args are strings
  
  Parameters:
  
  stringColumns - the string columns to append
  
  Returns:
  
  the new column
- replaceAll
  
  default StringColumn replaceAll(String[] regexArray, String replacement)
  
  Creates a new column, replacing each string in this column with a new string formed by replacing any substring that matches the regex
  
  Parameters:
  
  regexArray - the regex array to replace
  
  replacement - the replacement array
  
  Returns:
  
  the new column
- tokenizeAndSort
  
  default StringColumn tokenizeAndSort(String separator)
- countTokens
  
  default DoubleColumn countTokens(String separator)
- uniqueTokens
  
  default StringColumn uniqueTokens(String separator)
  
  Returns a column of arbitrary size containing each unique token in this column, where a token is defined using the given separator, and uniqueness is calculated across the entire column
  NOTE: Unlike other map functions, this method produces a column whose size may be different from the source, so they cannot safely be combined in a table.
  
  Parameters:
  
  separator - the delimiter used in the tokenizing operation
  
  Returns:
  
  a new column
- tokens
  
  default StringColumn tokens(String separator)
  
  Returns a column of arbitrary size containing each token in this column, where a token is defined using the given separator.
  NOTE: Unlike other map functions, this method produces a column whose size may be different from the source, so they cannot safely be combined in a table.
  
  Parameters:
  
  separator - the delimiter used in the tokenizing operation
  
  Returns:
  
  a new column
- length
  
  default DoubleColumn length()
  
  Returns a column containing the character length of each string in this column The returned column is the same size as the original
- tokenizeAndSort
  
  default StringColumn tokenizeAndSort()
  
  Splits on Whitespace and returns the lexicographically sorted result.
  
  Returns:
  
  a StringColumn
- tokenizeAndRemoveDuplicates
  
  default StringColumn tokenizeAndRemoveDuplicates(String separator)

Interface StringMapFunctions

Method Summary

Methods inherited from interface tech.tablesaw.columns.Column

Methods inherited from interface java.util.Comparator

Methods inherited from interface java.lang.Iterable

Method Details

upperCase

lowerCase

capitalize

repeat

trim

replaceAll

replaceFirst

substring

substring

abbreviate

format

parseInt

parseDouble

parseFloat

padEnd

padStart

commonPrefix

commonSuffix

distance

join

concatenate

concatenate

replaceAll

tokenizeAndSort

countTokens

uniqueTokens

tokens

length

tokenizeAndSort

tokenizeAndRemoveDuplicates