Class BinaryStringData
- java.lang.Object
-
- org.apache.flink.table.data.binary.LazyBinaryFormat<String>
-
- org.apache.flink.table.data.binary.BinaryStringData
-
- All Implemented Interfaces:
Comparable<StringData>,BinaryFormat,StringData
@Internal public final class BinaryStringData extends LazyBinaryFormat<String> implements StringData
A lazily binary implementation ofStringDatawhich is backed byMemorySegments andString.Either
MemorySegments orStringmust be provided when constructingBinaryStringData. The other representation will be materialized when needed.It provides many useful methods for comparison, search, and so on.
-
-
Field Summary
Fields Modifier and Type Field Description static BinaryStringDataEMPTY_UTF8-
Fields inherited from interface org.apache.flink.table.data.binary.BinaryFormat
HIGHEST_FIRST_BIT, HIGHEST_SECOND_TO_EIGHTH_BIT, MAX_FIX_PART_DATA_SIZE
-
-
Constructor Summary
Constructors Constructor Description BinaryStringData()BinaryStringData(String javaObject)BinaryStringData(org.apache.flink.core.memory.MemorySegment[] segments, int offset, int sizeInBytes)BinaryStringData(org.apache.flink.core.memory.MemorySegment[] segments, int offset, int sizeInBytes, String javaObject)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static BinaryStringDatablankString(int length)Creates aBinaryStringDatainstance that contains `length` spaces.bytebyteAt(int index)Returns thebytevalue at the specified index.intcompareTo(StringData o)Compares two strings lexicographically.booleancontains(BinaryStringData s)Returns true if and only if this BinaryStringData contains the specified sequence of bytes values.BinaryStringDatacopy()Copy a newBinaryStringData.booleanendsWith(BinaryStringData suffix)Tests if this BinaryStringData ends with the specified suffix.voidensureMaterialized()booleanequals(Object o)static BinaryStringDatafromAddress(org.apache.flink.core.memory.MemorySegment[] segments, int offset, int numBytes)Creates aBinaryStringDatainstance from the given address (base and offset) and length.static BinaryStringDatafromBytes(byte[] bytes)Creates aBinaryStringDatainstance from the given UTF-8 bytes.static BinaryStringDatafromBytes(byte[] bytes, int offset, int numBytes)Creates aBinaryStringDatainstance from the given UTF-8 bytes with offset and number of bytes.static BinaryStringDatafromString(String str)Creates aBinaryStringDatainstance from the given Java string.intgetOffset()Gets the start offset of this binary data in theMemorySegments.org.apache.flink.core.memory.MemorySegment[]getSegments()Gets the underlyingMemorySegments this binary format spans.intgetSizeInBytes()Gets the size in bytes of this binary data.inthashCode()intindexOf(BinaryStringData str, int fromIndex)Returns the index within this string of the first occurrence of the specified substring, starting at the specified index.protected BinarySectionmaterialize(org.apache.flink.api.common.typeutils.TypeSerializer<String> serializer)Materialize java object to binary format.intnumChars()Returns the number of UTF-8 code points in the string.booleanstartsWith(BinaryStringData prefix)Tests if this BinaryStringData starts with the specified prefix.BinaryStringDatasubstring(int beginIndex, int endIndex)Returns a binary string that is a substring of this binary string.byte[]toBytes()Converts thisStringDataobject to a UTF-8 byte array.BinaryStringDatatoLowerCase()Converts all of the characters in thisBinaryStringDatato lower case.StringtoString()Converts thisStringDataobject to aString.BinaryStringDatatoUpperCase()Converts all of the characters in thisBinaryStringDatato upper case.BinaryStringDatatrim()Returns a string whose value is this string, with any leading and trailing whitespace removed.-
Methods inherited from class org.apache.flink.table.data.binary.LazyBinaryFormat
ensureMaterialized, getBinarySection, getJavaObject, setJavaObject
-
-
-
-
Field Detail
-
EMPTY_UTF8
public static final BinaryStringData EMPTY_UTF8
-
-
Constructor Detail
-
BinaryStringData
public BinaryStringData()
-
BinaryStringData
public BinaryStringData(String javaObject)
-
BinaryStringData
public BinaryStringData(org.apache.flink.core.memory.MemorySegment[] segments, int offset, int sizeInBytes)
-
BinaryStringData
public BinaryStringData(org.apache.flink.core.memory.MemorySegment[] segments, int offset, int sizeInBytes, String javaObject)
-
-
Method Detail
-
fromAddress
public static BinaryStringData fromAddress(org.apache.flink.core.memory.MemorySegment[] segments, int offset, int numBytes)
Creates aBinaryStringDatainstance from the given address (base and offset) and length.
-
fromString
public static BinaryStringData fromString(String str)
Creates aBinaryStringDatainstance from the given Java string.
-
fromBytes
public static BinaryStringData fromBytes(byte[] bytes)
Creates aBinaryStringDatainstance from the given UTF-8 bytes.
-
fromBytes
public static BinaryStringData fromBytes(byte[] bytes, int offset, int numBytes)
Creates aBinaryStringDatainstance from the given UTF-8 bytes with offset and number of bytes.
-
blankString
public static BinaryStringData blankString(int length)
Creates aBinaryStringDatainstance that contains `length` spaces.
-
toBytes
public byte[] toBytes()
Description copied from interface:StringDataConverts thisStringDataobject to a UTF-8 byte array.Note: The returned byte array may be reused.
- Specified by:
toBytesin interfaceStringData
-
toString
public String toString()
Description copied from interface:StringDataConverts thisStringDataobject to aString.- Specified by:
toStringin interfaceStringData- Overrides:
toStringin classObject
-
compareTo
public int compareTo(@Nonnull StringData o)Compares two strings lexicographically. Since UTF-8 uses groups of six bits, it is sometimes useful to use octal notation which uses 3-bit groups. With a calculator which can convert between hexadecimal and octal it can be easier to manually create or interpret UTF-8 compared with using binary. So we just compare the binary.- Specified by:
compareToin interfaceComparable<StringData>
-
numChars
public int numChars()
Returns the number of UTF-8 code points in the string.
-
byteAt
public byte byteAt(int index)
Returns thebytevalue at the specified index. An index ranges from0tobinarySection.sizeInBytes - 1.- Parameters:
index- the index of thebytevalue.- Returns:
- the
bytevalue at the specified index of this UTF-8 bytes. - Throws:
IndexOutOfBoundsException- if theindexargument is negative or not less than the length of this UTF-8 bytes.
-
getSegments
public org.apache.flink.core.memory.MemorySegment[] getSegments()
Description copied from interface:BinaryFormatGets the underlyingMemorySegments this binary format spans.- Specified by:
getSegmentsin interfaceBinaryFormat- Overrides:
getSegmentsin classLazyBinaryFormat<String>
-
getOffset
public int getOffset()
Description copied from interface:BinaryFormatGets the start offset of this binary data in theMemorySegments.- Specified by:
getOffsetin interfaceBinaryFormat- Overrides:
getOffsetin classLazyBinaryFormat<String>
-
getSizeInBytes
public int getSizeInBytes()
Description copied from interface:BinaryFormatGets the size in bytes of this binary data.- Specified by:
getSizeInBytesin interfaceBinaryFormat- Overrides:
getSizeInBytesin classLazyBinaryFormat<String>
-
ensureMaterialized
public void ensureMaterialized()
-
materialize
protected BinarySection materialize(org.apache.flink.api.common.typeutils.TypeSerializer<String> serializer)
Description copied from class:LazyBinaryFormatMaterialize java object to binary format. Inherited classes need to hold the information they need. (For example,RawValueDataneeds javaObjectSerializer).- Specified by:
materializein classLazyBinaryFormat<String>
-
copy
public BinaryStringData copy()
Copy a newBinaryStringData.
-
substring
public BinaryStringData substring(int beginIndex, int endIndex)
Returns a binary string that is a substring of this binary string. The substring begins at the specifiedbeginIndexand extends to the character at indexendIndex - 1.Examples:
fromString("hamburger").substring(4, 8) returns binary string "urge" fromString("smiles").substring(1, 5) returns binary string "mile"- Parameters:
beginIndex- the beginning index, inclusive.endIndex- the ending index, exclusive.- Returns:
- the specified substring, return EMPTY_UTF8 when index out of bounds instead of StringIndexOutOfBoundsException.
-
contains
public boolean contains(BinaryStringData s)
Returns true if and only if this BinaryStringData contains the specified sequence of bytes values.- Parameters:
s- the sequence to search for- Returns:
- true if this BinaryStringData contains
s, false otherwise
-
startsWith
public boolean startsWith(BinaryStringData prefix)
Tests if this BinaryStringData starts with the specified prefix.- Parameters:
prefix- the prefix.- Returns:
trueif the bytes represented by the argument is a prefix of the bytes represented by this string;falseotherwise. Note also thattruewill be returned if the argument is an empty BinaryStringData or is equal to thisBinaryStringDataobject as determined by theequals(Object)method.
-
endsWith
public boolean endsWith(BinaryStringData suffix)
Tests if this BinaryStringData ends with the specified suffix.- Parameters:
suffix- the suffix.- Returns:
trueif the bytes represented by the argument is a suffix of the bytes represented by this object;falseotherwise. Note that the result will betrueif the argument is the empty string or is equal to thisBinaryStringDataobject as determined by theequals(Object)method.
-
trim
public BinaryStringData trim()
Returns a string whose value is this string, with any leading and trailing whitespace removed.- Returns:
- A string whose value is this string, with any leading and trailing white space removed, or this string if it has no leading or trailing white space.
-
indexOf
public int indexOf(BinaryStringData str, int fromIndex)
Returns the index within this string of the first occurrence of the specified substring, starting at the specified index.- Parameters:
str- the substring to search for.fromIndex- the index from which to start the search.- Returns:
- the index of the first occurrence of the specified substring, starting at the
specified index, or
-1if there is no such occurrence.
-
toUpperCase
public BinaryStringData toUpperCase()
Converts all of the characters in thisBinaryStringDatato upper case.- Returns:
- the
BinaryStringData, converted to uppercase.
-
toLowerCase
public BinaryStringData toLowerCase()
Converts all of the characters in thisBinaryStringDatato lower case.- Returns:
- the
BinaryStringData, converted to lowercase.
-
-