public class Text extends Object
Modifier and Type | Class and Description |
---|---|
static class |
Text.TextSerializer
JSON serializer for
Text . |
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_MAX_LEN |
Constructor and Description |
---|
Text() |
Text(byte[] utf8)
Construct from a byte array.
|
Text(String string)
Construct from a string.
|
Text(Text utf8)
Construct from another text.
|
Modifier and Type | Method and Description |
---|---|
void |
append(byte[] utf8,
int start,
int len)
Append a range of bytes to the end of the given text.
|
static int |
bytesToCodePoint(ByteBuffer bytes)
Returns the next code point at the current position in the buffer.
|
int |
charAt(int position)
Returns the Unicode Scalar Value (32-bit integer value) for the character at
position . |
void |
clear()
Clear the string to empty.
|
byte[] |
copyBytes()
Get a copy of the bytes that is exactly the length of the data.
|
static String |
decode(byte[] utf8)
Converts the provided byte array to a String using the UTF-8 encoding.
|
static String |
decode(byte[] utf8,
int start,
int length) |
static String |
decode(byte[] utf8,
int start,
int length,
boolean replace)
Converts the provided byte array to a String using the UTF-8 encoding.
|
static ByteBuffer |
encode(String string)
Converts the provided String to bytes using the UTF-8 encoding.
|
static ByteBuffer |
encode(String string,
boolean replace)
Converts the provided String to bytes using the UTF-8 encoding.
|
boolean |
equals(Object o) |
int |
find(String what) |
int |
find(String what,
int start)
Finds any occurrence of
what in the backing buffer, starting as position
start . |
byte[] |
getBytes()
Returns the raw bytes; however, only data up to
getLength() is valid. |
int |
getLength()
Get the number of bytes in the byte array.
|
int |
hashCode()
Copied from Arrays.hashCode so we don't have to copy the byte array.
|
void |
readWithKnownLength(DataInput in,
int len)
Read a Text object whose length is already known.
|
void |
set(byte[] utf8)
Set to a utf8 byte array.
|
void |
set(byte[] utf8,
int start,
int len)
Set the Text to range of bytes.
|
void |
set(String string)
Set to contain the contents of a string.
|
void |
set(Text other)
copy a text.
|
String |
toString() |
static int |
utf8Length(String string)
For the given string, returns the number of UTF-8 bytes required to encode the string.
|
static void |
validateUTF8(byte[] utf8)
Check if a byte array contains valid utf-8.
|
static void |
validateUTF8(byte[] utf8,
int start,
int len)
Check to see if a byte array is valid utf-8.
|
public static final int DEFAULT_MAX_LEN
public Text()
public Text(String string)
string
- initialize from that stringpublic Text(Text utf8)
utf8
- initialize from that Textpublic Text(byte[] utf8)
utf8
- initialize from that byte arraypublic byte[] copyBytes()
getBytes()
for
faster access to the underlying array.public byte[] getBytes()
getLength()
is valid. Please use
copyBytes()
if you need the returned array to be precisely the length of the data.public int getLength()
public int charAt(int position)
position
. Note that this method avoids using the converter or doing String
instantiation.position
- the index of the char we want to retrievepublic int find(String what)
public int find(String what, int start)
what
in the backing buffer, starting as position
start
. The starting position is measured in bytes and the return value is in terms
of byte position in the buffer. The backing buffer is not converted to a string for this
operation.what
- the string to search forstart
- where to start frompublic void set(String string)
string
- the string to initialize frompublic void set(byte[] utf8)
utf8
- the byte array to initialize frompublic void set(Text other)
other
- the text to initialize frompublic void set(byte[] utf8, int start, int len)
utf8
- the data to copy fromstart
- the first position of the new stringlen
- the number of bytes of the new stringpublic void append(byte[] utf8, int start, int len)
utf8
- the data to copy fromstart
- the first position to append from utf8len
- the number of bytes to appendpublic void clear()
getBytes()
. In order to free the byte-array memory, call
set(byte[])
with an empty byte array (For example, new byte[0]
).public void readWithKnownLength(DataInput in, int len) throws IOException
in
- the input to initialize fromlen
- how many bytes to read from inIOException
- if something bad happenspublic int hashCode()
public static String decode(byte[] utf8) throws CharacterCodingException
utf8
- bytes to decodeCharacterCodingException
- if this is not valid UTF-8public static String decode(byte[] utf8, int start, int length) throws CharacterCodingException
CharacterCodingException
public static String decode(byte[] utf8, int start, int length, boolean replace) throws CharacterCodingException
replace
is true, then malformed input is replaced with the substitution character, which is U+FFFD.
Otherwise the method throws a MalformedInputException.utf8
- the bytes to decodestart
- where to start fromlength
- length of the bytes to decodereplace
- whether to replace malformed characters with U+FFFDCharacterCodingException
- if the input could not be decodedpublic static ByteBuffer encode(String string) throws CharacterCodingException
string
- the string to encodeCharacterCodingException
- if the string could not be encodedpublic static ByteBuffer encode(String string, boolean replace) throws CharacterCodingException
replace
is
true, then malformed input is replaced with the substitution character, which is U+FFFD.
Otherwise the method throws a MalformedInputException.string
- the string to encodereplace
- whether to replace malformed characters with U+FFFDCharacterCodingException
- if the string could not be encodedpublic static void validateUTF8(byte[] utf8) throws MalformedInputException
utf8
- byte arrayMalformedInputException
- if the byte array contains invalid utf-8public static void validateUTF8(byte[] utf8, int start, int len) throws MalformedInputException
utf8
- the array of bytesstart
- the offset of the first byte in the arraylen
- the length of the byte sequenceMalformedInputException
- if the byte array contains invalid bytespublic static int bytesToCodePoint(ByteBuffer bytes)
bytes
- the incoming bytespublic static int utf8Length(String string)
string
- text to encodeCopyright © 2023 The Apache Software Foundation. All rights reserved.