Package convex.core.data.prim
Class CVMChar
java.lang.Object
convex.core.data.AObject
convex.core.data.ACell
convex.core.data.prim.APrimitive
convex.core.data.prim.CVMChar
- All Implemented Interfaces:
IValidated
,IWriteable
,Comparable<CVMChar>
Class for CVM Character values.
Characters are Unicode code points, and can be used to construct Strings on the CVM.
Limited to range 0 .. 0x10ffff as per Unicode standard
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic int
static final int
Maximum number of UTF-8 bytes required to represent aCVMChar
static CVMChar
static final CVMChar
Singleton instance representing the NULL character (code point zero)Fields inherited from class convex.core.data.ACell
cachedRef, memorySize
-
Method Summary
Modifier and TypeMethodDescriptionstatic int
byteCountFromTag
(byte tag) Get the number of UTF-8 bytes as encoded within the encoding tagchar
Gets the Java char value of this CVM Character.static int
codepointFromUTFInt
(int utf) Gets a code point value from bytes encoded in a Java integer (starting from high byte)int
static CVMChar
create
(long value) Gets aCVMChar
for the given Unicode code point, or null if not validdouble
int
encode
(byte[] bs, int pos) Writes this Cell's encoding to a byte array, including a tag byte which will be written first.int
encodeRaw
(byte[] bs, int pos) Writes this Cell's encoding to a byte array, excluding the tag byte.boolean
Checks for equality with another Cell.boolean
int
Estimate the encoded data size for this Cell.static CVMChar
Gets aCVMChar
from a UTF-8 representationint
Gets the Unicode code point for this Characterbyte
getTag()
Gets the tag byte for this cell.getType()
Gets the most specific known runtime Type for this Cell.int
hashCode()
Gets the Java hashCode for this cell.long
Gets the Long value of this char, equal to the Unicode code pointstatic CVMChar
Parses a Character from a Java String, as interpreted by the Reader e.g.boolean
print
(BlobBuilder bb, long limit) Prints this Object to a readable String Representation.static CVMChar
Reads char data from BlobtoCVMString
(long limit) Returns the CVM String representation of this Cell.toString()
Returns the Java String representation of this CVMChar.Gets the Blob representation of this Character in UTF-8byte[]
Converts this Character to a Blob with its UTF-8 byte representationstatic int
utfLength
(long c) Gets the UTF-8 length in bytes for this CVMCharvoid
Validates the local structure and invariants of this cell.Methods inherited from class convex.core.data.prim.APrimitive
calcMemorySize, getRef, getRefCount, isCanonical, isCVMValue, isDataValue, isEmbedded, toCanonical, updateRefs
Methods inherited from class convex.core.data.ACell
attachMemorySize, attachRef, cachedEncoding, cachedHash, createEncoding, createRef, equals, genericEquals, getCanonical, getChildRefs, getEncoding, getEncodingLength, getHash, getMemorySize, getMemorySize, getRef, validate
Methods inherited from class convex.core.data.AObject
attachEncoding, print, print
-
Field Details
-
MAX_CODEPOINT
public static int MAX_CODEPOINT -
MAX_VALUE
-
ZERO
Singleton instance representing the NULL character (code point zero) -
MAX_UTF_BYTES
public static final int MAX_UTF_BYTESMaximum number of UTF-8 bytes required to represent aCVMChar
- See Also:
-
-
Method Details
-
getType
-
create
-
fromUTF8
-
longValue
public long longValue()Gets the Long value of this char, equal to the Unicode code point- Specified by:
longValue
in classAPrimitive
- Returns:
- Java long value representing this primitive CVM value
-
estimatedEncodingSize
public int estimatedEncodingSize()Description copied from interface:IWriteable
Estimate the encoded data size for this Cell. Used for quickly sizing buffers. Implementations should try to return a size that is highly likely to contain the entire object when encoded, including the tag byte. Should not traverse soft Refs, i.e. must be usable on arbitrary partial data structures- Specified by:
estimatedEncodingSize
in interfaceIWriteable
- Returns:
- The estimated size for the binary representation of this object.
-
byteCountFromTag
public static int byteCountFromTag(byte tag) Get the number of UTF-8 bytes as encoded within the encoding tag- Parameters:
tag
- Tag byte- Returns:
- Number of bytes in range 1-4
-
validateCell
Description copied from class:ACell
Validates the local structure and invariants of this cell. Called by validate() super implementation. Should validate directly contained data, but should not validate all other structure of this cell. In particular, should not traverse potentially missing child Refs.- Specified by:
validateCell
in classACell
- Throws:
InvalidDataException
- If the Cell is invalid
-
codepointFromUTFInt
public static int codepointFromUTFInt(int utf) Gets a code point value from bytes encoded in a Java integer (starting from high byte)- Parameters:
utf
- UTF-8 encoded value in an integer, first byte in high byte.- Returns:
- Unicode code point, or -1 if not valid UTF-8
-
utfLength
public static int utfLength(long c) Gets the UTF-8 length in bytes for this CVMChar- Parameters:
c
- Code point value- Returns:
- UTF lenth or -1 if not a valid Unicode value
-
read
Reads char data from Blob- Parameters:
len
- Length in UTF-8 bytesblob
- Blob to read frompos
- Position of tag- Returns:
- CVMChar instance
- Throws:
BadFormatException
- if any format error
-
encode
public int encode(byte[] bs, int pos) Description copied from class:ACell
Writes this Cell's encoding to a byte array, including a tag byte which will be written first. Cell must be canonical, or else an error may occur.- Specified by:
encode
in interfaceIWriteable
- Specified by:
encode
in classACell
- Parameters:
bs
- A byte array to which to write the encodingpos
- The offset into the byte array- Returns:
- New position after writing
-
encodeRaw
public int encodeRaw(byte[] bs, int pos) Description copied from class:ACell
Writes this Cell's encoding to a byte array, excluding the tag byte. -
print
Description copied from class:AObject
Prints this Object to a readable String Representation. SECURITY: Must halt and return false in O(1) time when limit of printing is exceeded otherwise DoS attacks may be possible. -
toString
Returns the Java String representation of this CVMChar. Returns a bad character representation in the case that the UTF code point of this Character is invalid Different fromprint()
which returns a readable representation. For instance, on CVMChar \a, this methods returns "a" whileprint()
returns "\a". -
doubleValue
public double doubleValue()- Specified by:
doubleValue
in classAPrimitive
- Returns:
- Java double value representing this primitive CVM value
-
parse
-
getTag
public byte getTag()Description copied from class:ACell
Gets the tag byte for this cell. The tag byte is always equal to the first byte of the Cell's canonical Encoding, and is sufficient to distinguish how to read the rest of the encoding. -
charValue
public char charValue()Gets the Java char value of this CVM Character. Not all Unicode code points fit in a JVM char, a "bad character" value is used as replacement if this is not possible.- Returns:
- Java Char, or a special bad character if not valid.
-
toUTFBytes
public byte[] toUTFBytes()Converts this Character to a Blob with its UTF-8 byte representation- Returns:
- byte[] array containing UTF-8 bytes
-
toUTFBlob
Gets the Blob representation of this Character in UTF-8- Returns:
- 1-4 Bytes Blob containing UTF-8 representation of this Character
-
toCVMString
Description copied from class:ACell
Returns the CVM String representation of this Cell. Normally, this is as printed, but may be different for some types. SHOULD return null in O(1) time if the length of the CVM String can be proved to exceed the limit. MUST complete in O(limit) time and space otherwise The String representation is intended to be a easy-to-read textual representation of the Cell's data content.- Overrides:
toCVMString
in classACell
- Parameters:
limit
- Limit of CVM String length in UTF-8 bytes- Returns:
- CVM String, or null if limit exceeded
-
getCodePoint
public int getCodePoint()Gets the Unicode code point for this Character- Returns:
- Code point as an int value
-
compareTo
- Specified by:
compareTo
in interfaceComparable<CVMChar>
-
equals
Description copied from class:ACell
Checks for equality with another Cell. In general, Cells are considered equal if they have the same canonical representation, i.e. an identical encoding with the same hash value. Subclasses SHOULD override this if they have a more efficient equals implementation. MUST NOT require reads from Store. -
equals
-
hashCode
public int hashCode()Description copied from class:ACell
Gets the Java hashCode for this cell. Must be consistent with equals. Default is the hashCode of the Encoding blob, since this is consistent with encoding-based equality. However, different Types may provide more efficient hashcodes provided that the usual invariants are preserved
-