Class Internal


  • public final class Internal
    extends Object
    The classes contained within are used internally by the Protocol Buffer library and generated message implementations. They are public only because those generated messages do not reside in the protobuf package. Others should not use this class directly.
    Author:
    [email protected] (Kenton Varda)
    • Field Detail

      • EMPTY_BYTE_ARRAY

        public static final byte[] EMPTY_BYTE_ARRAY
        An empty byte array constant used in generated code.
      • EMPTY_BYTE_BUFFER

        public static final ByteBuffer EMPTY_BYTE_BUFFER
        An empty byte array constant used in generated code.
      • EMPTY_CODED_INPUT_STREAM

        public static final CodedInputStream EMPTY_CODED_INPUT_STREAM
        An empty coded input stream constant used in generated code.
    • Method Detail

      • stringDefaultValue

        public static String stringDefaultValue​(String bytes)
        Helper called by generated code to construct default values for string fields.

        The protocol compiler does not actually contain a UTF-8 decoder -- it just pushes UTF-8-encoded text around without touching it. The one place where this presents a problem is when generating Java string literals. Unicode characters in the string literal would normally need to be encoded using a Unicode escape sequence, which would require decoding them. To get around this, protoc instead embeds the UTF-8 bytes into the generated code and leaves it to the runtime library to decode them.

        It gets worse, though. If protoc just generated a byte array, like: new byte[] {0x12, 0x34, 0x56, 0x78} Java actually generates *code* which allocates an array and then fills in each value. This is much less efficient than just embedding the bytes directly into the bytecode. To get around this, we need another work-around. String literals are embedded directly, so protoc actually generates a string literal corresponding to the bytes. The easiest way to do this is to use the ISO-8859-1 character set, which corresponds to the first 256 characters of the Unicode range. Protoc can then use good old CEscape to generate the string.

        So we have a string literal which represents a set of bytes which represents another string. This function -- stringDefaultValue -- converts from the generated string to the string we actually want. The generated code calls this automatically.

      • bytesDefaultValue

        public static ByteString bytesDefaultValue​(String bytes)
        Helper called by generated code to construct default values for bytes fields.

        This is a lot like stringDefaultValue(java.lang.String), but for bytes fields. In this case we only need the second of the two hacks -- allowing us to embed raw bytes as a string literal with ISO-8859-1 encoding.

      • byteArrayDefaultValue

        public static byte[] byteArrayDefaultValue​(String bytes)
        Helper called by generated code to construct default values for bytes fields.

        This is like bytesDefaultValue(java.lang.String), but returns a byte array.

      • copyByteBuffer

        public static ByteBuffer copyByteBuffer​(ByteBuffer source)
        Create a new ByteBuffer and copy all the content of source ByteBuffer to the new ByteBuffer. The new ByteBuffer's limit and capacity will be source.capacity(), and its position will be 0. Note that the state of source ByteBuffer won't be changed.
      • isValidUtf8

        public static boolean isValidUtf8​(ByteString byteString)
        Helper called by generated code to determine if a byte array is a valid UTF-8 encoded string such that the original bytes can be converted to a String object and then back to a byte array round tripping the bytes without loss. More precisely, returns true whenever:
        
         Arrays.equals(byteString.toByteArray(),
             new String(byteString.toByteArray(), "UTF-8").getBytes("UTF-8"))
         

        This method rejects "overlong" byte sequences, as well as 3-byte sequences that would map to a surrogate character, in accordance with the restricted definition of UTF-8 introduced in Unicode 3.1. Note that the UTF-8 decoder included in Oracle's JDK has been modified to also reject "overlong" byte sequences, but currently (2011) still accepts 3-byte surrogate character byte sequences.

        See the Unicode Standard,
        Table 3-6. UTF-8 Bit Distribution,
        Table 3-7. Well Formed UTF-8 Byte Sequences.

        As of 2011-02, this method simply returns the result of ByteString.isValidUtf8(). Calling that method directly is preferred.

        Parameters:
        byteString - the string to check
        Returns:
        whether the byte array is round trippable
      • isValidUtf8

        public static boolean isValidUtf8​(byte[] byteArray)
        Like isValidUtf8(ByteString) but for byte arrays.
      • toByteArray

        public static byte[] toByteArray​(String value)
        Helper method to get the UTF-8 bytes of a string.
      • toStringUtf8

        public static String toStringUtf8​(byte[] bytes)
        Helper method to convert a byte array to a string using UTF-8 encoding.
      • hashEnum

        public static int hashEnum​(Internal.EnumLite e)
        Helper method for implementing Message.hashCode() for enums.

        This is needed because Enum.hashCode() is final, but we need to use the field number as the hash code to ensure compatibility between statically and dynamically generated enum objects.

      • hashCode

        public static int hashCode​(List<byte[]> list)
        Helper method for implementing Message.hashCode() for bytes field.
      • hashCode

        public static int hashCode​(byte[] bytes)
        Helper method for implementing Message.hashCode() for bytes field.
      • hashCodeByteBuffer

        public static int hashCodeByteBuffer​(ByteBuffer bytes)
        Helper method for implementing Message.hashCode() for bytes field.
      • getDefaultInstance

        public static <T extends MessageLite> T getDefaultInstance​(Class<T> clazz)