com.google.protobuf
Class Internal

java.lang.Object
  extended by com.google.protobuf.Internal

public class Internal
extends java.lang.Object

The classes contained within are used internally by the Protocol Buffer library and generated message implementations. They are public only because those generated messages do not reside in the protobuf package. Others should not use this class directly.

Author:
[email protected] (Kenton Varda)

Nested Class Summary
static interface Internal.EnumLite
          Interface for an enum value or value descriptor, to be used in FieldSet.
static interface Internal.EnumLiteMap<T extends Internal.EnumLite>
          Interface for an object which maps integers to Internal.EnumLites.
 
Constructor Summary
Internal()
           
 
Method Summary
static ByteString bytesDefaultValue(java.lang.String bytes)
          Helper called by generated code to construct default values for bytes fields.
static boolean isValidUtf8(ByteString byteString)
          Helper called by generated code to determine if a byte array is a valid UTF-8 encoded string such that the original bytes can be converted to a String object and then back to a byte array round tripping the bytes without loss.
static java.lang.String stringDefaultValue(java.lang.String bytes)
          Helper called by generated code to construct default values for string fields.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Internal

public Internal()
Method Detail

stringDefaultValue

public static java.lang.String stringDefaultValue(java.lang.String bytes)
Helper called by generated code to construct default values for string fields.

The protocol compiler does not actually contain a UTF-8 decoder -- it just pushes UTF-8-encoded text around without touching it. The one place where this presents a problem is when generating Java string literals. Unicode characters in the string literal would normally need to be encoded using a Unicode escape sequence, which would require decoding them. To get around this, protoc instead embeds the UTF-8 bytes into the generated code and leaves it to the runtime library to decode them.

It gets worse, though. If protoc just generated a byte array, like: new byte[] {0x12, 0x34, 0x56, 0x78} Java actually generates *code* which allocates an array and then fills in each value. This is much less efficient than just embedding the bytes directly into the bytecode. To get around this, we need another work-around. String literals are embedded directly, so protoc actually generates a string literal corresponding to the bytes. The easiest way to do this is to use the ISO-8859-1 character set, which corresponds to the first 256 characters of the Unicode range. Protoc can then use good old CEscape to generate the string.

So we have a string literal which represents a set of bytes which represents another string. This function -- stringDefaultValue -- converts from the generated string to the string we actually want. The generated code calls this automatically.


bytesDefaultValue

public static ByteString bytesDefaultValue(java.lang.String bytes)
Helper called by generated code to construct default values for bytes fields.

This is a lot like stringDefaultValue(java.lang.String), but for bytes fields. In this case we only need the second of the two hacks -- allowing us to embed raw bytes as a string literal with ISO-8859-1 encoding.


isValidUtf8

public static boolean isValidUtf8(ByteString byteString)
Helper called by generated code to determine if a byte array is a valid UTF-8 encoded string such that the original bytes can be converted to a String object and then back to a byte array round tripping the bytes without loss.

This is inspired by UTF_8.java in sun.nio.cs.

Parameters:
byteString - the string to check
Returns:
whether the byte array is round trippable


Copyright © 2008-2011 Google. All Rights Reserved.