Add a new string to the map for dictionaries.
Add a new string to the map for dictionaries. The key field has the index of the value i.e. (n - 1) for nth distinct string added to the map, with the offset into the value. The string itself is stored back to back in the value portion with its size at the start being variable length. This exactly matches the end format of the dictionary encoding that stores the dictionary string back-to-back in index order and expected by DictionaryDecoders. So the encoder can use the final value serialized array as is for putting into the encoded column batch (followed by the dictionary indexes of actual values themselves).
The encoded values are read in the initialization of DictionaryDecoder and put into an array, and looked up by its readUTF8String method.
A HashMap implementation using a serialized ByteBuffer for key data and another one for value data. Key data is required to hold fixed-width values while the value data will be written back-to-back as new data is inserted into the map. Key data is stored in the following format:
If key has variable length data, then it should be appended to the value data. The offset+hash code is read as a single long where LSB is used for hash code while MSB is used for offset, so the two can be reverse in actual memory layout on big-endian machines. Since there is no disk storage of this map so no attempt is made to have consistent endianness or memory layout of the data.
Rehash of the map (when loadFactor exceeds) moves around the above key fields to create a new array as per the new hash locations. The value fields are left untouched with the headers of keys having the offsets into value array as before the rehash.