public class DictionaryEncoder extends Object
ValueVector
. Dictionary encoding produces an
integer ValueVector
. Each entry in the Vector is index into the dictionary which can hold
values of any type.Constructor and Description |
---|
DictionaryEncoder(Dictionary dictionary,
BufferAllocator allocator)
Construct an instance.
|
DictionaryEncoder(Dictionary dictionary,
BufferAllocator allocator,
ArrowBufHasher hasher)
Construct an instance.
|
Modifier and Type | Method and Description |
---|---|
ValueVector |
decode(ValueVector indices)
Decodes a vector with the dictionary in this encoder.
|
static ValueVector |
decode(ValueVector indices,
Dictionary dictionary)
Decodes a dictionary encoded array using the provided dictionary.
|
static ValueVector |
decode(ValueVector indices,
Dictionary dictionary,
BufferAllocator allocator)
Decodes a dictionary encoded array using the provided dictionary.
|
ValueVector |
encode(ValueVector vector)
Encodes a vector with the built hash table in this encoder.
|
static ValueVector |
encode(ValueVector vector,
Dictionary dictionary)
Dictionary encodes a vector with a provided dictionary.
|
static ArrowType.Int |
getIndexType(int valueCount)
Get the indexType according to the dictionary vector valueCount.
|
public DictionaryEncoder(Dictionary dictionary, BufferAllocator allocator)
public DictionaryEncoder(Dictionary dictionary, BufferAllocator allocator, ArrowBufHasher hasher)
public static ValueVector encode(ValueVector vector, Dictionary dictionary)
vector
- vector to encodedictionary
- dictionary used for encodingpublic static ValueVector decode(ValueVector indices, Dictionary dictionary)
indices
- dictionary encoded values, must be int typedictionary
- dictionary used to decode the valuespublic static ValueVector decode(ValueVector indices, Dictionary dictionary, BufferAllocator allocator)
indices
- dictionary encoded values, must be int typedictionary
- dictionary used to decode the valuesallocator
- allocator the decoded values usepublic static ArrowType.Int getIndexType(int valueCount)
valueCount
- dictionary vector valueCount.public ValueVector encode(ValueVector vector)
public ValueVector decode(ValueVector indices)
decode(ValueVector, Dictionary, BufferAllocator)
should be used instead if only decoding
is required as it can avoid building the DictionaryHashTable
which only makes sense when encoding.Copyright © 2023 The Apache Software Foundation. All rights reserved.