org.apache.hadoop.mapreduce.lib.partition
Class KeyFieldBasedPartitioner<K2,V2>
java.lang.Object
org.apache.hadoop.mapreduce.Partitioner<K2,V2>
org.apache.hadoop.mapreduce.lib.partition.KeyFieldBasedPartitioner<K2,V2>
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable
- Direct Known Subclasses:
- KeyFieldBasedPartitioner
@InterfaceAudience.Public
@InterfaceStability.Stable
public class KeyFieldBasedPartitioner<K2,V2>
- extends Partitioner<K2,V2>
- implements org.apache.hadoop.conf.Configurable
Defines a way to partition keys based on certain key fields (also see
KeyFieldBasedComparator
.
The key specification supported is of the form -k pos1[,pos2], where,
pos is of the form f[.c][opts], where f is the number
of the key field to use, and c is the number of the first character from
the beginning of the field. Fields and character posns are numbered
starting with 1; a character position of zero in pos2 indicates the
field's last character. If '.c' is omitted from pos1, it defaults to 1
(the beginning of the field); if omitted from pos2, it defaults to 0
(the end of the field).
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PARTITIONER_OPTIONS
public static String PARTITIONER_OPTIONS
KeyFieldBasedPartitioner
public KeyFieldBasedPartitioner()
setConf
public void setConf(org.apache.hadoop.conf.Configuration conf)
- Specified by:
setConf
in interface org.apache.hadoop.conf.Configurable
getConf
public org.apache.hadoop.conf.Configuration getConf()
- Specified by:
getConf
in interface org.apache.hadoop.conf.Configurable
getPartition
public int getPartition(K2 key,
V2 value,
int numReduceTasks)
- Description copied from class:
Partitioner
- Get the partition number for a given key (hence record) given the total
number of partitions i.e. number of reduce-tasks for the job.
Typically a hash function on a all or a subset of the key.
- Specified by:
getPartition
in class Partitioner<K2,V2>
- Parameters:
key
- the key to be partioned.value
- the entry value.numReduceTasks
- the total number of partitions.
- Returns:
- the partition number for the
key
.
hashCode
protected int hashCode(byte[] b,
int start,
int end,
int currentHash)
getPartition
protected int getPartition(int hash,
int numReduceTasks)
setKeyFieldPartitionerOptions
public void setKeyFieldPartitionerOptions(Job job,
String keySpec)
- Set the
KeyFieldBasedPartitioner
options used for
Partitioner
- Parameters:
keySpec
- the key specification of the form -k pos1[,pos2], where,
pos is of the form f[.c][opts], where f is the number
of the key field to use, and c is the number of the first character from
the beginning of the field. Fields and character posns are numbered
starting with 1; a character position of zero in pos2 indicates the
field's last character. If '.c' is omitted from pos1, it defaults to 1
(the beginning of the field); if omitted from pos2, it defaults to 0
(the end of the field).
getKeyFieldPartitionerOption
public String getKeyFieldPartitionerOption(JobContext job)
- Get the
KeyFieldBasedPartitioner
options
Copyright © 2012 Apache Software Foundation. All Rights Reserved.