org.apache.hadoop.mapreduce.lib.partition
Class KeyFieldBasedPartitioner<K2,V2>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.Partitioner<K2,V2>
      extended by org.apache.hadoop.mapreduce.lib.partition.KeyFieldBasedPartitioner<K2,V2>
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable
Direct Known Subclasses:
KeyFieldBasedPartitioner

@InterfaceAudience.Public
@InterfaceStability.Stable
public class KeyFieldBasedPartitioner<K2,V2>
extends Partitioner<K2,V2>
implements org.apache.hadoop.conf.Configurable

Defines a way to partition keys based on certain key fields (also see KeyFieldBasedComparator. The key specification supported is of the form -k pos1[,pos2], where, pos is of the form f[.c][opts], where f is the number of the key field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field).


Field Summary
static String PARTITIONER_OPTIONS
           
 
Constructor Summary
KeyFieldBasedPartitioner()
           
 
Method Summary
 org.apache.hadoop.conf.Configuration getConf()
           
 String getKeyFieldPartitionerOption(JobContext job)
          Get the KeyFieldBasedPartitioner options
protected  int getPartition(int hash, int numReduceTasks)
           
 int getPartition(K2 key, V2 value, int numReduceTasks)
          Get the partition number for a given key (hence record) given the total number of partitions i.e.
protected  int hashCode(byte[] b, int start, int end, int currentHash)
           
 void setConf(org.apache.hadoop.conf.Configuration conf)
           
 void setKeyFieldPartitionerOptions(Job job, String keySpec)
          Set the KeyFieldBasedPartitioner options used for Partitioner
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PARTITIONER_OPTIONS

public static String PARTITIONER_OPTIONS
Constructor Detail

KeyFieldBasedPartitioner

public KeyFieldBasedPartitioner()
Method Detail

setConf

public void setConf(org.apache.hadoop.conf.Configuration conf)
Specified by:
setConf in interface org.apache.hadoop.conf.Configurable

getConf

public org.apache.hadoop.conf.Configuration getConf()
Specified by:
getConf in interface org.apache.hadoop.conf.Configurable

getPartition

public int getPartition(K2 key,
                        V2 value,
                        int numReduceTasks)
Description copied from class: Partitioner
Get the partition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.

Typically a hash function on a all or a subset of the key.

Specified by:
getPartition in class Partitioner<K2,V2>
Parameters:
key - the key to be partioned.
value - the entry value.
numReduceTasks - the total number of partitions.
Returns:
the partition number for the key.

hashCode

protected int hashCode(byte[] b,
                       int start,
                       int end,
                       int currentHash)

getPartition

protected int getPartition(int hash,
                           int numReduceTasks)

setKeyFieldPartitionerOptions

public void setKeyFieldPartitionerOptions(Job job,
                                          String keySpec)
Set the KeyFieldBasedPartitioner options used for Partitioner

Parameters:
keySpec - the key specification of the form -k pos1[,pos2], where, pos is of the form f[.c][opts], where f is the number of the key field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field).

getKeyFieldPartitionerOption

public String getKeyFieldPartitionerOption(JobContext job)
Get the KeyFieldBasedPartitioner options



Copyright © 2012 Apache Software Foundation. All Rights Reserved.