org.apache.hadoop.mapreduce.lib.partition
Class InputSampler.IntervalSampler<K,V>
java.lang.Object
org.apache.hadoop.mapreduce.lib.partition.InputSampler.IntervalSampler<K,V>
- All Implemented Interfaces:
- InputSampler.Sampler<K,V>
- Direct Known Subclasses:
- InputSampler.IntervalSampler
- Enclosing class:
- InputSampler<K,V>
public static class InputSampler.IntervalSampler<K,V>
- extends Object
- implements InputSampler.Sampler<K,V>
Sample from s splits at regular intervals.
Useful for sorted data.
Method Summary |
K[] |
getSample(InputFormat<K,V> inf,
Job job)
For each split sampled, emit when the ratio of the number of records
retained to the total record count is less than the specified
frequency. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
freq
protected final double freq
maxSplitsSampled
protected final int maxSplitsSampled
InputSampler.IntervalSampler
public InputSampler.IntervalSampler(double freq)
- Create a new IntervalSampler sampling all splits.
- Parameters:
freq
- The frequency with which records will be emitted.
InputSampler.IntervalSampler
public InputSampler.IntervalSampler(double freq,
int maxSplitsSampled)
- Create a new IntervalSampler.
- Parameters:
freq
- The frequency with which records will be emitted.maxSplitsSampled
- The maximum number of splits to examine.- See Also:
getSample(org.apache.hadoop.mapreduce.InputFormat, org.apache.hadoop.mapreduce.Job)
getSample
public K[] getSample(InputFormat<K,V> inf,
Job job)
throws IOException,
InterruptedException
- For each split sampled, emit when the ratio of the number of records
retained to the total record count is less than the specified
frequency.
- Specified by:
getSample
in interface InputSampler.Sampler<K,V>
- Throws:
IOException
InterruptedException
Copyright © 2013 Apache Software Foundation. All Rights Reserved.