TypedPipeInst

Instance Constructors

new TypedPipeInst(inpipe: Pipe, fields: Fields, flatMapFn: FlatMapFn[T])

Value Members

final def !=(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def !=(arg0: Any): Boolean

Definition Classes
Any
final def ##(): Int

Definition Classes
AnyRef → Any
def ++[U >: T](other: TypedPipe[U]): TypedPipe[U]

Definition Classes
TypedPipe
final def ==(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def ==(arg0: Any): Boolean

Definition Classes
Any
def aggregate[B, C](agg: Aggregator[T, B, C]): ValuePipe[C]

Same as groupAll.
Same as groupAll.aggregate.values

Definition Classes
TypedPipe
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def collect[U](fn: PartialFunction[T, U]): TypedPipe[U]

Definition Classes
TypedPipe
def cross[U](tiny: TypedPipe[U]): TypedPipe[(T, U)]

Definition Classes
TypedPipeInst → TypedPipe
def cross[V](p: ValuePipe[V]): TypedPipe[(T, V)]

Definition Classes
TypedPipe
def debug: TypedPipe[T]

Definition Classes
TypedPipeInst → TypedPipe
def distinct(implicit ord: Ordering[_ >: T]): TypedPipe[T]

Returns the set of distinct elements in the TypedPipe
Returns the set of distinct elements in the TypedPipe

Definition Classes
TypedPipe
Annotations
@implicitNotFound( ... )
def either[R](that: TypedPipe[R]): TypedPipe[Either[T, R]]

Definition Classes
TypedPipe
def eitherValues[K, V, R](that: TypedPipe[(K, R)])(implicit ev: <:<[T, (K, V)]): TypedPipe[(K, Either[V, R])]

Sometimes useful for implementing custom joins with groupBy + mapValueStream when you know that the value/key can fit in memory.
Sometimes useful for implementing custom joins with groupBy + mapValueStream when you know that the value/key can fit in memory. Beware.

Definition Classes
TypedPipe
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
val fields: Fields
def filter(f: (T) ⇒ Boolean): TypedPipe[T]

Keep only items that satisfy this predicate
Keep only items that satisfy this predicate

Definition Classes
TypedPipeInst → TypedPipe
def filterKeys[K](fn: (K) ⇒ Boolean)(implicit ev: <:<[T, (K, Any)]): TypedPipe[T]

If T is a (K, V) for some V, then we can use this function to filter.
If T is a (K, V) for some V, then we can use this function to filter. This is here to match the function in KeyedListLike, where it is optimized

Definition Classes
TypedPipe
def filterNot(f: (T) ⇒ Boolean): TypedPipe[T]

Keep only items that don't satisfy the predicate.
Keep only items that don't satisfy the predicate. filterNot is the same as filter with a negated predicate.

Definition Classes
TypedPipe
def filterWithValue[U](value: ValuePipe[U])(f: (T, Option[U]) ⇒ Boolean): TypedPipe[T]

Definition Classes
TypedPipe
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def flatMap[U](f: (T) ⇒ TraversableOnce[U]): TypedPipe[U]

Definition Classes
TypedPipeInst → TypedPipe
val flatMapFn: FlatMapFn[T]
def flatMapWithValue[U, V](value: ValuePipe[U])(f: (T, Option[U]) ⇒ TraversableOnce[V]): TypedPipe[V]

Definition Classes
TypedPipe
def flatten[U](implicit ev: <:<[T, TraversableOnce[U]]): TypedPipe[U]

flatten an Iterable
flatten an Iterable

Definition Classes
TypedPipe
lazy val forceToDisk: TypedPipe[T]

Force a materialization of this pipe prior to the next operation.
Force a materialization of this pipe prior to the next operation. This is useful if you filter almost everything before a hashJoin, for instance.

Definition Classes
TypedPipeInst → TypedPipe
def fork: TypedPipe[T]

If you are going to create two branches or forks, it may be more efficient to call this method first which will create a node in the cascading graph.
If you are going to create two branches or forks, it may be more efficient to call this method first which will create a node in the cascading graph. Without this, both full branches of the fork will be put into separate cascading.
Ideally the planner would see this

Definition Classes
TypedPipeInst → TypedPipe
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def group[K, V](implicit ev: <:<[T, (K, V)], ord: Ordering[K]): Grouped[K, V]

Definition Classes
TypedPipe
def groupAll: Grouped[Unit, T]

Definition Classes
TypedPipe
def groupBy[K](g: (T) ⇒ K)(implicit ord: Ordering[K]): Grouped[K, T]

Definition Classes
TypedPipe
def groupRandomly(partitions: Int): Grouped[Int, T]

Forces a shuffle by randomly assigning each item into one of the partitions.
Forces a shuffle by randomly assigning each item into one of the partitions.
This is for the case where you mappers take a long time, and it is faster to shuffle them to more reducers and then operate.
You probably want shard if you are just forcing a shuffle.

Definition Classes
TypedPipe
def hashCogroup[K, V, W, R](smaller: HashJoinable[K, W])(joiner: (K, V, Iterable[W]) ⇒ Iterator[R])(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]]): TypedPipe[(K, R)]

These operations look like joins, but they do not force any communication of the current TypedPipe.
These operations look like joins, but they do not force any communication of the current TypedPipe. They are mapping operations where this pipe is streamed through one item at a time.
WARNING These behave semantically very differently than cogroup. This is because we handle (K,V) tuples on the left as we see them. The iterable on the right is over all elements with a matching key K, and it may be empty if there are no values for this key K.

Definition Classes
TypedPipe
def hashJoin[K, V, W](smaller: HashJoinable[K, W])(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]]): TypedPipe[(K, (V, W))]

Definition Classes
TypedPipe
def hashLeftJoin[K, V, W](smaller: HashJoinable[K, W])(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]]): TypedPipe[(K, (V, Option[W]))]

Definition Classes
TypedPipe
def hashLookup[K >: T, V](grouped: HashJoinable[K, V]): TypedPipe[(K, Option[V])]

For each element, do a map-side (hash) left join to look up a value
For each element, do a map-side (hash) left join to look up a value

Definition Classes
TypedPipe
val inpipe: Pipe
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def keys[K](implicit ev: <:<[T, Tuple2[K, _]]): TypedPipe[K]

Definition Classes
TypedPipe
def leftCross[V](thatPipe: TypedPipe[V]): TypedPipe[(T, Option[V])]

Definition Classes
TypedPipe
def leftCross[V](p: ValuePipe[V]): TypedPipe[(T, Option[V])]

Definition Classes
TypedPipe
def limit(count: Int): TypedPipe[T]

limit the output to at most count items.
limit the output to at most count items. useful for debugging, but probably that's about it. The number may be less than count, and not sampled particular method

Definition Classes
TypedPipeInst → TypedPipe
def map[U](f: (T) ⇒ U): TypedPipe[U]

Definition Classes
TypedPipeInst → TypedPipe
def mapValues[K, V, U](f: (V) ⇒ U)(implicit ev: <:<[T, (K, V)]): TypedPipe[(K, U)]

Definition Classes
TypedPipe
def mapWithValue[U, V](value: ValuePipe[U])(f: (T, Option[U]) ⇒ V): TypedPipe[V]

Definition Classes
TypedPipe
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
lazy val pipe: Pipe

Attributes
protected
def sample(percent: Double, seed: Long): TypedPipe[T]

Definition Classes
TypedPipeInst → TypedPipe
def sample(percent: Double): TypedPipe[T]

Definition Classes
TypedPipeInst → TypedPipe
def shard(partitions: Int): TypedPipe[T]

Used to force a shuffle into a given size of nodes.
Used to force a shuffle into a given size of nodes. Only use this if your mappers are taking far longer than the time to shuffle.

Definition Classes
TypedPipe
def sketch[K, V](reducers: Int, eps: Double = 1.0E-5, delta: Double = 0.01, seed: Int = 12345)(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]], serialization: (K) ⇒ Array[Byte], ordering: Ordering[K]): Sketched[K, V]

Definition Classes
TypedPipe
def sum[U >: T](implicit plus: Semigroup[U]): ValuePipe[U]

Reasonably common shortcut for cases of associative/commutative reduction returns a typed pipe with only one element.
Reasonably common shortcut for cases of associative/commutative reduction returns a typed pipe with only one element.

Definition Classes
TypedPipe
def sumByKey[K, V](implicit ev: <:<[T, (K, V)], ord: Ordering[K], plus: Semigroup[V]): TypedPipe[(K, V)]

Reasonably common shortcut for cases of associative/commutative reduction by Key
Reasonably common shortcut for cases of associative/commutative reduction by Key

Definition Classes
TypedPipe
def sumByLocalKeys[K, V](implicit ev: <:<[T, (K, V)], sg: Semigroup[V]): TypedPipe[(K, V)]

This does a sum of values WITHOUT triggering a shuffle.
This does a sum of values WITHOUT triggering a shuffle. the contract is, if followed by a group.sum the result is the same with or without this present, and it never increases the number of items. BUT due to the cost of caching, it might not be faster if there is poor key locality.
It is only useful for expert tuning, and best avoided unless you are struggling with performance problems. If you are not sure you need this, you probably don't.
The main use case is to reduce the values down before a key expansion such as is often done in a data cube.

Definition Classes
TypedPipeInst → TypedPipe
def swap[K, V](implicit ev: <:<[T, (K, V)]): TypedPipe[(V, K)]

Definition Classes
TypedPipe
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toPipe[U >: T](fieldNames: Fields)(implicit setter: TupleSetter[U]): Pipe

This actually runs all the pure map functions in one Cascading Each This approach is more efficient than untyped scalding because we don't use TupleConverters/Setters after each map.
This actually runs all the pure map functions in one Cascading Each This approach is more efficient than untyped scalding because we don't use TupleConverters/Setters after each map.

Definition Classes
TypedPipeInst → TypedPipe
def unpackToPipe[U >: T](fieldNames: Fields)(implicit up: TupleUnpacker[U]): Pipe

Definition Classes
TypedPipe
def values[V](implicit ev: <:<[T, Tuple2[_, V]]): TypedPipe[V]

Definition Classes
TypedPipe
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
def write(dest: TypedSink[T])(implicit flowDef: FlowDef, mode: Mode): TypedPipe[T]

Safely write to a TypedSink[T].
Safely write to a TypedSink[T]. If you want to write to a Source (not a Sink) you need to do something like: toPipe(fieldNames).write(dest)
returns
a pipe equivalent to the current pipe.

Definition Classes
TypedPipe

final case class TypedPipeInst[T](inpipe: Pipe, fields: Fields, flatMapFn: FlatMapFn[T]) extends TypedPipe[T] with Product with Serializable

Instance Constructors

new TypedPipeInst(inpipe: Pipe, fields: Fields, flatMapFn: FlatMapFn[T])

Value Members

final def !=(arg0: AnyRef): Boolean

final def !=(arg0: Any): Boolean

final def ##(): Int

def ++[U >: T](other: TypedPipe[U]): TypedPipe[U]

final def ==(arg0: AnyRef): Boolean

final def ==(arg0: Any): Boolean

def aggregate[B, C](agg: Aggregator[T, B, C]): ValuePipe[C]

final def asInstanceOf[T0]: T0

def clone(): AnyRef

def collect[U](fn: PartialFunction[T, U]): TypedPipe[U]

def cross[U](tiny: TypedPipe[U]): TypedPipe[(T, U)]

def cross[V](p: ValuePipe[V]): TypedPipe[(T, V)]

def debug: TypedPipe[T]

def distinct(implicit ord: Ordering[_ >: T]): TypedPipe[T]

def either[R](that: TypedPipe[R]): TypedPipe[Either[T, R]]

def eitherValues[K, V, R](that: TypedPipe[(K, R)])(implicit ev: <:<[T, (K, V)]): TypedPipe[(K, Either[V, R])]

final def eq(arg0: AnyRef): Boolean

val fields: Fields

def filter(f: (T) ⇒ Boolean): TypedPipe[T]

def filterKeys[K](fn: (K) ⇒ Boolean)(implicit ev: <:<[T, (K, Any)]): TypedPipe[T]

def filterNot(f: (T) ⇒ Boolean): TypedPipe[T]

def filterWithValue[U](value: ValuePipe[U])(f: (T, Option[U]) ⇒ Boolean): TypedPipe[T]

def finalize(): Unit

def flatMap[U](f: (T) ⇒ TraversableOnce[U]): TypedPipe[U]

val flatMapFn: FlatMapFn[T]

def flatMapWithValue[U, V](value: ValuePipe[U])(f: (T, Option[U]) ⇒ TraversableOnce[V]): TypedPipe[V]

def flatten[U](implicit ev: <:<[T, TraversableOnce[U]]): TypedPipe[U]

lazy val forceToDisk: TypedPipe[T]

def fork: TypedPipe[T]

final def getClass(): Class[_]

def group[K, V](implicit ev: <:<[T, (K, V)], ord: Ordering[K]): Grouped[K, V]

def groupAll: Grouped[Unit, T]

def groupBy[K](g: (T) ⇒ K)(implicit ord: Ordering[K]): Grouped[K, T]

def groupRandomly(partitions: Int): Grouped[Int, T]

def hashCogroup[K, V, W, R](smaller: HashJoinable[K, W])(joiner: (K, V, Iterable[W]) ⇒ Iterator[R])(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]]): TypedPipe[(K, R)]

def hashJoin[K, V, W](smaller: HashJoinable[K, W])(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]]): TypedPipe[(K, (V, W))]

def hashLeftJoin[K, V, W](smaller: HashJoinable[K, W])(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]]): TypedPipe[(K, (V, Option[W]))]

def hashLookup[K >: T, V](grouped: HashJoinable[K, V]): TypedPipe[(K, Option[V])]

val inpipe: Pipe

final def isInstanceOf[T0]: Boolean

def keys[K](implicit ev: <:<[T, Tuple2[K, _]]): TypedPipe[K]

def leftCross[V](thatPipe: TypedPipe[V]): TypedPipe[(T, Option[V])]

def leftCross[V](p: ValuePipe[V]): TypedPipe[(T, Option[V])]

def limit(count: Int): TypedPipe[T]

def map[U](f: (T) ⇒ U): TypedPipe[U]

def mapValues[K, V, U](f: (V) ⇒ U)(implicit ev: <:<[T, (K, V)]): TypedPipe[(K, U)]

def mapWithValue[U, V](value: ValuePipe[U])(f: (T, Option[U]) ⇒ V): TypedPipe[V]

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

lazy val pipe: Pipe

def sample(percent: Double, seed: Long): TypedPipe[T]

def sample(percent: Double): TypedPipe[T]

def shard(partitions: Int): TypedPipe[T]

def sketch[K, V](reducers: Int, eps: Double = 1.0E-5, delta: Double = 0.01, seed: Int = 12345)(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]], serialization: (K) ⇒ Array[Byte], ordering: Ordering[K]): Sketched[K, V]

def sum[U >: T](implicit plus: Semigroup[U]): ValuePipe[U]

def sumByKey[K, V](implicit ev: <:<[T, (K, V)], ord: Ordering[K], plus: Semigroup[V]): TypedPipe[(K, V)]

def sumByLocalKeys[K, V](implicit ev: <:<[T, (K, V)], sg: Semigroup[V]): TypedPipe[(K, V)]

def swap[K, V](implicit ev: <:<[T, (K, V)]): TypedPipe[(V, K)]

final def synchronized[T0](arg0: ⇒ T0): T0

def toPipe[U >: T](fieldNames: Fields)(implicit setter: TupleSetter[U]): Pipe

def unpackToPipe[U >: T](fieldNames: Fields)(implicit up: TupleUnpacker[U]): Pipe

def values[V](implicit ev: <:<[T, Tuple2[_, V]]): TypedPipe[V]

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

def write(dest: TypedSink[T])(implicit flowDef: FlowDef, mode: Mode): TypedPipe[T]

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from TypedPipe[T]

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped