com.twitter.scalding.typed

IterablePipe

final case class IterablePipe[T](iterable: Iterable[T]) extends TypedPipe[T] with Product with Serializable

You should use a view here If you avoid toPipe, this class is more efficient than IterableSource.

Linear Supertypes
Serializable, Product, Equals, TypedPipe[T], Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. IterablePipe
  2. Serializable
  3. Product
  4. Equals
  5. TypedPipe
  6. Serializable
  7. AnyRef
  8. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new IterablePipe(iterable: Iterable[T])

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. def ++[U >: T](other: TypedPipe[U]): TypedPipe[U]

    Merge two TypedPipes (no order is guaranteed) This is only realized when a group (or join) is performed.

    Merge two TypedPipes (no order is guaranteed) This is only realized when a group (or join) is performed.

    Definition Classes
    IterablePipeTypedPipe
  5. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  6. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  7. def addTrap[U >: T](trapSink: Source with TypedSink[T])(implicit conv: TupleConverter[U]): TypedPipe[U]

    Definition Classes
    TypedPipe
  8. def aggregate[B, C](agg: Aggregator[T, B, C]): ValuePipe[C]

    Same as groupAll.

    Same as groupAll.aggregate.values

    Definition Classes
    IterablePipeTypedPipe
  9. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  10. def asKeys[U >: T](implicit ord: Ordering[U]): Grouped[U, Unit]

    Put the items in this into the keys, and unit as the value in a Group in some sense, this is the dual of groupAll

    Put the items in this into the keys, and unit as the value in a Group in some sense, this is the dual of groupAll

    Definition Classes
    TypedPipe
    Annotations
    @implicitNotFound( ... )
  11. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  12. def collect[U](fn: PartialFunction[T, U]): TypedPipe[U]

    Filter and map.

    Filter and map. See scala.collection.List.collect. collect { case Some(x) => fn(x) }

    Definition Classes
    TypedPipe
  13. def cross[U](tiny: TypedPipe[U]): TypedPipe[(T, U)]

    Definition Classes
    IterablePipeTypedPipe
  14. def cross[V](p: ValuePipe[V]): TypedPipe[(T, V)]

    Attach a ValuePipe to each element this TypedPipe

    Attach a ValuePipe to each element this TypedPipe

    Definition Classes
    TypedPipe
  15. def debug: TypedPipe[T]

    prints the current pipe to stdout

    prints the current pipe to stdout

    Definition Classes
    TypedPipe
  16. def distinct(implicit ord: Ordering[_ >: T]): TypedPipe[T]

    Returns the set of distinct elements in the TypedPipe

    Returns the set of distinct elements in the TypedPipe

    Definition Classes
    TypedPipe
    Annotations
    @implicitNotFound( ... )
  17. def distinctBy[U](fn: (T) ⇒ U, numReducers: Option[Int] = None)(implicit ord: Ordering[_ >: U]): TypedPipe[T]

    Returns the set of distinct elements identified by a given lambda extractor in the TypedPipe

    Returns the set of distinct elements identified by a given lambda extractor in the TypedPipe

    Definition Classes
    TypedPipe
    Annotations
    @implicitNotFound( ... )
  18. def either[R](that: TypedPipe[R]): TypedPipe[Either[T, R]]

    Merge two TypedPipes of different types by using Either

    Merge two TypedPipes of different types by using Either

    Definition Classes
    TypedPipe
  19. def eitherValues[K, V, R](that: TypedPipe[(K, R)])(implicit ev: <:<[T, (K, V)]): TypedPipe[(K, Either[V, R])]

    Sometimes useful for implementing custom joins with groupBy + mapValueStream when you know that the value/key can fit in memory.

    Sometimes useful for implementing custom joins with groupBy + mapValueStream when you know that the value/key can fit in memory. Beware.

    Definition Classes
    TypedPipe
  20. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  21. def filter(f: (T) ⇒ Boolean): TypedPipe[T]

    Keep only items that satisfy this predicate

    Keep only items that satisfy this predicate

    Definition Classes
    IterablePipeTypedPipe
  22. def filterKeys[K](fn: (K) ⇒ Boolean)(implicit ev: <:<[T, (K, Any)]): TypedPipe[T]

    If T is a (K, V) for some V, then we can use this function to filter.

    If T is a (K, V) for some V, then we can use this function to filter. This is here to match the function in KeyedListLike, where it is optimized

    Definition Classes
    TypedPipe
  23. def filterNot(f: (T) ⇒ Boolean): TypedPipe[T]

    Keep only items that don't satisfy the predicate.

    Keep only items that don't satisfy the predicate. filterNot is the same as filter with a negated predicate.

    Definition Classes
    TypedPipe
  24. def filterWithValue[U](value: ValuePipe[U])(f: (T, Option[U]) ⇒ Boolean): TypedPipe[T]

    common pattern of attaching a value and then filter

    common pattern of attaching a value and then filter

    Definition Classes
    TypedPipe
  25. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  26. def flatMap[U](f: (T) ⇒ TraversableOnce[U]): TypedPipe[U]

    When flatMap is called on an IterablePipe, we defer to make sure that f is applied lazily, which avoids OOM issues when the returned value from the map is larger than the input

    When flatMap is called on an IterablePipe, we defer to make sure that f is applied lazily, which avoids OOM issues when the returned value from the map is larger than the input

    Definition Classes
    IterablePipeTypedPipe
  27. def flatMapWithValue[U, V](value: ValuePipe[U])(f: (T, Option[U]) ⇒ TraversableOnce[V]): TypedPipe[V]

    common pattern of attaching a value and then flatMap

    common pattern of attaching a value and then flatMap

    Definition Classes
    TypedPipe
  28. def flatten[U](implicit ev: <:<[T, TraversableOnce[U]]): TypedPipe[U]

    flatten an Iterable

    flatten an Iterable

    Definition Classes
    TypedPipe
  29. def flattenValues[K, U](implicit ev: <:<[T, (K, TraversableOnce[U])]): TypedPipe[(K, U)]

    flatten just the values This is more useful on KeyedListLike, but added here to reduce assymmetry in the APIs

    flatten just the values This is more useful on KeyedListLike, but added here to reduce assymmetry in the APIs

    Definition Classes
    TypedPipe
  30. def forceToDisk: IterablePipe[T]

    Force a materialization of this pipe prior to the next operation.

    Force a materialization of this pipe prior to the next operation. This is useful if you filter almost everything before a hashJoin, for instance.

    Definition Classes
    IterablePipeTypedPipe
  31. def forceToDiskExecution: Execution[TypedPipe[T]]

    Definition Classes
    IterablePipeTypedPipe
  32. def fork: TypedPipe[T]

    If you are going to create two branches or forks, it may be more efficient to call this method first which will create a node in the cascading graph.

    If you are going to create two branches or forks, it may be more efficient to call this method first which will create a node in the cascading graph. Without this, both full branches of the fork will be put into separate cascading pipes, which can, in some cases, be slower.

    Ideally the planner would see this

    Definition Classes
    IterablePipeTypedPipe
  33. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  34. def group[K, V](implicit ev: <:<[T, (K, V)], ord: Ordering[K]): Grouped[K, V]

    This is the default means of grouping all pairs with the same key.

    This is the default means of grouping all pairs with the same key. Generally this triggers 1 Map/Reduce transition

    Definition Classes
    TypedPipe
  35. def groupAll: Grouped[Unit, T]

    Send all items to a single reducer

    Send all items to a single reducer

    Definition Classes
    TypedPipe
  36. def groupBy[K](g: (T) ⇒ K)(implicit ord: Ordering[K]): Grouped[K, T]

    Given a key function, add the key, then call .

    Given a key function, add the key, then call .group

    Definition Classes
    TypedPipe
  37. def groupRandomly(partitions: Int): Grouped[Int, T]

    Forces a shuffle by randomly assigning each item into one of the partitions.

    Forces a shuffle by randomly assigning each item into one of the partitions.

    This is for the case where you mappers take a long time, and it is faster to shuffle them to more reducers and then operate.

    You probably want shard if you are just forcing a shuffle.

    Definition Classes
    TypedPipe
  38. def hashCogroup[K, V, W, R](smaller: HashJoinable[K, W])(joiner: (K, V, Iterable[W]) ⇒ Iterator[R])(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]]): TypedPipe[(K, R)]

    These operations look like joins, but they do not force any communication of the current TypedPipe.

    These operations look like joins, but they do not force any communication of the current TypedPipe. They are mapping operations where this pipe is streamed through one item at a time.

    WARNING These behave semantically very differently than cogroup. This is because we handle (K,V) tuples on the left as we see them. The iterable on the right is over all elements with a matching key K, and it may be empty if there are no values for this key K.

    Definition Classes
    TypedPipe
  39. def hashJoin[K, V, W](smaller: HashJoinable[K, W])(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]]): TypedPipe[(K, (V, W))]

    Do an inner-join without shuffling this TypedPipe, but replicating argument to all tasks

    Do an inner-join without shuffling this TypedPipe, but replicating argument to all tasks

    Definition Classes
    TypedPipe
  40. def hashLeftJoin[K, V, W](smaller: HashJoinable[K, W])(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]]): TypedPipe[(K, (V, Option[W]))]

    Do an leftjoin without shuffling this TypedPipe, but replicating argument to all tasks

    Do an leftjoin without shuffling this TypedPipe, but replicating argument to all tasks

    Definition Classes
    TypedPipe
  41. def hashLookup[K >: T, V](grouped: HashJoinable[K, V]): TypedPipe[(K, Option[V])]

    For each element, do a map-side (hash) left join to look up a value

    For each element, do a map-side (hash) left join to look up a value

    Definition Classes
    TypedPipe
  42. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  43. val iterable: Iterable[T]

  44. def keys[K](implicit ev: <:<[T, (K, Any)]): TypedPipe[K]

    Just keep the keys, or .

    Just keep the keys, or ._1 (if this type is a Tuple2)

    Definition Classes
    TypedPipe
  45. def leftCross[V](thatPipe: TypedPipe[V]): TypedPipe[(T, Option[V])]

    uses hashJoin but attaches None if thatPipe is empty

    uses hashJoin but attaches None if thatPipe is empty

    Definition Classes
    TypedPipe
  46. def leftCross[V](p: ValuePipe[V]): TypedPipe[(T, Option[V])]

    ValuePipe may be empty, so, this attaches it as an Option cross is the same as leftCross(p).

    ValuePipe may be empty, so, this attaches it as an Option cross is the same as leftCross(p).collect { case (t, Some(v)) => (t, v) }

    Definition Classes
    TypedPipe
  47. def limit(count: Int): TypedPipe[T]

    limit the output to at most count items.

    limit the output to at most count items. useful for debugging, but probably that's about it. The number may be less than count, and not sampled particular method

    Definition Classes
    IterablePipeTypedPipe
  48. def map[U](f: (T) ⇒ U): TypedPipe[U]

    When map is called on an IterablePipe, we defer to make sure that f is applied lazily, which avoids OOM issues when the returned value from the map is larger than the input

    When map is called on an IterablePipe, we defer to make sure that f is applied lazily, which avoids OOM issues when the returned value from the map is larger than the input

    Definition Classes
    IterablePipeTypedPipe
  49. def mapValues[K, V, U](f: (V) ⇒ U)(implicit ev: <:<[T, (K, V)]): TypedPipe[(K, U)]

    Transform only the values (sometimes requires giving the types due to scala type inference)

    Transform only the values (sometimes requires giving the types due to scala type inference)

    Definition Classes
    TypedPipe
  50. def mapWithValue[U, V](value: ValuePipe[U])(f: (T, Option[U]) ⇒ V): TypedPipe[V]

    common pattern of attaching a value and then map

    common pattern of attaching a value and then map

    Definition Classes
    TypedPipe
  51. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  52. final def notify(): Unit

    Definition Classes
    AnyRef
  53. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  54. def onRawSingle(onPipe: (Pipe) ⇒ Pipe): TypedPipe[T]

    Attributes
    protected
    Definition Classes
    TypedPipe
  55. def partition(p: (T) ⇒ Boolean): (TypedPipe[T], TypedPipe[T])

    Partitions this into two pipes according to a predicate.

    Partitions this into two pipes according to a predicate.

    Sometimes what you really want is a groupBy in these cases.

    Definition Classes
    TypedPipe
  56. def raiseTo[U](implicit ev: <:<[T, U]): TypedPipe[U]

    If T <:< U, then this is safe to treat as TypedPipe[U] due to covariance

    If T <:< U, then this is safe to treat as TypedPipe[U] due to covariance

    Attributes
    protected
    Definition Classes
    TypedPipe
  57. def sample(percent: Double, seed: Long): TypedPipe[T]

    Definition Classes
    TypedPipe
  58. def sample(percent: Double): TypedPipe[T]

    Definition Classes
    TypedPipe
  59. def shard(partitions: Int): TypedPipe[T]

    Used to force a shuffle into a given size of nodes.

    Used to force a shuffle into a given size of nodes. Only use this if your mappers are taking far longer than the time to shuffle.

    Definition Classes
    TypedPipe
  60. def sketch[K, V](reducers: Int, eps: Double = 1.0E-5, delta: Double = 0.01, seed: Int = 12345)(implicit ev: <:<[TypedPipe[T], TypedPipe[(K, V)]], serialization: (K) ⇒ Array[Byte], ordering: Ordering[K]): Sketched[K, V]

    Build a sketch of this TypedPipe so that you can do a skew-join with another Grouped

    Build a sketch of this TypedPipe so that you can do a skew-join with another Grouped

    Definition Classes
    TypedPipe
  61. def sum[U >: T](implicit plus: Semigroup[U]): ValuePipe[U]

    Reasonably common shortcut for cases of associative/commutative reduction returns a typed pipe with only one element.

    Reasonably common shortcut for cases of associative/commutative reduction returns a typed pipe with only one element.

    Definition Classes
    IterablePipeTypedPipe
  62. def sumByKey[K, V](implicit ev: <:<[T, (K, V)], ord: Ordering[K], plus: Semigroup[V]): UnsortedGrouped[K, V]

    Reasonably common shortcut for cases of associative/commutative reduction by Key

    Reasonably common shortcut for cases of associative/commutative reduction by Key

    Definition Classes
    TypedPipe
  63. def sumByLocalKeys[K, V](implicit ev: <:<[T, (K, V)], sg: Semigroup[V]): IterablePipe[(K, V)]

    This does a sum of values WITHOUT triggering a shuffle.

    This does a sum of values WITHOUT triggering a shuffle. the contract is, if followed by a group.sum the result is the same with or without this present, and it never increases the number of items. BUT due to the cost of caching, it might not be faster if there is poor key locality.

    It is only useful for expert tuning, and best avoided unless you are struggling with performance problems. If you are not sure you need this, you probably don't.

    The main use case is to reduce the values down before a key expansion such as is often done in a data cube.

    Definition Classes
    IterablePipeTypedPipe
  64. def swap[K, V](implicit ev: <:<[T, (K, V)]): TypedPipe[(V, K)]

    swap the keys with the values

    swap the keys with the values

    Definition Classes
    TypedPipe
  65. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  66. def toIteratorExecution: Execution[Iterator[T]]

    Definition Classes
    IterablePipeTypedPipe
  67. def toPipe[U >: T](fieldNames: Fields)(implicit flowDef: FlowDef, mode: Mode, setter: TupleSetter[U]): Pipe

    Export back to a raw cascading Pipe.

    Export back to a raw cascading Pipe. useful for interop with the scalding Fields API or with Cascading code.

    Definition Classes
    IterablePipeTypedPipe
  68. def unpackToPipe[U >: T](fieldNames: Fields)(implicit fd: FlowDef, mode: Mode, up: TupleUnpacker[U]): Pipe

    use a TupleUnpacker to flatten U out into a cascading Tuple

    use a TupleUnpacker to flatten U out into a cascading Tuple

    Definition Classes
    TypedPipe
  69. def values[V](implicit ev: <:<[T, (Any, V)]): TypedPipe[V]

    Just keep the values, or .

    Just keep the values, or ._2 (if this type is a Tuple2)

    Definition Classes
    TypedPipe
  70. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  71. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  72. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  73. def write(dest: TypedSink[T])(implicit flowDef: FlowDef, mode: Mode): TypedPipe[T]

    Safely write to a TypedSink[T].

    Safely write to a TypedSink[T]. If you want to write to a Source (not a Sink) you need to do something like: toPipe(fieldNames).write(dest)

    returns

    a pipe equivalent to the current pipe.

    Definition Classes
    TypedPipe
  74. def writeExecution(dest: TypedSink[T]): Execution[Unit]

    This is the functionally pure approach to building jobs.

    This is the functionally pure approach to building jobs. Note, that you have to call run on the result for anything to happen here.

    Definition Classes
    TypedPipe
  75. def writeThrough[U >: T](dest: TypedSink[T] with TypedSource[U]): Execution[TypedPipe[U]]

    If you want to write to a specific location, and then read from that location going forward, use this.

    If you want to write to a specific location, and then read from that location going forward, use this.

    Definition Classes
    TypedPipe

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from TypedPipe[T]

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped