com.twitter.scalding.typed

UnsortedGrouped

trait UnsortedGrouped[K, +V] extends KeyedListLike[K, V, UnsortedGrouped] with HashJoinable[K, V] with WithReducers[UnsortedGrouped[K, V]]

This is the state after we have done some reducing. It is not possible to sort at this phase, but it is possible to do a CoGrouping or a HashJoin.

Linear Supertypes
WithReducers[UnsortedGrouped[K, V]], HashJoinable[K, V], KeyedPipe[K], CoGroupable[K, V], HasReducers, KeyedListLike[K, V, UnsortedGrouped], Serializable, AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. UnsortedGrouped
  2. WithReducers
  3. HashJoinable
  4. KeyedPipe
  5. CoGroupable
  6. HasReducers
  7. KeyedListLike
  8. Serializable
  9. AnyRef
  10. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def filterKeys(fn: (K) ⇒ Boolean): UnsortedGrouped[K, V]

    filter keys on a predicate.

    filter keys on a predicate. More efficient than filter if you are only looking at keys

    Definition Classes
    KeyedListLike
  2. abstract def joinFunction: (K, Iterator[Tuple], Seq[Iterable[Tuple]]) ⇒ Iterator[V]

    This function is not type-safe for others to call, but it should never have an error.

    This function is not type-safe for others to call, but it should never have an error. By construction, we never call it with incorrect types. It would be preferable to have stronger type safety here, but unclear how to achieve, and since it is an internal function, not clear it would actually help anyone for it to be type-safe

    Attributes
    protected
    Definition Classes
    CoGroupable
  3. abstract def keyOrdering: Ordering[K]

    Definition Classes
    KeyedPipe
  4. abstract def mapGroup[V](smfn: (K, Iterator[V]) ⇒ Iterator[V]): UnsortedGrouped[K, V]

    Operate on an Iterator[T] of all the values for each key at one time.

    Operate on an Iterator[T] of all the values for each key at one time. Avoid accumulating the whole list in memory if you can. Prefer sum, which is partially executed map-side by default.

    Definition Classes
    KeyedListLike
  5. abstract def mapped: TypedPipe[(K, Any)]

    Definition Classes
    KeyedPipe
  6. abstract def reducers: Option[Int]

    Definition Classes
    HasReducers
  7. abstract def toTypedPipe: TypedPipe[(K, V)]

    End of the operations on values.

    End of the operations on values. From this point on the keyed structure is lost and another shuffle is generally required to reconstruct it

    Definition Classes
    KeyedListLike
  8. abstract def withReducers(reds: Int): UnsortedGrouped[K, V]

    never mutates this, instead returns a new item.

    never mutates this, instead returns a new item.

    Definition Classes
    WithReducers

Concrete Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def aggregate[B, C](agg: Aggregator[V, B, C]): UnsortedGrouped[K, C]

    Use Algebird Aggregator to do the reduction

    Use Algebird Aggregator to do the reduction

    Definition Classes
    KeyedListLike
  7. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  8. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  9. def cogroup[R1, R2](smaller: CoGroupable[K, R1])(fn: (K, Iterator[V], Iterable[R1]) ⇒ Iterator[R2]): CoGrouped[K, R2]

    Smaller is about average values/key not total size (that does not matter, but is clearly related).

    Smaller is about average values/key not total size (that does not matter, but is clearly related).

    Note that from the type signature we see that the right side is iterated (or may be) over and over, but the left side is not. That means that you want the side with fewer values per key on the right. If both sides are similar, no need to worry. If one side is a one-to-one mapping, that should be the "smaller" side.

    Definition Classes
    CoGroupable
  10. def count(fn: (V) ⇒ Boolean): UnsortedGrouped[K, Long]

    Definition Classes
    KeyedListLike
  11. def drop(n: Int): UnsortedGrouped[K, V]

    Selects all elements except first n ones.

    Selects all elements except first n ones.

    Definition Classes
    KeyedListLike
  12. def dropWhile(p: (V) ⇒ Boolean): UnsortedGrouped[K, V]

    Drops longest prefix of elements that satisfy the given predicate.

    Drops longest prefix of elements that satisfy the given predicate.

    Definition Classes
    KeyedListLike
  13. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  14. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  15. def filter(fn: ((K, V)) ⇒ Boolean): UnsortedGrouped[K, V]

    .

    .filter(fn).toTypedPipe == .toTypedPipe.filter(fn) It is generally better to avoid going back to a TypedPipe as long as possible: this minimizes the times we go in and out of cascading/hadoop types.

    Definition Classes
    KeyedListLike
  16. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  17. def foldLeft[B](z: B)(fn: (B, V) ⇒ B): UnsortedGrouped[K, B]

    Definition Classes
    KeyedListLike
  18. def forall(fn: (V) ⇒ Boolean): UnsortedGrouped[K, Boolean]

    Definition Classes
    KeyedListLike
  19. def forceToReducers: UnsortedGrouped[K, V]

    This is just short hand for mapValueStream(identity), it makes sure the planner sees that you want to force a shuffle.

    This is just short hand for mapValueStream(identity), it makes sure the planner sees that you want to force a shuffle. For expert tuning

    Definition Classes
    KeyedListLike
  20. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  21. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  22. def hashCogroupOn[V1, R](mapside: TypedPipe[(K, V1)])(joiner: (K, V1, Iterable[V]) ⇒ Iterator[R]): TypedPipe[(K, R)]

    This fully replicates this entire Grouped to the argument: mapside.

    This fully replicates this entire Grouped to the argument: mapside. This means that we never see the case where the key is absent in the pipe. This means implementing a right-join (from the pipe) is impossible. Note, there is no reduce-phase in this operation. The next issue is that obviously, unlike a cogroup, for a fixed key, each joiner will NOT See all the tuples with those keys. This is because the keys on the left are distributed across many machines See hashjoin: http://docs.cascading.org/cascading/2.0/javadoc/cascading/pipe/HashJoin.html

    Definition Classes
    HashJoinable
  23. def head: UnsortedGrouped[K, V]

    Use this to get the first value encountered.

    Use this to get the first value encountered. prefer this to take(1).

    Definition Classes
    KeyedListLike
  24. def inputs: List[TypedPipe[(K, Any)]]

    A HashJoinable has a single input into to the cogroup

    A HashJoinable has a single input into to the cogroup

    Definition Classes
    HashJoinableCoGroupable
  25. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  26. def join[W](smaller: CoGroupable[K, W]): CoGrouped[K, (V, W)]

    Definition Classes
    CoGroupable
  27. def keys: TypedPipe[K]

    Definition Classes
    KeyedListLike
  28. def leftJoin[W](smaller: CoGroupable[K, W]): CoGrouped[K, (V, Option[W])]

    Definition Classes
    CoGroupable
  29. def mapValueStream[V](smfn: (Iterator[V]) ⇒ Iterator[V]): UnsortedGrouped[K, V]

    Use this when you don't care about the key for the group, otherwise use mapGroup

    Use this when you don't care about the key for the group, otherwise use mapGroup

    Definition Classes
    KeyedListLike
  30. def mapValues[V](fn: (V) ⇒ V): UnsortedGrouped[K, V]

    This is a special case of mapValueStream, but can be optimized because it doesn't need all the values for a given key at once.

    This is a special case of mapValueStream, but can be optimized because it doesn't need all the values for a given key at once. An unoptimized implementation is: mapValueStream { _.map { fn } } but for Grouped we can avoid resorting to mapValueStream

    Definition Classes
    KeyedListLike
  31. def max[B >: V](implicit cmp: Ordering[B]): UnsortedGrouped[K, V]

    Definition Classes
    KeyedListLike
  32. def maxBy[B](fn: (V) ⇒ B)(implicit cmp: Ordering[B]): UnsortedGrouped[K, V]

    Definition Classes
    KeyedListLike
  33. def min[B >: V](implicit cmp: Ordering[B]): UnsortedGrouped[K, V]

    Definition Classes
    KeyedListLike
  34. def minBy[B](fn: (V) ⇒ B)(implicit cmp: Ordering[B]): UnsortedGrouped[K, V]

    Definition Classes
    KeyedListLike
  35. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  36. final def notify(): Unit

    Definition Classes
    AnyRef
  37. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  38. def outerJoin[W](smaller: CoGroupable[K, W]): CoGrouped[K, (Option[V], Option[W])]

    Definition Classes
    CoGroupable
  39. def product[U >: V](implicit ring: Ring[U]): UnsortedGrouped[K, U]

    Definition Classes
    KeyedListLike
  40. def reduce[U >: V](fn: (U, U) ⇒ U): UnsortedGrouped[K, U]

    reduce with fn which must be associative and commutative.

    reduce with fn which must be associative and commutative. Like the above this can be optimized in some Grouped cases. If you don't have a commutative operator, use reduceLeft

    Definition Classes
    KeyedListLike
  41. def reduceLeft[U >: V](fn: (U, U) ⇒ U): UnsortedGrouped[K, U]

    Definition Classes
    KeyedListLike
  42. def rightJoin[W](smaller: CoGroupable[K, W]): CoGrouped[K, (Option[V], W)]

    Definition Classes
    CoGroupable
  43. def scanLeft[B](z: B)(fn: (B, V) ⇒ B): UnsortedGrouped[K, B]

    Definition Classes
    KeyedListLike
  44. def size: UnsortedGrouped[K, Long]

    Definition Classes
    KeyedListLike
  45. def sortWithTake[U >: V](k: Int)(lessThan: (U, U) ⇒ Boolean): UnsortedGrouped[K, Seq[V]]

    Like the above, but with a less than operation for the ordering

    Like the above, but with a less than operation for the ordering

    Definition Classes
    KeyedListLike
  46. def sortedReverseTake(k: Int)(implicit ord: Ordering[_ >: V]): UnsortedGrouped[K, Seq[V]]

    Take the largest k things according to the implicit ordering.

    Take the largest k things according to the implicit ordering. Useful for top-k without having to call ord.reverse

    Definition Classes
    KeyedListLike
  47. def sortedTake(k: Int)(implicit ord: Ordering[_ >: V]): UnsortedGrouped[K, Seq[V]]

    This implements bottom-k (smallest k items) on each mapper for each key, then sends those to reducers to get the result.

    This implements bottom-k (smallest k items) on each mapper for each key, then sends those to reducers to get the result. This is faster than using .take if k * (number of Keys) is small enough to fit in memory.

    Definition Classes
    KeyedListLike
  48. def sum[U >: V](implicit sg: Semigroup[U]): UnsortedGrouped[K, U]

    If there is no ordering, we default to assuming the Semigroup is commutative.

    If there is no ordering, we default to assuming the Semigroup is commutative. If you don't want that, define an ordering on the Values, or .forceToReducers.

    Semigroups MAY have a faster implementation of sum for iterators, so prefer using sum/sumLeft to reduce

    Definition Classes
    KeyedListLike
  49. def sumLeft[U >: V](implicit sg: Semigroup[U]): UnsortedGrouped[K, U]

    Semigroups MAY have a faster implementation of sum for iterators, so prefer using sum/sumLeft to reduce/reduceLeft

    Semigroups MAY have a faster implementation of sum for iterators, so prefer using sum/sumLeft to reduce/reduceLeft

    Definition Classes
    KeyedListLike
  50. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  51. def take(n: Int): UnsortedGrouped[K, V]

    Selects first n elements.

    Selects first n elements. Don't use this if n == 1, head is faster in that case.

    Definition Classes
    KeyedListLike
  52. def takeWhile(p: (V) ⇒ Boolean): UnsortedGrouped[K, V]

    Takes longest prefix of elements that satisfy the given predicate.

    Takes longest prefix of elements that satisfy the given predicate.

    Definition Classes
    KeyedListLike
  53. def toList: UnsortedGrouped[K, List[V]]

    Definition Classes
    KeyedListLike
  54. def toSet[U >: V]: UnsortedGrouped[K, Set[U]]

    Definition Classes
    KeyedListLike
  55. def toString(): String

    Definition Classes
    AnyRef → Any
  56. def values: TypedPipe[V]

    Definition Classes
    KeyedListLike
  57. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  58. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  59. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from WithReducers[UnsortedGrouped[K, V]]

Inherited from HashJoinable[K, V]

Inherited from KeyedPipe[K]

Inherited from CoGroupable[K, V]

Inherited from HasReducers

Inherited from KeyedListLike[K, V, UnsortedGrouped]

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped