Encapsulate an accumulator, similar to Hadoop counters.
Encapsulate context of one or more Accumulators in an SCollectionWithAccumulator.
Type class for T
that can be used in an Accumulator.
Encapsulate arbitrary data that can be distributed to all workers.
Extra functions available on SCollections of Doubles through an implicit conversion.
Extra functions available on SCollections of (key, value) pairs through an implicit conversion.
A Scala wrapper for PCollection.
An enhanced SCollection that provides access to one or more Accumulators for some transforms.
An enhanced SCollection that provides access to one or more Accumulators for some transforms. Accumulators are accessed via the additional AccumulatorContext argument.
An enhanced SCollection that uses an intermediate node to combine parts of the data to reduce load on the final global combine step.
An enhanced SCollection that uses an intermediate node to combine "hot" keys partially before performing the full combine.
An enhanced SCollection that provides access to one or more SideInputs for some transforms.
An enhanced SCollection that provides access to one or more SideInputs for some transforms. SideInputs are accessed via the additional SideInputContext argument.
An enhanced SCollection that provides access to one or more SideOutputs for some transforms.
An enhanced SCollection that provides access to one or more SideOutputs for some transforms. SideOutputs are accessed via the additional SideOutputContext argument. SCollections of the SideOutputs are accessed via the additional SideOutputCollections return value.
Encapsulate an SCollection when it is being used as a side input.
Encapsulate context of one or more SideInputs in an SCollectionWithSideInput.
Encapsulate a side output for a transform.
Encapsulate output of one or more SideOutputs in an SCollectionWithSideOutput.
Encapsulate context of one or more SideOutputs in an SCollectionWithSideOutput.
Trait for setting custom names on transforms.
Convenience functions for creating SCollections.
Companion object for SideOutput.
A Scala wrapper for PCollection. Represents an immutable, partitioned collection of elements that can be operated on in parallel. This class contains the basic operations available on all SCollections, such as
map
,filter
, andpersist
. In addition, PairSCollectionFunctions contains operations available only on SCollections of key-value pairs, such asgroupByKey
andjoin
; DoubleSCollectionFunctions contains operations available only on SCollections of Doubles.