com.datawizards.sparklocal.session
Broadcast a read-only variable to the cluster, returning a broadcast object for reading it in distributed functions.
Broadcast a read-only variable to the cluster, returning a broadcast object for reading it in distributed functions. The variable will be sent to each cluster only once.
Create and register a CollectionAccumulator
, which starts with empty list and accumulates
inputs by adding them into the list.
Create and register a CollectionAccumulator
, which starts with empty list and accumulates
inputs by adding them into the list.
Create new DataSet based on Scala collection
Create new RDD based on Scala collection
Create and register a double accumulator, which starts with 0 and accumulates inputs by add
.
Create and register a double accumulator, which starts with 0 and accumulates inputs by add
.
Create and register a long accumulator, which starts with 0 and accumulates inputs by add
.
Create and register a long accumulator, which starts with 0 and accumulates inputs by add
.
Returns a ReaderExecutor that can be used to read non-streaming data in as a DataSet
Register the given accumulator.
Register the given accumulator with given name.
Read a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.
Create new DataSet based on RDD