Registers a new listener.
Registers a new listener.
Specifies if the step should throw an error if one of the inputs is not connected.
Specifies if the step should throw an error if one of the inputs is not connected.
Returns the output schema of the step at a given index by calling output() in preview mode and retrieving the schema from the DataFrame.
Returns the output schema of the step at a given index by calling output() in preview mode and retrieving the schema from the DataFrame. Since the default implementation of this method calls output(), invoking it from output() directly or indirectly will result in infinite cycle.
Computes a step output value at the specified index.
Computes a step output value at the specified index. This method is invoked from output() and can safely throw any exception, which will be wrapped into ExecutionException.
the output value index.
Returns the implicit SQLContext.
Returns the implicit SQLContext.
Evaluates all step's outputs and returns a list of results.
Evaluates all step's outputs and returns a list of results.
The maximum number of input ports.
The maximum number of input ports.
Takes one row from the data frame is preview is true.
Takes one row from the data frame is preview is true.
Returns the output schema of the step at index 0.
Returns the output schema of the step at index 0.
Returns the output schema of the step at a given index.
Returns the output schema of the step at a given index.
Shortcut for output(0)
.
Shortcut for output(0)
. Computes a step output at index 0.
in case of an error, or if the step is not connected.
Computes a step output value at the specified index.
Computes a step output value at the specified index.
the output value index.
in case of an error, or if the step is not connected.
The number of output ports.
The number of output ports.
Converts the data frame into an RDD[(key, value)] where key is defined by the group field indices, and value is computed for each row by the supplied function.
Converts the data frame into an RDD[(key, value)] where key is defined by the group field indices, and value is computed for each row by the supplied function.
Unregisters a listener.
Unregisters a listener.
Clears the cache of this step and optionally that of its predecessors and descendants.
Clears the cache of this step and optionally that of its predecessors and descendants.
Returns the implicit SparkContext.
Returns the implicit SparkContext.
Converts a data frame into a pair RDD[(key, data)], where key is the row key as defined by the set of grouping fields, and data is a LabeledPoint, as defined by label and fields.
Converts a data frame into a pair RDD[(key, data)], where key is the row key as defined by the set of grouping fields, and data is a LabeledPoint, as defined by label and fields.
Converts a data frame into a pair RDD[(key, data)], where key is the row key as defined by the set of grouping fields, and data is defied by the set of data fields from the original row.
Converts a data frame into a pair RDD[(key, data)], where key is the row key as defined by the set of grouping fields, and data is defied by the set of data fields from the original row.
Converts a data frame into a pair RDD[(key, data)], where key is the row key as defined by the set of grouping fields, and data is a data Vector, as defined by fields.
Converts a data frame into a pair RDD[(key, data)], where key is the row key as defined by the set of grouping fields, and data is a data Vector, as defined by fields.
Wraps exceptions into ExecutionException instances.
Wraps exceptions into ExecutionException instances.
Calculates column-based statistics using MLLib library.