Package

org.apache.spark.sql.execution

python

Permalink

package python

Visibility
  1. Public
  2. All

Type Members

  1. case class BatchEvalPythonExec(udfs: Seq[PythonUDF], output: Seq[Attribute], child: SparkPlan) extends SparkPlan with Product with Serializable

    Permalink

    A physical plan that evaluates a PythonUDF, one partition of tuples at a time.

    A physical plan that evaluates a PythonUDF, one partition of tuples at a time.

    Python evaluation works by sending the necessary (projected) input data via a socket to an external Python process, and combine the result from the Python process with the original row.

    For each row we send to Python, we also put it in a queue. For each output row from Python, we drain the queue to find the original input row. Note that if the Python process is way too slow, this could lead to the queue growing unbounded and eventually run out of memory.

  2. case class PythonUDF(name: String, func: PythonFunction, dataType: DataType, children: Seq[Expression]) extends Expression with Unevaluable with NonSQLExpression with Product with Serializable

    Permalink

    A serialized version of a Python lambda function.

  3. case class UserDefinedPythonFunction(name: String, func: PythonFunction, dataType: DataType) extends Product with Serializable

    Permalink

    A user-defined Python function.

    A user-defined Python function. This is used by the Python API.

Value Members

  1. object EvaluatePython

    Permalink

Ungrouped