python

case class BatchEvalPythonExec(udfs: Seq[PythonUDF], output: Seq[Attribute], child: SparkPlan) extends SparkPlan with Product with Serializable

A physical plan that evaluates a PythonUDF, one partition of tuples at a time.
A physical plan that evaluates a PythonUDF, one partition of tuples at a time.
Python evaluation works by sending the necessary (projected) input data via a socket to an external Python process, and combine the result from the Python process with the original row.
For each row we send to Python, we also put it in a queue. For each output row from Python, we drain the queue to find the original input row. Note that if the Python process is way too slow, this could lead to the queue growing unbounded and eventually run out of memory.
case class PythonUDF(name: String, func: PythonFunction, dataType: DataType, children: Seq[Expression]) extends Expression with Unevaluable with NonSQLExpression with Product with Serializable

A serialized version of a Python lambda function.
case class UserDefinedPythonFunction(name: String, func: PythonFunction, dataType: DataType) extends Product with Serializable

A user-defined Python function.
A user-defined Python function. This is used by the Python API.