A physical plan that evaluates a PythonUDF, one partition of tuples at a time.
A physical plan that evaluates a PythonUDF, one partition of tuples at a time.
Python evaluation works by sending the necessary (projected) input data via a socket to an
external Python process, and combine the result from the Python process with the original row.
For each row we send to Python, we also put it in a queue. For each output row from Python,
we drain the queue to find the original input row. Note that if the Python process is way too
slow, this could lead to the queue growing unbounded and eventually run out of memory.
case classPythonUDF(name: String, func: PythonFunction, dataType: DataType, children: Seq[Expression]) extends Expression with Unevaluable with NonSQLExpression with Product with Serializable
A serialized version of a Python lambda function.
case classUserDefinedPythonFunction(name: String, func: PythonFunction, dataType: DataType) extends Product with Serializable
A user-defined Python function.
A user-defined Python function. This is used by the Python API.
A physical plan that evaluates a PythonUDF, one partition of tuples at a time.
Python evaluation works by sending the necessary (projected) input data via a socket to an external Python process, and combine the result from the Python process with the original row.
For each row we send to Python, we also put it in a queue. For each output row from Python, we drain the queue to find the original input row. Note that if the Python process is way too slow, this could lead to the queue growing unbounded and eventually run out of memory.