Looks at the CPU plan associated with the dataframe and outputs information about which parts of the query the RAPIDS Accelerator for Apache Spark could place on the GPU.
Looks at the CPU plan associated with the dataframe and outputs information about which parts of the query the RAPIDS Accelerator for Apache Spark could place on the GPU. This only applies to the initial plan, so if running with adaptive query execution enable, it will not be able to show any changes in the plan due to that.
This is very similar output you would get by running the query with the
Rapids Accelerator enabled and with the config spark.rapids.sql.enabled
enabled.
Requires the RAPIDS Accelerator for Apache Spark jar and RAPIDS cudf jar be included in the classpath but the RAPIDS Accelerator for Apache Spark should be disabled.
val output = com.nvidia.spark.rapids.ExplainPlan.explainPotentialGpuPlan(df)
Calling from PySpark:
output = sc._jvm.com.nvidia.spark.rapids.ExplainPlan.explainPotentialGpuPlan(df._jdf, "ALL")
The Spark DataFrame to get the query plan from
If ALL returns all the explain data, otherwise just returns what does not work on the GPU. Default is ALL.
String containing the explained plan.
java.lang.IllegalArgumentException
if an argument is invalid or it is unable to
determine the Spark version
java.lang.IllegalStateException
if the plugin gets into an invalid state while trying
to process the plan or there is an unexepected exception.