pyspark.sql.DataFrame.explain#
- DataFrame.explain(extended=None, mode=None)[source]#
- Prints the (logical and physical) plans to the console for debugging purposes. - New in version 1.3.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- extendedbool, optional
- default - False. If- False, prints only the physical plan. When this is a string without specifying the- mode, it works as the mode is specified.
- modestr, optional
- specifies the expected output format of plans. - simple: Print only a physical plan.
- extended: Print both logical and physical plans.
- codegen: Print a physical plan and generated codes if they are available.
- cost: Print a logical plan and statistics if they are available.
- formatted: Split explain output into two sections: a physical plan outline and node details.
 - Changed in version 3.0.0: Added optional argument mode to specify the expected output format of plans. 
 
 - Examples - Example 1: Print out the physical plan only (default). - >>> df = spark.createDataFrame( ... [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"]) >>> df.explain() == Physical Plan == *(1) Scan ExistingRDD[age...,name...] - Example 2: Print out all parsed, analyzed, optimized, and physical plans. - >>> df.explain(extended=True) == Parsed Logical Plan == ... == Analyzed Logical Plan == ... == Optimized Logical Plan == ... == Physical Plan == ... - Example 3: Print out the plans with two sections: a physical plan outline and node details. - >>> df.explain(mode="formatted") == Physical Plan == * Scan ExistingRDD (...) (1) Scan ExistingRDD [codegen id : ...] Output [2]: [age..., name...] ... - Example 4: Print a logical plan and statistics if they are available. - >>> df.explain(mode="cost") == Optimized Logical Plan == ...Statistics... ...