pyspark.sql.functions.cume_dist#
- pyspark.sql.functions.cume_dist()[source]#
- Window function: returns the cumulative distribution of values within a window partition, i.e. the fraction of rows that are below the current row. - New in version 1.6.0. - Changed in version 3.4.0: Supports Spark Connect. - Returns
- Column
- the column for calculating cumulative distribution. 
 
 - Examples - >>> from pyspark.sql import Window, types >>> df = spark.createDataFrame([1, 2, 3, 3, 4], types.IntegerType()) >>> w = Window.orderBy("value") >>> df.withColumn("cd", cume_dist().over(w)).show() +-----+---+ |value| cd| +-----+---+ | 1|0.2| | 2|0.4| | 3|0.8| | 3|0.8| | 4|1.0| +-----+---+