pyspark.sql.functions.bucket¶
- 
pyspark.sql.functions.bucket(numBuckets: Union[pyspark.sql.column.Column, int], col: ColumnOrName) → pyspark.sql.column.Column[source]¶
- Partition transform function: A transform for any type that partitions by a hash of the input column. - New in version 3.1.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- colColumnor str
- target date or timestamp column to work on. 
 
- col
- Returns
- Column
- data partitioned by given columns. 
 
 - Notes - This function can be used only in combination with - partitionedBy()method of the DataFrameWriterV2.- Examples - >>> df.writeTo("catalog.db.table").partitionedBy( ... bucket(42, "ts") ... ).createOrReplace()