pyspark.sql.GroupedData.count#
- GroupedData.count()[source]#
- Counts the number of records for each group. - New in version 1.3.0. - Changed in version 3.4.0: Supports Spark Connect. - Examples - >>> df = spark.createDataFrame( ... [(2, "Alice"), (3, "Alice"), (5, "Bob"), (10, "Bob")], ["age", "name"]) >>> df.show() +---+-----+ |age| name| +---+-----+ | 2|Alice| | 3|Alice| | 5| Bob| | 10| Bob| +---+-----+ - Group-by name, and count each group. - >>> df.groupBy(df.name).count().sort("name").show() +-----+-----+ | name|count| +-----+-----+ |Alice| 2| | Bob| 2| +-----+-----+