pyspark.sql.functions.count_if#
- pyspark.sql.functions.count_if(col)[source]#
- Aggregate function: Returns the number of TRUE values for the col. - New in version 3.5.0. - Parameters
- colColumnor column name
- target column to work on. 
 
- col
- Returns
- Column
- the number of TRUE values for the col. 
 
 - See also - Examples - Example 1: Counting the number of even numbers in a numeric column - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([("a", 1), ("a", 2), ("a", 3), ("b", 8), ("b", 2)], ["c1", "c2"]) >>> df.select(sf.count_if(sf.col('c2') % 2 == 0)).show() +------------------------+ |count_if(((c2 % 2) = 0))| +------------------------+ | 3| +------------------------+ - Example 2: Counting the number of rows where a string column starts with a certain letter - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame( ... [("apple",), ("banana",), ("cherry",), ("apple",), ("banana",)], ["fruit"]) >>> df.select(sf.count_if(sf.col('fruit').startswith('a'))).show() +------------------------------+ |count_if(startswith(fruit, a))| +------------------------------+ | 2| +------------------------------+ - Example 3: Counting the number of rows where a numeric column is greater than a certain value - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(1,), (2,), (3,), (4,), (5,)], ["num"]) >>> df.select(sf.count_if(sf.col('num') > 3)).show() +-------------------+ |count_if((num > 3))| +-------------------+ | 2| +-------------------+ - Example 4: Counting the number of rows where a boolean column is True - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(True,), (False,), (True,), (False,), (True,)], ["b"]) >>> df.select(sf.count('b'), sf.count_if('b')).show() +--------+-----------+ |count(b)|count_if(b)| +--------+-----------+ | 5| 3| +--------+-----------+