pyspark.pandas.DataFrame.mode#
- DataFrame.mode(axis=0, numeric_only=False, dropna=True)[source]#
- Get the mode(s) of each element along the selected axis. - The mode of a set of values is the value that appears most often. It can be multiple values. - New in version 3.4.0. - Parameters
- axis{0 or ‘index’}, default 0
- Axis for the function to be applied on. 
- numeric_onlybool, default False
- If True, only apply to numeric columns. 
- dropnabool, default True
- Don’t consider counts of NaN/NaT. 
 
- Returns
- DataFrame
- The modes of each column or row. 
 
 - See also - Series.mode
- Return the highest frequency value in a Series. 
- Series.value_counts
- Return the counts of values in a Series. 
 - Examples - >>> df = ps.DataFrame([('bird', 2, 2), ... ('mammal', 4, np.nan), ... ('arthropod', 8, 0), ... ('bird', 2, np.nan)], ... index=('falcon', 'horse', 'spider', 'ostrich'), ... columns=('species', 'legs', 'wings')) >>> df species legs wings falcon bird 2 2.0 horse mammal 4 NaN spider arthropod 8 0.0 ostrich bird 2 NaN - By default missing values are not considered, and the mode of wings are both 0 and 2. Because the resulting DataFrame has two rows, the second row of - speciesand- legscontains- NaN.- >>> df.mode() species legs wings 0 bird 2.0 0.0 1 None NaN 2.0 - Setting - dropna=False- NaNvalues are considered and they can be the mode (like for wings).- >>> df.mode(dropna=False) species legs wings 0 bird 2 NaN - Setting - numeric_only=True, only the mode of numeric columns is computed, and columns of other types are ignored.- >>> df.mode(numeric_only=True) legs wings 0 2.0 0.0 1 NaN 2.0