pyspark.pandas.window.Rolling.quantile#
- Rolling.quantile(quantile, accuracy=10000)[source]#
- Calculate the rolling quantile of the values. - New in version 3.4.0. - Parameters
- quantilefloat
- Value between 0 and 1 providing the quantile to compute. - Deprecated since version 4.0.0: This will be renamed to ‘q’ in a future version. 
- accuracyint, optional
- Default accuracy of approximation. Larger value means better accuracy. The relative error can be deduced by 1.0 / accuracy. This is a panda-on-Spark specific parameter. 
 
- Returns
- Series or DataFrame
- Returned object type is determined by the caller of the rolling calculation. 
 
 - See also - pyspark.pandas.Series.rolling
- Calling rolling with Series data. 
- pyspark.pandas.DataFrame.rolling
- Calling rolling with DataFrames. 
- pyspark.pandas.Series.quantile
- Aggregating quantile for Series. 
- pyspark.pandas.DataFrame.quantile
- Aggregating quantile for DataFrame. 
 - Notes - quantile in pandas-on-Spark are using distributed percentile approximation algorithm unlike pandas, the result might be different with pandas, also interpolation parameter is not supported yet. - the current implementation of this API uses Spark’s Window without specifying partition specification. This leads to move all data into single partition in single machine and could cause serious performance degradation. Avoid this method against very large dataset. - Examples - >>> s = ps.Series([4, 3, 5, 2, 6]) >>> s 0 4 1 3 2 5 3 2 4 6 dtype: int64 - >>> s.rolling(2).quantile(0.5) 0 NaN 1 3.0 2 3.0 3 2.0 4 2.0 dtype: float64 - >>> s.rolling(3).quantile(0.5) 0 NaN 1 NaN 2 4.0 3 3.0 4 5.0 dtype: float64 - For DataFrame, each rolling quantile is computed column-wise. - >>> df = ps.DataFrame({"A": s.to_numpy(), "B": s.to_numpy() ** 2}) >>> df A B 0 4 16 1 3 9 2 5 25 3 2 4 4 6 36 - >>> df.rolling(2).quantile(0.5) A B 0 NaN NaN 1 3.0 9.0 2 3.0 9.0 3 2.0 4.0 4 2.0 4.0 - >>> df.rolling(3).quantile(0.5) A B 0 NaN NaN 1 NaN NaN 2 4.0 16.0 3 3.0 9.0 4 5.0 25.0