pyspark.pandas.Series.cumprod¶
- 
Series.cumprod(skipna: bool = True) → FrameLike¶
- Return cumulative product over a DataFrame or Series axis. - Returns a DataFrame or Series of the same size containing the cumulative product. - Note - the current implementation of cumprod uses Spark’s Window without specifying partition specification. This leads to move all data into single partition in single machine and could cause serious performance degradation. Avoid this method against very large dataset. - Note - unlike pandas’, pandas-on-Spark’s emulates cumulative product by - exp(sum(log(...)))trick. Therefore, it only works for positive numbers.- Parameters
- skipnaboolean, default True
- Exclude NA/null values. If an entire row/column is NA, the result will be NA. 
 
- Returns
- DataFrame or Series
 
- Raises
- ExceptionIf the values is equal to or lower than 0.
 
 - See also - DataFrame.cummax
- Return cumulative maximum over DataFrame axis. 
- DataFrame.cummin
- Return cumulative minimum over DataFrame axis. 
- DataFrame.cumsum
- Return cumulative sum over DataFrame axis. 
- DataFrame.cumprod
- Return cumulative product over DataFrame axis. 
- Series.cummax
- Return cumulative maximum over Series axis. 
- Series.cummin
- Return cumulative minimum over Series axis. 
- Series.cumsum
- Return cumulative sum over Series axis. 
- Series.cumprod
- Return cumulative product over Series axis. 
 - Examples - >>> df = ps.DataFrame([[2.0, 1.0], [3.0, None], [4.0, 10.0]], columns=list('AB')) >>> df A B 0 2.0 1.0 1 3.0 NaN 2 4.0 10.0 - By default, iterates over rows and finds the sum in each column. - >>> df.cumprod() A B 0 2.0 1.0 1 6.0 NaN 2 24.0 10.0 - It works identically in Series. - >>> df.A.cumprod() 0 2.0 1 6.0 2 24.0 Name: A, dtype: float64