pyspark.pandas.DataFrame.explode#

DataFrame.explode(column, ignore_index=False)[source]#

Transform each element of a list-like to a row, replicating index values.

Parameters

columnstr or tuple: Column to explode.
ignore_indexbool, default False: If True, the resulting index will be labeled 0, 1, …, n - 1.

Returns

DataFrame: Exploded lists to rows of the subset columns; index will be duplicated for these rows.

See also

DataFrame.unstack: Pivot a level of the (necessarily hierarchical) index labels.
DataFrame.melt: Unpivot a DataFrame from wide format to long format.

Examples

>>> df = ps.DataFrame({'A': [[1, 2, 3], [], [3, 4]], 'B': 1})
>>> df
           A  B
0  [1, 2, 3]  1
1         []  1
2     [3, 4]  1

>>> df.explode('A')
     A  B
1.0  1
2.0  1
3.0  1
NaN  1
3.0  1
4.0  1

>>> df.explode('A', ignore_index=True)
     A  B
1.0  1
2.0  1
3.0  1
NaN  1
3.0  1
4.0  1