pyspark.pandas.DataFrame.T¶
- 
property DataFrame.T¶
- Transpose index and columns. - Reflect the DataFrame over its main diagonal by writing rows as columns and vice-versa. The property - Tis an accessor to the method- transpose().- Note - This method is based on an expensive operation due to the nature of big data. Internally it needs to generate each row for each value, and then group twice - it is a huge operation. To prevent misuse, this method has the ‘compute.max_rows’ default limit of input length and raises a ValueError. - >>> from pyspark.pandas.config import option_context >>> with option_context('compute.max_rows', 1000): ... ps.DataFrame({'a': range(1001)}).transpose() Traceback (most recent call last): ... ValueError: Current DataFrame's length exceeds the given limit of 1000 rows. Please set 'compute.max_rows' by using 'pyspark.pandas.config.set_option' to retrieve more than 1000 rows. Note that, before changing the 'compute.max_rows', this operation is considerably expensive. - Returns
- DataFrame
- The transposed DataFrame. 
 
 - Notes - Transposing a DataFrame with mixed dtypes will result in a homogeneous DataFrame with the coerced dtype. For instance, if int and float have to be placed in same column, it becomes float. If type coercion is not possible, it fails. - Also, note that the values in index should be unique because they become unique column names. - In addition, if Spark 2.3 is used, the types should always be exactly same. - Examples - Square DataFrame with homogeneous dtype - >>> d1 = {'col1': [1, 2], 'col2': [3, 4]} >>> df1 = ps.DataFrame(data=d1, columns=['col1', 'col2']) >>> df1 col1 col2 0 1 3 1 2 4 - >>> df1_transposed = df1.T.sort_index() >>> df1_transposed 0 1 col1 1 2 col2 3 4 - When the dtype is homogeneous in the original DataFrame, we get a transposed DataFrame with the same dtype: - >>> df1.dtypes col1 int64 col2 int64 dtype: object >>> df1_transposed.dtypes 0 int64 1 int64 dtype: object - Non-square DataFrame with mixed dtypes - >>> d2 = {'score': [9.5, 8], ... 'kids': [0, 0], ... 'age': [12, 22]} >>> df2 = ps.DataFrame(data=d2, columns=['score', 'kids', 'age']) >>> df2 score kids age 0 9.5 0 12 1 8.0 0 22 - >>> df2_transposed = df2.T.sort_index() >>> df2_transposed 0 1 age 12.0 22.0 kids 0.0 0.0 score 9.5 8.0 - When the DataFrame has mixed dtypes, we get a transposed DataFrame with the coerced dtype: - >>> df2.dtypes score float64 kids int64 age int64 dtype: object - >>> df2_transposed.dtypes 0 float64 1 float64 dtype: object