pyspark.sql.functions.arrays_zip#
- pyspark.sql.functions.arrays_zip(*cols)[source]#
- Array function: Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays. If one of the arrays is shorter than others then the resulting struct type value will be a null for missing elements. - New in version 2.4.0. - Changed in version 3.4.0: Supports Spark Connect. - Examples - Example 1: Zipping two arrays of the same length - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, 2, 3], ['a', 'b', 'c'])], ['nums', 'letters']) >>> df.select(sf.arrays_zip(df.nums, df.letters)).show(truncate=False) +-------------------------+ |arrays_zip(nums, letters)| +-------------------------+ |[{1, a}, {2, b}, {3, c}] | +-------------------------+ - Example 2: Zipping arrays of different lengths - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, 2], ['a', 'b', 'c'])], ['nums', 'letters']) >>> df.select(sf.arrays_zip(df.nums, df.letters)).show(truncate=False) +---------------------------+ |arrays_zip(nums, letters) | +---------------------------+ |[{1, a}, {2, b}, {NULL, c}]| +---------------------------+ - Example 3: Zipping more than two arrays - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame( ... [([1, 2], ['a', 'b'], [True, False])], ['nums', 'letters', 'bools']) >>> df.select(sf.arrays_zip(df.nums, df.letters, df.bools)).show(truncate=False) +--------------------------------+ |arrays_zip(nums, letters, bools)| +--------------------------------+ |[{1, a, true}, {2, b, false}] | +--------------------------------+ - Example 4: Zipping arrays with null values - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, 2, None], ['a', None, 'c'])], ['nums', 'letters']) >>> df.select(sf.arrays_zip(df.nums, df.letters)).show(truncate=False) +------------------------------+ |arrays_zip(nums, letters) | +------------------------------+ |[{1, a}, {2, NULL}, {NULL, c}]| +------------------------------+