pyspark.sql.functions.
array_join
Concatenates the elements of column using the delimiter. Null values are replaced with null_replacement if set, otherwise they are ignored.
New in version 2.4.0.
Changed in version 3.4.0: Supports Spark Connect.
Column
target column to work on.
delimiter used to concatenate elements
if set then null values will be replaced by this value
a column of string type. Concatenated values.
Examples
>>> df = spark.createDataFrame([(["a", "b", "c"],), (["a", None],)], ['data']) >>> df.select(array_join(df.data, ",").alias("joined")).collect() [Row(joined='a,b,c'), Row(joined='a')] >>> df.select(array_join(df.data, ",", "NULL").alias("joined")).collect() [Row(joined='a,b,c'), Row(joined='a,NULL')]