pyspark.sql.functions.shuffle¶
-
pyspark.sql.functions.
shuffle
(col: ColumnOrName) → pyspark.sql.column.Column[source]¶ Collection function: Generates a random permutation of the given array.
New in version 2.4.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- col
Column
or str name of column or expression
- col
- Returns
Column
an array of elements in random order.
Notes
The function is non-deterministic.
Examples
>>> df = spark.createDataFrame([([1, 20, 3, 5],), ([1, 20, None, 3],)], ['data']) >>> df.select(shuffle(df.data).alias('s')).collect() [Row(s=[3, 1, 5, 20]), Row(s=[20, None, 3, 1])]