pyspark.sql.DataFrame.sortWithinPartitions

DataFrame.sortWithinPartitions(*cols: Union[str, pyspark.sql.column.Column, List[Union[str, pyspark.sql.column.Column]]], **kwargs: Any) → pyspark.sql.dataframe.DataFrame[source]

Returns a new DataFrame with each partition sorted by the specified column(s).

New in version 1.6.0.

Parameters
colsstr, list or Column, optional

list of Column or column names to sort by.

Other Parameters
ascendingbool or list, optional

boolean or list of boolean (default True). Sort ascending vs. descending. Specify list for multiple sort orders. If a list is specified, length of the list must equal length of the cols.

Examples

>>> df.sortWithinPartitions("age", ascending=False).show()
+---+-----+
|age| name|
+---+-----+
|  2|Alice|
|  5|  Bob|
+---+-----+