pyspark.sql.DataFrame.sort¶
-
DataFrame.
sort
(*cols: Union[str, pyspark.sql.column.Column, List[Union[str, pyspark.sql.column.Column]]], **kwargs: Any) → pyspark.sql.dataframe.DataFrame[source]¶ Returns a new
DataFrame
sorted by the specified column(s).New in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- Returns
DataFrame
Sorted DataFrame.
- Other Parameters
- ascendingbool or list, optional, default True
boolean or list of boolean. Sort ascending vs. descending. Specify list for multiple sort orders. If a list is specified, the length of the list must equal the length of the cols.
Examples
>>> from pyspark.sql.functions import desc, asc >>> df = spark.createDataFrame([ ... (2, "Alice"), (5, "Bob")], schema=["age", "name"])
Sort the DataFrame in ascending order.
>>> df.sort(asc("age")).show() +---+-----+ |age| name| +---+-----+ | 2|Alice| | 5| Bob| +---+-----+
Sort the DataFrame in descending order.
>>> df.sort(df.age.desc()).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| +---+-----+ >>> df.orderBy(df.age.desc()).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| +---+-----+ >>> df.sort("age", ascending=False).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| +---+-----+
Specify multiple columns
>>> df = spark.createDataFrame([ ... (2, "Alice"), (2, "Bob"), (5, "Bob")], schema=["age", "name"]) >>> df.orderBy(desc("age"), "name").show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2|Alice| | 2| Bob| +---+-----+
Specify multiple columns for sorting order at ascending.
>>> df.orderBy(["age", "name"], ascending=[False, False]).show() +---+-----+ |age| name| +---+-----+ | 5| Bob| | 2| Bob| | 2|Alice| +---+-----+