A list of column names can be provided to sort by multiple columns. Your DataFrame will be sorted by the first column first and, when multiple rows have the same value for the first column, those rows are then sorted by their second column.
import pandas as pd\n \ndf = pd.DataFrame( [{"city": "Los Angeles", "state": "CA", "population": 3898747},\n {"city": "Urbana", "state": "IL", "population": 42461},\n {"city": "New York", "state": "NY", "population": 8804190},\n {"city": "Washington D.C.", "population": 689545},\n {"city": "Chicago", "state": "IL", "population": 2746388}] )\ndf.sort_values(["state", "population"])\n# => Sorts by "state" first,\n# then "population" for rows with identical "state" values.
Reset CodeRun All to Here Python Output:
city
state
population
0
Los Angeles
CA
3898747
1
Urbana
IL
42461
4
Chicago
IL
2746388
2
New York
NY
8804190
3
Washington D.C.
NaN
689545
Sorting in Reverse Order (Descending Order)
By default, the ascending parameter is True indicating that the sorting will be done in ascending order (smallest value first, largest value last). The ascending parameter can be set to False to sort in descending order.
When a sort is a stable, the sort guarantees to not change the order among rows with identical values. For example, our initial dataset has "Urbana" listed before "Chicago". A stable sort that sorts the list only on state will keep "Urbana" listed before "Chicago" since that is the way it appears in the original list.
By default, the sorting will be done as fast as possible using a quicksort algorithm. The quicksort algorithm is NOT stable. The kind parameter can specify a "stable" to guarantee a stable sort is used.
When sorting a DataFrame, missing values (NaN) can be placed at the beginning or end of your DataFrame. By default, the na_position parameter is set to "last". The na_position can be set to "first" to place missing values at the beginning of your DataFrame.
import pandas as pd\n \ndf = pd.DataFrame( [{"city": "Los Angeles", "state": "CA", "population": 3898747},\n {"city": "Urbana", "state": "IL", "population": 42461},\n {"city": "New York", "state": "NY", "population": 8804190},\n {"city": "Washington D.C.", "population": 689545},\n {"city": "Chicago", "state": "IL", "population": 2746388}] )\ndf.sort_values("state", na_position="first")\n# => Sorts the missing values (NaN) to the first part of the DataFrame