Sorting a Python list by two fields [duplicate]

Question

I have the following list created from a sorted csv

list1 = sorted(csv1, key=operator.itemgetter(1))

I would actually like to sort the list by two criteria: first by the value in field 1 and then by the value in field 2. How do I do this?

Do we let this question stand and just restrict its scope to "list-of-lists-of-length-two-builtin-types (e.g. string/int/float)". Or do we also allow "list-of-user-defined-object", as the title suggests is also allowed, in which case the answer is "Define __lt__() method on your class or inherit from some class that does"? That would make it a far better canonical. — smci, Commented Apr 24, 2018 at 22:31

dheerajpv · Accepted Answer · 2021-08-05 09:01:42Z

466

No need to import anything when using lambda functions.
The following sorts list by the first element, then by the second element. You can also sort by one field ascending and another descending for example:

sorted_list = sorted(list, key=lambda x: (x[0], -x[1]))

edited Aug 5, 2021 at 9:01

dheerajpv

631 silver badge6 bronze badges

answered Jun 14, 2013 at 13:01

jaap

5,8492 gold badges23 silver badges25 bronze badges

16

Nice. As you noted in comment to the main answer above, this is the best (only?) way to do multiple sorts with different sort orders. Perhaps highlight that. Also, your text does not indicate that you sorted descending on second element.
– PeterVermont
Commented Jun 12, 2015 at 14:25
2

@user1700890 I was assuming the field was already string. It should sort strings in alphabetical order by default. You should post your own question separately on SO if it is not specifically related to the answer here or the OP's original question.
– pbible
Commented Nov 23, 2015 at 16:59
10

what does the - in -x[1] stand for?
– jan
Commented Feb 26, 2016 at 15:55
10

@jan it's reverse sort
– jaap
Commented Feb 28, 2016 at 23:39
5

Won't work in one specific case. The accepted solution will not work either. For example, the columns to be used as keys are all strings that cannot be converted to numbers. Secondly, one wants to sort in ascending order by one column and descending order by another column.
– coder.in.me
Commented Sep 25, 2016 at 9:00

| Show 8 more comments

BrechtDeMan · Accepted Answer · 2015-09-04 08:54:41Z

183

like this:

import operator
list1 = sorted(csv1, key=operator.itemgetter(1, 2))

edited Sep 4, 2015 at 8:54

BrechtDeMan

6,8194 gold badges25 silver badges25 bronze badges

answered Mar 6, 2011 at 19:38

mouad

69.3k18 gold badges115 silver badges106 bronze badges

2

+1: More elegant than mine. I forgot that itemgetter can take multiple indices.
– dappawit
Commented Mar 6, 2011 at 19:43
9

operator is a module that needs to be imported.
– trapicki
Commented Aug 28, 2013 at 14:45
6

how will i proceed if i want to sort ascending on one element and descending on other, using itemgetter??.
– ashish
Commented Oct 12, 2013 at 10:13
6

@ashish, see my answer below with lambda functions this is clear, sort by "-x[1]" or even "x[0]+x[1]" if you wish
– jaap
Commented Feb 27, 2014 at 15:15
what about if one criteria in reversed mode?
– Yaser Khahani
Commented Jan 30, 2018 at 11:15

Add a comment |

Duncan · Accepted Answer · 2015-01-22 09:44:54Z

Python has a stable sort, so provided that performance isn't an issue the simplest way is to sort it by field 2 and then sort it again by field 1.

That will give you the result you want, the only catch is that if it is a big list (or you want to sort it often) calling sort twice might be an unacceptable overhead.

list1 = sorted(csv1, key=operator.itemgetter(2))
list1 = sorted(list1, key=operator.itemgetter(1))

Doing it this way also makes it easy to handle the situation where you want some of the columns reverse sorted, just include the 'reverse=True' parameter when necessary.

Otherwise you can pass multiple parameters to itemgetter or manually build a tuple. That is probably going to be faster, but has the problem that it doesn't generalise well if some of the columns want to be reverse sorted (numeric columns can still be reversed by negating them but that stops the sort being stable).

So if you don't need any columns reverse sorted, go for multiple arguments to itemgetter, if you might, and the columns aren't numeric or you want to keep the sort stable go for multiple consecutive sorts.

Edit: For the commenters who have problems understanding how this answers the original question, here is an example that shows exactly how the stable nature of the sorting ensures we can do separate sorts on each key and end up with data sorted on multiple criteria:

DATA = [
    ('Jones', 'Jane', 58),
    ('Smith', 'Anne', 30),
    ('Jones', 'Fred', 30),
    ('Smith', 'John', 60),
    ('Smith', 'Fred', 30),
    ('Jones', 'Anne', 30),
    ('Smith', 'Jane', 58),
    ('Smith', 'Twin2', 3),
    ('Jones', 'John', 60),
    ('Smith', 'Twin1', 3),
    ('Jones', 'Twin1', 3),
    ('Jones', 'Twin2', 3)
]

# Sort by Surname, Age DESCENDING, Firstname
print("Initial data in random order")
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred''')
DATA.sort(key=lambda row: row[1])

for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.''')
DATA.sort(key=lambda row: row[2], reverse=True)
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.
''')
DATA.sort(key=lambda row: row[0])
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

This is a runnable example, but to save people running it the output is:

Initial data in random order
Jones      Jane       58
Smith      Anne       30
Jones      Fred       30
Smith      John       60
Smith      Fred       30
Jones      Anne       30
Smith      Jane       58
Smith      Twin2      3
Jones      John       60
Smith      Twin1      3
Jones      Twin1      3
Jones      Twin2      3

First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred
Smith      Anne       30
Jones      Anne       30
Jones      Fred       30
Smith      Fred       30
Jones      Jane       58
Smith      Jane       58
Smith      John       60
Jones      John       60
Smith      Twin1      3
Jones      Twin1      3
Smith      Twin2      3
Jones      Twin2      3

Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.
Smith      John       60
Jones      John       60
Jones      Jane       58
Smith      Jane       58
Smith      Anne       30
Jones      Anne       30
Jones      Fred       30
Smith      Fred       30
Smith      Twin1      3
Jones      Twin1      3
Smith      Twin2      3
Jones      Twin2      3

Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.

Jones      John       60
Jones      Jane       58
Jones      Anne       30
Jones      Fred       30
Jones      Twin1      3
Jones      Twin2      3
Smith      John       60
Smith      Jane       58
Smith      Anne       30
Smith      Fred       30
Smith      Twin1      3
Smith      Twin2      3

Note in particular how in the second step the reverse=True parameter keeps the firstnames in order whereas simply sorting then reversing the list would lose the desired order for the third sort key.

Stable sorting doesn't mean that it won't forget what your previous sorting was. This answer is wrong. — Mike Axiak, Commented Mar 6, 2011 at 21:10
Stable sorting means that you can sort by columns a, b, c simply by sorting by column c then b then a. Unless you care to expand on your comment I think it is you that is mistaken. — Duncan, Commented Mar 6, 2011 at 21:23
This answer is definitely correct, although for larger lists it's unideal: if the list was already partially sorted, then you'll lose most of the optimization of Python's sorting by shuffling the list around a lot more. @Mike, you're incorrect; I suggest actually testing answers before declaring them wrong. — Glenn Maynard, Commented Mar 6, 2011 at 21:39
@MikeAxiak: docs.python.org/2/library/stdtypes.html#index-29 states in comment 9: Starting with Python 2.3, the sort() method is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade). — trapicki, Commented Aug 28, 2013 at 14:40
This is not correct because this does not answer the question he asked. he wants a list sorted by the first index and in the case of where there are ties in the first index, he wants to use the second index as the sorting criteria. A stable sort only guarantees that all things being equal, the original order passed will be the order the items appear. — Jon, Commented Jan 20, 2015 at 2:33

rioV8 · Accepted Answer · 2020-02-15 22:52:53Z

22

list1 = sorted(csv1, key=lambda x: (x[1], x[2]) )

edited Feb 15, 2020 at 22:52

rioV8

27.4k4 gold badges38 silver badges54 bronze badges

answered Mar 6, 2011 at 19:41

dappawit

12.5k2 gold badges34 silver badges27 bronze badges

4

I don't think tuple() can receive two arguments (or rather, three, if you count with self)
– Filipe Correia
Commented Dec 12, 2012 at 23:15
3

tuple takes only can take one argument
– therealprashant
Commented Jun 6, 2015 at 11:02
1

return statement should be return tuple((x[1], x[2])) or simply return x[1], x[2]. Refer @jaap answer below if you're looking for sorting in different directions
– Jo Kachikaran
Commented Feb 11, 2017 at 1:40
… or tuple(x[1:3]), if you want to use the tuple constructor for some reason instead of just a tuple display list x[1], x[2]. Or keyfunc = operator.itemgetter(1, 2) and don't even write a function yourself.
– abarnert
Commented Apr 4, 2018 at 17:53
Can I do this, list1 = sorted(csv1, key=lambda x: x[1] and x[2] )? If not what would be the behaviour in this case?
– ahmetbulut
Commented Sep 23, 2021 at 22:45

Add a comment |

cottontail · Accepted Answer · 2022-06-01 00:42:21Z

4

employees.sort(key = lambda x:x[1])
employees.sort(key = lambda x:x[0])

We can also use .sort with lambda 2 times because python sort is in place and stable. This will first sort the list according to the second element, x[1]. Then, it will sort the first element, x[0] (highest priority).

employees[0] = "Employee's Name"
employees[1] = "Employee's Salary"

This is equivalent to doing the following:

employees.sort(key = lambda x:(x[0], x[1]))

edited Jun 1, 2022 at 0:42

cottontail

19.5k22 gold badges111 silver badges98 bronze badges

answered Jun 20, 2018 at 11:37

Deepak Yadav

6407 silver badges14 bronze badges

2

no, this sorting rule need to take precedence then second.
– CodeFarmer
Commented Mar 8, 2019 at 23:43

Add a comment |

Saurabh · Accepted Answer · 2019-02-20 02:40:58Z

3

Sorting list of dicts using below will sort list in descending order on first column as salary and second column as age

d=[{'salary':123,'age':23},{'salary':123,'age':25}]
d=sorted(d, key=lambda i: (i['salary'], i['age']),reverse=True)

Output: [{'salary': 123, 'age': 25}, {'salary': 123, 'age': 23}]

answered Feb 20, 2019 at 2:40

Saurabh

7,7134 gold badges47 silver badges46 bronze badges

Add a comment |

Chathura Wanniarachchi · Accepted Answer · 2022-02-05 02:49:45Z

If you want to sort the array based on, based on both ascending and then descending order follow the method mentioned below. For that, you can use the lambda function.
let us consider below example,
input: [[1,2],[3,3],[2,1],[1,1],[4,1],[3,1]]
expected output: [[4, 1], [3, 1], [3, 3], [2, 1], [1, 1], [1, 2]]
code used:

    arr = [[1,2],[3,3],[2,1],[1,1],[4,1],[3,1]]
    arr.sort(key=lambda ele: (ele[0], -ele[1]), reverse=True)
    # output [[4, 1], [3, 1], [3, 3], [2, 1], [1, 1], [1, 2]]

The negative sign is the reason the procedure the result expected.

taras · Accepted Answer · 2019-05-22 07:47:09Z

1

In ascending order you can use:

sorted_data= sorted(non_sorted_data, key=lambda k: (k[1],k[0]))

or in descending order you can use:

sorted_data= sorted(non_sorted_data, key=lambda k: (k[1],k[0]),reverse=True)

edited May 22, 2019 at 7:47

taras

6,79410 gold badges42 silver badges52 bronze badges

answered May 22, 2019 at 7:29

Majid Arasteh

361 bronze badge

Add a comment |

SurpriseDog · Accepted Answer · 2021-04-12 02:38:33Z

1

After reading the answers in this thread, I wrote a general solution that will work for an arbitrary number of columns:

def sort_array(array, *columns):
    for col in columns:
        array.sort(key = lambda x:x[col])

The OP would call it like this:

sort_array(list1, 2, 1)

Which sorts first by column 2, then by column 1.
(Most important column goes last)

edited Apr 12, 2021 at 2:38

answered Apr 12, 2021 at 2:33

SurpriseDog

4648 silver badges20 bronze badges

Add a comment |

doodzio · Accepted Answer · 2021-01-29 16:16:29Z

0

python 3 https://docs.python.org/3.5/howto/sorting.html#the-old-way-using-the-cmp-parameter

from functools import cmp_to_key

def custom_compare(x, y):
    # custom comparsion of x[0], x[1] with y[0], y[1]
    return 0

sorted(entries, key=lambda e: (cmp_to_key(custom_compare)(e[0]), e[1]))

answered Jan 29, 2021 at 16:16

doodzio

614 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Sorting a Python list by two fields [duplicate]

10 Answers 10

Not the answer you're looking for? Browse other questions tagged
python
sorting
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

Not the answer you're looking for? Browse other questions tagged pythonsorting or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
sorting
or ask your own question.