75

I was wondering what would be a Pythonic way of sorting a list of tuples by two keys whereby sorting with one (and only one) key would be in a reverse order and sorting with the the other would be case insensitive. More specifically, I have a list containing tuples like:

myList = [(ele1A, ele2A),(ele1B, ele2B),(ele1C, ele2C)]

I can use the following code to sort it with two keys:

sortedList = sorted(myList, key = lambda y: (y[0].lower(), y[1]))

To sort in reverse order I can use

sortedList = sorted(myList, key = lambda y: (y[0].lower(), y[1]), reverse = True)

But this would sort in a reverse order with two keys.

1

9 Answers 9

74

Two keys will be used when we need to sort a list with two constraints: one in ascending order and the other in descending, in the same list or any

In your example,

sortedList = sorted(myList, key = lambda y: (y[0].lower(), y[1]))

you can sort entire list only in one order.

You can try these and check what's happening:

sortedList = sorted(myList, key = lambda y: (y[0].lower(), -y[1]))
sortedList = sorted(myList, key = lambda y: (-y[0].lower(), y[1]))
sortedList = sorted(myList, key = lambda y: (-y[0].lower(), -y[1]))
2
  • 14
    This only works for lists of numbers? The OP didn't say what type the elements are. Commented Jul 19, 2021 at 18:10
  • 4
    The answer by @black-panda works for all data types -- and object type that is comparable -- and is a much much better answer. Commented Oct 8, 2021 at 7:19
70

You could create a reversor class and use it to decorate the key in question. This class could be used to reverse any field that is comparable.

class reversor:
    def __init__(self, obj):
        self.obj = obj

    def __eq__(self, other):
        return other.obj == self.obj

    def __lt__(self, other):
        return other.obj < self.obj

Use it like so:

sortedList = sorted(myList, key=lambda y: (y[0].lower(), reversor(y[1])))
4
  • 11
    This solution can be used with strings or other objects. Its a cleaner solution than the one that's tagged as best solution.
    – Tim Givois
    Commented Sep 26, 2019 at 0:28
  • Better decorate this with functools.total_ordering: docs.python.org/3/library/…
    – kaya3
    Commented Dec 19, 2019 at 12:19
  • 4
    Total ordering is unnecessary for use as a key parameter in the sorted function. You only need == and < Commented Dec 19, 2019 at 15:57
  • Definitely the best answer all around. For n00bs, if you are dealing with list of dicts, use y[<key>]. For list of objects use y.<property>. Commented May 2, 2022 at 1:31
6

When using Python 3, @KellyBundy made an excellent observation that the multisort method listed in the current python docs is incredibly fast and be used to accomplish multi-colum sort with discrete ordering. Here is a NoneType safe version:

students = [
     {'idx': 0, 'name': 'john', 'grade': 'A', 'attend': 100}
    ,{'idx': 1, 'name': 'jane', 'grade': 'B', 'attend': 80}
    ,{'idx': 2, 'name': 'dave', 'grade': 'B', 'attend': 85}
    ,{'idx': 3, 'name': 'stu' , 'grade': None, 'attend': 85}
]

def key_grade(student):
    grade = student['grade']
    return grade is None, grade
def key_attend(student):
    attend = student['attend']
    return attend is None, attend
students_sorted = sorted(students, key=key_attend)
students_sorted.sort(key=key_grade, reverse=True)

Notes:

  • The <variable> is None, check is defensive check so that search does not fail on None values
  • Although, this does multiple sorted calls, it is hands down the fastest multi-sort method!

I have created a new Python Project called multisort which exposes three methodologies:

Method Descr Notes speed
multisort Simple one-liner designed after multisort example from python docs Second fastest of the bunch but most configurable and easy to read. 0.0035
cmp_func Multi column sorting in the model java.util.Comparator Reasonable speed 0.0138
reversor Implementation of reversor - See Black Panda's answer Pretty slow methodology 0.0370

For reference:

Method speed
KellyBundy's Multisort 0.0005
pandas 0.0079

Note: Speed is average of 10 runs for 1000 rows with 4 columns.

Example of multisort from the multisort library :

from multisort import multisort
rows_sorted = multisort(rows_dict, [
        ('grade', True, lambda s:None if s is None else s.upper()),
        'attend',
], reverse=True)

However, for developers who come in from Java, here is an example that is similar to java.util.Comparator for use in Python 3:

from multisort import cmp_func

def cmp_student(a,b):
    k='grade'; va=a[k]; vb=b[k]
    if va != vb:
        if va is None: return -1
        if vb is None: return 1
        return -1 if va > vb else 1
    k='attend'; va=a[k]; vb=b[k]; 
    if va != vb:
        return -1 if va < vb else 1
    return 0

students_sorted = sorted(students, key=cmp_func(cmp_student))
11
  • How much slower was the method from Python's Sorting HOW TO? Commented May 2, 2022 at 5:04
  • Please provide the example from Python docs you are mentioning and I'll add it to my tests. Commented May 2, 2022 at 5:44
  • docs.python.org/3/howto/… Commented May 2, 2022 at 5:51
  • None of the examples provided there can handle None in values so they cannot be used for real world data unless you cleanse the data first so it has no None values. However, I did write up a test for the multisort. I'll add it to my gits tests. Commented May 2, 2022 at 6:19
  • Am I mistaken, or does msorted only do one sorted instead of multiple like Python's HOW TO does? Commented May 4, 2022 at 23:08
5

Method 1

A simple solution, but might not be the most efficient is to sort twice: the first time using the second element, the second using the first element:

sortedList = sorted(sorted(myList, key=lambda (a,b):b, reverse=True), key=lambda(a,b):a)

Or break down:

tempList = sorted(myList, key=lambda (a,b):b, reverse=True)
sortedList = sorted(tempList, key=lambda(a,b):a))

Method 2

If your elements are numbers, you can cheat a little:

sorted(myList, key=lambda(a,b):(a,1.0/b))

Method 3

I recommend against this approach as it is messy and the cmp keyword is not available in Python 3.

Another approach is to swap the elements when comparing the elements:

def compare_func(x, y):
    tup1 = (x[0], y[1])
    tup2 = (x[1], y[0])
    if tup1 == tup2:
        return 0
    elif tup1 > tup2:
        return 1
    else:
        return -1

sortedList = sorted(myList, cmp=compare_func)

Or, using lambda to avoid writing function:

sortedList = sorted(
    myList,
    cmp=lambda (a1, b1), (a2, b2): 0 if (a1, b2) == (a2, b1) else 1 if (a1, b2) > (a2, b1) else -1
    )
1
  • 2
    Method 2 doesn't work for zero; it raisesZeroDivisionError. I think you meant -b.
    – wjandrea
    Commented May 2, 2022 at 0:46
4

Sometimes there is little alternative but to use a comparator function. There was a cmp argument to sorted from its introduction to 2.4, but it was removed from Python 3 in favour of the more efficient key function. In 3.2, cmp_to_key was added to functools; it creates keys from the original objects by wrapping them in an object whose comparison function is based on the cmp function. (You can see a simple definition of cmp_to_key at the end of the Sorting How-To

In your case, since lower-casing is relatively expensive, you might want to do a combination:

class case_insensitive_and_2nd_reversed:
    def __init__(self, obj, *args):
        self.first = obj[0].lower()
        self.second = obj[1]
    def __lt__(self, other):
        return self.first < other.first or self.first == other.first and other.second < self.second
    def __gt__(self, other):
        return self.first > other.first or self.first == other.first and other.second > self.second
    def __le__(self, other):
        return self.first < other.first or self.first == other.first and other.second <= self.second
    def __ge__(self, other):
        return self.first > other.first or self.first == other.first and other.second >= self.second
    def __eq__(self, other):
        return self.first == other.first and self.second == other.second
    def __ne__(self, other):
        return self.first != other.first and self.second != other.second

sortedList = sorted(myList, key = case_insensitive_and_2nd_reversed)
1

maybe elegant but not efficient way:

reverse_key = functools.cmp_to_key(lambda a, b: (a < b) - (a > b))
sortedList = sorted(myList, key = lambda y: (reverse_key(y[0].lower()), y[1]))
1

Underlying theory

All of the following applies to both the built-in sorted function, and the .sort method of lists.

In general, a key function for sorting can simply produce a tuple, where each element corresponds to one of the "keys" we want to use for sorting. These tuples will sort lexicographically, so this produces the desired result - elements are sorted according to the first key result, with ties broken by the second, etc.

Meanwhile, the reverse keyword argument for sorting can specify that sorting should be done in reverse order. It's equivalent to sorting normally, and then reversing the result, but more efficient.

However, this reverse setting applies to the entire sort. It does not allow for sorting ascending by one key and then descending by a different key, or vice versa.

Example setup

It is possible to sort a list containing any sort of objects, not just nested lists/tuples; and it is possible to write key functions that process those objects in whatever manner - for example, to sort instances of a class according to the value of a specific attribute. For clarity (i.e., in order to use attribute names), I will set up a simple namedtuple and demonstrate techniques to sort a list of instances.

from collections import namedtuple
datum = namedtuple('datum', 'id age first last')
data = [
    datum(1, 23, 'Foo', 'Bar'),
    datum(2, 42, 'Baz', 'Quux'),
    # etc.
]

Special case: sorting by two numeric keys

To emulate sorting in reverse, it is sufficient to take the negative of a numeric value. Thus:

# sort ascending by id, then descending by age
data.sort(key=lambda d: (d.id, -d.age))
# equivalent, but more complex:
data.sort(key=lambda d: (-d.id, d.age), reverse=True)

Special case: sorting by at most one non-numeric key

If there is only one non-numeric key, choosing whether or not to use reverse allows us to avoid the issue that only numeric keys can be negated in this way:

# sort ascending by first name, then descending by id
data.sort(key=lambda d: (d.first, -d.id))
# sort ascending by age, then descending by last name
# since the name can't be negated, `reverse` is needed;
# this implies in turn that the age values should be negated.
data.sort(key=lambda d: (-d.age, d.last), reverse=True)

Using a wrapper to negate values

A more general approach is to create a wrapper class negated, with the semantics that negated(x) < negated(y) if and only if x >= y. This is the approach taken in black panda's answer. Thus:

class negated: # name changed; otherwise the same
    def __init__(self, obj):
        self.obj = obj

    def __eq__(self, other):
        return other.obj == self.obj

    def __lt__(self, other):
        return other.obj < self.obj

# Sort descending by last name, then ascending by first name.
data.sort(lambda d: (negated(d.last), d.first))

More sophisticated: adapting a function rather than a value

Suppose that there is some existing key function my_key, and we want to sort descending by its results, then ascending by some other key. Rather than rewriting my_key, we can adapt it like so:

def negated_result(func):
    return lambda x: negated(func(x))

# Which now allows:
data.sort(lambda d: (negated_result(my_key)(d), d.id))

Since negated_result accepts a function and returns a function, it can also be used as a decorator.

If all else fails: repeated sorting per-key

Since Python's built-in sort is guaranteed stable, we can simply sort on the second key, and then on the first:

# Sort "by my_key descending, then id ascending", by doing the steps
# the other way around.
data.sort(lambda d: d.id)
data.sort(my_key, reverse=True)

The idea is that the sub-ordering will be preserved while applying the main ordering. It's a bit tricky to remember to do this in reverse order, so a wrapper function might be desired. For example:

# Use the `operator` module to avoid writing lambdas for simple accesses.
# This is not much simpler, but arguably more explicit.
from operator import attrgetter

# Give the sort orderings nicer names.
# See: https://stackoverflow.com/questions/31509401
from enum import Flag

class SortOrder(Flag):
    DESCENDING = True
    ASCENDING = False

def multi_sort(a_list, *specs):
    '''Sort by multiple, optionally reversed keys.
    specs -> a sequence of (func, bool) tuples.
             Each tuple specifies a key func to use for sorting,
             and whether or not to reverse the sort.'''
    for key, reverse in reversed(specs):
        # The enum value must be converted explicitly to work.
        a_list.sort(key=key, reverse=bool(reverse))

# Now the same sort looks like:
multi_sort(
    data, 
    (my_key, SortOrder.DESCENDING),
    (attrgetter('id'), SortOrder.ASCENDING)
)
4
  • I don't understand the point of negated_result. Why not just do data.sort(lambda d: (negated(my_key(d)), d.id))?
    – wjandrea
    Commented Apr 27, 2023 at 21:03
  • "It's equivalent to sorting normally, and then reversing the result" -- That's incorrect; that wouldn't be stable, it would actually reverse it. E.g. L = [1, 0, 1.0], sorted(L, reverse=True)[1, 1.0, 0] vs sorted(L)[::-1][1.0, 1, 0].
    – wjandrea
    Commented Apr 27, 2023 at 21:04
  • To reverse the result and preserve stability, you'd have to reverse the input too, before sorting.
    – wjandrea
    Commented Apr 27, 2023 at 21:20
  • @wjandrea Good points throughout. I see some other typos as well. I'll add fixing this to my todo list. Commented Apr 28, 2023 at 1:55
0

At least in my case, it was possible to simply call X.sort() twice, with differing parameters, and one time in reverse, the other not. All I had to do was pay attention to the priority of the sorts - do the higher-priority sort last.

So for example, I had a list of strings, and I wanted to sort by length from longest to shortest and then alphabetically if strings were the same length.
That translates to:

lst = ["Bbbb", "Aaaa", "Ddd", "Cc"]
lst.sort()  # no extra arguments necessary for alphabetical sorting
# lst = ["Aaaa", "Bbbb", "Cc", "Ddd"]
lst.sort(key=len, reverse=True) # sort by length, which is higher priority, so last
# lst = ["Aaaa", "Bbbb", "Ddd", "Cc"]
1
0
data = [
    {"name": "Alok", "status": "offline"},
    {"name": "Bablu", "status": "online"},
    {"name": "ravi", "status": "offline"},
    {"name": "Rani", "status": "online"},
    {"name": "John", "status": "offline"},
    {"name": "Alice", "status": "online"},
    {"name": "smith", "status": "offline"},
    {"name": "Emma", "status": "online"},
    {"name": "David", "status": "offline"},
    {"name": "Olivia", "status": "online"}
]

sorted_data = sorted(data, key=lambda x: (-1 if x['status'] == 'online' else 1, x['name'].lower()))

print(sorted_data)

Output

[{'name': 'Alice', 'status': 'online'}, {'name': 'Bablu', 'status': 'online'}, {'name': 'Emma', 'status': 'online'}, {'name': 'Olivia', 'status': 'online'}, {'name': 'Rani', 'status': 'online'}, {'name': 'Alok', 'status': 'offline'}, {'name': 'David', 'status': 'offline'}, {'name': 'John', 'status': 'offline'}, {'name': 'ravi', 'status': 'offline'}, {'name': 'smith', 'status': 'offline'}]

This code sorts the list of dictionaries by two criteria:

  1. First, it sorts by the status of each dictionary item. Items with the status "online" are sorted before items with the status "offline".
  2. If the status is the same, it then sorts alphabetically by the name field, ignoring case.

So, the output of this code will be a list where items with the status "online" come before items with the status "offline", and within each status group, names are sorted alphabetically.

Not the answer you're looking for? Browse other questions tagged or ask your own question.