117

With Sqlite, a select .. from command returns the results output, which prints:

>>print output
[(12.2817, 12.2817), (0, 0), (8.52, 8.52)]

It seems to be a list of tuples. I would like to either convert output to a simple list:

[12.2817, 12.2817, 0, 0, 8.52, 8.52]

or a 2x3 matrix:

12.2817 12.2817
0          0 
8.52     8.52

to be read via output[i][j]

The flatten command does not do the job for the 1st option, and I have no idea for the second one...

A fast solution would be appreciated, as the real data is much bigger.

4

11 Answers 11

161

By far the fastest (and shortest) solution posted:

list(sum(output, ()))

About 50% faster than the itertools solution, and about 70% faster than the map solution.

6
  • 12
    @Joel nice, but I wonder how it works? list(output[0]+output[1]+output[2]) gives the desired result but list(sum(output)) not. Why? What "magic" does the () do?
    – Kyss Tao
    Commented May 5, 2013 at 18:39
  • 10
    Ok, I should have read the manual g. It seems sum(sequence[, start]): sum adds start which defaults to 0 rather then just starting from sequence[0] if it exists and then adding the rest of the elements. Sorry for bothering you.
    – Kyss Tao
    Commented May 5, 2013 at 18:44
  • 4
    This is a well-known anti-pattern: don't use sum to concatenate sequences, it results in a quadratic time algorithm. Indeed, the sum function will complain if you try to do this with strings! Commented Apr 11, 2018 at 22:42
  • 20
    Yes, fast but completely obtuse. You'd have to leave a comment as to what it is actually doing :(
    – CpILL
    Commented May 22, 2018 at 21:30
  • 1
    See my answer for a comparison of this to other techniques which are faster and more Pythonic IMO.
    – Gman
    Commented Feb 11, 2019 at 21:52
79

List comprehension approach that works with Iterable types and is faster than other methods shown here.

flattened = [item for sublist in l for item in sublist]

l is the list to flatten (called output in the OP's case)


timeit tests:

l = list(zip(range(99), range(99)))  # list of tuples to flatten

List comprehension

[item for sublist in l for item in sublist]

timeit result = 7.67 µs ± 129 ns per loop

List extend() method

flattened = []
list(flattened.extend(item) for item in l)

timeit result = 11 µs ± 433 ns per loop

sum()

list(sum(l, ()))

timeit result = 24.2 µs ± 269 ns per loop

3
  • 2
    I had to use this on a large dataset, the list comprehension method was by far the fastest!
    – nbeuchat
    Commented Aug 16, 2018 at 6:36
  • I did a little change to the .extend solution and now performs a bit better. check it out on your timeit to compare
    – Totoro
    Commented Nov 20, 2018 at 17:00
  • this is very confusing and I don't understand the syntax here at all. General syntax for list comprehension is expression for item in list like x*2 for x in listONumbers. So for flattening you would expect an expression like num for num in sublist for sublist in list not num for sublist in list for num in sublist. how is in the comprehension broken down here?
    – Cheruvim
    Commented Mar 10, 2023 at 1:06
31

In Python 2.7, and all versions of Python3, you can use itertools.chain to flatten a list of iterables. Either with the * syntax or the class method.

>>> t = [ (1,2), (3,4), (5,6) ]
>>> t
[(1, 2), (3, 4), (5, 6)]
>>> import itertools
>>> list(itertools.chain(*t))
[1, 2, 3, 4, 5, 6]
>>> list(itertools.chain.from_iterable(t))
[1, 2, 3, 4, 5, 6]
0
16

Update: Flattening using extend but without comprehension and without using list as iterator (fastest)

After checking the next answer to this that provided a faster solution via a list comprehension with dual for I did a little tweak and now it performs better, first the execution of list(...) was dragging a big percentage of time, then changing a list comprehension for a simple loop shaved a bit more as well.

The new solution is:

l = []
for row in output: l.extend(row)

The old one replacing list with [] (a bit slower but not much):

[l.extend(row) for row in output]

Older (slower):

Flattening with list comprehension

l = []
list(l.extend(row) for row in output)

some timeits for new extend and the improvement gotten by just removing list(...) for [...]:

import timeit
t = timeit.timeit
o = "output=list(zip(range(1000000000), range(10000000))); l=[]"
steps_ext = "for row in output: l.extend(row)"
steps_ext_old = "list(l.extend(row) for row in output)"
steps_ext_remove_list = "[l.extend(row) for row in output]"
steps_com = "[item for sublist in output for item in sublist]"

print(f"{steps_ext}\n>>>{t(steps_ext, setup=o, number=10)}")
print(f"{steps_ext_remove_list}\n>>>{t(steps_ext_remove_list, setup=o, number=10)}")
print(f"{steps_com}\n>>>{t(steps_com, setup=o, number=10)}")
print(f"{steps_ext_old}\n>>>{t(steps_ext_old, setup=o, number=10)}")

Time it results:

for row in output: l.extend(row)                  
>>> 7.022608777000187

[l.extend(row) for row in output]
>>> 9.155910597999991

[item for sublist in output for item in sublist]
>>> 9.920002304000036

list(l.extend(row) for row in output)
>>> 10.703829122000116
0
9
>>> flat_list = []
>>> nested_list = [(1, 2, 4), (0, 9)]
>>> for a_tuple in nested_list:
...     flat_list.extend(list(a_tuple))
... 
>>> flat_list
[1, 2, 4, 0, 9]
>>> 

you could easily move from list of tuple to single list as shown above.

9

use itertools chain:

>>> import itertools
>>> list(itertools.chain.from_iterable([(12.2817, 12.2817), (0, 0), (8.52, 8.52)]))
[12.2817, 12.2817, 0, 0, 8.52, 8.52]
7

Or you can flatten the list like this:

reduce(lambda x,y:x+y, map(list, output))
1
  • reduce(lambda x,y:x+y, output) seems to work directly converting to a long tuple (which can be converted to a list). Why use map(list, output) inside the reduce() call? Maybe It's more in line with the fact that tuples are immutable, lists are mutable. Commented Mar 20, 2019 at 14:47
5

This is what numpy was made for, both from a data structures, as well as speed perspective.

import numpy as np

output = [(12.2817, 12.2817), (0, 0), (8.52, 8.52)]
output_ary = np.array(output)   # this is your matrix 
output_vec = output_ary.ravel() # this is your 1d-array
3

In case of arbitrary nested lists(just in case):

def flatten(lst):
    result = []
    for element in lst: 
        if hasattr(element, '__iter__'):
            result.extend(flatten(element))
        else:
            result.append(element)
    return result

>>> flatten(output)
[12.2817, 12.2817, 0, 0, 8.52, 8.52]
3
def flatten_tuple_list(tuples):
    return list(sum(tuples, ()))


tuples = [(5, 6), (6, 7, 8, 9), (3,)]
print(flatten_tuple_list(tuples))
2
  • 5
    Thank you for contributing an answer. Would you kindly edit your answer to to include an explanation of your code? That will help future readers better understand what is going on, and especially those members of the community who are new to the language and struggling to understand the concepts. That’s especially useful here, where your answer is competing for attention with nine other answers. What distinguishes yours? When might this be preferred over well-established answers above? Commented Feb 6, 2021 at 1:02
  • 1
    Ok sure I will do that Commented Feb 7, 2021 at 4:54
1

The question mentions that the list of tuples (output) is returned by Sqlite select .. from command.

Instead of flattening the returned output, you could adjust how sqlite connection returns rows using row_factory to return a matrix (list of lists/nested lists) with numeric values instead of a list with tuples:

import sqlite3 as db

conn = db.connect('...')
conn.row_factory = lambda cursor, row: list(row) # This will convert the tuple to list.
c = conn.cursor()
output = c.execute('SELECT ... FROM ...').fetchall()
print(output)
# Should print [[12.2817, 12.2817], [0, 0], [8.52, 8.52]]

Not the answer you're looking for? Browse other questions tagged or ask your own question.