5

Here is a piece of code from Allen Downey's Think Bayes book on the github:

def ReadData(filename='showcases.2011.csv'):
    """Reads a CSV file of data.
    Args:
      filename: string filename
    Returns: sequence of (price1 price2 bid1 bid2 diff1 diff2) tuples
    """
    fp = open(filename)
    reader = csv.reader(fp)
    res = []

    for t in reader:
        _heading = t[0]
        data = t[1:]
        try:
            data = [int(x) for x in data]
            # print heading, data[0], len(data)
            res.append(data)
        except ValueError:
            pass

    fp.close()
    return zip(*res)

The entire file can be seen at this URL:link on Github for this file.

I am trying to figure out what does zip(*res) mean on the last line of code? Specifically:

  1. What does a '*' do when used as a prefix. and next
  2. What does the zip function do to (*anything)

I am new to Python, so I may be asking something obvious. I see the author's note in the function's docstring, that it returns a sequence of (price1 price2 ...), but it is less than clear to me.

UPDATE: Following up on James Rettie's anwer, here is what I get when I run the code he provided in Python 3.6:

In [51]: zip(['a', 'b', 'c'], [1, 2, 3])
Out[51]: <zip at 0x1118af848>

Whereas running the same code in Python 2.7 yields the results that he provided as shown below:

In [2]: zip(['a', 'b', 'c'], [1, 2, 3])
Out[2]: [('a', 1), ('b', 2), ('c', 3)]

Can you explain why? The difference is Python 2.7 and Python 3.6 is important to me since I have to still support Python 2.7, but I would like to move to 3.6.

3
  • Thanks for pointing me to this link. It does answer my questions above.
    – Bharat
    Commented May 29, 2017 at 0:15
  • 2
    For python 3 use list(zip(*x)). The good thing is that this will work in python 2 as well because applying list to a list still gives the same list. Commented May 30, 2017 at 0:21
  • Nice! That will work for me. Thanks.
    – Bharat
    Commented May 30, 2017 at 0:30

2 Answers 2

11

In python, * is the 'splat' operator. It is used for unpacking a list into arguments. For example: foo(*[1, 2, 3]) is the same as foo(1, 2, 3).

The zip() function takes n iterables, and returns y tuples, where y is the least of the length of all of the iterables provided. The yth tuple will contain the yth element of all of the iterables provided.

For example:

zip(['a', 'b', 'c'], [1, 2, 3])

Will yield

('a', 1) ('b', 2) ('c', 3)

For a nested list like res in the example you provided, calling zip(*res) will do something like this:

res = [['a', 'b', 'c'], [1, 2, 3]]
zip(*res)
# this is the same as calling zip(['a', 'b', 'c'], [1, 2, 3])
('a', 1)
('b', 2)
('c', 3)
1

zip(*res) transposes a matrix (2-d array/list). The * operator 'unpacks' an iterable or rows of a matrix and zip interleaves and zips the rows column-wise:

> x = [('a', 'b', 'c'), (1, 2, 3)]
> zip(*x)
[('a', 1), ('b', 2), ('c', 3)]

Imagine mirroring the matrix on the diagonal.

Not the answer you're looking for? Browse other questions tagged or ask your own question.