25
>>> rows = [['']*5]*5
>>> rows
[['', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', '']]
>>> rows[0][0] = 'x'

Naturally, I expect rows to become:

[['x', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', ''], ['', '', '', '', '']]

Instead, I get:

[['x', '', '', '', ''], ['x', '', '', '', ''], ['x', '', '', '', ''], ['x', '', '', '', ''], ['x', '', '', '', '']]

It seems that elements of rows list are pointers to the same old ['']*5 list. Why does it work this way and is this a Python feature?

2
  • 1
    As a side note, if I create list through list comprehension syntax, I get the "properly working one": rows = [['' for x in range(5)] for y in range(5)]
    – xyzman
    Commented Jan 11, 2012 at 16:19
  • 1
    This also "works": rows = [['']*5 for y in range(5)]
    – xyzman
    Commented Jan 11, 2012 at 16:21

3 Answers 3

27

The behaviour is not specific to the repetition operator (*). For example, if you concatenate two lists using +, the behaviour is the same:

In [1]: a = [[1]]

In [2]: b = a + a

In [3]: b
Out[3]: [[1], [1]]

In [4]: b[0][0] = 10

In [5]: b
Out[5]: [[10], [10]]

This has to do with the fact that lists are objects, and objects are stored by reference. When you use * et al, it is the reference that gets repeated, hence the behaviour that you're seeing.

The following demonstrates that all elements of rows have the same identity (i.e. memory address in CPython):

In [6]: rows = [['']*5]*5

In [7]: for row in rows:
   ...:     print id(row)
   ...:     
   ...:     
15975992
15975992
15975992
15975992
15975992

The following is equivalent to your example except it creates five distinct lists for the rows:

rows = [['']*5 for i in range(5)]
8

The fact that names, function parameters, and containers have reference semantics is a very basic design decision in Python. It affects the way Python works in many aspects, and you picked just one of these aspects. In many cases, reference semantics are more convenient, while in other cases copies would be more convenient. In Python, you can always explicitly create a copy if needed, or, in this case, use a list comprehension instead:

rows = [[''] * 5 for i in range(5)]

You could design a programming language with different semantics, and there are many languages that do have different semantics, as well as languages with similar semantics. Why this decision was made is a bit hard to answer -- a language just has to have some semantics, and you can always ask why. You could as well ask why Python is dynamically typed, and in the end the answer is that this is just was Guido decided way back in 1989.

6
  • I know this is very old but.. how do i do this without getting a warning that i is unused?
    – T_01
    Commented Apr 13, 2018 at 21:57
  • @T_01 Python doesn't give you such warning – it must come from your IDE or linter or whatever. And in general, a warning that a variable is unused disappears if you, well, use it for something. Commented Apr 14, 2018 at 16:25
  • i found a way, just use for _ in range(5)
    – T_01
    Commented Apr 20, 2018 at 0:58
  • @T_01 Yeah, many linters accept _ as an unused variable. I really dislike that pattern, since it tends to make people think that _ is special in some way, which it isn't. I recommend using unused or dummy or just i instead, and cajole your linter into accepting it. You shouldn't write worse code just because your linter tells you so. Commented Apr 20, 2018 at 10:15
  • what is a linter
    – T_01
    Commented Apr 26, 2018 at 20:01
5

You are correct that Python is using pointers "under the hood", and yes, this is a feature. I don't know for sure why they did it this way- I assume it was for speed and to reduce memory usage.

This issue is, by the way, why it is critical to understand the distinction between shallow copies and deep copies.

Not the answer you're looking for? Browse other questions tagged or ask your own question.