131

I found something strange while debugging some code. Apparently,

>>> (0-6) is -6
False

but,

>>> (0-5) is -5
True

Why does this happen?

2
  • 27
    What led you to use is in the first place? It's not something that should be often used in Python, apart from the is/is not None case. Commented Jul 13, 2012 at 18:35
  • 4
    @Russel's comment hits the nail on the head -- the problem is that someone was apparently using "is" to compare numbers and expected it to function like =, an incorrect expectation.
    – LarsH
    Commented Jul 13, 2012 at 20:11

4 Answers 4

155

All integers from -5 to 256 inclusive are cached as global objects sharing the same address with CPython, thus the is test passes.

This artifact is explained in detail in http://www.laurentluce.com/posts/python-integer-objects-implementation/, and we could check the current source code in http://hg.python.org/cpython/file/tip/Objects/longobject.c.

A specific structure is used to refer small integers and share them so access is fast. It is an array of 262 pointers to integer objects. Those integer objects are allocated during initialization in a block of integer objects we saw above. The small integers range is from -5 to 256. Many Python programs spend a lot of time using integers in that range so this is a smart decision.

This is only an implementation detail of CPython and you shouldn't rely on this. For instance, PyPy implemented the id of integer to return itself, so (0-6) is -6 is always true even if they are "different objects" internally; it also allows you to configure whether to enable this integer caching, and even set the lower and upper bounds. But in general, objects retrieved from different origins will not be identical. If you want to compare equality, just use ==.

4
  • 1
    Interesting skew to the positive side. The article says many Python programs spend a lot of time using integers in that range, so the devs probably have measured it somehow. I guess negative number literals are only used for error codes these days... Commented Jul 13, 2012 at 18:34
  • Note that PyPy has different promises on is though (despite not doing caching) - pypy.readthedocs.org/en/latest/…
    – fijal
    Commented Jul 16, 2012 at 9:26
  • Just a side note: there is a little bug in the quoted post and hence in the quotation (while the first sentence of the answer is correct)—the range is -5 to 256, not 257 as zero is counted as a positive integer.
    – kirelagin
    Commented May 26, 2013 at 22:07
  • 3
    @kirelagin Maybe that's intended. In python, range(m, n) means integer interval of [m, n), i.e. m, m + 1, m + 2, ..., n - 1. It doesn't include n, so range(-5, 257) doesn't contain 257 and the described behaviour is true for this range.
    – 0xc0de
    Commented Oct 31, 2017 at 5:46
30

Python stores integers in the range -5 - 256 in the interpreter: it has a pool of integer objects from which these integers are returned. That's why those objects are the same: (0-5) and -5 but not (0-6) and -6 as these are created on the spot.

Here's the source in the source code of CPython:

#define NSMALLPOSINTS           257
#define NSMALLNEGINTS           5
static PyIntObject *small_ints[NSMALLNEGINTS + NSMALLPOSINTS];

(view CPython source code: /trunk/Objects/intobject.c). The source code includes the following comment:

/* References to small integers are saved in this array so that they
   can be shared.
   The integers that are saved are those in the range
   -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
*/

The is operator will then compare them (-5) as equal because they are the same object (same memory location) but the two other new integers (-6) will be at different memory locations (and then is won't return True). Note that 257 in the above source code is for the positive integers so that is 0 - 256 (inclusive).

(source)

27

It's not a bug. is is not an equality test. == will give the expected results.

The technical reason for this behavior is that a Python implementation is free to treat different instances of the same constant value as either the same object, or as different objects. The Python implementation you're using chooses to make certain small constants share the same object for memory-saving reasons. You can't rely on this behavior being the same version to version or across different Python implementations.

1
  • 2
    > is is not an equality test. This. is is an identity test, to see if two objects are the exact same ones. It just so happens that in the CPython implementation, some int objects are cached. Commented Jul 13, 2012 at 18:41
17

It is happening because CPython caches some small integers and small strings and gives every instance of that object a same id().

(0-5) and -5 has same value for id(), which is not true for 0-6 and -6

>>> id((0-6))
12064324
>>> id((-6))
12064276
>>> id((0-5))
10022392
>>> id((-5))
10022392

Similarly for strings :

>>> x = 'abc'
>>> y = 'abc'
>>> x is y
True
>>> x = 'a little big string'
>>> y = 'a little big string'
>>> x is y
False

For more details on string caching, read: is operator behaves differently when comparing strings with spaces

3
  • 2
    so why is -6 considered "big" and -5 not? What is the qualifying criteria for something to be cosidered "big"? Commented Jul 13, 2012 at 18:36
  • 1
    For CPython, -5 to 256 are "interned" (cached). It's a somewhat arbitrary implementation choice. If a given interned object is used a lot, there's a potentially large memory savings, but there's a cost to interning it (either in runtime or memory) so you don't want to do it for everything. Commented Jul 13, 2012 at 18:40
  • +1 for showing the IDs; I was just about to add that to my answer. Commented Jul 13, 2012 at 18:43

Not the answer you're looking for? Browse other questions tagged or ask your own question.