0

I just started using numpy and figured out something strange happens. (i read the official Quickstart Tutorial but it didn't help) This is the code:

>>> jok = np.int16(33)
>>> jok.dtype
dtype('int16')
>>> jok += 1
>>> jok
34
>>> jok.dtype
dtype('int64')

When i apply arithmetic operations to a variable (jok) it changes the 'dtype' from 'int16' to 'int64'. But when i apply the same operations to arrays it stays the same, it doesn't change the 'dtype':

>>> ar = np.arange(6,dtype='int8')
>>> ar
array([0, 1, 2, 3, 4, 5], dtype=int8)
>>> ar += 10
>>> ar
array([10, 11, 12, 13, 14, 15], dtype=int8)

Why does this happens?

Is it possible to apply arithmetic operations to a variable like 'jok' and conserving the specifies 'dtype' of the variable (in my case 'int16')?

And why does it always change them to 'int64'. I know 'int64' is the default type of numpy, but i want to save some memory making the type of my variables smaller.

Are there any reasons for me to stay with 'int64' knowing that my maximum value will not even reach 1,000. Most of my variables will be below 200 ('jok' will always be < 400).

0

2 Answers 2

3

__array_priority__ may explain the pattern you see.

First the scalar created by np.int16:

In [303]: jok = np.int16(33) 
In [304]: jok.__array_priority__                                                                     
Out[304]: -1000000.0

and the priority of an array created from a python int:

In [305]: np.array(1).__array_priority__                                                             
Out[305]: 0.0

In this addition the int is first converted to np.array; it's priority is higher than jok, so the dtype is changed:

In [306]: jok += 1                                                                                   
In [307]: jok.dtype                                                                                  
Out[307]: dtype('int64')
In [308]: type(jok)                                                                                  
Out[308]: numpy.int64

Adding a float changes dtype to float - again based on priority:

In [309]: jok += 3.2                                                                                 
In [310]: jok                                                                                        
Out[310]: 37.2

But if we make an array, 0d, with int16 dtype:

In [311]: jok = np.array(33, 'int16')                                                                
In [312]: jok.__array_priority__                                                                     
Out[312]: 0.0
In [313]: jok += 1                                                                                   
In [314]: jok.dtype                                                                                  
Out[314]: dtype('int16')
In [315]: jok += 3.2                                                                                 
---------------------------------------------------------------------------
UFuncTypeError                            Traceback (most recent call last)
<ipython-input-315-28d0135066df> in <module>
----> 1 jok += 3.2

UFuncTypeError: Cannot cast ufunc 'add' output from dtype('float64') to dtype('int16') with casting rule 'same_kind'

Adding the int preserves the dtype; but trying to add a float results in a casting error. jok+3.2 produces a float, but that can't be put into the int16 array.

As a general rule, I don't recommend creating variables with np.int16(...) (or other such functions. Use the np.array(.., dtype) function instead.

The two classes have many of the same methods, but aren't identical. I don't think there's a good reason to make the np.int16 object directly:

In [317]: type(np.int16(33)).__mro__                                                                 
Out[317]: 
(numpy.int16,
 numpy.signedinteger,
 numpy.integer,
 numpy.number,
 numpy.generic,
 object)
In [318]: type(np.array(33, 'int16'))                                                                
Out[318]: numpy.ndarray
In [319]: type(np.array(33, 'int16')).__mro__                                                        
Out[319]: (numpy.ndarray, object)

np.int16 objects are created indirectly by indexing an array:

In [320]: type(np.array(33, 'int16')[()])                                                            
Out[320]: numpy.int16

But we seldom try to do things like += on such a variable.

1

The main question here is probably: do you need to cast your ints as 'int16' / etc.? My thought is that 99% of the time, you probably don't need to worry about it.

But to answer your question, if you want to ensure that your data types remain the same, it seems that you'll need to wrap your plain ints in np.int16(). For example:

jok = np.int16(33)
jok += np.int16(1)

As far as why it happens, I unfortunately can't answer that—you could dig around in the C code under the hood of NumPy if you really want to find out: https://github.com/numpy/numpy

1
  • Thank you so much. This really answered my question.
    – camagu4
    Commented Aug 20, 2020 at 13:47

Not the answer you're looking for? Browse other questions tagged or ask your own question.