Weird behavior of numpy array type setting

Question

The code

np.array([100,200,300],dtype=str)

returns:

array(['1', '2', '3'], 
      dtype='|S1')

The documentation says:

dtype : data-type, optional

The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence.

Is this a bug?

Can you try using dtype='|S3' and see if that gives what you expect? — SethMMorton, Commented Aug 29, 2013 at 18:39
This question has been asked fairly recently, although I cannot find it right off. A detailed search of the numpy tag should lead you to it. — Daniel, Commented Aug 29, 2013 at 18:39
What happens if you have np.array([101,201,301],dtype=str) instead? — SethMMorton, Commented Aug 29, 2013 at 18:42

Daniel · Accepted Answer · 2013-08-29 19:37:39Z

I still cannot find the question, but to get around it:

>>> a=[100,200,300]

>>> np.char.mod('%d', a)
array(['100', '200', '300'],
      dtype='|S3')

This circumvents your problem:

>>> a=[100,200,3005]
>>> np.char.mod('%d', a)
array(['100', '200', '3005'],
      dtype='|S4')

The obscure documentation, it should be noted that this is roughly 4 times slower then choosing dtype="S..", but non-linearly faster then using np.array(map(str,a)) methods.

You can also do some neat things:

>>> a
[1234.5, 123.4, 12345]

>>> np.char.mod('%s',a)
array(['1234.5', '123.4', '12345.0'],
      dtype='|S7')

>>> np.char.mod('%f',a)
array(['1234.500000', '123.400000', '12345.000000'],
      dtype='|S12')

>>> np.char.mod('%d',a) #Note the truncation of decimals here.
array(['1234', '123', '12345'],
      dtype='|S5')

>>> np.char.mod('%s.stuff',a)
array(['1234.5.stuff', '123.4.stuff', '12345.0.stuff'],
      dtype='|S13')

Additional information can be found here.

Thanks! Can you add a link to the documentation for this function? — Bitwise, Commented Aug 29, 2013 at 19:14
I also updated this with a few extra examples, depending on what you are doing %d might not be optimal for you. — Daniel, Commented Aug 29, 2013 at 19:38

Community · Accepted Answer · 2017-05-23 11:49:51Z

1

The reason you see this behavior is that you have to specify the size of each string element e.g. using:

>>> np.array([100,200,300],dtype='S3')
      array(['100', '200', '300'], 
             dtype='|S3')

Otherwise the size of each element string will default to 1.

More info here: Numpy converting array from float to strings

edited May 23, 2017 at 11:49

CommunityBot

11 silver badge

answered Aug 29, 2013 at 18:46

crs17

5412 silver badges6 bronze badges

The problem is that to do this for an arbitrary list of numbers I need to check the lengths of all the numbers first.
– Bitwise
Commented Aug 29, 2013 at 19:08

Add a comment |

Collectives™ on Stack Overflow

Weird behavior of numpy array type setting

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
python
numpy
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Not the answer you're looking for? Browse other questions tagged pythonnumpy or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
numpy
or ask your own question.