making histogram from a csv file

Question

I am trying to read a column of data from a csv file and create a histogram for it. I could read the data into an array but was not able to make the histogram. Here is what I did:

thimar=csv.reader(open('thimar.csv', 'rb'))
thimar_list=[]
thimar_list.extend(thimar)
z=[]
for data in thimar_list:
    z.append(data[7])
zz=np.array(z)
n, bins, patches = plt.hist(zz, 50, normed=1)

which gives me the error:

TypeError: cannot perform reduce with flexible type

Any idea what is going on?

you may need to convert from string to number. i think csv.reader just creates list of strings, and numpy makes array of strings — yosukesabai, Commented Jan 5, 2012 at 15:32
Do you need to use csv? I think np.loadtxt would do a better job here (simpler code, automatic conversion, etc). — Ricardo Cárdenes, Commented Jan 5, 2012 at 15:57
I try and use csv over loadtxt because it deals with non-number fields better, for example column labels. But if the csv only has numbers loadtxt is a good option. — Bi Rico, Commented Jan 5, 2012 at 16:17
@Bago - Just FYI, you can specify skiprows=1 to loadtxt to have it skip the column headers. However, the csv module will handle csv files with quoted strings containing commas, etc. loadtxt is (deliberately) not set up to deal with non-simple delimiters. — Joe Kington, Commented Jan 5, 2012 at 17:57

yosukesabai · Accepted Answer · 2012-01-05 15:55:17Z

1

modify the sixth line to cast string to numeric

    z.append(float(data[7]))

with this i got some plot with my made up data.

answered Jan 5, 2012 at 15:55

yosukesabai

6,2244 gold badges32 silver badges42 bronze badges

Add a comment |

Bi Rico · Accepted Answer · 2012-01-05 21:12:16Z

0

Here are two options, this one will work if all your columns are made up of numbers:

array = np.loadtxt('thimar.csv', 'float', delimiter=',')
n, bins, patches = plt.hist(array[:, 7], 50, normed=1)

this one is better if you have non-numeric columns in your file (ie Name, Gender, ...):

thimar = csv.reader(open('thimar.csv', 'rb'))
thimar_list = list(thimar)
zz = np.array([float(row[7]) for row in thimar_list])
n, bins, patches = plt.hist(zz, 50, normed=1)

edited Jan 5, 2012 at 21:12

answered Jan 5, 2012 at 16:08

Bi Rico

25.7k3 gold badges55 silver badges75 bronze badges

Add a comment |

Collectives™ on Stack Overflow

making histogram from a csv file

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
arrays
csv
numpy
histogram
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Not the answer you're looking for? Browse other questions tagged arrayscsvnumpyhistogram or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
arrays
csv
numpy
histogram
or ask your own question.