How to check frequency in csv file on python?

Question

I have few doc in .csv - 20 000 record or more.

Basically it's easy - something like that:

numer,produkt,date
202,produkt A its sad,20.04.2019
203,produkt A its sad,21.04.2019
204,produkt A its sad,22.04.2019
etc

I want to print info:

A "produkt A its sad" appears 6 times A "produkt B" appers 3 times A "produkt C" appers 2 times

Base on another answer on stack overflow I wrote:

import csv
from collections import Counter

with open ('base2.csv', encoding="utf8") as csv_file:

    csv_reader = csv.reader(csv_file)

    produkt = [row[0] for row in csv_file]

    for (k,v) in Counter(produkt).items():
        print ("A %s appears %d times" % (k, v))

I'm newbie on python so its probably something stupid :)

output is:

A n appears 1 times
A 2 appears 11 times

Would you be able to provide a larger sample CSV to work with? — Sri, Commented Apr 21, 2020 at 12:50
you are rueading from the csv_file instead of the reader. So produkt = [row[0] for row in csv_file] essentialy says read each line from the file and store as row, then take the first char of that line. You prob want to replace csv_file to csv_reader — Chris Doyle, Commented Apr 21, 2020 at 12:57
@ChrisDoyle yes! Its that easy! Thaks very much! I change it and its works! — Michał Barczak, Commented Apr 21, 2020 at 13:01

Chris Doyle · Accepted Answer · 2020-04-21 13:06:17Z

Your issue is when you u se a list comprehension to build the list of products, you are reading from the file not the CSV reader object.

produkt = [row[0] for row in csv_file]

Says read each line of the file and store the line one at a time in variable name row, and from row, take the first char (index 0) from the string that row holds.

Instead assuming you want the produkt which is field one you should update this line to be

produkt = [row[1] for row in csv_reader]

Although that would also read the header line, Since you have headers i would use dictReader and select the column name your interested in like:

csv_reader = csv.DictReader(csv_data)
produkts = [row['produkt'] for row in csv_reader]
for (k, v) in Counter(produkts).items():
    print("A %s appears %d times" % (k, v))

That way its clear what column your counting without havint to just use numeric index

rpoleski · Accepted Answer · 2020-04-21 12:57:00Z

0

In your produkt = [row[0] for row in csv_file] the variable row is of string type and row[0] is just the 0-th character. I've replaced it with row.split(",")[1] and got the intended answer.

answered Apr 21, 2020 at 12:57

rpoleski

9986 silver badges12 bronze badges

Add a comment |

Michał Barczak · Accepted Answer · 2020-04-21 13:03:28Z

0

Im reading from the csv_file instead of the csv_reader.

So produkt = [row[0] for row in csv_file] essentialy says read each line from the file and store as row, then take the first char of that line.

I replace csv_file to csv_reader and its works.

Thanks to @chrisdoyle

answered Apr 21, 2020 at 13:03

Michał Barczak

233 bronze badges

Add a comment |

Sankalp1999 · Accepted Answer · 2020-04-21 13:19:49Z

0

You need to use the csv_reader object and not the csv_file.

import csv
from collections import Counter

with open ("base2.csv", encoding="utf8") as csv_file:

csv_reader = csv.reader(csv_file, delimiter=',')

frequency = Counter([row[1] for row in csv_reader])
#In the above line, you have typed csv_file rather it should 
# be csv_reader
for k, v in frequency.items():
    print("{} appears {} times".format(k, v))

answered Apr 21, 2020 at 13:19

Sankalp1999

313 bronze badges

Add a comment |

Collectives™ on Stack Overflow

How to check frequency in csv file on python?

4 Answers 4

Not the answer you're looking for? Browse other questions tagged
python
python-3.x
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Not the answer you're looking for? Browse other questions tagged pythonpython-3.x or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
python-3.x
or ask your own question.