2

I have few doc in .csv - 20 000 record or more.

Basically it's easy - something like that:

numer,produkt,date
202,produkt A its sad,20.04.2019
203,produkt A its sad,21.04.2019
204,produkt A its sad,22.04.2019
etc

I want to print info:

A "produkt A its sad" appears 6 times A "produkt B" appers 3 times A "produkt C" appers 2 times

Base on another answer on stack overflow I wrote:

import csv
from collections import Counter

with open ('base2.csv', encoding="utf8") as csv_file:

    csv_reader = csv.reader(csv_file)

    produkt = [row[0] for row in csv_file]

    for (k,v) in Counter(produkt).items():
        print ("A %s appears %d times" % (k, v))

I'm newbie on python so its probably something stupid :)

output is:

A n appears 1 times
A 2 appears 11 times
3
  • Would you be able to provide a larger sample CSV to work with?
    – Sri
    Commented Apr 21, 2020 at 12:50
  • you are rueading from the csv_file instead of the reader. So produkt = [row[0] for row in csv_file] essentialy says read each line from the file and store as row, then take the first char of that line. You prob want to replace csv_file to csv_reader Commented Apr 21, 2020 at 12:57
  • @ChrisDoyle yes! Its that easy! Thaks very much! I change it and its works! Commented Apr 21, 2020 at 13:01

4 Answers 4

1

Your issue is when you u se a list comprehension to build the list of products, you are reading from the file not the CSV reader object.

produkt = [row[0] for row in csv_file]

Says read each line of the file and store the line one at a time in variable name row, and from row, take the first char (index 0) from the string that row holds.

Instead assuming you want the produkt which is field one you should update this line to be

produkt = [row[1] for row in csv_reader]

Although that would also read the header line, Since you have headers i would use dictReader and select the column name your interested in like:

csv_reader = csv.DictReader(csv_data)
produkts = [row['produkt'] for row in csv_reader]
for (k, v) in Counter(produkts).items():
    print("A %s appears %d times" % (k, v))

That way its clear what column your counting without havint to just use numeric index

0
0

In your produkt = [row[0] for row in csv_file] the variable row is of string type and row[0] is just the 0-th character. I've replaced it with row.split(",")[1] and got the intended answer.

0

Im reading from the csv_file instead of the csv_reader.

So produkt = [row[0] for row in csv_file] essentialy says read each line from the file and store as row, then take the first char of that line.

I replace csv_file to csv_reader and its works.

Thanks to @chrisdoyle

0

You need to use the csv_reader object and not the csv_file.

import csv
from collections import Counter

with open ("base2.csv", encoding="utf8") as csv_file:

csv_reader = csv.reader(csv_file, delimiter=',')

frequency = Counter([row[1] for row in csv_reader])
#In the above line, you have typed csv_file rather it should 
# be csv_reader
for k, v in frequency.items():
    print("{} appears {} times".format(k, v))

Not the answer you're looking for? Browse other questions tagged or ask your own question.