Consider a .csv file that contains a set of video names like so:
"There are happy days","1204923"
"Beware of ignorance","589636"
"Bloody Halls MV","258933"
"Dream Theater - As I Am - Live in...","89526"
The intent of the code I built is to filter items in the csv depending on the list of excluded items. Therefore, if the name of the video contains a word in the list of excluded items, it'll be rejected for saving. The following is the code:
exclude_list = ["mv","live","cover","remix","bootleg"]
data_set = []
with open('video_2013-2016.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
# Only record videos with at least 100 views
if int(row[1]) > 99:
# A test list that holds whether the regex passes or fails
test_list = []
for ex in exclude_list:
regex = re.compile(".*("+ex+").*")
if regex.search(row[0]):
test_list.append(False)
else:
test_list.append(True)
# Depending on the results, see if the row is worthy of saving
if all(result for result in test_list):
data_set.append(row)
I know the code I wrote above is quite inefficient, and I've seen examples of list comprehensions that can do a better job, but I do not quite understand how list comprehension can work in this case. I just hate it that I have to create the regex variable many times and it feels like a waste of resource.