Comparing columns in pandas different data frames and fill in a new column

Question

I have two dataframes: One contains of company and its corresponding texts. The texts are in lists

**supplier_company_name   Main_Text**

JDA SOFTWARE          ['Supply chains','The answer is simple -RunJDA!']

PTC                    ['Hello', 'Solution']

The second dataframe is texts extracted from the company's website.

      Company            Text   
0   JDA SOFTWARE    About | JDA Software    
1   JDA SOFTWARE    833.JDA.4ROI
2   JDA SOFTWARE    Contact Us
3   JDA SOFTWARE    Customer Support    
4   PTC             Training    
5   PTC             Partner Advantage

I want to create the new column in second dataframe if the text extracted from the web matches with the any item inside the list in the Main_Text column of the first data frame, fill True else fill False.

Code:

target = []
for x in tqdm(range(len(df['supplier_company_name']))): #company name in df1
    #print(x)
    for y in range(len(samp['Company']): #company name in df2
        if samp['Company'][y] == df['supplier_company_name'][x]: #if the company name matches
            #check if the text matches
            if samp['Company'][y] in df['Main_Text'][x]:
                target.append(True)
            else:
                target.append(False)

How can I change my code to run efficiently?

301_Moved_Permanently · Accepted Answer · 2019-05-03 12:49:08Z

I’ll take the hypothesis that your first dataframe (df) has unique company names. If so, you can easily reindex it by said company name and extract the (only one left) Main_Text Series to make it pretty much like a good old dict:

main_text = df.set_index('supplier_company_name')['Main_Text']

Now we just need to iterate over each line in samp, fetch the main text corresponding to the first column and generate a truthy value based on that and the second column. This is a job for apply:

target = samp.apply(lambda row: row[1] in main_text.loc[row[0]], axis=1)

Stack Exchange Network

Comparing columns in pandas different data frames and fill in a new column

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
python
pandas
or ask your own question.

Hot Network Questions

Comparing columns in pandas different data frames and fill in a new column

1 Answer 1

Not the answer you're looking for? Browse other questions tagged pythonpandas or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
python
pandas
or ask your own question.