Skip to main content
removed the incorrect property names
Source Link
jp207
  • 173
  • 1
  • 7

Here's an example of the data: Sample datasample data

Hotel Portland 76
Hotel Suites Los Angeles 49
Hotel Suites Colorado Springs 80
Hotel Premium Denver 82
Hotel Premium Miami Extended Stay 70
Extended Stay Phoenix 65
Hotel Suites Redwood Extended Stay 81
Nexus Hotel Santa Barbara 9292
Plaza Hotel by the Sea 77 77
Majestic By Hilltop 90
Majestic San Francisco 85
Hotel Seattle 56

Here's an example of the data: Sample data

Hotel Portland 76
Hotel Suites Los Angeles 49
Hotel Suites Colorado Springs 80
Hotel Premium Denver 82
Hotel Premium Miami Extended Stay 70
Extended Stay Phoenix 65
Hotel Suites Redwood Extended Stay 81
Nexus Hotel Santa Barbara 92
Plaza Hotel by the Sea 77
Majestic By Hilltop 90
Majestic San Francisco 85
Hotel Seattle 56

Here's an example of the data: sample data

Hotel Portland 76
Hotel Suites Los Angeles 49
Hotel Suites Colorado Springs 80
Hotel Premium Denver 82
Hotel Premium Miami Extended Stay 70
Extended Stay Phoenix 65
Hotel Suites Redwood Extended Stay 81
Nexus Santa Barbara 92
Plaza by the Sea 77
Majestic By Hilltop 90
Majestic San Francisco 85
Hotel Seattle 56

added a sample of what data looks like
Source Link
jp207
  • 173
  • 1
  • 7

Here's an example of the data: Sample data

Raw text if that's more helpful:

Property| Overall Experience| Meeting /Not Meeting

Hotel Portland 76
Hotel Suites Los Angeles 49
Hotel Suites Colorado Springs 80
Hotel Premium Denver 82
Hotel Premium Miami Extended Stay 70
Extended Stay Phoenix 65
Hotel Suites Redwood Extended Stay 81
Nexus Hotel Santa Barbara 92
Plaza Hotel by the Sea 77
Majestic By Hilltop 90
Majestic San Francisco 85
Hotel Seattle 56

Here's an example of the data: Sample data

Raw text if that's more helpful:

Property| Overall Experience| Meeting /Not Meeting

Hotel Portland 76
Hotel Suites Los Angeles 49
Hotel Suites Colorado Springs 80
Hotel Premium Denver 82
Hotel Premium Miami Extended Stay 70
Extended Stay Phoenix 65
Hotel Suites Redwood Extended Stay 81
Nexus Hotel Santa Barbara 92
Plaza Hotel by the Sea 77
Majestic By Hilltop 90
Majestic San Francisco 85
Hotel Seattle 56

Source Link
jp207
  • 173
  • 1
  • 7

Using Pandas to group data based on name and see if column value is greater than or equal to values based on group names

As you'll see from the below code, I'm creating separate data frames of a much larger data frame, then updating a column for each one. What I'm doing is looking at the second column and checking to see if it meets a certain threshold value that is associated with the brand name found within the first column that has each individual hotel's name (part of that whole hotel name contains the specific brand name so I use .contains to group each of these hotels by brand name into specific data frame). Finally, I merge all of them together to get the original data frame with the updated column "Meeting/Not Meeting" column. When I've tried making a change only to the "splice" of the data frame associated with the rows that I'm interested in applying a certain threshold comparison to, it overwrites the previous values of the column with blank values, so this was the best solution I could come up with my limited knowledge.

Another thing I want to improve is the repetitiveness of the below code. I'm rewriting so much it seems and I'm sure there is a much better way of doing the following.

import pandas as pd
from google.colab import files

# file uploaded here and name extracted
uploaded = files.upload()
filename = next(iter(uploaded))

df = pd.read_excel(filename)

# separate dfs for each brand name since they have their own threshold value for Overall Experience guest rating
hotel_core_df = df.loc[(df['Property'].str.contains('Hotel')) & ~(df['Property'].str.contains('Hotel Suites')) & ~(df['Property'].str.contains('Hotel Premium'))].copy()
hotel_core_df['Meeting /Not Meeting'] = hotel_core_df['Overall Experience'] >= 72.10

# if Extended Stay in name - treat as Extended Stay brand even if Suites or Premium property
hotel_suites_df = df.loc[df['Property'].str.contains('Hotel Suites') & ~(df['Property'].str.contains('Extended Stay'))].copy()
hotel_suites_df['Meeting /Not Meeting'] = hotel_suites_df['Overall Experience'] >= 74.65

hotel_prem_df = df.loc[df['Property'].str.contains('Hotel Premium') & ~(df['Property'].str.contains('Extended Stay'))].copy()
hotel_prem_df['Meeting /Not Meeting'] = hotel_prem_df['Overall Experience'] >= 75.71

nexus_df = df.loc[df['Property'].str.contains('Nexus')].copy()
nexus_df['Meeting /Not Meeting'] = nexus_df['Overall Experience'] >= 74.82

plaza_df = df.loc[df['Property'].str.contains('Plaza')].copy()
plaza_df['Meeting /Not Meeting'] = plaza_df['Overall Experience'] >= 73.82

extend_df = df.loc[df['Property'].str.contains('Extended Stay')].copy()
extend_df['Meeting /Not Meeting'] = extend_df['Overall Experience'] >= 74.82

majestic_df = df.loc[df['Property'].str.contains('Majestic')].copy()
majestic_df['Meeting /Not Meeting'] = majestic_df['Overall Experience'] >= 75.78

merged_df = pd.concat([hotel_core_df, hotel_suites_df, hotel_prem_df, nexus_df, plaza_df, extend_df, majestic_df])
merged_df.sort_values(by=['Overall Experience'], ascending=False, inplace=True)

merged_df['Meeting /Not Meeting'] = merged_df.apply(lambda x: 'Meeting' if x['Meeting /Not Meeting'] == True else 'Not Meeting',axis=1)
    
merged_df.to_excel('OE threshold.xlsx', index=False)