i'm trying to automate a program to scrap periodically some prices from amazon and other pages. (I'm starting with amazon)
The problem is when i do the soup.find method with PyCharm, it finds his target and returns-it correctly and with the windows terminal it returns: None
I have the code running well from PyCharm, but i need it running from the windows terminal to automate it thought a .bat file.
I find that's a very strange issue and I could't find documentation about it so if any of you could help me with it It would be awesome!
There's some things I've tried so they are discarded.
- Uninstalling and reinstalling bs4
- Verify the installation of all the modules needed
- Verify that the windows is running the program in the same folder as PyCharm
- Point out that this issue doesn't always happens (i built a logreport and it shows me that from 12 webs it fails at 10, 11 or 12 (not a fixed number too) That only happens running from terminal)
- Response is <Response [200]> with both cases
I've compared the soup it gets with PyCharm and Windows and are different soups, in the Windows one i couldn't find manually the text words.
Finally I'm putting here the code I'm using so you can see what i'm seeing:
- So you can reproduce the error that's one of the links that always gives this error: https://www.amazon.es/dp/B0BRYY69YD
import time
import requests
from bs4 import BeautifulSoup
import pandas as pd
import os
from csv import writer
from datetime import date, datetime
from tqdm import tqdm
def r_Amazon(URL):
headers = {
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'es-ES,es;q=0.8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36'
}
response = requests.get(URL, headers=headers)
soup = BeautifulSoup(response.content, 'lxml')
# Comprobar si el Item esta en Stock o Solo Segunda Mano
get_error = 0
try:
product_price_stat = soup.find('span', {'class': 'a-text-bold'}).text.strip() #<- HERE IT FAILS
if product_price_stat == 'Comprar de segunda mano' or product_price_stat == 'Ofertas destacadas no disponibles':
# El item tiene el precio de 2a MANO, utilizar script correspondiente
try:
product_price = 'ND'
get_error = 1
except:
print('ERROR 2nd TRY')
get_error = 1
else:
# El item tiene el precio NORMAL, utilizar script correspondiente
try:
product_price = soup.find('span', {'class': 'a-offscreen'}).text.strip()
# Format Correctly
product_price = product_price.replace('.', '')
product_price = product_price.replace(',', '.')
product_price = product_price.replace('€', '')
except:
print('ERROR 1rs TRY')
get_error = 1
except:
product_price = 'ND'
print('ERROR')
get_error = 1
return product_price, get_error