How to Download a CSV File from a Blob URL Using Selenium in Python?

Question

I'm trying to automate the download of a CSV file from a Blob URL on a dynamic website using Selenium with Python. The CSV download is triggered by clicking a button, but the button click generates a Blob URL, which isn't directly accessible via traditional HTTP requests. I'm having trouble capturing and downloading the file from this Blob URL.

Here is the example of url: https://snapshot.org/#/aave.eth/proposal/0x70dfd865b78c4c391e2b0729b907d152e6e8a0da683416d617d8f84782036349

Here is the link looks like when I check from download history:

blob:https://snapshot.org/4b2f45e9-8ca3-4105-b142-e1877e420c84

I've checked this post as well Python: How to download a blob url video? which said that I could not download it and I think it does not make sense since when I open the webpage manually, I can click the download button and get the file.

This is the code that I've tried before

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
import time

# Setup ChromeDriver
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))

# URL of the proposal page
url = 'https://snapshot.org/#/aave.eth/proposal/0x70dfd865b78c4c391e2b0729b907d152e6e8a0da683416d617d8f84782036349'

# Navigate to the page
driver.get(url)

try:
    # Wait up to 20 seconds until the expected button is found using its attributes
    wait = WebDriverWait(driver, 20)
    download_button = wait.until(EC.element_to_be_clickable((By.XPATH, "//button[contains(.,'svg')]")))
    download_button.click()
    print("Download initiated.")
except Exception as e:
    print(f"Error: {e}")

# Wait for the download to complete
time.sleep(5)

# Close the browser
driver.quit()

When I inspect the Download csv button here is what I got

<svg viewBox="0 0 24 24" width="1.2em" height="1.2em"><path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M4 16v1a3 3 0 0 0 3 3h10a3 3 0 0 0 3-3v-1m-4-4l-4 4m0 0l-4-4m4 4V4"></path></svg>

what did you try? where is your code? Don't expect that we will write all code from scratch to test it. — furas, Commented Jun 26 at 11:12
I don't know what you tried to do (because you didn't show your code) but this works for me driver.find_elements(By.XPATH, '//button')[10].click() but it may need better method to find correct button. — furas, Commented Jun 26 at 11:22
you can't access it with traditional HTTP request because it runs some JavaScript code to send file - and traditional HTTP request can't run JavaScript. That's all. — furas, Commented Jun 26 at 11:24
@furas thank you, I've updated the question with the code that I've tried before. I forgot to attach before. sorry — rischan, Commented Jun 26 at 23:42
it looks like you get image SVG which was inside <button>. Page has many buttons (~20) and maybe you get wrong one. It can be the hardest part of this program - find correct button :) — furas, Commented Jun 26 at 23:44

furas · Accepted Answer · 2024-06-27 00:12:57Z

0

I don't know what you tried to do (because you didn't show your code) but this works for me

driver.find_elements(By.XPATH, '//button')[10].click()

but it may need better method to find correct button (instead of [10]).

Meanwhile I found this xpath: "//div[h4//span[contains(text(),'Votes')]]//button"

Full working code

from selenium import webdriver
from selenium.webdriver.common.by import By
#import undetected_chromedriver as uc

import time

# ---

import selenium
print('Selenium:', selenium.__version__)  # Selenium: 4.19.0

# ---

url = 'https://snapshot.org/#/aave.eth/proposal/0x70dfd865b78c4c391e2b0729b907d152e6e8a0da683416d617d8f84782036349'

#driver = webdriver.Chrome()  # the newest Selenium will automatically download driver - so it doesn't need `service=`
driver = webdriver.Firefox()  # the newest Selenium will automatically download driver - so it doesn't need `service=`
#driver = uc.Chrome()

driver.get(url)

# ---

time.sleep(3)

#all_buttons = driver.find_elements(By.XPATH, '//button')
#all_buttons[10].click() 
        
driver.find_element(By.XPATH, "//div[h4//span[contains(text(),'Votes')]]//button").click()

input('Press ENTER to exit')
driver.close()

edited Jun 27 at 0:12

answered Jun 26 at 11:27

furas

1

Hi Furas, Thank you, it's working. However, do you think using the button index like that is a sustainable solution? What if there is a page with more buttons, which will change the button index? I've tried to inspect the button to get some key from that but I could not found it.
– rischan
Commented Jun 26 at 23:46
index can be the problem if page will have different structure - but finding better method is the hardest part in this task :) I didn't find better but I spend only few seconds for this. maybe you should find some other element - there is <h4> with text Votes - so maybe //div[h4[contains(text(), "Votes")]]//button
– furas
Commented Jun 27 at 0:00
at this moment works for me "//div[h4]//button" but I try to use contains(text(), "Votes")
– furas
Commented Jun 27 at 0:09
finally, this works for me "//div[h4//span[contains(text(),'Votes')]]//button". I add it to answer.
– furas
Commented Jun 27 at 0:11

Add a comment |

Collectives™ on Stack Overflow

How to Download a CSV File from a Blob URL Using Selenium in Python?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
python
selenium-webdriver
selenium-chromedriver
blob
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged pythonselenium-webdriverselenium-chromedriverblob or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
selenium-webdriver
selenium-chromedriver
blob
or ask your own question.