0

I wanted to increase the selenium performance (working on selenium-python). So I thought of switching to a headless browser as GUI is not a necessary. I set path variable for phantomJS and ran

driver = webdriver.PhantomJS()

Upon getting an error I set the path and service arguments while initializing the driver (going through several dozens of stackoverflow and google groups

phantomjs_path = r"C:\Users\sachin.nandakumar\AppData\Local\Continuum\anaconda3\phantomjs\bin\phantomjs.exe"
service_args = [ '--proxy=10.118.132.29:80', '--proxy-type=http',]     
driver = webdriver.PhantomJS(executable_path=phantomjs_path,service_args=service_args)    

But I still get the same error. (Error specified below in detail)

Later I tried with HtmlUnitDriver as well. But the same error occurs again.

Is there any issues for headless browsers to work behind proxies (corporate firewall?). Or if it is regarding some authentication issues, I didn’t find a way on how to tackle them.

E
======================================================================
ERROR: test_start (__main__.TestWeb)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "...Crawler\crawl_core\src_main\run.py", line 26, in setUp
    driver =     webdriver.PhantomJS(executable_path=phantomjs_path,service_args=service_args)
  File "...\anaconda3\lib\site-packages\selenium\webdriver\phantomjs\webdriver.py", line 58, in __init__
desired_capabilities=desired_capabilities)
  File "...\anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 151, in __init__
self.start_session(desired_capabilities, browser_profile)
  File "...\anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 240, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
  File "...\anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 308, in execute
self.error_handler.check_response(response)
  File "...\anaconda3\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 165, in check_response
raise exception_class(value)
selenium.common.exceptions.WebDriverException: Message: <HTML><HEAD>

<TITLE>Access Denied</TITLE>

</HEAD>

<BODY>

<FONT face="Helvetica">

<big><strong></strong></big><BR>

</FONT>

<blockquote>

<TABLE border=0 cellPadding=1 width="80%">

<TR><TD>

<FONT face="Helvetica">

<big>Access Denied (authentication_failed)</big>

<BR>

<BR>

</FONT>

</TD></TR>

<TR><TD>

<FONT face="Helvetica">

Your credentials could not be authenticated: "General authentication failure due to bad user ID or authentication token.". You will not be permitted access until your credentials can be verified.

</FONT>

</TD></TR>

<TR><TD>

<FONT face="Helvetica">

This is typically caused by an incorrect username and/or password, but could also be caused by network problems.

</FONT>

</TD></TR>

<TR><TD>

<FONT face="Helvetica" SIZE=2>

<BR>

For assistance, contact your network support team.

</FONT>

</TD></TR>

</TABLE>

</blockquote>

</FONT>

</BODY></HTML>

1 Answer 1

2

PhantomJS is no more in active development. So you should not be running PhantomJS. Switch to chrome and you should be fine. And check if the proxy requires authentication.

get chromedriver from here, https://sites.google.com/a/chromium.org/chromedriver/downloads

3
  • I never found that info anywhere before this but it came i quite handy! Well i switched to chrome headless browser. Once i switched to it my first task is to fetch all the links in the webpage. But it does not fetch all the links. I thought it was some problem with page loading, so i even set timeout for each page to 45s driver.set_page_load_timeout(45) . Still not working! The same code had worked with firefox and chrome browser but not when switched to the headless version of chrome! Commented Nov 14, 2017 at 9:56
  • The admin must have made some changes. You need to load the proxy on a normal desktop using the proxy and without the proxy.
    – Mo. Atairu
    Commented Nov 15, 2017 at 0:57
  • Yes.. the chrome browser did the job for me for the time being! Commented Nov 16, 2017 at 5:36

Not the answer you're looking for? Browse other questions tagged or ask your own question.