1

We have a server host with WHM .

This host is a vertual host with a bunch of websites.

We have recently found that Google analytics and Search Console can't seem to access website pages because the Google can't seem to access the robots.txt file.

The robots.txt file exists and is reachable from the browser.

My conclusion is that somehow the WHM firewall or similar is blocking the Google access to www.website.com/robots.txt . But I can't see how this is happening. Google gives no useful specific information. Just that the request is met by a (5xx) error. But the request loads perfectly in the browser.

I have cleared our extensive list of blocked IPs on the Firewall (CSF) and have checked that port flooding firewall options are turned off (they are off). I have also checked Apache to see if there's anything on there that might cause issue in the Virtual host httpd.conf includes and nothing there seems relevant.

I'm not certain what I'm looking for but something that's causing Google (specifically and only) to be denied by the server.

What am I missing? Where can I look? I'm out of ideas. I think that there's something automated that's denying Google bots from reaching the server but I can't make out what it is. Maybe some sort of rule denying access to non HTML files , although they work in the browser .

4
  • You will need to contact your hosting provider for support.
    – Keltari
    Commented Jun 26 at 15:54
  • 1
    @Keltari Please read the question. We are the hosting provider.
    – Martin
    Commented Jun 26 at 15:58
  • are you sure its the server? Turn off the CSF or APF firewall. delete any .htaccess files. And what is the exact error you are getting from the console? Heck, disable Apache and see it the error changes or not.
    – Keltari
    Commented Jun 26 at 23:33
  • @Keltari I'm not sure what it is. Hence my question! The feedback from Google is that it can't read the robots.txt and returns a 5xx error (but won't say which one exactly). The .htaccess is clear, the CSF is clear, The issue might be Google but I wanted to find out if there was any thing else that I'd missed in checking.
    – Martin
    Commented Jun 27 at 8:59

1 Answer 1

1

While I was unable to find exactly information telling me what the cause was, by a process of deduction I found the issue:

Googlebots are unable to operate with certain types of "Permissions-Policy" HTTP headers are in place. Specifically

Permissions-Policy: 

execution-while-not-rendered=*, 
execution-while-out-of-viewport=*, 
geolocation=*,
sync-script=*,

Should all be default/enabled (*) on the HTTP header supplied to Googles bots.

(I'm unsure if geolocation is required to make it work, but the others definitely)

Not the answer you're looking for? Browse other questions tagged .