65

Some web servers, when accessed using their IP address, return an error that direct IP address access is not allowed.

I've been wondering for some time how this works. I mean, doesn't the browser always resolve the IP address and connect to it? Isn't "Direct IP address access" just skipping DNS? How does the remote server even know you skipped DNS?

4
  • 2
    As I recall, what he really asked for was added to the http protocol very early, in order to provide for virtual servers on the same real host.
    – JDługosz
    Commented Mar 13, 2016 at 22:55
  • 3
    It’s basically the same process that allows a single server to differentiate between different virtual hosts. The real server maps a URL to one of its virtual hosts. Many servers do not have a fallback for an unmapped URL, either by design or default.
    – Manngo
    Commented Mar 14, 2016 at 10:55
  • You can skip DNS but avoid this error if you create an entry in your hosts file for the domain name in question. Your browser will be looking for the domain name, and will include it in the Host: header, but no DNS query will be made due to the hosts file entry. Commented Mar 15, 2016 at 14:31
  • The answer to these kinds of questions usually is, because you told them.
    – Thomas
    Commented Mar 16, 2016 at 6:42

4 Answers 4

92

To answer your question of how it knows, it has to do with what your browser sends the server.

You're right that the system always resolves it to an IP address, but the browser sends the URL you attempted to access in the HTTP header.

Here is a sample header that I found online, modified to look as though you used Firefox on Windows and typed apple.com into the address bar:

GET / HTTP/1.1
Host: apple.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache

Here's what the header would look like if you used its IP address:

GET / HTTP/1.1
Host: 17.142.160.59
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache

Both of these would be sent to the same IP address over a socket, but the browser tells the server what it accessed.

Why? Because web servers with the same IP address may host multiple sites and give different pages for each. It cannot distinguish who wants which page by IP address because they all have the same one - but it can distinguish them by the HTTP header.

6
  • 7
    Ahh, makes much more sense now! So basically, the browser sends TO the IP the header with either the IP or the domain, and the site makes its assumption on that. So really, these restrictions are easy to bypass?
    – Joseph
    Commented Mar 13, 2016 at 15:24
  • 7
    It's not that it's a restriction that you're bypassing, it's just that you're not playing ball and you're going to get some strange results.
    – iAdjunct
    Commented Mar 13, 2016 at 15:29
  • These HTTP requests are what you'd get if you are using a proxy. Without a proxy, the information comes in the host header. See this example.
    – 0xFE
    Commented Mar 13, 2016 at 15:55
  • 2
    bytec0de: The other piece of this is that web server configurations will often be set up based on host name. The IP packet specifies the IP address, the TCP segment specifies the port number, and the HTTP header specifies the hostname. So commonly servers are configured to say "if client/browser asks for example.com, then give them this." They can be set up to also respond to IP addresses or wildcards (respond to anything), but many people just copy examples, and many pre-existing examples are based on the domain name supplied by the browser.
    – TOOGAM
    Commented Mar 13, 2016 at 17:58
  • 14
    @bytec0de It's not a restriction. It's more like using the correct phone number, but the wrong extension - you called the right building, but not the right person. And the reason for its introduction is also pretty much the same as with phones - it allows you to host multiple separate sites on the same IP address (and TCP port). For example, our development server hosted hundreds of separate web sites at the same time, and plenty of web hosting solutions use the same approach ("register a domain, point it at our IP address, we'll take care of the rest").
    – Luaan
    Commented Mar 14, 2016 at 9:14
21

With the HTTP 1.1 protocol (the prior HTTP 1.0 version has been obsolete for quite some time, so is unlikely to be used by any recent version of a browser), the host header was introduced. For HTTP 1.1 that is a required header line that must be issued by a browser. The domain name is included by the browser in that line, e.g. Host: example.com. So the web server knows which web site the browser wants to access from that line. Since a webserver may be supporting dozens of websites, that line is important to it to determine which web site the requested page resides on. Supposing the browser wants to access the home page for a site on example.com, It issues the following line to the server when it connects to the server:

GET / HTTP/1.1

That line specifies the browser wishes to get the root document, i.e., "/" for the website. If you wanted to access /somedir/testpage.html, GET /somedir/testpage.html would be in the "get" line. The line will be followed by the line below:

Host: example.com

So if the web server is supporting the websites example.com, someothersite.com, yetanothersite.org, etc., it knows that it should return the main page for example.com. If it doesn't get that line, or doesn't have a domain name listed in the Host line, it doesn't know which website's home page should be returned. So it may return an error message, instead, or return the home page for a "default" site for the server.

You can issue the same commands a browser issues using the telnet protocol, e.g., telnet example.com 80 from a Linux shell prompt or an Apple OS X Terminal window, to connect to the default HTTP port, port 80 - see Testing access to a website using PuTTY for steps to do so with PuTTY on a Windows system.

1
  • 3
    Just a note: the host header was also used in HTTP 1.0, it just wasn't required. HTTP 1.1 made the field mandatory. In practice, many HTTP 1.0 servers simply didn't work if the browser didn't send the host header (for all the reasons outlined above), so most browsers sent it anyway.
    – Luaan
    Commented Mar 14, 2016 at 9:16
6

This is due to the Host: HTTP header. This is quite useful for hosting multiple sites on the same IP address. For example, http://www.k7dxs.net/ and http://www.philipgrimes.com/ are both on the same IP address. However, because of the Host: header, they can show two different sites.

For HTTPS, as @Toothbrush pointed out, they use TLS Server Name Indication because the Host header is part of the encrypted request, and the server doesn't know which cert to offer without this.

Fun experiment: Get Tamper Data for Firefox (I haven't been able to find an equivalent for Chrome) and start tampering. Open http://slipstation.com/ and edit the Host: header in the request to be http://www.zombo.com/. You'll see a possibly familiar website where anything is possible.

8
  • Actually, those sites use Server Name Indication. There is no way to tell what site to display if both sites are hosted on the same server over HTTPS without SNI since the server does not know which certificate to use.
    – Toothbrush
    Commented Mar 15, 2016 at 16:17
  • Oh, interesting. Will my experiment still work? Commented Mar 15, 2016 at 20:24
  • Yes, if you find two sites that are hosted on the same IP address over HTTP.
    – Toothbrush
    Commented Mar 15, 2016 at 20:29
  • But not HTTPS is what I was asking. Commented Mar 15, 2016 at 20:31
  • No, it shouldn't work over HTTPS. If it does, there is a security vulnerability in the web server.
    – Toothbrush
    Commented Mar 15, 2016 at 20:32
5

The web server can be configured to only accept connections to a particular domain or subdomain. It could be hosting multiple domains.

What the web server does when a direct IP address is used is configurable. In the case of Apache, it will by default go to the first named vhost out of the enabled sites, which are sorted alpha-numerically.

This is the most relevant part of the Apache documentation that I have found, after a quick search:

https://httpd.apache.org/docs/current/vhosts/name-based.html

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .