4

I understand that TCP and UDP are multiplexed protocols, and the multiplexing key in them is the (sender.ip, sender.port, receiver.ip, receiver.port) tuple. I understand the reasoning behind sender.ip, sender.port and receiver.port but can't really see how receiver.ip helps with identifying the corresponding socket on the receiver side.

I am very new to the whole idea but on the other hand, it appears to me that by removing receiver.ip from the tuple, a whole new range of Internet censorship bypassing techniques could arise. Fragmenting IP diagrams into diagrams with various IP addresses of the same receiver could at least in theory help Internet freedom.

So my question is what is the reasoning behind having receiver.ip in TCP and UDP multiplexing key tuple? What would break if we just went with (sender.ip, sender.port, receiver.port)? I apologize in advance if this is a stupid question.

7
  • 1
    Not quite enough to be an answer, but how do you think the packet will get to where it’s supposed to be without a destination IP address? Because that’s essentially what you’re suggesting doing, the info for the multiplexing key is all inherently part of the packet headers (which is why that’s what’s used), and the destination address needs to be there so that the packet gets to the right place. Commented Jan 25 at 12:39
  • 3
    What if the same sender contacts different IP addresses on the receiver, without knowing it's the same receiver? Commented Jan 25 at 18:52
  • 1
    This could possibly be one part of an anti-censorship system, but the censor would just learn all of the host's IP addresses and block them. Commented Jan 25 at 18:54
  • 1
    @SoroushSherafat I’ve added an answer that should be sufficient to explain that, given that a proper explanation is too long for a comment. Commented Jan 28 at 13:44
  • 1
    "most of IP diagram header fields are part of the IP diagram header and not part of the TCP multiplexing key at the same time. my question was why isn't receiver.ip one of them." TCP is independent of the underlying protocols. It works on both IPs and there is even an RFC for it to work on IPX. If you start including the network addressing in the TCP header, then you lock it into a single network protocol. The point of the network layers is encapsulation and abstraction that lead to the ability to, for example, replace ethernet with Wi-Fi, or IPv4 with IPv6, etc., and still use TCP.
    – Ron Maupin
    Commented Jan 28 at 14:16

5 Answers 5

8

From a host stack's perspective, different local IP addresses mean different logical interfaces on the network layer. Whether they are bound to a single or multiple physical interfaces is of little interest here.

Likewise, the local TCP ports 192.0.2.1:80 and 192.0.2.2:80 can be bound and listened to by entirely different server processes, or the same process but handled differently, or the same.

From the remote perspective, both addresses represent different hosts anyway.

More generally, an IP:port combination defines a unique socket (for port-based transport protocols). Each unique pairing of sockets (source to destination, if you wish) represents an unambiguous connection (for connection-based protocols) or dialogue. From the perspective of a participant, that connection is defined by the SourceIP:SourcePort:DestinationIP:DestinationPort tuple.

6

You assume a host has one, and only one, IP address. This is rarely true. If you count loopback, which every machine has, there will always be at least two. In the modern world, it's very likely you'll have many, on many interfaces. Considering IPv6, a host will ABSOLUTELY have multiple addresses.

4
  • 1
    not that a loopback address like 127.0.0.1 or IPv6 link-local addresses are exactly relevant as far as serving remote clients is concerned.
    – ilkkachu
    Commented Jan 25 at 8:01
  • 1
    Once the traffic has reached you, it absolutely does. Applications rarely go to the effort to know what interface gave them the packet, and I'll say it AGAIN: INTERFACES CAN HAVE MORE THAN ONE ADDRESS.
    – Ricky
    Commented Jan 25 at 18:15
  • 1
    @Ricky Shouting doesn't make a point better. Things are designed this way now, but ilkkachu seems to be pointing out that they could have been designed differently in a parallel timeline. Commented Jan 25 at 18:53
  • 1
    They're free to go live in that universe. In this one, IPv4 was designed to support multiple addresses and multiple interfaces. That was very forward thinking of them, as "multi-homing" wasn't common for nearly two decades. (networks were smaller and simpler in the 80's, also much more expensive.)
    – Ricky
    Commented Jan 25 at 23:31
3

A server can have multiple IP addresses, even if it only has 1 NIC.

In fact, every host has at least 2 IP addresses: The host's Layer 3 identifier, and the host's loopback address.

A service can choose to bind to the same port for all addresses (usually denoted by 0.0.0.0:<port> in case of IPv4), or just a certain address:port tuple.

For a made up example:

Let's say I have a (dev) web server + Varnish. The server has 3 addresses (but 1 NIC):

  • 192.168.123.5
  • 192.168.123.19
  • 127.x.x.x (loopback)

I have explicitly configured the web server to bind to 192.168.123.19:80 and 127.0.0.1:80. The former to allow access from other endpoints in the LAN, the latter to receive redirected requests from Varnish.

The Varnish service is explicitly configured to bind to only 192.168.123.5:80, so all requests coming to that IP address goes through Varnish first before being redirected to 127.0.0.1:80 to be handled by the actual web server.

This way, if I receive errors accessing http://192.168.123.5, I can switch over to http://192.168.123.19 and determine if the error is due to Varnish or due to the web server itself.

Because the server has multiple addresses, every TCP connection needs to be recorded using a 4-tuple of (source.address, source.port, dest.address, dest.port) to ensure that the right response goes through the right connection.

4
  • The web server binds to 192.168.123.19:80 and 127.0.0.1:80 only when you configure it so specifically or when you use the unspecified address 0.0.0.0:80 for binding.
    – Zac67
    Commented Jan 25 at 7:28
  • @Zac67 good point. I'll explicitly mention that.
    – pepoluan
    Commented Jan 25 at 8:10
  • But the host still has multiple addresses. The application can choose to ignore the dest addr, but nothing else in the system, or greater network, can.
    – Ricky
    Commented Jan 25 at 18:17
  • @Ricky Yeah, an application after it binds to an address:port, cares no longer about the destination address:port. However, the TCP/IP stack still needs to record the connection as 4-tuple of (s.a, s.p, d.a, d.p)
    – pepoluan
    Commented Jan 26 at 2:18
1

Because the receiver address is required to uniquely identify a given flow.

Consider the case where two different clients connect to the same web server, and they happen to pick the same ephemeral port for their outbound connection. The flow from those two clients to the web server is uniquely identified in this case only by the sender address. But web servers send data back, and when that happens, the sender/receiver roles are reversed. If you don’t include the receiver address as part of the multiplexing key, you cannot uniquely identify the return traffic in such a case.

That sounds like a possibly rare situation, but there are only 16383 ephemeral ports. That means you’re obviously guaranteed to get overlap if the node is serving more than that many clients simultaneously (which is absolutely reasonable for a large site), but it’s actually worse than that. Assuming that each host picks an ephemeral port randomly with a uniform probability distribution, you only need 152 clients to get a 50% chance of two picking the same port, and 389 for a 99% chance (this is a generalized case of the Birthday Problem).


Also, because the addresses involved are always inherently part of the individual packets anyway, not using them in the multiplexing key doesn’t actually hide anything from nodes that those packets are traversing, which means it’s not really any use in trying to defeat censorship or information harvesting.

0

First of all the address parts are in the IP layer, not the TCP/UDP/Transport layer itself. But answering the question further regardless.

I would list all purposes that come to mind:

  • you need the receiver address in the packet so that routers know where to forward your packet, otherwise they wouldn't know whether packet is for them as a device or to be forwarded
  • receiver itself needs to know whether the packet is for them. In some networks, nodes can see packets for other nodes. For example when using hubs or if improperly speced switch is used with too many connected nodes
  • receiver should know which of the local addresses needs to be used. Locally you may use different services on the different IPs
  • there are also multicast and broadcast addresses that usually use UDP for transport. Obviously routers and receivers need to know what kind of address the packet is destined for
  • useful for firewalls

Fragmenting IP diagrams into diagrams with various IP addresses of the same receiver could at least in theory help Internet freedom.

Where would you put all these "various IP addresses"?

Not the answer you're looking for? Browse other questions tagged or ask your own question.