0

I am trying to learn about ipv4, TCP and UDP protocols. I have read that the purpose of a port number being assigned in a TCP or UDP datagram is to ensure that the sending/receiving machines forward the datagram to the correct application (Process).

I have two similar questions regarding this.

Firstly, what happens if a system is running 2 instances of a process? For arguments sake, let us assume a machine is running 2 instances of a web browser. Perhaps there are 2 users logged into a terminal? Or perhaps a single user with 2 instances of a process running. Let us also assume that these two processes, or users, are connected to the same IP, with a http connection. (Port 80).

How does ipv4 work in this case? Surely the requests look identical? How does our system know which process to send the response datagrams to?

Secondly, as an extension of the first question, what happens if two different processes use the same port number? We could have two different applications that read mail, for example. Let us assume they use the same protocol and port number. In this case, how does our system know which mail process instance to send receive data to? Let us assume they both use the same IP and port, as before.

2 Answers 2

2

Multiple processes don't share port numbers.

Let's say you're setting up a web server, say Apache, on your machine. Let's say your machine only has one IPv4 address, let's say 192.0.2.2. Apache is going to ask your OS's TCP/IP stack to let it open a listening socket on TCP port 80, and if nothing else is already using that port, the TCP/IP stack will let it have it. Now let's say that while you still have Apache running, you (or another user on another terminal of the same machine) try to set up nginx on the same machine. When nginx tries to open a listening socket on TCP port 80, it will get an error from the stack saying the port is in use. So you'd have to reconfigure nginx to use some other TCP port for its listener, let's say 8080.

Any client that tries to connect to TCP port 80 on your machine will be talking to Apache, not nginx. If you want to connect to nginx, you'd have to specify the 8080 port number: "http://192.0.2.2:8080/"

In general, apps only specify port numbers on the listener/server side. So when a web browser connects to a web server, the web browser asks the TCP/IP stack to make a connection to port 80 on the remote IP address, and the TCP/IP stack just assigns that connection an arbitrary source port, using any available (not already in use by another process) port number from the "ephemeral" port range from 49152 to 65535.

2
  • What if I am running two browsers simultaneously? For example I have Safari and Google Chrome open at the same time, will one be assigned a different port? What about multiple tabs within one browser - is that something the browser manages or does each tab get it's own port?
    – timtam
    Commented Apr 14, 2021 at 12:46
  • 1
    @timtam Client processes such as web browsers don't use a single well-known port like server (listener) processes do. Every outgoing connection attempt a web browser makes is assigned a different free port number from the "ephemeral" port range (49152-65536). Once the browser finishes any HTTP transactions it needed to do over that TCP connection, it closes the connection and the TCP/IP stack marks the port number as free and returns it to the pool to be reused.
    – Spiff
    Commented Apr 14, 2021 at 16:45
1

Each IP packet carries information about its source address and its destination address. If it encapsulates TCP, the TCP part carries information about its source TCP port and its destination TCP port. Its similar for UDP. Let's denote like this:

src-ip:src-port → dst-ip:dst-port

Note: in this answer everything immediately before : should be treated as numeric IP address, even if it looks like a host name sometimes; everything immediately after : should be treated as port number.

Packets that go the opposite way are considered to belong to the same connection. This means packets that "say" ipA:portA → ipB:portB travel from A to B, packets that "say" ipB:portB → ipA:portA travel from B to A; and all these belong to a connection we may denote

ipA:portA ↔ ipB:portB

or equivalently

ipB:portB ↔ ipA:portA

The equivalence is valid because after the connection is established the situation is symmetrical and it doesn't matter which end listened and which initiated the connection.

The most important thing to understand: any single connection is (locally) identified by two addresses and two ports. If a packet reports another address or/and port, it belongs to another connection.

Where packets are created, the OS makes sure new connections don't "impersonate" old ones.

I said "locally identified" because sometimes IP addresses and ports are translated along the way. In this case on one part of the route the connection is identified by one tuple of addresses and ports, on the other part by another tuple; so there is no global tuple that identifies the connection everywhere. But at any given node (including both ends) such local tuple exists, so separate connections can always be tell apart.

At each end the respective OS keeps track which process serves which established connection.

I hope this will become clearer after I answer your specific questions.


2 instances of a web browser[…] are connected to the same IP, with a HTTP connection (port 80).

The first connection will be like

client:portC ↔ server:80

While creating the second connection on the client, it must not get portC as its local port. It can get any valid port that is not yet used to communicate between client and server:80. It's technically possible the second instance of the browser requests portC specifically, it will be denied then. In practice browsers don't ask for specific ports and they attach to whatever ephemeral ports the OS offers. In our case it's the OS's job to offer a port different than portC.

But if the other end for the second connection was server2:80 or server:8080, it could use portC because any of these

client:portC ↔ server2:80
client:portC ↔ server:8080

is different than client:portC ↔ server:80 that is already taken.

OK, so the second connection will look like this:

client:portD ↔ server:80

where portD is different than portC.

If client is globally routable, there is no translation needed and the server gets packets that "say":

client:portC → server:80
client:portD → server:80

and this is enough to tell they belong to two separate connections. Responses will "say"

server:80 → client:portC
server:80 → client:portD

respectively and after they get to the client, the OS there will be able to deliver each to the proper process because it knows which instance of the browser is associated with which connection.


Consider two clients in LAN that connect to server:80. It may happen (by chance) they both use the same local port portE. On the first client this connection gets established:

lan_ip1:portE ↔ server:80

On the second:

lan_ip2:portE ↔ server:80

They both reach the Internet via router that implements source NAT, i.e. substitutes lan_ip* with its globally routable wan_ip. To make the two connections distinguishable on the WAN side, the router cannot use portE for both. On the WAN side these connections will look like

wan_ip:portF ↔ server:80
wan_ip:portG ↔ server:80

respectively, where portF is not portG (relation to portE is irrelevant).

In general it's the router's job to make sure connections that were different on the LAN side (because of different local IP address or port) were distinguishable on the WAN side. Since wan_ip is fixed and the other end (here server:80) must not be changed, the router must assign different local ports.

Now any packet that comes from WAN and "says" server:80 → wan_ip:portG will be translated to a packet in LAN that "says" server:80 → lan_ip2:portE.


Now imagine two different machines in two different LANs, yet each with the same lan_ip (the two machines obviously cannot reach each other directly). Then it's possible to establish this on each:

lan_ip:portE ↔ server:80

But

  • either they reach the Internet by two different global IP addresses, so two different routers will do their job and the server will see

    wan_ip1:portH ↔ server:80
    wan_ip2:portI ↔ server:80
    

    and these are different even if (by chance) portH is the same as portI;

  • or they reach the Internet (possibly via multiple NATs) by the same global IP address, so one final router will do its job and the server will see

    wan_ip:portF ↔ server:80
    wan_ip:portG ↔ server:80
    

    where portF is not portG, because the router takes care of this.

In each case the server can tell the connections apart.


In addition server may be a router that implements DNAT, with the actual server behind it. Any device that performs translation should make sure two connections that are considered different in one part of the network remain different in the other part. No matter how many NATs there are between the two ends, connections to server:80 that started as separate will appear to the actual server as two different ones.


What happens if two different processes use the same port number? […] Let us assume they both use the same IP and port, as before.

Above we considered established connections. Now, because I'm not sure what exactly you mean, let's explain what happens earlier:

  1. On the server side there is a process (or processes, we'll get to this) that listens on some specific server_ip and server_port.
  2. On the client side a process initiates a connection to server_ip:server_port. The connection binds to client_ip:client_port, where client_port may have been specifically requested or just randomly granted.
  3. After handshaking the connection is established. The server process may:

    • stop listening and just serve the connection,
    • keep listening and serve the connection,
    • fork, so one process listens and the other serves the connection.

If two different client processes try to use the same port number while connecting to the same server_ip:server_port, one of them will be denied the port, as elaborated above.

If two different server processes try to use the same server_ip:server_port to listen on it, normally one of them will be denied the port; but:

  • if the listening process serves an incoming connection and stops listening, the other process may start listening;
  • if connections keep coming in, you may end up with many processes using the same port at the same time to serve different established connections;
  • on demand, different processes can actually listen on the same port to process multiple incoming connections in parallel (see this answer of mine); using this with different programs seems possible but then clients cannot know which program will serve any particular connection attempt, so it's hardly useful; using this with few instances of one program is useful.

And there is broadcasting, you can send packets to many programs that do use the same src-ip:src-port → dst-ip:dst-port tuple (example).

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .