Each IP packet carries information about its source address and its destination address. If it encapsulates TCP, the TCP part carries information about its source TCP port and its destination TCP port. Its similar for UDP. Let's denote like this:
src-ip:src-port → dst-ip:dst-port
Note: in this answer everything immediately before :
should be treated as numeric IP address, even if it looks like a host name sometimes; everything immediately after :
should be treated as port number.
Packets that go the opposite way are considered to belong to the same connection. This means packets that "say" ipA:portA → ipB:portB
travel from A to B, packets that "say" ipB:portB → ipA:portA
travel from B to A; and all these belong to a connection we may denote
ipA:portA ↔ ipB:portB
or equivalently
ipB:portB ↔ ipA:portA
The equivalence is valid because after the connection is established the situation is symmetrical and it doesn't matter which end listened and which initiated the connection.
The most important thing to understand: any single connection is (locally) identified by two addresses and two ports. If a packet reports another address or/and port, it belongs to another connection.
Where packets are created, the OS makes sure new connections don't "impersonate" old ones.
I said "locally identified" because sometimes IP addresses and ports are translated along the way. In this case on one part of the route the connection is identified by one tuple of addresses and ports, on the other part by another tuple; so there is no global tuple that identifies the connection everywhere. But at any given node (including both ends) such local tuple exists, so separate connections can always be tell apart.
At each end the respective OS keeps track which process serves which established connection.
I hope this will become clearer after I answer your specific questions.
2 instances of a web browser[…] are connected to the same IP, with a HTTP connection (port 80).
The first connection will be like
client:portC ↔ server:80
While creating the second connection on the client
, it must not get portC
as its local port. It can get any valid port that is not yet used to communicate between client
and server:80
. It's technically possible the second instance of the browser requests portC
specifically, it will be denied then. In practice browsers don't ask for specific ports and they attach to whatever ephemeral ports the OS offers. In our case it's the OS's job to offer a port different than portC
.
But if the other end for the second connection was server2:80
or server:8080
, it could use portC
because any of these
client:portC ↔ server2:80
client:portC ↔ server:8080
is different than client:portC ↔ server:80
that is already taken.
OK, so the second connection will look like this:
client:portD ↔ server:80
where portD
is different than portC
.
If client
is globally routable, there is no translation needed and the server gets packets that "say":
client:portC → server:80
client:portD → server:80
and this is enough to tell they belong to two separate connections. Responses will "say"
server:80 → client:portC
server:80 → client:portD
respectively and after they get to the client
, the OS there will be able to deliver each to the proper process because it knows which instance of the browser is associated with which connection.
Consider two clients in LAN that connect to server:80
. It may happen (by chance) they both use the same local port portE
. On the first client this connection gets established:
lan_ip1:portE ↔ server:80
On the second:
lan_ip2:portE ↔ server:80
They both reach the Internet via router that implements source NAT, i.e. substitutes lan_ip*
with its globally routable wan_ip
. To make the two connections distinguishable on the WAN side, the router cannot use portE
for both. On the WAN side these connections will look like
wan_ip:portF ↔ server:80
wan_ip:portG ↔ server:80
respectively, where portF
is not portG
(relation to portE
is irrelevant).
In general it's the router's job to make sure connections that were different on the LAN side (because of different local IP address or port) were distinguishable on the WAN side. Since wan_ip
is fixed and the other end (here server:80
) must not be changed, the router must assign different local ports.
Now any packet that comes from WAN and "says" server:80 → wan_ip:portG
will be translated to a packet in LAN that "says" server:80 → lan_ip2:portE
.
Now imagine two different machines in two different LANs, yet each with the same lan_ip
(the two machines obviously cannot reach each other directly). Then it's possible to establish this on each:
lan_ip:portE ↔ server:80
But
either they reach the Internet by two different global IP addresses, so two different routers will do their job and the server
will see
wan_ip1:portH ↔ server:80
wan_ip2:portI ↔ server:80
and these are different even if (by chance) portH
is the same as portI
;
or they reach the Internet (possibly via multiple NATs) by the same global IP address, so one final router will do its job and the server
will see
wan_ip:portF ↔ server:80
wan_ip:portG ↔ server:80
where portF
is not portG
, because the router takes care of this.
In each case the server
can tell the connections apart.
In addition server
may be a router that implements DNAT, with the actual server behind it. Any device that performs translation should make sure two connections that are considered different in one part of the network remain different in the other part. No matter how many NATs there are between the two ends, connections to server:80
that started as separate will appear to the actual server as two different ones.
What happens if two different processes use the same port number? […] Let us assume they both use the same IP and port, as before.
Above we considered established connections. Now, because I'm not sure what exactly you mean, let's explain what happens earlier:
- On the server side there is a process (or processes, we'll get to this) that listens on some specific
server_ip
and server_port
.
- On the client side a process initiates a connection to
server_ip:server_port
. The connection binds to client_ip:client_port
, where client_port
may have been specifically requested or just randomly granted.
After handshaking the connection is established. The server process may:
- stop listening and just serve the connection,
- keep listening and serve the connection,
- fork, so one process listens and the other serves the connection.
If two different client processes try to use the same port number while connecting to the same server_ip:server_port
, one of them will be denied the port, as elaborated above.
If two different server processes try to use the same server_ip:server_port
to listen on it, normally one of them will be denied the port; but:
- if the listening process serves an incoming connection and stops listening, the other process may start listening;
- if connections keep coming in, you may end up with many processes using the same port at the same time to serve different established connections;
- on demand, different processes can actually listen on the same port to process multiple incoming connections in parallel (see this answer of mine); using this with different programs seems possible but then clients cannot know which program will serve any particular connection attempt, so it's hardly useful; using this with few instances of one program is useful.
And there is broadcasting, you can send packets to many programs that do use the same src-ip:src-port → dst-ip:dst-port
tuple (example).