5

My ISP sometimes drops IP Fragments. This manifests itself in SSH thus: I can login fine and run commands that output short strings, but running something like "ps fax" causes the link to lock up.

The PMTU is 1500. So it seems that SSH is trying to transmit packets bigger than this that are then getting fragmented and therefore lost. I can't easily reduce the remote MTU, and I am reluctant to do so, since the MTU is correct (1500).

What is going on here? I thought TCP sets its MSS to the correct value to avoid fragmentation. How am I getting fragments?

EDIT: The remote machine is Centos 6.4. By "sometimes" I meant that the ISP has bad days where their network breaks and it really does drop just fragments. Sniffing both ends of the link I see the Packet and fragments from a large ping leave my system, but only the "main" packet arrives at the target.

The problem manifested itself as above and also a black screen when logging into an RDP session. The problem is fixed now, so I can't do any more tests until it happens again.

A bit more testing on another system shows that SSH sets the DF bit. So now I really don't understand what is going on.

I wasn't able to test if Fragmentation needed ICMP messages appear (they shouldn't) as network accepts packets up to Ethernet size (1500) so my own router would issue the frag needed message.

I complained to the ISP, but they tried to convince me that this was by design and that their network would never allow a ping like "ping -s 5000". Not true of course. Especially given they fixed it.

5
  • What OS are you using? Some kind of Unix like I assume but please specify.
    – terdon
    Commented Aug 22, 2013 at 17:03
  • Might be worth getting a pcap of both a small successful transfer and a large failed transfer and see what differs.
    – MaQleod
    Commented Aug 22, 2013 at 17:10
  • 1
    Wouldn't the TCP protocol compensate for dropped packets by re-sending the packets? Also, "fragmented" does not correlate with "lost". If your ISP is dropping packets, then a large packet is just as likely to be lost as a small packet fragment. I don't know what's wrong, but you might be barking up the wrong tree. Have you tried running ssh in full debug mode? ssh -vvv <host> Commented Aug 22, 2013 at 17:20
  • @DarthAndroid - It would. If I were the author I would use a proxy or VPN to reduce the filtering the ISP is doing. If the ISP is really dropping pockets to a point where the problem can be reduced I would also COMPLAIN to the service provider.
    – Ramhound
    Commented Aug 22, 2013 at 17:23
  • Dropped packets is something that ISPs will allow on all lines to some degree. Most resi lines can have up to 6% lost before an ISP will even consider investigating. A business line usually shouldn't be higher than 3%. Unfortunately, some applications are sensitive enough to dropped packets that even at those levels, it can cause significant problems.
    – MaQleod
    Commented Aug 22, 2013 at 17:35

1 Answer 1

1

In this case I overlooked the fact that the SSH and RDP sessions were going through an IPSEC VPN. The large SSH packets at or near the PMTU were causing the IPSEC packets to be bigger than the MTU and therefore to fragment. These fragments were lost and the session hung.

I presume that the session would eventually time out and recover, but humans timeout more quickly.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .