I run a personal VPN network with Wireguard for my family and me. It started some years ago out of curiosity, but then became quite useful and started to grow. I created an overview of it, it's linked here: Network overview
This was working absolutely fine, until recently my ISP decided to switch me over from a traditional DSL to a coax cable (DOCSIS) connection. For this reason, the device that's labeled 'ISP Shitbox A' in the overview diagram, was changed. And while already at it, i decided to buy and add another router (that's the "Fritz!Box 4060"), because the new "ISP Shitbox A" doesn't have sufficient configuration options for my purposes.
So in the "zrh" part of the network, the devices were assigned new local IP adresses and i had to change some routes for this to properly work, but it seemed fine - at first glance. Then i unfortunately noticed, that the performance of my VPN was heavily degraded. I'm not talking about some percent here, but instead from 20+ MByte/s to just a couple of kilobytes per second (often causing HTTP connections to timeout). But to be more specific:
In the zrh net, i have a Raspberry Pi ("zrh-02", running Raspberry Pi OS Lite 11), that's acting as a VPN gateway, connecting the zrh net to the ams net (or more detailed, a VPS which i named "ams-01", Ubuntu Server 22.04). I also have another Raspberry Pi ("fra-01", Raspberry Pi OS Lite 11) in the fra net, which is acting as counterpart and connects also the fra net to the ams net. (The yellow and blue parts of the network on the left side of the diagram shouldn't be relevant for the problem, but i included them for completeness.)
I did some research and found out, that due to the changed underlying technology of the zrh ISP connection, i should review the MTU of my zrh-02 gateway. For testing purposes, i directly connected (using the same physical connection, but not the gateway zrh-02 and instead the wireguard client/app on the device) client A.1 to ams-01 and fiddled with the MTU there. Indeed, reducing it (to 1384 in this case) seemed to improve the connection quality (speed around ~75% as fast as before the migration, a dimension where it's acceptable - and i think the connection quality was generally a bit better before the migration as i notice some slightly higher latency there independently from the VPN).
Unfortunately, setting the MTU to 1384 also on zrh-02 doesn't resolve the issue there. But: When speed testing traffic originating from zrh-02 itself, it's as fast as from client A.1 (when directly connected to ams-01). So i did some more testing (using iperf3). When referring to "fast" in the following i mean 10+ MByte/s and by "slow" 5-100 KByte/s (i don't think, the exact numbers matter; as the gap is so huge).
zrh-02->fra-01: fast
fra-01->zrh-02: fast
zrh-03->fra-01: slow
zrh-03->zrh-02: fast
fra-01->zrh-03: fast
I conclude from this, that the MTU was not the (only) problem, but my zrh-02 gateway RasPi seems to be misconfigured in some way: It seems that only traffic from zrh routed via zrh-02 to ams/fra is slow (even traffic in just the opposite direction seems to be not affected). (Additional remarks: The zrh-* devices are on the same switch with 1 GBit/s ports each. Pings are working in all cases, with not more than max. ~60ms in any case. Traceroutes didn't show anything not to be expected here. In the slow case zrh-03->fra-01: 1. 192.168.177.1, 2. zrh-02, 3. 10.49.0.1, 4. 192.168.188.78 - nothing more)
My first guess was, that the "Fritz!Box 4060" i added to the zrh net maybe had some different defaults regarding IPv6 and maybe some network participants were trying to use this and then only falling back to IPv4. But this should -to my understanding- only have impact on the time needed to establish a TCP connection, not on the throughput afterwards. I nevertheless tried out disabling IPv6 entirely on zrh-02, zrh-03 and the Fritz!Box 4060, but it had no effect.
This (finally!) leads to my questions: Any ideas, what else could possibly be misconfigured?
For more reference:
iptables -L from zrh-02:
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT icmp -- anywhere anywhere
Chain FORWARD (policy ACCEPT)
target prot opt source destination
DOCKER-USER all -- anywhere anywhere
DOCKER-ISOLATION-STAGE-1 all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain DOCKER (2 references)
target prot opt source destination
ACCEPT tcp -- anywhere 172.18.0.3 tcp dpt:domain
ACCEPT udp -- anywhere 172.18.0.3 udp dpt:domain
ACCEPT tcp -- anywhere 172.18.0.2 tcp dpt:https
ACCEPT tcp -- anywhere 172.18.0.2 tcp dpt:http
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- anywhere anywhere
DOCKER-ISOLATION-STAGE-2 all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-ISOLATION-STAGE-2 (2 references)
target prot opt source destination
DROP all -- anywhere anywhere
DROP all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- anywhere anywhere
ip route from zrh-02:
default via 192.168.177.1 dev eth0 proto dhcp src 192.168.177.11 metric 202 mtu 1500
10.49.0.0/16 dev wg0 scope link
(public IP of ams-01) dev wg0 scope link
169.254.0.0/16 dev veth080a326 scope link src 169.254.119.196 metric 209
169.254.0.0/16 dev vetha933199 scope link src 169.254.229.232 metric 211
172.16.253.86 dev wg0 scope link
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
172.18.0.0/16 dev br-bbdeb14a7bce proto kernel scope link src 172.18.0.1
192.168.177.0/24 dev eth0 proto dhcp scope link src 192.168.177.11 metric 202 mtu 1500
192.168.188.0/24 dev wg0 scope link
cat /etc/wireguard/wg0.conf from zrh-02:
[Interface]
PrivateKey = (redacted)
Address = 10.49.0.11
DNS = 172.16.253.86
PostUp = iptables -A FORWARD -i %i -j ACCEPT; iptables -A FORWARD -o %i -j ACCEPT; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE; iptables -t nat -A POSTROUTING -o wg0 -j MASQUERADE; iptables -A INPUT -p icmp -j ACCEPT
PostDown = iptables -D FORWARD -i %i -j ACCEPT; iptables -D FORWARD -o %i -j ACCEPT; iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE; iptables -t nat -D POSTROUTING -o wg0 -j MASQUERADE; iptables -D INPUT -p icmp -j ACCEPT
MTU = 1384
[Peer]
PublicKey = (redacted)
PresharedKey = (redacted)
AllowedIPs = 192.168.188.0/24, 10.49.0.0/16, 172.16.253.86/32, (public IP of ams-02)/32
Endpoint = (public IP of ams-01):51820
PersistentKeepalive = 25
ifconfig from zrh-02:
br-bbdeb14a7bce: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.18.0.1 netmask 255.255.0.0 broadcast 172.18.255.255
inet6 fe80::42:4aff:febf:5a71 prefixlen 64 scopeid 0x20<link>
ether 02:42:4a:bf:5a:71 txqueuelen 0 (Ethernet)
RX packets 109066 bytes 12276335 (11.7 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 97743 bytes 17850048 (17.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
ether 02:42:37:59:e5:b5 txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.177.11 netmask 255.255.255.0 broadcast 192.168.177.255
inet6 fe80::815b:d516:d548:42d3 prefixlen 64 scopeid 0x20<link>
inet6 2a02:aa12:a87e:3b6:6b2d:cfa9:5b20:1cfb prefixlen 64 scopeid 0x0<global>
ether dc:a6:32:48:2d:74 txqueuelen 1000 (Ethernet)
RX packets 345104 bytes 60403645 (57.6 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 174448 bytes 39141821 (37.3 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 11 bytes 1722 (1.6 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 11 bytes 1722 (1.6 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
veth080a326: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 169.254.119.196 netmask 255.255.0.0 broadcast 169.254.255.255
inet6 fe80::84fc:4d80:c0ff:87c1 prefixlen 64 scopeid 0x20<link>
ether e6:fc:45:96:c1:13 txqueuelen 0 (Ethernet)
RX packets 108227 bytes 13566490 (12.9 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 98381 bytes 18072087 (17.2 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
vetha933199: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 169.254.229.232 netmask 255.255.0.0 broadcast 169.254.255.255
inet6 fe80::609c:a3ff:fe93:71d4 prefixlen 64 scopeid 0x20<link>
inet6 fe80::61fa:14fb:96c6:ec72 prefixlen 64 scopeid 0x20<link>
ether 62:9c:a3:93:71:d4 txqueuelen 0 (Ethernet)
RX packets 6 bytes 701 (701.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 753 bytes 250040 (244.1 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
wg0: flags=209<UP,POINTOPOINT,RUNNING,NOARP> mtu 1384
inet 10.49.0.11 netmask 255.255.255.255 destination 10.49.0.11
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 1000 (UNSPEC)
RX packets 418 bytes 46076 (44.9 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 425 bytes 244724 (238.9 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0