I'm testing out a setup that I would reproduce in a production environment if I get it work. What I have:
- A wireless network 192.168.88.0/24
- An IoT device X in the wireless network
- A Teltonika router (their RutOS firmware is an OpenWrt fork) connecting to the wireless network with IP masquerading (192.168.2.0/24) and providing a 4G failover for internet connectivity due to sometimes unreliable wireless network.
- Another IoT device Y in the natted network behind the Teltonika
These IoT devices need to communicate with each other. They discover each other by means of an UDP broadcast from Y, to which X responds, revealing its IP address. Very simple. By default the broadcast address is the local broadcast address 255.255.255.255 but can be configured to directed broadcast, i.e. 192.168.88.255, which is what I have done.
Since the devices are in different subnets I have enabled bc_forwarding in the Teltonika and of course configured the firewall to allow the return traffic to come through.
Using this configuration I can get the UDP broadcast from Y to X to be successfully forwarded with NAT in place. So far, so good.
Unfortunately, the return packet never comes back to Y!
I have verified that X does send the response to the Teltonika's address (due to NAT) normally. I have also verified that without broadcasts the devices find each other by configuring the "broadcast" address to 192.168.88.whatever-X-got-from-DHCP-this-time. And I have temporarily placed them in the same network to ensure that discovery using the local broadcast address works as expected. So, the problem seems to be that a response to a natted directed broadcast isn't getting routed to Y.
Looking at /proc/net/nf_conntrack in the Teltonika I can see:
ipv4 2 udp 17 18 src=192.168.2.236 dst=192.168.88.255 sport=55879 dport=50607 packets=1 bytes=48 [UNREPLIED] src=192.168.88.255 dst=192.168.88.205 sport=50607 dport=55879 packets=0 bytes=0 mark=0 zone=0 use=2
192.168.88.205 is the address of the Teltonika router in the 192.168.88.0 subnet.
Questions:
- I interpret that nf_conntrack line to mean that conntrack is expecting a response from the same broadcast address it had sent the original packet to (192.168.88.255)?
- If answer to 1 is no, what does that line them mean?
- If answer to 1 is yes, shouldn't it be expecting a response from any address in 192.168.88.0/24? After all, the router knows that 192.168.88.255 is a broadcast address, since it's in the same subnet itself, and only enabling bc_forwarding made the packet get routed. Nobody should be sending packets with the broadcast address as the source address, right?