tl;dr: my firewall has stopped getting IP addresses via DHCP for some reason which seems to be related to the MAC address.
Network topology
[Fibre]
|
[Router] - Melbye Raycore RC-CP9
|
[Router/Firewall] - HP T620+, opnSense
| |
| [WiFi router] - Netgear R7800
| . )
| . ) )
| . ) ) ) {various devices - Linux, Android and iOS}
|
|
[Switch] - 16-port Cisco
| | |
| | |
{various devices - Windows, Linux}
The firewall is acting as local DHCP server on the LAN port. The WiFi 'router' is configured in AP mode. There is also a DMZ coming off a third port of the firewall but it has been removed from the equation for now.
Up until a couple days ago everything was working fine.
Initial Troubleshooting
I disconnected the firewall to mount it on a wall closer to the fiber router and to route the cabling more tidily. I had the WiFi router plugged in directly so laptops and mobile devices still have internet connectivity in the meanwhile
I set the firewall up again but now there was no WAN connectivity. The WAN IP address is reported as
0.0.0.0
. I tried moving it back to the old location to rule out any weird issues with the ethernet outlet or patching but no improvement.I backed up my config then followed a few dead-ends with opnSense configuration around things like gateway and DHCP client settings before deciding to just re-install it. No change with default config. I tried booting IPFire incase it was some BSD-specific quirk but I get the same behaviour.
I double-checked all cabling, everything is fine. Tried connecting the WiFi router directly to the fibre router again - it still works but since the WiFi router has its own DHCP server disabled I'm getting public IP addresses for every device that connects to it (
37.?.?.?
).I restored the previously working opnSense configuration, reset the fibre router and tried again. Still no connectivity on the WAN side. I tried setting a static IP and gateway based on the one the WiFi router received but still no connectivity. Since the LAN side works I tried reassigning the WAN/LAN ports in case it was some kind of hardware issue with the NIC but the behaviour is the same.
I tried connecting with a desktop directly to the fibre router which has worked previously but now this too cannot get an IP address. When it connects via the WiFi router it works but with a public IP address as mentioned above. Tried with a laptop with ethernet dongle and it too is unable to get a IP, (though I've only been connecting it via WiFi before). I try connecting a small server which was previously in the DMZ directly to the router and it gets an IP address fine, albiet a completely different IP and gateway to what the router was getting.
I've double-checked all cables and replaced a couple which might have been suspect (but were previously working so I'm just clutching at straws). CAT6 and CAT7 S-FTP all the way from fibre router to each node, all working fine.
Minor breakthrough
After a suggestion from @user1686 I looked at the DHCP logs in opnSense and noticed this loop going on (sorted with newest logs first):
Nov 20 22:42:26 dhclient FAIL
Nov 20 22:42:26 dhclient 47074 No working leases in persistent database - sleeping.
Nov 20 22:42:26 dhclient 47074 No DHCPOFFERS received.
Nov 20 22:42:24 dhclient 47074 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 2
Nov 20 22:42:17 dhclient 47074 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 7
Nov 20 22:42:10 dhclient 47074 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 7
Nov 20 22:41:58 dhclient 47074 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 12
Nov 20 22:41:50 dhclient 47074 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 8
Nov 20 22:41:37 dhclient 47074 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 13
It happens over and over again.
On a whim I try setting a static IP address again from another device but this time also spoofing its IP address. Now it works pretty much immediately:
Nov 20 22:43:30 dhclient 20508 DHCPACK from 5.###.###.###
Nov 20 22:43:29 dhclient 20508 DHCPREQUEST on igb0 to 255.255.255.255 port 67
Nov 20 22:43:29 dhclient 20508 DHCPREQUEST on igb0 to 255.255.255.255 port 67
This made me think maybe the particular NIC MAC is being blocked or ignored for some reason, but when I change the MAC address being spoofed to something random-ish (e.g. change the last AA
to BB
) it goes back to no DHCPOFFERS
. Also I would have expected swapping the WAN/LAN port assignments in point [5] above would have worked if that was the case.
I can remove the static IP address now and it gets an IP address fine as long as the MAC is spoofed but I don't know if that's actually getting an IP address assigned properly or it's only getting an existing lease while it's still valid.
I reset the fibre router again and go through the same steps above in case I skipped over something. I leave the other devices except the firewall disconnected from the fibre router but it doesn't help. This is now nearly 24 hours later so I would expect anything relating to DHCP lease expiry would be a non-issue now.
From point [1] above, I've heard suggestions that there are ISPs which only allow a single device to get an IP on a given network. This doesn't make sense since I've had multiple devices connected directly to the fiber router without issue prior to my current setup.
I'm at a loss for how to most effectively troubleshoot further from here.