I have a network consisting of the following:
PfSense router with untagged network 10.1.0.0/23 and tagged VLAN 40 network of 10.1.40.0/24.
I have a wireless computer connected to the the 10.1.0.0 network with IP address 10.1.0.11
I have a linux server connected to router with two network cards with addresses 10.1.0.200 and 10.1.40.200 - IP addresses are assigned through DHCP from the router (more on this in a bit).
Problem
When I establish a ssh connection from wireless computer to the 10.1.0.200, I don't receive any dropped connections -- which makes sense since both client and server are on the same subnet.
When I establish a ssh connection from wireless computer (10.1.0.11) to 10.1.40.200, the connection is established initially, but within about 20-40 seconds the network connection drops and examining the PfSense logs, I get the following:
Default allow LAN to any rule (100000101) 10.1.1.11:60991 10.1.40.200:22 TCP:S
...
...
Default deny rule IPv4 (1000000103)| 10.1.1.11:60742| 10.1.40.200:22|TCP:PA
Default deny rule IPv4 (1000000103)| 10.1.1.11:60742| 10.1.40.200:22|TCP:A
Through connsulting the PfSense documentation (https://docs.netgate.com/pfsense/en/latest/troubleshooting/asymmetric-routing.html), this appears to be a case a asymmetric routing. I wondering how to fix this issue.
Server routing table As stated above the linux server has two network cards attached to two separate networks with IP addresses given out by dhcp.
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth1.40: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether d2:14:29:b2:d8:05 brd ff:ff:ff:ff:ff:ff
inet 10.1.40.200/24 metric 1000 brd 10.1.40.255 scope global dynamic eth1.40
valid_lft 6690sec preferred_lft 6690sec
3: eth1.0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 62:d0:5d:0c:3f:24 brd ff:ff:ff:ff:ff:ff
inet 10.1.0.200/23 metric 1000 brd 10.1.1.255 scope global dynamic eth1.0
valid_lft 2496sec preferred_lft 2496sec
Because of this the routing table appears like the following (which I think is the source of the error):
default via 10.1.40.1 dev eth1.40 proto dhcp src 10.1.40.200 metric 1000
default via 10.1.0.1 dev eth1.0 proto dhcp src 10.1.0.200 metric 1000
10.1.0.0/23 dev eth1.0 proto kernel scope link src 10.1.0.200 metric 1000
10.1.0.1 dev eth1.0 proto dhcp scope link src 10.1.0.200 metric 1000
10.1.40.0/24 dev eth1.40 proto kernel scope link src 10.1.40.200 metric 1000
10.1.40.1 dev eth1.40 proto dhcp scope link src 10.1.40.200 metric 1000
So what I'm envisioning happening in this circumstance is connections from client (10.1.0.11) are passing through the router and routed through to the 10.1.40.0 network and received by the server on interface eth1.40. The server responds back to the client using the eth1.0 interface which I'm guessing isn't routed back through the router. The client then responds back to the server with these packets passing through the router (TCP:PA and TCP:A packets) which it seems the router is dropping.
Thinking about the problem a bit more, I'm not sure the best way to design things to prevent this from happening, other than just make the server available on one interface rather than two. I've contemplated changing the routing table, however I'm not exactly sure how I would change the server routing table to prevent such an issue.
Thanks for any input.