NSA 2700 drops ARP requests for IP address it's NAT'ing Firmware bug?
We have a pair of SonicWall 2700’s in a very simple HA configuration. Here’s the problem. The WAN interface (X1) has IP address 10.5.1.2 Our internal router’s interface is 10.5.1.1. We have a single NAT policy that for the WAN interface that maps the source IP for all egress traffic on the WAN interface to 126.96.36.199. To the outside world, everyone on our LAN appears to be at 188.8.131.52. That’s great. Should work fine. Has for years with Cisco ASAs. However, via packet capture on the SonicWall I notice that the router at 10.5.1.1 sends an ARP request to the SonicWall for 184.108.40.206 The SonicWall drops the ARP request. The router can’t get a MAC address for 220.127.116.11 so it can’t send traffic to the WAN interface on the SonicWall and our connectivity goes away. There is an internal settings page (/settings/diag.html) on the SonicWall that has a button to send gratuitous ARPs. Hit that button and the router is happy…until it’s cache times out…about 4 hrs. Then the router sends another ARP request for 18.104.22.168 and it’s dropped by the SonicWall. There is another setting in the internal settings page to repeated send gratuitous ARP responses every 60 minutes. That works for days…until the thread on the SonicWall’s OS dies or an HA failover happens. So that workaround is no good. I configured a Static ARP entry for 22.214.171.124 on the SonicWall and set it to publish. Didn’t help. SonicWall support has pretty much given up.
It’s probably as simple as the SonicWall has a (hard-coded) security policy that rejects ARP requests for hosts (126.96.36.199 in our case) that it thinks are not on it’s subset. It doesn’t seem to consider NAT policies !! Anyway, perhaps some sort of static route would work around the problem or some overriding security policy. Not sure.
@ltenny I'am not sure how your Router believes to find 188.8.131.52, because it's not part of the LAN facing Router-Interfaces, which is 10.5.1.0/24? Do you have a chance to route 184.108.40.206 via 10.5.1.2 on the Router itself, that should do the trick.
The router routes 220.127.116.11 out a particular interface via a simple static route. This interface on the router has 10.5.1.1 as its IP address. This is very common in router configurations. The firewall's WAN interface is 10.5.1.2. They are on the same subnet. Same broadcast domain. The router has this packet with a destination IP of 18.104.22.168. It knows it needs to go out a certain interface (who's IP is 10.5.1.1). So it needs to know the MAC address of 22.214.171.124 so it sends out an ARP expecting that someone in the broadcast domain of its interface knows the MAC address. The firewall should respond with the MAC address of the firewall's WAN interface because this WAN interface is NAT'ing egress traffic to 126.96.36.199.
Remember, the firewall is replacing the source IP address of all egress packets with 188.8.131.52 so the whole world past the firewall's WAN interface believes this packet came from 184.108.40.206...that's the purpose of NAT. Ok, when someone (host) replies to this packet it knows only to send it (destination) to 220.127.116.11. The routing in the internet gets it to our perimeter router. Our router has a static route that sends it out a particular interface. But, of course, in order for the router to send it out at the data access layer it needs to know the MAC address to send it to....which the firewall has failed to provide. That's the problem. The firewall drops the ARP request because it says that 18.104.22.168 is not on it network, but it has failed to consider the NAT policy that made all egress traffic from the interface look like it came from 22.214.171.124.
Routing networks are commonly completely different than the traffic they route. The interfaces on the routers typically don't have IPs that are in the subnets of the traffic they route. They can, but it's limiting.
have you read the following?
Yes. I've done this. Doesn't work, Rebooted everything and tried again. Still doesn't work. It's been set for a couple weeks now, just in case. When support was engaged and still interested they double checked to make sure I had it set to publish and correct.
Yup...had a lot of hope on that one.
Sorry I missed that you tried that in the first post. Can you share more details about the interface configs on the Sonicwall? Are the defaults NATs in place or did you enable routing mode? Do you still have a copy of the ASA config you can sanitize and post?
Turns out it's either a) a bug in the firmware, or b) so poorly documented that it took tech support 3 weeks to find it. This may affect any NAT configuration. TKWITS was on to something. The second part of the article referenced mentioned a route policy. Now the route policy is a little strange because it needs to be for inbound traffic to the NAT'ed IP. Packet capture confirms now that the 2700 correctly responds to ARP requests. The crazy thing is that any reasonable implementation of NAT would always imply this policy (we need to respond to ARP request for what we NAT)...however, it needs to be explicit for SonicWall via this route policy.
The route-policy added is incorrect, Please add the route-policy as below:
KB - https://www.sonicwall.com/support/knowledge-base/configuring-multiple-wan-subnets-using-static-arp-with-sonicos-enhanced/170503911164326/\
"The crazy thing is that any reasonable implementation of NAT would always imply this policy (we need to respond to ARP request for what we NAT)...however, it needs to be explicit for SonicWall via this route policy."
Implied policies would also cause problems... Your configuration situation of NATing an address that is not in a used address range is not common in my world. Glad it was worked out.
What's the purpose of the 10.5.1 network here? Surely it would be easier just to give the firewall the IP 126.96.36.199 and be done with it.
In small, home office type networks you would be correct. Most people simply use the IP given to them by their ISP. In larger networks, the routing network is different than the routed network. What I mean here is the IP address of the interfaces for the routers and other devices that participate in moving packets around are different than the IP addresses of the packets they are moving. So, for example, let's say you have a /24 IP allocation. So you have 188.8.131.52 - 184.108.40.206 so you have 254 IP addresses. If you are using from this space for router interface IPs, then you might say 220.127.116.11 is your ISP's (next hop) interface. Your perimeter router interface IP would be say 18.104.22.168 and it would have a say 8 interfaces say 22.214.171.124 - 126.96.36.199 then you need a firewall and it has an interface...but wait, I want this firewall to expose 30 web servers to the internet from a DMZ, each with it's own IP. So, mmm, ok, I'll need 30 interfaces on my firewall? What a mess. Even medium sized businesses don't do this. You would have consumed a ton of valuable IP addresses for internal traffic management. These IPs should be internal. That's one reason we use NAT. And, of course, part of NAT is that the interface must respond to ARP requests for addresses it's NAT'ing. Which, it turns out, takes a few hoops to jump through for SonicWall. Other vendors implement this in a more streamlined fashion -:)
Yes, no one does what you are describing with using multiple interfaces to get devices behind a firewall 'exposed' to the internet. Maybe during the earliest years people did that...
Your issue was specific to a route policy, not the way the Sonicwall handles NAT...
"Now the route policy is a little strange because it needs to be for inbound traffic to the NAT'ed IP." That's not really strange considering you're not using an IP on the actual interface that is in the block of IPs your using as the NAT...
Sonicwalls used to be (pre-Gen7) first and foremost a router with NAT and firewall capabilities (upon which they put UTM). You could remove / edit default routes, NAT statements, firewall rules to your hearts desire and make it function (more or less) how you want. Now with Gen7 defaults aren't editable, so you have to bend to their design. This was likely a decision based upon most frequent deployment scenarios.