NSV200 Azure vpn to NSa/Tz, traffic randomly stops passing over tunnel interface
Just wondering if anyone else in the community has experienced this (I have a ticket logged with Sonicwall :) )
I have this issue on 4 totally different sites now where I have an NSv200 in azure and a NSa/TZ on prem.
NSv200 (azure) - NSa2650
NSv200 (azure) - NSa2600
NSv200 (azure) - TZ500
NSv200 (azure) - TZ300 (18.104.22.168-44n)
All other firewalls above are on latest firmware, NSv_200__Azure_-6_5_4_4-44v-21-987 and 22.214.171.124-83n on all NSa/TZ
Though the issue was also present on NSv200 nsv_azure_126.96.36.199-44v-21-987 and NSv_200__Azure_-6_5_0_2-8v-37-628 with NSa_2650-6_5_4_5-53n.
The setup is a route based/tunnel interface VPN IKEv2 between Azure NSv200 and on prem NSa/TZ.
The problem is that out of nowhere, maybe once a week, maybe 2 weeks, maybe in a month the tunnel stays up/green on both sides but traffic does not pass over it. I can see from the traffic logs that the NSv200 is not trying to push traffic into the tunnel (no traffic appears in the packet capture for the interesting traffic on the tunnel even though I know for sure there is a lot of traffic trying) and I cannot see traffic inbound from the other site on the tunnel, on the NSa/TZ side it is trying to push the traffic into the tunnel ( I can see this in the packet captures).
Once I bounce the tunnel it all works again.
The error on packet captures is:
[in:X1*(interface),out:--,DROPPED, Drop Code: 425(Octeon Decrypyion Failed Pad check), Module Id: 20(ipSec), (Ref.Id: _775_jqtfdPdufpoJoqvu),1:1)]
[in:X1*(interface),out:--,DROPPED, Drop Code: 680(Packet dropped - fails to handle IPSec pkt), Module Id: 20(ipSec), (Ref.Id: _2926_txGsIboemfJqQlu),1:1)]
My first ticket with SonicWALL they suggested disabling keep alives on the NSa/TZ side and leaving on for the NSv side - this did not work
I have also changed key life times (though the issue does not get triggered at a point when the key lifetimes expire anyway)
I have also tried different combinations of DPD.
Irregardless of my changes above there is no reason why this should happen, this is a sonicwall appliance to another sonicwall appliance, the tunnels work fine for weeks, sometimes months and then randomly (sometimes at 03:00am when all is quiet and just Active directory traffic is passing with very low bandwidth consumption - sometimes at 15:00pm - sometimes at 10:00am) - There is not common trigger.
As I said, this is on all 4 sites for different customers where I have a tunnel interface IKEv2 vpn between NSv200 azure and an on prem NSa/TZ
I am currently testing one site by changing the VPN to policy based (site to site) with IKEv2.
Another site by changing the tunnel interface (route) based VPN to IKEv1
Another site by disabling IPsec anti-replay and same tunnel interface (route) based VPN (ikev2)
I have an NSv200 with a VPN to another vendor firewall (route based/tunnel interface and IKEv2)and absolutely no issue.
The latest suggestion from sonicwall support, after asking to them to escalate, was to rebuild a VPN and recreate network objects....
My guess on this one is that the NSv is not activating the route for the tunnel though it is active in the config.