I had a failing unit that needed replacing. I opened a case and let it sit as it wasn’t that urgent, I wouldn’t be able to replace the HA primary unit until a maintenance period anyway.
As usual it took about four days until support was in contact and the case needed “active troubleshooting” even though it was a clear case of hardware failure. Case was put on “pending closed” as expected.
So I called and had to be on the phone for about an hour before the tech said the unit would have to be factory defaulted to confirm the issue. I had already done that and wasn’t onsite, but I was told I couldn’t get the RMA until it was confirmed to be faulty during a remote session. I demanded to speak to a manager and he was luckily much better at customer service and we got the RMA process started.
In Europe RMA deliveries seem to be a mess. Devices are delivered from the UK which because of Brexit requires customs handling and it took almost a week to get the appliance to north Europe. So all in all it took almost week and a half to get the replacement unit. If this would’ve been the only device protecting the network, a new product from which ever vendor having hardware in stock would’ve been purchased and the SonicWall sold as surplus.
So on to the work of replacing a primary unit in a HA pair. I registered the replacement unit to the same mysonicwall account and as I expected the licenses didn’t transfer automatically. So I had to spend FlexSpend credits to get the security services back online. I will probably lose a months worth of credits from the old appliance if I had to guess. I also have to get by without stateful synchronization license for the time being as the only channel seems to be to send a message to customer service via web form and wait for a response.
SonicWall has two KB articles about replacing the primary HA unit:
The articles differ a little bit and both seem to be wrong. After activating HA on the restored primary unit, it goes into permanent “ELECTION state” as seen on the serial console, and never recovers on its own. In this state it’s completely inaccessible. Same on the secondary appliance. After factory defaulting and restoring both units a couple of times and trying again, I finally got them to see each other and form the HA pair by doing a cold restart by hand one after the other. Took a lot of trial and error and about two hours to get it working, where it should’ve been a few minutes according to the article. And of course there was much longer downtime when a unit was either hanging or being defaulted and restored.
The articles look straight forward but give different advice and don’t say what to do when the units get stuck in election state and don’t recover. One of the articles should be corrected and one removed altogether, why even have two articles when they don’t even agree on how to do it?