Hosts need an alternate access method if they are to be recovered. ssh over the internet requires things to be working normally: network access, firewall allow rule, running and configured sshd, secured key files. Any one of these breaks, cannot get in. The most reliable out of band access does not rely on IP networking in the host to be working.
If it is a VM, perhaps shut it down, and attach the disk to some other working instance to repair it. Not ideal, but only requires you have access to the disk.
An off-host backup of the data allows creating a new replacement host. May seem silly to destroy and rebuild just because ssh access was lost, but it remains a recovery option.
As to preventing the problem in the first place, a syntax check of sshd configuration (sshd -T -f
for OpenSSH) will not catch everything. End to end testing could be done by starting another sshd, on a different port, with everything the same but the port number. Connect to this remotely to test that things are working. Unfortunately, even taking such care is not going to catch unintentional things, like a permissions change on home directories that accidentally makes ssh files unsecured. Or a change in IP that could alter the effective ssh_config
due to Match
keywords.
Passwords remain a terrible authentication mechanism.