Score:1

Access denied error when bringing a Windows failover cluster role back online after long downtime

us flag

I have a two node failover cluster running SQL Server 2016 (Standard) using Availability Groups (DB1 and DB2). I turned DB2 off as a cost saving measure for COVID by switching all the roles to DB1, removing all the databases from the Availability Groups on DB1 and then turning DB2 off.

Three years later I brought DB2 back online and that seemed to have caused DB1 to fail. I managed to bring it back online by evicting it and re-adding it but there are still roles that are not working that I can't get back online. The IP address for each role is online but the Network Name resource is in the failed state.

If I try to bring it online or try to use the repair function to fix it I get the error message "The user name or password is incorrect". The cluster events shows an error message saying that it couldn't locate a writeable domain controller with the reason being "The user name or password is incorrect".

I did find an article about this needing a hotfix applying but that was for Windows Server 2012 and this is Windows Server 2016. Other articles have said to try flushing the DNS but that hasn't worked.

I need to set up another availability group and that also fails to come online with the same error message about finding a writeable domain controller and it fails to create the computer in active directory or the domain name in the DNS.

If I connect to DB1 and start Failover Cluster Manager it isn't connected to the failover cluster and I get the error message "Access is denied" when I try.

These are not the domain controllers and I am logged into the servers using a domain admin account.

Score:1
us flag

It's possible that DB2 has lost its trust relationship with the domain during its time offline, which may not be immediately obvious in all cases (particularly if you're logging on using an account previously used to log into the machine (since if it is unable to contact a domain controller to validate credentials, it will fall back to its local cache)).

This might explain the symptoms you're seeing with its inability to write/alter resources such as DNS entries.

I would suggest validating that DB2 still has a trust relationship with the domain - to be honest given the circumstances you have outlined, the most straight forward action here might be just to remove DB2 from the domain and go through the domain join procedure again.

Once this is done, you can retry re-adding it to the cluster relationship.

Steve Kaye avatar
us flag
I have already removed DB2 from the domain and re-added it as part of my troubleshooting. I can't remove it from the cluster because I wouldn't be able to add it back as it's the only server that can manage the cluster at the moment because DB1 can't for some reason (access denied when trying to connect)
Steve Kaye avatar
us flag
If I reboot DB2 then DB1 is able to connect to the cluster using Failover Cluster Manager until DB2 comes back up. If I disconnect and re-connect Failover Cluster Manager whilst DB2 is running it won't connect but if I leave it connected I can seem to control things.
Steve Kaye avatar
us flag
Sorry to spam you. I removed DB2 from the cluster, and domain, added it to the domain and it failed when I tried to re-add it to the cluster from DB1 with an error message saying that I am not an administrator on DB2. I tried adding it from DB2 and it failed saying that it was already part of a cluster. I called Clear-ClusterNode and tried again from DB1 with the same message about not being an administrator . I joined DB2 to the cluster from DB2 but I can't start the network name resource with the incorrect password error message. I'm logged in as the same domain admin user on both servers.
us flag
OK - the situation is more nuanced then. If I were you I'd be trying to have DB1 host the cluster and clustered roles if possible (including moving the cluster key resources to that node). Reason being, it's stayed online throughout and isn't seemingly the one which is presenting the primary problems. It does sound to me like there is a credential desync going on here somewhere. Is the account name you're using literally "administrator" by any chance? I've seen issues with name collisions between domain/local accounts before (i.e. if the local admin on DB2 is also "administrator")
us flag
i.e. - maybe it would be worth trying to perform the actions you've already tried using a different set of domain admin credentials, if you have access to some (or can create some). For what it's worth, everything you've described doing is basically exactly what I'd do in the same scenario, including `Clear-ClusterNode` which I've found to do the trick in a similar scenario in the recent past.
Steve Kaye avatar
us flag
It wasn't "administrator" because they are Azure VMs and MS doesn't allow that but it was the same username as the local server admins. I gave my account the relevant access rights. waited a few hours to make sure that I had the correct access and tried the same steps and it went the same way. I was told that I'm not an admin on DB2 when I did it from DB1 and it worked from DB2 but the network name resource won't repair or come online and gives the incorrect user name or password error.
us flag
Have you checked that DB2 exists on the security tab of the network name resource in DNS and in Active Directory with Write privileges?
Steve Kaye avatar
us flag
DB1 and DB2 aren't on there but the failover cluster object is. We have another failover cluster on the same network that is working and I've compared the access rights and they seem to be the same.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.