We have an existing Windows 2012 2 node cluster with file witness, but are replacing them with 2019 servers. So we can just reuse the existing cluster object, we're adding the 2019 servers to the cluster then dropping the 2012 ones once the clustered databases sync over. Validation test only reveals a few warnings, including the "only one pair of network interfaces" warning on Validate Network Communication, "The password does not meet the password policy requirements" warning on Validate CSV Settings, the new servers not being in the same OU as the old servers (they're grouped by Server OS version 2012 vs. 2019), "The cluster property "ClusterLogSize" is set with a value less than the default value of 1536," and the obvious "different, but compatible, operating system versions" warning. So nothing that should prevent us from adding the nodes to the cluster.
The Add Node process fails at the step of "Waiting for notification that node...is a fully functional member of the cluster." System event logs reveal errors 1653 and 5398, which points to a communication issue. The Clustering Service communicates over 3343, so troubleshooting has been focused on that. Turned the firewall and AV off completely, but the issue still persisted.
The only thing I'm noticing that is different between the new servers and old servers is that during the Add Node Wizard when Cluster Service is activated on the new server, the server isn't listening on UDP 3343. Saw this by running TCPView on both the 2012 servers currently in the cluster and the 2019 servers I'm trying to add to the cluster. The 2012 servers will show listening over 3343 on TCP and UDP, but the 2019 servers will only show listening and sending information over TCP 3343. There will be 4 Time Wait communications over TCP 3343 to the two 2012 servers, but these will eventually timeout, causing the Add Node process to fail with a timeout error. Not sure if not listening on UDP 3343 at this point of the process is normal behavior before a node is fully added or is indicative of the service not correctly listening on both TCP and UDP 3343 on the new servers.
Nothing seems to be actively blocking the communication; just seems like the new servers aren't properly listening for communication back from the other nodes. Or am I just barking up the wrong tree?