This Windows Server VM is under host Esxi version 7.0 update 3. It's a DB server, hence quite critical in production.
Some points that I need to make:
The event occurs randomly (non specific days and times), with a rate of 2-3 times per week. The machine shuts down and never boots back up again. Someone has to do it manually.
Known errors are logged by Windows during start-up, but apparently have nothing to do with the cause of the shutdown itself. These are errors 307 and 304.
Passing to what is logged by the host (esxi), I quote below what has been logged at vmware logs repeatedly:
2023-01-18T19:52:14.697Z In(05) vmx - TOOLS autoupgrade protocol version 0
2023-01-18T19:52:14.735Z In(05) vmx - ToolsGetAppInfoEnabledFromConfigStore: Returning the cached value: '1'.
2023-01-18T19:52:14.758Z In(05) vmx - GuestRpc: Got error for channel 0 connection 4: Remote disconnected
2023-01-18T19:52:14.758Z In(05) vmx - GuestRpc: GuestRpcResetVsockChannel: channel 0
2023-01-18T19:52:14.758Z In(05) vmx - GuestRpc: Closing channel 0 connection 4
2023-01-18T19:52:14.758Z In(05) vcpu-1 - GuestRpc: Reinitializing Channel 0(toolbox)
2023-01-18T19:52:14.758Z In(05) vcpu-1 - Tools: [AppStatus] Last heartbeat value 111591 (last received 0s ago)
2023-01-18T19:52:14.758Z In(05) vcpu-1 - TOOLS: appName=toolbox, oldStatus=1, status=0, guestInitiated=1.
2023-01-18T19:52:23.626Z In(05) vcpu-0 - CDROM: Unknown command 0x35.
2023-01-18T19:52:23.626Z In(05) vcpu-0 - CDROM sata0:0: CMD 0x35 (SYNC CACHE) FAILED (key 0x5 asc 0x20 ascq 0)
2023-01-18T19:52:23.980Z In(05) vcpu-1 - E1000: e1000e-- tx queue 1 is enabled.
2023-01-18T19:52:23.981Z In(05) vcpu-2 - CDROM-IMG: Ignoring a Unit Start or Stop
2023-01-18T19:52:26.045Z In(05) vcpu-3 - PCIXHCI: Interrupt type changed from MSIX to INTX
2023-01-18T19:52:26.724Z In(05) vcpu-0 - PIIX4: PM Soft Off. Good-bye.
2023-01-18T19:52:26.724Z In(05) vcpu-0 - Chipset: The guest has requested that the virtual machine be powered off.
2023-01-18T19:52:26.724Z No(00) vcpu-0 - ConfigDB: Setting softPowerOff = "TRUE"
2023-01-18T19:52:26.728Z In(05) vcpu-0 - VMX: Issuing power-off request...
2023-01-18T19:52:26.728Z In(05) vmx - Stopping VCPU threads...
2023-01-18T19:52:26.728Z In(05) vcpu-0 - VMMon_WaitForExit: vcpu-0: worldID=1638402
2023-01-18T19:52:26.728Z In(05) vcpu-2 - VMMon_WaitForExit: vcpu-2: worldID=1638407
2023-01-18T19:52:26.728Z In(05) vcpu-1 - VMMon_WaitForExit: vcpu-1: worldID=1638406
2023-01-18T19:52:26.728Z In(05) vcpu-3 - VMMon_WaitForExit: vcpu-3: worldID=1638408
That line lead us to suspect it is a windows related problem:
2023-01-18T19:52:26.724Z In(05) vcpu-0 - Chipset: The guest has requested that the virtual machine be powered off
Licensing status is activated
. Moreover, we have checked that the power management settings are ok (the machine doesnt enter to sleep mode).
Maybe, some update did that?
Any ideas what to look for or how to troubleshoot this please?