I have a HP DL380G10 server running Windows 2016. In the last few days, HP's Agentless Management Services started to initiate system shutdown with the following message:
Description of Event ID 521, IML Class Code 2, Event Code 19: System
Overheating (Temperature Sensor 30, Location I/O Board, Temperature
127) Check fans, processor heat sink and air baffles installation.
At the same time, system fans start to run at maximum for 5-10seconds, very noisy until shutdown. These overheating events also happen while I am accessing BIOS, so I don't believe it is system related. While in BIOS, fans max out every five minutes or so, run at max for 10 seconds and then return to their normal level. System does not turn off when in BIOS. If I monitor all temps via ILO, then there is no spiking in the temperature during these events.
30-PCI 1 I/O Board 13 13 OK 40C Caution: 100C; Critical:
N/A
31-PCI 1 Zone I/O Board 13 13 OK 35C Caution: 75C;
Critical: 80C
Although ILO does detect something is wrong, as health icon for a brief moment turns red when fans start maxing, there is no change in these readouts. I suspect faulty sensor 30 that maxes out for brief time, thus initiating emergency cooling and shutdown. Is there a way to monitor these temps with higher frequency, close to real time? I also note that readout of all other system temp sensors also do not change substantially when this happens.