Score:0

Random kernel panics on Ubuntu Server 22.04.2 LTS

pt flag

Last year I built a custom Ubuntu Server which hosts multiple Docker containers and an 8TB RAID setup with ZFS. However, since March, I have been experiencing totally random kernel panics at least once a day. I have tried various troubleshooting steps, including reinstalling the OS, disabling certain containers, using the kernel before the update, running MemTest successfully, and checking the hardware, but none of them have worked so far. I have also taken pictures of the kernel panics, but I am not sure how to interpret them. The message is always the same: end kernel panic - not syncing: Fatal exception in interrupt. I am currently suspecting that there might be an issue with the CPU, but I am open to other suggestions. Can anyone offer some insight into what might be causing these kernel panics? Thank you for your help!

OS:

  • Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-60-generic x86_64)

Hardware:

  • ASRock B550M Steel Legend AMD B550 So.AM4 Dual Channel DDR mATX Retail
  • AMD Ryzen 5 4600G 6x 3.70GHz So.AM4 BOX
  • 300 Watt be quiet! SFX Power 3
  • 16GB (1x 16GB) G.Skill Aegis DDR4-2666 DIMM CL19-19-19-43 Single
  • 2x 8TB Seagate Barracuda Compute ST8000DM004
Marco avatar
br flag
The RAM is not on the supported memory list of the board with this cpu. Maybe a memtest shows something?
user535733 avatar
cn flag
Run the LiveUSB installer's "Try Ubuntu" environment for a day or two. If you still get random kernel panics, then it's hardware-caused. If the panics go away, then it's software-caused.
Terrance avatar
id flag
I think your suspicion of the CPU may be correct. Every panic is on the same CPU core of 3. Things you might want to check is heatsink compound and replace it with adequate coverage on the top of the CPU. Pull the CPU out and examine for damage to the pins or if compound got onto any pins, that could cause the pin not to connect properly to the socket. Also, I would check the power supply. They do make power supply testers that you can plug in to check for bad power rails. And, one other thought is check the capacitors on the motherboard for any caps that look like they might burst.
Kryptolyser avatar
pt flag
Alright guys, thanks for the advice! I'll try a few things and report back.
Hannu avatar
ca flag
Verify whether the 300W PSU is enough for two 8TB disks and the remainder of the system; you might need one with more Wattage - at least during startup.
Score:0
pt flag

After reordering the same CPU, I have yet to experience any kernel panics, indicating that the issue was likely with the CPU. Thank you all for the ideas and support!

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.