Score:0

How to find the cause of a server kernel panic?

us flag

Our high-load server is in kernel panic. How can we identify the cause and correct it?

It is a Dell PowerEdge R940 with 256GB ram and 20TB disk array with the root in a SSD disk. Linux CentOs 7 system. Trying to reboot I just can get the next two kernel panic messages: kernel panic 1 kernel panic 2

I will appreciate any help.

A.B avatar
cl flag
A.B
"Unable to mount root fs": now figure out where's your disk or what disk was configured to be used etc.
us flag
The root fs is in the ssd disk.
Score:0
mx flag

legally obligatory message: I work for Dell

Is is possible? Yes. Is it worth it: If you have a Linux nerd on hand… maybe. Sometimes these things are easy kills, sometimes they are rat holes.

Operationally, if only the drive with the OS is effected but the RAID is fine, just reinstall the OS. It is not likely that it is worth it to spend a bunch of time troubleshooting a kernel panic. Moreover, it’s worth getting off of CentOS 7 which is long dead as far as effective updates. They’re on RHEL 9.1. CentOS 7 is nearly a decade old at this point.

us flag
Yes but I need it compatible with a Lustre system we installed in the cluster 5 years ago. OTOH, the PowerEdge R940 has two M.2 ports; should I try installing the second one to install a second operative system?
Grant Curell avatar
mx flag
Ah yes, HPC and its never ending quest to go un-updated . My overly forthright answer is that if you're at the level of desperate that you're posting to Serverfault, yes, any direction you're going to go is going to require you to find a way to reinstall the OS. It's unlikely you'll be able to troubleshoot a kernel panic over a forum. That said, I would also strongly recommend, even if it's an HPC system, you consider paying the technical debt sooner rather than later - CentOS 7 goes completely EOL in a year and it's already effectively dead. The only way it makes sense to me to stay on
Grant Curell avatar
mx flag
CentOS 7 is if you are *extremely* invested in very specific optimization to the given configuration. From a maintenance perspective though it isn't going to be fun if you have any package dependencies - particularly given that many HPC workloads depend on libraries that require newer kernels.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.