Score:0

My Nvidia drivers died suddenly after a reboot

de flag

I rebooted my PC and Ubuntu was suddenly locked in a very low resolution and I can't play any games. During boot, Linux displayed error messages about Nvidia stuff. What's going on?

Score:1
de flag

I experienced this problem once and was very confused but fixed it temporarily. A couple of weeks later it happened again and I looked into how to fix the problem permanently. In my case the fix (both temporary and permanent) was very simple. These instructions were written for Ubuntu 22.04 but may apply to other versions.

Which specific problem do these instructions fix?

Ubuntu will sometimes update your kernel (operating system) with a new patch. You might not even notice this happening because it could have been in a big list of other updates it wanted to install and you just hit "Ok" without thinking that it might completely break your system. The problem is that Nvidia drivers (kernel modules) are designed for a specific kernel version. When the kernel updates, they need to be installed again.

How can I tell if this is specifically what's causing my problem?

The fixes I'm proposing are simple and non-destructive so you could just give them a go anyway. But here's how I figured it out originally. Check out the contents of the folder /lib/modules. The first time this happened to me, there were two folders in there: 5.15.0-43-generic and 5.15.0-56-generic, corresponding to two different kernel versions - the old one (43) and the new one (56) which had just been installed. Look around in the folder for the older version, specifically in the kernel subfolder (so /lib/modules/5.15.0-43-generic/kernel in my case). You should see some files with nvidia in the name (see my linked question for an example). Now check the latest version folder. If you can't find any nvidia stuff, there's a good chance that this is what happened to you.

How can I fix the problem temporarily?

Re-install your nvidia drivers. You can sometimes do this in the command line by running ubuntu-drivers install, but it often doesn't work (it will tell you you have them already and not reinstall). Alternatively, you can open the "Additional drivers" program, which looks like this:

enter image description here

Unselect the currently selected driver and then reselect it. Hit "Apply Changes". After a reboot, your graphics should work again.

Note that at time of writing (January 2022) there is (and has been since 2020!) an extremely embarrassing parsing bug in the ubuntu-drivers program. This bug renders the program completely worthless the moment it comes into contact with any nvidia driver whose name ends with -open or -server or something else that isn't a number (which there are several of, including the recommended "tested" one, as you can see in the screenshot above). One symptom of this bug is that the above "Additional drivers" program may be rendered inoperable. All your drivers will be greyed out and the window will say something like "This computer is using a proprietary driver" (I can't remember the exact message). I worked around this bug by actually modifying the ubuntu-drivers program (which is written in Python). This also means you should install only the proprietary versions of the drivers whose names end with a number, not the -open variants. As you can see in the screenshot, that's what I did. If you think this bug is preventing you from making progress past this step and can't figure out how to get around it, please open a new question.

How can I fix the problem permanently?

DKMS is a software object in Linux whose job is to solve exactly this problem - keep your drivers working after a kernel update. I have found that all that's necessary to fix this issue is to install dkms:

sudo apt install dkms

and then re-install your Nvidia drivers following the instructions in the "temporary fix" section, above. The Nvidia driver installation will see DKMS support and automatically take advantage of it. You may see some information online suggesting that after you do this, running dkms status should print information about your Nvidia drivers. In my case, running dkms status returns no output, yet this fix still appears to have worked.

Just to explain why I believe this worked: in my older linked question I mentioned that this problem knocked two of my drivers offline, the Nvidia one and a Wi-fi USB adapter one, killing my internet connection. After the second time this happened and I applied my "permanent fix" above, I deliberately did not enable DKMS for the Wi-fi driver. Just today a kernel patch happened, and sure enough after a reboot, my Wi-fi was gone - but my graphics are fine. So it seems DKMS did indeed do its job.


Update: 9th Feb 2023, a kernel update knocked my Nvidia drivers offline again. The problem may be related to my wi-fi drivers since when I tried to fix the problem by running sudo ubuntu-drivers install, I got hit with network errors (it couldn't download what it needed), since of course my wi-fi had been knocked out by the update as well. I fixed my wi-fi and then ran ubuntu-drivers as normal and it fixed the graphics. Will continue to monitor.

Update: 17th Feb 2023, 5.19.0-32 knocked my wi-fi offline but not my graphics. I had made no configuration changes since the 9th.

Update: 4th March 2023, 5.19.0-35 killed both wi-fi and graphics. I had made no configuration changes since the 17th of February. Also worth noting is that when I rebooted after manually I got the "(Ctrl+C) Cancel all Filesystem checks in progress" prompt as I booted, which I thought only happened when you incorrectly shut down the system.

I thought that doing my fix with the "Additional drivers" program had worked, but it did not. My resolution was fine, but when I went to play a game it didn't launch, and the Ubuntu "About" tab in the settings listed the graphics as "llvmpipe". I tried again to reinstall my drivers as before but no luck. So I ended up trying the proprietary driver, nvidia-driver-525, since at some point in the past I must have switched to nvidia-driver-525-open. This worked, after a reboot. "About" tab now lists my graphics card's name for "Graphics". So it seems like the -open variant just doesn't work?

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.