Score:1

Trouble removing NVIDIA drivers

in flag

I've recently installed graphic drivers for an nvidia geforce RTX 2070. After that, Ubuntu (version 21.04) wouldn't boot anymore. (Lots of people seem to have this issue, see e.g. [1], [2], [3].) I've installed the driver via the built-in app (forgot the name and can't look since I can't boot, I think "Something & Packages"). I've then managed to remove the drivers by running some variant of sudo apt purge nvidia-.* in the Root Shell Prompt accessed via recovery mode, which allowed me to boot again.

I've next tried installing drivers by running the file downloaded here from the terminal. I figured that, if it doesn't work, I could remove the drivers again with the above command. However, this time, the same command returned a message saying there are no packages that start with 'nvidia' installed. As a result, I'm now unable to boot Ubuntu.

I've also tried

  • Navigating to the file (also in the recovery mode root shell) and launching it with the --uninstall option. It threw some error messages during the process but told me the drivers were successfully removed. However, I still can't boot.
  • Reinstalling and uninstalling with the file
  • Navigating into /etc/ and removing all files with xorg in their name (I have no idea what those files are, but it was among the suggested fixes I've found.)

Still can't boot. Any ideas other than the three things above or reinstalling Ubuntu?

Aside from being unable to boot, I still really need to get the drivers going. I only have Ubuntu to run ML stuff, which also requires a functioning GPU. Is there anything better than trying another of the suggested drivers and hoping for a different result?

EDIT: I believe this is (also) a hardware issue; the behavior changed after I changed BIOS settings (the action, only in the opposite direction, is described here.)

EDIT2: I've been told I need another power supply, I'll try that next.

oldfred avatar
cn flag
With Ubuntu, you never install the .run file directly from nVidia. You in effect have to reinstall with every kernel update. But the correct version from Ubuntu repository should work. Can you boot older kernel from grub recovery menu? Uninstall the .run nVidia driver. https://askubuntu.com/questions/219942/how-to-uninstall-manually-installed-nvidia-drivers
silver avatar
in flag
I'll try booting with an older version. However, as for uninstalling the .run driver, the site you linked suggests the command `sudo ./NVIDIA-Linux-x86-310.19.run --uninstall` which I've already tried (first item in the list)
silver avatar
in flag
Booting with an older version has worked like a charm. (That is, if I understand correctly that it just means choosing the third item from [this list](https://i.ibb.co/hVxG55m/mde.jpg).) Thanks for that -- but it doesn't solve my main problem; the nvidia drivers are still there in the newer version.
cc flag
Lots of nvidia packages don't have a name starting with "nvidia-". Look at the output of dpkg -l |grep nvidia and clean out any leftovers. You should always be able to boot in recovery mode, using the nouveau driver, unless some leftover config item in /etc/modules.d has blacklisted nouveau. When clean, install the 460 or 465 driver from the standard repos and that should work.
silver avatar
in flag
I'll try that (but I'm just about to go to bed, so I'll only report back in a few hours). Could you give me an ELI5 (explain like I'm five) version of how to install the 460 driver from the standard repos?
oldfred avatar
cn flag
https://ubuntuforums.org/showthread.php?t=2383560&p=13735336#post13735336 You can, but should not need to install ppa anymore. Ubuntu maintains the current versions. Only if extremely new nVidia card/chip may you need ppa. Examples of adding ppa: https://askubuntu.com/questions/1026179/how-to-install-a-gtx-1060 & https://askubuntu.com/questions/61396/how-do-i-install-the-nvidia-drivers
Score:2
hu flag

List all Nvidia related packages

dpkg -l | grep nvidia

Purge all the Nvidia related package you see in the list

When you have done this successfully

sudo ubuntu-drivers autoinstall
sudo prime-select nvidia

Reboot and now your system should work fine

silver avatar
in flag
This allowed me to boot again, but didn't successfully install the driver; if I plug a monitor into the card, it's still not recognized. Running torch.cuda.is_available() now returns [this new error](https://i.ibb.co/hWWgmFC/error.png).
Utkarsh Chandra Srivastava avatar
hu flag
Are you able to run nvidia-smi? and can you post your output here , have you installed cuda toolkit torch supports 11.1 / 10.2 ? can you post your nvcc -V
Utkarsh Chandra Srivastava avatar
hu flag
Note after you install your driver you have to go to your BIOS make sure secure boot is disabled and change "Primary Display" to GPU
silver avatar
in flag
It turns out I needed a new Power Supply, and once I installed that, not only did the monitor connected to the card work, but the drivers I installed using your commands seem to already do the job. This basically means that your reply has solved all of the software-related issues, even though I didn't know that at the time, so I've marked it as the accepted answer now. Thanks a bunch!
silver avatar
in flag
(I did have to do the BIOS thing, but have already done so before your comment once I suspected a hardware issue. The fact that the monitor connected to the card remained completely dark rather than working at a crappy resolution should have probably given it away sooner.)
Score:2
us flag

I had similar problem. Exact steps might be different but you should get an idea from what I did.

  1. uninstall nvidia drivers as mentioned above.

  2. reboot and instead of login to UI, login to console

  3. make sure nvidia modules are not loaded. You can do that by running command

sudo lsmod  | grep nvidia

This will list any nvidia modules loaded by the kernel. If the modules are loaded that means your nvidia uninstall was not clean and you will need to manually remove the kernel module.

rmmod <name of nvidia modules>
  1. Install xserver-xorg
sudo apt-get install --reinstall xserver-xorg-video-nouveau

This will set your system to use nouveau.

  1. Reboot and connect your monitor. It should detect now.

  2. Use "Additional Drivers" from "Show Application" to install nvidia drivers. (I think latest right now is "nvidia driver metapackage from nvidia-driver-470".

Reboot.

  1. Assuming you don't have secure boot enabled. If secure boot is enable you need to ensure that nvidia kernel modules are signed and loaded.

  2. run sudo lsmod | grep nvidia and nvidia-smi to check if nvidia drivers are loaded.

  3. Now run torch.cuda.is_available() and see if it is able to use GPU.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.