Score:0

Can't use nvidia-smi with nvidia 495 driver Ubuntu 20.04 LTS

br flag

I encountered a problem while trying to install my GPU drivers.

I installed the 495 NVIDIA driver. Which was the recommended driver for Ubuntu.

Somehow, nvidia-smi doesn’t find the driver I installed. Yet, the driver is present when running DKMS status:

user@server:~$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

user@server:~$ dkms status
nvidia, 495.29.05, 5.11.0-41-generic, x86_64: installed

user@server:~$ nvidia-debugdump -l
Error: nvmlInit(): Driver Not Loaded

user@server:~$ lspci -v | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2060 Rev. A] (rev a1) (prog-if 00 [VGA controller])

I also get this:

user@server:~$ systemctl status nvidia-persistenced.service
● nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/lib/systemd/system/nvidia-persistenced.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2021-12-06 14:50:17 EST; 38min ago
    Process: 1109 ExecStart=/usr/bin/nvidia-persistenced --verbose (code=exited, status=1/FAILURE)
    Process: 1116 ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced/* (code=exited, status=0/SUCCESS)

Dec 06 14:50:17 server systemd[1]: nvidia-persistenced.service: Scheduled restart job, restart counter is at 5.
Dec 06 14:50:17 server systemd[1]: Stopped NVIDIA Persistence Daemon.
Dec 06 14:50:17 server systemd[1]: nvidia-persistenced.service: Start request repeated too quickly.
Dec 06 14:50:17 server systemd[1]: nvidia-persistenced.service: Failed with result 'exit-code'.
Dec 06 14:50:17 server systemd[1]: Failed to start NVIDIA Persistence Daemon.

My GPU is detected. Am I missing a step required to link nvidia-smi to the Ubuntu NVIDIA driver?

I have a xorg.conf file, with which I am able to set a resolution size, but not connect it to nvidia-smi.

Let me know if you need more info on this issue.

Thank you in advance.

(Edit):

Here is sudo lshw -c video

user@server:~$ sudo lshw -c video
  *-display UNCLAIMED
       description: VGA compatible controller
       product: TU106 [GeForce RTX 2060 Rev. A]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:01:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller bus_master cap_list
       configuration: latency=0
       resources: memory:f6000000-f6ffffff memory:e0000000-efffffff memory:f0000000-f1ffffff ioport:e000(size=128) memory:c0000-dffff
vLev avatar
br flag
I used `sudo apt install nvidia-driver-495` after seeing it in `ubuntu-drivers devices`. I also purged my drivers multiple times, which might mean some blacklists are hidden somewhere. I found this similar issue: https://forums.developer.nvidia.com/t/nvidia-driver-is-not-loaded-ubuntu-18-10/70495
cc flag
You can reply in your original posting, comments may be deleted. Definitely check for anything other than nvidiafb (fgrep nvidia /etc/modprobe.d/*) -- shouldn't be any. What driver does lshw -c video show? Add it's text output to your original posting with code tags. Is secure boot off? A Windows update might turn it back on.
vLev avatar
br flag
Thanks! I just updated the post with `lshw`
heynnema avatar
ru flag
Nvidia 495 is beta software. Purge it, and install 470.86... or whatever the most recent version that ends in a zero. Use `Software & Updates` Additional Drivers tab, or see https://www.nvidia.com/en-us/geforce/drivers/. You may also have to disable Secure Boot in your BIOS.
Score:0
br flag

As heynnema said,

I disabled Secure boot in my UEFI bios. nvidia-smi now works perfectly fine!

user@server:~$ nvidia-smi
Mon Dec  6 18:14:06 2021
+---------------------------------------------+
| NVIDIA-SMI 495.29.05    Driver Version: 495.29.05    CUDA Version: 11.5     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
|  0%   49C    P8    17W / 190W |    179MiB /  5931MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+---------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1152      G   /usr/lib/xorg/Xorg                 35MiB |
|    0   N/A  N/A      1889      G   /usr/lib/xorg/Xorg                 48MiB |
|    0   N/A  N/A      2016      G   /usr/bin/gnome-shell               84MiB |
+---------------------------------------------+

Thank you for all your help!

NovHak avatar
cn flag
You mean you disabled Secure Boot, right ? Because deleting all keys will indeed put Secure Boot in _Setup Mode_ hence stop SB enforcement, but that's a bit harsh and requires repopulating the keys databases later, should you decide to reenable it. Imho, the best way to disable Secure Boot checks on Ubuntu is to execute `mokutil --disable-validation`. But anyway, it should work with Secure Boot, you probably lack a module signature key : `update-secureboot-policy --new-key`
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.