Premis
I tried pretty much everything I could find, I will list the attempts below.
Question structured as follows: Premis, Problem, Question, Attempts.
My PC:
- Gigabyte X570 I AORUS PRO WIFI Mini ITX AM4
- AMD Ryzen 5 5600x (Zen3)
- Palit Dual GeForce RTX 3060Ti 8GB
- MX500 1TB (Ubuntu)
- Sabrent Rocket 4.0 500GB (Windows)
- MX500 2TB (storage)
Also:
- Secure Boot is Disabled (so also CSM is disabled)
Objective
Ubuntu 22.04 LTS Installation for Machine Learning development, mainly GPU accelerated:
- CUDA is needed
- Nouveau drivers are not optimal, hence they are currently blacklisted
- NVIDIA drivers are needed
blacklist nouveau
options nouveau modeset=0
Problem
The driver is recognized as:
- It is shown in the Software & Updates as installed
Software & Updates Screenshot link.
But it is clearly not used as I'm writing this in 1024x768 (4:3).
Taking a look at drivers/what is recognized:
teniu@AW-PC:~$ sudo lshw -c video
[sudo] password for teniu:
*-display
description: VGA compatible controller
product: GA104 [GeForce RTX 3060 Ti]
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:09:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
configuration: driver=nvidia latency=0
resources: irq:108 memory:fb000000-fbffffff memory:d0000000-dfffffff memory:e0000000-e1ffffff ioport:e000(size=128) memory:fc000000-fc07ffff
*-graphics
product: EFI VGA
physical id: 1
logical name: /dev/fb0
capabilities: fb
configuration: depth=32 resolution=1024,768
teniu@AW-PC:~$ modinfo $(modprobe --resolve-alias nvidia)
modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.19.0-41-generic
modinfo: ERROR: missing module or filename.
Trying to install drivers with autoinstall, output:
raceback (most recent call last):
File "/usr/bin/ubuntu-drivers", line 513, in <module>
greet()
File "/usr/lib/python3/dist-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/usr/lib/python3/dist-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/usr/lib/python3/dist-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/lib/python3/dist-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python3/dist-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/usr/lib/python3/dist-packages/click/decorators.py", line 84, in new_func
return ctx.invoke(f, obj, *args, **kwargs)
File "/usr/lib/python3/dist-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/usr/bin/ubuntu-drivers", line 432, in autoinstall
command_install(config)
File "/usr/bin/ubuntu-drivers", line 187, in command_install
UbuntuDrivers.detect.nvidia_desktop_pre_installation_hook(to_install)
File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 839, in nvidia_desktop_pre_installation_hook
with_nvidia_kms = version >= 470
UnboundLocalError: local variable 'version' referenced before assignment
Ignoring this means installing the drivers specifying the version, so with:
sudo apt install nvidia-driver-530-open
It does its thing, then trying nvidia-smi, output:
No devices were found
Is the kernel module loaded? Seems like it is:
teniu@AW-PC:~$ lsmod | grep nvidia
nvidia_uvm 1441792 0
nvidia_drm 77824 1
nvidia_modeset 1404928 1 nvidia_drm
nvidia 6234112 5 nvidia_uvm,nvidia_modeset
drm_kms_helper 200704 1 nvidia_drm
drm 581632 5 drm_kms_helper,nvidia,nvidia_drm
Trying again to see if nvidia is seen:
lspci | grep -i nvidia
teniu@AW-PC:~$ lspci | grep -i nvidia
09:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3060 Ti] (rev a1)
09:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1)
Seems like it.
Question & Recap
nvidia-smi
Doesn't work, but the drivers are seen? I am clueless now.
The Ubuntu 22.04 LTS used for this is a fresh install.
What I think can be a major hint is this output, since NVIDIA modules are not found in the kernel modules directory.
teniu@AW-PC:~$ modinfo $(modprobe --resolve-alias nvidia)
modprobe: FATAL: Module nvidia not found in directory /lib/modules/5.19.0-41-generic
modinfo: ERROR: missing module or filename.
- What am I supposed to do now?
Attempts (all failed)
- Purge and reinstall (multiple)
- Changing to 515 drivers instead of 530 (twice)
- (did I install the wrong version...?)
- Disabling Secure Boot (and try again everything)
- Fresh Ubuntu install
- Test on Windows (it works there, no hardware problems)
- Headers update, 0 updates
sudo apt install linux-headers-$(uname -r)
- (I would like to avoid manual installation through download)
- Give up (not an option)
Useful but unrelated information
When I built the PC I had to update the BIOS since the motherboard didn't offer native support for the Zen3 architecture of the Ryzen 5 series.
I mention this only because at startup, I see the fast terminal gibberish and at the top there is written something about the AMD not being recognized. Although I highly doubt it is somehow related or of importance, since it works afterwards.