Score:0

GTX 2060 not recognized after installing nvidia-driver-510 with a GTX 3060 recognized side by side on Ubuntu 20.04

in flag

GTX 2060 not recognized after installing nvidia-driver-510 with a GTX 3060 recognized side by side on Ubuntu 20.04 (I have 2 Nvidia GPUs and 1 Intel GPU)

This is a new system after I alternate my HDD. nvidia-dirver-510 works well with both 2060 and 3060 in the previous system.

$ lspci|grep -i vga
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2504 (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation Device 1f03 (rev a1)
$ uname -a
Linux CMPLTRTOK-U20 5.15.0-72-generic #79~20.04.1-Ubuntu SMP Thu Apr 20 22:12:07 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
$ apt list linux-headers-$(uname -r)
Listing... Done
linux-headers-5.15.0-72-generic/focal-updates,focal-security,now 5.15.0-72.79~20.04.1 amd64 [installed,automatic]
$ apt list nvidia-driver-*
Listing... Done
nvidia-driver-390/focal-updates,focal-security 390.157-0ubuntu0.20.04.1 amd64
nvidia-driver-390/focal-updates,focal-security 390.157-0ubuntu0.20.04.1 i386
nvidia-driver-418-server/focal-updates,focal-security 418.226.00-0ubuntu0.20.04.2 amd64
nvidia-driver-418/focal 430.50-0ubuntu3 amd64
nvidia-driver-430/focal-updates,focal-security 440.100-0ubuntu0.20.04.1 amd64
nvidia-driver-435/focal-updates 455.45.01-0ubuntu0.20.04.1 amd64
nvidia-driver-440-server/focal-updates,focal-security 450.236.01-0ubuntu0.20.04.1 amd64
nvidia-driver-440/focal-updates,focal-security 450.119.03-0ubuntu0.20.04.1 amd64
nvidia-driver-450-server/focal-updates,focal-security 450.236.01-0ubuntu0.20.04.1 amd64
nvidia-driver-450/focal-updates,focal-security 460.91.03-0ubuntu0.20.04.1 amd64
nvidia-driver-455/focal-updates,focal-security 460.91.03-0ubuntu0.20.04.1 amd64
nvidia-driver-460-server/focal-updates,focal-security 470.182.03-0ubuntu0.20.04.1 amd64
nvidia-driver-460/focal-updates,focal-security 470.182.03-0ubuntu0.20.04.1 amd64
nvidia-driver-465/focal-updates,focal-security 470.182.03-0ubuntu0.20.04.1 amd64
nvidia-driver-470-server/focal-updates,focal-security 470.182.03-0ubuntu0.20.04.1 amd64
nvidia-driver-470/focal-updates,focal-security 470.182.03-0ubuntu0.20.04.1 amd64
nvidia-driver-495/focal-updates,focal-security 510.108.03-0ubuntu0.20.04.1 amd64
nvidia-driver-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
nvidia-driver-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 amd64 [installed]
nvidia-driver-515-open/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
nvidia-driver-515-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
nvidia-driver-515/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
nvidia-driver-520-open/focal-updates,focal-security 525.116.04-0ubuntu0.20.04.1 amd64
nvidia-driver-520/focal-updates,focal-security 525.116.04-0ubuntu0.20.04.1 amd64
nvidia-driver-525-open/focal-updates,focal-security 525.116.04-0ubuntu0.20.04.1 amd64
nvidia-driver-525-server/focal-updates,focal-security 525.105.17-0ubuntu0.20.04.1 amd64
nvidia-driver-525/focal-updates,focal-security 525.116.04-0ubuntu0.20.04.1 amd64
nvidia-driver-530-open/focal-updates,focal-security 530.41.03-0ubuntu0.20.04.2 amd64
nvidia-driver-530/focal-updates,focal-security 530.41.03-0ubuntu0.20.04.2 amd64

$ apt list libnvidia*-510*
Listing... Done
libnvidia-cfg1-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
libnvidia-cfg1-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 amd64 [installed,automatic]
libnvidia-common-510-server/focal-updates,focal-updates,focal-security,focal-security 515.105.01-0ubuntu0.20.04.1 all
libnvidia-common-510/focal-updates,focal-updates,focal-security,focal-security,now 510.108.03-0ubuntu0.20.04.1 all [installed,automatic]
libnvidia-compute-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
libnvidia-compute-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 i386
libnvidia-compute-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 amd64 [installed,automatic]
libnvidia-compute-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 i386 [installed,automatic]
libnvidia-decode-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
libnvidia-decode-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 i386
libnvidia-decode-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 amd64 [installed,automatic]
libnvidia-decode-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 i386 [installed,automatic]
libnvidia-encode-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
libnvidia-encode-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 i386
libnvidia-encode-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 amd64 [installed,automatic]
libnvidia-encode-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 i386 [installed,automatic]
libnvidia-extra-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
libnvidia-extra-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 i386
libnvidia-extra-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 amd64 [installed,automatic]
libnvidia-extra-510/focal-updates,focal-security 510.108.03-0ubuntu0.20.04.1 i386
libnvidia-fbc1-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
libnvidia-fbc1-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 i386
libnvidia-fbc1-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 amd64 [installed,automatic]
libnvidia-fbc1-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 i386 [installed,automatic]
libnvidia-gl-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
libnvidia-gl-510-server/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 i386
libnvidia-gl-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 amd64 [installed,automatic]
libnvidia-gl-510/focal-updates,focal-security,now 510.108.03-0ubuntu0.20.04.1 i386 [installed,automatic]
libnvidia-nscq-510/focal-updates,focal-security 515.105.01-0ubuntu0.20.04.1 amd64
$ nvidia-smi -q

==============NVSMI LOG==============

Timestamp                                 : Thu Jun 22 17:05:49 2023
Driver Version                            : 510.108.03
CUDA Version                              : 11.6

Attached GPUs                             : 1
GPU 00000000:01:00.0
    Product Name                          : NVIDIA GeForce RTX 3060
    Product Brand                         : GeForce
    Product Architecture                  : Ampere
    Display Mode                          : Disabled
    Display Active                        : Disabled
    Persistence Mode                      : Disabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : N/A
    GPU UUID                              : GPU-5f52334c-826a-7900-978b-7fcb937de6ea
    Minor Number                          : 0
    VBIOS Version                         : 94.06.2F.00.9A
    MultiGPU Board                        : No
    Board ID                              : 0x100
    GPU Part Number                       : N/A
    Module ID                             : 0
    Inforom Version
        Image Version                     : G001.0000.03.03
        OEM Object                        : 2.0
        ECC Object                        : N/A
        Power Management Object           : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GSP Firmware Version                  : N/A
    GPU Virtualization Mode
        Virtualization Mode               : None
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x01
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x250410DE
        Bus Id                            : 00000000:01:00.0
        Sub System Id                     : 0x397D1462
        GPU Link Info
            PCIe Generation
                Max                       : 3
                Current                   : 1
            Link Width
                Max                       : 16x
                Current                   : 16x
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : 0 KB/s
        Rx Throughput                     : 0 KB/s
    Fan Speed                             : 0 %
    Performance State                     : P8
    Clocks Throttle Reasons
        Idle                              : Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    FB Memory Usage
        Total                             : 12288 MiB
        Reserved                          : 235 MiB
        Used                              : 3 MiB
        Free                              : 12049 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 3 MiB
        Free                              : 253 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 0 %
        Memory                            : 0 %
        Encoder                           : 0 %
        Decoder                           : 0 %
    Encoder Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    FBC Stats
        Active Sessions                   : 0
        Average FPS                       : 0
        Average Latency                   : 0
    Ecc Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
        Aggregate
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 43 C
        GPU Shutdown Temp                 : 98 C
        GPU Slowdown Temp                 : 95 C
        GPU Max Operating Temp            : 93 C
        GPU Target Temperature            : 83 C
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : Supported
        Power Draw                        : 15.30 W
        Power Limit                       : 170.00 W
        Default Power Limit               : 170.00 W
        Enforced Power Limit              : 170.00 W
        Min Power Limit                   : 100.00 W
        Max Power Limit                   : 170.00 W
    Clocks
        Graphics                          : 210 MHz
        SM                                : 210 MHz
        Memory                            : 405 MHz
        Video                             : 555 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Max Clocks
        Graphics                          : 2130 MHz
        SM                                : 2130 MHz
        Memory                            : 7501 MHz
        Video                             : 1950 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : 662.500 mV
    Processes
        GPU instance ID                   : N/A
        Compute instance ID               : N/A
        Process ID                        : 1151
            Type                          : G
            Name                          : /usr/bin/gnome-shell
            Used GPU Memory               : 2 MiB
Score:0
in flag

I do not know why, but it works after 2 times reboots of the system after I posted this question.

Below are the details, I wrote down them hoping it can provide clues to other people maybe and also hoping experts could explain the root cause. (I provided some detail of the installation of nvtop, I believe it does not matter before but guess it does now.)

Details:

(1) I have a Ubuntu 20.04 with RTX 3060 (PCI-E 3.0x16) and Intel integrated graphics. Later I attached RTX 2060 to the system via PCI-E 2.0x4.

(2) I installed nvidia-driver-510 via apt about one week ago following an immediate reboot.

(3) I install nvtop yesterday to monitor GPU status before I verify them. When installing nvtop, I get the source from git and build it. I installed the below packages via apt following building errors from nvtop cmake:

libsystemd-dev

libdrm-dev

libgtest-dev

libudev-dev

(4) I verify GPUs by running distributed data parallel of pytorch (torch-1.10.1+cu113-p38-linux). But I find pytorch cannot find any GPU. At this time, nvtop only finds Intel GPU and RTX 3060, nvidia-smi only finds RTX 3060, but command lspci finds Intel GPU and two nvidia GPUs. Then I posted this question, powered off my system, go to sleep.

(5) (The 1st reboot.) I powered on my system after I woke up from sleeping. nvtop finds Intel GPU and 2 nvidia GPUs while nvidia-smi has the below error:

nvidia-smi -q
Unable to determine the device handle for GPU 0000:02:00.0: Unknown Error

(6) I tested via pytorch like before, and it complains of a CUDA error. And it sticks and I cannot abort it, and I cannot even kill it with -9. It takes 100% of one core of my CPU. The system is still on even after I run "sudo poweroff". Then I physically power down it.

(7) (The 2nd reboot.) After the 2nd reboot after I installed nvtop. nvtop, nvidia-smi, and the pytorch code are all OK. I verified them via running the pytorch code and monitoring with nvtop. It works!

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.