Score:1

nvidia-smi reports PCIE link generation to be 1

ro flag

On Ubuntu Server 20.04, I am getting a very concerning output when running an nvidia-smi query:

$ nvidia-smi --query-gpu=index,pcie.link.gen.current,pcie.link.gen.max,pcie.link.width.current --format=csv
index, pcie.link.gen.current, pcie.link.gen.max, pcie.link.width.current
0, 1, 4, 8
1, 1, 4, 4

The reported pcie.link.gen.current is 1 despite the maximum of the cards being 4. If my understanding is correct, this could drastically reduce the speed of memory copy operations between CPU and GPU and could affect the speed of my deep learning training and inference (on PyTorch).

In terms of hardware, this is my setup:

  • CPU: Intel Core i9-11900
  • Motherboard: Asus ROG Strix Z590-A Gaming WiFi ATX
  • GPUs: 2x ASUS RTX3090 Strix OC
  • SSD: All M.2 slots are occupied (I understand this can limit the available PCIE lanes, so I won't complain about the second GPU's lanes being only 4 compared to the first one's 8)

I have specifically set the PCIE generation setting for those slots to 4 in the motherboard BIOS, but that does not change what nvidia-smi reports.

How may I:

  1. Test if this situation actually has an impact on the memory transfer, or if it is just a matter of nvidia-smi reporting erroneous information?
  2. Correct this problematic setup?
Terrance avatar
id flag
If you read https://enterprise-support.nvidia.com/s/article/Useful-nvidia-smi-Queries-2 it states that "The current PCI-E link generation. These may be reduced when the GPU is not in use." So I am guessing that your GPUs are not in use right now, but the Max is showing 4 so you should be fine.
ro flag
Ah, that is a relief. Indeed, the nvidia-smi reports gen 4 when they are in use. I'll accept your answer if you post it as such
Score:1
id flag

According to https://enterprise-support.nvidia.com/s/article/Useful-nvidia-smi-Queries-2 it states "The current PCI-E link generation. These may be reduced when the GPU is not in use."

So, as a test I ran the following to see with my server I have here with a NVIDIA RTX1650 in it.

No current GPU in use at this time:

terrance@Intrepid:~$ nvidia-smi
Wed Mar  1 07:13:34 2023       
+---------------------------------------------+
| NVIDIA-SMI 525.78.01    Driver Version: 525.78.01    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| 35%   37C    P8    11W /  75W |      3MiB /  4096MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+---------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+---------------------------------------------+

terrance@Intrepid:~$ nvidia-smi --query-gpu=index,pcie.link.gen.current,pcie.link.gen.max,pcie.link.width.current --format=csv
index, pcie.link.gen.current, pcie.link.gen.max, pcie.link.width.current
0, 1, 2, 16

Now with a GPU in use:

terrance@Intrepid:~$ nvidia-smi
Wed Mar  1 07:40:16 2023       
+---------------------------------------------+
| NVIDIA-SMI 525.78.01    Driver Version: 525.78.01    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| 35%   40C    P0    17W /  75W |     30MiB /  4096MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+---------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    310633      G   /usr/lib/xorg/Xorg                 23MiB |
|    0   N/A  N/A    310787      G   xfwm4                               1MiB |
+---------------------------------------------+

terrance@Intrepid:~$ nvidia-smi --query-gpu=index,pcie.link.gen.current,pcie.link.gen.max,pcie.link.width.current --format=csv
index, pcie.link.gen.current, pcie.link.gen.max, pcie.link.width.current
0, 2, 2, 16

Hope this helps!

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.