Each time I run nvidia-smi
on our new compute system I get this type of errors in syslog. Often a few of them in a group:
Feb 25 13:35:02 xxxx kernel: [77419.656602] ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20210331/dsfield-184)
Feb 25 13:35:02 xxxx kernel: [77419.656612] ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20210331/dswload2-477)
Feb 25 13:35:02 xxxx kernel: [77419.656616]
Feb 25 13:35:02 xxxx kernel: [77419.656618] No Local Variables are initialized for Method [_DSM]
Feb 25 13:35:02 xxxx kernel: [77419.656618]
Feb 25 13:35:02 xxxx kernel: [77419.656619] Initialized Arguments for Method [_DSM]: (4 arguments defined for method invocation)
Feb 25 13:35:02 xxxx kernel: [77419.656620] Arg0: 000000007cd03195 <Obj> Buffer(16) 75 0B A5 D4 C7 65 F7 46
Feb 25 13:35:02 xxxx kernel: [77419.656628] Arg1: 0000000012ece7a2 <Obj> Integer 0000000000000102
Feb 25 13:35:02 xxxx kernel: [77419.656632] Arg2: 000000009179cfcc <Obj> Integer 0000000000000010
Feb 25 13:35:02 xxxx kernel: [77419.656635] Arg3: 000000002ecdce5a <Obj> Buffer(4) 00 10 52 44
Feb 25 13:35:02 xxxx kernel: [77419.656639]
Feb 25 13:35:02 xxxx kernel: [77419.656641] ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20210331/psparse-529)
The same happens when an snmpd
process periodically queries the GPU parameters.
Any ideas why would this be?
The output of nvidia-smi
seems to be correct, but I'm a bit puzzled if those syslog errors would matter. I have updated BIOS with the latest version that is only a few days old. Here is the information about the system in question:
$ inxi -Fxz
System: Kernel: 5.13.0-30-generic x86_64 bits: 64 compiler: N/A Console: tty 0 Distro: Ubuntu 20.04.4 LTS (Focal Fossa)
Machine: Type: Desktop System: Alienware product: Alienware Aurora R13 v: N/A serial: <filter>
Mobo: Alienware model: 0C92D0 v: A00 serial: <filter> UEFI: Alienware v: 1.0.12 date: 01/25/2022
CPU: Topology: 10-Core model: 12th Gen Intel Core i7-12700KF bits: 64 type: MT MCP arch: N/A L2 cache: 25.0 MiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 144383
Speed: 893 MHz min/max: 800/6300 MHz Core speeds (MHz): 1: 890 2: 900 3: 843 4: 891 5: 800 6: 818 7: 873 8: 894
9: 958 10: 925 11: 909 12: 900 13: 891 14: 901 15: 881 16: 909 17: 891 18: 1182 19: 884 20: 913
Graphics: Device-1: NVIDIA vendor: Dell driver: nvidia v: 510.47.03 bus ID: 01:00.0
Display: server: X.org 1.20.13 driver: fbdev,nouveau unloaded: modesetting,vesa tty: 136x50
Message: Advanced graphics data unavailable in console. Try -G --display
Audio: Device-1: Intel vendor: Dell driver: snd_hda_intel v: kernel bus ID: 00:1f.3
Device-2: NVIDIA vendor: Dell driver: snd_hda_intel v: kernel bus ID: 01:00.1
Sound Server: ALSA v: k5.13.0-30-generic
Network: Device-1: Realtek vendor: Bigfoot Networks driver: r8169 v: kernel port: 3000 bus ID: 03:00.0
IF: enp3s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Device-2: Intel vendor: Bigfoot Networks driver: iwlwifi v: kernel port: 3000 bus ID: 04:00.0
IF: wlp4s0 state: down mac: <filter>
IF-ID-1: docker0 state: up speed: 10000 Mbps duplex: unknown mac: <filter>
IF-ID-2: veth4f6068a state: up speed: 10000 Mbps duplex: full mac: <filter>
Drives: Local Storage: total: 1.84 TiB used: 131.29 GiB (7.0%)
ID-1: /dev/nvme0n1 model: KXG70ZNV1T02 NVMe KIOXIA 1024GB size: 953.87 GiB
ID-2: /dev/sda vendor: Toshiba model: DT01ACA100 size: 931.51 GiB temp: 35 C
Partition: ID-1: / size: 904.82 GiB used: 131.20 GiB (14.5%) fs: ext4 dev: /dev/nvme0n1p2
ID-2: swap-1 size: 11.00 GiB used: 65.2 MiB (0.6%) fs: swap dev: /dev/nvme0n1p3
Sensors: System Temperatures: cpu: 32.0 C mobo: N/A
Fan Speeds (RPM): N/A
Info: Processes: 456 Uptime: 21h 41m Memory: 62.60 GiB used: 2.92 GiB (4.7%) Init: systemd runlevel: 5 Compilers:
gcc: 9.3.0 Shell: bash v: 5.0.17 inxi: 3.0.38
The GPU is NVIDIA RTX 3080 10GB. The system is deployed in a server room without monitor, no mouse, no keyboard. The messages show the same way even if I connect monitor/mouse/keyboard. No difference.
I tried to find more information about this problem but no luck. I'm not even sure if it is important to try fixing this or who should I report it to in case it is a genuine bug.
--
Bogdan