Score:3

Ubuntu 22.04 freezes and showing amdgpu Error in logs

rw flag

I am running Ubuntu 22.04 for a while now. Everything worked perfectly until today. I am using an AMD Ryzen Lenovo ThinkPad (T14 gen3). My system got stuck two times today.

The last message in my logs in the Important tab before freezing was:

09:02:10 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process msedge pid 3949 thread msedge:cs0 pid 3971
09:02:10 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process msedge pid 3949 thread msedge:cs0 pid 3971
09:02:10 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=172936, emitted seq=172937
09:01:46 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
09:01:46 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled 

The System tabs shows the following logs:

09:02:10 kernel: amdgpu 0000:04:00.0: amdgpu: GPU recovery disabled.
09:02:10 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process msedge pid 3949 thread msedge:cs0 pid 3971
09:01:46 kernel: amdgpu 0000:04:00.0: amdgpu: GPU recovery disabled.
09:01:46 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
09:01:46 kernel: [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!

I can not figure it out why this problem appears, because I didn't do any updates today. I did also not install any gpu related drivers or any other driver. Just the "default" Ubuntu installation.

Thanks for help.

Antonin avatar
us flag
same here ``` kernel: [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out! *ERROR* ring sdma0 timeout, signaled seq=27689, emitted seq=27691 *ERROR* Process information: process pid 0 thread pid 0 kernel: amdgpu 0000:33:00.0: amdgpu: GPU recovery disabled. *ERROR* Waiting for fences timed out! *ERROR* ring gfx_0.0.0 timeout, signaled seq=227887, emitted seq=227889 ] *ERROR* Process information: process firefox pid 5155 thread firefox:cs0 pid 5330 kernel: amdgpu 0000:33:00.0: amdgpu: GPU recovery disabled. *ERROR* Waiting for fences timed out! ```
Antonin avatar
us flag
same hardware btw
Philipp  avatar
rw flag
@Antonin since I don't turn my notebook into `suspend` mode the problem disappeared. But actually I am not sure if it has something to do with this...
d.lime avatar
th flag
Same laptop, same OS, same error. It has to be something with kernel versions, but I'm too lazy to go downgrade till finding again a working one... We shall submit this to AMD / Ubuntu support
Score:0
bo flag
P09

Should be solved in Ubuntu 23.04 Kernel 6.2+ and with libdrm-amdgpu1 2.4.114-1) - see https://gitlab.freedesktop.org/drm/amd/-/issues/2282#note_1901512

More info https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1980831

I have same problem, Lenovo ThinkPad T14 Gen 3 AMD (Ryzen 7 PRO 6850U, 21CGS1ES00, BIOS R23ET65W - 1.35), Ubuntu 22.04.2 (5.19.0-42-generic).

libdrm-amdgpu (apt search amdgpu*):

libdrm-amdgpu1/jammy-updates,now 2.4.113-2~ubuntu0.22.04.1 amd64

syslog:

May 22 23:12:16 P09-ThinkPad-T14-Gen-3 kernel: [ 4089.681396] [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!
May 22 23:12:21 P09-ThinkPad-T14-Gen-3 kernel: [ 4089.685379] [drm:amdgpu_dm_commit_planes [amdgpu]] *ERROR* Waiting for fences timed out!
May 22 23:12:21 P09-ThinkPad-T14-Gen-3 kernel: [ 4094.801723] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=19205, emitted seq=19206
May 22 23:12:21 P09-ThinkPad-T14-Gen-3 kernel: [ 4094.802071] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
May 22 23:12:21 P09-ThinkPad-T14-Gen-3 kernel: [ 4094.802351] amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4095.955280] amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4095.955394] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.150614] [drm:gfx_v10_0_cp_gfx_enable.isra.0 [amdgpu]] *ERROR* failed to halt cp gfx
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.163949] [drm] free PSP TMR buffer
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196421] CPU: 6 PID: 24115 Comm: kworker/u32:3 Tainted: G           OE     5.19.0-41-generic #42~22.04.1-Ubuntu
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196425] Hardware name: LENOVO 21CGS1ES00/21CGS1ES00, BIOS R23ET65W (1.35 ) 03/21/2023
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196427] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196435] Call Trace:
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196437]  <TASK>
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196440]  show_stack+0x52/0x69
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196445]  dump_stack_lvl+0x49/0x6d
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196450]  dump_stack+0x10/0x18
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196453]  amdgpu_do_asic_reset+0x2b/0x441 [amdgpu]
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196678]  amdgpu_device_gpu_recover_imp.cold+0x4f6/0x805 [amdgpu]
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.196879]  amdgpu_job_timedout+0x15e/0x190 [amdgpu]
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.197059]  ? finish_task_switch.isra.0+0x84/0x290
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.197064]  drm_sched_job_timedout+0x6d/0x120 [gpu_sched]
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.197068]  process_one_work+0x21f/0x400
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.197072]  worker_thread+0x50/0x3f0
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.197074]  ? rescuer_thread+0x3a0/0x3a0
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.197076]  kthread+0xee/0x120
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.197078]  ? kthread_complete_and_exit+0x20/0x20
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.197081]  ret_from_fork+0x22/0x30
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.197086]  </TASK>
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.197088] amdgpu 0000:04:00.0: amdgpu: MODE2 reset
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.206082] amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to resume
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.206251] [drm] PCIE GART of 512M enabled (table at 0x000000F4008C9000).
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.206264] [drm] PSP is resuming...
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.228241] [drm] reserve 0xa00000 from 0xf43f400000 for PSP TMR
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.564430] amdgpu 0000:04:00.0: amdgpu: RAS: optional ras ta ucode is not available
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.576531] amdgpu 0000:04:00.0: amdgpu: RAP: optional rap ta ucode is not available
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.576534] amdgpu 0000:04:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.576542] amdgpu 0000:04:00.0: amdgpu: SMU is resuming...
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.576933] amdgpu 0000:04:00.0: amdgpu: SMU is resumed successfully!
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.578769] [drm] DMUB hardware initialized: version=0x04000022
May 22 23:12:23 P09-ThinkPad-T14-Gen-3 kernel: [ 4096.748970] [drm:check_syncd_pipes_for_disabled_master_pipe [amdgpu]] *ERROR* DC: Failure: pipe_idx[2] syncd with disabled master pipe_idx[1]
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.370891] [drm] kiq ring mec 2 pipe 1 q 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.376593] [drm] VCN decode and encode initialized successfully(under DPG Mode).
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377031] [drm] JPEG decode initialized successfully.
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377039] amdgpu 0000:04:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377045] amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377047] amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377049] amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377050] amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377052] amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377053] amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377054] amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377056] amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377058] amdgpu 0000:04:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377059] amdgpu 0000:04:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377061] amdgpu 0000:04:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377063] amdgpu 0000:04:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377064] amdgpu 0000:04:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.377065] amdgpu 0000:04:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.384067] amdgpu 0000:04:00.0: amdgpu: recover vram bo from shadow start
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.384073] amdgpu 0000:04:00.0: amdgpu: recover vram bo from shadow done
May 22 23:12:24 P09-ThinkPad-T14-Gen-3 kernel: [ 4097.384166] amdgpu 0000:04:00.0: amdgpu: GPU reset(1) succeeded!

I'm using Lenovo ThinkPad Universal USB-C Dock (40AY) with two monitors via DP + internal display.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.