Score:1

Ubuntu 22.04.2 | Random Freeze Problem

io flag
fxh

Can't fix this problem for a month.

Randomly I do something, change tab or do simple tasks my screen crashes, can't move cursor, then monitor off (become black for 2 seconds), then back to normal but frozen and I must restart my laptop.

Log:

Feb 27 11:16:50 fichony kernel: CPU: 7 PID: 61710 Comm: kworker/u32:1 Not tainted 5.19.0-32-generic #33~22.04.1-Ubuntu
Feb 27 11:16:50 fichony kernel: Hardware name: LENOVO 20QJS0GG00/20QJS0GG00, BIOS R13ET53W(1.27 ) 07/28/2022
Feb 27 11:16:50 fichony kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
Feb 27 11:16:50 fichony kernel: Call Trace:
Feb 27 11:16:50 fichony kernel:  <TASK>
Feb 27 11:16:50 fichony kernel:  show_stack+0x52/0x69
Feb 27 11:16:50 fichony kernel:  dump_stack_lvl+0x49/0x6d
Feb 27 11:16:50 fichony kernel:  dump_stack+0x10/0x18
Feb 27 11:16:50 fichony kernel:  amdgpu_do_asic_reset+0x2b/0x441 [amdgpu]
Feb 27 11:16:50 fichony kernel:  amdgpu_device_gpu_recover_imp.cold+0x4e4/0x7e1 [amdgpu]
Feb 27 11:16:50 fichony kernel:  amdgpu_job_timedout+0x15e/0x190 [amdgpu]
Feb 27 11:16:50 fichony kernel:  ? finish_task_switch.isra.0+0x84/0x290
Feb 27 11:16:50 fichony kernel:  drm_sched_job_timedout+0x6d/0x120 [gpu_sched]
Feb 27 11:16:50 fichony kernel:  process_one_work+0x21f/0x400
Feb 27 11:16:50 fichony kernel:  worker_thread+0x50/0x3f0
Feb 27 11:16:50 fichony kernel:  ? rescuer_thread+0x3a0/0x3a0
Feb 27 11:16:50 fichony kernel:  kthread+0xee/0x120
Feb 27 11:16:50 fichony kernel:  ? kthread_complete_and_exit+0x20/0x20
Feb 27 11:16:50 fichony kernel:  ret_from_fork+0x22/0x30
Feb 27 11:16:50 fichony kernel:  </TASK>
Feb 27 11:16:50 fichony kernel: amdgpu 0000:05:00.0: amdgpu: MODE2 reset
Feb 27 11:16:50 fichony kernel: amdgpu 0000:05:00.0: amdgpu: GPU reset succeeded, trying to resume
Feb 27 11:16:50 fichony kernel: [drm] PCIE GART of 1024M enabled.
Feb 27 11:16:50 fichony kernel: [drm] PTB located at 0x000000F400900000
Feb 27 11:16:50 fichony kernel: [drm] PSP is resuming...
Feb 27 11:16:50 fichony kernel: [drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR
Feb 27 11:16:51 fichony kernel: amdgpu 0000:05:00.0: amdgpu: RAS: optional ras ta ucode is not available
Feb 27 11:16:51 fichony kernel: amdgpu 0000:05:00.0: amdgpu: RAP: optional rap ta ucode is not available
Feb 27 11:16:51 fichony kernel: [drm] kiq ring mec 2 pipe 1 q 0
Feb 27 11:16:52 fichony kernel: amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Feb 27 11:16:52 fichony kernel: [drm:amdgpu_gfx_enable_kcq.cold [amdgpu]] *ERROR* KCQ enable failed
Feb 27 11:16:52 fichony kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -110
Feb 27 11:16:52 fichony kernel: amdgpu 0000:05:00.0: amdgpu: GPU reset(12) failed
Feb 27 11:16:52 fichony kernel: amdgpu 0000:05:00.0: amdgpu: GPU reset end with ret = -110
Feb 27 11:16:52 fichony kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110
Feb 27 11:17:01 fichony CRON[67852]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Feb 27 11:17:01 fichony CRON[67853]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Feb 27 11:17:01 fichony CRON[67852]: pam_unix(cron:session): session closed for user root
Feb 27 11:17:02 fichony kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=312097, emitted seq=312097
Feb 27 11:17:02 fichony kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process opera pid 50728 thread opera:cs0 pid 50732

System information:

My System Info Screenshot

Tried reinstalling Ubuntu, tried "WaylandEnable", installing custom drivers for GPU, but nothing worked.

How to fix this?

Score:1
bq flag

I hit this problem too recently when my kernel went from 5.15.0-60-generic to 5.19.0-32 generic. Downgrading back to the 5.15 line fixed the exact same freezes you are hitting. I believe if you look at your syslog you'll see VM_L2_PROTECTION_FAULT_STATUS just as your system freezes. Looking around I saw some posts saying the 5.19.19 (believe that was the version) was suppose to have the gpu fix in it, but the 32 version I had still hit it. Hope this helps!

fxh avatar
io flag
fxh
Thanks for the answer. It's been 2 days already and I have no problem with crashes. :)
Daum avatar
bq flag
FYI the 6.0 kernel appears to no longer have this issue.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.