Score:0

random crash running 20.04.5 LTS

jo flag

My ubuntu installation randomly crashes and I'm very perplexed as to why.

The crashes only seem to happen when I run the gnome shell from boot. When I boot into "text" mode via grub, I have yet to reproduce it (but honestly I ran the gnome shell for a day with no issue).

The restart/crash pattern seems so random, sometimes nothing is happening for hours on end, then seemingly just restarts every 5 or so minutes with no rhyme or reason.

I've watched sensors and tailed the following logs and don't see anything that stick. I do consistently see the gnome shell assertion warnings (which seem like they always happen, but its only consistent thing I've noticed)

I also watched the cpu/ram usage (and recorded it using prometheus node exporter) and I barely am using any cpu, however half of ram is use/cached/available in buffer.

Also checked Xorg log file for my user (but unclear if this would be same user that the main boot gui display would use) and don't see any crashes or errors in there

tail -F /var/log/{auth.log,syslog,kern.log,dmesg}

I also installed kdump, but during last crash around 1am, nothing was there. I do have a crash from 4 days ago in gnome shell in /var/crash, but ubuntu has crashed many times since then.

I've run memtest86 on the cpu and ram and it works fine. I've run a stress test tool and could not reliably reproduce any crash. Since I installed kdump and modified some grub settings, it no longer auto reboots on crash, so I don't think my power supply is the issue (I'm also connected to a UPS which isn't having any issues)

The other thing I tried was to verify health of ssd running via nvme. It does have an error count reported by nvme, but I don't know if thats valid. If I use nvme client it show throws some namespace warnings when trying to check the self test log (which I assume is because maybe the ssd doesn't conform to some nvme standards).

Any ideas what I should check next? Or if I'm missing something

Jan 24 00:58:42 GB-BRi7-H-8550 sshd[3436]: debug3: Copy environment: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
Jan 24 00:58:42 GB-BRi7-H-8550 sshd[3436]: debug3: Copy environment: LANG=en_US.UTF-8

==> /var/log/syslog <==
Jan 24 01:00:18 GB-BRi7-H-8550 gnome-shell[2608]: cr_parser_new_from_buf: assertion 'a_buf && a_len' failed
Jan 24 01:00:18 GB-BRi7-H-8550 gnome-shell[2608]: cr_declaration_parse_list_from_buf: assertion 'parser' failed
Jan 24 01:00:18 GB-BRi7-H-8550 gnome-shell[2608]: cr_parser_new_from_buf: assertion 'a_buf && a_len' failed
Jan 24 01:00:18 GB-BRi7-H-8550 gnome-shell[2608]: cr_declaration_parse_list_from_buf: assertion 'parser' failed
Jan 24 01:00:18 GB-BRi7-H-8550 gnome-shell[2608]: cr_parser_new_from_buf: assertion 'a_buf && a_len' failed
Jan 24 01:00:18 GB-BRi7-H-8550 gnome-shell[2608]: cr_declaration_parse_list_from_buf: assertion 'parser' failed
Jan 24 01:00:18 GB-BRi7-H-8550 gnome-shell[2608]: g_dbus_connection_emit_signal: assertion 'G_IS_DBUS_CONNECTION (connection)' failed
Jan 24 01:00:18 GB-BRi7-H-8550 gnome-shell[2608]: cr_parser_new_from_buf: assertion 'a_buf && a_len' failed
Jan 24 01:00:18 GB-BRi7-H-8550 gnome-shell[2608]: cr_declaration_parse_list_from_buf: assertion 'parser' failed
Jan 24 01:00:23 GB-BRi7-H-8550 PackageKit: daemon quit
Jan 24 01:00:23 GB-BRi7-H-8550 systemd[1]: packagekit.service: Succeeded.

Every 2.0s: sensors                                  GB-BRi7-H-8550: Tue Jan 24 01:00:32 2023

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +32.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +32.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:        +32.0°C  (high = +100.0°C, crit = +100.0°C)

pch_cannonlake-virtual-0
Adapter: Virtual device
temp1:        +32.0°C

acpitz-acpi-0
Adapter: ACPI interface
temp1:       -263.2°C
temp2:        +27.8°C  (crit = +119.0°C)

iwlwifi_1-virtual-0
Adapter: Virtual device
temp1:            N/A

nvme-pci-3b00
Adapter: PCI adapter
Composite:    +23.9°C  (low  =  -0.1°C, high = +69.8°C)
                       (crit = +84.8°C)
SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        23 Celsius
Available Spare:                    100%
Available Spare Threshold:          5%
Percentage Used:                    1%
Data Units Read:                    47,266,923 [24.2 TB]
Data Units Written:                 18,678,056 [9.56 TB]
Host Read Commands:                 4,590,365,019
Host Write Commands:                2,075,041,190
Controller Busy Time:               7,186
Power Cycles:                       95
Power On Hours:                     11,932
Unsafe Shutdowns:                   49
Media and Data Integrity Errors:    0
Error Information Log Entries:      118
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0

Error Information (NVMe Log 0x01, max 16 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0        118     0  0x100c  0x4005  0x028            0     0     -


I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.