Score:1

Finding out why disk goes read only oncec a week

cn flag

Once per week it seems this ssd goes into read only mode. It seems like its always Sunday but I can not 100% confirm that at this point. Is there any place I should look to see whats kicking it into read only?

Running Ubuntu 20.04.3 LTS

Here is the output from dmesg | grep -i 'mount'

[    0.127427] Mount-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
[    0.127451] Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
[    2.923738] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
[    3.305534] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point.
[    3.306844] systemd[1]: Mounting Huge Pages File System...
[    3.307557] systemd[1]: Mounting POSIX Message Queue File System...
[    3.308408] systemd[1]: Mounting Kernel Debug File System...
[    3.309511] systemd[1]: Mounting Kernel Trace File System...
[    3.316394] systemd[1]: Starting Remount Root and Kernel File Systems...
[    3.319664] systemd[1]: Mounted Huge Pages File System.
[    3.319789] systemd[1]: Mounted POSIX Message Queue File System.
[    3.319898] systemd[1]: Mounted Kernel Debug File System.
[    3.320106] systemd[1]: Mounted Kernel Trace File System.
[    3.331875] EXT4-fs (nvme0n1p2): re-mounted. Opts: errors=remount-ro. Quota mode: none.
[    3.333993] systemd[1]: Finished Remount Root and Kernel File Systems.
[    3.362611] systemd[1]: Mounting FUSE Control File System...
[    3.363770] systemd[1]: Mounting Kernel Configuration File System...
[    3.365921] systemd[1]: Mounted FUSE Control File System.
[    3.369316] systemd[1]: Mounted Kernel Configuration File System.
[    3.397687] systemd[1]: Mounting Mount unit for bare, revision 5...
[    3.399068] systemd[1]: Mounting Mount unit for core18, revision 2128...
[    3.400368] systemd[1]: Mounting Mount unit for core18, revision 2253...
[    3.402429] systemd[1]: Mounting Mount unit for core20, revision 1270...
[    3.403942] systemd[1]: Mounting Mount unit for gnome-3-34-1804, revision 72...
[    3.405876] systemd[1]: Mounting Mount unit for gnome-3-34-1804, revision 77...
[    3.408791] systemd[1]: Mounting Mount unit for gnome-3-38-2004, revision 87...
[    3.410177] systemd[1]: Mounting Mount unit for gtk-common-themes, revision 1515...
[    3.411536] systemd[1]: Mounting Mount unit for gtk-common-themes, revision 1519...
[    3.415038] systemd[1]: Mounting Mount unit for snap-store, revision 547...
[    3.417231] systemd[1]: Mounting Mount unit for snap-store, revision 558...
[    3.420108] systemd[1]: Mounting Mount unit for snapd, revision 12704...
[    3.422536] systemd[1]: Mounting Mount unit for snapd, revision 14295...

Here is smartctl

=== START OF INFORMATION SECTION ===
Model Number:                       SPCC M.2 PCIe SSD
Serial Number:                      11111111111111111111
Firmware Version:                   EDFM20.0
PCI Vendor/Subsystem ID:            0x1987
IEEE OUI Identifier:                0x6479a7
Total NVM Capacity:                 256,060,514,304 [256 GB]
Unallocated NVM Capacity:           0
Controller ID:                      1
Number of Namespaces:               1
Namespace 1 Size/Capacity:          256,060,514,304 [256 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            6479a7 5580500da2
Local Time is:                      Tue Jan 11 11:09:08 2022 EST
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     70 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     4.50W       -        -    0  0  0  0        0       0
 1 +     2.70W       -        -    1  1  1  1        0       0
 2 +     2.16W       -        -    2  2  2  2        0       0
 3 -   0.0700W       -        -    3  3  3  3     1000    1000
 4 -   0.0050W       -        -    4  4  4  4     5000  100000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         1
 1 -    4096       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        23 Celsius
Available Spare:                    100%
Available Spare Threshold:          5%
Percentage Used:                    0%
Data Units Read:                    26,898 [13.7 GB]
Data Units Written:                 77,333 [39.5 GB]
Host Read Commands:                 358,393
Host Write Commands:                1,697,160
Controller Busy Time:               33
Power Cycles:                       18
Power On Hours:                     207
Unsafe Shutdowns:                   13
Media and Data Integrity Errors:    0
Error Information Log Entries:      12
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               35 Celsius

Error Information (NVMe Log 0x01, max 16 entries)
No Errors Logged

Here is the output from journalctl

Jan 13 10:44:14 hcker2000-s1 kernel: pci 0000:00:1d.0: [8086:9d18] type 01 class 0x060400
Jan 13 10:44:14 hcker2000-s1 kernel: pci 0000:00:1d.0: PME# supported from D0 D3hot D3cold
Jan 13 10:44:14 hcker2000-s1 kernel: pci 0000:03:00.0: [1987:5013] type 00 class 0x010802
Jan 13 10:44:14 hcker2000-s1 kernel: pci 0000:03:00.0: reg 0x10: [mem 0xef000000-0xef003fff 64bit]
Jan 13 10:44:14 hcker2000-s1 kernel: pci 0000:03:00.0: 15.752 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x2 link at 0000:00:1d.0 (capable of 31.504 Gb/s with 8.0 GT/s P>
Jan 13 10:44:14 hcker2000-s1 kernel: pci 0000:00:1d.0: PCI bridge to [bus 03]
Jan 13 10:44:14 hcker2000-s1 kernel: pci 0000:00:1d.0:   bridge window [mem 0xef000000-0xef0fffff]
Jan 13 10:44:14 hcker2000-s1 kernel: pci 0000:00:1d.0: PCI bridge to [bus 03]
Jan 13 10:44:14 hcker2000-s1 kernel: pci 0000:00:1d.0:   bridge window [mem 0xef000000-0xef0fffff]
Jan 13 10:44:14 hcker2000-s1 kernel: pcieport 0000:00:1d.0: PME: Signaling with IRQ 124
Jan 13 10:44:14 hcker2000-s1 kernel: pcieport 0000:00:1d.0: AER: enabled with IRQ 124
Jan 13 10:44:14 hcker2000-s1 kernel: nvme nvme0: pci function 0000:03:00.0
Jan 13 10:44:14 hcker2000-s1 kernel: nvme nvme0: missing or invalid SUBNQN field.
Jan 13 10:44:14 hcker2000-s1 kernel: nvme nvme0: allocated 128 MiB host memory buffer.
in flag
There may be something in `/var/log/syslog`, but only if the error could be written before the device was remounted as read-only
cn flag
I grepped that for "error" but didnt find anything
guiverc avatar
cn flag
I wouldn't only `grep` for the word error (unless you also scanned lines around the line found); key words can be read-only, mount etc.. but I usually find the issues using date/times because I'm looking the following or a subsequent day with an approximate day & time already known. Issues tend to stand out anyway; but it's more than just a single word 'error' that I look for.
waltinator avatar
it flag
Search all the logs `sudo journalctl -b 0 /dev/nvme0n1p2`. Read `man journalct`.
cn flag
Thanks I have added the output from journalct from right after a reboot when it has had the error. I tried to find a way to run the command while the disk is read only but no commands will work with it in read only.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.