The filesystem is on an LVM RAID5. It appears to be working correctly:
$ sudo pvs
[sudo] password for jrwren:
PV VG Fmt Attr PSize PFree
/dev/sda2 datavg lvm2 a
/dev/sdb2 datavg lvm2 a
/dev/sdc2 datavg lvm2 a
/dev/sdd2 datavg lvm2 a
/dev/sde2 datavg lvm2 a
/dev/sdf1 datavg lvm2 a
/dev/sdg2 datavg lvm2 a
/dev/sdh2 datavg lvm2 a
/dev/sdi2 datavg lvm2 a
$ sudo lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lxd2 datavg -wi-ao
mirrored datavg -wi-ao
m datavg Rwi-aor
m3 datavg Rwi-aor
mu datavg Rwi-aor
nomirror datavg -wi-ao
photos datavg Rwi-aor
stor datavg Rwi-aor
storj datavg -wi-ao
t datavg Rwi-aor
t2 datavg Rwi-aor
I have a process doing many reads on logical volume named m. This is device dm-12. Eventually, it just dies with the following kernel messages.
Jun 30 16:02:33 delays kernel: [393661.035286] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365192: com[68/1946]t main: pblk 765519712 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:02:33 delays kernel: [393661.039726] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365192: comm rtorrent main: pblk 765519712 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:02:33 delays kernel: [393661.044175] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365192: comm rtorrent main: pblk 765519712 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:02:33 delays kernel: [393661.048584] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365192: comm rtorrent main: pblk 765519712 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:02:33 delays kernel: [393661.054717] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365192: comm rtorrent main: pblk 765519712 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:02:33 delays kernel: [393661.060977] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365192: comm rtorrent main: pblk 765519712 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:02:33 delays kernel: [393661.063736] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365192: comm rtorrent main: pblk 765519712 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:02:33 delays kernel: [393661.066283] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365192: comm rtorrent main: pblk 765519712 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:02:33 delays kernel: [393661.068773] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365192: comm rtorrent main: pblk 765519712 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:02:33 delays kernel: [393661.071232] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365192: comm rtorrent main: pblk 765519712 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
I unmount the filesystem and run e2fsck:
$ sudo e2fsck -p /dev/datavg/m
movies contains a file system with errors, check forced.
movies: Inode 118751237 has an invalid extent node (blk 475078659, lblk 0)
movies: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)
$ sudo e2fsck -y /dev/datavg/movies
e2fsck 1.45.7 (28-Jan-2021)
movies contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 177471496 has an invalid extent node (blk 709943175, lblk 0)
Clear? yes
...
Pass 1E: Optimizing extent trees
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(709943175--709943176) -(868210688--868212735) -(868214784--868216831) -(868253696--868255743) -(868257792--868259839) -(868886528--868888575) -(868892672--868894719) -(868896768--868898815) -(868900864--868902911) -(868904960--868907007) -(868909056--868911103) -(868913152--868917247) -(868921344--868923391) -(868925440--868927487) -(868929536--868931583) -(868933632--868935679) -(868937728--868939775) -(868941824--868943871) -(868945920--868947967) -(868950016--868954111) -(868958208--868960013) -(869894144--869922573)
Fix? yes
Free blocks count wrong for group
Fix? yes
Free blocks count wrong for group
Fix? yes
Free blocks count wrong for group
Fix? yes
Free blocks count wrong for group
Fix? yes
Free blocks count wrong for group
Fix? yes
Free blocks count wrong for group
Fix? yes
Free blocks count wrong for group
Fix? yes
Free blocks count wrong (366951912, counted=367025158).
Fix? yes
movies: ***** FILE SYSTEM WAS MODIFIED *****
movies: 6896/236224512 files (20.8% non-contiguous), 577868794/944893952 blocks
$ sudo e2fsck -p /dev/datavg/movies
movies: clean, 6896/236224512 files, 577868794/944893952 blocks
It says it is clean, so I remount it and rerun the reading software.
And a few minutes later:
Jun 30 16:34:49 delays kernel: [395595.309814] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365190: comm rtorrent main: pblk 765517692 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:34:49 delays kernel: [395595.317838] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365190: comm rtorrent main: pblk 765517692 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:34:49 delays kernel: [395595.320836] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365190: comm rtorrent main: pblk 765517692 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:34:49 delays kernel: [395595.323418] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365190: comm rtorrent main: pblk 765517692 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:35:14 delays kernel: [395619.785771] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365190: comm rtorrent main: pblk 765517692 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
Jun 30 16:35:14 delays kernel: [395619.793135] EXT4-fs error (device dm-12): ext4_find_extent:885: inode #191365190: comm rtorrent main: pblk 765517692 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
What is going on here? Is the LVM corrupt and lying to me? Is there a command I can run to check? Should I run a badblocks (e2fsck -c) or something?
There are no corresponding LVM messages from the kernel. I'd expect LVM errors if the underlying disks had problems. What is going on?
update: someone asked for dmesg output. That is exactly what is above with the EXT4-fs messages. The only other messages in dmesg output other than standard boot messages is this repeated:
[527724.593062] rptaddrs[3948921]: segfault at 7ffc7a7a50b5 ip 00007fd9f0f86820 sp 00007ffc7a7a3fc8 error 4 in libc-2.28.so[7fd9f0e4c000+148000] [527724.593075] Code: 7f 07 c5 fe 7f 4f 20 c5 fe 7f 54 17 e0 c5 fe 7f 5c 17 c0 c5 f8 77 c3 48 39 f7 0f 87 ab 00 00 00 0f 84 e5 fe ff ff c5 fe 6f 26 <c5> fe 6f 6c 16 e0 c5 fe 6f 74 16 c0 c5 fe 6f 7c 16 a0 c5 7e 6f 44