I'm running a raidz1-0 (RAID5) setup with 4 data 2TB SSDs.
During midnight, somehow 2 of my data disks experience some I/O error (from /var/log/messages).
When I investigated in the morning, the zpool status shows the following :
state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://zfsonlinux.org/msg/ZFS-8000-HC
scan: resilvered 1.36T in 0 days 04:23:23 with 0 errors on Thu Apr 20 21:40:48 2023
config:
NAME STATE READ WRITE CKSUM
zfs51 UNAVAIL 0 0 0 insufficient replicas
raidz1-0 UNAVAIL 36 0 0 insufficient replicas
sdc FAULTED 57 0 0 too many errors
sdd ONLINE 0 0 0
sde UNAVAIL 0 0 0
sdf ONLINE 0 0 0
errors: List of errors unavailable: pool I/O is currently suspended
I tried doing zpool clear
, I keep getting the error message cannot clear errors for zfs51: I/O error
Subsequently, I tried rebooting first to see if it resolves - however there was issue shut-downing.
As a result, I had to do a hard reset. When the system boot back up, the pool was not imported.
Doing zpool import zfs51
now returns me :
Destroy and re-create the pool from
a backup source.
Even putting -f
or -F
, I get the same error. Strangely, when I do zpool import -F
, it shows the pool and all the disks online :
pool: zfs51
id: 12204763083768531851
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
zfs51 ONLINE
raidz1-0 ONLINE
sdc ONLINE
sdd ONLINE
sde ONLINE
sdf ONLINE
Yet however, when importing by the pool name, the same error shows.
Even tried using -fF
, doesn't work.
After scrawling through Google and reading up on different various ZFS issues, i stumbled upon the -X
flag command (that solves users facing similar issue).
I went ahead to run zpool import -fFX zfs51
and the command seems to be taking long.However, I noticed the 4 data disks having high read activity, which I assume its due to ZFS reading the entire data pool. But after 7 hours, all the read activity on the disks stopped.
I also noticed a ZFS kernel panic message :
kernel:PANIC: zfs: allocating allocated segment(offset=6859281825792 size=49152) of (offset=6859281825792 size=49152)
Currently, the command zpool import -fFX zfs51
seems to be still running (terminal did not return back the input to me). However, there doesnt seem to be any activity in the disks. Also running zpool status
in another terminal seems to hanged as well.
I'm not sure what do at the moment - should I continue waiting (it has been almost 14 hours since I started the import command), or should I do another hard reset/reboot?
Also, I read that potentially I can actually import the pool as readonly (zpool import -o readonly=on -f POOLNAME
) and salvage the data - anyone can any advise on that?
I'm guessing both of my data disks potentially got spoilt (somehow at the same timing) - how likely is this the case, or could it be due to ZFS issue?