Score:1

zfs pool status unstable

us flag

I've been running zfs pool on ubuntu problem free for years. currently on 20.04

since around beginning of this year I've had to replace 2 out of 4 disks and even then brand new disks started showing errors.

started scrubbing it weekly and the things were kinda stable. 20-50 errors read and/or write errors would appear on some disks and scrub would fix them.

few days ago however a disk was faulted for too many errors. then second one degraded. running scrub made things worse.

triggered scrub today then realized disks may be too hot, shut down the pc to adjust fans, started again and zpool status shows this:

 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sat Jun 19 18:44:07 2021
    1.51T scanned at 2.74G/s, 1.29T issued at 2.35G/s, 3.04T total
    2.76G resilvered, 42.42% done, 0 days 00:12:44 to go
config:

    NAME                                           STATE     READ WRITE CKSUM
    ztank                                          DEGRADED     0     0     0
      mirror-0                                     DEGRADED     0     0     0
        ata-ST2000LM003_HN-M201RAD_S34RJ9AFB25570  DEGRADED     0     0     0  too many errors
        ata-ST2000LM003_HN-M201RAD_S362J9EGB75740  ONLINE       0     0     0  (resilvering)
      mirror-1                                     ONLINE       0     0     0
        ata-ST2000DM008-2FR102_ZFL3P2SZ            ONLINE       0     0     0
        ata-TOSHIBA_HDWL120_807APRBUT              ONLINE       0     0     0  (resilvering)
    logs
      zfs_slog                                     ONLINE       0     0     0
    cache
      zfs_l2arc                                    ONLINE       0     0     0

errors: No known data errors

I'm really shocked what's going on

Andrew Henle avatar
ph flag
Have you been using the same power supply?
us flag
yes. haven't changed it in more than an year
Michael Hampton avatar
cz flag
**Wait for the resilver to complete.**
Score:1
ph flag

Well, looks like you answered yourself - disks were too hot so they started failing. See if you can recover from that degraded state.

Also, check your RAM. Do full memtest. If they are ok, check SATA cables too. Check all SMART stats and to test=long on all of them via smartctl. And never overheat your HDDs.

Score:0
us flag

Turns out the problem was with the way I powered my drives. I have, without noticing put too many drives on single power rail. Once I distributed them evenly across the power rails, everything went back to normal.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.