Score:0

Drive to fail in 24 hours - Possible Hard Drive Failure Soon

kr flag

Here's what the smartctls is saying. How accurate is the 24 hours failure? Any successful fixes to the Reallocated_Sector_Ct fail?

Here's what I got from running the attributes command:
== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   076   063   044    Pre-fail  Always       -       46668818
  3 Spin_Up_Time            0x0003   095   094   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       53
  5 Reallocated_Sector_Ct   0x0033   022   022   036    Pre-fail  Always   FAILING_NOW 3214
  7 Seek_Error_Rate         0x000f   089   060   030    Pre-fail  Always       -       10217282021
  9 Power_On_Hours          0x0032   061   061   000    Old_age   Always       -       34447
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       52
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   099   000    Old_age   Always       -       14
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   063   059   045    Old_age   Always       -       37 (Min/Max 30/41)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       48
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       53
194 Temperature_Celsius     0x0022   037   041   000    Old_age   Always       -       37 (0 20 0 0 0)
195 Hardware_ECC_Recovered  0x001a   112   099   000    Old_age   Always       -       46668818
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
in flag
What does it matter how accurate it is? Replace the disk.
Score:3
cz flag

Your disk has probably been dying for quite some time. Maybe you'll get 48 hours. Maybe a week or a month. Maybe it'll die before I finish typing this out.

This error means that your drive is developing bad sectors, and the disk is running out of spare sectors to replace them with. This is eligible for warranty replacement if the disk is still in warranty.

In any case, as soon as the drive does run out of spare sectors, (when 022 reaches 000) you will start silently losing data. The time to replace it is before that happens.

Nikita Kipriyanov avatar
za flag
In addition to that, I suggest to immediately stop using the faulty disk completely. Next time you start it must be for copying of data to the replacement media.
larry888 avatar
kr flag
Michael - thanks - I was wondering what the numbers meant. So when spare sectors hits 36 - that's the threshold for the warning? We're still at 22. This is a server drive. I've been trying to get my server guys to set up a new server and migrate the data over. I would love to stop using it right away. Hopefully I'll get this rolling today. I have sites backed up - but if there's a complete fail then we may have a gap in data - like email. Was hoping to migrate then rsync it.
Michael Hampton avatar
cz flag
@larry888 Yes, the threshold (36 of 100) is when FAILING_NOW appears and you get an alarm. The drive is actually considered bad and can be RMA'd as soon as even _one_ reallocated sector (raw value) appears.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.