Score:0

xfs corruption metadata after reboot

us flag

I had a problem on a RAID1 with 4 disks. We replaced the faulty disk and restarted the server, the rebuild was done, two linux centos 7 machines did not come up accusing error of xfs corruption. Other machines rose normally. I tried to mount the partition:

# mount /dev/mapper/cs_mbox_opt /mnt
returned: XFS metadata corruption detected at xfs_dir3_leaf_check_init.....

I ran the XFS_repair command and received the message that it was not possible to fix and indicated to use -L. I did the process with xfs_repair -L and after many messages with errors it informed that it was not possible to correct with the message: Metadata CRC error detected at 0x559d9f7ac1e9. xfs_dir3_block 0x41df0c80/0x1000 corrupt block 0 in directory inode 807368306: junking block segmentation failure(saved core image)

I exported the metadata and imported it in another directory but I got the error:

Commands:
#xfs_metadump -gwa /dev/mapper/[volume] /tmp/xfsmetadata.img 
# xfs_mdrestore -g /tmp/xfsmetadata.img /tmp/xfs_file 
# xfs_repair -vf /tmp/xfs_file

Sorry, Could not file valid secondary superblock.
See attached images.

xfs_repair -L xfs_repair at metadata after restore mount and xfs_repair error

At the moment I don't know what else to do. Any tips?

I mentioned the steps above.

shodanshok avatar
ca flag
Did you really have a 4-way RAID1 (ie: four copies of your data)? Are you using hardware or software RAID?
Christovam avatar
us flag
Hi. That's a raid1 with 4 disks. We exchanged the HDs and HP's Ilo reported that the rebuild was completed.
shodanshok avatar
ca flag
The log you provided shows real data and metadata error, so I don't think you can cleanly recover. I suggest you to recover from backups. But without detailed hardware and setup info it is not possible to help further.
Christovam avatar
us flag
I already made a backup. But I would like to retrieve information from the day the problem occurred. What kind of information do you need to support?
Christovam avatar
us flag
Complementing. Raid 1 is configured on the HP P440 controller smart array. I have 4 disks of 600Gb.
shodanshok avatar
ca flag
Ok, the HP p440 is a real hardware RAID controller, so it should had no issue in replacing a failed disk. Are you sure it was a RAID1 array? If so, please check the server RAM via memtest86 (you can live-boot it).
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.