Score:0

Recovering a nested PV

us flag

On my Proxmox 6.4 host, I had an LVM Thin pool that was 250GB large. I created an Ubuntu VM (which used LVM for the root partition as well) on it, but accidentally oversubscribed it, so the PV inside the VM was set to 500GB.

Everything ran great for a while until I went over the hidden 250GB limit and the VM crashed with an I/O error and refuses to boot. So now I'm trying to recover the disk. The partition table of the disk appears to be intact:

$ fdisk -l /dev/vm-disks/vm-101-disk-0
Disk /dev/vm-disks/vm-101-disk-0: 500 GiB, 536870912000 bytes, 1048576000 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: gpt
Disk identifier: 30874BBC-0B29-4083-B5BF-E973C665D87F

Device                          Start        End    Sectors  Size Type
/dev/vm-disks/vm-101-disk-0p1    2048       4095       2048    1M BIOS boot
/dev/vm-disks/vm-101-disk-0p2    4096    2101247    2097152    1G Linux filesystem
/dev/vm-disks/vm-101-disk-0p3 2101248 1048573951 1046472704  499G Linux filesystem

I've run

$ kpartx -a /dev/vm-disks/vm-101-disk-0

to create /dev/mapper entries for the 3 partitions inside vm-101-disk-0, and that works. If I run:

$ file -sL /dev/mapper/vm--disks-vm--101--disk--0p3
/dev/mapper/vm--disks-vm--101--disk--0p3: LVM2 PV (Linux Logical Volume Manager), UUID: fdOzWR-sPcy-hyYo-Lj2H-YEnZ-wK3c-J6biES, size: 535794024448

Then I can see that PV inside the 3rd partition of the disk. But how can I mount this somewhere in the host to start recovering data? Obviously pvscan from the host system doesn't see it since it's inside another LV. Do I have any options at all here for recovery, or did the fact that the VM thought it had 500GB when it actually didn't mean that I've damaged this beyond repair?

in flag
Boot the VM from a recovery CD image.
Nikita Kipriyanov avatar
za flag
First you must restore Thin LVM operation on the host, i.e. enlarge the thin pool, with simple `lvextend`. Only when this step is done and you have some free space in the thin pool, you can proceed with restoring the VM, which shouldn't be very difficult, just boot the VM into recovery and run appropriate fsck. I repeat again, don't try to proceed until your thin lvm on the host is fixed!
eg flag
Just run `vgchange -ay` to activate the volumes... the host should see them just fine.
us flag
Thank you so much, @NikitaKipriyanov! You've really saved my day :) Did you want to convert your comment to an answer so I can accept it and you can get some rep? If not, I'll just write out the steps I took as a self-answer.
Score:2
za flag

The VM just got write I/O errors when the space in the thin pool was exhausted. For a VM this looks like hard disk unexpectedly denied all writes. So if the VM was bare hardware, the first action was to be find the new hard disk and clone this bad one into it. After the HW is fixed, you may fix logical structures.

In case of virtual machines, you don't have any broken hardware, you can "fix" the "hard disk" by restoring thin volume operation. Just enlarge the thin pool, use lvextend on the thin pool LV to add some space.

And, when it is done, boot the VM from some recovery (virtual) media and do standard file system recovery. Remember, there couldn't be much difficulty; modern filesystems generally designed to withstand this kind of failure.


Monitor the thin LVM. While data space exhaustion is not such a big problem, the metadata exhaustion might have much bigger impact. Don't allow this to happen.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.