Score:0

QEMU-KVM, drbd and corosync - VM's doesn't work after reboot

br flag

On Debian 9.6 I've got QEMU-KVM Virtualisation. After electricity problems this machine just shut down. After switching it on again I can't start any VM because of this error:

error: internal error: process exited while connecting to monitor: 2022-02-03T12:01:58.403986Z qemu-system-x86_64: -drive file=/dev/drbd6,format=raw,if=none,id=drive-virtio-disk0,cache=none: Could not open '/dev/drbd6': Read-only file system

This happens on every of 4 VMs. Fdisk sees only this:

Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00037a37

Device     Boot    Start        End    Sectors  Size Id Type
/dev/sda1  *        2048   19531775   19529728  9.3G fd Linux raid autodetect
/dev/sda2       19531776   35155967   15624192  7.5G fd Linux raid autodetect
/dev/sda3       35155968 1939451903 1904295936  908G fd Linux raid autodetect


Disk /dev/sdb: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x000e1911

Device     Boot    Start        End    Sectors  Size Id Type
/dev/sdb1  *        2048   19531775   19529728  9.3G fd Linux raid autodetect
/dev/sdb2       19531776   35155967   15624192  7.5G fd Linux raid autodetect
/dev/sdb3       35155968 1939451903 1904295936  908G fd Linux raid autodetect


Disk /dev/md0: 9.3 GiB, 9998098432 bytes, 19527536 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/md1: 7.5 GiB, 7998525440 bytes, 15622120 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/md2: 908 GiB, 974998331392 bytes, 1904293616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

I realized that it's drbd problem (or corosync also) which I didn't know that exists there and this is how it's done. Below some infos which are the same on botch machines:

    # service drbd status
● drbd.service - LSB: Control DRBD resources.
   Loaded: loaded (/etc/init.d/drbd; generated; vendor preset: enabled)
   Active: active (exited) since Tue 2022-02-08 11:34:48 CET; 6min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 12711 ExecStop=/etc/init.d/drbd stop (code=exited, status=0/SUCCESS)
  Process: 12793 ExecStart=/etc/init.d/drbd start (code=exited, status=0/SUCCESS)

Feb 08 11:34:47 brain systemd[1]: Starting LSB: Control DRBD resources....
Feb 08 11:34:47 brain drbd[12793]: Starting DRBD resources:[
Feb 08 11:34:47 brain drbd[12793]:      create res: r0 r1 r10 r2 r3 r4 r5 r6 r7 r8 r9
Feb 08 11:34:47 brain drbd[12793]:    prepare disk: r0 r1 r10 r2 r3 r4 r5 r6 r7 r8 r9
Feb 08 11:34:47 brain drbd[12793]:     adjust disk: r0:failed(apply-al:20) r1:failed(apply-al:20) r10:failed(apply-al:20) r2:failed(apply-al:20) r3:failed(apply-al:20) r4:failed(apply-al:20) r5:failed(apply
Feb 08 11:34:47 brain drbd[12793]:      adjust net: r0 r1 r10 r2 r3 r4 r5 r6 r7 r8 r9
Feb 08 11:34:47 brain drbd[12793]: ]
Feb 08 11:34:48 brain drbd[12793]: WARN: stdin/stdout is not a TTY; using /dev/console.
Feb 08 11:34:48 brain systemd[1]: Started LSB: Control DRBD resources..



# cat /proc/drbd
version: 8.4.7 (api:1/proto:86-101)
srcversion: F7D2F0C9036CD0E796D5958
 0: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 1: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 2: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 3: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 4: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 5: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 6: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 7: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 8: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 9: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
10: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

Whan I want to make one of the disk primary on VE1 it gives me error:

# drbdadm primary r0
0: State change failed: (-2) Need access to UpToDate data
Command 'drbdsetup-84 primary 0' terminated with exit code 17

On VE2 (secondary) drbdadm secondary r0 works.

# drbdadm up r0
open(/dev/vg0/lv-sheep) failed: No such file or directory
Command 'drbdmeta 0 v08 /dev/vg0/lv-sheep internal apply-al' terminated with exit code 20

And I can not find /dev/vg0 anywhere. All is in /dev/drbd/vg0/by-disk/lv-sheep.

I don't know if these VM existed should I go command sequence like that:

# drbdadm create-md r0
# drbdadm up r0
# drbdadm primary r0 --force
# mkfs.ext4 /dev/drbd0

Has anyone got any thoughts?

EDIT: Additional data

    # vgdisplay
  --- Volume group ---
  VG Name               vg0
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  26
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                11
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               908.04 GiB
  PE Size               4.00 MiB
  Total PE              232457
  Alloc PE / Size       177664 / 694.00 GiB
  Free  PE / Size       54793 / 214.04 GiB
  VG UUID               cHjzTE-lZxc-J6Qs-35jD-3kRn-csJx-g5MgNy

# cat /etc/drbd.conf
# You can find an example in  /usr/share/doc/drbd.../drbd.conf.example

include "drbd.d/global_common.conf";
include "drbd.d/*.res";


# cat /etc/drbd.d/r1.res
resource r1 {
        device          /dev/drbd1;
        disk            /dev/vg0/lv-viewcenter;
        meta-disk       internal;

        startup {
#               become-primary-on both;
        }

        net {
                allow-two-primaries;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
                cram-hmac-alg sha1;
                shared-secret "T/L0zE/i9eiPI";
        }

        syncer {
                rate 200M;
        }

        on brain {
                address         10.0.0.1:7789;
        }

        on pinky {
                address         10.0.0.2:7789;
        }
}
Matt Kereczman avatar
nr flag
Could you add your DRBD configuration(s) to your question? Usually, these are in either `/etc/drbd.d/*.res` or `/etc/drbd.conf`. Also, what does a `vgdisplay` output?
br flag
@MattKereczman added info you asked for.
Score:1
br flag

Everything works right now thanks to Matt Kereczman comment. After "vgdisplay" command I saw vg0 volume group. Next command I used was "lvdisplay" which printed me all my VM's.

Next steps was to make sequences of commands:

# vgscan --mknodes
File descriptor 8 (pipe:[270576]) leaked on vgscan invocation. Parent PID 15357: bash
Reading volume groups from cache.
Found volume group "vg0" using metadata type lvm2

# vgchange -a y
File descriptor 8 (pipe:[270576]) leaked on vgchange invocation. Parent PID 15357: bash
11 logical volume(s) in volume group "vg0" now active

And all the logical volumes appeared. Next steps was to make the VM primary, turn it up and start the VM:

# drbdadm primary r6
# drbdadm up r6
# virsh start VM

And everything started to work fine.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.