Score:0

Mount Azure virtual hard disks on filesystem during Azure Linux VM creation using Terraform

cf flag

I am using Terraform to automate VM creation on Microsoft Azure. The Terraform script shall accept data disk configuration as below (refer data_disk_size_gb), create the virtual hard disk and subsequently mount it in the given filesystem path.

module "jumphost" {
    count = 1
    source = "../modules/services/jumphost"
    prefix = "${module.global-vars.project}-${var.environment}"
    rg = azurerm_resource_group.rg
    vm_size = "Standard_D2s_v4"
    subnet = azurerm_subnet.web-subnet
    private_ip_address = "10.0.1.250"
    data_disk_size_gb = [ 
        ["/data", 100],
        ["/data2" , 200]
    ]
    admin_username = "zaidwaqi"
    admin_public_key_path = "../id_rsa.pub"
    nsg_allow_tcp_ports = [22]
    public_ip_address = true
}

The virtual hard disks creation and attachment to VM are done as below, which I believe works well and not cause of the issue.

resource "azurerm_managed_disk" "data-disk" {
    count                = length(var.data_disk_size_gb)
    name                 = "${var.prefix}-${var.service_name}-data-disk-${count.index}"
    location             = var.rg.location
    resource_group_name  = var.rg.name
    storage_account_type = "Standard_LRS"
    create_option        = "Empty"
    disk_size_gb         = var.data_disk_size_gb[count.index][1]
}

resource "azurerm_virtual_machine_data_disk_attachment" "external" {
    count       = length(azurerm_managed_disk.data-disk)
    managed_disk_id  = "${azurerm_managed_disk.data-disk[count.index].id}"  
    virtual_machine_id = azurerm_linux_virtual_machine.vm.id  
    lun        = "${count.index + 10}"  
    caching      = "ReadWrite"  
}

To utilize the provided data disks, the cloud-init configurations is provided to handle partitioning, filesystem creation and mounting. The information for that is provided by Terraform config via data template_cloudinit_config, which will be passed to the VM custom_data attribute

data "template_cloudinit_config" "config" {
  gzip = true
  base64_encode = true
  part {
      filename = "init-cloud-config"
      content_type = "text/cloud-config"
      content = file("../modules/services/${var.service_name}/init.yaml")
  }
  part {
      filename = "init-shellscript"
      content_type = "text/x-shellscript"
      content = templatefile("../modules/services/${var.service_name}/init.sh",
        { 
          hostname = "${var.prefix}-${var.service_name}"
          data_disk_size_gb = var.data_disk_size_gb
        }
      )
  }
}

The cloud-init shell script init.sh that accepts the parameter is as below

#!/bin/bash

hostnamectl set-hostname ${hostname}

%{ for index, disk in data_disk_size_gb ~}
parted /dev/sd${ split("","bcdef")[index] } --script mklabel gpt mkpart xfspart xfs 0% 100%
mkfs.xfs /dev/sd${ split("","bcdef")[index] }1
partprobe /dev/sd${ split("","bcdef")[index] }1
mkdir -p ${ disk[0] }
mount /dev/sd${ split("","bcdef")[index] }1 ${ disk[0] }
echo UUID=\"`(blkid /dev/sd${ split("","bcdef")[index] }1 -s UUID -o value)`\" ${ disk[0] }        xfs     defaults,nofail         1       2 >> /etc/fstab
%{ endfor ~}

Upon completion of terraform apply, /data and /data2 are not visible in df output. I expect to see entries for /dev/sdb1 and /dev/sdc1 with mount points of /data and /data2 respectively.

[zaidwaqi@starter-stage-jumphost ~]$ ls /
bin  boot  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var
[zaidwaqi@starter-stage-jumphost ~]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G     0  3.9G   0% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           3.9G  8.6M  3.9G   1% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda2        30G  1.8G   28G   7% /
/dev/sda1       496M   73M  424M  15% /boot
/dev/sda15      495M  6.9M  488M   2% /boot/efi
tmpfs           798M     0  798M   0% /run/user/1000

Diagnostic information

/var/lib/cloud/instance/scripts/

#!/bin/bash

hostnamectl set-hostname starter-stage-jumphost

parted /dev/sdb --script mklabel gpt mkpart xfspart xfs 0% 100%
mkfs.xfs /dev/sdb1
partprobe /dev/sdb1
mkdir -p /data
mount /dev/sdb1 /data
echo UUID=\"`(blkid /dev/sdb1 -s UUID -o value)`\" /data        xfs     defaults,nofail         1       2 >> /etc/fstab 
parted /dev/sdc --script mklabel gpt mkpart xfspart xfs 0% 100%
mkfs.xfs /dev/sdc1
partprobe /dev/sdc1
mkdir -p /data2
mount /dev/sdc1 /data2
echo UUID=\"`(blkid /dev/sdc1 -s UUID -o value)`\" /data2        xfs     defaults,nofail         1       2 >> /etc/fstab
[zaidwaqi@starter-stage-jumphost scripts]$ 

/var/log/cloud-init.log

Partial content of the log. Hopefully I display the relevant part below.

2021-07-03 05:42:43,635 - cc_disk_setup.py[DEBUG]: Creating new partition table/disk
2021-07-03 05:42:43,635 - util.py[DEBUG]: Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=True)
2021-07-03 05:42:43,651 - util.py[DEBUG]: Creating partition on /dev/disk/cloud/azure_resource took 0.016 seconds
2021-07-03 05:42:43,651 - util.py[WARNING]: Failed partitioning operation
Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle.
2021-07-03 05:42:43,651 - util.py[DEBUG]: Failed partitioning operation
Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle.
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 140, in handle
    func=mkpart, args=(disk, definition))
  File "/usr/lib/python3.6/site-packages/cloudinit/util.py", line 2539, in log_time
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 769, in mkpart
    assert_and_settle_device(device)
  File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 746, in assert_and_settle_device
    "with a udevadm settle." % device)
RuntimeError: Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle.
2021-07-03 05:42:43,672 - cc_disk_setup.py[DEBUG]: setting up filesystems: [{'filesystem': 'ext4', 'device': 'ephemeral0.1'}]
2021-07-03 05:42:43,672 - cc_disk_setup.py[DEBUG]: ephemeral0.1 is mapped to disk=/dev/disk/cloud/azure_resource part=1
2021-07-03 05:42:43,672 - cc_disk_setup.py[DEBUG]: Creating new filesystem.
2021-07-03 05:42:43,672 - util.py[DEBUG]: Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=True)
2021-07-03 05:42:43,684 - util.py[DEBUG]: Creating fs for /dev/disk/cloud/azure_resource took 0.012 seconds
2021-07-03 05:42:43,684 - util.py[WARNING]: Failed during filesystem operation
Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle.
2021-07-03 05:42:43,684 - util.py[DEBUG]: Failed during filesystem operation
Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle.
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 158, in handle
    func=mkfs, args=(definition,))
  File "/usr/lib/python3.6/site-packages/cloudinit/util.py", line 2539, in log_time
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 871, in mkfs
    assert_and_settle_device(device)
  File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 746, in assert_and_settle_device
    "with a udevadm settle." % device)
RuntimeError: Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle.
2021-07-03 05:42:43,684 - handlers.py[DEBUG]: finish: init-network/config-disk_setup: SUCCESS: config-disk_setup ran successfully
2021-07-03 05:42:43,685 - stages.py[DEBUG]: Running module mounts (<module 'cloudinit.config.cc_mounts' from '/usr/lib/python3.6/site-packages/cloudinit/config/cc_mounts.py'>) with frequency once-per-instance
2021-07-03 05:42:43,685 - handlers.py[DEBUG]: start: init-network/config-mounts: running config-mounts with frequency once-per-instance
2021-07-03 05:42:43,685 - util.py[DEBUG]: Writing to /var/lib/cloud/instances/b7e003ce-7ad3-4840-a4f7-06faefed9cb0/sem/config_mounts - wb: [644] 24 bytes

/var/log/cloud-init-output.log

Partial content of the log. Hopefully I display the relevant part below.

Complete!
Cloud-init v. 19.4 running 'modules:config' at Sat, 03 Jul 2021 05:42:46 +0000. Up 2268.33 seconds.
meta-data=/dev/sdb1              isize=512    agcount=4, agsize=6553472 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=26213888, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=12799, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
meta-data=/dev/sdc1              isize=512    agcount=4, agsize=13107072 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=52428288, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=25599, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Cloud-init v. 19.4 running 'modules:final' at Sat, 03 Jul 2021 05:45:28 +0000. Up 2430.85 seconds.
Cloud-init v. 19.4 finished at Sat, 03 Jul 2021 05:45:34 +0000. Datasource DataSourceAzure [seed=/dev/sr0].  Up 2436.88 seconds
cf flag
My subsequent reading on this suggest that this is due to data disk cannot be attached to azure_linux_virtual_machine at creation time. https://github.com/terraform-providers/terraform-provider-azurerm/issues/6117
Score:0
pw flag

With parted, I had to reboot the vm to see the disks. Also, the uuid values were not necessarily matching what I initially put in /etc/fstab. After reboot, I checked with lsblk and blkid to make sure the right info was in /etc/fstab. That is not of course ideal for automation. lvm stuff seems to not require the reboot. I guess partprob did not work as well.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.