Bug 1925213
| Summary: | [Azure][RHEL-8] cloud-init.service costs 2 minutes in DSv4 VM | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Yuxin Sun <yuxisun> |
| Component: | cloud-init | Assignee: | Emanuele Giuseppe Esposito <eesposit> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Huijuan Zhao <huzhao> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 8.4 | CC: | eterrell, huzhao, jgreguske, litian, ribarry, xiachen, yacao |
| Target Milestone: | rc | Keywords: | Triaged |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-06-08 09:59:49 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Assigned to Rick for initial triage per bz process and age of bug created or assigned to virt-maint without triage. (In reply to Yuxin Sun from comment #0) > Description of problem: > The DSv4 sizes don't have a temporary(ephemeral) disk. The cloud-init > doesn't handle it properly and costs 2 minutes to wait for the ephemeral > disk. > * This happened in both first boot and reboot. > > Version-Release number of selected components (if applicable): > > RHEL Version: > RHEL8.4 > cloud-init-20.3-10.el8.noarch > > How reproducible: > 100% > > Steps to Reproduce: > 1. Deploy a RHEL-8.4 VM on Azure in Standard_D2s_v4 size. > 2. Login the instance, check cloud-init log > > Actual results: > # systemd-analyze blame > 2min 5.049s cloud-init.service > > cloud-init.log: > 2021-02-04 08:37:53,221 - handlers.py[DEBUG]: start: > azure-ds/address_ephemeral_resize: address_ephemeral_resize > 2021-02-04 08:37:53,221 - handlers.py[DEBUG]: start: > azure-ds/wait-for-ephemeral-disk: wait for ephemeral disk > 2021-02-04 08:37:53,221 - util.py[DEBUG]: Azure ephemeral disk: Waiting up > to 120 seconds for the following files: ['/dev/disk/cloud/azure_resource'] > 2021-02-04 08:39:53,242 - util.py[DEBUG]: Azure ephemeral disk: Still > missing files after 120 seconds: {'/dev/disk/cloud/azure_resource'} > 2021-02-04 08:39:53,242 - DataSourceAzure.py[WARNING]: ephemeral device > '/dev/disk/cloud/azure_resource' did not appear after 120 seconds. > ... > 2021-02-04 08:39:53,551 - util.py[WARNING]: Failed partitioning operation > Device /dev/disk/cloud/azure_resource did not exist and was not created with > a udevadm settle. > 2021-02-04 08:39:53,551 - util.py[DEBUG]: Failed partitioning operation > Device /dev/disk/cloud/azure_resource did not exist and was not created with > a udevadm settle. > Traceback (most recent call last): > File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", > line 141, in handle > func=mkpart, args=(disk, definition)) > File "/usr/lib/python3.6/site-packages/cloudinit/util.py", line 2292, in > log_time > ret = func(*args, **kwargs) > File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", > line 780, in mkpart > assert_and_settle_device(device) > File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", > line 757, in assert_and_settle_device > "with a udevadm settle." % device) > RuntimeError: Device /dev/disk/cloud/azure_resource did not exist and was > not created with a udevadm settle. > 2021-02-04 08:39:53,652 - cc_disk_setup.py[DEBUG]: setting up filesystems: > [{'filesystem': 'ext4', 'device': 'ephemeral0.1'}] > 2021-02-04 08:39:53,653 - cc_disk_setup.py[DEBUG]: ephemeral0.1 is mapped to > disk=/dev/disk/cloud/azure_resource part=1 > 2021-02-04 08:39:53,653 - cc_disk_setup.py[DEBUG]: Creating new filesystem. > 2021-02-04 08:39:53,653 - subp.py[DEBUG]: Running command ['udevadm', > 'settle'] with allowed return codes [0] (shell=False, capture=True) > 2021-02-04 08:39:53,666 - util.py[DEBUG]: Creating fs for > /dev/disk/cloud/azure_resource took 0.014 seconds > 2021-02-04 08:39:53,666 - util.py[WARNING]: Failed during filesystem > operation > Device /dev/disk/cloud/azure_resource did not exist and was not created with > a udevadm settle. > 2021-02-04 08:39:53,667 - util.py[DEBUG]: Failed during filesystem operation > Device /dev/disk/cloud/azure_resource did not exist and was not created with > a udevadm settle. > Traceback (most recent call last): > File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", > line 159, in handle > func=mkfs, args=(definition,)) > File "/usr/lib/python3.6/site-packages/cloudinit/util.py", line 2292, in > log_time > ret = func(*args, **kwargs) > File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", > line 883, in mkfs > assert_and_settle_device(device) > File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", > line 757, in assert_and_settle_device > "with a udevadm settle." % device) > RuntimeError: Device /dev/disk/cloud/azure_resource did not exist and was > not created with a udevadm settle. > > Expected results: > cloud-init.service costs ~10s > > Additional info: @yuxisun can you confirm this happens only on instances that does not have ephemeral disk? Also, can you test for regression on rhel-8.4 as well? It does look like this problem is the same as in: https://bugs.launchpad.net/cloud-init/+bug/1901011 There's already a PR opened for this issue, reviews are positive but no merge yet. I'll ping the maintainers so we can speed up a little bit on this fix. https://github.com/canonical/cloud-init/pull/800 @yuxisun the PR is already merged but it a little late for 8.4. If you agree we could deffer it to 8.5. What do you think? @yuxisun the PR is already merged but it a little late for 8.4. If you agree we could deffer it to 8.5. What do you think? (In reply to Eduardo Otubo from comment #7) > @yuxisun the PR is already merged but it a little late for 8.4. > If you agree we could deffer it to 8.5. What do you think? I don't know what caused the message to duplicate. But we're good. Postponing to rhel-8.5.0 :-) Thanks! |
Description of problem: The DSv4 sizes don't have a temporary(ephemeral) disk. The cloud-init doesn't handle it properly and costs 2 minutes to wait for the ephemeral disk. * This happened in both first boot and reboot. Version-Release number of selected components (if applicable): RHEL Version: RHEL8.4 cloud-init-20.3-10.el8.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy a RHEL-8.4 VM on Azure in Standard_D2s_v4 size. 2. Login the instance, check cloud-init log Actual results: # systemd-analyze blame 2min 5.049s cloud-init.service cloud-init.log: 2021-02-04 08:37:53,221 - handlers.py[DEBUG]: start: azure-ds/address_ephemeral_resize: address_ephemeral_resize 2021-02-04 08:37:53,221 - handlers.py[DEBUG]: start: azure-ds/wait-for-ephemeral-disk: wait for ephemeral disk 2021-02-04 08:37:53,221 - util.py[DEBUG]: Azure ephemeral disk: Waiting up to 120 seconds for the following files: ['/dev/disk/cloud/azure_resource'] 2021-02-04 08:39:53,242 - util.py[DEBUG]: Azure ephemeral disk: Still missing files after 120 seconds: {'/dev/disk/cloud/azure_resource'} 2021-02-04 08:39:53,242 - DataSourceAzure.py[WARNING]: ephemeral device '/dev/disk/cloud/azure_resource' did not appear after 120 seconds. ... 2021-02-04 08:39:53,551 - util.py[WARNING]: Failed partitioning operation Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle. 2021-02-04 08:39:53,551 - util.py[DEBUG]: Failed partitioning operation Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle. Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 141, in handle func=mkpart, args=(disk, definition)) File "/usr/lib/python3.6/site-packages/cloudinit/util.py", line 2292, in log_time ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 780, in mkpart assert_and_settle_device(device) File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 757, in assert_and_settle_device "with a udevadm settle." % device) RuntimeError: Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle. 2021-02-04 08:39:53,652 - cc_disk_setup.py[DEBUG]: setting up filesystems: [{'filesystem': 'ext4', 'device': 'ephemeral0.1'}] 2021-02-04 08:39:53,653 - cc_disk_setup.py[DEBUG]: ephemeral0.1 is mapped to disk=/dev/disk/cloud/azure_resource part=1 2021-02-04 08:39:53,653 - cc_disk_setup.py[DEBUG]: Creating new filesystem. 2021-02-04 08:39:53,653 - subp.py[DEBUG]: Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=True) 2021-02-04 08:39:53,666 - util.py[DEBUG]: Creating fs for /dev/disk/cloud/azure_resource took 0.014 seconds 2021-02-04 08:39:53,666 - util.py[WARNING]: Failed during filesystem operation Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle. 2021-02-04 08:39:53,667 - util.py[DEBUG]: Failed during filesystem operation Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle. Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 159, in handle func=mkfs, args=(definition,)) File "/usr/lib/python3.6/site-packages/cloudinit/util.py", line 2292, in log_time ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 883, in mkfs assert_and_settle_device(device) File "/usr/lib/python3.6/site-packages/cloudinit/config/cc_disk_setup.py", line 757, in assert_and_settle_device "with a udevadm settle." % device) RuntimeError: Device /dev/disk/cloud/azure_resource did not exist and was not created with a udevadm settle. Expected results: cloud-init.service costs ~10s Additional info: