Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
cloud-init can't find Azure endpoint server during provisioning on Azure Platform,and then cause cloud-init to use "DataSourceNone" as its datasource(this value is fallback value).
cloud-init find Azure endpoint server in "/run/cloud-init/dhclient.hooks" folder, this folder and file in this folder is generated by "/etc/NetworkManager/dispatcher.d/cloud-init-azure-hook".
The file is a hook script,and it will be invoked by NetworkManager during booting.But the hook script will check whether there is a marker file under /run/cloud-init,called "enabled".
If the marker file doesn't exist,the hook script will not generate "/run/cloud-init/dhclient.hooks" folder and files in this folder.And the marker file is generated by a systemd generator "/usr/lib/systemd/system-generators/cloud-init-generator".
From 0.7.9-5, we remove the generator,so the marker will not exist,and finally cause the issue.
So....in a word,
due to "/usr/lib/systemd/system-generators/cloud-init-generator" was removed
==> not generate marker file "/run/cloud-init/enabled"
==> NetworkManager hook script will not execute dhcp hook
==> not generator "/run/cloud-init/dhclient.hooks"
==> cloud-init can't find Azure endpoint server
the content of "/etc/NetworkManager/dispatcher.d/cloud-init-azure-hook" is below,the keypoint I have highlighted
#!/bin/sh
# This file is part of cloud-init. See LICENSE file for license information.
# This script hooks into NetworkManager(8) via its scripts
# arguments are 'interface-name' and 'action'
#
is_azure() {
local dmi_path="/sys/class/dmi/id/board_vendor" vendor=""
if [ -e "$dmi_path" ] && read vendor < "$dmi_path"; then
[ "$vendor" = "Microsoft Corporation" ] && return 0
fi
return 1
}
is_enabled() {
# only execute hooks if cloud-init is enabled and on azure
[ -e /run/cloud-init/enabled ] || return 1
is_azure
}
if is_enabled; then
case "$1:$2" in
*:up) exec cloud-init dhclient-hook up "$1";;
*:down) exec cloud-init dhclient-hook down "$1";;
esac
fi
### Why do we remove the generator?
refer this link: https://bugzilla.redhat.com/show_bug.cgi?id=1440831
Version-Release number of selected component (if applicable):
cloud-init-0.7.9-5.el7.x86_64.rpm(the latest build is 0.7.9-8)
RHEL Version:
RHEL-7.4
How reproducible:
100%
Steps to Reproduce:
1. Prepare a running VM in Azure. Install cloud-init-0.7.9-8.el7.x86_64.rpm
2. Add a new user and this user must authenticate by keypair(not use password),and this user have sudo privilege.
(This step can ensure you can successfully login the VM,even though provision process failed)
3. systemctl enable cloud-{init,init-local,config,final}
4. Change /etc/waagent.conf, set Provisioning.Enabled=n, Provisioning.UseCloudInit=y
5. Deprovision this VM use WALA,and use this VM as a template to create a new VM
6. After provision finishing,login the VM
7. Check if cloud-init does provisioning successfully
Actual results:
cloud-init doesn't successfully provision(cloud-init can't find Azure endpoint server,and use fallback value "DataSourceNone")
......
2017-06-09 11:09:38,303 - azure.py[INFO]: Registering with Azure...
2017-06-09 11:09:38,303 - azure.py[DEBUG]: Finding Azure endpoint...
2017-06-09 11:09:38,304 - util.py[DEBUG]: Reading from /etc/cloud/cloud.cfg (quiet=False)
2017-06-09 11:09:38,304 - util.py[DEBUG]: Read 1150 bytes from /etc/cloud/cloud.cfg
2017-06-09 11:09:38,304 - util.py[DEBUG]: Attempting to load yaml from string of length 1150 with allowed root types (<type 'dict'>,)
2017-06-09 11:09:38,325 - util.py[DEBUG]: Reading from /etc/cloud/cloud.cfg.d/05_logging.cfg (quiet=False)
2017-06-09 11:09:38,325 - util.py[DEBUG]: Read 1821 bytes from /etc/cloud/cloud.cfg.d/05_logging.cfg
2017-06-09 11:09:38,325 - util.py[DEBUG]: Attempting to load yaml from string of length 1821 with allowed root types (<type 'dict'>,)
2017-06-09 11:09:38,334 - util.py[DEBUG]: Attempting to load yaml from string of length 0 with allowed root types (<type 'dict'>,)
2017-06-09 11:09:38,334 - util.py[DEBUG]: load_yaml given empty string, returning default
2017-06-09 11:09:38,335 - azure.py[DEBUG]: /run/cloud-init/dhclient.hooks not found.
2017-06-09 11:09:38,335 - azure.py[DEBUG]: Unable to find endpoint in dhclient logs. Falling back to check lease files
2017-06-09 11:09:38,336 - azure.py[DEBUG]: Looking for endpoint in lease file /var/lib/dhcp/dhclient.eth0.leases
2017-06-09 11:09:38,336 - util.py[DEBUG]: Reading from /var/lib/dhcp/dhclient.eth0.leases (quiet=False)
......
2017-06-09 18:05:38,819 - DataSourceAzure.py[INFO]: Error communicating with Azure fabric; assume we aren't on Azure.
......
Expected results:
cloud-init successfully provision
Additional info:
I have changed a wrong status previous,make appologize for this.And I have rolled back the status.
I use the new scratch build(https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13443396) to verify on RHEL-7.4-20170616.3
Below are my steps,
1. Install cloud-init-0.7.9-9.el7.x86_64.rpm in RHEL7.4 on Azure Platform
And install WALA-2.2.12 in this VM
2. Add a new user (authenticate by keypair,and this user have sudo privilege).
3. systemctl enable cloud-{init,init-local,config,final}
4. Change /etc/waagent.conf, set Provisioning.Enabled=n, Provisioning.UseCloudInit=y
5. Deprovision this VM use WALA,and use this VM as a template to create a new VM
6. After provision finishing,login the VM
7. Check /var/log/cloud-init.log
And now cloud-init successfully do provision process.
Thanks!
I'm not entirely sure about this sentence in the previous comment:
"And now cloud-init successfully do provision process."
Does this mean that you were able to verify the issue?
Thank you for clarification.
Hi Vratislav,
yeah, I have verified.The new scratch build(0.7.9-9) has resolved the issue.
Sorry for my previsous comment description to make you confused.
Thanks!
Verified this bug pass on RHEL-7.4-20170621.0 with cloud-init-0.7.9-9.el7.x86_64.rpm on Azure Platform.
cloud-init can successfully find endpoint on Azure Platform.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHEA-2017:2275
Description of problem: cloud-init can't find Azure endpoint server during provisioning on Azure Platform,and then cause cloud-init to use "DataSourceNone" as its datasource(this value is fallback value). cloud-init find Azure endpoint server in "/run/cloud-init/dhclient.hooks" folder, this folder and file in this folder is generated by "/etc/NetworkManager/dispatcher.d/cloud-init-azure-hook". The file is a hook script,and it will be invoked by NetworkManager during booting.But the hook script will check whether there is a marker file under /run/cloud-init,called "enabled". If the marker file doesn't exist,the hook script will not generate "/run/cloud-init/dhclient.hooks" folder and files in this folder.And the marker file is generated by a systemd generator "/usr/lib/systemd/system-generators/cloud-init-generator". From 0.7.9-5, we remove the generator,so the marker will not exist,and finally cause the issue. So....in a word, due to "/usr/lib/systemd/system-generators/cloud-init-generator" was removed ==> not generate marker file "/run/cloud-init/enabled" ==> NetworkManager hook script will not execute dhcp hook ==> not generator "/run/cloud-init/dhclient.hooks" ==> cloud-init can't find Azure endpoint server the content of "/etc/NetworkManager/dispatcher.d/cloud-init-azure-hook" is below,the keypoint I have highlighted #!/bin/sh # This file is part of cloud-init. See LICENSE file for license information. # This script hooks into NetworkManager(8) via its scripts # arguments are 'interface-name' and 'action' # is_azure() { local dmi_path="/sys/class/dmi/id/board_vendor" vendor="" if [ -e "$dmi_path" ] && read vendor < "$dmi_path"; then [ "$vendor" = "Microsoft Corporation" ] && return 0 fi return 1 } is_enabled() { # only execute hooks if cloud-init is enabled and on azure [ -e /run/cloud-init/enabled ] || return 1 is_azure } if is_enabled; then case "$1:$2" in *:up) exec cloud-init dhclient-hook up "$1";; *:down) exec cloud-init dhclient-hook down "$1";; esac fi ### Why do we remove the generator? refer this link: https://bugzilla.redhat.com/show_bug.cgi?id=1440831 Version-Release number of selected component (if applicable): cloud-init-0.7.9-5.el7.x86_64.rpm(the latest build is 0.7.9-8) RHEL Version: RHEL-7.4 How reproducible: 100% Steps to Reproduce: 1. Prepare a running VM in Azure. Install cloud-init-0.7.9-8.el7.x86_64.rpm 2. Add a new user and this user must authenticate by keypair(not use password),and this user have sudo privilege. (This step can ensure you can successfully login the VM,even though provision process failed) 3. systemctl enable cloud-{init,init-local,config,final} 4. Change /etc/waagent.conf, set Provisioning.Enabled=n, Provisioning.UseCloudInit=y 5. Deprovision this VM use WALA,and use this VM as a template to create a new VM 6. After provision finishing,login the VM 7. Check if cloud-init does provisioning successfully Actual results: cloud-init doesn't successfully provision(cloud-init can't find Azure endpoint server,and use fallback value "DataSourceNone") ...... 2017-06-09 11:09:38,303 - azure.py[INFO]: Registering with Azure... 2017-06-09 11:09:38,303 - azure.py[DEBUG]: Finding Azure endpoint... 2017-06-09 11:09:38,304 - util.py[DEBUG]: Reading from /etc/cloud/cloud.cfg (quiet=False) 2017-06-09 11:09:38,304 - util.py[DEBUG]: Read 1150 bytes from /etc/cloud/cloud.cfg 2017-06-09 11:09:38,304 - util.py[DEBUG]: Attempting to load yaml from string of length 1150 with allowed root types (<type 'dict'>,) 2017-06-09 11:09:38,325 - util.py[DEBUG]: Reading from /etc/cloud/cloud.cfg.d/05_logging.cfg (quiet=False) 2017-06-09 11:09:38,325 - util.py[DEBUG]: Read 1821 bytes from /etc/cloud/cloud.cfg.d/05_logging.cfg 2017-06-09 11:09:38,325 - util.py[DEBUG]: Attempting to load yaml from string of length 1821 with allowed root types (<type 'dict'>,) 2017-06-09 11:09:38,334 - util.py[DEBUG]: Attempting to load yaml from string of length 0 with allowed root types (<type 'dict'>,) 2017-06-09 11:09:38,334 - util.py[DEBUG]: load_yaml given empty string, returning default 2017-06-09 11:09:38,335 - azure.py[DEBUG]: /run/cloud-init/dhclient.hooks not found. 2017-06-09 11:09:38,335 - azure.py[DEBUG]: Unable to find endpoint in dhclient logs. Falling back to check lease files 2017-06-09 11:09:38,336 - azure.py[DEBUG]: Looking for endpoint in lease file /var/lib/dhcp/dhclient.eth0.leases 2017-06-09 11:09:38,336 - util.py[DEBUG]: Reading from /var/lib/dhcp/dhclient.eth0.leases (quiet=False) ...... 2017-06-09 18:05:38,819 - DataSourceAzure.py[INFO]: Error communicating with Azure fabric; assume we aren't on Azure. ...... Expected results: cloud-init successfully provision Additional info: