RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2032524 - [RHEL9] [Azure] cloud-init fails to configure the system
Summary: [RHEL9] [Azure] cloud-init fails to configure the system
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: cloud-init
Version: CentOS Stream
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Emanuele Giuseppe Esposito
QA Contact: Huijuan Zhao
URL:
Whiteboard:
Depends On:
Blocks: 2039697
TreeView+ depends on / blocked
 
Reported: 2021-12-14 16:05 UTC by Neal Gompa
Modified: 2024-11-20 07:49 UTC (History)
31 users (show)

Fixed In Version: cloud-init-21.1-15.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1974262
: 2039697 (view as bug list)
Environment:
Last Closed: 2022-05-17 12:26:18 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/centos-stream/rpms cloud-init merge_requests 22 0 None None None 2021-12-14 16:13:42 UTC
Gitlab redhat/centos-stream/src cloud-init merge_requests 16 0 None None None 2022-01-12 10:52:08 UTC
Red Hat Issue Tracker RHELPLAN-105784 0 None None None 2021-12-14 16:17:43 UTC
Red Hat Product Errata RHBA-2022:2308 0 None None None 2022-05-17 12:26:40 UTC

Description Neal Gompa 2021-12-14 16:05:07 UTC
+++ This bug was initially created as a clone of Bug #1974262 +++

Description of problem:
cloud-init service fails when trying to provision on Azure with errors like the following:

2021-04-20 03:33:11,917 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceAzure.DataSourceAzure'> failed

Version-Release number of selected components (if applicable):
21.1-14.el9

How reproducible:
100%

Steps to Reproduce:

(Note, this is with Fedora 34 and 35, as there is no Azure CentOS image yet, but custom built ones on EL9 demonstrate this issue)

1. Create a Fedora 34 VM ("urn": "tunnelbiz:fedora:fedoraupdate:34.0.1") on Azure
2. Login and check cloud-init service status

Actual results:
[root@walafedora ~]# systemctl status cloud-init
× cloud-init.service - Initial cloud-init job (metadata service crawler)
     Loaded: loaded (/usr/lib/systemd/system/cloud-init.service; enabled; vendor preset: disabled)
     Active: failed (Result: exit-code) since Mon 2021-06-21 14:53:56 +08; 1h 27min ago
    Process: 709 ExecStart=/usr/bin/cloud-init init (code=exited, status=1/FAILURE)
   Main PID: 709 (code=exited, status=1/FAILURE)
        CPU: 535ms

Jun 21 14:53:56 walafedora cloud-init[806]: ci-info: +-------+-------------+---------+-----------+-------+
Jun 21 14:53:56 walafedora cloud-init[806]: ci-info: | Route | Destination | Gateway | Interface | Flags |
Jun 21 14:53:56 walafedora cloud-init[806]: ci-info: +-------+-------------+---------+-----------+-------+
Jun 21 14:53:56 walafedora cloud-init[806]: ci-info: |   2   |  multicast  |    ::   |    eth0   |   U   |
Jun 21 14:53:56 walafedora cloud-init[806]: ci-info: +-------+-------------+---------+-----------+-------+
Jun 21 14:53:56 walafedora cloud-init[806]: 2021-06-21 06:53:56,163 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceNone.DataSourceNone'> failed
Jun 21 14:53:56 walafedora cloud-init[806]: 2021-06-21 06:53:56,174 - util.py[WARNING]: No instance datasource found! Likely bad things to come!
Jun 21 14:53:56 walafedora systemd[1]: cloud-init.service: Main process exited, code=exited, status=1/FAILURE
Jun 21 14:53:56 walafedora systemd[1]: cloud-init.service: Failed with result 'exit-code'.
Jun 21 14:53:56 walafedora systemd[1]: Failed to start Initial cloud-init job (metadata service crawler).

/var/log/cloud-init.log:
2021-04-20 03:33:11,916 - handlers.py[DEBUG]: start: init-local/search-Azure: searching for local data from DataSourceAzure
2021-04-20 03:33:11,916 - __init__.py[DEBUG]: Seeing if we can get any data from <class 'cloudinit.sources.DataSourceAzure.DataSourceAzure'>
2021-04-20 03:33:11,917 - handlers.py[DEBUG]: finish: init-local/search-Azure: FAIL: no local data found from DataSourceAzure
2021-04-20 03:33:11,917 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceAzure.DataSourceAzure'> failed
2021-04-20 03:33:11,917 - util.py[DEBUG]: Getting data from <class 'cloudinit.sources.DataSourceAzure.DataSourceAzure'> failed
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/cloudinit/sources/__init__.py", line 759, in find_source
    s = cls(sys_cfg, distro, paths)
  File "/usr/lib/python3.9/site-packages/cloudinit/sources/DataSourceAzure.py", line 292, in __init__
    sources.DataSource.__init__(self, sys_cfg, distro, paths)
  File "/usr/lib/python3.9/site-packages/cloudinit/sources/__init__.py", line 211, in __init__
    self.ds_cfg = util.get_cfg_by_path(
  File "/usr/lib/python3.9/site-packages/cloudinit/util.py", line 735, in get_cfg_by_path
    if tok not in cur:
TypeError: argument of type 'NoneType' is not iterable
2021-04-20 03:33:11,922 - main.py[DEBUG]: No local datasource found

Expected results: 
No error in cloud-init

Additional info:

(From the cloned bug 1974262...)

It also fails against AzureStack,  Fedora-Cloud-Base-35-1.2.x86_64.qcow2 (same with 34, so not a regression)

[   38.793452] cloud-init[651]: 2021-11-04 15:13:43,593 - azure.py[WARNING]: Error communicating with Azure fabric; You may experience connectivity issues: Unexpected error while running command.
[   38.842623] cloud-init[651]: Command: ['opesl', 'req', '-x509', '-nodes', '-subj', '/CN=LinuxTransport', '-days', '32768', '-newkey', 'rsa:2048', '-keyout', 'TransportPrivate.pem', '-out', 'TransportCert.pem']
[   38.896366] cloud-init[651]: Exit code: -
[   38.906487] cloud-init[651]: Reason: [Errno 2] No such file or directory: b'openssl'
[   38.928738] cloud-init[651]: Stdout: -
[   38.938581] cloud-init[651]: Stderr: -
[   39.178793] cloud-init[651]: 2021-11-04 15:13:44,646 - util.py[WARNING]: Failed partitioning operation
[   39.209486] cloud-init[651]: Error running partition command on /dev/sdb
[   39.231286] cloud-init[651]: 'NoneType' object has no attribute 'encode'
[   43.187537] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.

no ssh key is installed and the system is unusable.


cloud-init azure.py depends on openssl but the dependency is not there.
Workaround: virt-customize --install openssl -a Fedora-Cloud-Base-35-1.2.x86_64.qcow2
make it work.
(there is still a broken service:
Nov 04 15:48:15 fedora systemd[1]: Starting Rebuild Dynamic Linker Cache...                                                                                                                                                                           
Nov 04 15:48:17 fedora ldconfig[618]: /sbin/ldconfig: Renaming of /etc/ld.so.cache~ to /etc/ld.so.cache failed: Permission denied 
but at least ssh is working)

Looks like the cloud image does not work on Azure :/

--- Additional comment from François Rigault on 2021-11-04 12:17:46 EDT ---

also need the gdisk package for the partitioning issue. With both packages cloud-init seems to work as expected.


virt-customize --install gdisk --install openssl -a Fedora-Cloud-Base-35-1.2.x86_64.qcow2

--- Additional comment from Neal Gompa on 2021-12-14 10:52:09 EST ---

The fix here would be to add "gdisk" and "openssl" as required runtime dependencies for cloud-init.

--- Additional comment from Neal Gompa on 2021-12-14 10:59:24 EST ---

PR proposed: https://src.fedoraproject.org/rpms/cloud-init/pull-request/23

Comment 2 Huijuan Zhao 2021-12-15 02:35:23 UTC
Tried it with RHEL-9(cloud-init-21.1-14.el9) on Azure, did not meet this issue as we pre-installed openssl and gdisk in the image. 

Tried to remove openssl and gdisk from RHEL-9, failed to remove openssl as it is dependency for several other packages, and it is included in rhel-guest-image by default. So maybe no need adding openssl as cloud-init dependency. 

But after removed gdisk, there is error when do partitioning operation which should be called by cloud-utils-growpart:
$ cat /var/log/cloud-init.log
---------------------------------
    673 2021-12-15 02:05:09,765 - util.py[WARNING]: Failed partitioning operation
    674 Error running partition command on /dev/sda
    675 'NoneType' object has no attribute 'encode'
    676 2021-12-15 02:05:09,773 - util.py[DEBUG]: Failed partitioning operation
    677 Error running partition command on /dev/sda
    678 'NoneType' object has no attribute 'encode'
    679 Traceback (most recent call last):
    680   File "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line 491, in check_partition_gpt_layout
    681     out, _err = subp.subp(prt_cmd, update_env=LANG_C_ENV)
    682   File "/usr/lib/python3.9/site-packages/cloudinit/subp.py", line 253, in subp
    683     bytes_args = [
    684   File "/usr/lib/python3.9/site-packages/cloudinit/subp.py", line 254, in <listcomp>
    685     x if isinstance(x, bytes) else x.encode("utf-8")
    686 AttributeError: 'NoneType' object has no attribute 'encode'
    687 
    688 The above exception was the direct cause of the following exception:
    689 
    690 Traceback (most recent call last):
    691   File "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line 139, in handle
    692     util.log_time(logfunc=LOG.debug,
    693   File "/usr/lib/python3.9/site-packages/cloudinit/util.py", line 2409, in log_time
    694     ret = func(*args, **kwargs)
    695   File "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line 808, in mkpart
    696     if check_partition_layout(table_type, device, layout):
    697   File "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line 537, in check_partition_layout
    698     found_layout = get_dyn_func(
    699   File "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line 431, in get_dyn_func
    700     return globals()[func_name](*func_args)
    701   File "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line 493, in check_partition_gpt_layout
    702     raise Exception(
    703 Exception: Error running partition command on /dev/sda
    704 'NoneType' object has no attribute 'encode'
---------------------------------


Neal, did you use Fedora-Cloud-Base-35-1.2.x86_64.qcow2 to meet the issue? Is it Fedora image which does not have openssl and gdisk pre-installed by default?

According to the tests in RHEL-9 on Azure, IMO maybe we can add gdisk as cloud-utils-growpart dependency in RHEL. Please correct me if anything wrong. Thanks!

Comment 3 Neal Gompa 2021-12-15 08:49:50 UTC
(In reply to Huijuan Zhao from comment #2)
> Tried it with RHEL-9(cloud-init-21.1-14.el9) on Azure, did not meet this
> issue as we pre-installed openssl and gdisk in the image. 
> 
> Tried to remove openssl and gdisk from RHEL-9, failed to remove openssl as
> it is dependency for several other packages, and it is included in
> rhel-guest-image by default. So maybe no need adding openssl as cloud-init
> dependency. 
> 
> But after removed gdisk, there is error when do partitioning operation which
> should be called by cloud-utils-growpart:
> $ cat /var/log/cloud-init.log
> ---------------------------------
>     673 2021-12-15 02:05:09,765 - util.py[WARNING]: Failed partitioning
> operation
>     674 Error running partition command on /dev/sda
>     675 'NoneType' object has no attribute 'encode'
>     676 2021-12-15 02:05:09,773 - util.py[DEBUG]: Failed partitioning
> operation
>     677 Error running partition command on /dev/sda
>     678 'NoneType' object has no attribute 'encode'
>     679 Traceback (most recent call last):
>     680   File
> "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line
> 491, in check_partition_gpt_layout
>     681     out, _err = subp.subp(prt_cmd, update_env=LANG_C_ENV)
>     682   File "/usr/lib/python3.9/site-packages/cloudinit/subp.py", line
> 253, in subp
>     683     bytes_args = [
>     684   File "/usr/lib/python3.9/site-packages/cloudinit/subp.py", line
> 254, in <listcomp>
>     685     x if isinstance(x, bytes) else x.encode("utf-8")
>     686 AttributeError: 'NoneType' object has no attribute 'encode'
>     687 
>     688 The above exception was the direct cause of the following exception:
>     689 
>     690 Traceback (most recent call last):
>     691   File
> "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line
> 139, in handle
>     692     util.log_time(logfunc=LOG.debug,
>     693   File "/usr/lib/python3.9/site-packages/cloudinit/util.py", line
> 2409, in log_time
>     694     ret = func(*args, **kwargs)
>     695   File
> "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line
> 808, in mkpart
>     696     if check_partition_layout(table_type, device, layout):
>     697   File
> "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line
> 537, in check_partition_layout
>     698     found_layout = get_dyn_func(
>     699   File
> "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line
> 431, in get_dyn_func
>     700     return globals()[func_name](*func_args)
>     701   File
> "/usr/lib/python3.9/site-packages/cloudinit/config/cc_disk_setup.py", line
> 493, in check_partition_gpt_layout
>     702     raise Exception(
>     703 Exception: Error running partition command on /dev/sda
>     704 'NoneType' object has no attribute 'encode'
> ---------------------------------
> 
> 
> Neal, did you use Fedora-Cloud-Base-35-1.2.x86_64.qcow2 to meet the issue?
> Is it Fedora image which does not have openssl and gdisk pre-installed by
> default?
> 

That is the case, yes. I also tested with a custom CentOS Stream 9 image that I built, where it's possible to not have the openssl CLI tools installed.

As we need openssl and gdisk at the cloud-init level, it makes sense to guarantee it is always there for this stuff.

> According to the tests in RHEL-9 on Azure, IMO maybe we can add gdisk as
> cloud-utils-growpart dependency in RHEL. Please correct me if anything
> wrong. Thanks!

I think the dependency needs to be at the cloud-init level because sgdisk gets run by cloud-init stuff before passing it to cloud-utils-growpart. If cloud-utils-growpart *also* calls sgdisk, then it needs a gdisk dependency too, but cloud-init *definitely* needs the gdisk dependency.

Comment 4 Huijuan Zhao 2021-12-20 04:24:28 UTC
Neal, thanks for the explanation and updates. 
It is ok to add openssl and gdisk as dependency at the cloud-init level from QE side, thanks!

Comment 18 errata-xmlrpc 2022-05-17 12:26:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: cloud-init), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2308


Note You need to log in before you can comment on or make changes to this bug.