RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2036060 - [cloud-init][ESXi][RHEL-9] Failed to config static IP according to VMware Customization Config File
Summary: [cloud-init][ESXi][RHEL-9] Failed to config static IP according to VMware Cus...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: cloud-init
Version: 9.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Eduardo Otubo
QA Contact: Huijuan Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-29 15:25 UTC by Huijuan Zhao
Modified: 2022-05-17 12:34 UTC (History)
16 users (show)

Fixed In Version: cloud-init-21.1-19.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-17 12:26:18 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
cloud-init.log (97.55 KB, text/plain)
2021-12-29 15:25 UTC, Huijuan Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github canonical cloud-init pull 1180 0 None open Fix NetworkManagerActivator error 2022-01-28 09:05:19 UTC
Red Hat Issue Tracker RHELPLAN-106668 0 None None None 2021-12-29 15:25:49 UTC
Red Hat Product Errata RHBA-2022:2308 0 None None None 2022-05-17 12:26:40 UTC

Description Huijuan Zhao 2021-12-29 15:25:07 UTC
Created attachment 1848217 [details]
cloud-init.log

Description of problem:
Config static IP in Customization Config File, clone VM with this customization file, the VM does not get the static IP, but gets IP from DHCP server

Version-Release number of selected components (if applicable):
kernel-5.14.0-39.el9.x86_64
cloud-init-21.1-14.el9.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create VM_template with cloud-init installed
2. Config static IP in customization file
3. Clone VM_new with above customization file
4. Boot the VM_new and check the IP


Actual results:
The VM_new gets IP 10.73.197.6 from DHCP server

# ip a s
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:a7:25:98 brd ff:ff:ff:ff:ff:ff
    altname enp11s0
    inet 10.73.197.6/22 brd 10.73.199.255 scope global dynamic noprefixroute ens192
       valid_lft 84846sec preferred_lft 84846sec
    inet6 2620:52:0:49c4:250:56ff:fea7:2598/64 scope global dynamic noprefixroute
       valid_lft 2591996sec preferred_lft 604796sec
    inet6 fe80::250:56ff:fea7:2598/64 scope link noprefixroute
       valid_lft forever preferred_lft forever


# cat /etc/sysconfig/network-scripts/ifcfg-ens192
# Created by cloud-init on instance boot automatically, do not edit.
#
BOOTPROTO=none
DEVICE=ens192
HWADDR=00:50:56:a7:25:98
IPADDR=192.168.10.23
IPV6ADDR=2001::1/63
IPV6INIT=yes
IPV6_AUTOCONF=no
IPV6_FORCE_ACCEPT_RA=no
NETMASK=255.255.255.0
ONBOOT=yes
TYPE=Ethernet
USERCTL=no



#cat /var/log/cloud-init.log 

-----------------------------------

2021-12-29 08:01:52,949 - config_file.py[INFO]: Parsing the config file /var/run/vmware-imc/cust.cfg.
2021-12-29 08:01:52,950 - config_file.py[DEBUG]: FOUND CATEGORY = 'NETWORK'
2021-12-29 08:01:52,950 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|NETWORKING' = 'yes'
2021-12-29 08:01:52,950 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|BOOTPROTO' = 'dhcp'
2021-12-29 08:01:52,950 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|HOSTNAME' = 'huzhao-90-static-IP'
2021-12-29 08:01:52,950 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|DOMAINNAME' = 'redhat.com'
2021-12-29 08:01:52,950 - config_file.py[DEBUG]: FOUND CATEGORY = 'NIC-CONFIG'
2021-12-29 08:01:52,950 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC-CONFIG|NICS' = 'NIC1'
2021-12-29 08:01:52,950 - config_file.py[DEBUG]: FOUND CATEGORY = 'NIC1'
2021-12-29 08:01:52,950 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|MACADDR' = '00:50:56:a7:25:98'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|ONBOOT' = 'yes'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv4_MODE' = 'BACKWARDS_COMPATIBLE'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|BOOTPROTO' = 'static'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPADDR' = '192.168.10.23'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|NETMASK' = '255.255.255.0'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv6ADDR|1' = '2001::1'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv6NETMASK|1' = '63'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: FOUND CATEGORY = 'DNS'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DNS|DNSFROMDHCP' = 'no'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DNS|NAMESERVER|1' = '1.2.3.4'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: FOUND CATEGORY = 'DATETIME'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DATETIME|TIMEZONE' = 'Africa/Abidjan'
2021-12-29 08:01:52,951 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DATETIME|UTC' = 'yes'
2021-12-29 08:01:52,951 - DataSourceOVF.py[DEBUG]: Found VMware Customization Config File at /var/run/vmware-imc/cust.cfg

-----------------------------------


Expected results:
The VM_new should config IP as static IP 192.168.10.23 according to customization file 

Additional info:
No such issue in RHEL-8.6(cloud-init-21.1-11.el8)

Comment 1 Huijuan Zhao 2021-12-30 03:41:12 UTC
Quick update:

Seems this is related with the network configuration change in RHEL-9, using NetworkManager by default instead of file ifcfg-xx in rhel-9, so cloud-init failed to config network using ifcfg-xx. The ifcfg-xx will be deprecated in rhel-9, so maybe cloud-init need to fix the methods to configure network.

Other tests:
1. /etc/NetworkManager/system-connections/ens192.nmconnection is effective by default (using nmcli also can configure network)
2. The ifcfg-xx is effective(can configure the static IP successfully) after set: #nmcli con down ens192

Comment 2 Pengpeng Sun 2022-01-04 08:34:55 UTC
Cloud-init enables ifcfg-rh plugin and writes network configuration to network-script file ifcfg-xx, but this fails to load network configuration correctly in rhel-9. See cloud-init related code here: https://github.com/canonical/cloud-init/blob/main/cloudinit/net/sysconfig.py#L68

Comment 12 Pengpeng Sun 2022-01-24 07:51:22 UTC
There is a solution I verified locally which is adding 'AUTOCONNECT_PRIORITY: 999' to network-scripts ifcfg-xx files, NetworkManager will activate network-scripts(ifcfg-rh) instead of keyfile since network-scripts has higher autoconnect priority.

Modified cloud-init file /usr/lib/python3.9/site-packages/cloudinit/net/sysconfig.py
L293 - 'BOOTPROTO': 'none'
     + 'BOOTPROTO': 'none', 'AUTOCONNECT_PRIORITY': 999


Please RedHat team help to review and confirm if this is a workable solution.

Comment 13 Huijuan Zhao 2022-01-25 08:36:49 UTC
(In reply to Pengpeng Sun from comment #12)
> There is a solution I verified locally which is adding
> 'AUTOCONNECT_PRIORITY: 999' to network-scripts ifcfg-xx files,
> NetworkManager will activate network-scripts(ifcfg-rh) instead of keyfile
> since network-scripts has higher autoconnect priority.
> 
> Modified cloud-init file
> /usr/lib/python3.9/site-packages/cloudinit/net/sysconfig.py
> L293 - 'BOOTPROTO': 'none'
>      + 'BOOTPROTO': 'none', 'AUTOCONNECT_PRIORITY': 999
> 
> 
> Please RedHat team help to review and confirm if this is a workable solution.

Thanks Pengpeng, this solution works well in my testing.

Comment 14 Eduardo Otubo 2022-01-25 14:20:14 UTC
(In reply to Pengpeng Sun from comment #12)
> There is a solution I verified locally which is adding
> 'AUTOCONNECT_PRIORITY: 999' to network-scripts ifcfg-xx files,
> NetworkManager will activate network-scripts(ifcfg-rh) instead of keyfile
> since network-scripts has higher autoconnect priority.
> 
> Modified cloud-init file
> /usr/lib/python3.9/site-packages/cloudinit/net/sysconfig.py
> L293 - 'BOOTPROTO': 'none'
>      + 'BOOTPROTO': 'none', 'AUTOCONNECT_PRIORITY': 999
> 
> 
> Please RedHat team help to review and confirm if this is a workable solution.

There's another solution pending rework on github: https://github.com/canonical/cloud-init/pull/1180
But perhaps this one could also be a fix for the problem. Are you sending a pull request for this?

Comment 15 Pengpeng Sun 2022-01-26 10:28:16 UTC
(In reply to Eduardo Otubo from comment #14)
> (In reply to Pengpeng Sun from comment #12)
> > There is a solution I verified locally which is adding
> > 'AUTOCONNECT_PRIORITY: 999' to network-scripts ifcfg-xx files,
> > NetworkManager will activate network-scripts(ifcfg-rh) instead of keyfile
> > since network-scripts has higher autoconnect priority.
> > 
> > Modified cloud-init file
> > /usr/lib/python3.9/site-packages/cloudinit/net/sysconfig.py
> > L293 - 'BOOTPROTO': 'none'
> >      + 'BOOTPROTO': 'none', 'AUTOCONNECT_PRIORITY': 999
> > 
> > 
> > Please RedHat team help to review and confirm if this is a workable solution.
> 
> There's another solution pending rework on github:
> https://github.com/canonical/cloud-init/pull/1180
> But perhaps this one could also be a fix for the problem. Are you sending a
> pull request for this?

Thanks for the info, I've just sent a pull request on the 'AUTOCONNECT_PRIORITY' solution: https://github.com/canonical/cloud-init/pull/1212

Comment 16 Eduardo Otubo 2022-01-31 09:16:12 UTC
@huzhao since you're the reporter, do you think you can set this BZ to public? Cloud-init maintainers would to take a look at least at the public comments on this BZ.

Thanks!

Comment 17 Eduardo Otubo 2022-02-04 13:31:20 UTC
This issue is being discussed in 3 different PR: 

 - Set the highest autoconnect priority for network-scripts #1212
   https://github.com/canonical/cloud-init/pull/1212

 - Add native NetworkManager support #1224
   https://github.com/canonical/cloud-init/pull/1224

 - Fix NetworkManagerActivator error #1180
   https://github.com/canonical/cloud-init/pull/1180    

I'll be working with Lubomir to make sure we merge the NetworkManager support as it seems to be the best fit for the case.

Comment 18 Huijuan Zhao 2022-02-07 01:06:25 UTC
(In reply to Eduardo Otubo from comment #16)
> @huzhao since you're the reporter, do you think you can set this
> BZ to public? Cloud-init maintainers would to take a look at least at the
> public comments on this BZ.
> 
> Thanks!

Yes, I think so, setting this BZ as public

Comment 19 Lubomir Rintel 2022-02-07 12:51:22 UTC
(In reply to Eduardo Otubo from comment #17)
> This issue is being discussed in 3 different PR: 
> 
>  - Set the highest autoconnect priority for network-scripts #1212
>    https://github.com/canonical/cloud-init/pull/1212

What is being done in this pull request seems incorrect to me. If the system operator creates /etc/NetworkManager/system-connections/ens192.nmconnection, the cloud-init shouldn't just beat it with a higher autoconnect priority. Quite the contrary, it makes every sense for it to extend/override configuration applied on cloud instance bootstrap.

The question here is, why does ens192.nmconnection exists if it is not to be activated?

Does the installer create it? Is there some tooling specific to ESXi involved?

Comment 20 Pengpeng Sun 2022-02-07 16:10:00 UTC
(In reply to Lubomir Rintel from comment #19)
> (In reply to Eduardo Otubo from comment #17)
> > This issue is being discussed in 3 different PR: 
> > 
> >  - Set the highest autoconnect priority for network-scripts #1212
> >    https://github.com/canonical/cloud-init/pull/1212
> 
> What is being done in this pull request seems incorrect to me. If the system
> operator creates /etc/NetworkManager/system-connections/ens192.nmconnection,
> the cloud-init shouldn't just beat it with a higher autoconnect priority.
> Quite the contrary, it makes every sense for it to extend/override
> configuration applied on cloud instance bootstrap.
> 
> The question here is, why does ens192.nmconnection exists if it is not to be
> activated?
> 
> Does the installer create it? Is there some tooling specific to ESXi
> involved?

No, ESXi is not involved here. According to my experiment with RHEL9 Beta image, this issue doesn't reproduce on RHEL9 VM installed from RHEL9 Beta ISO step by step follows installation wizard, this issue reproduces on RHEL9 VM installed by kickstart. The reason is there is 'autoconnect-priority=-999' set in keyfile on RHEL9 VM installed from ISO, there is no such setting in keyfile installed by kickstart, so 'autoconnect-priority' is default to 0.
The /etc/NetworkManager/system-connections/ens192.nmconnection exists on both cases, I think it's created by RHEL9 itself. The 'autoconnect-priority=-999' looks like a backward compatible setting to me, since once network-scripts file created, it will have default 'autoconnect-priority' 0, then network-scripts will be activated since 0>-999.  The idea of PR #1212 came from this point.

For cloud-init, I agree that PR #1224 is the best choice for RHEL9 or Fedora33 which prefer keyfile by default.

Comment 21 Pengpeng Sun 2022-02-07 16:41:35 UTC
(In reply to Pengpeng Sun from comment #20)
> (In reply to Lubomir Rintel from comment #19)
> > (In reply to Eduardo Otubo from comment #17)
> > > This issue is being discussed in 3 different PR: 
> > > 
> > >  - Set the highest autoconnect priority for network-scripts #1212
> > >    https://github.com/canonical/cloud-init/pull/1212
> > 
> > What is being done in this pull request seems incorrect to me. If the system
> > operator creates /etc/NetworkManager/system-connections/ens192.nmconnection,
> > the cloud-init shouldn't just beat it with a higher autoconnect priority.
> > Quite the contrary, it makes every sense for it to extend/override
> > configuration applied on cloud instance bootstrap.
> > 
> > The question here is, why does ens192.nmconnection exists if it is not to be
> > activated?
> > 
> > Does the installer create it? Is there some tooling specific to ESXi
> > involved?
> 
> No, ESXi is not involved here. According to my experiment with RHEL9 Beta
> image, this issue doesn't reproduce on RHEL9 VM installed from RHEL9 Beta
> ISO step by step follows installation wizard, this issue reproduces on RHEL9
> VM installed by kickstart. The reason is there is
> 'autoconnect-priority=-999' set in keyfile on RHEL9 VM installed from ISO,
> there is no such setting in keyfile installed by kickstart, so
> 'autoconnect-priority' is default to 0.
> The /etc/NetworkManager/system-connections/ens192.nmconnection exists on
> both cases, I think it's created by RHEL9 itself. The
> 'autoconnect-priority=-999' looks like a backward compatible setting to me,
> since once network-scripts file created, it will have default
> 'autoconnect-priority' 0, then network-scripts will be activated since
> 0>-999.  The idea of PR #1212 came from this point.
> 
> For cloud-init, I agree that PR #1224 is the best choice for RHEL9 or
> Fedora33 which prefer keyfile by default.

One concern on PR #1224, will RHEL8.x cloud-init be aligned with upstream, will PR #1223 create problem on RHEL8 since keyfile is not default choice there.

Comment 22 Lubomir Rintel 2022-02-07 17:50:03 UTC
So the real problem is that installation with kickstart creates /etc/NetworkManager/system-connections/ens192.nmconnection when it shouldn't, right?

I'm wondering if you could share the .ks file you're using?

Comment 23 Pengpeng Sun 2022-02-08 04:24:41 UTC
In my understanding, from RHEL9 /etc/NetworkManager/system-connections/ens192.nmconnection but /etc/sysconfig/network-scripts/ifcfg-ens192 file will be created, no matter how the OS is installed.

For my testing, Ansible helps to deploy the OS, here is the network related commands:
# Network information
network --bootproto=dhcp --ipv6=auto
network --hostname=localhost.localdomain

The whole ks.cfg file is https://github.com/vmware/ansible-vsphere-gos-validation/blob/main/autoinstall/RHEL/8/server_with_GUI/ks.cfg

Comment 24 Lubomir Rintel 2022-02-08 12:23:26 UTC
Thanks for the response. Quick check of the installer source suggests that you're indeed correct about the installer always copying the configuration used for installation into the target system.

I suggest you wipe /etc/NetworkManager/system-connections/ clean in a %post scriptlet.

Comment 25 Pengpeng Sun 2022-02-09 03:47:02 UTC
Thanks Lubomir for the confirmation. We will face issue for customer who want to customize a RHEL9 VM in vSphere. Let me explain this with below points:
1. When customizing a Linux VM, there are 2 engines can be used. One is cloud-init, the other is perl scripts.
2. Now both cloud-init and perl scripts write network configuration to network-scripts(ex: /etc/sysconfig/network-scripts/ifcfg-ens192), if customer's RHEL9 VM has keyfile exists, network-scripts won't be activated. Customer will find the expected network settings do NOT take effect after customization.
3. With PR #1224, cloud-init will write network configuration to keyfile directly which will fix the cloud-init side problem. One concern is for customer who uses RHEL8.x, and RHEL8.x's cloud-init picks up PR #1224, if cloud-init also writes network configuration to keyfile, will it take effect? Any backward compatibility issue for PR#1224?

Comment 26 Huijuan Zhao 2022-02-10 08:20:17 UTC
Thanks Pengpeng for the quick response. 
QE's test results are same with Pengpeng's comment 20 and comment 23(although the kickstart is different).

Comment 27 Eduardo Otubo 2022-02-16 13:24:44 UTC
(In reply to Huijuan Zhao from comment #26)
> Thanks Pengpeng for the quick response. 
> QE's test results are same with Pengpeng's comment 20 and comment
> 23(although the kickstart is different).

Did you guys run tests on other platforms as well? I'm asking because we might want to apply this fix downstream-only (Pengpeng's fix) - since it's a major blocker - and leave the NetworkManager support (Lubomir's fix) for the next release and we'll have more time to have a stable fix.

I believe it is safe to have this fix for the other platforms as well.
As soon as I have a green from QA I'll send a Merge Request.

Thanks!

Comment 28 Huijuan Zhao 2022-02-17 01:17:39 UTC
(In reply to Eduardo Otubo from comment #27)
> (In reply to Huijuan Zhao from comment #26)
> > Thanks Pengpeng for the quick response. 
> > QE's test results are same with Pengpeng's comment 20 and comment
> > 23(although the kickstart is different).
> 
> Did you guys run tests on other platforms as well? I'm asking because we
> might want to apply this fix downstream-only (Pengpeng's fix) - since it's a
> major blocker - and leave the NetworkManager support (Lubomir's fix) for the
> next release and we'll have more time to have a stable fix.
> 
> I believe it is safe to have this fix for the other platforms as well.
> As soon as I have a green from QA I'll send a Merge Request.
> 
> Thanks!

Currently we covered tests in AWS/Azure/OpenStack which install rhel-9 via image, there is only ifcfg-rh file after rhel-9 installation, so did not meet this issue.
Seems only when install rhel-9 via ISO+kickstart can meet the issue per the tests above.

Thanks!

Comment 29 Lubomir Rintel 2022-02-21 09:53:38 UTC
(In reply to Eduardo Otubo from comment #27)
> (In reply to Huijuan Zhao from comment #26)
> > Thanks Pengpeng for the quick response. 
> > QE's test results are same with Pengpeng's comment 20 and comment
> > 23(although the kickstart is different).
> 
> Did you guys run tests on other platforms as well? I'm asking because we
> might want to apply this fix downstream-only (Pengpeng's fix) - since it's a
> major blocker - and leave the NetworkManager support (Lubomir's fix) for the
> next release and we'll have more time to have a stable fix.

This makes sense. The proper way to address this is for the images to *never* contain network configuration and only create it when necessary. That way there will be no conflict. I don't understand why is not done already.

Nevertheless, if the Pengpeng's fix addresses this for now, it's perfectly okay to apply it.

Comment 39 Huijuan Zhao 2022-02-26 09:33:30 UTC
Move to VERIFIED per comment 34

Comment 42 errata-xmlrpc 2022-05-17 12:26:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: cloud-init), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2308


Note You need to log in before you can comment on or make changes to this bug.