Bug 1713618

Summary: [ESXi][RHEL 8.0]Cloud-init fails to set the default gateway on RHEL 8
Product: Red Hat Enterprise Linux 8 Reporter: Jaroslav Spanko <jspanko>
Component: cloud-initAssignee: Eduardo Otubo <eterrell>
Status: CLOSED CURRENTRELEASE QA Contact: ldu <ldu>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.0CC: boyang, jgreguske, ldu, leiwang, pengpengs, ribarry, vmware-gos-qa, yacao, yujiang, yuxisun
Target Milestone: rcKeywords: TestOnly
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-07-19 03:04:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jaroslav Spanko 2019-05-24 09:24:52 UTC
Description of problem:
The same problem as before reported in https://bugzilla.redhat.com/show_bug.cgi?id=1492726
Cloud init does not appy the default gateway

nmcli
nmcli con show "System ens222" | grep -i GATEWAY
connection.gateway-ping-timeout:        0
ipv4.gateway:                           --
ipv6.gateway:                           --
IP4.GATEWAY:                            --

messages
Cloud-init v. 18.2 running 'init' at Thu, 23 May 2019 09:46:54 +0000. Up 9.90 seconds.
ci-info: ++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++
ci-info: +---------+------+-----------------+---------------+-------+-------------------+
ci-info: |  Device |  Up  |     Address     |      Mask     | Scope |     Hw-Address    |
ci-info: +---------+------+-----------------+---------------+-------+-------------------+
ci-info: | ens222: | True | 192.168.160.111 | 255.255.255.0 |   .   | 00:50:56:a2:61:41 |
ci-info: | ens222: | True |        .        |       .       |   d   | 00:50:56:a2:61:41 |
ci-info: |   lo:   | True |    127.0.0.1    |   255.0.0.0   |   .   |         .         |
ci-info: |   lo:   | True |        .        |       .       |   d   |         .         |
ci-info: +---------+------+-----------------+---------------+-------+-------------------+
ci-info: ++++++++++++++++++++++++++++Route IPv4 info++++++++++++++++++++++++++++
ci-info: +-------+---------------+---------+---------------+-----------+-------+
ci-info: | Route |  Destination  | Gateway |    Genmask    | Interface | Flags |
ci-info: +-------+---------------+---------+---------------+-----------+-------+
ci-info: |   0   | 192.168.160.0 | 0.0.0.0 | 255.255.255.0 |   ens222 |   U   |
ci-info: +-------+---------------+---------+---------------+-----------+-------+

cloud-init.log
2019-05-24 06:40:22,637 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'ens224|GATEWAY' = '192.168.160.1'
2019-05-21 06:57:08,474 - stages.py[DEBUG]: applying net config names for {'version': 1, 'config': [{'type': 'physical', 'name': 'ens222', 'mac_address': '00:50:56:a2:92:36', 'subnets': [{'control': 'auto', 'type': 'static', 'address': '192.168.160.111', 'netmask': '255.255.255.0'}]}, {'destination': '192.168.160.0/22', 'type': 'route', 'gateway': '192.168.160.1', 'metric': 10000}, {'type': 'nameserver', 'address': ['10.248.176.61', '10.248.177.61'], 'search': ['xxxxxxxxxx']}]}

Version-Release number of selected component (if applicable):
cloud-init-18.2-6.el8.noarch

How reproducible:
100%

Steps to Reproduce:
1. configure static networking with gateway 


Actual results:
Gateway is not correctly applied 

Expected results:
correctly applied gateway

Additional info:
It seems that the GATEWAY attribute is not used as default gateway but it creates a static route to 192.168.160.0/22 via 192.168.160.1
I tried also cloud-init-18.5-3.el8.noarch from brew, it seems there is improvement - at least i can see the gateway in the Route IPv4 info table but GATEWAY is still not applied correctly 

Thank you

Comment 3 Jaroslav Spanko 2019-05-24 11:03:51 UTC
More information from customer tests
-----------------------
the default gateway is not set up because of the lack of the "PRIMARY" parameter:

/usr/lib/python3.6/site-packages/cloudinit/sources/helpers/vmware/imc/nic.py   --> check if nic is PRIMARY (line 37)
/usr/lib/python3.6/site-packages/cloudinit/sources/helpers/vmware/imc/config_nic.py --> add default gw only if nic is promary (line 160)

to test it I simply added "PRIMARY = true" to the NIC session:

<...>
[ens224]
PRIMARY = true              <---------------------------
MACADDR = 00:50:56:a2:61:41
<...>

and running the net_convert.py utility with the modified cust.cfg it creates the correct file with the gateway:

# Created by cloud-init on instance boot automatically, do not edit.
#
BOOTPROTO=none
DEFROUTE=yes
DEVICE=ens224
GATEWAY=192.168.160.1                 <--------------
HWADDR=00:50:56:a2:47:c1
IPADDR=192.168.160.242
NETMASK=255.255.252.0
ONBOOT=yes
TYPE=Ethernet
USERCTL=no

only a test - it does not work for multi-nic setup
 
diff -Naur /usr/lib/python3.6/site-packages/cloudinit/sources/helpers/vmware/imc/config_nic.py.orig /usr/lib/python3.6/site-packages/cloudinit/sources/helpers/vmware/imc/config_nic.py
@@ -158,7 +158,7 @@
             subnet.update({'netmask': v4.netmask})

         # Add the primary gateway
-        if nic.primary and v4.gateways:
+        if v4.gateways:
             self.ipv4PrimaryGateway = v4.gateways[0]
             subnet.update({'gateway': self.ipv4PrimaryGateway})
             return ([subnet], route_list)

Now, with this change, I can provision a RHEL8 vm from a vmware template with all parameters correctly set up (gw included).
--------------------------------------

Comment 4 Jaroslav Spanko 2019-05-27 09:01:43 UTC
Adding more info from the latest customer test
----------------------
I confirm that the issue is related to the lack of compatibility between cloud-init 18.2 and vshere 6.5 "OS Guest customization" format (because of the lack of PRIOMARY parameter) .
You can't use "OS Guest customization" (triggered by vmware tools) *and* cloud-init at the same time to post-configure a system.

Removing cloud-init from my templates (I tried with both RHEL 7 and 8) I can successfully deploy VMs with a correct network setup (gateway included).
--------------------------

Thank you

Comment 5 ldu 2019-05-28 05:44:59 UTC
Could reproduce on our test ENV.

Comment 6 Pengpeng Sun 2019-05-28 07:23:26 UTC
(In reply to Jaroslav Spanko from comment #4)
> Adding more info from the latest customer test
> ----------------------
> I confirm that the issue is related to the lack of compatibility between
> cloud-init 18.2 and vshere 6.5 "OS Guest customization" format (because of
> the lack of PRIOMARY parameter) .
> You can't use "OS Guest customization" (triggered by vmware tools) *and*
> cloud-init at the same time to post-configure a system.
> 
> Removing cloud-init from my templates (I tried with both RHEL 7 and 8) I can
> successfully deploy VMs with a correct network setup (gateway included).
> --------------------------
> 
> Thank you

Yes, this is due to lack of PRIMARY NIC parameter in the current vSphere6.5 Guest OS Customization specification. 
Without a primary NIC, static route but default gateway will be set.

config_nic.py
        # Add routes if there is no primary nic
        if not self._primaryNic and v4.gateways:
            subnet.update(
                {'routes': self.gen_ipv4_route(nic, v4.gateways, v4.netmask)})

A fix is available on the next vSphere vCenter Server 6.5 update release and vSphere vCenter Server 6.7 Update 2 release. There will be PRIMARY NIC set when the NIC is the first NIC and also has a static IPv4 and a gateway configured. Details at "You cannot set a primary virtual NIC" in https://docs.vmware.com/en/VMware-vSphere/6.7/rn/vsphere-vcenter-server-67u2-release-notes.html

Thanks,
Pengpeng

Comment 8 ldu 2019-07-19 03:04:25 UTC
After upgrade the vCenter 6.7 update 2, The nmcli con show "System ens222" | grep -i GATEWAY had set gateway.
So close this bug as current release.