Bug 1943320

Summary: Baremetal node loses connectivity with bonded interface and OVNKubernetes
Product: OpenShift Container Platform Reporter: Andrew Austin <aaustin>
Component: NetworkingAssignee: Mohamed Mahmoud <mmahmoud>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: high    
Priority: high CC: anbhat, astoycos, aygarg, memodi, mifiedle, mmahmoud, rbrattai, sbelmasg, thaller, trozet, vkochuku, vlaad, william.caban, zzhao
Version: 4.7Keywords: Reopened, UpcomingSprint
Target Milestone: ---Flags: trozet: needinfo-
mmahmoud: needinfo-
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-26 14:13:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1945429    
Bug Blocks: 1951028    
Attachments:
Description Flags
system journal 30-03-2021
none
ovs-configuration service log 30-03-2021
none
NetworkManager log 30-03-2021 none

Description Andrew Austin 2021-03-25 19:23:49 UTC
Description of problem:
When deploying a baremetal UPI node whose default gateway is on a bonded interface with no VLAN tag (i.e. bond0), network connectivity to the node is lost when ovs-configuration.service runs on the first post-install boot.

On each reboot, connectivity is up for several pings until ovs-configuration runs and breaks it again. From the console, it is noted that the former bond member interfaces still reference the UUID of the old bond interface rather than the bond interface created by ovs-configuration for the br-ex bridge. 

Running the ovs-configuration.sh script with OpenShiftSDN from the console fixes connectivity (but breaks br-ex) and deploying the node with OpenShiftSDN works fine.

Version-Release number of selected component (if applicable):
Tested on 4.7.0, 4.7.2, and 4.7.3

How reproducible:
Deploy a baremetal UPI node with a bonded interface as the default gateway and no VLAN tag. Observe that connectivity is lost after the node is configured for OVN. 

Steps to Reproduce:
1. Deploy a baremetal UPI node with a bonded interface configured via kernel arguments
2. Wait for the node to reboot twice (installer and machine-config)
3. On the second reboot, observe that network connectivity is lost when br-ex appears

Actual results:
The node has no connectivity from the new br-ex interface. The former bond members are not a member of any bond. /proc/net/bonding/bond0 shows no member interfaces on the new bond0.

Expected results:
The new bond0 should inherit the member interfaces from the old one.

Additional info:
This is a lab cluster and I can provide any logs or access needed for debugging.

Comment 1 Tim Rozet 2021-03-29 15:04:35 UTC
Can you please provide logs for ovs-configuration.service and NM or full system journal logs?

Comment 2 Andrew Austin 2021-03-30 19:33:10 UTC
Created attachment 1767773 [details]
system journal 30-03-2021

Comment 3 Andrew Austin 2021-03-30 19:33:39 UTC
Created attachment 1767774 [details]
ovs-configuration service log 30-03-2021

Comment 4 Andrew Austin 2021-03-30 19:34:01 UTC
Created attachment 1767775 [details]
NetworkManager log 30-03-2021

Comment 5 Andrew Austin 2021-03-30 19:35:10 UTC
I ended up in a different failure mode on attempting to reproduce the issue again for logs. This time, bond0 is still available but there is no active physical port attached to br-ex so the node remains in NotReady status. I have also provided Mohamed with access to the lab cluster in question.

Comment 6 Andrew Austin 2021-03-30 19:42:10 UTC
For reference, here is the isolinux config used to build the failing node including networking kernel args:

label linux
  menu label ^Install RHEL CoreOS on ocp2-worker-3.lab.signal9.gg
  kernel /images/vmlinuz
  initrd /images/initramfs.img,/images/tls-ca.initrd
  append nomodeset rd.neednet=1 console=tty0 ignition.firstboot ignition.platform.id=metal coreos.inst=yes coreos.inst.install_dev=sda coreos.live.rootfs_url=https://172.18.0.59:443/ocp2/rhcos-rootfs.img coreos.inst.ignition_url=https://172.18.0.59:443/ocp2/worker.ign bootdev=bond0  bond=bond0:eno1,eno2:mode=active-backup ip=172.18.0.72::172.18.0.1:255.255.255.0:ocp2-worker-3.lab.signal9.gg:bond0:none:172.18.42.10 ip=eno3:none ip=eno3d1:none nameserver=172.18.42.10  nameserver=172.18.42.11

Comment 10 Tim Rozet 2021-03-31 22:45:02 UTC
This looks like a bug in NetworkManager to me. auto-activation of eno2 (a child to bond0) is bringing up bond0 and taking the interface from ovs-if-phys0 this results in "bond0" connection being up with the same IP as ovs-if-br-ex. Created https://bugzilla.redhat.com/show_bug.cgi?id=1945429

Comment 11 Andrew Austin 2021-04-01 02:55:37 UTC
For testing, I applied this patch to MCO and confirmed I was able to deploy a node that was failing with 4.7.3 successfully. I have not added logic to undo this change at the end of the configure script when run with OpenShiftSDN.

diff --git a/templates/common/_base/files/configure-ovs-network.yaml b/templates/common/_base/files/configure-ovs-network.yaml
index f2b79b98..79645f9a 100644
--- a/templates/common/_base/files/configure-ovs-network.yaml
+++ b/templates/common/_base/files/configure-ovs-network.yaml
@@ -173,6 +173,13 @@ contents:
           connection.autoconnect-priority 100 802-3-ethernet.mtu ${iface_mtu} ${extra_phys_args}
       fi
 
+      # Move any bond member interfaces to the new ovs-if-phys0 connection
+      if [ "$(nmcli --get-values connection.type conn show ${old_conn})" == "bond" ]; then
+        new_conn=$(grep uuid= $NM_CONN_PATH/ovs-if-phys0.nmconnection | sed s/uuid=//)
+        sed -i s/master=${old_conn}/master=${new_conn}/ $NM_CONN_PATH/*.nmconnection
+        nmcli conn reload
+      fi
+
       nmcli conn up ovs-if-phys0
 
       if ! nmcli connection show ovs-if-br-ex &> /dev/null; then

Comment 12 Andrew Austin 2021-04-06 14:36:44 UTC
Here is a more complete patch if you ultimately choose to handle the conflict on the MCO side. I don't think the reversion is perfect, but it results in a reachable system for debugging most of the time.

https://github.com/marbindrakon/machine-config-operator/commit/192441aabd51cc77f57b7f7f060d3da7eb369891

Comment 13 Tim Rozet 2021-04-08 21:58:13 UTC
*** Bug 1937914 has been marked as a duplicate of this bug. ***

Comment 14 Tim Rozet 2021-04-08 22:06:53 UTC
Thanks Andrew. I think it is a good start. Mohammed can you please create a PR for this? I'm not sure if we even need to revert setting the device as the master for the slaves. I think it is probably OK to leave it, but will leave it up to you.

Comment 15 Mohamed Mahmoud 2021-04-08 23:45:30 UTC
so we won't wait for NM team assessment for 1945429 to close on this ?!?

Comment 16 Tim Rozet 2021-04-09 02:33:11 UTC
From Beniamino's response it seems like they do not think it is a bug in NM. Even if it is, we would have to get NM fixed and carried all the way back to 4.6 for this which will take a considerable amount of time. I think it's safe for us to go ahead and proceed with a workaround to get this fixed asap.

Comment 17 Simon Belmas-Gauderic 2021-04-12 16:22:51 UTC
*** Bug 1948440 has been marked as a duplicate of this bug. ***

Comment 19 Mike Fiedler 2021-04-16 11:53:04 UTC
@aaustin Any chance you can test the fix in your environment?   OCP QE doesn't have an env immediately available.

Comment 20 Andrew Austin 2021-04-16 19:59:40 UTC
Sure thing. I will post results once the cluster build completes.

Comment 21 Andrew Austin 2021-04-16 21:28:52 UTC
Looking good to me. Tested using the 4.8-2021-04-16-184424 image stream, no intervention was required to bring up a worker using bonded interfaces defined via kernel arguments.

[root@ocp2-worker-4 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 00:50:56:b7:ff:45 brd ff:ff:ff:ff:ff:ff
3: ens224: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 00:50:56:b7:ff:45 brd ff:ff:ff:ff:ff:ff permaddr 00:50:56:b7:9f:70
5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP group default qlen 1000
    link/ether 00:50:56:b7:ff:45 brd ff:ff:ff:ff:ff:ff
6: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 22:6d:fa:c8:ff:0a brd ff:ff:ff:ff:ff:ff
7: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 00:50:56:b7:ff:45 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.73/24 brd 172.18.0.255 scope global noprefixroute br-ex
       valid_lft forever preferred_lft forever

[root@ocp2-worker-4 ~]# grep -C 5 new_device /usr/local/bin/configure-ovs.sh 
    nmcli c add type ${iface_type} conn.interface ${iface} master ovs-port-phys0 con-name ovs-if-phys0 \
      connection.autoconnect-priority 100 802-3-ethernet.mtu ${iface_mtu} "${extra_phys_args[@]}"
  fi

  # Update connections with master property set to use the new device name
  new_device=$(nmcli --get-values connection.interface-name conn show ovs-if-phys0)
  for conn_uuid in $(nmcli -g UUID connection show) ; do
    if [ "$(nmcli -g connection.master connection show uuid "$conn_uuid")" != "$old_conn" ]; then
      continue
    fi
    nmcli conn mod uuid ${conn_uuid} connection.master ${new_device}
  done

  nmcli conn up ovs-if-phys0

  if ! nmcli connection show ovs-if-br-ex &> /dev/null; then

Comment 22 zhaozhanqi 2021-04-19 02:34:42 UTC
Thanks, Andrew.  Move this bug to verified according to comment 21

Comment 25 errata-xmlrpc 2021-07-27 22:55:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438

Comment 26 aygarg 2021-08-06 14:29:30 UTC
Hello All,

Will it be backported to 4.6 and 4.7? As the customer is building clusters with 4.6 and 4.7 versions.

Regards,
Ayush Garg

Comment 29 zhaozhanqi 2021-08-10 03:03:24 UTC
(In reply to aygarg from comment #26)
> Hello All,
> 
> Will it be backported to 4.6 and 4.7? As the customer is building clusters
> with 4.6 and 4.7 versions.
> 
> Regards,
> Ayush Garg

this bug only for 4.8 version.  for 4.6 and 4.7 version, please refer to 

https://bugzilla.redhat.com/show_bug.cgi?id=1951089
https://bugzilla.redhat.com/show_bug.cgi?id=1951028