Bug 1967656

Summary: Installation of Bare metal UPI nodes using OVN kubernetes and bond fail second boot.
Product: OpenShift Container Platform Reporter: Jonas Nordell <jnordell>
Component: Machine Config OperatorAssignee: Tim Rozet <trozet>
Status: CLOSED DUPLICATE QA Contact: Michael Nguyen <mnguyen>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.7CC: dmoessne, trozet
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-23 17:03:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jonas Nordell 2021-06-03 14:30:26 UTC
Description of problem:

When performing a bare metal cluster installation, nodes with OVN kubernets and bonds fail the second boot.

The IP seems to be configured on the bond but after a few seconds after the second boot the IP vanish. 

Unable to login on the console or to access the node as there is no network. 


Version-Release number of selected component (if applicable):

OCP 4.7.12
RHCOS 4.7.7


How reproducible:

Every time in customer environment.


Steps to Reproduce:
1. Create a cluster with OVN and bond + static IP
2. Boot node, it will reboot after upgrade from CoreOS 47.83.202103251640-0 to 47.83.202105200135-0
3. Bond0 will display an IP for a few seconds in the console but then it vanish.

Actual results:
Node has no IP and is not able to connect to the cluster.

Expected results:
Node should have an IP

Additional info:
This is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1931522 but the backport to 4.7 was included in 4.7.3 https://bugzilla.redhat.com/show_bug.cgi?id=1931615 and this is 4.7.12

Also very similar to https://bugzilla.redhat.com/show_bug.cgi?id=1951028 but it was fixed in 4.7.9

Comment 11 Tim Rozet 2021-06-23 17:03:42 UTC
This bug does look the same as 1971715. The fix there should ensure that the correct uuid is set for connection.master of the subinterfaces and that it persists during reboot.

*** This bug has been marked as a duplicate of bug 1971715 ***