Bug 2013438 - OCP 4.7 bond network fails to link properly in mode 1, defaults to round-robin, MC override fails despite successful ifconfig update on master nodes
Summary: OCP 4.7 bond network fails to link properly in mode 1, defaults to round-robi...
Keywords:
Status: CLOSED DUPLICATE of bug 2018003
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 4.10.0
Assignee: Ben Nemec
QA Contact: Victor Voronkov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-12 20:32 UTC by Will Russell
Modified: 2021-10-28 20:04 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-27 22:21:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Will Russell 2021-10-12 20:32:44 UTC
Description of problem:
OCP 4.7
Cluster deployment on bare metal hosts fails with timeout error if network bond is defined with mode 1. 
Cluster will only spin up if the mode is undefined, which defaults the bond to round-robin and does not engage the interface. 

When specifying mode 1 via MC (as stipulated by docs) The MC does deploy successfully to the master nodes, but the contents of bond0 are as follows:

cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: ens2f0
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: <redacted>
Slave queue ID: 0

Slave Interface: ens2f1
MII Status: up
Speed: 25000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: <redacted>
Slave queue ID: 0

network config for bond:
[root@master-0 ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond0 
DEVICE=bond0
TYPE=Bond
NAME=bond0
BONDING_MASTER=yes
BOOTPROTO=dhcp
ONBOOT=yes
MTU=9100
IPV4_DHCP_TIMEOUT=2147483647
IPV6INIT=no
DHCPV6C=no
IPV6INIT=no
IPV6_AUTOCONF=no
IPV6_DEFROUTE=no
IPV6_PEERDNS=no
IPV6_PEERROUTES=no
IPV6_FAILURE_FATAL=no
BONDING_OPTS="mode=1 miimon=100"

Version-Release number of selected component (if applicable):
4.7.11/24 *tested on both releases issue occurs across both.

How reproducible:
every time

[network is preconfigured for bond link]


Steps to Reproduce:
1.install cluster - observe timeout if bond defined
2.install cluster with workaround failing to define bond mode, adjust via MC deployment when cluster is up/stable, observe no change in bond network interface even after node reboots


Actual results:
bond link is never activated as desired in mode 1, stays in round robin/disconnected


Expected results:
bond link should map as expected

Additional info:
case details linked to BZ for additional uploads/MG/SOS reports

Comment 3 Will Russell 2021-10-27 22:21:02 UTC
This bug is now being tracked in the following NEW Bug, which is the heart of the problem. Closing this case as duplicate. https://bugzilla.redhat.com/show_bug.cgi?id=2018003

*** This bug has been marked as a duplicate of bug 2018003 ***


Note You need to log in before you can comment on or make changes to this bug.