Bug 1899350

Summary: configure-ovs.sh doesn't configure bonding options
Product: OpenShift Container Platform Reporter: Michael Zamot <mzamot>
Component: Machine Config OperatorAssignee: Tim Rozet <trozet>
Status: CLOSED ERRATA QA Contact: Ross Brattain <rbrattai>
Severity: high Docs Contact:
Priority: high    
Version: 4.6.zCC: achernet, bhershbe, cpaquin, djuran, jerzhang, manrodri, maydin, mcornea, mjahangi, mleonard, skanakal, trozet
Target Milestone: ---Flags: trozet: needinfo-
Target Release: 4.7.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Bond options were being ignored on a pre-configured Linux bond interface in an OCP node that was to be migrated over to OVS using OVN-Kubernetes as the SDN provider. Consequence: After moving the bond interface to OVS, the bond may no longer function if it was originally specified to use a non-default mode (round-robin) and also included other bond specific Linux options. Fix: When ovs-configuration.service runs, it now copies all of the previous bond options on the Linux bond over to ovs-if-phys0 NetworkManager connection. Result: Bond should work as originally configured in Linux, and details of specified bond options can be verified by doing an "nmcli conn show ovs-if-phys0".
Story Points: ---
Clone Of:
: 1899622 (view as bug list) Environment:
Last Closed: 2021-02-24 15:34:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1899622    
Attachments:
Description Flags
output for *ip -a* to go along w the configuration failure log for 02814494 none

Description Michael Zamot 2020-11-18 23:08:00 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Deploy cluster with bonding enabled (using MachineConfig).
2. Wait until the nodes boot from the hard drive and finish the first boot.
3. configure-ovs.sh will create a new bonding interface ovs-if-phys0, but lacking the bonding options and ipv4.client-id


Actual results:
Bonds are configured using round robin instead of the mode specified. All bond options such as lacp_rate, updelay, miimon or xmit_hash_policy are missing

Expected results:
All bond options and ipv4.client-id setting are copied over the new bond.

Additional info:

Comment 1 Tim Rozet 2020-11-19 17:17:38 UTC
Michael can you please provide the nmcli conn show <bond iface> output for the original bond iface?

Comment 2 Manuel Rodriguez 2020-11-26 20:51:42 UTC
I'm facing this issue too, if it helps I can provide the information from a worker node in a fresh installation.

My environment uses OCP 4.6.4 and I performed an IPI baremetal installation, 3 masters, 3 workers.
I'm using OVNKubernetes as network type in install-config.yaml
I defined the following in the bonding options: mode=active-backup miimon=100 fail_over_mac=follow

NIC layout is as follows:

nic1: provisioning network (192.168.10.0/24)
nic2: bond0 member
nic3: bond0 member
bond0: baremetal network (192.168.100.0/24)



After the deployment, there is bonding but round-robin, and from nmcli output bond0 is not active

[core@worker-1 ~]$ sudo ovs-vsctl list-ports br-ex
bond0
patch-br-ex_worker-1-to-br-int


[core@worker-1 ~]$ nmcli dev
DEVICE                                  TYPE           STATE      CONNECTION
br-ex                                   ovs-interface  connected  ovs-if-br-ex
enp1s0                                  ethernet       connected  Wired connection 1
bond0                                   bond           connected  ovs-if-phys0
enp2s0                                  ethernet       connected  System enp2s0
enp3s0                                  ethernet       connected  System enp3s0
br-ex                                   ovs-bridge     connected  br-ex
bond0                                   ovs-port       connected  ovs-port-phys0
br-ex                                   ovs-port       connected  ovs-port-br-ex


[core@worker-1 ~]$ nmcli con show
NAME                UUID                                  TYPE           DEVICE 
ovs-if-br-ex        42ca7154-762a-496d-9242-e2c17e7d19b4  ovs-interface  br-ex  
Wired connection 1  1505485f-448c-3ff9-aef8-676261939ec5  ethernet       enp1s0 
br-ex               3ede4036-72a2-4ec6-8ec3-644bbbe69f7d  ovs-bridge     br-ex  
ovs-if-phys0        44631883-6f39-410b-b30d-fcea6aeb41dc  bond           bond0  
ovs-port-br-ex      ce2007a1-9d24-4d01-b1e9-ed5d730c8503  ovs-port       br-ex  
ovs-port-phys0      09f8a62d-f547-4508-ab45-b177f8629725  ovs-port       bond0  
System enp2s0       8c6fd7b1-ab62-a383-5b96-46e083e04bb1  ethernet       enp2s0 
System enp3s0       63aa2036-8665-f54d-9a92-c3035bad03f7  ethernet       enp3s0 
bond0               ad33d8b0-1f7b-cab9-9447-ba07f855b143  bond           --     


[core@worker-1 ~]$ ip a s br-ex
8: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 52:54:00:99:00:31 brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.31/24 brd 192.168.100.255 scope global dynamic noprefixroute br-ex
       valid_lft 597sec preferred_lft 597sec
    inet6 fe80::1ffa:8df8:53c6:eadb/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
[core@worker-1 ~]$ ip a s bond0
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP group default qlen 1000
    link/ether 52:54:00:99:00:31 brd ff:ff:ff:ff:ff:ff


[core@worker-1 ~]$ cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: enp2s0
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 52:54:00:99:00:31
Slave queue ID: 0

Slave Interface: enp3s0
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 52:54:00:98:00:31
Slave queue ID: 0


[core@worker-1 ~]$ nmcli con show bond0
connection.id:                          bond0
connection.uuid:                        ad33d8b0-1f7b-cab9-9447-ba07f855b143
connection.stable-id:                   --
connection.type:                        bond
connection.interface-name:              bond0
connection.autoconnect:                 yes
connection.autoconnect-priority:        0
connection.autoconnect-retries:         -1 (default)
connection.multi-connect:               0 (default)
connection.auth-retries:                -1
connection.timestamp:                   1606420039
connection.read-only:                   no
connection.permissions:                 --
connection.zone:                        --
connection.master:                      --
connection.slave-type:                  --
connection.autoconnect-slaves:          1 (yes)
connection.secondaries:                 --
connection.gateway-ping-timeout:        0
connection.metered:                     unknown
connection.lldp:                        default
connection.mdns:                        -1 (default)
connection.llmnr:                       -1 (default)
connection.wait-device-timeout:         -1
802-3-ethernet.port:                    --
802-3-ethernet.speed:                   0
802-3-ethernet.duplex:                  --
802-3-ethernet.auto-negotiate:          no
802-3-ethernet.mac-address:             --
802-3-ethernet.cloned-mac-address:      --
802-3-ethernet.generate-mac-address-mask:--
802-3-ethernet.mac-address-blacklist:   --
802-3-ethernet.mtu:                     1500
802-3-ethernet.s390-subchannels:        --
802-3-ethernet.s390-nettype:            --
802-3-ethernet.s390-options:            --
802-3-ethernet.wake-on-lan:             default
802-3-ethernet.wake-on-lan-password:    --
ipv4.method:                            auto
ipv4.dns:                               --
ipv4.dns-search:                        --
ipv4.dns-options:                       --
ipv4.dns-priority:                      0
ipv4.addresses:                         --
ipv4.gateway:                           --
ipv4.routes:                            --
ipv4.route-metric:                      -1
ipv4.route-table:                       0 (unspec)
ipv4.routing-rules:                     --
ipv4.ignore-auto-routes:                no
ipv4.ignore-auto-dns:                   no
ipv4.dhcp-client-id:                    --   
ipv4.dhcp-iaid:                         --                                  
ipv4.dhcp-timeout:                      2147483647 (infinity)
ipv4.dhcp-send-hostname:                yes 
ipv4.dhcp-hostname:                     --   
ipv4.dhcp-fqdn:                         -- 
ipv4.dhcp-hostname-flags:               0x0 (none)
ipv4.never-default:                     no          
ipv4.may-fail:                          yes        
ipv4.dad-timeout:                       -1 (default)
ipv6.method:                            ignore    
ipv6.dns:                               --
ipv6.dns-search:                        --
ipv6.dns-options:                       --
ipv6.dns-priority:                      0 
ipv6.addresses:                         --
ipv6.gateway:                           --     
ipv6.routes:                            --
ipv6.route-metric:                      -1
ipv6.route-table:                       0 (unspec)
ipv6.routing-rules:                     --     
ipv6.ignore-auto-routes:                yes         
ipv6.ignore-auto-dns:                   yes         
ipv6.never-default:                     yes
ipv6.may-fail:                          yes
ipv6.ip6-privacy:                       -1 (unknown)
ipv6.addr-gen-mode:                     stable-privacy
ipv6.ra-timeout:                        0 (default)
ipv6.dhcp-duid:                         --
ipv6.dhcp-iaid:                         --
ipv6.dhcp-timeout:                      0 (default)
ipv6.dhcp-send-hostname:                yes
ipv6.dhcp-hostname:                     --  
ipv6.dhcp-hostname-flags:               0x0 (none)
ipv6.token:                             --
bond.options:                           fail_over_mac=follow,miimon=100,mode=active-backup
proxy.method:                           none   
proxy.browser-only:                     no
proxy.pac-url:                          --  
proxy.pac-script:                       --


[core@worker-1 ~]$ cat /etc/sysconfig/network-scripts/ifcfg-bond0                                                                                                                                       
DEVICE=bond0                                                                                                                                                                                             
TYPE=Bond                                                                                                                                                                                               
NAME=bond0                                                                                                                                                                                              
BONDING_MASTER=yes                                                                        
BOOTPROTO=dhcp                                                              
ONBOOT=yes                                                                                                                                                                                               
MTU=1500                                                                                                                                                                                                
BONDING_OPTS="mode=active-backup miimon=100 fail_over_mac=follow"
AUTOCONNECT_SLAVES=yes
IPV6INIT=no
DHCPV6C=no
IPV6INIT=no
IPV6_AUTOCONF=no
IPV6_DEFROUTE=no
IPV6_PEERDNS=no
IPV6_PEERROUTES=no
IPV6_FAILURE_FATAL=no
IPV4_DHCP_TIMEOUT=2147483647


I dumped a full sosreport from the node, if you need it just let me know and I'll attach it.
Thanks.

Comment 3 Manuel Rodriguez 2020-11-27 02:14:07 UTC
Also just to let you know when using the same setup above with the only difference that I set OpenShiftSDN as network type I do not have this issue, bonding settings are displayed by nmcli as defined by the machine config manifests in ifcfg-bond0 file.

Comment 5 milti leonard 2020-12-02 16:46:24 UTC
Created attachment 1735680 [details]
output for *ip -a* to go along w the configuration failure log for 02814494

output for *ip -a* to go along w the configuration failure log for 02814494

Comment 14 Yu Qi Zhang 2021-01-06 16:51:55 UTC
This should be documented as a bugfix. @Tim could you fill in the doc text in the bug field above?

Comment 18 errata-xmlrpc 2021-02-24 15:34:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633