Bug 1165185 - Bonding interfaces set by staypuft won't come up after reboot
Summary: Bonding interfaces set by staypuft won't come up after reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-puppet-modules
Version: Foreman (RHEL 6)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: 5.0 (RHEL 7)
Assignee: Ivan Chavero
QA Contact: Asaf Hirshberg
URL:
Whiteboard:
Depends On:
Blocks: 1191232
TreeView+ depends on / blocked
 
Reported: 2014-11-18 14:16 UTC by Ramon Acedo
Modified: 2023-09-18 00:19 UTC (History)
16 users (show)

Fixed In Version: openstack-puppet-modules-2014.2.7-2
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1191232 (view as bug list)
Environment:
Last Closed: 2015-02-09 15:15:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
messages log file logging both modules.conf file loading bonding and not loading it (13.77 MB, text/plain)
2014-12-04 10:09 UTC, Ramon Acedo
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 139905 0 None MERGED Adds filtering for BONDING (LACP) 2020-06-02 16:07:54 UTC
Red Hat Issue Tracker OSP-28771 0 None None None 2023-09-18 00:13:58 UTC
Red Hat Product Errata RHBA-2015:0156 0 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform Installer Bug Fix Advisory 2015-02-09 20:13:39 UTC

Description Ramon Acedo 2014-11-18 14:16:37 UTC
After fixing bonding+VLANs as specified in https://bugzilla.redhat.com/show_bug.cgi?id=1162794 the bonding interface doesn't come up after reboot.

In my setup I have just bond0 and I solve it after doing this in the hosts:

echo "alias bond* bonding" > /etc/modprobe.d/bonding.conf

In my test I haven't tried multiple bonding interfaces, just bond0.

Comment 3 Brad P. Crochet 2014-12-03 18:34:02 UTC
Is BONDING_OPTS set correctly in the /etc/sysconfig/network-scripts/ifcfg-bond0? If so, according to RHEL docs, the bonding module should be loaded automatically.

Comment 4 Ramon Acedo 2014-12-04 10:09:01 UTC
Created attachment 964539 [details]
messages log file logging both modules.conf file loading bonding and  not loading it

Comment 5 Ramon Acedo 2014-12-04 10:11:30 UTC
I believe BONDING_OPTS was set correctly by the installer:

# cat ifcfg-br-trunk
ONBOOT=yes
PEERDNS=no
PEERROUTES=no
DEFROUTE=no
BONDING_OPTS="miimon=100 mode=balance-tlb"
BONDING_MASTER=yes
NM_CONTROLLED=no
DEVICE=br-trunk
DEVICETYPE=ovs
OVSBOOTPROTO="none"
TYPE=OVSBridge

As a recap, this setup has multiple subnets associated to VLANs:

Management	        192.168.101.0/24	        101
Cluster Management	192.168.102.0/24		102
Storage	                192.168.103.0/24		103
Admin API	        192.168.104.0/24		104
External Public API	192.168.105.0/24		105

They are all associated to bond0 (via bond0.101, bond0.102...). This is all of them:
ifcfg-bond0      ifcfg-bond0.102  ifcfg-bond0.104  ifcfg-br-trunk     ifcfg-eno33557248  ifcfg-eno67115776  ifcfg-lo
ifcfg-bond0.101  ifcfg-bond0.103  ifcfg-bond0.105  ifcfg-eno16777984  ifcfg-eno50336512  ifcfg-eno83887104

And this is how they are configured:

# cat ifcfg-bond0
DEVICE=bond0
DEVICETYPE=ovs
TYPE=OVSPort
OVS_BRIDGE=br-trunk
ONBOOT=yes
BOOTPROTO=none

# cat ifcfg-eno33557248
BOOTPROTO="none"
DEVICE="eno33557248"
HWADDR="00:0c:29:94:a2:40"
ONBOOT=yes
PEERROUTES=no
NM_CONTROLLED=no
MASTER=bond0
SLAVE=yes
DEFROUTE=no
PEERDNS=no

# cat ifcfg-bond0.101
BOOTPROTO="none"
IPADDR="192.168.101.12"
NETMASK="255.255.255.0"
GATEWAY=""
DEVICE="bond0.101"
ONBOOT=yes
PEERDNS=no
PEERROUTES=no
VLAN=yes
NM_CONTROLLED=no
DEFROUTE=no

This is what I see in the logs with and without loading the module via modules.conf:

Without “alias bond0 bonding” in /etc/modprobe.d/bonding.conf:

Dec  4 09:34:28 controller1 kernel: device br-trunk entered promiscuous mode
Dec  4 09:34:28 controller1 kernel: bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Dec  4 09:34:28 controller1 kernel: Loading kernel module for a network device with CAP_SYS_MODULE (deprecated).  Use CAP_NET_ADMIN and alias netdev-bond0 instead.
Dec  4 09:34:28 controller1 kernel: device bond0 entered promiscuous mode
Dec  4 09:34:28 controller1 ovs-ctl: Starting ovs-vswitchd [  OK  ]
Dec  4 09:34:28 controller1 ovs-ctl: Enabling remote OVSDB managers [  OK  ]
Dec  4 09:34:28 controller1 systemd: Started Open vSwitch Internal Unit.
Dec  4 09:34:28 controller1 systemd: Started Dynamic System Tuning Daemon.
Dec  4 09:34:28 controller1 systemd: Started PCS GUI and remote configuration interface.
Dec  4 09:34:28 controller1 network: Bringing up loopback interface:  [  OK  ]
Dec  4 09:34:28 controller1 kernel: IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
Dec  4 09:34:28 controller1 kernel: bonding: bond0: Adding slave eno33557248.

Dec  4 09:34:28 controller1 network: Bringing up loopback interface:  [  OK  ]
Dec  4 09:34:28 controller1 kernel: IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
Dec  4 09:34:28 controller1 kernel: bonding: bond0: Adding slave eno33557248.
Dec  4 09:34:28 controller1 kernel: vmxnet3 0000:0b:00.0 eno33557248: intr type 3, mode 0, 3 vectors allocated
Dec  4 09:34:28 controller1 kernel: vmxnet3 0000:0b:00.0 eno33557248: NIC Link is Up 10000 Mbps
Dec  4 09:34:28 controller1 kernel: device eno33557248 entered promiscuous mode
Dec  4 09:34:28 controller1 kernel: bonding: bond0: enslaving eno33557248 as an active interface with an up link.
Dec  4 09:34:29 controller1 kernel: bonding: bond0: Adding slave eno50336512.

[…]

Dec  4 09:36:39 controller1 network: Bringing up interface bond0:  ERROR    : [/etc/sysconfig/network-scripts/ifup-eth] Device bond0 does not seem to be present, delaying initialization.
Dec  4 09:36:39 controller1 /etc/sysconfig/network-scripts/ifup-eth: Device bond0 does not seem to be present, delaying initialization.

Dec  4 09:36:41 controller1 network: Determining IP information for eno16777984... done.
Dec  4 09:36:42 controller1 network: [  OK  ]
Dec  4 09:36:42 controller1 kernel: 8021q: 802.1Q VLAN Support v1.8
Dec  4 09:36:42 controller1 kernel: 8021q: adding VLAN 0 to HW filter on device eno16777984
Dec  4 09:36:42 controller1 network: Bringing up interface bond0.101:  ERROR    : [./ifup]  device bond0.101 does not seem to be present, delaying initialization.
Dec  4 09:36:42 controller1 ./ifup: device bond0.101 does not seem to be present, delaying initialization.
Dec  4 09:36:42 controller1 network: [FAILED]
Dec  4 09:36:42 controller1 network: Bringing up interface bond0.102:  ERROR    : [./ifup]  device bond0.102 does not seem to be present, delaying initialization.
Dec  4 09:36:42 controller1 ./ifup: device bond0.102 does not seem to be present, delaying initialization.
Dec  4 09:36:42 controller1 network: [FAILED]
Dec  4 09:36:42 controller1 network: Bringing up interface bond0.103:  ERROR    : [./ifup]  device bond0.103 does not seem to be present, delaying initialization.
Dec  4 09:36:42 controller1 ./ifup: device bond0.103 does not seem to be present, delaying initialization.
Dec  4 09:36:42 controller1 network: [FAILED]
Dec  4 09:36:42 controller1 network: Bringing up interface bond0.104:  ERROR    : [./ifup]  device bond0.104 does not seem to be present, delaying initialization.

With “alias bond0 bonding” in /etc/modprobe.d/bonding.conf:

Dec  4 09:50:52 controller1 kernel: bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Dec  4 09:50:52 controller1 kernel: Loading kernel module for a network device with CAP_SYS_MODULE (deprecated).  Use CAP_NET_ADMIN and alias netdev-bond0 instead.
Dec  4 09:50:52 controller1 kernel: device bond0 entered promiscuous mode
Dec  4 09:50:52 controller1 kernel: device br-trunk entered promiscuous mode
Dec  4 09:50:52 controller1 ovs-ctl: Starting ovs-vswitchd [  OK  ]
Dec  4 09:50:52 controller1 ovs-ctl: Enabling remote OVSDB managers [  OK  ]
Dec  4 09:50:52 controller1 systemd: Started Open vSwitch Internal Unit.
Dec  4 09:50:52 controller1 systemd: Started Dynamic System Tuning Daemon.
Dec  4 09:50:52 controller1 systemd: Started PCS GUI and remote configuration interface.
Dec  4 09:50:52 controller1 network: Bringing up loopback interface:  [  OK  ]
Dec  4 09:50:52 controller1 kernel: IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
Dec  4 09:50:52 controller1 kernel: bonding: bond0: Adding slave eno33557248.
Dec  4 09:50:52 controller1 kernel: vmxnet3 0000:0b:00.0 eno33557248: intr type 3, mode 0, 3 vectors allocated
Dec  4 09:50:52 controller1 kernel: vmxnet3 0000:0b:00.0 eno33557248: NIC Link is Up 10000 Mbps
Dec  4 09:50:52 controller1 kernel: device eno33557248 entered promiscuous mode
Dec  4 09:50:52 controller1 kernel: bonding: bond0: enslaving eno33557248 as an active interface with an up link.
Dec  4 09:50:52 controller1 kernel: bonding: bond0: Adding slave eno50336512.
Dec  4 09:50:52 controller1 kernel: vmxnet3 0000:13:00.0 eno50336512: intr type 3, mode 0, 3 vectors allocated
Dec  4 09:50:52 controller1 kernel: vmxnet3 0000:13:00.0 eno50336512: NIC Link is Up 10000 Mbps
Dec  4 09:50:52 controller1 kernel: device eno50336512 entered promiscuous mode
Dec  4 09:50:52 controller1 kernel: bonding: bond0: enslaving eno50336512 as an active interface with an up link.
Dec  4 09:50:52 controller1 kernel: bonding: bond0: Adding slave eno67115776.
[…]
Dec  4 09:50:52 controller1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-port br-trunk bond0
Dec  4 09:50:52 controller1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-br br-trunk
Dec  4 09:50:52 controller1 network: Bringing up interface bond0:  [  OK  ]
Dec  4 09:50:53 controller1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-br br-trunk
Dec  4 09:50:53 controller1 network: Bringing up interface br-trunk:  [  OK  ]
Dec  4 09:50:53 controller1 network: Bringing up interface eno16777984:
Dec  4 09:50:53 controller1 kernel: vmxnet3 0000:03:00.0 eno16777984: intr type 3, mode 0, 3 vectors allocated
Dec  4 09:50:53 controller1 kernel: vmxnet3 0000:03:00.0 eno16777984: NIC Link is Up 10000 Mbps
Dec  4 09:50:53 controller1 dhclient[1322]: DHCPREQUEST on eno16777984 to 255.255.255.255 port 67 (xid=0x30f0265e)
Dec  4 09:50:53 controller1 dhclient[1322]: DHCPACK from 192.168.100.200 (xid=0x30f0265e)
[…]

Attached the full log.

Comment 6 Brad P. Crochet 2014-12-04 13:51:04 UTC
So, I believe there is a bug here, but it is not in the staypuft installer. It actually looks to be in how the ovs puppet module handles the switch over to using OVS. gildub will need to have a look at this.

The BONDING_OPTS and BONDING_MASTER options need to be kept in the ifcfg-bond0 file, and not moved into the ovs cfg.

Comment 7 Ivan Chavero 2014-12-05 02:07:59 UTC
Gilles can you give me a hand with this i think you have more experience with this.

Comment 8 Gilles Dubreuil 2014-12-07 10:57:48 UTC
Yes, BONDING_OPTS and BONDING_MASTER should stay in the bond file.
I'm doing the patch immediately.

Comment 9 Gilles Dubreuil 2014-12-08 05:16:18 UTC
= VLAN over BONDING =

Note about #comment5 example:
The bonding patch is not needed (when using latest vswitch module from Openstac-Puppet-Modules).

Because only the vlan interface is involved when attaching port bond0.101 to attach to bridge.

= BONDING =
Patch added upstream, waiting for peers review.

[root]# cat ifcfg-eth1
NM_CONTROLLED=no
BOOTPROTO=none
ONBOOT=yes
DEVICE=eth1
TYPE=Ethernet
MASTER=bond0
SLAVE=yes

[root]# cat ifcfg-bond0 
DEVICE=bond0
DEVICETYPE=ovs
TYPE=OVSPort
OVS_BRIDGE=br-ex
ONBOOT=yes
BOOTPROTO=none

[root]# cat ifcfg-br-ex 
ONBOOT=yes
DEFROUTE=no
BONDING_OPTS="miimon=100 mode=0"
BONDING_MASTER=yes
NM_CONTROLLED=no
PEERDNS=no
PEERROUTES=no
IPADDR=192.168.80.11
NETMASK=255.255.255.0
DEVICE=br-ex
DEVICETYPE=ovs
OVSBOOTPROTO=
TYPE=OVSBridge

[root]# ip a
...snip...
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP qlen 1000
    link/ether 52:54:00:44:ab:80 brd ff:ff:ff:ff:ff:ff
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP 
    link/ether 52:54:00:44:ab:80 brd ff:ff:ff:ff:ff:ff
5: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN 
    link/ether 32:94:1a:99:15:73 brd ff:ff:ff:ff:ff:ff
7: br-ex: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether 52:e1:c6:3f:9a:40 brd ff:ff:ff:ff:ff:ff
    inet 192.168.80.11/24 brd 192.168.80.255 scope global br-ex
       valid_lft forever preferred_lft forever
    inet6 fe80::2cad:6cff:fec3:3dfa/64 scope link 
       valid_lft forever preferred_lft forever

[root]# ovs-vsctl  show
51fc7335-c403-4445-ab6b-0b00d9240cea
    Bridge br-ex
        Port "bond0"
            Interface "bond0"
        Port br-ex
            Interface br-ex
                type: internal
    ovs_version: "2.0.0"


[root]# ping -c 3 192.168.80.10
PING 192.168.80.10 (192.168.80.10) 56(84) bytes of data.
64 bytes from 192.168.80.10: icmp_seq=1 ttl=64 time=0.746 ms
64 bytes from 192.168.80.10: icmp_seq=2 ttl=64 time=0.623 ms
64 bytes from 192.168.80.10: icmp_seq=3 ttl=64 time=0.436 ms

--- 192.168.80.10 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.436/0.601/0.746/0.130 ms

Comment 10 Gilles Dubreuil 2014-12-08 10:30:36 UTC
Notes,

Tested on VMs (libvirt) using:
"modprobe bonding" command and option mode=0 for the Bonding type,

This mode allows me to test without switch and only one slave.
Depending on the bonding mode, the bond might not allow any traffic before it's in place and cause trouble at reboot.

> I’m using the latest versions in the CDN of rhel-osp-installer
> (rhel-osp-installer-0.4.7-1.el6ost.noarch) and of the puppet modules
> (openstack-puppet-modules-2014.1-24.el6ost.noarch).
> 

openstack-puppet-modules-2014.1-24.el6ost.noarch doesn't contain the patch.
Both the vlan and bonding patches must be backported to Havana branch.

Comment 11 Gilles Dubreuil 2014-12-08 23:45:50 UTC
(In reply to Gilles Dubreuil from comment #10)
> openstack-puppet-modules-2014.1-24.el6ost.noarch doesn't contain the patch.
> Both the vlan and bonding patches must be backported to Havana branch.

Sorry I meant backport needed to OPM Icehouse branch.

I don't think Havana would get it.

Comment 14 Asaf Hirshberg 2015-01-12 08:14:27 UTC
verified on my setup, ha-neutron:controllers with bond 802.3ad. all controllers have been rebooted and the bond interfaces came up properly.

Comment 16 Gilles Dubreuil 2015-02-05 11:59:17 UTC
Since validated by QA, puppet-vswitch version for OSP5 needs to be bumped up into Openstack-Puppet-Modules, more likely the one corresponding commit of the patch.

Comment 17 Alvaro Lopez Ortega 2015-02-05 12:51:02 UTC
We'd need to duplicate this bug for RHEL-OSP6 and get the path ported.

Comment 19 errata-xmlrpc 2015-02-09 15:15:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0156.html

Comment 20 Gilles Dubreuil 2015-02-18 00:21:08 UTC
(In reply to Alvaro Lopez Ortega from comment #17)
> We'd need to duplicate this bug for RHEL-OSP6 and get the path ported.

I believe you meant OSP5 :/

I've created BZ#1193718

Comment 21 Red Hat Bugzilla 2023-09-18 00:10:56 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.