After fixing bonding+VLANs as specified in https://bugzilla.redhat.com/show_bug.cgi?id=1162794 the bonding interface doesn't come up after reboot. In my setup I have just bond0 and I solve it after doing this in the hosts: echo "alias bond* bonding" > /etc/modprobe.d/bonding.conf In my test I haven't tried multiple bonding interfaces, just bond0.
Is BONDING_OPTS set correctly in the /etc/sysconfig/network-scripts/ifcfg-bond0? If so, according to RHEL docs, the bonding module should be loaded automatically.
Created attachment 964539 [details] messages log file logging both modules.conf file loading bonding and not loading it
I believe BONDING_OPTS was set correctly by the installer: # cat ifcfg-br-trunk ONBOOT=yes PEERDNS=no PEERROUTES=no DEFROUTE=no BONDING_OPTS="miimon=100 mode=balance-tlb" BONDING_MASTER=yes NM_CONTROLLED=no DEVICE=br-trunk DEVICETYPE=ovs OVSBOOTPROTO="none" TYPE=OVSBridge As a recap, this setup has multiple subnets associated to VLANs: Management 192.168.101.0/24 101 Cluster Management 192.168.102.0/24 102 Storage 192.168.103.0/24 103 Admin API 192.168.104.0/24 104 External Public API 192.168.105.0/24 105 They are all associated to bond0 (via bond0.101, bond0.102...). This is all of them: ifcfg-bond0 ifcfg-bond0.102 ifcfg-bond0.104 ifcfg-br-trunk ifcfg-eno33557248 ifcfg-eno67115776 ifcfg-lo ifcfg-bond0.101 ifcfg-bond0.103 ifcfg-bond0.105 ifcfg-eno16777984 ifcfg-eno50336512 ifcfg-eno83887104 And this is how they are configured: # cat ifcfg-bond0 DEVICE=bond0 DEVICETYPE=ovs TYPE=OVSPort OVS_BRIDGE=br-trunk ONBOOT=yes BOOTPROTO=none # cat ifcfg-eno33557248 BOOTPROTO="none" DEVICE="eno33557248" HWADDR="00:0c:29:94:a2:40" ONBOOT=yes PEERROUTES=no NM_CONTROLLED=no MASTER=bond0 SLAVE=yes DEFROUTE=no PEERDNS=no # cat ifcfg-bond0.101 BOOTPROTO="none" IPADDR="192.168.101.12" NETMASK="255.255.255.0" GATEWAY="" DEVICE="bond0.101" ONBOOT=yes PEERDNS=no PEERROUTES=no VLAN=yes NM_CONTROLLED=no DEFROUTE=no This is what I see in the logs with and without loading the module via modules.conf: Without “alias bond0 bonding” in /etc/modprobe.d/bonding.conf: Dec 4 09:34:28 controller1 kernel: device br-trunk entered promiscuous mode Dec 4 09:34:28 controller1 kernel: bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Dec 4 09:34:28 controller1 kernel: Loading kernel module for a network device with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias netdev-bond0 instead. Dec 4 09:34:28 controller1 kernel: device bond0 entered promiscuous mode Dec 4 09:34:28 controller1 ovs-ctl: Starting ovs-vswitchd [ OK ] Dec 4 09:34:28 controller1 ovs-ctl: Enabling remote OVSDB managers [ OK ] Dec 4 09:34:28 controller1 systemd: Started Open vSwitch Internal Unit. Dec 4 09:34:28 controller1 systemd: Started Dynamic System Tuning Daemon. Dec 4 09:34:28 controller1 systemd: Started PCS GUI and remote configuration interface. Dec 4 09:34:28 controller1 network: Bringing up loopback interface: [ OK ] Dec 4 09:34:28 controller1 kernel: IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready Dec 4 09:34:28 controller1 kernel: bonding: bond0: Adding slave eno33557248. Dec 4 09:34:28 controller1 network: Bringing up loopback interface: [ OK ] Dec 4 09:34:28 controller1 kernel: IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready Dec 4 09:34:28 controller1 kernel: bonding: bond0: Adding slave eno33557248. Dec 4 09:34:28 controller1 kernel: vmxnet3 0000:0b:00.0 eno33557248: intr type 3, mode 0, 3 vectors allocated Dec 4 09:34:28 controller1 kernel: vmxnet3 0000:0b:00.0 eno33557248: NIC Link is Up 10000 Mbps Dec 4 09:34:28 controller1 kernel: device eno33557248 entered promiscuous mode Dec 4 09:34:28 controller1 kernel: bonding: bond0: enslaving eno33557248 as an active interface with an up link. Dec 4 09:34:29 controller1 kernel: bonding: bond0: Adding slave eno50336512. […] Dec 4 09:36:39 controller1 network: Bringing up interface bond0: ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Device bond0 does not seem to be present, delaying initialization. Dec 4 09:36:39 controller1 /etc/sysconfig/network-scripts/ifup-eth: Device bond0 does not seem to be present, delaying initialization. Dec 4 09:36:41 controller1 network: Determining IP information for eno16777984... done. Dec 4 09:36:42 controller1 network: [ OK ] Dec 4 09:36:42 controller1 kernel: 8021q: 802.1Q VLAN Support v1.8 Dec 4 09:36:42 controller1 kernel: 8021q: adding VLAN 0 to HW filter on device eno16777984 Dec 4 09:36:42 controller1 network: Bringing up interface bond0.101: ERROR : [./ifup] device bond0.101 does not seem to be present, delaying initialization. Dec 4 09:36:42 controller1 ./ifup: device bond0.101 does not seem to be present, delaying initialization. Dec 4 09:36:42 controller1 network: [FAILED] Dec 4 09:36:42 controller1 network: Bringing up interface bond0.102: ERROR : [./ifup] device bond0.102 does not seem to be present, delaying initialization. Dec 4 09:36:42 controller1 ./ifup: device bond0.102 does not seem to be present, delaying initialization. Dec 4 09:36:42 controller1 network: [FAILED] Dec 4 09:36:42 controller1 network: Bringing up interface bond0.103: ERROR : [./ifup] device bond0.103 does not seem to be present, delaying initialization. Dec 4 09:36:42 controller1 ./ifup: device bond0.103 does not seem to be present, delaying initialization. Dec 4 09:36:42 controller1 network: [FAILED] Dec 4 09:36:42 controller1 network: Bringing up interface bond0.104: ERROR : [./ifup] device bond0.104 does not seem to be present, delaying initialization. With “alias bond0 bonding” in /etc/modprobe.d/bonding.conf: Dec 4 09:50:52 controller1 kernel: bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Dec 4 09:50:52 controller1 kernel: Loading kernel module for a network device with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias netdev-bond0 instead. Dec 4 09:50:52 controller1 kernel: device bond0 entered promiscuous mode Dec 4 09:50:52 controller1 kernel: device br-trunk entered promiscuous mode Dec 4 09:50:52 controller1 ovs-ctl: Starting ovs-vswitchd [ OK ] Dec 4 09:50:52 controller1 ovs-ctl: Enabling remote OVSDB managers [ OK ] Dec 4 09:50:52 controller1 systemd: Started Open vSwitch Internal Unit. Dec 4 09:50:52 controller1 systemd: Started Dynamic System Tuning Daemon. Dec 4 09:50:52 controller1 systemd: Started PCS GUI and remote configuration interface. Dec 4 09:50:52 controller1 network: Bringing up loopback interface: [ OK ] Dec 4 09:50:52 controller1 kernel: IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready Dec 4 09:50:52 controller1 kernel: bonding: bond0: Adding slave eno33557248. Dec 4 09:50:52 controller1 kernel: vmxnet3 0000:0b:00.0 eno33557248: intr type 3, mode 0, 3 vectors allocated Dec 4 09:50:52 controller1 kernel: vmxnet3 0000:0b:00.0 eno33557248: NIC Link is Up 10000 Mbps Dec 4 09:50:52 controller1 kernel: device eno33557248 entered promiscuous mode Dec 4 09:50:52 controller1 kernel: bonding: bond0: enslaving eno33557248 as an active interface with an up link. Dec 4 09:50:52 controller1 kernel: bonding: bond0: Adding slave eno50336512. Dec 4 09:50:52 controller1 kernel: vmxnet3 0000:13:00.0 eno50336512: intr type 3, mode 0, 3 vectors allocated Dec 4 09:50:52 controller1 kernel: vmxnet3 0000:13:00.0 eno50336512: NIC Link is Up 10000 Mbps Dec 4 09:50:52 controller1 kernel: device eno50336512 entered promiscuous mode Dec 4 09:50:52 controller1 kernel: bonding: bond0: enslaving eno50336512 as an active interface with an up link. Dec 4 09:50:52 controller1 kernel: bonding: bond0: Adding slave eno67115776. […] Dec 4 09:50:52 controller1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-port br-trunk bond0 Dec 4 09:50:52 controller1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-br br-trunk Dec 4 09:50:52 controller1 network: Bringing up interface bond0: [ OK ] Dec 4 09:50:53 controller1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-br br-trunk Dec 4 09:50:53 controller1 network: Bringing up interface br-trunk: [ OK ] Dec 4 09:50:53 controller1 network: Bringing up interface eno16777984: Dec 4 09:50:53 controller1 kernel: vmxnet3 0000:03:00.0 eno16777984: intr type 3, mode 0, 3 vectors allocated Dec 4 09:50:53 controller1 kernel: vmxnet3 0000:03:00.0 eno16777984: NIC Link is Up 10000 Mbps Dec 4 09:50:53 controller1 dhclient[1322]: DHCPREQUEST on eno16777984 to 255.255.255.255 port 67 (xid=0x30f0265e) Dec 4 09:50:53 controller1 dhclient[1322]: DHCPACK from 192.168.100.200 (xid=0x30f0265e) […] Attached the full log.
So, I believe there is a bug here, but it is not in the staypuft installer. It actually looks to be in how the ovs puppet module handles the switch over to using OVS. gildub will need to have a look at this. The BONDING_OPTS and BONDING_MASTER options need to be kept in the ifcfg-bond0 file, and not moved into the ovs cfg.
Gilles can you give me a hand with this i think you have more experience with this.
Yes, BONDING_OPTS and BONDING_MASTER should stay in the bond file. I'm doing the patch immediately.
= VLAN over BONDING = Note about #comment5 example: The bonding patch is not needed (when using latest vswitch module from Openstac-Puppet-Modules). Because only the vlan interface is involved when attaching port bond0.101 to attach to bridge. = BONDING = Patch added upstream, waiting for peers review. [root]# cat ifcfg-eth1 NM_CONTROLLED=no BOOTPROTO=none ONBOOT=yes DEVICE=eth1 TYPE=Ethernet MASTER=bond0 SLAVE=yes [root]# cat ifcfg-bond0 DEVICE=bond0 DEVICETYPE=ovs TYPE=OVSPort OVS_BRIDGE=br-ex ONBOOT=yes BOOTPROTO=none [root]# cat ifcfg-br-ex ONBOOT=yes DEFROUTE=no BONDING_OPTS="miimon=100 mode=0" BONDING_MASTER=yes NM_CONTROLLED=no PEERDNS=no PEERROUTES=no IPADDR=192.168.80.11 NETMASK=255.255.255.0 DEVICE=br-ex DEVICETYPE=ovs OVSBOOTPROTO= TYPE=OVSBridge [root]# ip a ...snip... 3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP qlen 1000 link/ether 52:54:00:44:ab:80 brd ff:ff:ff:ff:ff:ff 4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master ovs-system state UP link/ether 52:54:00:44:ab:80 brd ff:ff:ff:ff:ff:ff 5: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether 32:94:1a:99:15:73 brd ff:ff:ff:ff:ff:ff 7: br-ex: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether 52:e1:c6:3f:9a:40 brd ff:ff:ff:ff:ff:ff inet 192.168.80.11/24 brd 192.168.80.255 scope global br-ex valid_lft forever preferred_lft forever inet6 fe80::2cad:6cff:fec3:3dfa/64 scope link valid_lft forever preferred_lft forever [root]# ovs-vsctl show 51fc7335-c403-4445-ab6b-0b00d9240cea Bridge br-ex Port "bond0" Interface "bond0" Port br-ex Interface br-ex type: internal ovs_version: "2.0.0" [root]# ping -c 3 192.168.80.10 PING 192.168.80.10 (192.168.80.10) 56(84) bytes of data. 64 bytes from 192.168.80.10: icmp_seq=1 ttl=64 time=0.746 ms 64 bytes from 192.168.80.10: icmp_seq=2 ttl=64 time=0.623 ms 64 bytes from 192.168.80.10: icmp_seq=3 ttl=64 time=0.436 ms --- 192.168.80.10 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2000ms rtt min/avg/max/mdev = 0.436/0.601/0.746/0.130 ms
Notes, Tested on VMs (libvirt) using: "modprobe bonding" command and option mode=0 for the Bonding type, This mode allows me to test without switch and only one slave. Depending on the bonding mode, the bond might not allow any traffic before it's in place and cause trouble at reboot. > I’m using the latest versions in the CDN of rhel-osp-installer > (rhel-osp-installer-0.4.7-1.el6ost.noarch) and of the puppet modules > (openstack-puppet-modules-2014.1-24.el6ost.noarch). > openstack-puppet-modules-2014.1-24.el6ost.noarch doesn't contain the patch. Both the vlan and bonding patches must be backported to Havana branch.
(In reply to Gilles Dubreuil from comment #10) > openstack-puppet-modules-2014.1-24.el6ost.noarch doesn't contain the patch. > Both the vlan and bonding patches must be backported to Havana branch. Sorry I meant backport needed to OPM Icehouse branch. I don't think Havana would get it.
verified on my setup, ha-neutron:controllers with bond 802.3ad. all controllers have been rebooted and the bond interfaces came up properly.
Since validated by QA, puppet-vswitch version for OSP5 needs to be bumped up into Openstack-Puppet-Modules, more likely the one corresponding commit of the patch.
We'd need to duplicate this bug for RHEL-OSP6 and get the path ported.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0156.html
(In reply to Alvaro Lopez Ortega from comment #17) > We'd need to duplicate this bug for RHEL-OSP6 and get the path ported. I believe you meant OSP5 :/ I've created BZ#1193718
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days