Bug 1366348

Summary: Recent Update to initscripts Causing OpenStack Bridge To Fail
Product: Red Hat Enterprise Linux 7 Reporter: Dan Sneddon <dsneddon>
Component: NetworkManagerAssignee: Rashid Khan <rkhan>
Status: CLOSED DUPLICATE QA Contact: Desktop QE <desktop-qa-list>
Severity: high Docs Contact:
Priority: high    
Version: 7.2CC: aloughla, atragler, bgalvani, deekej, initscripts-maint-list, lrintel, rkhan, sasha, thaller
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-17 18:23:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Sneddon 2016-08-11 17:49:32 UTC
Description of problem:
After a recent update to initscripts, upgrades to Red Hat OpenStack Platform began failing. This was traced to a reproducible bug where the bridge named "br-ex" would be deactivated shortly after activation. This appears to be caused by initscripts+NetworkManager, because when we deactivate NetworkManager everything works. 

Version-Release number of selected component (if applicable):
I think the version number of the new initscripts (released in the last two weeks) is: 9.49.30-1.el7_2.3

How reproducible:
100%

Steps to Reproduce:
1. Deploy Red Hat OpenStack Platform 7 (succeeds)
2. Upgrade to Red Hat OpenStack Platform 8 (fails)

Actual results:
The deployment fails because the br-ex bridge is brought down shortly after activation. We see in the logs that NetworkManager reads the configuration file and sees that NM_CONTROLLED=no, but immediately after the bridge is marked as unmanaged it is shut down. The log message appears to indicate that NetworkManager is involved:
Aug  9 20:44:44 overcloud-controller-1 nm-dispatcher: Dispatching action 'down' for br-ex

Expected results:
Prior to this latest initscripts update, the upgrade would succeed and the br-ex bridge would remain up.

Additional info:
We believe that this is the offending patch that is causing the failure:

http://pkgs.devel.redhat.com/cgit/rpms/initscripts/commit/?h=rhel-7.2&id=4aeb2f7ee2b31630ae5ff27e8046b5117b7f7a22

Here are the /var/log/messages logs from immediately after the bridge interface was brought up via "ifup br-ex". There is more info in the linked BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1364583

Aug  9 20:44:44 overcloud-controller-1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-ex em1 -- add-port br-ex em1
Aug  9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info>  (em1): enslaved to non-master-type device ovs-system; ignoring
Aug  9 20:44:44 overcloud-controller-1 kernel: device em1 entered promiscuous mode
Aug  9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info>  ifcfg-rh: new connection /etc/sysconfig/network-scripts/ifcfg-br-ex (f0123855-5f72-fb68-339e-ef4f1d038014,"System br-ex")
Aug  9 20:44:44 overcloud-controller-1 NetworkManager[775]: <warn>  ifcfg-rh: Ignoring connection /etc/sysconfig/network-scripts/ifcfg-br-ex (f0123855-5f72-fb68-339e-ef4f1d038014,"System br-
ex") / device 'br-ex' due to NM_CONTROLLED=no.
Aug  9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info>  (br-ex): device state change: activated -> unmanaged (reason 'unmanaged') [100 10 3]
Aug  9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info>  NetworkManager state is now CONNECTED_LOCAL
Aug  9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info>  NetworkManager state is now DISCONNECTED
Aug  9 20:44:44 overcloud-controller-1 kernel: IPv6: ADDRCONF(NETDEV_UP): br-ex: link is not ready
Aug  9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info>  (br-ex): link disconnected
Aug  9 20:44:44 overcloud-controller-1 dbus-daemon: dbus[758]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Aug  9 20:44:44 overcloud-controller-1 dbus[758]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Aug  9 20:44:44 overcloud-controller-1 systemd: Starting Network Manager Script Dispatcher Service...
Aug  9 20:44:44 overcloud-controller-1 dbus[758]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Aug  9 20:44:44 overcloud-controller-1 systemd: Started Network Manager Script Dispatcher Service.
Aug  9 20:44:44 overcloud-controller-1 dbus-daemon: dbus[758]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Aug  9 20:44:44 overcloud-controller-1 nm-dispatcher: Dispatching action 'down' for br-ex

Comment 2 Lukáš Nykrýn 2016-08-12 07:34:56 UTC
That patch should be pretty harmless, on the initscripts side it only adds a call of  nmcli con load "/etc/sysconfig/network-scripts/$CONFIG" for every device. We could workaround it by skipping that for bridges, but maybe there is a better way to fix it on the NM side.

Comment 3 Thomas Haller 2016-08-17 18:23:17 UTC

*** This bug has been marked as a duplicate of bug 1367580 ***