Description of problem: A recent change to initscripts caused interfaces to be read by "nmcli con load <ifcfg>" even if NM_CONTROLLED=no. NetworkManager is causing the bridges used by OpenStack to go from up to down when this happens, causing outages. Version-Release number of selected component (if applicable): RHEL 7.2 initscripts 9.49.30-1.el7_2.3 How reproducible: 100% Steps to Reproduce: 1. Run upgrade on Red Hat OpenStack Platform from version 7.3 to version 8.0 Actual results: Part of the upgrade process involves running yum update, which causes the latest iniscripts RPM to be loaded. This causes NetworkManager to read every ifcfg file, even those with NM_CONTROLLED=no. NetworkManager appears to be causing the main OpenStack bridge to go down when the ifcfg file is read. Expected results: The existing br-ex bridge, which is up and running when the upgrade process begins, is brought down when changes to other network interfaces are made. Additional info: There is log information in the two attached BZs. Red Hat OSP: https://bugzilla.redhat.com/show_bug.cgi?id=1364583 initscripts BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1366348
Relevant logs from https://bugzilla.redhat.com/show_bug.cgi?id=1366348 Here are the /var/log/messages logs from immediately after the bridge interface was brought up via "ifup br-ex". There is more info in the linked BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1364583 Aug 9 20:44:44 overcloud-controller-1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-ex em1 -- add-port br-ex em1 Aug 9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info> (em1): enslaved to non-master-type device ovs-system; ignoring Aug 9 20:44:44 overcloud-controller-1 kernel: device em1 entered promiscuous mode Aug 9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info> ifcfg-rh: new connection /etc/sysconfig/network-scripts/ifcfg-br-ex (f0123855-5f72-fb68-339e-ef4f1d038014,"System br-ex") Aug 9 20:44:44 overcloud-controller-1 NetworkManager[775]: <warn> ifcfg-rh: Ignoring connection /etc/sysconfig/network-scripts/ifcfg-br-ex (f0123855-5f72-fb68-339e-ef4f1d038014,"System br- ex") / device 'br-ex' due to NM_CONTROLLED=no. Aug 9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info> (br-ex): device state change: activated -> unmanaged (reason 'unmanaged') [100 10 3] Aug 9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info> NetworkManager state is now CONNECTED_LOCAL Aug 9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info> NetworkManager state is now DISCONNECTED Aug 9 20:44:44 overcloud-controller-1 kernel: IPv6: ADDRCONF(NETDEV_UP): br-ex: link is not ready Aug 9 20:44:44 overcloud-controller-1 NetworkManager[775]: <info> (br-ex): link disconnected Aug 9 20:44:44 overcloud-controller-1 dbus-daemon: dbus[758]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' Aug 9 20:44:44 overcloud-controller-1 dbus[758]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' Aug 9 20:44:44 overcloud-controller-1 systemd: Starting Network Manager Script Dispatcher Service... Aug 9 20:44:44 overcloud-controller-1 dbus[758]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' Aug 9 20:44:44 overcloud-controller-1 systemd: Started Network Manager Script Dispatcher Service. Aug 9 20:44:44 overcloud-controller-1 dbus-daemon: dbus[758]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' Aug 9 20:44:44 overcloud-controller-1 nm-dispatcher: Dispatching action 'down' for br-ex
can you please enable debug logging for NetworkManager, reproduce the problem and attach the full logfile? You do that, by editing /etc/NetworkManager/NetworkManager.conf to contain: [logging] level=TRACE and restart NM. Thank you
(In reply to Thomas Haller from comment #3) > can you please enable debug logging for NetworkManager, reproduce the > problem and attach the full logfile? > > You do that, by editing /etc/NetworkManager/NetworkManager.conf to contain: > > [logging] > level=TRACE > > and restart NM. > > Thank you Omri, can you do this? ^^ 7->8 upgrade without the workaround to disable NM
*** Bug 1366348 has been marked as a duplicate of this bug. ***
related to bug 1363995
some notes: - when calling `ifup`, initscripts first do a `nmcli connection load` on the ifcfg-file. That is to ensure that NetworkManager has the current version of the file loaded. - a device that is currently up and managed by NetworkManager, is taken down immediately, when the device becomes unmanaged. That happens for example when reloading the ifcfg-rh file with a change in NM_CONTROLLED=no. "stopping managing the device" means to bring down the interface and clean it up. There is rh#1371433 which ask to release the device with leaving it up. That may make sense for special cases, but in general saying NM to "unmanage" a device should continue to bring the current device down. Leaving it up is anyway not something that works in general (e.g. DHCP addresses will timeout). - the mentioned change in initscripts is http://pkgs.devel.redhat.com/cgit/rpms/initscripts/commit/?h=rhel-7.2&id=4aeb2f7ee2b31630ae5ff27e8046b5117b7f7a22 . That is, always call `nmcli connection load`, also if the ifcfg-rh file contains NM_CONTROLLED=no. Note that this initscripts patch is correct. It only has effects, when the device is already managed by NerworkManager. When the user sets" NM_CONTROLLED=no" followed by ifup, it is correct that NetworkManager stops managing the device. Why does upgrading initscripts package result in an ifup-call? That seems wrong, note that upgrading NetworkManager package does neither involve restarting the daemon nor changing networking. It does not do that on purpose, but of course that has other issues. Maybe updating the initscripts RPM should just do nothing to the runtime configuration too. I think NM is behaving as intended. Reassigning to initscripts for evaluation from their side.
Hello folks, I'm really sorry, but I don't see anything in initscripts specfile that would do network restart or ifdown/ifup during update: ============================================================== %pre /usr/sbin/groupadd -g 22 -r -f utmp %post touch /var/log/wtmp /var/run/utmp /var/log/btmp chown root:utmp /var/log/wtmp /var/run/utmp /var/log/btmp chmod 664 /var/log/wtmp /var/run/utmp chmod 600 /var/log/btmp /usr/sbin/chkconfig --add network /usr/sbin/chkconfig --add netconsole if [ $1 -eq 1 ]; then /usr/bin/systemctl daemon-reload > /dev/null 2>&1 || : fi =============================================================== https://github.com/fedora-sysv/initscripts/blob/rhel7-branch/initscripts.spec#L77 You will have to debug this by yourself, guys. I do not have any knowledge regarding OpenStack. Sorry. :-/ Dee'Kej
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.