Bug 1123492 - NetworkManager must be disabled in staypuft deployments
Summary: NetworkManager must be disabled in staypuft deployments
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhel-osp-installer
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ga
: Installer
Assignee: Mike Burns
QA Contact: Toni Freger
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-25 20:01 UTC by Lars Kellogg-Stedman
Modified: 2014-08-22 03:10 UTC (History)
8 users (show)

Fixed In Version: rhel-osp-installer-0.1.6-3.el6ost
Doc Type: Bug Fix
Doc Text:
In some deployment scenarios, puppet configures a NIC as part of a bridge. As a consequence, if NetworkManager is running, this change causes the puppet agent to terminate when the NIC being changed is the one in use. This has been fixed by disabling NetworkManager, so now puppet-runs no longer get killed mid-run as a result of a valid configuration change.
Clone Of:
Environment:
Last Closed: 2014-08-21 18:06:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1090 0 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform Enhancement Advisory 2014-08-22 15:28:08 UTC

Description Lars Kellogg-Stedman 2014-07-25 20:01:36 UTC
I am deploying a non-HA configuration and having staypuft configure external network connectivity.  During the initial puppet run when the system boots, br-ex is getting partially configured, but left with an address.  This leaves the system without any network connectivity, since I have configured staypuft to use eth0 (aka the provisioning interface) for external access.

This is the log from puppet:

Jul 25 19:33:12 mac52540036b16c.localdomain puppet-agent[1959]: (/Stage[main]/Neutron::Agents::Ovs/Neutron_plugin_ovs[OVS/bridge_mappi
ngs]/ensure) created
Jul 25 19:33:12 mac52540036b16c.localdomain ovs-vsctl[11004]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl add-br br-ex
Jul 25 19:33:12 mac52540036b16c.localdomain puppet-agent[1959]: (/Stage[main]/Neutron::Agents::Ovs/Neutron::Plugins::Ovs::Bridge[physnet-external:br-ex]/Vs_bridge[br-ex]/ensure) created
Jul 25 19:33:13 mac52540036b16c.localdomain ovs-vsctl[11021]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl add-port br-ex eth0
Jul 25 19:33:14 mac52540036b16c.localdomain systemd[1]: Stopping Puppet agent...
Jul 25 19:33:14 mac52540036b16c.localdomain puppet-agent[1755]: Caught TERM; calling stop
Jul 25 19:33:14 mac52540036b16c.localdomain puppet-agent[1959]: Caught TERM; calling stop
Jul 25 19:33:14 mac52540036b16c.localdomain systemd[1]: Starting Puppet agent...
Jul 25 19:33:14 mac52540036b16c.localdomain systemd[1]: Started Puppet agent.
Jul 25 19:33:16 mac52540036b16c.localdomain puppet-agent[11059]: Starting Puppet client version 3.6.2
Jul 25 19:33:17 mac52540036b16c.localdomain puppet-agent[11069]: Unable to fetch my node definition, but the agent run will continue:

As you can see, puppet-agent stops immediately after the call to 'add-port br-ex eth0'. Looking at the system logs implicates NetworkManager:

Jul 25 19:33:14 mac52540036b16c.localdomain NetworkManager[632]: <info> (eth0): deactivating device (reason 'connection-removed') [38]
Jul 25 19:33:14 mac52540036b16c.localdomain dbus[526]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Jul 25 19:33:14 mac52540036b16c.localdomain dbus-daemon[526]: dbus[526]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Jul 25 19:33:14 mac52540036b16c.localdomain systemd[1]: Starting Network Manager Script Dispatcher Service...
Jul 25 19:33:14 mac52540036b16c.localdomain NetworkManager[632]: <info> (eth0): canceled DHCP transaction, DHCP client pid 1373
Jul 25 19:33:14 mac52540036b16c.localdomain dbus-daemon[526]: dbus[526]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Jul 25 19:33:14 mac52540036b16c.localdomain dbus[526]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Jul 25 19:33:14 mac52540036b16c.localdomain systemd[1]: Started Network Manager Script Dispatcher Service.
Jul 25 19:33:14 mac52540036b16c.localdomain systemd[1]: Stopping Puppet agent...
Jul 25 19:33:14 mac52540036b16c.localdomain puppet-agent[1755]: Caught TERM; calling stop
Jul 25 19:33:14 mac52540036b16c.localdomain puppet-agent[1959]: Caught TERM; calling stop

Here you can see that NM starts the dispatcher service to handle the disconnect event for eth0, and immediately puppet-agent gets killed.

Comment 1 Lars Kellogg-Stedman 2014-07-25 20:18:44 UTC
And with NetworkManager disabled, br-ex gets configured correctly and the deploy continues successfully:

Jul 25 20:16:24 mac52540036b16c.localdomain ovs-vsctl[11132]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl add-br br-ex
Jul 25 20:16:24 mac52540036b16c.localdomain kernel: device br-ex entered promiscuous mode
Jul 25 20:16:24 mac52540036b16c.localdomain puppet-agent[1969]: (/Stage[main]/Neutron::Agents::Ovs/Neutron::Plugins::Ovs::Bridge[physnet-external:br-ex]/Vs_bridge[br-ex]/ensure) created
Jul 25 20:16:25 mac52540036b16c.localdomain ovs-vsctl[11149]: ovs|00001|vsctl|INFO|Called as /usr/bin/ovs-vsctl add-port br-ex eth0
Jul 25 20:16:25 mac52540036b16c.localdomain kernel: device eth0 entered promiscuous mode
Jul 25 20:16:26 mac52540036b16c.localdomain ovs-vsctl[11244]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-ex eth0
Jul 25 20:16:26 mac52540036b16c.localdomain kernel: device eth0 left promiscuous mode
Jul 25 20:16:27 mac52540036b16c.localdomain ovs-vsctl[11300]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-br br-ex
Jul 25 20:16:27 mac52540036b16c.localdomain kernel: device br-ex left promiscuous mode
Jul 25 20:16:27 mac52540036b16c.localdomain ovs-vsctl[11334]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-br br-ex -- set bridge br-ex other-config:hwaddr=52:54:00:36:b1:6c
Jul 25 20:16:27 mac52540036b16c.localdomain kernel: device br-ex entered promiscuous mode
Jul 25 20:16:28 mac52540036b16c.localdomain ovs-vsctl[11394]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-port br-ex eth0
Jul 25 20:16:28 mac52540036b16c.localdomain kernel: device eth0 entered promiscuous mode
Jul 25 20:16:28 mac52540036b16c.localdomain dhclient[11419]: DHCPDISCOVER on br-ex to 255.255.255.255 port 67 interval 3 (xid=0x40c1bfaf)
Jul 25 20:16:28 mac52540036b16c.localdomain dhclient[11419]: DHCPREQUEST on br-ex to 255.255.255.255 port 67 (xid=0x40c1bfaf)
Jul 25 20:16:28 mac52540036b16c.localdomain dhclient[11419]: DHCPOFFER from 172.16.0.1
Jul 25 20:16:28 mac52540036b16c.localdomain dhclient[11419]: DHCPACK from 172.16.0.1 (xid=0x40c1bfaf)
Jul 25 20:16:30 mac52540036b16c.localdomain NET[11464]: /usr/sbin/dhclient-script : updated /etc/resolv.conf
Jul 25 20:16:30 mac52540036b16c.localdomain dhclient[11419]: bound to 172.16.0.6 -- renewal in 296 seconds.

Comment 8 Toni Freger 2014-08-07 12:17:30 UTC
NetworkManager is down as expected.

ruby193-rubygem-staypuft-0.1.22.el6ost

Comment 9 Assaf Muller 2014-08-12 13:51:09 UTC
This bug is only relevant when you use the same NIC for both provisioning and external access, correct?

Also, with NM off, before the puppet run, eth0 has an IP address, and when the run finishes that IP is on br-ex? This is not the case with NM on? If you would run with NM on and SSH into the machine on a different NIC, would br-ex end up with an IP or not?

Comment 10 errata-xmlrpc 2014-08-21 18:06:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1090.html


Note You need to log in before you can comment on or make changes to this bug.