Bug 902372 - NetworkManager should ensure clean bridge state on startup with bridging support enabled
NetworkManager should ensure clean bridge state on startup with bridging supp...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: NetworkManager (Show other bugs)
6.4
Unspecified Unspecified
unspecified Severity high
: rc
: ---
Assigned To: Dan Williams
Desktop QE
: ZStream
: 902371 (view as bug list)
Depends On:
Blocks: 1021088
  Show dependency treegraph
 
Reported: 2013-01-21 09:18 EST by David Jaša
Modified: 2017-02-06 10:16 EST (History)
6 users (show)

See Also:
Fixed In Version: NetworkManager-0.8.1-53.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1021088 (view as bug list)
Environment:
Last Closed: 2013-11-21 16:48:07 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Ensure clean bridge/bond interface state on startup (10.98 KB, patch)
2013-01-29 19:08 EST, Dan Williams
no flags Details | Diff

  None (edit)
Description David Jaša 2013-01-21 09:18:23 EST
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:
always

Steps to Reproduce:
0. stop all network-related services

1.
/etc/sysconfig/network-scripts/ifcfg-br0:
DEVICE=br0
TYPE=Bridge
BOOTPROTO=dhcp
ONBOOT=yes
DELAY=0
STP=off
NM_CONTROLLED=yes
HWADDR=f0:de:f1:04:c0:fa  # MAC address of eth0

/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BRIDGE=br0
ONBOOT=yes
BOOTPROTO=none
NM_CONTROLLED=no   # or yes, it doesn't affect reproducer yet

user-defined connection using eth0 device:
$ gconftool-2 -R /system/networking/connections/1
 /system/networking/connections/1/ipv4:
  addresses = []
  name = ipv4
  dns = []
  routes = []
  method = auto
 /system/networking/connections/1/ipv6:
  addresses = []
  name = ipv6
  dns = []
  routes = []
  method = auto
  may-fail = true
 /system/networking/connections/1/connection:
  uuid = 9b887145-365a-447a-900e-4ad3a80bf157
  name = connection
  id = Auto Ethernet
  type = 802-3-ethernet
  timestamp = 1358772786
 /system/networking/connections/1/802-3-ethernet:
  name = 802-3-ethernet
  duplex = full

2. log in to gnome session as a user to whom "Auto Ethernet" connection belongs and make sure that nm-applet is running
3. service network start
4. service NetworkManager start
  
Actual results:
3. br0 is enabled and configured successfully by network-scripts (good)
4. nm-applet senses that it "can" reconnect "Auto Ethernet" connection
5a bridge is not recognized (br0 device is "unavailable" according to nmcli)
5b NM daemon happily configures "Auto Ethernet" connection on eth0 despite:
  * the device already being part of the bridge
  * NM_CONTROLLED=no in ifcfg-eth0 (that's for separate bug)

Expected results:
4 NM daemon should see that eth0 is a part of a bridge and it should not "offer" it to the clients at all
5. even if the eth0 is visible to clients such as nm-applet who ask to enable the device, NM should refuse

Additional info:
Comment 1 David Jaša 2013-01-21 09:18:52 EST
NetworkManager-0.8.1-39.el6.x86_64
Comment 2 David Jaša 2013-01-21 09:35:57 EST
*** Bug 902371 has been marked as a duplicate of this bug. ***
Comment 3 Dan Williams 2013-01-22 11:18:24 EST
You need HWADDR=<eth0 mac address> in ifcfg-eth0, or some other minimal ifcfg file (say, ifcfg-eth0-not-controlled) that contains HWADDR and NM_CONTROLLED=no.  The same ifcfg file must contain NM_CONTROLLED=no and the HWADDR of the device that is not being managed.

Also, are you setting NM_BOND_BRIDGE_VLAN_ENABLED=yes in /etc/sysconfig/network or not?
Comment 4 David Jaša 2013-01-22 12:01:34 EST
(In reply to comment #3)
> You need HWADDR=<eth0 mac address> in ifcfg-eth0, or some other minimal
> ifcfg file (say, ifcfg-eth0-not-controlled) that contains HWADDR and
> NM_CONTROLLED=no. 

The resulting behaviour is plain wrong for any use case but brouter, so NM should prevent such implicitly (unless it configures brouter of course, but that's not the case now).

> The same ifcfg file must contain NM_CONTROLLED=no and the
> HWADDR of the device that is not being managed.
> 
> Also, are you setting NM_BOND_BRIDGE_VLAN_ENABLED=yes in
> /etc/sysconfig/network or not?

yes


In addition, if I set the HWADDR in ifcfg-eth0, NM fails to see "Bridge br0" connection - I guess that it is caused by identical HWADDR for both devices. I expected NM to distinguish the devices though based on TYPE=(Ethernet|Bridge) that is specified correctly in both files (a separate bug?):

ifcfg-eth0:
DEVICE=eth0
TYPE=Ethernet
HWADDR=f0:de:f1:04:c0:fa
BRIDGE=br0
ONBOOT=yes
BOOTPROTO=none
NM_CONTROLLED=no

ifcfg-br0:
DEVICE=br0
TYPE=Bridge
BOOTPROTO=dhcp
ONBOOT=yes
DELAY=0
STP=off
NM_CONTROLLED=yes
HWADDR=f0:de:f1:04:c0:fa
Comment 6 Dan Williams 2013-01-22 12:31:23 EST
Ok, so remove the HWADDR stuff in the ifcfg-br0.  And remove any NM_CONTROLLED=no from either ifcfg-eth0 and ifcfg-br0, since NetworkManager should be able to manage these devices.  If your intended configuration is to have br0 with one bridge port eth0, then this setup should be supported by NetworkManager.

Remember, in RHEL6 when NetworkManager starts, it does not matter what configuration the device had before.  The devices allowed to be managed by NM will be reconfigured when NM starts using the ifcfg files on the system.  Given the same ifcfg files, NetworkManager should produce the same configuration as the network scripts.

If there is an existing connection that applies to eth0 that is not what you want, then you can either delete it or set it to not autoconnect.  If that connection exists before you define the bridge port for eth0, then they will conflict.  It's up to the user/administrator to ensure that network connections will not conflict the same way as if you have two ifcfg files for the same interface.
Comment 10 David Jaša 2013-01-23 04:52:50 EST
(In reply to comment #9)
> (In reply to comment #6)
> > Ok, so remove the HWADDR stuff in the ifcfg-br0.
> 
> If I do so, I wouldn't get the same IP from DHCP as before :] and more
> importantly, MAC address of br0 interface will sometimes change during br0
> lifetime, ...

bug 903134
Comment 11 David Jaša 2013-01-23 05:51:33 EST
(In reply to comment #10) 
> bug 903134

it looks more like initscripts bug: bug 903159
Comment 12 Dan Williams 2013-01-23 10:54:00 EST
You can actually have any number of ifcfg-* files for any single device.  It just used to be that people only ever connected to one ethernet network, and thus only had ifcfg-eth0.  But it's perfectly possible, expected, and reasonable to have ifcfg-work (static) and ifcfg-home (DHCP) that both have DEVICE=eth0 and are both perfectly valid network connections.

The initscripts only order ifcfg connections alphabetically.  So if you have ifcfg-eth0 (unbridged) and ifcfg-port-eth0 (bridged), and both are ONBOOT=yes, then the initscripts will start ifcfg-eth0 first.  There is no preference ordering in the initscripts either.

NM lets you pick one of these connections *at one time*, because obviously you can only have one network configuration applied to any single device at a time.  But you can certainly manually disconnect the wrong one and start the correct one if you like, or you can set one to autoconnect and the other to not autoconnect to achieve the right startup behavior that you desire.

So I'm not quite sure what you're asking about here with conflicting configurations...
Comment 13 David Jaša 2013-01-23 11:12:07 EST
> NM lets you pick one of these connections *at one time*,

The problem is that this is not true with bridge-with-eth0-as-port and other-connection-on-top-of-eth0. These two do conflict but NM will try to activate second while first is active. That's actually what I observed.

I believe that the NM not looking if device is enslaved is more generic bug with reproducer:
1. configure system or user connection using eth0 (using ifcfg or nm-(applet,connection-editor), that doesn't matter), keep that connection down
2. create a bridge manually, enslave the eth0 to the bridge, get dhcp configuration for the bridge
3. bring the connection from step 1. up

NM will try to bring the connection up and kernel will happily obey so there are high chances of having two conflicting actual network connections configured at the same time. Is it clearer now?
Comment 14 Dan Williams 2013-01-23 13:09:51 EST
Things that happen before NM starts are not preserved, except for plain wired DHCP and static IP connections.  That's how it's been from day 1.

Step (2) is happening before NM starts, or underneath NM while NM is running using the command line?  Is that correct?
Comment 15 Dan Williams 2013-01-23 13:15:35 EST
In any case, thanks for clarifying the issue.  The problem is clearly stated in comment 13.

The problem you describe is the way things have always worked in RHEL6 with NetworkManager.  There is only simple cooperation between NetworkManager and externally managed interfaces.  If some tool is expected to manage eth0 externally from NetworkManager, then the current solution is to put NM_CONTROLLED=no and HWADDR=xx into ifcfg-eth0 to allow that external tool to manage the interface without NM interfering.

If that is done, then eth0 is shown as "unmanaged" by NetworkManager, and the user is not allowed to change configuration of eth0 through NetworkManager.
Comment 16 Dan Williams 2013-01-23 13:19:13 EST
(In reply to comment #15)
> In any case, thanks for clarifying the issue.  The problem is clearly stated
> in comment 13.
> 
> The problem you describe is the way things have always worked in RHEL6 with
> NetworkManager.  There is only simple cooperation between NetworkManager and
> externally managed interfaces.  If some tool is expected to manage eth0
> externally from NetworkManager, then the current solution is to put
> NM_CONTROLLED=no and HWADDR=xx into ifcfg-eth0 to allow that external tool
> to manage the interface without NM interfering.
> 
> If that is done, then eth0 is shown as "unmanaged" by NetworkManager, and
> the user is not allowed to change configuration of eth0 through
> NetworkManager.

Oh, also, make sure you do not have NM_BRIDGE_BOND_VLAN_ENABLED=yes in /etc/sysconfig/network, otherwise NM would find and manage the bridge, but could not managed the port.  So to clarify:

1) ensure NM_BRIDGE_BOND_VLAN_ENABLED is not set in /etc/sysconfig/network
2) set HWADDR in ifcfg=eth0
3) set NM_CONTROLLED=no in ifcfg-eth0

and the bridge and port can be managed by external tools (eg, libvirt, etc) without interference from NetworkManager.  NM will still control the default route and DNS, however, and since it is not allowed to manage br0 or eth0, the default route will not be assigned to those interfaces.  If eth0/br0 is your primary network connection, then it may be best to disable NetworkManager entirely.
Comment 17 David Jaša 2013-01-23 17:23:49 EST
(In reply to comment #14)
> Things that happen before NM starts are not preserved, except for plain
> wired DHCP and static IP connections.  That's how it's been from day 1.

The equivalent of what I want to get fixed would be not to do dhcp release and IP deconfiguration before putting up the new automatic connection.

IOW if configuration is not to be preserved, then NM should:
* release dhcp lease on br0 (I'd keep static configuration as that may be used by other bridge ports)
* unslave eth0 from br0

I do understand your other points, but:
1) some of them do not apply to my scenario (I _want_ bridge to be managed by NM)
2) some of them are workarounds - descriptions of manual settings that prevent NM from creating invalid configuration
Comment 18 David Jaša 2013-01-24 09:42:32 EST
In -43, the concurrent connections managed by NM don't occur any more, so the current reproducer is just the one in comment 13.
Comment 19 Jirka Klimes 2013-01-29 09:55:13 EST
David, I'm trying to nail down what is your report about:
1) You are *not* talking about managing bridges outside NM
2) You suggest that when NM activates a bridge (br0) with enslaved device (eth0), then  NM should recognize that eth0 is part of the active bridge and ignore other possible configured connections for the eth0 device.
Right?

As Dan wrote, it's perfectly possible to have more connections for a device configured and it is up to the administrator to set up what connection will autoconnect, etc.
E.g. ifcfg-static, ifcfg-dhcp, ifcfg-yetanother (all for eth0)

So, administrator should ensure that no connection profile conflicts with the configured bridge.
However, I see your point and think it would be a good enhancement to temporarily disable ethernet profiles for devices that are enslaved in a bond.
Comment 20 Dan Williams 2013-01-29 14:21:25 EST
(In reply to comment #19)
> However, I see your point and think it would be a good enhancement to
> temporarily disable ethernet profiles for devices that are enslaved in a
> bond.

That's currently how the user would remove a slave from the bond/port from the bridge and use it for something else in one operation rather than having to "disconnect" the interface first.  However, it can be argued that having a slave in a bridge/bond is a more destructive operation and that it should be a two step process of "disconnect" and then "reconnect using some other connection".
Comment 21 Dan Williams 2013-01-29 14:34:00 EST
(In reply to comment #17)
> (In reply to comment #14)
> > Things that happen before NM starts are not preserved, except for plain
> > wired DHCP and static IP connections.  That's how it's been from day 1.
> 
> The equivalent of what I want to get fixed would be not to do dhcp release
> and IP deconfiguration before putting up the new automatic connection.
> 
> IOW if configuration is not to be preserved, then NM should:
> * release dhcp lease on br0 (I'd keep static configuration as that may be
> used by other bridge ports)
> * unslave eth0 from br0

Both these could be done, I suppose.  But again, the way these things have always worked with NM (and thus not a regression) is that whatever came before NM is not known to NM, and thus ignored.  Unfortunately there's not a good way to tell initscripts to *not* bring the bridge up before NM, but still have NM autoconnect the bridge.

Actually this could be done by 'chkconfig network off' if all your interfaces are managed by NetworkManager.  So there is a workaround in that case, but this workaround does not apply if you have interfaces that are not managed by NM but that you expect to start at boot time.
Comment 22 Dan Williams 2013-01-29 18:11:59 EST
Bug title changed to reflect the actual issue being discussed now.
Comment 23 Dan Williams 2013-01-29 19:08:12 EST
Created attachment 690081 [details]
Ensure clean bridge/bond interface state on startup
Comment 24 RHEL Product and Program Management 2013-01-29 19:09:11 EST
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.
Comment 26 David Jaša 2013-01-30 10:32:24 EST
(In reply to comment #19)
> David, I'm trying to nail down what is your report about:
> 1) You are *not* talking about managing bridges outside NM
> 2) You suggest that when NM activates a bridge (br0) with enslaved device
> (eth0), then  NM should recognize that eth0 is part of the active bridge and
> ignore other possible configured connections for the eth0 device.
> Right?

yes

> 
> As Dan wrote, it's perfectly possible to have more connections for a device
> configured and it is up to the administrator to set up what connection will
> autoconnect, etc.
> E.g. ifcfg-static, ifcfg-dhcp, ifcfg-yetanother (all for eth0)
> 
> So, administrator should ensure that no connection profile conflicts with
> the configured bridge.
> However, I see your point and think it would be a good enhancement to
> temporarily disable ethernet profiles for devices that are enslaved in a
> bond.

Yes

If I'm looking correctly at the patches, it check both ways: when putting up ethernet connection, deconfiguring related bridge; and when putting up bridge connection, deconfiguring bridge? If so, I think that the bug should be fixed completely by the patch.
Comment 30 David Jaša 2013-01-31 10:18:52 EST
works fine for NM connections (current connection is deconfigured or not touched at all; both ways).
Comment 32 David Jaša 2013-01-31 10:45:02 EST
when there is manually configured bridge (test0) with eth0 as slave and IP configuration via dhclient; and then user activates Auto_Ethernet*, the IP is deconfigured (good) but the eth0 device is not unplugged from test0 bridge (bad).

The correct behavior with "tear down existing connection and go on setting up new one" approach is to unplug the test0 from eth0.

* I had to rename "Auto Ethernet" to Auto_Ethernet halfway to make expression:
COMMAND='nmcli ...' ; logger -t "(root shell)" $COMMAND ; $COMMAND
work
Comment 34 David Jaša 2013-01-31 10:56:32 EST
if there is manual ip configuration on (unenslaved) eth0 (dhclient eth0 in my case), ip configuration is not removed at all when I activate "System eth0" connection and by extension, "Bridge br0" connection.

The IP configuration should be removed from future bridge port before enslaving.
Comment 35 David Jaša 2013-01-31 10:57:39 EST
All four tests above were conducted without any interference by network-scripts.
Comment 36 David Jaša 2013-01-31 11:24:37 EST
(In reply to comment #30)
> works fine for NM connections (current connection is deconfigured or not
> touched at all; both ways).

Clarifying this a bit:
if you transition from "Bridge br0" + "System eth0" to "Auto Ethernet", the bridge configuration is removed before activating "Auto Ethernet".

if you try to transition from "Auto Ethernet" to "System eth0" (and "Bridge br0" by extension), you get error message saying that "a connection is already activating on the device".

both are good.
Comment 45 errata-xmlrpc 2013-11-21 16:48:07 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1670.html

Note You need to log in before you can comment on or make changes to this bug.