Bug 905059 - NM can not take down bridge connection via CLI at all and awkwardly via applet
NM can not take down bridge connection via CLI at all and awkwardly via applet
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: NetworkManager (Show other bugs)
6.4
Unspecified Unspecified
unspecified Severity high
: rc
: ---
Assigned To: Dan Williams
Desktop QE
: Regression, ZStream
Depends On:
Blocks: 558983
  Show dependency treegraph
 
Reported: 2013-01-28 08:46 EST by David Jaša
Modified: 2013-11-21 16:48 EST (History)
5 users (show)

See Also:
Fixed In Version: NetworkManager-0.8.1-53.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-21 16:48:20 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Ensure disconnected master doesn't get reactivated by autoconnect slave (3.63 KB, patch)
2013-01-29 18:43 EST, Dan Williams
no flags Details | Diff

  None (edit)
Description David Jaša 2013-01-28 08:46:35 EST
Description of problem:
NM can not take down bridge connection via CLI at all and awkwardly via applet

Version-Release number of selected component (if applicable):
NetworkManager-0.8.1-43.el6.x86_64
kernel-2.6.32-353.el6.x86_64
bridge-utils-1.2-10.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. set up a bridge br0 connection with eth0 slave via ifcfg files:
ifcfg-br0:
DEVICE=br0
TYPE=Bridge
BOOTPROTO=dhcp
ONBOOT=no
DELAY=0
STP=off
NM_CONTROLLED=yes

ifcfg-eth0:
DEVICE=eth0
BRIDGE=br0
ONBOOT=yes
BOOTPROTO=none
NM_CONTROLLED=yes

2. bring up "Bridge br0" connection
3. try to take down the bridge connection:
3a) via cli: nmcli con down id "Bridge br0"
3b) via applet: "Bridge br0" disconnect
  
Actual results:
* connection is immediately reconnected
* if you take down "System eth0" first and "Bridge br0" second (both via applet), the connection goes down but it takes a bit of magic to take it up again - "normal" actions such as clicking on the connection in applet doesn't do anything:
> Jan 24 11:33:20 dhcp-29-7 NetworkManager[12962]: <info> (br0): IPv4 config waiting until slaves are present and forwarding
> Jan 24 11:33:23 dhcp-29-7 NetworkManager[12962]: <info> (br0): IPv4 config waiting until slaves are present and forwarding
> Jan 24 11:33:26 dhcp-29-7 NetworkManager[12962]: <info> (br0): IPv4 config waiting until slaves are present and forwarding
> Jan 24 11:33:29 dhcp-29-7 NetworkManager[12962]: <info> (br0): IPv4 config waiting until slaves are present and forwarding
> Jan 24 11:33:31 dhcp-29-7 kernel: br0: port 1(eth0) entering forwarding state
> Jan 24 11:33:32 dhcp-29-7 NetworkManager[12962]: <info> (br0): IPv4 config waiting until slaves are present and forwarding
> Jan 24 11:33:35 dhcp-29-7 NetworkManager[12962]: <info> (br0): IPv4 config waiting until slaves are present and forwarding
> Jan 24 11:33:38 dhcp-29-7 NetworkManager[12962]: <info> (br0): IPv4 config waiting until slaves are present and forwarding
(repeats indefinitely)

Expected results:
connection is disconnected (and can be reconnected again with a single nmcli invocation or click in the applet)

Additional info:
This is regression from -39 (I didn't try builds between -39 and -43)

Not being able to take the connection down from CLI hampers actual bridge support usefulness environments without desktop...
Comment 3 Jirka Klimes 2013-01-29 08:00:04 EST
(In reply to comment #0)
> 
> 2. bring up "Bridge br0" connection
> 3. try to take down the bridge connection:
> 3a) via cli: nmcli con down id "Bridge br0"
This actually calls DeactivateConnection() that deactivates the connection, but doesn't prevent the device from activating again (the same or other connection).
To disconnect the device (the same way the applet does) you should call:
nmcli dev disconnect iface br0

The behaviour doesn't change at all.

> 3b) via applet: "Bridge br0" disconnect
> > Jan 24 11:33:38 dhcp-29-7 NetworkManager[12962]: <info> (br0): IPv4 config waiting until slaves are present and forwarding
> (repeats indefinitely)
> 
Hmm, if there's a endless loop that's a problem. Could you try again and attach exact reproducer and /var/log/messages?
Comment 6 Dan Williams 2013-01-29 14:28:52 EST
We could change the behavior of nm-applet to call Disconnect() instead of DeactivateConnection(), but this would constitute a behavior change.

Disconnect() clears the interface's configuration and inhibits autoconnect until the interface is manually reconnected, until a carrier change, or until the machine is put to sleep.

This change would not be hard to make, but it's important to understand the implications and the change in behavior.
Comment 7 Dan Williams 2013-01-29 14:30:33 EST
(In reply to comment #0)
> Additional info:
> This is regression from -39 (I didn't try builds between -39 and -43)

Not really a regression from -39 since the applet only gained support for bridge/bond devices in -43.  So the functionality to disconnect a slave simply wasn't there before in -39, and it's hard to have something regress when it wasn't present in the first place.
Comment 8 Dan Williams 2013-01-29 14:34:01 EST
Just had a thought, that we could do a more complicated fix and only call Disconnect() on bridge/bond masters, and when a slave interface is disconnected, but when eg eth0 is disconnected and it's *not* a slave, call DeactivateConnection() like we always have.  That way you only get a behavior change when the interface is using the new functionality, and that new functionality is required to be manually enabled by the user.
Comment 9 Dan Williams 2013-01-29 16:17:48 EST
(In reply to comment #6)
> We could change the behavior of nm-applet to call Disconnect() instead of
> DeactivateConnection(), but this would constitute a behavior change.
> 
> Disconnect() clears the interface's configuration and inhibits autoconnect
> until the interface is manually reconnected, until a carrier change, or
> until the machine is put to sleep.
> 
> This change would not be hard to make, but it's important to understand the
> implications and the change in behavior.

Scratch all that; jklimes has it correct; the applet calls Disconnect() already, which will not attempt to auto-restart the bridge/bond interface unless manually reactivated.
Comment 10 Dan Williams 2013-01-29 16:40:04 EST
For the "infinite looping" problem, we'll need a quick backport of this commit from git master:

http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?h=dcbw/bb-fixes&id=01a21feba21e00eb87a7a160afe199c6b4057ad8

From 01a21feba21e00eb87a7a160afe199c6b4057ad8 Mon Sep 17 00:00:00 2001
From: Dan Williams <dcbw@redhat.com>
Date: Mon, 28 Jan 2013 16:53:16 +0000
Subject: core: return success when port already attached to bridge

Instead of just not logging the error, don't return failure either.
---
diff --git a/src/nm-system.c b/src/nm-system.c
index 3bc55bd..2f95076 100644
--- a/src/nm-system.c
+++ b/src/nm-system.c
@@ -2516,9 +2516,14 @@ nm_system_bridge_attach (int master_ifindex,
 	                             mif ? mif : master_iface,
 	                             slave_ifindex,
 	                             sif ? sif : slave_iface);
-	if (err < 0 && err != -EBUSY) {
-		nm_log_err (LOGD_DEVICE, "(%s): failed to attach slave %s: %s",
-		            master_iface, slave_iface, strerror (-err));
+	if (err < 0) {
+		if (err == -EBUSY) {
+			/* Interface already attached to the given bridge */
+			err = 0;
+		} else {
+			nm_log_err (LOGD_DEVICE, "(%s): failed to attach slave %s: %s",
+			            master_iface, slave_iface, strerror (-err));
+		}
 	}
 
 out:


I cannot reproduce the applet issue from comment #1: "if you take down "System eth0" first and "Bridge br0" second (both via applet), the connection goes down but it takes a bit of magic to take it up again - "normal" actions such as clicking on the connection in applet doesn't do anything".

If you can reproduce that issue, please include /var/log/messages and also add:

ls /sys/class/net/br0/brif/
Comment 12 Dan Williams 2013-01-29 18:41:59 EST
One other issue I've observed is that if you start a bridge/bond connection, and have one slave set autoconnect=yes (eg, ONBOOT=yes), then disconnecting the master will terminate the connection, but since the slave connection is autoconnect=yes, it will immediately restart activation of the slave which restarts activation of the just-disconnected master too.  That's a bug; if the master has been disconnected, it should not be retriggered automatically.

Patch for this attached.
Comment 13 Dan Williams 2013-01-29 18:43:59 EST
Created attachment 690080 [details]
Ensure disconnected master doesn't get reactivated by autoconnect slave
Comment 14 Dan Williams 2013-01-29 18:46:37 EST
So we have two fixes; the first one (bridge attach error patch from comment 10) is clearly correct and should be fixed anyway.  This causes the infinite looping issue.

The second fixes the issue of immediate reconnection if disconnecting the bridge via 'nmcli dev disconnect' or via nm-applet.

I'm happy to commit both these patches if given the go-ahead.
Comment 15 RHEL Product and Program Management 2013-01-29 18:49:04 EST
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.
Comment 24 David Jaša 2013-01-31 10:06:57 EST
Works for me in both ways: connection can be taken down just fine and it is brought up again without any magic no matter what happens before it.

There are still gaps in bridging support such as no indication of which port has to be active in order to obtain dhcp lease (for BOOTPROTO=dhcp) but they are present in network-scripts as well. I think that this bug can go to VERIFIED if everybody else can not reproduce any of the too issues either.
Comment 30 errata-xmlrpc 2013-11-21 16:48:20 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1670.html

Note You need to log in before you can comment on or make changes to this bug.