Bug 670154

Summary: /sbin/ifdown fails for bridge devices when NetworkManager is running
Product: Red Hat Enterprise Linux 6 Reporter: Vivian Bian <vbian>
Component: initscriptsAssignee: initscripts Maintenance Team <initscripts-maint-list>
Status: CLOSED ERRATA QA Contact: qe-baseos-daemons
Severity: medium Docs Contact:
Priority: low    
Version: 6.1CC: azelinka, dallan, dcbw, eblake, jyang, laine, notting, plautrba, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: initscripts-9.03.22-1.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 684322 (view as bug list) Environment:
Last Closed: 2011-05-19 13:52:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vivian Bian 2011-01-17 11:34:24 UTC
Description of problem:


Version-Release number of selected component (if applicable):
libvirt-0.8.7-1.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. create a new interface (e.g #virsh iface-define bridge.xml)
2. start the new created interface 
3. try to destroy the interface 
  
Actual results:
virsh iface-destroy br0
error: Failed to destroy interface br0
error: internal error failed to destroy (stop) interface br0 (netcf: failed to execute external program - Running 'ifdown br0' failed with exit code 1)

/var/log/message output
Jan 17 17:49:27 dhcp-66-92-51 libvirtd: 17:49:27.593: 24717: info : qemudDispatchServer:1410 : Turn off polkit auth for privileged client pid 26590 from 127.0.0.1;0
Jan 17 17:49:34 dhcp-66-92-51 libvirtd: 17:49:34.820: 24717: info : qemudDispatchServer:1410 : Turn off polkit auth for privileged client pid 26605 from 127.0.0.1;0
Jan 17 17:49:55 dhcp-66-92-51 libvirtd: 17:49:55.540: 24717: info : qemudDispatchServer:1410 : Turn off polkit auth for privileged client pid 26627 from 127.0.0.1;0
Jan 17 17:54:19 dhcp-66-92-51 libvirtd: 17:54:19.061: 24717: info : qemudDispatchServer:1410 : Turn off polkit auth for privileged client pid 26951 from 127.0.0.1;0
Jan 17 17:54:27 dhcp-66-92-51 libvirtd: 17:54:27.748: 24717: info : qemudDispatchServer:1410 : Turn off polkit auth for privileged client pid 26962 from 127.0.0.1;0
Jan 17 17:54:35 dhcp-66-92-51 libvirtd: 17:54:35.627: 24717: info : qemudDispatchServer:1410 : Turn off polkit auth for privileged client pid 26986 from 127.0.0.1;0
Jan 17 17:54:41 dhcp-66-92-51 libvirtd: 17:54:41.512: 24717: info : qemudDispatchServer:1410 : Turn off polkit auth for privileged client pid 26992 from 127.0.0.1;0
Jan 17 17:56:43 dhcp-66-92-51 libvirtd: 17:56:43.508: 24717: info : qemudDispatchServer:1410 : Turn off polkit auth for privileged client pid 27135 from 127.0.0.1;0
n

Jan 17 17:58:22 dhcp-66-92-51 libvirtd: 17:58:22.139: 24722: warning : virEventUpdateHandleImpl:139 : Ignoring invalid update watch -1

[root@dhcp-66-92-51 ~]# virsh iface-list --all
Name                 State      MAC Address
--------------------------------------------
br0                  active     00:1b:21:39:8b:18
eth1                 active     00:1b:21:39:8b:19
eth2                 active     d8:d3:85:7e:61:9b
lo                   active     00:00:00:00:00:00


Expected results:
interface could be stopped successfully 

Additional info:
for example I created a interface with this xml 
<interface type='bridge' name='br0'>
  <start mode='onboot'/>
  <mtu size='1500'/>
  <protocol family='ipv4'>
    <dhcp/>
  </protocol>
  <bridge stp='off' delay='0.01'>
    <interface type='ethernet' name='eth0'>
    </interface>
  </bridge>
</interface>

Comment 2 Laine Stump 2011-03-09 18:59:24 UTC
Vivian:

1) Are you running NetworkManager? If so, you should probably disable it for your testing, as use of NetworkManager is not supported when bridge devices are connected to physical interfaces.

2) "virsh iface-destroy ends up just calling "/sbin/ifdown br0". If you directly run /sbin/ifdown (thus removing libvirt and netcf both from the picture), you will see this message:

   # ifdown br0
   Error: Device 'br0' not found.

Past experience has shown that this has been caused by a disagreement between NetworkManager and /sbin/ifdown over who is managing br0, and indeed when I stop the NetworkManager service and again try ifdown br0, it is successful:

   service NetworkManager stop
   Stopping NetworkManager daemon:                            [  OK  ]
   # ifdown br0
   # ifconfig br0
   br0       Link encap:Ethernet  HWaddr 00:27:13:53:DB:77  
             BROADCAST MULTICAST  MTU:1500  Metric:1
             RX packets:9 errors:0 dropped:0 overruns:0 frame:0
             TX packets:57 errors:0 dropped:0 overruns:0 carrier:0
             collisions:0 txqueuelen:0 
             RX bytes:787 (787.0 b)  TX bytes:9925 (9.6 KiB)

(ie, it successfully brought it down).

I've done further investigation, and found that the problem here is that the function is_nm_device_managed() in

  /etc/sysconfig/network-scripts/network-functions

(which is used by /sbin/ifdown is trying to determine whether or not br0 is managed by NM with this command:

    nmcli -t --fields device,state dev status | grep -q "^${1}:unmanaged$"

(where ${1} in this case is 'br0'). This does not work correctly, because br0 doesn't even show up in the output of the nmcli command. (likewise, it should also be showing that eth0 (which is enslaved to br0) is not managed by NM, but that fails as well, because the output for eth0 is "eth0:disconnected")

Since network-functions is owned by initscripts, I'm changing the component accordingly.

(to the initscripts people - note that it's not necessary to use libvirt or netcf for bridge configuration to see this problem - you could also setup the ifcfg-br0 and ifcfg-eth0 files by hand and experience the same behavior. The only extra necessary bit is that NetworkManager be running)

Comment 3 Vivian Bian 2011-03-10 06:44:52 UTC
(In reply to comment #2)
Laine, 

Re 1) Are you running NetworkManager? 
I did run the NetworkManager when trying to destroy the bridge devices . 

Re 2) According to your introduction, tried "ifdown br0" and got the command hang , and could not make the bridge shutdown when NetworkManager is running . But if stop the NetworkManager, "ifdown br0" works fine . 

Thanks for the detail introduction . Learned a lot from the comment #2 .

Comment 4 Bill Nottingham 2011-03-10 20:44:23 UTC
Should be fixed with:
 http://git.fedorahosted.org/git/?p=initscripts.git;a=commitdiff;h=bec37b082a490e101799810fe210aa273ad26de3

Please verify.

Comment 5 Laine Stump 2011-03-11 13:06:16 UTC
Yes, applying that patch fixes the problem with br0.

However, the ethernet that's connected to the bridge still can't be brought down with ifdown. This may be a NM problem though - I had thought that an ethn connected to a bridge was supposed to be marked as unmanaged by NM, but it shows up as "disconnected", ifdown tries to get NM to bring the interface down, and NM is unable to do it (here's the tail end of ifdown, with "set -x" turned on ):

+ is_nm_device_unmanaged eth0
+ LANG=C
+ nmcli -t --fields GENERAL dev list iface eth0
+ awk -F : '/GENERAL.STATE/ { if ($2 == "unmanaged") exit 0 ; else exit 1; }'
+ echo Using nmcli to disconnect device ''\''eth0'\'''
Using nmcli to disconnect device 'eth0'
+ nmcli dev disconnect iface eth0
Error: Device 'eth0' (/org/freedesktop/NetworkManager/Devices/1) disconnecting failed: This device is not active
+ exit 6
[root@stinkstation network-scripts]# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:27:13:53:DB:77  
          inet6 addr: fe80::227:13ff:fe53:db77/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:17 errors:0 dropped:0 overruns:0 frame:0
          TX packets:113 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:3118 (3.0 KiB)  TX bytes:19503 (19.0 KiB)
          Interrupt:16 


Here is the output of the nmcli command used in is_nm_device_unmanaged():

# nmcli -t --fields GENERAL dev list iface "eth0"  2>/dev/null
GENERAL.DEVICE:eth0
GENERAL.TYPE:802-3-ethernet
GENERAL.DRIVER:tg3
GENERAL.HWADDR:00:27:13:53:DB:77
GENERAL.STATE:disconnected

Comment 6 Bill Nottingham 2011-03-11 16:20:44 UTC
Yes, that would have to be fixed in NM. Acking for the patch mentioned in comment #4, though.

Comment 10 errata-xmlrpc 2011-05-19 13:52:01 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0647.html