Bug 844578 - Starting interface without cable failed with error but actually started.
Starting interface without cable failed with error but actually started.
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: netcf (Show other bugs)
6.4
Unspecified Unspecified
low Severity low
: rc
: ---
Assigned To: Laine Stump
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-31 02:15 EDT by Yuan Shiyao
Modified: 2013-11-21 16:30 EST (History)
10 users (show)

See Also:
Fixed In Version: netcf-0.1.9-4.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 855574 (view as bug list)
Environment:
Last Closed: 2013-11-21 16:30:48 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Yuan Shiyao 2012-07-31 02:15:06 EDT
Description of problem:
Starting interface without cable failed with error but actually started.

Version-Release number of selected component (if applicable):
# rpm -qa |grep libvirt
libvirt-0.9.13-3.el6.x86_64
libvirt-client-0.9.13-3.el6.x86_64

How reproducible:
100%

Steps to Reproduce:

1.Prepare a machine that has a NIC without cable.
#ifconfig
eth0      Link encap:Ethernet  HWaddr B8:AC:6F:3E:63:87
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:21 Memory:fdfe0000-fe000000

eth1      Link encap:Ethernet  HWaddr 00:0E:0C:B6:7F:8D
          inet addr:10.66.5.1  Bcast:10.66.7.255  Mask:255.255.252.0
          inet6 addr: fe80::20e:cff:feb6:7f8d/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:14837 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1759 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1769752 (1.6 MiB)  TX bytes:1598

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:12 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:720 (720.0 b)  TX bytes:720 (720.0 b)
2.#service NetworkManager stop
3.#virsh iface-list --all
Name                 State      MAC Address
--------------------------------------------
eth0                 active     b8:ac:6f:3e:63:87
eth1                 active     00:0e:0c:b6:7f:8d
lo                   active     00:00:00:00:00:00

4. #virsh iface-destroy eth0
Interface eth0 destroyed.

#virsh iface-list --all
Name                 State      MAC Address
--------------------------------------------
eth1                 active     00:0e:0c:b6:7f:8d
lo                   active     00:00:00:00:00:00
eth0                 inactive   b8:ac:6f:3e:63:87

5. #virsh iface-start eth0

6.virsh iface-list --all
  
Actual results:
After step 5, results are
# virsh iface-start eth0
error: Failed to start interface eth0
error: internal error failed to create (start) interface eth0: failed to execute external program - Running 'ifup eth0' failed with exit code 1:
Determining IP information for eth0... failed; no link present.  Check cable?

running #virsh iface-list --all
Name                 State      MAC Address
--------------------------------------------
eth0                 active     b8:ac:6f:3e:63:87
eth1                 active     00:0e:0c:b6:7f:8d
lo                   active     00:00:00:00:00:00



Expected results:
Interface should be inactive in the output of iface-list since starting it failed with error messages. Or the error messages should be more clear.

Additional info:
#cat /var/log/libvirt/libvirtd.log
...
 error : interfaceCreate:493 : internal error failed to create (start) interface eth0: failed to execute external program - Running 'ifup eth0' failed with exit code 1:
Determining IP information for eth0... failed; no link present.  Check cable?
Comment 2 Alex Jia 2012-07-31 05:57:44 EDT
(In reply to comment #0)
> ...
>  error : interfaceCreate:493 : internal error failed to create (start)
> interface eth0: failed to execute external program - Running 'ifup eth0'
> failed with exit code 1:
> Determining IP information for eth0... failed; no link present.  Check cable?

I guess your NetworkManager service is running, if you stop it then you may get a expected result?
Comment 3 Yuan Shiyao 2012-07-31 06:20:00 EDT
(In reply to comment #2)
> (In reply to comment #0)
> > ...
> >  error : interfaceCreate:493 : internal error failed to create (start)
> > interface eth0: failed to execute external program - Running 'ifup eth0'
> > failed with exit code 1:
> > Determining IP information for eth0... failed; no link present.  Check cable?
> 
> I guess your NetworkManager service is running, if you stop it then you may
> get a expected result?


I have stopped it at step 2. NetworkManager service isn't running.
Comment 4 Alex Jia 2012-07-31 06:38:57 EDT
(In reply to comment #3)

> > I guess your NetworkManager service is running, if you stop it then you may
> > get a expected result?
> 
> 
> I have stopped it at step 2. NetworkManager service isn't running.

Yeah, I missed it, IMHO, it may be a NetworkManager bug:

# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:23:AE:6F:F1:D7  
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:3183137 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1439293 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:757486621 (722.3 MiB)  TX bytes:289269951 (275.8 MiB)
          Interrupt:21 Memory:febe0000-fec00000 

# service NetworkManager status
NetworkManager is stopped

# ifup eht0
/sbin/ifup: configuration for eht0 not found.
Usage: ifup <device name>
[root@201 ajia]# ifup eth0
connect: Invalid argument
RTNETLINK answers: No such device
Error adding default gateway 0.0.0.0 for eth0.
RTNETLINK answers: File exists

# echo $?
0

Notes, If ncf_if_up(nif) == 0 then libvirt thinks interface successfully brought up, in fact, 'ifup eth0' should return non-zero value in here.
Comment 5 Laine Stump 2012-07-31 20:05:21 EDT
The problem is that when an interface is configured for dhcp, /sbin/ifup (which netcf calls to bring up the interface) will only return success if dhclient is successful in acquiring an IP address. In the meantime, ifup still sets the IFF_UP flag on the interface, and that is what netcf checks when determining the ACTIVE/INACTIVE state of an interface.

Additionally, if NetworkManager is running, /sbin/ifup will fail when the interface's cable is unplugged, even if the IP address is configured statically (so no dhcp is required).

In order to make the results of "virsh iface-start" (aka ncf_if_up()) consistent with the reported status of the interface, the following patch has been posted to the upstream netcf mailing list:

https://lists.fedorahosted.org/pipermail/netcf-devel/2012-July/000781.html

It modifies netcf's if_is_active() function to require both IFF_UP and IFF_RUNNING before considering the interface to be active, and checks the result of this function when bringing up an interface, even after /sbin/ifup has returned success.
Comment 6 Laine Stump 2012-07-31 23:14:02 EDT
The fix has been pushed upstream:

commit 012e2169dfd904520ecac65553ccdd265537351f
Author: Laine Stump <laine@redhat.com>
Date:   Tue Jul 31 19:57:59 2012 -0400

    check IFF_RUNNING before considering an interface "active"
    
    If an interface's cable is unplugged and it is configured to use dhcp
    (or if NetworkManager is running), attempts to ifup will fail, but
    netcf will later report that the interface is active. This is because
    netcf only checks the IFF_UP flag in the interface status.
    
    It makes more sense for the interface to be counted as active only if
    ifup has been successful, so this patch changes the if_is_active()
    utility function to require both IFF_UP and IFF_RUNNING be set before
    counting the interface as active.
    
    However, if an interface is configured for a static IP address *and
    NetworkManager isn't running*, ifup will succeed even when the cable
    is unplugged. So again the active status of the interface is not
    consistent with the result of ifup. To resolve this inconsistency,
    this patch makes na additional check for if_is_active() after the
    system's ifup utility successfully completes.
    
    The result is consistency between the result of ifup and the
    interface's flags in all cases.
    
    Note that the 2nd change needed to be done separately in all three
    linux drivers, because if_is_active() is a linux-specific function, so
    it can't be called from the platform-agnostic netcf.c (yet each
    platform's drv_if_up() is different, so they can't all call a common
    util_if_up())..
Comment 7 RHEL Product and Program Management 2012-09-07 00:58:09 EDT
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unable to address this
request at this time.

Red Hat invites you to ask your support representative to
propose this request, if appropriate, in the next release of
Red Hat Enterprise Linux.
Comment 11 Laine Stump 2013-07-31 16:06:48 EDT
No rebase for 6.5, so this will need to be updated via a backport of the patch.
Comment 12 Laine Stump 2013-08-04 17:18:17 EDT
Testing that this bug is fixed: the original description of the bug gives a very good step-by-step method of reproducing the bug - simply run "virsh iface-destroy eth0", then unplug the cable for ethernet 0, then run "virsh iface-start eth0". Before the patch that fixes the bug, the iface-start would erroneously report success. After the patch it will report failure.
Comment 13 Laine Stump 2013-08-04 17:28:44 EDT
Completely fixing this bug also requires the following patch from upstream (which adresses a regression caused by the first patch mentioned here):

commit 14af66fa2b119f47a23c9a4043ae8fe2441379fc
Author: Laine Stump <laine@laine.org>
Date:   Wed May 15 14:13:09 2013 -0400

    wait for IFF_UP and IFF_RUNNING after calling ifup
    
    This fixes https://bugzilla.redhat.com/show_bug.cgi?id=961184
    
    Apparently one or the other of IFF_UP and IFF_RUNNING are not always
    set by the time /sbin/ifup returns control to netcf, so the subsequent
    check to verify that the interface is up may fail. This patch adds a
    loop to re-check the status of the interface every 250msec for up to
    2.5 seconds (or until both flags are set). If timeout is reached, it
    still fails the operation.
Comment 14 Laine Stump 2013-08-06 12:45:16 EDT
The fix for this problem has been included in a build for RHEL6:

https://brewweb.devel.redhat.com/buildinfo?buildID=285470
Comment 16 EricLee 2013-08-07 22:09:48 EDT
Verifying this bug with netcf-0.1.9-4.el6:

Tested in a host with two network cards and eth1 without cable:
# service NetworkManager status
NetworkManager is stopped
# service network restart
Shutting down interface eth0:                              [  OK  ]
Shutting down interface eth1:                              [  OK  ]
Shutting down loopback interface:                          [  OK  ]
Bringing up loopback interface:                            [  OK  ]
Bringing up interface eth0:  
Determining IP information for eth0... done.
                                                           [  OK  ]
Bringing up interface eth1:  
Determining IP information for eth1... failed; no link present.  Check cable?
                                                           [FAILED]

# ifconfig 
eth0      Link encap:Ethernet  HWaddr 10:60:4B:5C:B9:A6  
          inet addr:10.66.7.179  Bcast:10.66.7.255  Mask:255.255.252.0
          inet6 addr: fe80::1260:4bff:fe5c:b9a6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:102140 errors:0 dropped:0 overruns:0 frame:0
          TX packets:29951 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:22741761 (21.6 MiB)  TX bytes:23396860 (22.3 MiB)
          Interrupt:20 Memory:f7f00000-f7f20000 

eth1      Link encap:Ethernet  HWaddr 00:15:17:62:AE:E8  
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:146358 errors:0 dropped:0 overruns:0 frame:0
          TX packets:137195 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:17242456 (16.4 MiB)  TX bytes:10224847 (9.7 MiB)
          Interrupt:16 Memory:f7d40000-f7d60000 

# virsh iface-list --all
Name                 State      MAC Address
--------------------------------------------
eth0                 active     10:60:4b:5c:b9:a6
eth1                 inactive   00:15:17:62:ae:e8
lo                   active     00:00:00:00:00:00

# virsh iface-start eth1
error: Failed to start interface eth1
error: internal error failed to create (start) interface eth1: failed to execute external program - Running 'ifup eth1' failed with exit code 1: 
Determining IP information for eth1... failed; no link present.  Check cable?

# virsh iface-list --all
Name                 State      MAC Address
--------------------------------------------
eth0                 active     10:60:4b:5c:b9:a6
eth1                 inactive   00:15:17:62:ae:e8
lo                   active     00:00:00:00:00:00

Using ncftool to verify it:

ncftool> list --inactive
eth1

ncftool> ifup eth1
Interface eth1 bring-up failed!
error: failed to execute external program
error: Running 'ifup eth1' failed with exit code 1: 
Determining IP information for eth1... failed; no link present.  Check cable?

ncftool> list --inactive
eth1

Worked as expected, so moving to VERIFIED.
Comment 17 errata-xmlrpc 2013-11-21 16:30:48 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1660.html

Note You need to log in before you can comment on or make changes to this bug.