Bug 827544 - can't add new netdevs after most recently added netdev is detached
can't add new netdevs after most recently added netdev is detached
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.3
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Laine Stump
Virtualization Bugs
:
: 844622 (view as bug list)
Depends On:
Blocks: 846869
  Show dependency treegraph
 
Reported: 2012-06-01 14:07 EDT by Laine Stump
Modified: 2012-09-05 04:00 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 846869 (view as bug list)
Environment:
Last Closed: 2012-07-25 13:47:44 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
trace of json monitor commands sent to qemu by libvirt (1.84 KB, text/plain)
2012-06-01 14:07 EDT, Laine Stump
no flags Details

  None (edit)
Description Laine Stump 2012-06-01 14:07:13 EDT
Created attachment 588548 [details]
trace of json monitor commands sent to qemu by libvirt

Description of problem: If the most recently added network device (or the last <interface> listed in the domain config when none have been hotplugged) is detached from a running domain, all further attempts to attach a new network device will fail.


Version-Release number of selected component (if applicable):

  libvirt-0.9.10-21
  qemu-kvm-0.12.1.2-2.295

How reproducible: 100%


Steps to Reproduce:
1. start a guest, any guest
2. virsh attach-interface $guest network no-ip --model virtio \
         --mac 52:54:00:12:34:56
Interface attached successfully

3. virsh detach-interface $guest network --mac 52:54:00:12:34:56
Interface detached successfully
  
4.  virsh attach-interface $guest network no-ip --model virtio \
         --mac 52:54:00:12:34:56

Actual results:

error: Failed to attach interface
error: internal error unable to execute QEMU command 'device_add': Duplicate ID 'net1' for device

Expected results:

Interface attached successfully

Additional info:

It appears that qemu doesn't re-use the device id's (which are called "alias" by libvirt"). According to the attached trace, libvirt is detaching the old device with id "net1", then attempting to add a new device with id "net1". Either qemu needs to recycle device IDs, or libvirt needs to never reissue the same id for any particular domain. The latter would require storing the current "highest alias number" for each type of device in the active XML, so that it would survive restarts of libvirtd.
Comment 1 Laszlo Ersek 2012-06-09 04:19:35 EDT
According to the trace, the netdev_del/netdev_add pair works just fine; indeed do_netdev_del() in the qemu source removes the id:

  int do_netdev_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
  {
      const char *id = qdict_get_str(qdict, "id");
      VLANClientState *vc;

      vc = qemu_find_netdev(id);
      if (!vc) {
          qerror_report(QERR_DEVICE_NOT_FOUND, id);
          return -1;
      }
      qemu_del_vlan_client(vc);
      qemu_opts_del(qemu_opts_find(&qemu_netdev_opts, id)); /* HERE */
      return 0;
  }

do_device_add/do_device_del seem to work differently (device_add is the one failing in the trace).

do_device_del()
  qdev_unplug()
    dev->info->unplug(dev)

I didn't try to track the funcptr, but I guess it doesn't clean up the qemu options. However do_device_add() checks for unicity:

do_device_add()
  qemu_opts_from_qdict()
    qemu_opts_create()
      qemu_opts_find()
      qerror_report(QERR_DUPLICATE_ID, ...) -- reports the cited error

Upstream qemu lacks do_device_del() completely, only an unused prototype exists in hw/qdev.h. Instead it has qmp_device_del() (and hmp_device_del(), calling it), but the underlying qdev_unplug() doesn't seem to be very different. This could be a problem in upstream qemu as well (ie. device_del not pruning the qemu_find_opts("device") QemuOptsList, that is, "qemu_device_opts").
Comment 2 Laine Stump 2012-06-09 04:28:52 EDT
The problem is narrower than I initially thought. The guest where I saw the problem was RHEL5. I tried the same operation on RHEL6 and WinXP guests and they had no problem re-using the id of a previously detached device.

I also noticed that this guest doesn't show any signs that the device has been added even the first time - nothing in dmesg, no new device showing up in ifconfig -a.

I had thought that adding and removing devices was happening at a low enough level that it didn't matter what OS was running on the guest, or whether or not the guest acknowledged it. Is this not the case? Beyond that, is there some known issue with RHEL5 and hotplugging of network devices, or is my guest somehow broken?
Comment 3 Laine Stump 2012-07-25 13:47:44 EDT
From discussion with qemu people, I've learned that that qemu *does* recycle the device alias names, but not until it receives confirmation from the guest that it really has released the hardware and that pci hotplug support is more or less non-existent in RHEL5. So my choice of test OS was unfortunate, since qemu is waiting for the guest to release the hardware, but the guest doesn't even know that it had it to begin with (and wouldn't know how to release it if it did).

On the other hand, when I used a RHEL6 guest, the device alias *is* properly recycled and can be re-used.

Since the problem is only manifests itself as an inability to hotplug any more devices, and it's only a problem when using a guest that doesn't support hotplug anyway, this is (mostly) a non-issue. 

The one issue is that there appears to be a race, since libvirt makes the alias available for re-use immediately, while qemu won't recycle it until the guest has notified qemu that it's completely finished, and apparently that happens asynchronous to the detach command, and there is no way for libvirt to be notified of that event. Until that theoretical race is actually witnessed on a guest that properly supports hotplug, I think we can close this as NOTABUG.
Comment 4 Dave Allan 2012-07-25 14:22:39 EDT
(In reply to comment #3)
> The one issue is that there appears to be a race, since libvirt makes the
> alias available for re-use immediately, while qemu won't recycle it until
> the guest has notified qemu that it's completely finished, and apparently
> that happens asynchronous to the detach command, and there is no way for
> libvirt to be notified of that event. Until that theoretical race is
> actually witnessed on a guest that properly supports hotplug, I think we can
> close this as NOTABUG.

Can you open a BZ against qemu for an event in this case?  There are other hotplug cases which behave similarly, and qemu will be providing an event so that libvirt can tell when the operation's complete.
Comment 5 Laine Stump 2012-08-05 19:34:52 EDT
*** Bug 844622 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.