Bug 1658198

Summary: libvirt sends VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED event too soon (before it has finished tearing down the device's resources and objects)
Product: Red Hat Enterprise Linux 7 Reporter: Roni <reliezer>
Component: libvirtAssignee: Laine Stump <laine>
Status: CLOSED ERRATA QA Contact: yalzhang <yalzhang>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 7.6CC: ailan, jdenemar, laine, lmen, mburman, mprivozn, mzamazal, rbalakri, yafu, yalzhang
Target Milestone: rcKeywords: Automation, AutomationBlocker, Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-4.5.0-15.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1692926 (view as bug list) Environment:
Last Closed: 2019-08-06 13:14:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1655276    
Attachments:
Description Flags
libvirt logs none

Description Roni 2018-12-11 13:49:45 UTC
Created attachment 1513386 [details]
libvirt logs

Description of problem:
VFs are not released after unplug and plug vNIC with Passthrough

Version-Release number of selected component (if applicable):
4.3.0-0.2.master.20181121071050.gita8fcd23.el7

How reproducible:
100%

Steps to Reproduce:
1. Enable 2 VFs at the host
2. Create a logical network
3. Enable Passthrough at the assigned vNIC profile
4. Attached the vNIC profile to a VM
5. Start the VM
6. Unplug & Plug the vNIC
7. Delete the vNIC from the VM
8. Verify VFs at the source host are released as followed:
   - Go to Setup Host Networks | Show virtual functions
     and verify that 2 VFs are displayed
   - Try to reduce the number of VFs from 2 to 0

Actual results:
- One VF is displayed instead of 2
- Can't reduce the number of VFs to zero, got the error:
  "Cannot edit host NIC VFs configuration. The selected network interface 
  enp5s0f0 has VFs that are in use"

Expected results:
- Two VFs should be displayed
- Reduce the number of VFs from2 to 0 should success

Additional info:
- To release the VFs it is required to reboot the host.
- See attached error message
- This issue occurs also at the source host when migrating a VM
- See also https://bugzilla.redhat.com/show_bug.cgi?id=1655276

Comment 1 Jiri Denemark 2018-12-11 14:32:55 UTC
From the logs: libvirt version: 4.5.0, package: 10.el7_6.3
Moving to RHEL-7.

Comment 4 Laine Stump 2019-03-08 18:22:06 UTC
I recall a discussion about this somewhere in the last few months where the conclusion was that the DEVICE_DELETED event sent from qemu to libvirt is issued too soon (i.e. after the guest is finished with the device, but before qemu has fully given it back to the kernel, or maybe it's been given back to the kernel, but the kernel hasn't yet completely re-initialized it, or whatever it has to do to make it re-usable - or something like that anyway). So libvirt is only forwarding on the event that it received from qemu, and that is the only event sent - there isn't any later "DEVICE_REALLY_DELETED_IM_SERIOUS_THIS_TIME" event sent by qemu that libvirt can use as a cue to send its own event of similar name to vdsm.

Anyway, my takeaway from that discussion was that libvirt is doing as much as it can without getting help from qemu in the form of a later event (because qemu wants to keep the current DEVICE_DELETED event as it is). It seems like the only thing libvirt could do would be to add in a workaround similar to what vdsm has done for Bug 1655276, which I don't think is a proper fix.

Can a qemu person correct my misconceptions about the true meaning of the DEVICE_DELETED event?

Comment 5 John Ferlan 2019-03-13 11:20:12 UTC
Is it possible that this could be related to bz 1623389 for which Michal has patches posted upstream to resolve? See patch 5/5 in particular:

https://www.redhat.com/archives/libvir-list/2019-March/msg00753.html

Although Michal's series deals with an attempt to delete the same device twice, but the underlying problem seems to be that the event was missed which it seems that series addresses.

Trying to determine if this really is a release blocker.

Comment 6 Laine Stump 2019-03-14 02:04:38 UTC
(NB: I've been calling the event DEVICE_DELETED, which is the name in qemu's events. The libvirt name is VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED.)

I think it's unrelated to both that bug *and* to the problem I described in comment 4. Once I actually look into the code (rather than just relying on faulty memory and assumptions), and looked back to the original vdsm BZ (Bug 1655276) for the *libvirt* error message that was returned when this problem occurred (it can't be found in the libvirt logs attached to this bug, so they must have been collected at the wrong time), I found the following:

* The error message:

  libvirtError: Requested operation is not valid: PCI device 0000:08:10.0 is in use by driver QEMU, domain golden_env_mixed_virtio_6

is logged by virHostdevIsPCINodeDeviceUsed() when it finds that the device you want to operate on is still in the activePCIHostdevs list.

The VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED event is being queued by qemuDomainRemoveHostDevice() calling virDomainEventDeviceRemovedNewFromObj() at line qemu_hotplug.c:4572 (in current upstream sources), but the device isn't being taken off the activePCIHostdevs list until qemuDomainRemoveHostDevice() calls qemuDomainRemovePCIHostDevice() (which eventually calls virHostdevReAttachPCIDevices() - this is at qemu_hotplug.c:4607. And several lengthy things happen in between (logging device removal audit messages, cgroup ACL removal, deleting node from /dev)

I tried outputting log messages at the point the event is dispatched and the point the device is removed from the activePCIHostdevs list, and found that the event is dispatched prior to removing the device from the list. We need to queue the event later. I would say we should not queue it until *all* the teardown is finished, but hopefully there won't be existing code that somehow depends on some of the device teardown not being done yet.

NB: It looks like it is the case for *all* device types that we queue the event prior to removing the device from the virDomainDef. That seems inherently racy.

I'll work on a patch for this tomorrow.

Comment 7 Laine Stump 2019-03-23 03:57:44 UTC
I've ended up with a 21 patch series that fixes this bug in the final patch:

https://www.redhat.com/archives/libvir-list/2019-March/msg01455.html

Comment 8 Laine Stump 2019-03-25 16:40:34 UTC
The first 12 patches of that series are posted. This does *not* solve the issue (that requires the rest of the series) but these patches will be prerequisites for the fix if it is backported:

commit 143291698358f5a1e693c768893d89038420af25
Author: Laine Stump <laine>
Date:   Tue Mar 19 18:51:09 2019 -0400

    qemu_hotplug: remove erroneous call to qemuDomainDetachExtensionDevice()
    
commit e18e9b72a99f93cd4b14f39c60baa7c5ea35e5db
Author: Laine Stump <laine>
Date:   Tue Mar 19 18:55:15 2019 -0400

    qemu_hotplug: remove another erroneous qemuDomainDetachExtensionDevice() call
    
commit 155064e0ed53b13701ff176c3f92605b1a850a9d
Author: Laine Stump <laine>
Date:   Tue Mar 19 18:57:45 2019 -0400

    qemu_hotplug: remove unnecessary check for valid PCI address
    
commit 1c2866a1f6087837688c3c2beea08753dc6871d0
Author: Laine Stump <laine>
Date:   Mon Mar 18 14:02:55 2019 -0400

    qemu_hotplug: rename a virDomainDeviceInfoPtr to avoid confusion
    
commit 287415e219fa2e477ae011ece275ab15a4be1d73
Author: Laine Stump <laine>
Date:   Mon Mar 18 14:54:37 2019 -0400

    qemu_hotplug: eliminate multiple identical qemuDomainDetachHost*Device() functions
    
commit 1ed46f3a22fe8570b4237477de5d5adb5a05f455
Author: Laine Stump <laine>
Date:   Mon Mar 18 16:41:03 2019 -0400

    qemu_hotplug: eliminate unnecessary call to qemuDomainDetachNetDevice()
    
commit ac442713e6aa1b1087d095796f9c35fd372a0511
Author: Laine Stump <laine>
Date:   Mon Mar 18 18:14:11 2019 -0400

    qemu_hotplug: refactor qemuDomainDetachDiskLive and qemuDomainDetachDiskDevice
    
commit 48a2668151c9ba3f4c94c0a4c0412a5140885ad4
Author: Laine Stump <laine>
Date:   Tue Mar 19 11:43:14 2019 -0400

    qemu_hotplug: don't call DetachThisHostDevice for hostdev network devices
    
commit 6be2414820a23663f9e6b7b4ed510ebbf3126307
Author: Laine Stump <laine>
Date:   Tue Mar 19 13:33:10 2019 -0400

    qemu_hotplug: merge qemuDomainDetachThisHostDevice into qemuDomainDetachHostDevice
    
commit 036a4521f3c539c58bb5706b4710db0f1a16eec6
Author: Laine Stump <laine>
Date:   Tue Mar 19 13:35:31 2019 -0400

    qemu_hotplug: move qemuDomainChangeGraphicsPasswords()
    
commit 5a8ffaec768ce25ef74eb398968e0b84b878a249
Author: Laine Stump <laine>
Date:   Tue Mar 19 13:39:07 2019 -0400

    qemu_hotplug: move (almost) all qemuDomainDetach*() functions together
    
commit 015e71c54ddf8d133905a85514239b21bc7e552e
Author: Laine Stump <laine>
Date:   Tue Mar 19 13:42:55 2019 -0400

    qemu_hotplug: move (Attach|Detach)Lease functions with others of same type

Comment 9 Laine Stump 2019-03-26 15:14:19 UTC
And here are the upstream commit IDs for the remaining 14 patches. The final patch isn't required, but fits in nicely with this series.

commit 6a9c3fbade1d5479f4893a0f31293f13ad8bf229
Author: Laine Stump <laine>
Date:   Tue Mar 19 13:55:13 2019 -0400

    qemu_hotplug: move qemuDomainDetachDeviceLive() to qemu_hotplug.c

commit e4d96324b48b8aab864212382390a5c4a40970d2
Author: Laine Stump <laine>
Date:   Tue Mar 19 17:37:55 2019 -0400

    qemu_hotplug: remove extra function in middle of DetachController call chain
    
commit b20494186578fb779547b714fff77f07e2ca03ea
Author: Laine Stump <laine>
Date:   Tue Mar 19 14:06:05 2019 -0400

    qemu_hotplug: pull qemuDomainUpdateDeviceList out of qemuDomainDetachDeviceLive
    
commit d3aab99096bb8f81600437682398235c17084d22
Author: Laine Stump <laine>
Date:   Tue Mar 19 14:47:40 2019 -0400

    test: replace calls to individual detach functions with one call to main detach
    
commit 637d72f985e2700e88c3b3a4d4a83df9b8d6d35d
Author: Laine Stump <laine>
Date:   Sat Mar 23 12:29:23 2019 -0400

    qemu_hotplug: make Detach functions called only from qemu_hotplug.c static
    
commit c4d6a121a8e903dc8d012d6737fd308b3ddec31a
Author: Laine Stump <laine>
Date:   Tue Mar 19 17:15:00 2019 -0400

    qemu_hotplug: rename dev to match in qemuDomainDetachDeviceLive
    
commit 2ec6faea798b2a7d8986b7a958e781b54d8a8070
Author: Laine Stump <laine>
Date:   Sun Mar 24 20:59:21 2019 -0400

    qemu_hotplug: separate Chr|Lease from other devices in DetachDevice switch
    
commit b6a53bf9079bc9ef2dc3f8b85ff5c84da14b9a0a
Author: Laine Stump <laine>
Date:   Wed Mar 20 14:32:53 2019 -0400

    qemu_hotplug: standardize the names/args/calling of qemuDomainDetach*()
    
commit e1949c7045377e7b32f0f8ed7c76539b1b5d50da
Author: Laine Stump <laine>
Date:   Sun Mar 24 21:29:49 2019 -0400

    qemu_hotplug: rename Chr and Lease Detach functions
    
commit b914e0eca385b52ede39b1b046bc9bf7a4fbbc2a
Author: Laine Stump <laine>
Date:   Wed Mar 20 19:44:05 2019 -0400

    qemu_hotplug: new function qemuDomainRemoveAuditDevice()
    
commit 444c5e7c432961c7ecd59b2627dfb69ce787a22a
Author: Laine Stump <laine>
Date:   Mon Mar 25 10:23:51 2019 -0400

    qemu_hotplug: audit *all* auditable device types in qemuDomainRemoveAuditDevice
    
commit dd60bd62d3ad60b564168f56b05f4c9354af4bd3
Author: Laine Stump <laine>
Date:   Wed Mar 20 21:44:00 2019 -0400

    qemu_hotplug: consolidate all common detach code in qemuDomainDetachDeviceLive
    
commit 78b03a7770f1822458be3e0769538dfc92b34803
Author: Laine Stump <laine>
Date:   Thu Mar 21 12:54:10 2019 -0400

    qemu_hotplug: delay sending DEVICE_REMOVED event until after *all* teardown
    
commit 34086fc59e7c59148409d6780176e84d0f1dbfb4
Author: Laine Stump <laine>
Date:   Mon Mar 25 10:46:56 2019 -0400

    qemu_hotplug: don't shutdown net device until the guest has released it

Comment 17 yalzhang@redhat.com 2019-05-23 09:08:40 UTC
Reproduce the bug on libvirt-4.5.0-10.el7_6.9.x86_64 and verify it on libvirt-4.5.0-18.el7.x86_64, the result is as expected.

1. Edit the /usr/share/doc/libvirt-python-4.5.0/examples/event-test.py to be like:
...
 vc = libvirt.open(uri)
...
def myDomainEventDeviceRemovedCallback(conn, dom, dev, opaque):
    print("myDomainEventDeviceRemovedCallback: Domain %s(%s) device removed: %s" % (
            dom.name(), dom.ID(), dev))
    dev = conn.nodeDeviceLookupByName('pci_0000_82_10_3')
    dev.reAttach()
...
2. Reproduce on libvirt-4.5.0-10.el7_6.9.x86_64
1) Prepare vm with hostdev interaface with source device "pci_0000_82_10_3" and managed='no'(or ignore this item, it default to manged='no')
# virsh dumpxml rhel | grep /interface -B9
......
 <interface type='hostdev'>
      <mac address='52:54:00:42:6c:8a'/>
      <source>
        <address type='pci' domain='0x0000' bus='0x82' slot='0x10' function='0x3'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </interface>

 2) Enable the VF pci_0000_82_10_3, then detach it from the host
# virsh nodedev-detach pci_0000_82_10_3

3) start the vm, make sure the guest boot to os.
before the detach, run below command on host, and open another terminal to detach the device, 
[terminal 1]# python /usr/share/doc/libvirt-python-4.5.0/examples/event-test.py
Using uri 'qemu:///system' and event loop 'poll'

[terminal 2]# virsh detach-device rhel /tmp/test
Device detached successfully

check the output on the former terminal:
[terminal 1]# python /usr/share/doc/libvirt-python-4.5.0/examples/event-test.py
Using uri 'qemu:///system' and event loop 'poll'
myDomainEventDeviceRemovedCallback: Domain rhel(5) device removed: hostdev0
libvirt:  error : Requested operation is not valid: PCI device 0000:82:10.3 is in use by driver QEMU, domain rhel
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 4986, in _dispatchDomainEventDeviceRemovedCallback
    cb(self, virDomain(self, _obj=dom), devAlias, opaque)
  File "/usr/share/doc/libvirt-python-4.5.0/examples/event-test.py", line 552, in myDomainEventDeviceRemovedCallback
    dev.reAttach()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 5624, in reAttach
    if ret == -1: raise libvirtError ('virNodeDeviceReAttach() failed')
libvirt.libvirtError: Requested operation is not valid: PCI device 0000:82:10.3 is in use by driver QEMU, domain rhel

3. update libvirt to latest libvirt-4.5.0-18.el7.x86_64, do the detach, and check the python event output in another terminal:
# python /usr/share/doc/libvirt-python-4.5.0/examples/event-test.py
Using uri 'qemu:///system' and event loop 'poll'
myDomainEventDeviceRemovedCallback: Domain rhel(1) device removed: hostdev0
myNodeDeviceEventLifecycleCallback: Node device  net_p4p2_1_8e_df_37_31_7c_41 Created 0

Comment 20 errata-xmlrpc 2019-08-06 13:14:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2294