Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 781985 - When detach PCI device from guest, unknown error occurs.
When detach PCI device from guest, unknown error occurs.
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.3
x86_64 Linux
high Severity high
: rc
: ---
Assigned To: Osier Yang
Virtualization Bugs
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-01-16 04:14 EST by hongming
Modified: 2012-06-20 02:46 EDT (History)
8 users (show)

See Also:
Fixed In Version: libvirt-0.9.10-1.el6
Doc Type: Bug Fix
Doc Text:
No documentation needed.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-06-20 02:46:35 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
libvirt debug log (51.77 KB, text/plain)
2012-01-16 04:14 EST, hongming
no flags Details
libvirt debug log (660.54 KB, text/plain)
2012-02-21 05:52 EST, zhpeng
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:0748 normal SHIPPED_LIVE Low: libvirt security, bug fix, and enhancement update 2012-06-19 15:31:38 EDT

  None (edit)
Description hongming 2012-01-16 04:14:19 EST
Created attachment 555455 [details]
libvirt debug log

Description of problem:

When run "virsh detach-device domain xmlfile" command,error occurs as follows,And the PCI device actually has detached from guest.The bug can always be reproduced in libvirt-0.9.9-1.It can't be reproduced in libvirt-0.9.8-1.
-error:Failed to detach device from hostdev.xml
-error:An error occurred,but the cause is unknown.




Version-Release number of selected component (if applicable):

-kernel-2.6.32-220.el6.x86_64
-libvirt-0.9.9-1.el6.x86_64
-qemu-kvm-0.12.1.2-2.213.el6.x86_64


How reproducible:
Always

Steps to Reproduce:
1.enable kernel iommu. edit grub.conf
add intel_iommu=on at the end of  kernel line.  
2.For platform just support vt-d1(host kernel) and host kernel
larger than 171 kernel, do the following steps.
      modprobe -r kvm_intel
      modprobe -r kvm
      modprobe kvm allow_unsafe_assigned_interrupts=1
      modprobe kvm_intel  
3.Check device list, prepare hotplug network from host to guest.
computer
   |
     +- pci_0000_00_19_0
   |        |
   |       +- net_eth0_44_37_e6_67_11_a2
4. # virsh nodedev-dumpxml pci_0000_00_19_0 
5. # readlink /sys/bus/pci/devices/0000\:00\:19.0/driver/ -f
/sys/bus/pci/drivers/e1000e
6. # virsh nodedev-dettach pci_0000_00_19_0
7. # readlink /sys/bus/pci/devices/0000\:00\:19.0/driver/ -f
/sys/bus/pci/drivers/pci-stub
8.# virsh nodedev-reset pci_0000_00_19_0
Device pci_0000_00_19_0 reset
9.virsh attach-device rhel6 hostdev.xml

hostdev.xml is like as following:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/>
      </source>
    </hostdev> 
  
10. In guest, using lspci, and ping to check the network device is working fine.
11. virsh detach-device rhel6 hostdev.xml

Actual results:
-error:Failed to detach device from **.xml
-error:An error occurred,but the cause is unknown 
The PCI device actually has detached from guest.If don't destroy the guest, the command nodedev-reattach can't reattach the PCI device to host. 

Expected results:
Successfully detach device from guest.


Additional info:
Comment 4 Alex Jia 2012-01-16 05:25:10 EST
The issue is introduced in commit a0aec36, the following comment is from this path:

This patch fixes two problems:
        1) The device will be reattached to host even if it's not
           managed, as there is a "pciDeviceSetManaged".

And in codes:

1960 static int
1961 qemuDomainDetachHostPciDevice(struct qemud_driver *driver,
1962                               virDomainObjPtr vm,
1963                               virDomainDeviceDefPtr dev,
1964                               virDomainHostdevDefPtr *detach_ret)
1965 {
......
2026     pci = pciGetDevice(detach->source.subsys.u.pci.domain,
2027                        detach->source.subsys.u.pci.bus,
2028                        detach->source.subsys.u.pci.slot,
2029                        detach->source.subsys.u.pci.function);
2030     if (pci) {
2031         activePci = pciDeviceListSteal(driver->activePciHostdevs, pci);
2032         if (pciResetDevice(activePci, driver->activePciHostdevs, NULL))
2033             qemuReattachPciDevice(activePci, driver);
2034         else
2035             ret = -1;
2036         pciFreeDevice(pci);
2037         pciFreeDevice(activePci);
2038     } else {
2039         ret = -1;
2040     }
......

In fact, the function qemuReattachPciDevice will call pciDeviceGetManaged function, if the pci device isn't managed mode, it will directly return, so the pci devices can't also been reattached to host.

In addition, in line 2032, also should judge pciResetDevice function return value, if return value < 0, then XXXX and change ret = -1 etc.
Comment 5 Osier Yang 2012-01-17 22:41:16 EST
Upstream commit 6be610bfaae08655eaf93f9638d4c6636c00343f fixed the problem indicentally.

diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
index dc40d2f..4b60839 100644
--- a/src/qemu/qemu_hotplug.c
+++ b/src/qemu/qemu_hotplug.c
@@ -2029,7 +2029,8 @@ qemuDomainDetachHostPciDevice(struct qemud_driver *driver,
                        detach->source.subsys.u.pci.function);
     if (pci) {
         activePci = pciDeviceListSteal(driver->activePciHostdevs, pci);
-        if (pciResetDevice(activePci, driver->activePciHostdevs, NULL))
+        if (pciResetDevice(activePci, driver->activePciHostdevs,
+                           driver->inactivePciHostdevs) == 0)
             qemuReattachPciDevice(activePci, driver);
         else
             ret = -1;
Comment 6 Alex Jia 2012-01-18 00:48:45 EST
Hi Osier,
Although device detached successfully, the pci device can't be returned to host with managed mode, is this a expected result? I don't think so. I remember the original design is the pci device with managed mode will be automatically returned to host when detaching a hot-pluged pci device from running guest or shut down the running guest with attached pci device.

In addition, I tried to manually reattach the pci device to host, although virsh nodedev-reattach said Device pci_0000_00_19_0 re-attached, in fact, the pci device is /sys/devices/pci0000:00/0000:00:19.0/driver not original e1000e, I still can't use the NICs on host.

Thanks,
Alex
Comment 7 Alex Jia 2012-01-18 00:50:20 EST
(In reply to comment #6)

> pci device is /sys/devices/pci0000:00/0000:00:19.0/driver not original e1000e,
s/device/driver/.
Comment 8 Alex Jia 2012-01-18 01:09:15 EST
Hi Osier, 
I saw your v2 patch "qemu: Introduce inactive PCI device list" remove 'if (!pciDeviceGetManaged(dev))' judgement from 'qemuReattachPciDevice' function, I think the patch will fix some issues on Comment 6 not all, if 'pciResetDevice != 0', we should also do some cleanup work such as returning the pci device to host, right?

Alex
Comment 11 zhpeng 2012-02-15 00:55:08 EST
With comment 0 steps, on libvirt-0.9.10-1.el6.x86_64, results:

[root@zhpeng ~]# virsh attach-device kvm1 nodedev.xml 
Device attached successfully

[root@zhpeng ~]# virsh detach-device kvm1 nodedev.xml 
Device detached successfully

So it's verified.
Comment 21 zhpeng 2012-02-21 22:23:29 EST
libvirt-0.9.4-23.el6_2.6 test passed.
Comment 22 Osier Yang 2012-05-04 06:14:56 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
No documentation needed.
Comment 24 errata-xmlrpc 2012-06-20 02:46:35 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0748.html

Note You need to log in before you can comment on or make changes to this bug.