Bug 559122 - Reattach a pci device which is using by guest to host output wrong info
Summary: Reattach a pci device which is using by guest to host output wrong info
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: libvirt
Version: 5.5
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Chris Lalancette
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 603039
TreeView+ depends on / blocked
 
Reported: 2010-01-27 08:22 UTC by zhanghaiyan
Modified: 2011-01-13 22:54 UTC (History)
9 users (show)

Fixed In Version: libvirt-0.8.2-1.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 603039 (view as bug list)
Environment:
Last Closed: 2011-01-13 22:54:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2011:0060 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2011-01-12 17:22:30 UTC

Description zhanghaiyan 2010-01-27 08:22:25 UTC
Description of problem:
When a pci device is using by guest, try to reattach this device to host, it should detect the pci device is using by guest, then report error warning and does no operation. But in fact, it outputs no error warning, and does some operation which causes the pci device cannot work in guest, and lost in host.

Version-Release number of selected component (if applicable):
On rhel5.4-server-x86_64-kvm system
libvirt-0.6.3-30.el5
libvirt-python-0.6.3-30.el5
kmod-kvm-83-149.el5
etherboot-zroms-kvm-5.4.4-13.el5
kvm-83-149.el5
kvm-qemu-img-83-149.el5

How reproducible:
Always

Steps to Reproduce:
1. Select a network pci device from host
   # virsh nodedev-dumpxml pci_8086_10c9_0
   <device>
     <name>pci_8086_10c9_0</name>
     <parent>pci_8086_340a</parent>
     <capability type='pci'>
        <domain>0</domain>
        <bus>66</bus>
        <slot>0</slot>
        <function>0</function>
        <product id='0x10c9'>82576 Gigabit Network Connection</product>
        <vendor id='0x8086'>Intel Corporation</vendor>
     </capability>
   </device>
2. Dettach a network pci device from host
   # virsh nodedev-dettach pci_8086_10c9_0
   Device pci_8086_10c9_0 dettached
   # virsh nodedev-reset pci_8086_10c9_0
   Device pci_8086_10c9_0 reset
3. Add this pci device info into guest xml config file
    <hostdev mode='subsystem' type='pci'>
      <source>
      <address bus='66' slot='0' function='0'/>
      </source>
    </hostdev>
4. Run the guest
   # virsh define rhel5u4_x86_64_kvm.xml 
   Domain rhel5u4_x86_64_kvm defined from rhel5u4_x86_64_kvm.xml
   # virsh start rhel5u4_x86_64_kvm
   Domain rhel5u4_x86_64_kvm started
   The pci device works well in the guest
5. In host, try to reattach the assigned pci device
    # virsh nodedev-reattach pci_8086_10c9_0
    Device pci_8086_10c9_0 re-attached
6. # readlink /sys/bus/pci/devices/0000\:42\:00.0/driver
  
Actual results:
After step5, the pci device cannot work in the guest.
After step6, null output, is the pci device lost in the host ?

Expected results:
After step5, it pops up a warning info like 'device is in use' and does no operation
After step6, output ../../../../bus/pci/drivers/pci-stub 

Additional info:
After step4, in the host if run the following command firstly
# virsh nodedev-reset pci_8086_10c9_0
error: Failed to reset device pci_8086_10c9_0
error: this function is not supported by the hypervisor: Unable to reset PCI device 0000:42:00.0: device is in use
# readlink /sys/bus/pci/devices/0000\:42\:00.0/driver
../../../../bus/pci/drivers/pci-stub 

I think nodedev-reset gives a expected result

Comment 1 zhanghaiyan 2010-01-27 08:27:01 UTC
In http://libvirt.org/html/libvirt-libvirt.html#virNodeDeviceReAttach, we could find the following descriptions:
virNodeDeviceReAttach

int	virNodeDeviceReAttach		(virNodeDevicePtr dev)

Re-attach a previously dettached node device to the node so that it may be used by the node again. Depending on the hypervisor, this may involve operations such as resetting the device, unbinding it from a dummy device driver and binding it to its appropriate driver. If the device is currently in use by a guest, this method may fail.
dev:	pointer to the node device
Returns:	0 in case of success, -1 in case of failure.

So If the device is currently in use by a guest, this method may fail.

Comment 2 Jiri Denemark 2010-01-27 09:08:19 UTC
I believe this bug could also be fixed by test packages found at 

http://people.redhat.com/clalance/bz500217

Could you try to retest it with those packages?

Comment 3 zhanghaiyan 2010-01-27 09:34:03 UTC
This bug is not fixed by test packages in http://people.redhat.com/clalance/bz500217

Comment 4 Daniel Veillard 2010-02-03 08:51:24 UTC
Okay, current status is that it's not critical and we don't have a fix yet,
so this is being retargeted for Update 6,

Daniel

Comment 7 Chris Lalancette 2010-06-14 19:18:23 UTC
OK, I actually see what is going on here now.  What is happening is that nodedev-dettach and nodedev-reattach don't take into account PCI devices that are already assigned to guests.  So if you run either of these commands against a device that is assigned to a guest, they will blindly disconnect them.  This causes problems and faults in the kernel DMAR code, and essentially causes the device to disappear.  I guess if a device is assigned to a guest, *and* that guest is running, these commands should just fail and do nothing.  This is still a problem in RHEL-6 and upstream, as well.

Chris Lalancette

Comment 8 Chris Lalancette 2010-06-15 12:07:02 UTC
I've sent a couple of patches upstream to basically disallow nodedev-detach and nodedev-reattach while a device is assigned to a guest.  Once they are integrated, I'll do a backport for RHEL-5.

Chris Lalancette

Comment 9 Jiri Denemark 2010-09-02 11:56:30 UTC
Fixed in libvirt-0.8.2-1.el5

Comment 11 Min Zhan 2010-10-26 05:52:58 UTC
Verified with passed on below environment:
RHEL5.6-Server-x86_64_KVM
kvm-qemu-img-83-205.el5
kernel-2.6.18-228.el5
libvirt-0.8.2-8.el5

But with xen kernel, i file a new bug 646749 to track: When pci works well in guest, at the same time re-attach in the host, the host will directly reboot.

Comment 13 errata-xmlrpc 2011-01-13 22:54:42 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0060.html


Note You need to log in before you can comment on or make changes to this bug.