Bug 733587

Summary:	Reattach a pci device to host which is using by guest sometimes outputs wrong info
Product:	Red Hat Enterprise Linux 6	Reporter:	weizhang <weizhan>
Component:	libvirt	Assignee:	Osier Yang <jyang>
Status:	CLOSED ERRATA	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	6.2	CC:	ajia, dallan, dyuan, eblake, jyang, mzhan, rwu, veillard, ydu
Target Milestone:	rc
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	libvirt-0.9.9-1.el6	Doc Type:	Bug Fix
Doc Text:	Cause: If a domain fails to start, the host device(s) for the domain will be reattached to host regardless of whether the device(s) is used by other domain. Consequense: The device will be reattached to host even if it's still being used by other domain. Fix: Improve the underlying codes so that it won't reattach the device which is being used by other domain. Result: More stable hotplug ecosphere	Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-06-20 06:30:16 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	773650, 773651, 773677, 773696

Description weizhang 2011-08-26 06:30:46 UTC

Description of problem:
Reattach a pci device to host which is using by guest sometimes outputs success after try to start another guest with assigned pci device

Version-Release number of selected component (if applicable):
kernel-2.6.32-191.el6.x86_64
libvirt-0.9.4-5.el6.x86_64
qemu-kvm-0.12.1.2-2.184.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. on machine with 82576 nic, do
rmmod kvm_intel
rmmod kvm
modprobe kvm allow_unsafe_assigned_interrupts=1
modprobe kvm_intel

2. # lspci |grep -i eth
00:19.0 Ethernet controller: Intel Corporation 82567LM-3 Gigabit Network Connection (rev 02)

3. install 2 guest and shutdown both, attach xml
    <hostdev mode='subsystem' type='pci' managed='no'>
      <source>
        <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/>
      </source>
    </hostdev>
on both guests

4. detach pci device from host
# virsh nodedev-dettach pci_0000_00_19_0
Device pci_0000_00_19_0 dettached

5. start 1 guest and then start another
when start second guest, it report error as expected
error: Failed to start domain guest2
error: internal error Not reattaching active device 0000:00:19.0

6. reattach pci device to host
# virsh nodedev-reattach pci_0000_00_19_0
Device pci_0000_00_19_0 re-attached

but check driver
# readlink /sys/bus/pci/devices/0000\:00\:19.0/driver
there is nothing output

also check with
# virsh nodedev-list --tree
# ifconfig -a

the pci device is not back
  
Actual results:
nodedev-reattach reports success but in fact failed

Expected results:
nodedev-reattach reports error like 
error: Failed to re-attach device pci_0000_00_19_0
error: internal error Not reattaching active device 0000:00:19.0

Additional info:

Comment 2 Osier Yang 2011-08-30 03:11:31 UTC

I'd think the device is unbound from the pci-stub driver successfully, however, it fails on reprobing (or even don't do) the driver for the device. Could you check if
"remove_id" is available for pci-stub driver? E.g.

# ls /sys/bus/pci/devices/0000\:00\:19.0/driver/remove_id 

If it exists, please test if the reprobing works fine.

# echo 0000\:00\:19.0 >  /sys/bus/pci/drivers_probe

I guess we have some problem of reprobing the driver for device here.

Comment 3 weizhang 2011-08-30 03:37:04 UTC

(In reply to comment #2)
> I'd think the device is unbound from the pci-stub driver successfully, however,
> it fails on reprobing (or even don't do) the driver for the device. Could you
> check if
> "remove_id" is available for pci-stub driver? E.g.
> 
> # ls /sys/bus/pci/devices/0000\:00\:19.0/driver/remove_id 
> 

after nodedev reattach, there is no remove_id exist

Comment 4 Osier Yang 2011-08-30 06:23:03 UTC

How about before?

Comment 5 weizhang 2011-08-30 07:02:49 UTC

(In reply to comment #4)
> How about before?

before reattach, remove_id exists, and with 
# echo 0000\:00\:19.0 >  /sys/bus/pci/drivers_probe
no error

Comment 6 Alex Jia 2011-08-30 07:27:47 UTC

The following is my debug information, it should be helpful for you:

# virsh nodedev-dettach pci_0000_00_19_0
Device pci_0000_00_19_0 dettached

# readlink /sys/bus/pci/devices/0000\:00\:19.0/driver -f
/sys/bus/pci/drivers/pci-stub

# virsh start vr-rhel6u1-x86_64-kvm
Domain vr-rhel6u1-x86_64-kvm started

# readlink /sys/bus/pci/devices/0000\:00\:19.0/driver -f
/sys/bus/pci/drivers/pci-stub

# virsh start vr-rhel6-x86_64-kvm
error: Failed to start domain vr-rhel6-x86_64-kvm
error: internal error Not reattaching active device 0000:00:19.0

# readlink /sys/bus/pci/devices/0000\:00\:19.0/driver -f
/sys/bus/pci/drivers/pci-stub

# virsh start vr-rhel6-x86_64-kvm
error: Failed to start domain vr-rhel6-x86_64-kvm
error: internal error Process exited while reading console log output: char device redirected to /dev/pts/2
Failed to assign device "hostdev0" : Device or resource busy
qemu-kvm: -device pci-assign,host=00:19.0,id=hostdev0,configfd=25,bus=pci.0,addr=0x7: Device 'pci-assign' could not be initialized

Notes, this error is different from the first time when try to start guest again.


# virsh nodedev-reattach pci_0000_00_19_0
Device pci_0000_00_19_0 re-attached

Notes, the pci device is active, so here should be failed and should see error like starting the second guest. however, it's successful. It seems some variable initial value are changed when the second guest is started, because if I only start a guest then reattach the attached pci device from guest, I can see "...Not reattaching active device..." error.

In addition, dmesg display as follows:
...
e1000e 0000:00:19.0: BAR 0: can't reserve mem region [0xfe9e0000-0xfe9fffff]
e1000e: probe of 0000:00:19.0 failed with error -16
...

Moreover, the messages log catches the same error:

# tail -f /var/log/messages
......
Aug 26 16:54:43 localhost kernel: e1000e 0000:00:19.0: BAR 0: can't reserve mem region [0xfe9e0000-0xfe9fffff]
Aug 26 16:54:43 localhost kernel: e1000e: probe of 0000:00:19.0 failed with error -16

Here should be a kernel issue, right?

# readlink /sys/bus/pci/devices/0000\:00\:19.0/driver -f
/sys/devices/pci0000:00/0000:00:19.0/driver

Notes, the pci driver isn't right.

# ll /sys/devices/pci0000:00/0000:00:19.0
total 0
-rw-r--r--. 1 root root   4096 Aug 26 15:26 broken_parity_status
-r--r--r--. 1 root root   4096 Aug 26 15:07 class
-rw-r--r--. 1 root root    256 Aug 26 15:07 config
-r--r--r--. 1 root root   4096 Aug 26 15:07 device
-rw-------. 1 root root   4096 Aug 26 15:26 enable
-r--r--r--. 1 root root   4096 Aug 26 15:07 irq
-r--r--r--. 1 root root   4096 Aug 26 15:26 local_cpulist
-r--r--r--. 1 root root   4096 Aug 26 15:07 local_cpus
-r--r--r--. 1 root root   4096 Aug 26 15:26 modalias
-rw-r--r--. 1 root root   4096 Aug 26 15:26 msi_bus
-r--r--r--. 1 root root   4096 Aug 26 15:26 numa_node
drwxr-xr-x. 2 root root      0 Aug 26 15:26 power
--w--w----. 1 root root   4096 Aug 26 15:21 remove
--w--w----. 1 root root   4096 Aug 26 15:47 rescan
--w-------. 1 root root   4096 Aug 26 15:07 reset
-r--r--r--. 1 root root   4096 Aug 26 15:07 resource
-rw-------. 1 root root 131072 Aug 26 15:07 resource0
-rw-------. 1 root root   4096 Aug 26 15:07 resource1
-rw-------. 1 root root     32 Aug 26 15:07 resource2
lrwxrwxrwx. 1 root root      0 Aug 26 15:07 subsystem -> ../../../bus/pci
-r--r--r--. 1 root root   4096 Aug 26 15:07 subsystem_device
-r--r--r--. 1 root root   4096 Aug 26 15:07 subsystem_vendor
-rw-r--r--. 1 root root   4096 Aug 26 15:07 uevent
-r--r--r--. 1 root root   4096 Aug 26 15:07 vendor


I try to trace the above issues, the issue may be introduced by the following codes slice:

int
pciReAttachDevice(pciDevice *dev, pciDeviceList *activeDevs)
{
......
    if (activeDevs && pciDeviceListFind(activeDevs, dev)) {
        pciReportError(VIR_ERR_INTERNAL_ERROR,
                       _("Not reattaching active device %s"), dev->name);
        return -1;
    }
......
}

When starting the second guest then reattach the device, pciDeviceListFind will return NULL, it means the pci device isn't active, so reattach will be successful, further more, list->count will be 0 in pciDeviceListFind, the value isn't right, list->count should be 1 not 0, here may be counter a issue, if I have the time, I will debug it again, and hope it's useful for you.


Alex

Comment 7 Osier Yang 2011-09-22 13:31:34 UTC

The problem here the hostdev is not managed. And we don't check if the device is in the active list if it's not managed. So the codes fallthough and steal the device from active pci list.

Comment 8 Osier Yang 2011-09-27 06:23:03 UTC

patch sent to upstream
https://www.redhat.com/archives/libvir-list/2011-September/msg01019.html

Comment 10 Eric Blake 2011-10-14 22:43:37 UTC

In POST:
http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-October/msg00564.html

Comment 13 yanbing du 2011-10-19 07:41:53 UTC

Test with:
libvirt-0.9.4-18.el6.x86_64
qemu-kvm-0.12.1.2-2.199.el6.x86_64
kernel-2.6.32-211.el6.x86_64

Following the reproduce steps in bug description, bug still not fix.
When reattach the pci device to host which using by a guest: 
# virsh nodedev-reattach pci_0000_00_19_0
Device pci_0000_00_19_0 re-attached

In fact, the pci device didn't come back, and it should report an error that the pci device is in use by a guest, can can't reattach.

Comment 19 Osier Yang 2011-11-29 09:55:34 UTC

patch posted to upstream:

https://www.redhat.com/archives/libvir-list/2011-November/msg01590.html

Comment 20 Osier Yang 2011-12-15 02:20:16 UTC

Patch committed to upstream.

Comment 21 Daniel Veillard 2012-01-09 08:24:31 UTC

Upstream commit 3f29d6c91f56857719fc500f02d55cee72684f36

Daniel

Comment 22 weizhang 2012-01-10 10:39:58 UTC

Verify pass on
libvirt-0.9.9-1.el6.x86_64
kernel-2.6.32-225.el6.x86_64
qemu-kvm-0.12.1.2-2.213.el6.x86_64

After starting second guest failed and then reattaching device, it reports error
# virsh nodedev-reattach pci_0000_00_19_0
error: Failed to re-attach device pci_0000_00_19_0
error: internal error Not reattaching active device 0000:00:19.0

and the driver still bound to pci-stub
# readlink /sys/bus/pci/devices/0000\:00\:19.0/driver -f
/sys/bus/pci/drivers/pci-stub

Comment 23 Osier Yang 2012-05-04 09:27:32 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: If a domain fails to start, the host device(s) for the domain will be reattached to host regardless of whether the device(s) is used by other domain.
Consequense: The device will be reattached to host even if it's still being used by other domain.
Fix: Improve the underlying codes so that it won't reattach the
device which is being used by other domain.
Result: More stable hotplug ecosphere

Comment 25 errata-xmlrpc 2012-06-20 06:30:16 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0748.html