Bug 746355

Summary: virt-manager detaches PCI devices from host, but fails to reattach if hotplug fails
Product: Red Hat Enterprise Linux 6 Reporter: Eric Blake <eblake>
Component: python-virtinstAssignee: Cole Robinson <crobinso>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 6.2CC: ajia, dallan, dyuan, hjiang, juzhang, jyang, mjenner, mvadkert, mzhan, rwu, syeghiay, veillard, weizhan, zpeng
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
No description necessary
Story Points: ---
Clone Of: 736214 Environment:
Last Closed: 2011-12-06 16:17:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 736214    
Bug Blocks: 748554    
Attachments:
Description Flags
Drop all hostdev detach/reset sanity checks none

Comment 2 Eric Blake 2011-10-14 22:14:20 UTC
libvirt-0.9.4-18.el6.x86_64
virt-manager-0.9.0-7.el6.x86_64

To reproduce the bug, you need a scenario where hostdev hotplug will fail; one possibility is by using hardware that lacks secure VT-d, such as with:
$ echo 0 > /sys/module/kvm/parameters/allow_unsafe_assigned_interrupts

From there, choose a device to test; in my case:
$ readlink /sys/bus/pci/devices/0000:0a:0a.0/driver
../../../../bus/pci/drivers/firewire_ohci

$ virsh start dom
$ virt-manager
 view the details of dom
 click Add hardware
 select PCI Host Device
 scroll to the device probed above (in my case, 00:0a:0a)
 hit finish
The hotplug fails (expected)
operation failed: adding pci-assign,host=0a:0a.0,id=hostdev0,configfd=fd-hostdev0,bus=pci.0,addr=0xa device failed: Device 'pci-assign' could not be initialized
 hit no to adding the device

Check the device again:
$ readlink /sys/bus/pci/devices/0000:0a:0a.0/driver
../../../../bus/pci/drivers/pci-stub

Oops.  The problem is that virt-manager manually called virNodeDeviceDettach prior to trying the hot-plug operation, but did not follow up with a virNodeDeviceReAttach on failure.  Had it merely relied on managed=yes during the hotplug operation, then the virNodeDeviceDettach would not be necessary (but this would not be portable to pre-0.6.1 servers).  So I think the proper fix is a combination - try managed=yes first; it will fail with older servers in which case you fall back to the old method, but when falling back, hotplug has to make sure that any manual detach is reattached on failure.

Likewise, on detach, I think that if the xml has managed=yes, you know you are talking to a new server, but if the xml lacks that (and also lacks managed=no), then you should ask the user whether to re-attach the hostdev when they use the gui to remove the passthrough device from a guest.

Comment 4 Eric Blake 2011-10-14 23:05:32 UTC
Worse, virt-manager is explicitly calling virNodeDeviceDettach even when attaching a PCI device to an offline guest.  This prevents the host from using that device, even if there are no plans to start the guest right away.  Really, virt-manager should only be calling virNodeDeviceDettach at the same points that libvirt would be emulating it for managed devices (that is, when starting a guest), since it is perfectly valid to have multiple persistent guests all claiming the same <hostdev>, so long as only one of the guests is running at a time (see bug 733587 for more details on a just-fixed libvirt bug for having multiple offline guests sharing a <hostdev>).

Comment 6 Cole Robinson 2011-10-18 14:51:12 UTC
Created attachment 528827 [details]
Drop all hostdev detach/reset sanity checks

Comment 7 Cole Robinson 2011-10-18 15:50:00 UTC
Moving to virtinst since that's where the fix is

Comment 9 Cole Robinson 2011-10-18 17:10:34 UTC
Fixed in python-virtinst-0.600.0-5.el6

Comment 12 Huming Jiang 2011-10-20 02:30:53 UTC
Reproduced with the following package:
python-virtinst-0.600.0-3.el6.noarch

Verified with 
kernel-2.6.32-197.el6.x86_64
qemu-kvm-0.12.1.2-2.192.el6.x86_64
libvirt-0.9.4-19.el6.x86_64
python-virtinst-0.600.0-5.el6.noarch
virt-manager-0.9.0-7.el6.x86_64

step:
1.$ echo 0 > /sys/module/kvm/parameters/allow_unsafe_assigned_interrupts
2.# readlink /sys/bus/pci/devices/0000:00:19.0/driver
../../../bus/pci/drivers/e1000e
3.$ virsh start dom
4.$ virt-manager
 view the details of dom
 click Add hardware
 select PCI Host Device
 scroll to the listed pci device (in my case,00:19:0 interface eth0)
 hit finish
The hotplug fails (expected)(Network is disconnected.)
internal error unable to execute QEMU command 'device_add': Device 'pci-assign' could not be initialized.
 hit no to adding the device

 Then the network is reconnected.
5.# readlink /sys/bus/pci/devices/0000:00:19.0/driver
../../../bus/pci/drivers/e1000e

So move the status of this bug to 'verified'.

Comment 13 Cole Robinson 2011-11-07 17:18:48 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
No description necessary

Comment 14 errata-xmlrpc 2011-12-06 16:17:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1643.html