Bug 773667

Summary: virsh attach-device fails with 'Unable to reset PCI device' for Broadcom NetExtreme II
Product: Red Hat Enterprise Linux 6 Reporter: Miroslav Vadkerti <mvadkert>
Component: libvirtAssignee: Osier Yang <jyang>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.2CC: acathrow, ajia, berrange, dallan, dyuan, ebenes, eblake, iboverma, jpallich, jrieden, linda.knippers, mprivozn, msvoboda, mzhan, rwu, sgrubb, smueller, weizhan
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0.9.10-1.el6 Doc Type: Bug Fix
Doc Text:
Cause: libvirt has problem to check if there is active device on the same bus. Consequence: "virsh attach-device" will fail even if the device on the same bus is already detached from host. Fix: Fix the checking of active device on the same bus. Result: User can attach device to guest successfully if the device(s) on the same bus are detached from host,
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 06:46:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 784785    

Description Miroslav Vadkerti 2012-01-12 14:27:59 UTC
Description of problem:
Any function of the PCI device cannot be assigned to the guest. Also trying to assign two functions at once fails the same way.

The virsh attach-device fails with the following (example for 04:00.0):

# lspci | grep Ether
03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
03:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
04:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)

# virsh attach-device guest1 pci_dev.xml
bnx2 0000:04:00.0: PCI INT A disabled
pci-stub 0000:04:00.0: claimed by stub
bnx2 0000:04:00.0: PCI INT A -> GSI 31 (level, low) -> IRQ 31
bnx2 0000:04:00.0: firmware: requesting bnx2/bnx2-mips-09-6.2.1a.fw
bnx2 0000:04:00.0: firmware: requesting bnx2/bnx2-rv2p-09-6.0.17.fw
bnx2 0000:04:00.0: eth2: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem f8000000, IRQ 31, node addr 3c:4a:92:f4:43:30
error: Failed to attach device from pci_dev.xml
error: internal error Unable to reset PCI device 0000:04:00.0: internal error Active 0000:04:00.1 devices on bus with 0000:04:00.0, not doing bus reset

Please note that the function 0000:04:00.1 is not used by host (ifdown) or any other guest. I see the same behavior for all PCI functions.

This scenario works well for:
* NetXtreme BCM5720 Gigabit Ethernet PCIe (card) / tg3 (PCI device)
* Intel 82576 Gigabit Network Connection / igb

How reproducible:
100%

Steps to Reproduce:
# dd if=/dev/zero of=/var/lib/libvirt/images/guest1.img bs=1M count=1

# cat guest1-template.xml
<domain type='kvm'>
  <name>guest1</name>
  <memory>256000</memory>
  <currentMemory>256000</currentMemory>
  <vcpu>1</vcpu>
  <os>
    <type arch='x86_64' machine='rhel6.0.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/lib/libvirt/images/guest1.img'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target port='0'/>
    </console>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes'/>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
 </devices>
 <seclabel type='static' model='selinux'>
    <label>system_u:system_r:svirt_t:s0:c50,c70</label>
 </seclabel>

# virsh create guest1-template.xml

# cat pci-dev.xml
    <hostdev mode="subsystem" type="pci" managed="yes">
        <source>
            <address domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
        </source>
    </hostdev>

# virsh attach-device guest1 pci_dev.xml
  
Actual results:
Attaching PCI function failed

Expected results:
Attaching PCI function successful

Additional info:
For testing machine contact mvadkert

Comment 2 Daniel Berrangé 2012-01-12 14:39:07 UTC
One option might be to detach the other conflicting, but unused, device from the host drivers, eg  'virsh nodev-detach', which should then let hotplug work for that device.

Alternatively, you have to coldplug all the devices to the VM at the same time when first starting it up

Comment 5 Dave Allan 2012-01-13 03:02:43 UTC
Don, do you have any thoughts on what might be conflicting here?

Comment 7 Osier Yang 2012-01-13 11:48:18 UTC
https://www.redhat.com/archives/libvir-list/2012-January/msg00535.html

Patch posted to upstream.

Comment 14 Eric Blake 2012-01-18 00:14:23 UTC
commit 6be610bfaae08655eaf93f9638d4c6636c00343f
Author: Osier Yang <jyang>
Date:   Wed Jan 18 04:02:05 2012 +0800

    qemu: Introduce inactive PCI device list
    
    pciTrySecondaryBusReset checks if there is active device on the
    same bus, however, qemu driver doesn't maintain an effective
    list for the inactive devices, and it passes meaningless argument
    for parameter "inactiveDevs". e.g. (qemuPrepareHostdevPCIDevices)

Comment 19 Alex Jia 2012-01-19 10:16:23 UTC
It seems I still met a similar error with Comment 4, Miroslav, could you try it again using above new build? thanks.

Comment 21 Dave Allan 2012-01-19 14:06:33 UTC
I've moved the BZ back to assigned.

Comment 22 Michal Privoznik 2012-01-19 16:33:05 UTC
In fact, i don't think this will ever work in this way. This NetExtreme card does not play very nice with virtualization. One one hand, a since PCI card adds 2 PCI devices onto slot (each with different PCI function number). But on the other hand, it completely lacks FLReset; Therefore if one wants to add the first PCI device, he has to 'virsh nodedev-detach' the second one and vice versa in order to allow secondary bus reset.

So Miroslav, what you need to do, is:

virsh nodedev-dettach pci_0000_04_00_0

prior to virsh attach-device guest1 pci_dev.xml
So I am suggesting to move this back to POST. Miroslav, can you confirm it is working for you?

Comment 27 Osier Yang 2012-01-24 06:22:07 UTC
(In reply to comment #22)
> In fact, i don't think this will ever work in this way. This NetExtreme card
> does not play very nice with virtualization. One one hand, a since PCI card
> adds 2 PCI devices onto slot (each with different PCI function number). But on
> the other hand, it completely lacks FLReset; Therefore if one wants to add the
> first PCI device, he has to 'virsh nodedev-detach' the second one and vice
> versa in order to allow secondary bus reset.
> 
> So Miroslav, what you need to do, is:
> 
> virsh nodedev-dettach pci_0000_04_00_0
> 
> prior to virsh attach-device guest1 pci_dev.xml
> So I am suggesting to move this back to POST. Miroslav, can you confirm it is
> working for you?

Exactly,you need to detach the PCI device which shares the same bus from host firstly, it's my fault to not clarify this in the commit message.

Comment 28 Osier Yang 2012-01-24 06:24:12 UTC
(In reply to comment #27)
> (In reply to comment #22)
> > In fact, i don't think this will ever work in this way. This NetExtreme card
> > does not play very nice with virtualization. One one hand, a since PCI card
> > adds 2 PCI devices onto slot (each with different PCI function number). But on
> > the other hand, it completely lacks FLReset; Therefore if one wants to add the
> > first PCI device, he has to 'virsh nodedev-detach' the second one and vice
> > versa in order to allow secondary bus reset.
> > 
> > So Miroslav, what you need to do, is:
> > 
> > virsh nodedev-dettach pci_0000_04_00_0
> > 
> > prior to virsh attach-device guest1 pci_dev.xml
> > So I am suggesting to move this back to POST. Miroslav, can you confirm it is
> > working for you?
> 
> Exactly,you need to detach the PCI device which shares the same bus from host
> firstly, it's my fault to not clarify this in the commit message.

And I will add Tech Notes for this bug once it's VERIFIED.

Comment 35 Alex Jia 2012-02-16 05:50:52 UTC
The bug has been verified on libvirt-0.9.10-1.el6.x86_64 according to steps of Comment 23.

Comment 36 Miroslav Svoboda 2012-02-27 15:30:38 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, when libvirt tried to attach certain SR-IOV (Single Root I/O Virtualization) devices to virtual guests, this attempts failed with the "Unable to reset PCI device" error messages. This patch modifies the underlying code so that these PCI devices can now be successfully attached to guests.

Comment 37 Linda Knippers 2012-02-27 15:51:13 UTC
I thought this problem had to do with non-SR-IOV devices.  When I tested the fix for Miroslav Vadkerti, it was with a 4-port Broadcom NIC, not a SR-IOV device.

Comment 38 Miroslav Vadkerti 2012-02-27 17:55:47 UTC
Linda is right, Miroslav we need to fix the documentation here. I would just maybe dorop the SR-IOV mentioning in favor of PCI device.

Comment 39 Osier Yang 2012-05-04 07:21:30 UTC
(In reply to comment #38)
> Linda is right, Miroslav we need to fix the documentation here. I would just
> maybe dorop the SR-IOV mentioning in favor of PCI device.

Hi, Miroslav,

I updated the tech note with CCFR format (without mentioning SR-IOV)

Osier

Comment 40 Osier Yang 2012-05-04 07:21:30 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1,4 @@
-Previously, when libvirt tried to attach certain SR-IOV (Single Root I/O Virtualization) devices to virtual guests, this attempts failed with the "Unable to reset PCI device" error messages. This patch modifies the underlying code so that these PCI devices can now be successfully attached to guests.+Cause: libvirt has problem to check if there is active device on the same bus.
+Consequence: "virsh attach-device" will fail even if the device on the same bus is already detached from host.
+Fix: Fix the checking of active device on the same bus.
+Result: User can attach device to guest successfully if the device(s) on the same bus are detached from host,

Comment 42 errata-xmlrpc 2012-06-20 06:46:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0748.html