Bug 738388

Summary: Unable to passthrough multifunction PCI devices to KVM guests
Product: Red Hat Enterprise Linux 6 Reporter: Nandini Chandra <nachandr>
Component: libvirtAssignee: Eric Blake <eblake>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1CC: acathrow, ajia, dallan, eblake, eng-i18n-bugs, juzhang, jwest, laine, mjenner, mzhan, rwu, weizhan
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0.9.4-16.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 11:29:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 738430    
Bug Blocks: 728174, 747120    

Description Nandini Chandra 2011-09-14 17:13:39 UTC
Description of problem:
A customer would like to use a multifunction PCI device(dual port -Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet) on a KVM guest running on a RHEL 6.1 host. 

lspci output from host
--------------------
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
-------------------

The PCI multifunction property has been incorporated through this BZ.This fix allows multiple functions of PCI devices to be assigned to KVM guests.This has been fixed in qemu-kvm-0.12.1.2-2.180.el6.
https://bugzilla.redhat.com/show_bug.cgi?id=729104 

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.189.el6.x86_64.rpm
libvirt-0.9.4-9.el6.x86_64 


How reproducible:
Always

Steps to Reproduce:
1.Detach the dual port NIC from the host.
2.Edit the XML file of the KVM guest so as to assign the NIC to the
 guest
3.Start the guest.Notice that the qemu-kvm command line is built without 'multifunction=on'.
  
Actual results:
Users are unable to passthrough a multifunction PCI device to a KVM guest.


Expected results:
Users should be able to passthrough a multifunction PCI device to a KVM guest.


Additional info:

Comment 3 Eric Blake 2011-09-14 17:24:34 UTC
Doing this would require a RHEL-specific patch, since upstream bases the use of multifunction pci support solely on qemu version.  Additionally, if we do this for RHEL 6.2, we'd also have to worry about bug 728174, where qxl does not work with multifunction support enabled.

Comment 5 Laine Stump 2011-09-30 06:09:55 UTC
In response to a suggestion by Dan Berrange on libvir-list, I just posted the
following patch, which removes the automatic setting of the multifunction bit,
in favor of an XML attribute to turn it on as desired:

https://www.redhat.com/archives/libvir-list/2011-September/msg01289.html

If this patch is added to the multifunction support that is already in libvirt-0.9.4, a RHEL-only patch to turn on QEMU_CAPS_PCI_MULTIFUNCTION for the stock rhel qemu binary should be sufficient to fix this bug without creating the situation described in Bug 728174. (Note that there is also a bug for the same QXL problem filed against libvirt in RHEL6, even though the problem currently doesn't occur when using stock RHEL6 binaries - Bug 727530)

Comment 6 Eric Blake 2011-10-05 21:58:21 UTC
Actually, it may be sufficient to solve this bug with the backport of a proposed upstream patch:

https://www.redhat.com/archives/libvir-list/2011-October/msg00145.html

qemu: enable multifunction for older qemu

Now that RHEL 6.2 Beta is out, it would be nice to test multifunction
devices on that platform.  This changes things so that the multifunction
cap bit can be set in two different ways: by version comparison (needed
for qemu 0.13 which lacked a -device query), and by -device query
(provided by qemu.git and backported to the RHEL beta build of
qemu-kvm which still claims to be a modified 0.12, and therefore needed
for RHEL).

* src/qemu/qemu_capabilities.c (qemuCapsParseDeviceStr): Allow
second method of setting multifunction cap bit.
* tests/qemuhelptest.c (mymain): Test it.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel62-beta: New file.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel62-beta-device: Likewise.

Comment 9 Alex Jia 2011-10-10 08:13:43 UTC
Hi Eric,
It seems we still need to apply new patch for this bug, at present, libvirt will automatically add the following line in guest xml:

<address type='pci' domain='xxxx' bus='xx' slot='xx' function='x'/>

That means we can't specify the same slot in guest xml, for instance:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>

Libvirt will raise this error "error: XML error: Attempted double use of PCI Address '0:0:6.0'" when starting guest. In fact, libvirt automatically assigns 0x07 to the second slot, and qemu-kvm cmdline looks like this:

# ps -ef|grep qemu-kvm
qemu      3043     1  3 15:16 ?        00:00:38 /usr/libexec/qemu-kvm -S -M rhel6.2.0 -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name vr-rhel6u2-x86_64-kvm -uuid 4b2bab0e-2b67-b743-49c8-bc2b9a551b0e -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vr-rhel6u2-x86_64-kvm.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -drive file=/var/lib/libvirt/images/vr-rhel6u2-x86_64-kvm,if=none,id=drive-ide0-0-0,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=23,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:80:4d:ac,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 -device pci-assign,host=02:00.0,id=hostdev0,configfd=24,bus=pci.0,multifunction=on,addr=0x6 -device pci-assign,host=02:00.1,id=hostdev1,configfd=25,bus=pci.0,addr=0x7 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

Although we can see these 2 NICs device in guest, they hold 2 slots not 1, that's not our expected. Nandini gave a expected result in Comment 1.


Alex

Comment 10 Alex Jia 2011-10-10 08:20:02 UTC
BTW, I have a Broadcom NICs with dual ports:

# lspci
......
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)

Comment 11 Eric Blake 2011-10-10 15:12:57 UTC
(In reply to comment #9)
> Hi Eric,
> It seems we still need to apply new patch for this bug, at present, libvirt
> will automatically add the following line in guest xml:
> 
> <address type='pci' domain='xxxx' bus='xx' slot='xx' function='x'/>
> 

> Libvirt will raise this error "error: XML error: Attempted double use of PCI
> Address '0:0:6.0'" when starting guest.

This behavior is expected.  Per bug 727530, the only way to get two devices to use two functions is to manually mark the device on function 0 with the additional attribute multifunction='yes', as in:

<address type='pci' domain='xxxx' bus='xx' slot='xx' function='x' multifunction='yes'/>

Without a multifunction designation, then libvirt defaults to separate slots, and intentionally warns about attempts to reuse the same slot, since you can't technically use function 1 of a slot unless function 0 was marked multifunction.

Comment 12 Alex Jia 2011-10-11 03:25:09 UTC
(In reply to comment #11)
> This behavior is expected.  Per bug 727530, the only way to get two devices to
> use two functions is to manually mark the device on function 0 with the
> additional attribute multifunction='yes', as in:
> 
> <address type='pci' domain='xxxx' bus='xx' slot='xx' function='x'
> multifunction='yes'/>

Here should be 'on' not 'yes', but I know your mean.
> 
> Without a multifunction designation, then libvirt defaults to separate slots,
> and intentionally warns about attempts to reuse the same slot, since you can't
> technically use function 1 of a slot unless function 0 was marked
> multifunction.

Eric, you're right, the root reason is I forgot to change the 'function' to '0x1' in the second '<address type='pci' domain='0x0000' ...>' line from Comment 9, the correct hostdev xml block looks like this:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x1'/>
    </hostdev>

and I can see a expected qemu-kvm cmdline:
# ps -ef|grep qemu-kvm
qemu     12186     1 31 10:42 ?        00:00:01 /usr/libexec/qemu-kvm -S -M rhel6.2.0 -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name vr-rhel6u2-x86_64-kvm -uuid 4b2bab0e-2b67-b743-49c8-bc2b9a551b0e -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vr-rhel6u2-x86_64-kvm.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -drive file=/var/lib/libvirt/images/vr-rhel6u2-x86_64-kvm,if=none,id=drive-ide0-0-0,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=23,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:80:4d:ac,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -k en-us -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 -device pci-assign,host=02:00.0,id=hostdev0,configfd=24,bus=pci.0,multifunction=on,addr=0x6 -device pci-assign,host=02:00.1,id=hostdev1,configfd=25,bus=pci.0,addr=0x6.0x1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

Notes, the first addr is '0x6', and the second addr is '0x6.0x1' in pci-assign section.
# lspci
......
In addition, you also find these devices in guest, it looks like this:
00:06.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
00:06.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)

Everything is fine, so move the bug to VERIFIED status.

Version-Release number of selected component:
# uname -r
2.6.32-206.el6.x86_64

# rpm -q libvirt
libvirt-0.9.4-16.el6.x86_64

# rpm -q qemu-kvm
qemu-kvm-0.12.1.2-2.195.el6.x86_64

Comment 13 errata-xmlrpc 2011-12-06 11:29:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html