Bug 655702

Summary: intel 82576 VF assignment to a guest : the guest doesn't receive ARP reply
Product: [Fedora] Fedora Reporter: mathieu <mathieu.rohon>
Component: qemuAssignee: Justin M. Forbes <jforbes>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 14CC: alex.williamson, amit.shah, berrange, chrisw, clalance, dwmw2, ehabkost, extras-orphan, gcosta, itamar, jaswinder, jforbes, knoel, markmc, notting, ondrejj, quintela, scottt.tw, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-12-09 17:07:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
the dmesg of the host none

Description mathieu 2010-11-22 08:36:16 UTC
Created attachment 461943 [details]
the dmesg of the host

Description of problem:  after assigning a VF to a guest, it can Broadcast ARP requests through this interface, but the ARP response never comes back to the guest


Version-Release number of selected component (if applicable): fedora 14 as host and guest


How reproducible: 5/5


Steps to Reproduce:
1. follow the instructions to enable PCI passthrough on one VF of the 82576 intel controller (http://docs.fedoraproject.org/en-US/Fedora/13/html-single/Virtualization_Guide/index.html#intel-prep), and assign the VF to the Fedora 14 guest
2. disable SELinux
3. Send DHCP requests or ARPping to any other host of the broadcast domain.
  
Actual results: the ARP reply is sent by the peer but never receive by the Guest


Expected results:


Additional info: it works great if the VF directly in the host. When assigning the VF to the guest, qemu gives an error : 
extract from /var/log/libvirt/qemu/vm1.log :

LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.13 -enable-kvm -m 1057 -s
mp 1,sockets=1,cores=1,threads=1 -name vm1 -uuid f2f0faf5-f171-0158-3413-a416d61a8d17 -nodefconfig -nodefaults -charde
v socket,id=monitor,path=/var/lib/libvirt/qemu/vm1.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=
utc -boot dc -drive file=/var/lib/libvirt/images/vm1.img,if=none,id=drive-ide0-0-0,boot=on,format=raw -device ide-driv
e,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive if=none,media=cdrom,id=drive-ide0-0-1,readonly=on,format=ra
w -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -device rtl8139,vlan=0,id=net0,mac=52:54:00:cf:b
1:28,bus=pci.0,addr=0x3 -net tap,fd=115,vlan=0,name=hostnet0 -chardev pty,id=serial0 -device isa-serial,chardev=serial
0 -usb -vnc 127.0.0.1:0 -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 -device pci-assign,host=1a:10.0,id=hostd
ev0,configfd=116,bus=pci.0,addr=0x5 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 
char device redirected to /dev/pts/3
assigned_dev_enable_msix: assign irq: Operation not permitted
fail to set MSI-X entry number for MSIX! Invalid argument
assigned_dev_update_msix_mmio: Invalid argument

Comment 1 Alex Williamson 2010-11-30 17:01:40 UTC
(In reply to comment #0)
> assigned_dev_enable_msix: assign irq: Operation not permitted
> fail to set MSI-X entry number for MSIX! Invalid argument
> assigned_dev_update_msix_mmio: Invalid argument

This makes it look like the device isn't working at all, not just ARP replies.  Does it work if you uncomment these lines in /etc/libvirt/qemu.conf?

user = "root"
group = "root"

(You'll need to restart libvirt or reboot after this)  There might be some lingering issues with de-privileged device assignment in F14.

Comment 2 Chris Wright 2010-11-30 17:02:57 UTC
This was fixed upstream in kernel commit:

48bb09eee4e102544808c00f43bc40a4a2e43e50

Comment 3 Chris Wright 2010-11-30 17:22:45 UTC
(In reply to comment #1)
> (In reply to comment #0)
> > assigned_dev_enable_msix: assign irq: Operation not permitted
> > fail to set MSI-X entry number for MSIX! Invalid argument
> > assigned_dev_update_msix_mmio: Invalid argument
> 
> This makes it look like the device isn't working at all, not just ARP replies. 
> Does it work if you uncomment these lines in /etc/libvirt/qemu.conf?
> 
> user = "root"
> group = "root"
> 
> (You'll need to restart libvirt or reboot after this)  There might be some
> lingering issues with de-privileged device assignment in F14.

Even as root, libvirt drops privileges (i.e. no CAP_SYS_RAWIO).
In libvirt 0.8.2 or newer then you can override that by setting
"clear_emulator_capabilities" in /etc/libvirt/qemu.conf.  But that's just a stop
gap (useful to test).  The proper solution is getting your fix I mentioned in Comment 2 into the fedora (and -stable) kernels.

Comment 4 Alex Williamson 2010-11-30 17:34:50 UTC
(In reply to comment #3)
> Even as root, libvirt drops privileges (i.e. no CAP_SYS_RAWIO).
> In libvirt 0.8.2 or newer then you can override that by setting
> "clear_emulator_capabilities" in /etc/libvirt/qemu.conf.  But that's just a
> stop
> gap (useful to test).  The proper solution is getting your fix I mentioned in
> Comment 2 into the fedora (and -stable) kernels.

Yep, I forgot the clear_emulator_capabilities flag.  Definitely looks like the referenced commit, thanks for digging that up.

Comment 5 mathieu 2010-12-01 11:02:56 UTC
thanks for the reply.
I set the flag and use "root", but it has no effect, even after a reboot. I still have the error. 
I use libvirt 0.8.3 and kernel 2.6.35.6-48.fc14.i686

(In reply to comment #1)
> (In reply to comment #0)
> > assigned_dev_enable_msix: assign irq: Operation not permitted
> > fail to set MSI-X entry number for MSIX! Invalid argument
> > assigned_dev_update_msix_mmio: Invalid argument
> 
> This makes it look like the device isn't working at all, not just ARP replies. 

it's very strange because the VF network card woks for sending ARP on the network, but replies are not sent back to the VM

Comment 6 Alex Williamson 2010-12-01 16:41:55 UTC
(In reply to comment #5)
> thanks for the reply.
> I set the flag and use "root", but it has no effect, even after a reboot. I
> still have the error. 
> I use libvirt 0.8.3 and kernel 2.6.35.6-48.fc14.i686

As Chris noted in Comment 3, this was insufficient.  To fully test this, you need to do all of the following:

1) Uncomment the following lines in /etc/libvirt/qemu.conf:

user = "root"
group = "root"

2) Uncomment and change the following line in the same file:

clear_emulator_capabilities = 0

(Note this is = 1 by default)

3) Restart libvirt or reboot

This should work around the missing kernel patch.  Please test.

Comment 7 mathieu 2010-12-03 10:29:26 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > thanks for the reply.
> > I set the flag and use "root", but it has no effect, even after a reboot. I
> > still have the error. 
> > I use libvirt 0.8.3 and kernel 2.6.35.6-48.fc14.i686
> 
> As Chris noted in Comment 3, this was insufficient.  To fully test this, you
> need to do all of the following:
> 
> 1) Uncomment the following lines in /etc/libvirt/qemu.conf:
> 
> user = "root"
> group = "root"
> 
> 2) Uncomment and change the following line in the same file:
> 
> clear_emulator_capabilities = 0
> 
> (Note this is = 1 by default)
> 
> 3) Restart libvirt or reboot
> 
> This should work around the missing kernel patch.  Please test.


great, it works!! thanks

Comment 8 Justin M. Forbes 2010-12-09 17:07:00 UTC
This should be resolved in kernel-2.6.35.9-64.fc14