Bug 861352

Summary: qemu-dm removes wrong iomem range when unplugging emulated NIC
Product: Red Hat Enterprise Linux 5 Reporter: Laszlo Ersek <lersek>
Component: xenAssignee: Laszlo Ersek <lersek>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 5.8CC: agospoda, ddutile, dzickus, leiwang, lersek, pasik, pbonzini, qguan, sassmann, tburke, xen-maint, yuzhou
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: xen
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 849223 Environment:
Last Closed: 2012-09-28 10:10:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 849223    
Bug Blocks:    

Description Laszlo Ersek 2012-09-28 10:05:55 UTC
+++ This bug was initially created as a clone of Bug #849223 +++

**** Description of problem:
When qemu-dm unplugs an emulated NIC, as requested by the RHEL-6 guest kernel, it intends to squash the iomem region(s) belonging to the NIC being removed. However, a bug in the unregister_iomem() function may cause removal of another card's region, for example one belonging to a passthru VF.

With the erroneously removed range, the VF's MSI-X registers are impossible to program for the guest.

**** Version-Release number of selected component (if applicable):
All versions up to xen pkg build -141.

**** How reproducible:
Seems to be host config dependent -- 100% reproducible on some machines, 0% reproducible on others. May depend on the emulated NIC's model (rtl8139 didn't seem to trigger the bug, e1000 did).

**** Steps to Reproduce:
1. Set up ixgbevf passthru in dom0 like this:
- hypervisor command line: dom0_mem=2048M iommu=1
- vmlinuz command line:    pci_pt_e820_access=on
- /etc/modprobe.conf:      options ixgbe max_vfs=1
- blacklist the ixgbevf module
- make sure the pciback module hides/seizes the one VF per each PF (=1 VF/port)
- bring up the PF(s) in the host

2. Install a RHEL-6.3 guest:
- make sure the xen_emul_unplug parameter is absent from the guest command line
- use the e1000 model emulated NIC
- pass through one VF in total to the guest
- bring up the VF in the guest

**** Actual results:
- qemu-dm logs something like

  region type 0 at [f4000000,f4020000).
  squash iomem [f4024000, f4024030).

Those ranges should match -- the first line describes the region belonging to the emulated NIC being unplugged, but the region actually squashed belongs to the VF.

- the MSI-X interrupts are configured for the VF, but their counters stay 0 in /proc/interrupts, and there's no traffic.

 48:          0   PCI-MSI-edge      eth1-rx-0
 49:          0   PCI-MSI-edge      eth1-tx-0
 50:          0   PCI-MSI-edge      eth1:mbx

**** Expected results:
The VF should work.

**** Additional info:
- See the three comments starting at bug 849223 comment 146 for analysis and patches.

- This bug may be masked by RHEL-6 guest kernel bug 849223. A working VF requires (may require, dependent on the host) fixes for both bugs.

Comment 1 Laszlo Ersek 2012-09-28 10:10:40 UTC

*** This bug has been marked as a duplicate of bug 861349 ***