pci_unplug_netifs: x=32: ethernet controller test_pci_slot: 1: slot=4 region type 0 at [f4000000,f4020000). squash iomem [f4024000, f4024030). region type 1 at [c200,c240). This log segment is generated when qemu-dm unplugs the emulated NIC, 00:04.0. pci_unplug_netifs: x=48: ethernet controller test_pci_slot: 1: slot=6 test_pci_slot: 2: php_slot=0 valid=1 This log segment is generated when qemu-dm (pci_unplug_netifs()) investigates and correctly skips (ie. does not unplug) the VF, 00:06.0. The iomem required to set up MSI-X for the VF (00:06.0) is squashed when qemu-dm (correctly) unplugs the emulated card (00:04.0). "Region type 0 at [f4000000, f4020000)" is from #define PNPMMIO_SIZE 0x20000 in [tools/ioemu/hw/e1000.c] -- see "vif = [ '..., model=e1000' ]" in comment 0. Note that these memory ranges don't overlap: region type 0 at [f4000000,f4020000) <--- e1000 squash iomem [f4024000, f4024030) <--- ixgbevf The bug is in unregister_iomem() [tools/ioemu/target-i386-dm/exec-dm.c]. See the following two references: http://xenbits.xen.org/gitweb/?p=qemu-xen-unstable.git;a=commitdiff;h=8cc8a365 http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1805 I'll build a xen package with that patch backported soon. --- Additional comment from lersek on 2012-09-27 15:01:06 EDT --- Created attachment 618240 [details] [1/2] Backport a single hunk from qemu-xen-unstable commit 13669683 commit 13669683830d4508b6c8ed87de088785fa95ed3c Author: Ian Jackson <ian.jackson.com> Date: Mon Mar 16 13:47:18 2009 +0000 Post-merge compilation fixes Signed-off-by: Ian Jackson <ian.jackson.com> as a dependency for the next patch. --- tools/ioemu/target-i386-dm/exec-dm.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) --- Additional comment from lersek on 2012-09-27 15:01:19 EDT --- Created attachment 618241 [details] [2/2] qemu-dm: fix unregister_iomem() Backport of qemu-xen-unstable... commit 8cc8a3651c9c5bc2d0086d12f4b870fc525b9387 Author: Jan Beulich <JBeulich> Date: Tue Feb 7 18:42:56 2012 +0000 This function (introduced quite a long time ago in e7911109f4321e9ba0cc56a253b653600aa46bea - "disable qemu PCI devices in HVM domains") appears to be completely broken, causing the regression reported in http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1805 (due to the newly added caller of it in 56d7747a3cf811910c4cf865e1ebcb8b82502005 - "qemu: clean up MSI-X table handling"). It's unclear how the function can ever have fulfilled its purpose: the value returned by iomem_index() is *not* an index into mmio[]. Additionally, fix two problems: - unregister_iomem() must not clear mmio[].start, otherwise cpu_register_physical_memory() won't be able to re-use the previous slot, thus causing a leak - cpu_unregister_io_memory() must not check mmio[].size, otherwise it won't properly clean up entries (temporarily) squashed through unregister_iomem() Signed-off-by: Jan Beulich <jbeulich> Tested-by: Stefano Stabellini <stefano.stabellini.com> Tested-by: Yongjie Ren <yongjie.ren> --- tools/ioemu/target-i386-dm/exec-dm.c | 12 ++++++++---- 1 files changed, 8 insertions(+), 4 deletions(-) --- Additional comment from lersek on 2012-09-27 15:51:11 EDT --- (In reply to comment #147) > Created attachment 618240 [details] > [1/2] Backport a single hunk from qemu-xen-unstable commit 13669683 (In reply to comment #148) > Created attachment 618241 [details] > [2/2] qemu-dm: fix unregister_iomem() (a) Build with these two, plus the latest debug messages (comment 140): http://brewweb.devel.redhat.com/brew/taskinfo?taskID=4915127 http://people.redhat.com/~lersek/bz849223_fb871991071c4b3f/xen_task_4915127/ (b) Build with only these two: http://brewweb.devel.redhat.com/brew/taskinfo?taskID=4915130 http://people.redhat.com/~lersek/bz849223_fb871991071c4b3f/xen_task_4915130/ msgid: <5064AE6D.205> --- Additional comment from lersek on 2012-09-27 15:57:09 EDT --- Pasi, as I wrote in my email, please - pick a guest kernel with the PCI_D0 patch in comment 87 / comment 92, - optionally with the v2 debug patch in comment 123, and - pick a xen userspace with the series in comment 147 - comment 148, - optionally with the v3 debug patch in comment 140. Then please repeat the three 6.3 test from comment 129: - please reboot the host again between tests, - do not specify the xen_emul_unplug cmdline param in the guest. Thanks! --- Additional comment from pasik on 2012-09-27 17:39:18 EDT --- New tests with PCI_D0 patched and debug enabled el6.3 guest kernel (87+92+123) and with patched + debug-enabled (140+147+148) xen/qemu-dm rpms: +-------------------------+-------------+---------------------------+ | host (pciback in sync) | max_vfs=1 | max_vfs=2 | +-------------------------+-------------+-------------+-------------+ | # of passed-through VFs | 1 | 1 | 2 | +-------------------------+------+------+------+------+------+------+ | guest | 5.8 | 6.3 | 5.8 | 6.3 | 5.8 | 6.3 | +-------------------------+------+------+------+------+------+------+ | results of the test | pass | pass | pass | pass | pass | pass | +-------------------------+------+------+------+------+------+------+ Both rhel5.8 and rhel6.3 HVM guests work OK now ! (ok, almost, rhel5 still has the weird kernel crash during the first time the VM is started, but that's a separate issue, and I'll file a separate bug about that). So it looks like solving this bug needs: - rhel6 kernel patch for the PCI_D0 issue. - rhel5 xen qemu-dm patch for the nic unplug / iomem issue. Thanks a lot ! ============================================== Requesting exception.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release.
*** Bug 861352 has been marked as a duplicate of this bug. ***
This bug reproduced on the same machine, verify it with: Version: Host(RHEL5.9): - kernel version: 2.6.18-343.el5xen - Xen version: xen-3.0.3-142.el5 - machine/CPU: dell-per510/Intel Xeon Guest(RHEL6.4): - Kernel version: 2.6.32-335 Steps: 1. enable VFs in host 2. assign VFs to guest 3. ping each vf of guest from host Results: [in guest] [root@dhcp-8-202 ~]# lspci | grep 82599 00:03.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:04.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:05.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:06.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:07.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:08.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:09.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:0a.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:0b.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:0c.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:0d.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:0e.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:0f.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:10.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:11.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 00:12.0 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) [in host] ping each vf of guest successfully from host
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0119.html