Description of problem: Install ia64 RHEL5.4 alpha with Xen kernel. Use modprobe -r to remove some pci device driver. System will throw out Error messange and call trace. [root@maxcv ~]# modprobe -r e1000e BUG: warning at drivers/xen/core/pci.c:41/pci_bus_remove_wrapper() (Tainted: G ) Call Trace: [<a00000010001d240>] show_stack+0x40/0xa0 sp=e00000019016fbf0 bsp=e000000190169298 [<a00000010001d2d0>] dump_stack+0x30/0x60 sp=e00000019016fdc0 bsp=e000000190169280 [<a000000100415540>] pci_bus_remove_wrapper+0x120/0x140 sp=e00000019016fdc0 bsp=e000000190169260 [<a000000100400da0>] __device_release_driver+0x160/0x1c0 sp=e00000019016fdd0 bsp=e000000190169228 [<a0000001004015f0>] driver_detach+0x170/0x200 sp=e00000019016fdd0 bsp=e0000001901691f0 [<a0000001003ff5e0>] bus_remove_driver+0x120/0x180 sp=e00000019016fdd0 bsp=e0000001901691c0 [<a000000100401700>] driver_unregister+0x20/0x60 sp=e00000019016fdd0 bsp=e0000001901691a0 [<a0000001003058b0>] pci_unregister_driver+0x50/0x120 sp=e00000019016fdd0 bsp=e000000190169170 [<a0000002015a5850>] e1000_exit_module+0x30/0x1530 [e1000e] sp=e00000019016fdd0 bsp=e000000190169158 [<a0000001000da3c0>] sys_delete_module+0x3c0/0x460 sp=e00000019016fdd0 bsp=e0000001901690e8 [<a00000010006ae00>] xen_trace_syscall+0x100/0x140 sp=e00000019016fe30 bsp=e0000001901690e8 [<a000000000010620>] __start_ivt_text+0xffffffff00010620/0x400 sp=e000000190170000 bsp=e0000001901690e8 Version-Release number of selected component (if applicable): xen-3.0.3-87.el5 kernel-xen-2.6.18-152.el5 Linux maxcv.rx3600-11.test 2.6.18-152.el5xen #1 SMP Wed Jun 3 19:21:01 EDT 2009 ia64 ia64 ia64 GNU/Linux How reproducible: always Steps to Reproduce: 1.modprobe -r device-driver-name 2. 3. Actual results: popup error msg and call trace Expected results: no popup msg Additional info:
if we do the following: 1) modprobe -r device-drivers (e1000e) 2) when modprobe -r finished , then "modprobe e1000e "really quick. we will find [root@maxcv ~]# modprobe e1000e map irq failed 0000:52:00.0: Failed to initialize MSI interrupts. Falling back to legacy interrupts. map irq failed 0000:52:00.1: Failed to initialize MSI interrupts. Falling back to legacy interrupts. map irq failed 0000:8b:00.0: Failed to initialize MSI interrupts. Falling back to legacy interrupts. map irq failed 0000:8b:00.1: Failed to initialize MSI interrupts. Falling back to legacy interrupts.
Created attachment 349198 [details] sysreport
Can you try this same test on the 5.3 (that would be kernel-xen-2.6.18-128.el5) and report back if this is a regression or not? Thanks, Chris Lalancette
We reproduce this on all our IA64 hardware and we dont see this happen on rhel5.3 we also found this on rhel5.4b1
OK, thanks for confirming. I'll mark this as a regression then. Chris Lalancette
Gah. I see what the problem is now. When we added the VT-d stuff, we added some code in drivers/xen/core/pci.c (where the bug message is coming from) that looks like this: static int pci_bus_remove_wrapper(struct device *dev) { int r; struct pci_dev *pci_dev = to_pci_dev(dev); struct physdev_manage_pci manage_pci; manage_pci.bus = pci_dev->bus->number; manage_pci.devfn = pci_dev->devfn; r = pci_bus_remove(dev); /* dev and pci_dev are no longer valid!! */ WARN_ON(HYPERVISOR_physdev_op(PHYSDEVOP_manage_pci_remove, &manage_pci)); return r; } However, our ia64 currently doesn't implement PHYSDEVOP_manage_pci_remove, so that's what causes the error message. Upstream xen-unstable c/s 18686 does implement this, so we'll probably need to backport that. Chris Lalancette
Created attachment 351247 [details] Skip calling PHYSDEVOP_manage_pci_remove on PCI teardown for ia64 (In reply to comment #8) > However, our ia64 currently doesn't implement PHYSDEVOP_manage_pci_remove, so > that's what causes the error message. Upstream xen-unstable c/s 18686 does > implement this, so we'll probably need to backport that. Nix this last part. That requires pulling in basically all of ia64 VT-d support, which is way too risky at this stage in 5.4. Instead, I've tested the attached patch, which seems to fix the issue for me. Chris Lalancette
in kernel-2.6.18-159.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
the new kernel looks good . works for me
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html