The reason commenting out the SRIOV code in setup-res.c & setup-bus.c works is because it removes the resource gathering of VFs, and when the generic pci_bus_alloc_resource() is done for all pci-dev resources, the SRIOV resources aren't allocated, so it can't lose it btwn unplug/plug. So, the root of the problem is that the generic scan code does the allocate_resource() in setup-*.c, but the pci_remove_bus_device(), which is done by the hotplug code at unplug time, does not do the release_resource() on the sriov resources. [note: release_resource() is called by pci_free_resources() which is called by pci_destroy_dev() which is called by pci_remove_bus_device(). This all works in upstream b/c this loop in pci_free_resources(): for (i = 0; i < PCI_NUM_RESOURCES; i++) { struct resource *res = dev->resource + i; if (res->parent) release_resource(res); } *includes* the sriov resources. in RHEL5, the sriov resources are put in a separate struct that is linked to the pci_dev{} struct due to kabi compatibility reqs btwn 5.0GA and the sriov support backported to rhel5. So, I believe by adding this code segment after the one above in pci_free_resources, the unplug should work: if (dev->is_physfn) { struct pci_sriov *iov = dev->sriov; struct resource *res; if (!iov) return; for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { res = iov->res + i; if (res->parent) release_resource(res); } } Warning: Only compile tested; have to try this out on Monday. Have 82576 in rhel5 system so I can dupe this case.
The patch in c#11 won't work b/c sriov_disable() is invoked by the driver before pci_remove_bus_device() is called, which will always make dev->sriov == NULL, and thus, the patch in c#11 will never invoke release_resources(). Thus, the release has to occur at sriov disable time, like this patch: diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index c182696..4e525b5 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -396,11 +396,20 @@ failed: static void sriov_release(struct pci_dev *dev) { + int i; + BUG_ON(dev->sriov->nr_virtfn); if (dev != dev->sriov->dev) pci_dev_put(dev->sriov->dev); + for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { + struct resource *res = dev->sriov->res + i; + if (!res->parent) + continue; + release_resource(res); + } + mutex_destroy(&dev->sriov->lock); kfree(dev->sriov); Testing with fakephp (remove-only) shows the leak doesn't occur. Also solves another problem where a cat /proc/iomem would eventually crash after hot-unplug w/o this patch since the sriov-struct-contained resource structure, once freed, would get used by kernel eventually, and corrupt iomem-list traversal.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Created attachment 498551 [details] Screenshot
Hello, the customer in the case 00456823 requests a z-stream errata for the bug. => Nominating for 5.6.z inclusion.
Verified on kernel-2.6.18-261.el5 using comment23's steps,I tried 5 times. after step4,host still works well #cat /proc/iomem | grep igb cf338000-cf33bfff : igb cf33c000-cf33ffff : igb cf340000-cf35ffff : igb cf3a0000-cf3bffff : igb cf400000-cf7fffff : igb cf800000-cfbfffff : igb
Patch(es) available in kernel-2.6.18-261.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
According to comment23 and comment27,set this issue as verified.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Hot removing a PCIe device and, consequently, hot plugging it again caused kernel panic. This was due to a PCI resource for the SR-IOV Virtual Function (vf) not being released after the hot removing, causing the memory area in the pci_dev struct to be used by another process. With this update, when a PCIe device is removed from a system, all resources are properly released; kernel panic no longer occurs.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1065.html