Bug 698879 - The pci resource for vf is not released after hot-removing Intel 82576 NIC
Summary: The pci resource for vf is not released after hot-removing Intel 82576 NIC
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.6
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: 5.8
Assignee: Don Dutile (Red Hat)
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 684637 707606 707899
TreeView+ depends on / blocked
 
Reported: 2011-04-22 06:58 UTC by Mark Wu
Modified: 2018-11-14 13:20 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Hot removing a PCIe device and, consequently, hot plugging it again caused kernel panic. This was due to a PCI resource for the SR-IOV Virtual Function (vf) not being released after the hot removing, causing the memory area in the pci_dev struct to be used by another process. With this update, when a PCIe device is removed from a system, all resources are properly released; kernel panic no longer occurs.
Clone Of:
Environment:
Last Closed: 2011-07-21 10:05:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
release sriov resource in igb_remove (1.66 KB, patch)
2011-04-29 13:10 UTC, Mark Wu
no flags Details | Diff
Screenshot (182.66 KB, image/png)
2011-05-12 14:17 UTC, juzhang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1065 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.7 kernel security and bug fix update 2011-07-21 09:21:37 UTC

Comment 11 Don Dutile (Red Hat) 2011-05-06 22:04:13 UTC
The reason commenting out the SRIOV code in setup-res.c & setup-bus.c works is because it removes the resource gathering of VFs, and when the generic pci_bus_alloc_resource() is done for all pci-dev resources, the SRIOV resources aren't allocated, so it can't lose it btwn unplug/plug.

So, the root of the problem is that the generic scan code does 
the allocate_resource() in setup-*.c,  but the pci_remove_bus_device(),
which is done by the hotplug code at unplug time, does not do the release_resource() on the sriov resources.
[note: release_resource() is called by pci_free_resources() which is called by pci_destroy_dev() which is called by pci_remove_bus_device().

This all works in upstream b/c this loop in pci_free_resources():
        for (i = 0; i < PCI_NUM_RESOURCES; i++) {
                struct resource *res = dev->resource + i;
                if (res->parent)
                        release_resource(res);
        }

*includes* the sriov resources.  in RHEL5, the sriov resources are
put in a separate struct that is linked to the pci_dev{} struct due to kabi
compatibility reqs btwn 5.0GA and the sriov support backported to rhel5.

So, I believe by adding this code segment after the one
above in pci_free_resources, the unplug should work:
        if (dev->is_physfn) {
                struct pci_sriov *iov = dev->sriov;
                struct resource *res;
                if (!iov) 
                        return;
                for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
                        res = iov->res + i;
                        if (res->parent)
                                release_resource(res);
                }
        }

Warning: Only compile tested; have to try this out on Monday.
         Have 82576 in rhel5 system so I can dupe this case.

Comment 12 Don Dutile (Red Hat) 2011-05-09 20:35:19 UTC
The patch in c#11 won't work b/c sriov_disable() is invoked by the driver
before pci_remove_bus_device() is called, which will always make dev->sriov == NULL, and thus, the patch in c#11 will never invoke release_resources().

Thus, the release has to occur at sriov disable time, like this patch:

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index c182696..4e525b5 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -396,11 +396,20 @@ failed:
 
 static void sriov_release(struct pci_dev *dev)
 {
+	int i;
+
 	BUG_ON(dev->sriov->nr_virtfn);
 
 	if (dev != dev->sriov->dev)
 		pci_dev_put(dev->sriov->dev);
 
+	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
+		struct resource *res = dev->sriov->res + i;
+		if (!res->parent)
+			continue;
+		release_resource(res);
+	}
+
 	mutex_destroy(&dev->sriov->lock);
 
 	kfree(dev->sriov);


Testing with fakephp (remove-only) shows the leak doesn't occur.
Also solves another problem where a cat /proc/iomem would eventually crash
after hot-unplug w/o this patch since the sriov-struct-contained resource structure, once freed, would get used by kernel eventually, and corrupt iomem-list traversal.

Comment 19 RHEL Program Management 2011-05-11 19:09:38 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 24 juzhang 2011-05-12 14:17:05 UTC
Created attachment 498551 [details]
Screenshot

Comment 26 Tomas Smetana 2011-05-13 06:57:52 UTC
Hello,
  the customer in the case 00456823 requests a z-stream errata for the bug. => Nominating for 5.6.z inclusion.

Comment 27 juzhang 2011-05-13 07:04:19 UTC
Verified on kernel-2.6.18-261.el5 using comment23's steps,I tried 5 times.

after step4,host still works well 

#cat /proc/iomem | grep igb
    cf338000-cf33bfff : igb
  cf33c000-cf33ffff : igb
    cf340000-cf35ffff : igb
  cf3a0000-cf3bffff : igb
    cf400000-cf7fffff : igb
  cf800000-cfbfffff : igb

Comment 28 Jarod Wilson 2011-05-13 22:21:18 UTC
Patch(es) available in kernel-2.6.18-261.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5
Detailed testing feedback is always welcomed.

Comment 30 juzhang 2011-05-18 05:43:48 UTC
According to comment23 and comment27,set this issue as verified.

Comment 34 Martin Prpič 2011-07-12 11:53:14 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Hot removing a PCIe device and, consequently, hot plugging it again caused kernel panic. This was due to a PCI resource for the SR-IOV Virtual Function (vf) not being released after the hot removing, causing the memory area in the pci_dev struct to be used by another process. With this update, when a PCIe device is removed from a system, all resources are properly released; kernel panic no longer occurs.

Comment 35 errata-xmlrpc 2011-07-21 10:05:40 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html


Note You need to log in before you can comment on or make changes to this bug.