Description of problem: I've been developing an SGI Hotplug driver on RHEL4. I noticed this in final testing of hotplug code. If you cat /proc/iomem and check lspci after removing a device via hotplug: [root@cranberry3 linux-2.6.9-hotplug]# lspci 01:01.0 Co-processor: Silicon Graphics, Inc. IOC4 I/O controller (rev 4f) 01:03.0 SCSI storage controller: QLogic Corp. ISP12160 Dual Channel Ultra3 SCSI Processor (rev 06) 01:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet (rev 15) 03:02.0 ATM network controller: FORE Systems Inc ForeRunnerHE ATM Adapter 04:01.0 SCSI storage controller: QLogic Corp. QLA2200 64-bit Fibre Channel Adapter (rev 05) [root@cranberry3 linux-2.6.9-hotplug]# cat /proc/iomem c00004080c200000-c00004080c2fffff : 0000:01:01.0 c00004080f400000-c00004080f4fffff : 0000:03:02.0 c00004080fe00000-c00004080fe00fff : 0000:04:01.0 c00004080fe01000-c00004080fe01fff : 0000:04:02.0 00220000-0023ffff : 0000:04:01.0 There are still allocations listed for device 04:02.0 even though it has been removed from the system. I tracked down the issue to the following: When pci_remove_bus_device is called both the pci_driver remove function for a device and the function pci_free_resources are called. In the case of the QLA2x00 driver, the drivers remove function calls pci_release_regions. This function releases the IO & MEM allocs for the pci device and kfree's them. Later, pci_free_resources is called. This function attempts to also free IO & MEM allocs for the pci device. This may (about 5% of the time) causes an oops. At a minimum, Hotplug support looks busted in 2.6.9 -- I have yet to check a 2.6.11 bk pull. Version-Release number of selected component (if applicable): How reproducible: see above
Bill, I can take this one. I've already posed a few questions in linux-hotplug-devel.
Created attachment 111856 [details] Reproducer of oops. Reproduce oops by memset'ing resource before kfree.
I've asked the following question in linux-hotplug-devel but haven't heard an answer yet. I'm worried that no one knows ... "... should a driver be cleaning up resources or should PCI?"
The low-level driver that requests the resources should also release them. Usually this is via pci_request_regions().
Jeff, Thanks -- I wanted someone who knew a bit about PCI to make a statement of fact about the resource allocations. I'm close to what I think an appropriate solution is. I could just remove the erroneous call to pci_free_resources, however, I believe that this may cause some drivers (that are dependent on the pci_free_resources call) to break. Since pci_free_resources is a legacy (deprecated?) call that we should modify the code such that release_resources sets the appropriate pci_dev->res pointers to null, and pci_free_resources only attempts to free a resource if the pointer is not NULL. At that time we would output a warning message about the deprecation of the call ... IMO, pci_free_resources should never be called. What do you think? Think the community would go for that solution? P.
If a driver is not properly releasing resources, that driver needs to be fixed. If resources are lingering, we shouldn't be messing with them, as that indicates something is still active. Am I missing something?
Jeff, You're absolutely 100% right that the broken drivers are the problem here, not PCI. I had a discussion with Greg Kroah (PCI & Hotplug maintainer) about this over the past 24-48 hours. At issue here is our worry that we will "break" some existing drivers by blindly removing the erroneous call to pci_frese_resources in the PCI subsystem. IMO this call just shouldn't be there. Everything that I've heard says that a driver is responsible for cleaning up resource allocations. I submitted a patch to Greg and he's going to put it into the tree for the time being. I'm at home and will properly submit this patch as an attachment on Monday. ===== drivers/pci/remove.c 1.6 vs edited ===== --- 1.6/drivers/pci/remove.c 2005-01-14 18:06:55 -05:00 +++ edited/drivers/pci/remove.c 2005-03-11 21:18:22 -05:00 @@ -19,8 +19,12 @@ pci_cleanup_rom(dev); for (i = 0; i < PCI_NUM_RESOURCES; i++) { struct resource *res = dev->resource + i; - if (res->parent) + if (res && res->parent) release_resource(res); } } ===== kernel/resource.c 1.26 vs edited ===== --- 1.26/kernel/resource.c 2005-01-08 00:44:13 -05:00 +++ edited/kernel/resource.c 2005-03-11 20:56:47 -05:00 @@ -505,6 +505,7 @@ *p = res->sibling; write_unlock(&resource_lock); kfree(res); + res = NULL; return; } p = &res->sibling;
Created attachment 111987 [details] Patch that "fixes" IO and MEM free'ing issues. Submitted to and accepted by Greg Kroah.
Prarit - if you have a tested fix which you consider viable.. please proceed to proposing it on rhkernel-list.
Please be aware that I can no longer see PRIVATE comments.