Bug 222460
| Summary: | qla4xxx/qla3xxx: co-existence issues during load/unload of either interface | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Mike Christie <mchristi> | ||||||
| Component: | kernel | Assignee: | Mike Christie <mchristi> | ||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Brian Brock <bbrock> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 4.0 | CC: | andriusb, coughlan, dwm, karen.higgins, konradr, mbarrow, ravi.anand, rkenna | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2007-02-26 21:08:18 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 209341, 216986 | ||||||||
| Attachments: |
|
||||||||
|
Description
Mike Christie
2007-01-12 16:59:40 UTC
Created attachment 145510 [details]
v5.01.00-d5
Bug fix to acquire drvr semaphore prior to resetting card during driver unload.
Reproduced bug and verified fix on kernel-2.6.9-42.40 dual proc x86_64 system.
(In reply to comment #1) > Created an attachment (id=145510) [edit] > v5.01.00-d5 > > Bug fix to acquire drvr semaphore prior to resetting card during driver unload. > Reproduced bug and verified fix on kernel-2.6.9-42.40 dual proc x86_64 system. Thanks for the patch. It fixes the hard lock up found with just the last patch, but the soft lock is still there. It is harder to hit now though. I am in the middle of recompiling the kernel with some debugging to see if there is anything detectable. Mike, From the following steps: 1.With qla4xxx and qla3xxx modules loaded, unloading/loading or bringdown/up one of the qla3xxx interfaces can sometimes lock up the qla4xxx driver. 2.With qla4xxx and qla3xxx modules loaded, unloading the qla4xxx module on a 4052 (one with two ports) can lock up the qla4xxx driver. 3.Another simple test is to run traffic on the ISCSI side and simply unload the qla3xxx module. This would cause the iscsi traffic to stop. Which one is causing the soft lockup to happen? With a 4052, I load qla3xxx and qla4xxx. qla3xxx is setup as eth0 and eth1. qla4xxx is setup with a session in the db, but the target is not connected. There is no traffic on the iscsi or network interface. When I do ifdown on eth0 or eth1, the box locks up. I cannot move the mouse or type anything in the console. Then, maybe a minute or two later the box unfreezes and everything works again. I think this is your #1. Created attachment 146040 [details]
Fix MII register access wait.
This patch fixes a condition where the network driver was busy waiting for the
MII register to become ready. It was looping without giving up the processor.
The loop duration is still 10ms, but a schedule_timeout is not called via
mdelay() instead of udelay().
I meant to say "but a schedule_timeout is "now" called via mdelay() instead of udelay()." Seems like it should be really msleep() and not mdelay(). msleep() invokes schedule_timeout() whie mdelay() is busy waiting function. Ravi If you are trying to give up the processer use msleep, I agree with Ravi. I think some places where you are using msleep and ssleep today though, you should not be holding a spin lock with irqs off. There is a kernel compile time debug option to check for this, DEBUG_SPINLOCK_SLEEP. It is not a default option for RHEL, but you can run with it yourself if you compile your own kernel. (In reply to comment #13) > debug option to check for this, DEBUG_SPINLOCK_SLEEP. It is not a default option > for RHEL, but you can run with it yourself if you compile your own kernel. Actually, it should be compiled in your kernel already. I am running with CONFIG_DEBUG_SPINLOCK_SLEEP already. I will look into the areas where sleep is called with irqs off. I haven't pushed the patch upstream so will change delay to sleep when I do. *** This bug has been marked as a duplicate of 228416 *** |