Bug 473454
Summary: | device-mapper multipath: kmpathd oops in process_queued_ios | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Bryn M. Reeves <bmr> | ||||
Component: | kernel | Assignee: | LVM and device-mapper development team <lvm-team> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Martin Jenner <mjenner> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 4.7 | CC: | caijuanyang, coughlan, hklein, levy_jerome, michael.hagmann, tao | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-01-20 20:11:46 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Bryn M. Reeves
2008-11-28 15:58:29 UTC
Created attachment 325030 [details]
disassembly of dm-multipath.ko
objdump -d --line-numbers of dm-multipath.ko
Probably will not be reproducable on CX4 with R26 or later FLARE code as a lower-level redirector prevents this issue from occurring. This problem has received low priority because it has only been seen when there is a hardware failure (a bad Clariion Link Control Card), while running with older firmware. Comment 7 indicates that the path flipping behavior that caused the crash will not happen on CX4 with R26 or later FLARE code. Although it is possible that some other scenario could trigger this crash, we are not currently able to reproduce the problem. This prevents us from developing and thoroughly testing a fix. At this stage in the life of RHEL 4, we believe the risk associated with making a change outweighs the risk that this problem will occur. Re-open this BZ if the problem is seen again on current hw/fw. I meet the same oops in my environment, a little different from this bug that I use a HUAWEI storage S2600 Instead of EMC storage CX3. The kernel panic I met can be reproduced by the follow actions: 1. Host is Oracle Enterprise Linux 4 update5, with a Emulex LPe11000-M4 FC card. Link to a switch and switch link to the storage both controller A and controller B. 2. Storage allocate several LUNs for host, and run multipathd daemon for creating dm-0 dm-1 and etc. 3. Use "dd" to run IO for device dm-0 and dm-1, and from iostat I see the IO is to controller A of the storage. 4. Reboot the controller A of the storage, and IO will failover to the line linked to controller B of the storage. Before failover, my private pg_init will be called to send mode select command just like other hardware handler. 5. Test the above several times, it will make kernel panic. There will one or two kernel panic in 10 tests. This BZ may be re-opened for a better fix. |