Red Hat Bugzilla – Bug 159869
Diskdump fails through ipr driver
Last modified: 2007-11-30 17:07:18 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050302 Firefox/1.0.1 Fedora/1.0.1-1.3.2
Description of problem:
When a diskdump is triggered, system is hanging due to
ipr adapter reset.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Add dump partition to /etc/sysconfig/diskdump
2. service diskdump initialformat
3. chkconfig diskdump on
4. service diskdump start
5. echo 1 >/proc/sys/kernel/sysrq
6. echo c >/proc/sysrq-trigger
Actual Results: When the dump is triggered, the system drops into xmon. When I exit xmon, the
following is printed:
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=64 NUMA PSERIES LPAR
NIP: C0000000001B1BFC XER: 0000000000000010 LR: C0000000001B1FA4
REGS: c0000000e0a33920 TRAP: 0300 Not tainted (2.6.9-6.37.EL)
MSR: 8000000000009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
DAR: 0000000000000000, DSISR: 0000000042000000
TASK: c0000001b83844c0 'bash' THREAD: c0000000e0a30000 CPU: 0
GPR00: C0000000001B1BF8 C0000000E0A33BA0 C0000000004D0610 0000000000000063
GPR04: 0000000000000000 0000000000000000 0000000000000080 0000000000000001
GPR08: 0000000000000018 0000000000000000 C0000001BD247BD8 C0000000001B1C04
GPR12: 0000000044242428 C0000000003DA000 00000000100CCBB0 0000000000000000
GPR16: 00000000FFFFFFFF 0000000010060000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 00000000100C0000 0000000000000000
GPR24: 0000000000000000 0000000000000006 0000000000000000 0000000000000000
GPR28: 0000000000000063 C000000000418D88 C000000000453D38 C00000000043F2F0
NIP [c0000000001b1bfc] .sysrq_handle_crash+0x4/0xc
LR [c0000000001b1fa4] .__handle_sysrq+0xac/0x170
[c0000000e0a33ba0] [c0000000001b1f70] .__handle_sysrq+0x78/0x170 (unreliable)
[c0000000e0a33c50] [c000000000108ee0] .write_sysrq_trigger+0x84/0xb4
[c0000000e0a33cf0] [c0000000000b69b8] .vfs_write+0x148/0x1ac
[c0000000e0a33d90] [c0000000000b6af4] .sys_write+0x4c/0x8c
[c0000000e0a33e30] [c000000000011180] syscall_exit+0x0/0x18
CPU frozen: #1#2#3#4#5#6#7
CPU#0 is executing diskdump.
<3>ipr 0002:48:01.0: Adapter being reset as a result of error recovery.
Then nothing happens. The dump is not created and the system does not reboot.
I have to manually reboot the system.
Expected Results: System should dump memory to the dump partition then automatically reboot.
Attaching patch to fix problem.
Created attachment 115233 [details]
Patch to fix problem
Could you please explain how the patch fixes the problem?
The patch keeps the adapter from being reset before the dump is
created. Customer believes that reenabling interrupts when resetting
the adapter is leaving a window for other interrupts to come in, causing
the system to hang.
With this patch, ipr_eh_host_rest() returns without calling ipr_reset_reload()
when crashdump_mode() is on. So I think the first part of patch is not required.
Is it correct?
Correct, I think they were cleaning up ipr_reset_reload(), removing the check
for crashdump_mode() since this check was moved to ipr_eh_host_reset(),
which as you stated, does not call ipr_reset_reload() when crashdump_mode()
Created attachment 116128 [details]
I tried to modify the patch. The original patch does not wait for completion of
reset and I think waiting for completion is better if it is possible to do.
With this patch, ipr_eh_host_reset() calls ipr_reset_reload() and
ipr_reset_reload() calls spin_unlock() instead of spin_unlock_irq() if
crashdump_mode() is on. So the new code can wait for completion fo reset
without enabling interrupt. If the customer agree with this, could you please
request the customer to test the patch?
> The patch does not work. The ipr driver requires interrupts to be
> functioning in order for its host reset processing to work. The patch would
> result in a hang as well.
I'm afraid allowing interrupts during diskdump process makes the possibility of
failure increase because it also allows unexpected other interrupts. Is there
any way to detect the completion without allowing interrupts?
Please confirm your next comment is written in this BZ in public comment. If the
comment is private, I cannot read it.
Nobuhiro, here is the customer's response to your question:
I agree that enabling interrupts during diskdump is not the right thing to do.
It would be possible to make ipr's adapter reset routine work without needing
interrupts, but to do so would be complex and, I think, unnecessary.
The problem is not so much just providing a way to detect the completion of the
adapter reset with interrupts disabled. The problem is that the process of
resetting an ipr adapter such that it can accept new commands is a multi-stage
process. At a high level, what ipr needs to do is to run BIST on the card, then
wait for 2 minutes. During this time, if ipr were to attempt to talk to the
adapter, it would result in an EEH error and the reset would fail. It must then
write to a register and wait for an interrupt to occur. This interrupt could
take as long as 5 minutes to occur (usually it takes around 30 seconds or less).
Then it needs to send several commands to the adapter, all of which are timed
and require interrupts in the current implementation. To implement what you are
proposing would require somone calling ipr_poll repeatedly to make forward
progress on the reset job and also would require a way for ipr to time events
which could take as long as 5 minutes.
There is no reason for diskdump to reset the ipr adapter before executing a
dump. There should be no need to quiesce the host before the dump as other
commands that may be completing should have no effect on the dump. If there is a
desire to prevent other scsi ops that are in progress at the time the dump is
started from having their done function from being called, it would be much
simpler to accomplish this. To make ipr's reset job work without interrupts and
without timer interrupts is complex and adds lots of potential for deadlock if
not properly implemented, which is why the initial patch was suggested, which
essentially no-ops the host reset when in diskdump.
In my tests, diskdump through ipr worked with the patch, but did not work
consistently without the patch. Brian says modifying the driver to do a reset
with interrupts disabled is not trivial. Besides, he also says it's unnecessary
to reset the adapter before the dump. Since he is the ipr driver maintainer, I
give his opinion added weight.
> There is no reason for diskdump to reset the ipr adapter before executing a
> dump. There should be no need to quiesce the host before the dump as other
> commands that may be completing should have no effect on the dump. If there is a
> desire to prevent other scsi ops that are in progress at the time the dump is
> started from having their done function from being called, it would be much
> simpler to accomplish this.
I'm not sure the ipr adapter has no need to reset the host. But given that it is
correct, ipr adapter can just return SUCCESS as the result of its host reset
handler when crashdump_mode() is set. And ipr adapter can do anything in its
quiesce handler to dispose of its scsi commands in progress. IMHO this is more
robust than allowing interrupts.
> Are you expecting a new patch from IBM that incorporates the suggestions in
Yes, I'm expecting a new patch. If host reset before diskdump is useless for ipr
as you said, I recommend skipping to do host reset. I think that makes diskdump
on ipr more robust because it does not open a window for interrupts.
The revised patch which was based on the patch of comment#1 was posted. Got one ACK.
committed in -22.25
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.