Red Hat Bugzilla – Bug 368981
kexec/kdump doesnt work on nfs root on QS21
Last modified: 2010-03-14 17:28:49 EDT
Description of problem:
When trying to test bug the fix in bug 313731, I tried to execute kexec/kdump on
a QS21 configured with NFS root (QS21 has no local storage).
- in running mkdumprd handle_netdev function does not exist (bug 368941)
- fixing above produces a working 'service kdump start', but dump is
- also with 'enforcing=0' kernel parameter
$ cat /proc/cmdline:
$ touch /etc/kdump.conf
$ service kdump restart
$ echo c > /proc/sysrq-trigger
- kernel starts to boot, then SOL console drops
- last thing you see is 'md: bitmap version 4.39"
- my assumption is that the booted kdump kernel somehow dies after that
- soon after (<1 minute), the tftp server sees a request for a boot
image from the QS21 and the QS21 "comes back" but is now booting the
- there is no crash in /var/crash
- after re-connect with 'console' eventually comes back, but SOL is
messed up, and generally unusable. I have to power off the system,
detach from the bladecenter and power it back on to get the SOL back.
contents of /etc/kdump.conf?
Can you provide a log of what the serial console did manage to capture before it
I expect what happened is, due to bz 313731 mkdumprd got rather confused and
produced a bad initramfs for kdump. given the error message above it likely
thinks that you are using a software raid setup of some sort (the md utility).
At which point it fails setup, and reboots the system back to the origional
kernel. My guess is this is a duplicate of bz 368941. I'll leave it open until
No idea why the SOL would drop during reboot. Isn't the SOL run independent of
the system in question? I thought crashes/reboots weren't supposed to affect
the management interfaces.
Created attachment 289959 [details]
qs21-kdump-53.el5.log: log from 2.6.18-53.el5 with modified mkdumprd
The above is a console log of a kdump attempt on qs21 running 2.6.18-53.el5.
The suggested patch from 368941 (attachment 289946 [details]) is applied to mkdumprd.
Created attachment 289961 [details]
qs21-kdump-58.el5.rhel5u2.sm12.log : same as above with sm12 kernel
this is the log running the sm12 kernel (my development kernel) from
http://people.redhat.com/smoser/rhel5u2/sm12 . It contains all Cell
related fixes for RHEL5u2 (amoung other things). There is no real
difference in the log other than the kernel used.
hmm, ok, this may not be a dupe after all. Judging by those logs, we either:
1) may not be getting into the initrd at all (i.e. hanging prior to
loading/running /init in the initramfs)
2) Somehow not getting messages to the console properly, even though we are
functioning properly otherwise.
Scott, did you say I could get access to this machine to test on? It would
probabaly be easiest if I could just have direct access to tinker for a bit, if
thats possible. Thanks!
(In reply to comment #5)
> Scott, did you say I could get access to this machine to test on? It would
> probabaly be easiest if I could just have direct access to tinker for a bit, if
> thats possible. Thanks!
I've forwarded you info.
Thats right, I remember now. Thanks!
FWIW it looks from my tinkering like we're not getting into the initramfs at all
yet on this system. Depending on the iteration, we either jump back to bios
halfway through kernel init, or we try to access the initramfs, but seem to fail
the sys_access call in init()
. blocking this on 313731
as per conversation with scott, I'm moving this bug to be dependent on the
correct kexec/cell bug.
thanks Neil, adding to RHEl5.2 release notes under "known issues":
Executing kdump on a QS21 configured with NFS root will fail. To avoid this,
specify an NFS dump target in /etc/kdump.conf.
please advise if any revisions are required.
Created attachment 297165 [details]
netdump-log for RHEL5.2-Beta1 on QS21
I have tried to verify kdump support for RHEL5.2-Beta1(2.6.18-84.el5) on QS21.
I found that the secondary kernel has same problem booting on QS21 diskless
I performed the following steps:
- install RHEL5.2-Beta1 on QS21(2.6.18-84.el5)
- Reboot with crashkernel to kernel command line (boot net
- set up kdump to dump to nfs mount point:
echo "net your.host.here:/your/exported/dir" >> /etc/kdump.conf
- service kdump restart
- echo 'c' > /proc/sysrq-trigger
The secondary kernel is loaded and starts booting, then the system reboots.
I found /var/crash is empty.
*Attaching the log.
I'm closing this as a dupe of bz 368941, as they're both tracking the same
issue, and the other bz has an additional patch in it already to clean some
other cruft up.
*** This bug has been marked as a duplicate of 368941 ***
minor release note revision as per BZ#438030:
Executing kdump on an IBM Bladecenter QS21 or QS22 configured with NFS root will
fail. To avoid this, specify an NFS dump target in /etc/kdump.conf.
please advise if any further revisions are required. thanks!
But It didn't work for me as I said in Comment#19
What you encountered in comment 19 was a different problem, one which IBM is
investigating. The RHEL5 kernel was booting as of kernel release -65.el5, but
stopped again sometime between -65.el5 and -84.el5. IIRC IBM is bisecting to
determine the release in which it initially (re)-broke. If you try to boot with
kernel -65.el5 and use the config suggested by Don's release note, then all
should work quite well
the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at
which point no further additions or revisions will be entertained.
a mockup of the RHEL5.2 release notes can be viewed at the following link:
please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
Tracking this bug for the Red Hat Enterprise Linux 5.3 Release Notes.
This Release Note is currently located in the Known Issues section.
Release note added. If any revisions are required, please set the
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.