Description of problem:
The /etc/init.d/kdump script will silently fail if /sbin/kexec cannot load the
crash kernel because the command line string passed to crash kernel is too long.
For example, the crash kernel string
"ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200n8 irqpoll
maxcpus=1 lpj=2999715 earlyprintk=serial,ttyS0,115200n8 memmap=exactmap
memmap=640K@0K memmap=5452K@16384K memmap=518180K@22476K elfcorehdr=540656K
is too long, but the following is not
"ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200 irqpoll
maxcpus=1 lpj=2999715 earlyprintk=ttyS0,115200 memmap=exactmap memmap=640K@0K
memmap=5452K@16384K memmap=518180K@22476K elfcorehdr=540656K memmap=412K#915904K
It took executing the kexec command by hand without the "2> /dev/null" in the
script to see what the real problem is.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. You need a system like and ES7000 that will have a long string for the exact
memmap provided to the crash kernel.
2. Add the line "console=tty0 console=ttyS0,115200n8 selinux=0
earlyprintk=serial,ttyS0,115200n8" to /etc/sysconfig/kdump (or the GRUB boot
line which is what I was doing)
3. If you have an ES7000, you will need to use the parameter lpj=SOME_VALUE
where you can get SOME_LVALUE from "dmesg | grep lpj | head -n1"
4. run /etc/init.d/kdump
/etc/init.d/kdump fails to load kdump and all I see in /var/log/messages is that
it failed to load the crash kernel.
Ideally, I would like to know why kexec choked. I can modify the script myself
to find the answer, but having a debug option in /etc/sysconfig/kdump that I can
tell a customer to change is preferable.
Created attachment 144491 [details]
This would be an example update to the sysconfig kdump config file
Created attachment 144492 [details]
update the init script for kdump to handle the added configuration variable
I tested passing such a long parameter list to kdump, but kdump throws an error
saying "Command line overflow" and exits. The kexec-tools level is
So can this bug be closed?
This bug still exists with kexec-tools-1.101-163.el5, so no, you may not close it.
Please re-read the bug. To make our lives simpler and to keep us on the same
page, I am attaching my /etc/sysconfig/kdump file for you to use. You may only
test this feature with /etc/init.d/kdump. You need to look at /var/log/messages
to see the error messages for the kdump script failure, not the command line.
DO NOT execute /sbin/kexec by hand.
The kexec tools work flawlessly.
[continued] The kexec tools work flawlessly. This problem pokes a hole in an
otherwise excellent integration of the kdump feature.
In the series of commands below, no where do I see a report for why the kdump
kernel failed to load. The customer needs to see why the failure occured.
[root@localhost ~]# grep APPEND /etc/sysconfig/kdump
KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1 lpj=3001000i console=tty0
console=ttyS0,115200n8 earlyprintk=serial,ttyS0,115200n8,keep debug acpi=debug"
[root@localhost ~]# /etc/init.d/kdump start
Starting kdump: [FAILED]
[root@localhost ~]# tail -n 2 /var/log/messages
Jan 23 14:24:14 localhost kdump: kexec: failed to load kdump kernel
Jan 23 14:24:14 localhost kdump: failed to start up
Created attachment 146330 [details]
Please use this file for your /etc/sysconfig/kdump. The parameters are
arbitrary and there just to make sure that your system's command line will
overflow. The ES7000's long exactmap helps me see this problem easier.
Okay, When I run 'init 3' to load kdump kernel using the init scripts with a
very long parameter list, I get
Starting portmap: [ OK ]
Starting kdump: Command line overflow
Then I tried kexec-tools-1.101-163 on a POWER machine with long parameter list.
kdump init script does not say the reason for kdump load failure and I need to
check the /var/log/message.
So the problem still exists in 163 level of kexec-tools.
Mohan, mind sharing your /etc/init.d/kdump or at least doing a diff of it to see
what went wrong between now and then?
If the scripts are the same, then maybe /sbin/kexec was printing the error
message on STDOUT and not STDERR. In the original bug report, I noted that the
script gets rid of any /sbin/kexec error messages by redirect STDERR to
/dev/null ("2> /dev/null").
Ok, it does appear the addition of the /dev/null redirection on kexec regressed
this. I don'tthink we need a whole separate log facility to catch this though.
The following patch should work just fine. Please test and confirm
Created attachment 146345 [details]
patch to log output of kexec on error
Neil, you're right we don't need the added logging complexity. Thanks for the
simpler fix. The change works, I can see the error in /var/log/messages.
I assume this change will make it into 5.1 correct?
Retargeting for 5.1. Also throwing back into Assigned, as this patch hasn't
been incorporated into a package build.
fixed in -164.el5. Thanks!
kexec-tools-1.101-164.el5 included in 20070208.0 trees.