Red Hat Bugzilla – Bug 457904
Confusion about Default Action
Last modified: 2009-01-20 15:59:36 EST
Description of problem:
There are two problems of Kdump's default option. The first one is if I had a single line Kdump configuration file like this,
It will give a warning when starting Kdump service,
Detected change(s) the following file(s):
Warning! Lack of dump target specification means default option is ignored!
Starting kdump:[ OK ]
However, the above warning was not true, because it still halted the box after vmcore captured,
+ /usr/bin/logger -p info -t kdump 'saved a vmcore to /var/crash/2008-08-05-04:49'
+ return 0
+ run_kdump_post 0
++ grep '^kdump_post' /etc/kdump.conf
++ cut '-d ' -f2
+ '[' -x '' ']'
++ grep default /etc/kdump.conf
++ grep -vm1 '^#'
++ cut '-d ' -f2
+ [[ halt != \h\a\l\t ]]
The second problem is that the explaination in kdump.conf about default action is somehow misleading,
# default <reboot | halt | shell>
# - Action to preform instead of mounting root fs and
# running init process
# reboot: If the default action is reboot simply reboot
# the system and loose the core that you are
# trying to retrieve.
# halt: If the default action is halt, then simply
# halt the system after attempting to capture
# a vmcore, regardless of success or failure.
# shell: If the default action is shell, then drop to
# an msh session inside the initramfs from
# where you can try to record the core manually.
# Exiting this shell reboots the system.
# NOTE: If no default action is specified, the initramfs
# will mount the root file system and run init.
"Action to preform instead of mounting root fs and running init process"
This is not right, because for the configuration file, it still mounted the root fs, and ran init to capture the vmcore.
"loose the core that you are trying to retrieve."
Probably, it means 'lose'.
"NOTE: If no default action is specified, the initramfs will mount the root file system and run init."
I don't think this is correct, because If I have a single line of configuration file like 'ext3 /dev/sda1', it will reboot after vmcore captured, and won't run init.
Version-Release number of selected component (if applicable):
Its not misleading, its just not working as its described. It should have been ignored because kdump.conf defines how the initramfs is laid out, it should be ignored, and instead the root file system should be mounted and /sbin/init run (since no dump target was specified).
btw, what version of kdump are you using? Your kdump output doesn't match either whats in cvs, nor does it match kexec-tools-1.102pre-21.el5 (the version shipping with RHEL5.2)
It was kexec-tools-1.102pre-21.el5.
Dang it. Looks like someone started snooping in /etc/kdump.conf from kdump.init, we shouldn't be doing that, we should just be rebooting. I'll fix it shortly.
Created attachment 313442 [details]
patch to ignore kdump.conf settings in kdump.init
Here, this should bring the docs & function into alignment.
give that a try if you would. Thanks!
The patch fixed the first problem. What do you think of the second problem regarding documentation?
The documentation should now be correct, in that if you configure a dump target the default action (if configured) will be honored in place of blindly mounting the root fs an running init. I'll try to clarify that though, when I check this other fix in.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
fixed in -31.el5. Thanks!
Neil, It looks like the default action is never working as expected since RHEL 5.2.
- config file:
- run SysRq-C:
Making device-mapper control node
Scanning logical volumes
Reading all physical volumes. This may take a while...
Found volume group "VolGroup00" using metadata type lvm2
Activating logical volumes
2 logical volume(s) in volume group "VolGroup00" now active
hwclock: Could not access RTC: No such file or directory
Saving to the local filesystem /dev/mapper/VolGroup00-LogVol00
e2fsck 1.38 (30-Jun-2005)
KDUMP-TEST: recovering journal
kjournald starting. Commit interval 5 seconds
KDUMP-TEST: clean, 67542/17382816 files, 2009048/17375232 blocks
EXT3 FS on dm-0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Saving core complete
md: stopping all md devices.
Completed flushing cache on controller 0
- init from the initramfs snip:
mount -t ext3 $DUMPDEV /mnt
if [ $? == 0 ]
mkdir -p /mnt//var/crash/127.0.0.1-$DATE
cp /proc/vmcore $VMCORE-incomplete >/dev/null
if [ $exitcode == 0 ]
mv $VMCORE-incomplete $VMCORE
echo "Saving core complete"
[ $exitcode == 0 ] && reboot -f
echo dropping to initramfs shell
echo exiting this shell will reboot your system
When the vmcore has been saved successfully, "exitcode" becomed 0 and the system rebooted. So, the default action will only be involved when we failed to save vmcore. Is that expected?
yes, that is exactly what is expected, and should be how its always worked. It was definately the intention for default_action to be invoked when core saving fails. We always reboot once we have the core so the system gets back to a running state. If you want to unconditionally drop to a shell after saving core, use the kdump_post script directive. Sorry for the confusion
Neil, it will cause the further confusion that "default: shell" is involved when save of vmcore failed, but "default: halt" is taking place whatever happens.
[ $exitcode == 0 ] && halt -f
Do you think it is expected? If so, I'll fill the enhancement request to clarify this in the document.
It looks to me like a bug. From kexec-kdump-howto.txt,
By default, if a configured dump method fails, the kdump initrd falls back
to trying to dump to the local file system (i.e., into the file system(s)
you would have mounted under normal system operation). The system always
reboots following an attempted dump to your local file system, regardless
of success or failure.
However, for any of the advanced methods, if the dump fails, you can configure
the kdump initrd to skip trying to dump to the local file system, instead
immediately rebooting ('default reboot'), halting the system ('default halt')
or dropping you to a shell within the initrd ('default shell'), from which you
could try to capture the vmcore manually. Again, if the 'default' parameter is
unset, a local file system dump will be attempted, then the system will reboot.
I assume the output in comment 13 is the contents of the init script in the initramfs? My guess there is that you specified your default action to be halt. If so, that would be expected behavior.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.