Bug 457904

Summary: Confusion about Default Action
Product: Red Hat Enterprise Linux 5 Reporter: Qian Cai <qcai>
Component: kexec-toolsAssignee: Neil Horman <nhorman>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 5.2CC: mgahagan
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 20:59:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch to ignore kdump.conf settings in kdump.init none

Description Qian Cai 2008-08-05 10:06:27 UTC
Description of problem:
There are two problems of Kdump's default option. The first one is if I had a single line Kdump configuration file like this,

 default halt

It will give a warning when starting Kdump service,

Detected change(s) the following file(s):
  
  /etc/kdump.conf
Rebuilding /boot/initrd-2.6.18-92.el5kdump.img
Warning!  Lack of dump target specification means default option is ignored!
Starting kdump:[  OK  ]

However, the above warning was not true, because it still halted the box after vmcore captured,

+ /usr/bin/logger -p info -t kdump 'saved a vmcore to /var/crash/2008-08-05-04:49'
+ return 0
+ run_kdump_post 0
+ kdumpsuccess=0
++ grep '^kdump_post' /etc/kdump.conf
++ cut '-d ' -f2
+ KDUMP_POST=
+ '[' -x '' ']'
+ do_final_action
++ grep default /etc/kdump.conf
++ grep -vm1 '^#'
++ cut '-d ' -f2
+ FINAL_ACTION=halt
+ [[ halt != \h\a\l\t ]]
+ halt

The second problem is that the explaination in kdump.conf about default action is somehow misleading,

# default <reboot | halt | shell>
#			- Action to preform instead of mounting root fs and
#			  running init process
#			  reboot: If the default action is reboot simply reboot
#				  the system and loose the core that you are
#				  trying to retrieve.
#			  halt:   If the default action is halt, then simply
#				  halt the system after attempting to capture
#				  a vmcore, regardless of success or failure.
#			  shell:  If the default action is shell, then drop to
#				  an msh session inside the initramfs from
#				  where you can try to record the core manually.
#				  Exiting this shell reboots the system.
#			  NOTE: If no default action is specified, the initramfs
#				will mount the root file system and run init.


"Action to preform instead of mounting root fs and running init process"
This is not right, because for the configuration file, it still mounted the root fs, and ran init to capture the vmcore.

"loose the core that you are trying to retrieve."
Probably, it means 'lose'.

"NOTE: If no default action is specified, the initramfs will mount the root file system and run init."
I don't think this is correct, because If I have a single line of configuration file like 'ext3 /dev/sda1', it will reboot after vmcore captured, and won't run init.

Version-Release number of selected component (if applicable):
kexec-tools-1.102pre-21.el5

How reproducible:
always

Comment 1 Neil Horman 2008-08-05 11:05:42 UTC
Its not misleading, its just not working as its described.  It should have been ignored because kdump.conf defines how the initramfs is laid out, it should be ignored, and instead the root file system should be mounted and /sbin/init run (since no dump target was specified).

btw, what version of kdump are you using?  Your kdump output doesn't match either whats in cvs, nor does it match kexec-tools-1.102pre-21.el5 (the version shipping with RHEL5.2)

Comment 2 Qian Cai 2008-08-05 11:13:51 UTC
It was kexec-tools-1.102pre-21.el5.

Comment 3 Neil Horman 2008-08-05 11:28:22 UTC
Dang it.    Looks like someone started snooping in /etc/kdump.conf from kdump.init, we shouldn't be doing that, we should just be rebooting.  I'll fix it shortly.

Comment 4 Neil Horman 2008-08-05 11:36:17 UTC
Created attachment 313442 [details]
patch to ignore kdump.conf settings in kdump.init

Here, this should bring the docs & function into alignment.

Comment 5 Neil Horman 2008-08-05 11:43:10 UTC
give that a try if you would.  Thanks!

Comment 6 Qian Cai 2008-08-06 04:25:17 UTC
The patch fixed the first problem. What do you think of the second problem regarding documentation?

Comment 7 Neil Horman 2008-08-06 11:42:24 UTC
The documentation should now be correct, in that if you configure a dump target the default action (if configured) will be honored in place of blindly mounting the root fs an running init.  I'll try to clarify that though, when I check this other fix in.

Comment 8 RHEL Program Management 2008-08-06 11:50:44 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 9 Neil Horman 2008-08-07 12:29:33 UTC
fixed in -31.el5.  Thanks!

Comment 11 Qian Cai 2008-09-26 08:18:08 UTC
Neil, It looks like the default action is never working as expected since RHEL 5.2.

- config file:
ext3 /dev/mapper/VolGroup00-LogVol00
default shell

- run SysRq-C:
...
Making device-mapper control node
Scanning logical volumes
  Reading all physical volumes.  This may take a while...
  Found volume group "VolGroup00" using metadata type lvm2
Activating logical volumes
  2 logical volume(s) in volume group "VolGroup00" now active
hwclock: Could not access RTC: No such file or directory
Saving to the local filesystem /dev/mapper/VolGroup00-LogVol00
e2fsck 1.38 (30-Jun-2005)
KDUMP-TEST: recovering journal
kjournald starting.  Commit interval 5 seconds
KDUMP-TEST: clean, 67542/17382816 files, 2009048/17375232 blocks
EXT3 FS on dm-0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Saving core complete
md: stopping all md devices.
Completed flushing cache on controller 0
Restarting system.
...

- init from the initramfs snip:
...
mount -t ext3 $DUMPDEV /mnt
if [ $? == 0 ]
then
  mkdir -p /mnt//var/crash/127.0.0.1-$DATE
  VMCORE=/mnt//var/crash/127.0.0.1-$DATE/vmcore
  export VMCORE
  cp /proc/vmcore $VMCORE-incomplete >/dev/null
  exitcode=$?
  if [ $exitcode == 0 ]
  then
      mv $VMCORE-incomplete $VMCORE
      echo "Saving core complete"
  fi
fi
umount /mnt
[ $exitcode == 0 ] && reboot -f
echo dropping to initramfs shell
echo exiting this shell will reboot your system
/bin/msh
reboot -f
...

When the vmcore has been saved successfully, "exitcode" becomed 0 and the system rebooted. So, the default action will only be involved when we failed to save vmcore. Is that expected?

Comment 12 Neil Horman 2008-09-26 13:09:04 UTC
yes, that is exactly what is expected, and should be how its always worked.  It was definately the intention for default_action to be invoked when core saving fails.  We always reboot once we have the core so the system gets back to a running state.  If you want to unconditionally drop to a shell after saving core, use the kdump_post script directive.  Sorry for the confusion

Comment 13 Qian Cai 2008-10-10 07:11:35 UTC
Neil, it will cause the further confusion that "default: shell" is involved when save of vmcore failed, but "default: halt" is taking place whatever happens.

...
[ $exitcode == 0 ] && halt -f
halt -f
...

Do you think it is expected? If so, I'll fill the enhancement request to clarify this in the document.

Comment 14 Qian Cai 2008-10-10 08:10:07 UTC
It looks to me like a bug. From kexec-kdump-howto.txt,

Default action

By default, if a configured dump method fails, the kdump initrd falls back
to trying to dump to the local file system (i.e., into the file system(s)
you would have mounted under normal system operation). The system always
reboots following an attempted dump to your local file system, regardless
of success or failure.

However, for any of the advanced methods, if the dump fails, you can configure
the kdump initrd to skip trying to dump to the local file system, instead
immediately rebooting ('default reboot'), halting the system ('default halt')
or dropping you to a shell within the initrd ('default shell'), from which you
could try to capture the vmcore manually. Again, if the 'default' parameter is
unset, a local file system dump will be attempted, then the system will reboot.

Comment 15 Neil Horman 2008-10-10 12:43:23 UTC
I assume the output in comment 13 is the contents of the init script in the initramfs?  My guess there is that you specified your default action to be halt.  If so, that would be expected behavior.

Comment 18 errata-xmlrpc 2009-01-20 20:59:36 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0105.html