Bug 461000
Summary: | two dumps are captured when default action is set as halt. | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Hiromitsu KIKUCHI <kikuchi.hiromitsu> | ||||||
Component: | kexec-tools | Assignee: | Neil Horman <nhorman> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | Martin Jenner <mjenner> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 5.2 | CC: | benl, hfuchi, lwang, mgahagan, nhorman, qcai, varekova | ||||||
Target Milestone: | beta | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2008-10-13 10:42:03 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Hiromitsu KIKUCHI
2008-09-03 09:00:49 UTC
Can you please provide a capture of the serial console on this system taken during the dump capture process? I'd like to examine the error and subsequent behavior of kdump in these conditions. Thank you. Created attachment 315720 [details] the console log of the 2nd kernel. Thank you for your reply. I attached the console log of the 2nd kernel. BTW, > (7) dumped to the non-specified partition (= /dev/sdaXX:/crash, /dev/sdaXX is root filesystem.) /dev/sdaXX:/crash should have been /dev/sdaXX:/var/crash. (the dumpdir is specified with coredir="/var/crash/<date> in the kdump service script.) This time, I tried to reproduce the problem with the following config: ------------------- core_collector makedumpfile -c -d 1 ext3 /dev/sda12 # dump partition path /crash default halt ------------------- I think the system attempted to halt with the following scripts, but it could not halt. the init script in initrd.kdump: -------------------------------- ... snip ... [ $exitcode == 0 ] && halt -f halt -f echo Creating root device. ... ... snip ... -------------------------------- After dump and system reboot, I made sure that the two dumps were captured as follows. [root@kikutiplex755]~# mount -t ext3 /dev/sda12 /mnt/test [root@kikutiplex755]~# [root@kikutiplex755]~# ls -l /mnt/test/crash/ drwxr-xr-x 2 root root 4096 Sep 4 11:09 127.0.0.1-2008-09-04-11:08:51 [root@kikutiplex755]~# ls -l /mnt/test/crash/127.0.0.1-2008-09-04-11:08:51 -rw------- 1 root root 282634982 Sep 4 11:09 vmcore [root@kikutiplex755]~# [root@kikutiplex755]~# ls -l /var/crash drwxr-xr-x 2 root root 4096 Sep 4 11:10 2008-09-04-11:10 [root@kikutiplex755]~# ls -l /var/crash/2008-09-04-11:10 -r-------- 1 root root 4018638936 Sep 4 11:10 vmcore regards, looks like I may need to update the shutdown utility in busybox. Out of curiosity does the system reboot properly with one core if you specify the default action as nothing instead of halt? Created attachment 315746 [details]
the 2nd kernel was not dropped to a shell.
Yes, it rebooted properly and the dump was captured once.
Should I reported this as a busybox problem ?
BTW:
When I specified the default action as nothing and the dump failed
(eg. the size of the dump partition was not enough.),
the 2nd kernel mounted root filesystem, then it rebooted as <FINAL_ACTION>
in kdump script.
(Thus, the 2nd kernel was not dropped to a shell. I attached this console log.)
It seem not to be a feature according to kexec-kdump-howto.txt.
best regards,
If you didn't erase your previous dumps, then you likely are out of space on your capture partition. You need to remove the previous captures. As for the default action, rebooting is the default, you need to specify shell if you want to drop to a shell in the default_action. I'll re-assign this to busybox to have the halt command looked at. It looks like busybox halt command won't work from a Kdump initramfs. I confirmed it works from a shell. I added the following debug output in the initramfs, ls -l `which halt` echo "halt" halt echo "halt -n" halt -n The output is, lrwxrwxrwx 1 root 0 7 Oct 10 03:31 /sbin/halt -> busybox halt halt -n None of above works. It could also be reproduced by the following Kdump configuration file, ext3 LABEL=/boot default halt We use a small boot partition to simulate a failure of saving the vmcore, Scanning logical volumes Reading all physical volumes. This may take a while... Found volume group "VolGroup00" using metadata type lvm2 Activating logical volumes 2 logical volume(s) in volume group "VolGroup00" now active Saving to the local filesystem LABEL=/boot e2fsck 1.38 (30-Jun-2005) /boot: recovering journal /boot: clean, 36/26104 files, 18134/104388 blocks cp: Write Error: No space left on device md: stopping all md devices. System halted. Creating root device. Checking root filesystem. fsck 1.38 (30-Jun-2005) ... enter INIT ... oops. fixing the needinfo flag. sorry for the confusion. I doubt this is a busybox bug. It is a known issue, https://bugzilla.redhat.com/show_bug.cgi?id=413921 Ivan said it is not a Kernel bug either, so is it working as expected? If so, we'll probably need to modify kexec-tools to use the right command to halt the system. If I modified /sbin/mkdumprd to replace "halt -f" to "poweroff -f". It could halt the system. ... Saving to the local filesystem /dev/sda1 e2fsck 1.38 (30-Jun-2005) /boot: recovering journal /boot: clean, 42/26104 files, 27988/104388 blockkjournald starting. Commit interval 5 seconds s EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. cp: Write Error: No space left on device md: stopping all md devices. Synchronizing SCSI cache for disk sda: sd 0:0:0:0: [sda] Stopping disk Power down. acpi_power_off called Ivanas comments simply don't make sense to me. Halt halts the system. It does not say its halting the system , followed by a return to the console prompt without actually halting the system. To do so is broken I've re-opened ivans bug on the subject and am closing this as a dupe of that. Quite simply, halt has to do what halt says it will do. *** This bug has been marked as a duplicate of bug 413921 *** |