Bug 466436 - [5.3][RFE] Makedumpfile Error Messages
Summary: [5.3][RFE] Makedumpfile Error Messages
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kexec-tools
Version: 5.2
Hardware: All
OS: Linux
Target Milestone: rc
: ---
Assignee: Neil Horman
QA Contact: Red Hat Kernel QE team
Keywords: Reopened
Depends On:
TreeView+ depends on / blocked
Reported: 2008-10-10 09:48 UTC by Qian Cai
Modified: 2009-09-02 09:13 UTC (History)
0 users

Clone Of:
: 600585 (view as bug list)
Last Closed: 2009-09-02 09:13:36 UTC

Attachments (Terms of Use)
patch to allow users to specify makdumpfile level (1.50 KB, patch)
2009-07-02 10:42 UTC, Neil Horman
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:1258 normal SHIPPED_LIVE kexec-tools bug fix and enhancement update 2009-09-01 09:09:40 UTC

Description Qian Cai 2008-10-10 09:48:30 UTC
Description of problem:
When makedumpfle failed on a vmcore, we need to know why. Otherwise, it is a pain to debug the failure when looking back from the serial console log. For example,

EXT3-fs: mounted filesystem with ordered data mode.
[  0 %][ 14 %][ 23 %][ 33 %][ 42 %][ 51 %][ 55 %][ 57 %][ 59 %][ 69 %]dropping to initramfs shell
exiting this shell will reboot your system

In fact, makedumpfile failed because of there was not enough disk space.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. configure Kdump with the following options,

  ext3 <small partition without enough disk space to save vmcore>
  core_collector makedumpfile -E

2. SysRq-C
3. check the serial console log to see if there is the reason of failure.
Actual results:
No error message.

Expected results:
Some error messages. For example,

[  0 %]write_buffer: Can't write the dump file(vmcore). Success

makedumpfile Failed.

Comment 1 Neil Horman 2008-10-10 14:05:14 UTC
Cai, can you please elaborate a bit on what you're looking for?  It seems like in the above case, setting the default_action to shell would allow you to recreate and debug the issue (by manually re-running the makedumpfile command with a higher log level.  Is there something more you're looking for?

Comment 2 Qian Cai 2008-10-22 12:35:28 UTC
Yes, that makes sense. I'll close this. Thanks.

Comment 3 Qian Cai 2009-07-02 06:19:43 UTC
I am afraid I'll need re-open it. It is such a pain to debug makedumpfile failures afterwards. For example, the configuration file contains,

core_collector makedumpfile --dump-dmesg /proc/vmcore /tmp/dmesg

From the serial console logs I can only see,

Saving to the local filesystem /dev/mapper/VolGroup00-LogVol00
e2fsck 1.38 (30-Jun-2005)
/dev/mapper/VolGroup00-LogVol00: recovering journal
/dev/mapper/VolGroup00-LogVol00: clean, 86240/16204320 files, 1176172/16203776 blocks
kjournald starting.  Commit interval 5 seconds
EXT3 FS on dm-0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
mv: unable to rename `/mnt//var/crasmd: stopping all md devices.
h/': No such file or directory
[0JSaving core complete
megaraid: flushing adapter 0...<6>usb 3-1: new full speed USB device using uhci_hcd and address 2
usb 3-1: not running at top speed; connect to a high speed hub
usb 3-1: configuration #1 chosen from 1 choice
hub 3-1:1.0: USB hub found
hub 3-1:1.0: 2 ports detected
Restarting system.

I have no idea if the above makedumpfile command fail or not?

Note, the "mv: unable to rename" and "No such file or directory" are expected, since I only want to capture dmesg in this case.

I think there is a valid reason not to use "default_action shell" in many situations. Users might setup kdump to enter INIT to capture the VMCore in the second attempt when makedumpfile failed etc.

I think it is a trivial fix with a big saver for debugging makedumpfile failures. The only downside I can think of is that more bug reports might come in since all the error and warning messages are opening to the users!

Comment 4 Qian Cai 2009-07-02 07:46:56 UTC
I have to manually workaround this problem by changing --message-level to 15 for the following lines,

            if [ -x /sbin/makedumpfile ]; then
                if [ -e $SYS_VMCOREINFO ]
                    grep -q control_d /proc/xen/capabilities 2>/dev/null
                    if [ $? -eq 0 ]
                        CORE_COLLECTOR=`echo $CORE_COLLECTOR | sed -e's/makedumpfile/makedumpfile -X --message-level 1/'`
                        CORE_COLLECTOR=`echo $CORE_COLLECTOR | sed -e's/makedumpfile/makedumpfile --message-level 1/'`
                    grep -q control_d /proc/xen/capabilities 2>/dev/null
                    if [ $? -eq 0 ]
                        CORE_COLLECTOR=`echo $CORE_COLLECTOR | sed -e's/makedumpfile/makedumpfile --xen-vmcoreinfo \/etc\/makedumpfile.config --message-level 1/'`
                        CORE_COLLECTOR=`echo $CORE_COLLECTOR | sed -e's/makedumpfile/makedumpfile -i \/etc\/makedumpfile.config --message-level 1/'`

Now, it is all clear what is wrong there,

open_dump_memory: Can't open the dump memory((null)). Bad address

makedumpfile Completed.

Comment 5 Qian Cai 2009-07-02 08:25:02 UTC
It does not mean that we always need --message-level 15 here, which is quite verbose. I think to efficiently ease debugging pain, both command and error messages are needed, so 7 sounds like a good combination.

      Message | progress    common    error     debug
      Level   | indicator   message   message   message
            0 |
            1 |     X
            2 |                X
            4 |                          X
          * 7 |     X          X         X
            8 |                                    X
           15 |     X          X         X         X

Comment 6 Neil Horman 2009-07-02 10:40:24 UTC
I'm a bit hesitant to do this as makedumpfile gets pretty verbose pretty quickly, and the extra messages will destroy the progress counter that we added.  Also there was a bug a few years ago now that explicity requested that makedumpfile be silent, although I never really agreed with that too much.  Maybe what I can do is is not specify message level at all in mkdumprd, and just let the user set it in kdump.conf.  Then we can change the example core_collector configuration to specify message-level 1 by default.

Comment 7 Neil Horman 2009-07-02 10:42:55 UTC
Created attachment 350259 [details]
patch to allow users to specify makdumpfile level

Cai, could you please give this patch a try.  It should allow you to specify --message-level in the core_collector line in /etc/kdump.conf.  Thanks!

Comment 8 Qian Cai 2009-07-02 11:21:44 UTC
Thanks Neil, I agree with your proposal, and I have also tested the patch on an ia64 machine without seen any problem. Once the patch has been integrated into packages, I will do more testing for it.

Comment 13 errata-xmlrpc 2009-09-02 09:13:36 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.