Bug 466436 - [5.3][RFE] Makedumpfile Error Messages
[5.3][RFE] Makedumpfile Error Messages
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kexec-tools (Show other bugs)
All Linux
low Severity low
: rc
: ---
Assigned To: Neil Horman
Red Hat Kernel QE team
: Reopened
Depends On:
  Show dependency treegraph
Reported: 2008-10-10 05:48 EDT by CAI Qian
Modified: 2009-09-02 05:13 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 600585 (view as bug list)
Last Closed: 2009-09-02 05:13:36 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
patch to allow users to specify makdumpfile level (1.50 KB, patch)
2009-07-02 06:42 EDT, Neil Horman
no flags Details | Diff

  None (edit)
Description CAI Qian 2008-10-10 05:48:30 EDT
Description of problem:
When makedumpfle failed on a vmcore, we need to know why. Otherwise, it is a pain to debug the failure when looking back from the serial console log. For example,

EXT3-fs: mounted filesystem with ordered data mode.
[  0 %][ 14 %][ 23 %][ 33 %][ 42 %][ 51 %][ 55 %][ 57 %][ 59 %][ 69 %]dropping to initramfs shell
exiting this shell will reboot your system

In fact, makedumpfile failed because of there was not enough disk space.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. configure Kdump with the following options,

  ext3 <small partition without enough disk space to save vmcore>
  core_collector makedumpfile -E

2. SysRq-C
3. check the serial console log to see if there is the reason of failure.
Actual results:
No error message.

Expected results:
Some error messages. For example,

[  0 %]write_buffer: Can't write the dump file(vmcore). Success

makedumpfile Failed.
Comment 1 Neil Horman 2008-10-10 10:05:14 EDT
Cai, can you please elaborate a bit on what you're looking for?  It seems like in the above case, setting the default_action to shell would allow you to recreate and debug the issue (by manually re-running the makedumpfile command with a higher log level.  Is there something more you're looking for?
Comment 2 CAI Qian 2008-10-22 08:35:28 EDT
Yes, that makes sense. I'll close this. Thanks.
Comment 3 CAI Qian 2009-07-02 02:19:43 EDT
I am afraid I'll need re-open it. It is such a pain to debug makedumpfile failures afterwards. For example, the configuration file contains,

core_collector makedumpfile --dump-dmesg /proc/vmcore /tmp/dmesg

From the serial console logs I can only see,

Saving to the local filesystem /dev/mapper/VolGroup00-LogVol00
e2fsck 1.38 (30-Jun-2005)
/dev/mapper/VolGroup00-LogVol00: recovering journal
/dev/mapper/VolGroup00-LogVol00: clean, 86240/16204320 files, 1176172/16203776 blocks
kjournald starting.  Commit interval 5 seconds
EXT3 FS on dm-0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
mv: unable to rename `/mnt//var/crasmd: stopping all md devices.
h/': No such file or directory
[0JSaving core complete
megaraid: flushing adapter 0...<6>usb 3-1: new full speed USB device using uhci_hcd and address 2
usb 3-1: not running at top speed; connect to a high speed hub
usb 3-1: configuration #1 chosen from 1 choice
hub 3-1:1.0: USB hub found
hub 3-1:1.0: 2 ports detected
Restarting system.

I have no idea if the above makedumpfile command fail or not?

Note, the "mv: unable to rename" and "No such file or directory" are expected, since I only want to capture dmesg in this case.

I think there is a valid reason not to use "default_action shell" in many situations. Users might setup kdump to enter INIT to capture the VMCore in the second attempt when makedumpfile failed etc.

I think it is a trivial fix with a big saver for debugging makedumpfile failures. The only downside I can think of is that more bug reports might come in since all the error and warning messages are opening to the users!
Comment 4 CAI Qian 2009-07-02 03:46:56 EDT
I have to manually workaround this problem by changing --message-level to 15 for the following lines,

            if [ -x /sbin/makedumpfile ]; then
                if [ -e $SYS_VMCOREINFO ]
                    grep -q control_d /proc/xen/capabilities 2>/dev/null
                    if [ $? -eq 0 ]
                        CORE_COLLECTOR=`echo $CORE_COLLECTOR | sed -e's/makedumpfile/makedumpfile -X --message-level 1/'`
                        CORE_COLLECTOR=`echo $CORE_COLLECTOR | sed -e's/makedumpfile/makedumpfile --message-level 1/'`
                    grep -q control_d /proc/xen/capabilities 2>/dev/null
                    if [ $? -eq 0 ]
                        CORE_COLLECTOR=`echo $CORE_COLLECTOR | sed -e's/makedumpfile/makedumpfile --xen-vmcoreinfo \/etc\/makedumpfile.config --message-level 1/'`
                        CORE_COLLECTOR=`echo $CORE_COLLECTOR | sed -e's/makedumpfile/makedumpfile -i \/etc\/makedumpfile.config --message-level 1/'`

Now, it is all clear what is wrong there,

open_dump_memory: Can't open the dump memory((null)). Bad address

makedumpfile Completed.
Comment 5 CAI Qian 2009-07-02 04:25:02 EDT
It does not mean that we always need --message-level 15 here, which is quite verbose. I think to efficiently ease debugging pain, both command and error messages are needed, so 7 sounds like a good combination.

      Message | progress    common    error     debug
      Level   | indicator   message   message   message
            0 |
            1 |     X
            2 |                X
            4 |                          X
          * 7 |     X          X         X
            8 |                                    X
           15 |     X          X         X         X
Comment 6 Neil Horman 2009-07-02 06:40:24 EDT
I'm a bit hesitant to do this as makedumpfile gets pretty verbose pretty quickly, and the extra messages will destroy the progress counter that we added.  Also there was a bug a few years ago now that explicity requested that makedumpfile be silent, although I never really agreed with that too much.  Maybe what I can do is is not specify message level at all in mkdumprd, and just let the user set it in kdump.conf.  Then we can change the example core_collector configuration to specify message-level 1 by default.
Comment 7 Neil Horman 2009-07-02 06:42:55 EDT
Created attachment 350259 [details]
patch to allow users to specify makdumpfile level

Cai, could you please give this patch a try.  It should allow you to specify --message-level in the core_collector line in /etc/kdump.conf.  Thanks!
Comment 8 CAI Qian 2009-07-02 07:21:44 EDT
Thanks Neil, I agree with your proposal, and I have also tested the patch on an ia64 machine without seen any problem. Once the patch has been integrated into packages, I will do more testing for it.
Comment 13 errata-xmlrpc 2009-09-02 05:13:36 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.