Bug 788678 - mkdumprd doesn't detect /var mounted on separate partion and kdump kernel fails
Summary: mkdumprd doesn't detect /var mounted on separate partion and kdump kernel fails
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kexec-tools
Version: 5.5
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: rc
: ---
Assignee: Cong Wang
QA Contact: Xu Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-08 18:46 UTC by Matthew Whitehead
Modified: 2018-12-02 16:11 UTC (History)
8 users (show)

Fixed In Version: kexec-tools-1.102pre-157.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-01-08 04:09:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch for bug 809983 (1.00 KB, patch)
2012-06-22 04:02 UTC, Cong Wang
no flags Details | Diff
Proposed Patch (4.96 KB, patch)
2012-06-22 10:27 UTC, Cong Wang
no flags Details | Diff
Proposed Patch v2 (15.67 KB, patch)
2012-06-25 06:36 UTC, Cong Wang
no flags Details | Diff
Proposed Patch v3 (15.53 KB, patch)
2012-06-25 09:44 UTC, Cong Wang
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:0012 0 normal SHIPPED_LIVE kexec-tools bug fix and enhancement update 2013-01-08 08:38:45 UTC

Description Matthew Whitehead 2012-02-08 18:46:37 UTC
Description of problem: 

On our system /var is a separate parition. Under RHEL6, the mkdumprd utility correctly detects this on the live system (without needing entries in /etc/kdump.conf) and builds a kdump initrd that mounts the /var patition and dumps to /crash on that parition.

RHEL5 does not do this. The kdump initrd fails to save the core file because it is incorrectly looking for /var/crash on the root filesystem, and it isn't there. It eventually panics the kdump kernel.

Version-Release number of selected component (if applicable): RHEL5.5


How reproducible: 100%


Steps to Reproduce:
1. mkdumprd
2. sysrq-c
3. observe the kdump fail
  
Actual results: crash dump is not obtained.


Expected results: crash dump should be obtained.


Additional info:

Comment 1 Cong Wang 2012-06-21 06:39:06 UTC
Hi, Matthew,

This could be a dup of Bug 809983, which filesystem is your /var using? Please show me the output of 'mount'?

Thanks!

Comment 2 Eric Hagberg 2012-06-21 11:32:29 UTC
Can't read that BZ, but the fs is ext3.

np30c3n1.one-nyp.ms.com /var/user/hagberg 2$ mount|grep "/var "
/dev/mapper/v1-varvol on /var type ext3 (rw)
np30c3n1.one-nyp.ms.com /var/user/hagberg 3$

Comment 3 Cong Wang 2012-06-22 04:00:23 UTC
Hi, Eric,

Bug 809983 is basically a bug that we missed ext* modules when a non-root partation is used as a dump target, so could be a dup of this one.

So, you /var is ext3, but rootfs is a different ext* filesystem?

Thanks.

Comment 4 Cong Wang 2012-06-22 04:02:15 UTC
Created attachment 593644 [details]
Patch for bug 809983

This is the patch to fix bug 809983, please test if it could fix this problem as well?

Comment 5 Eric Hagberg 2012-06-22 04:05:47 UTC
/var and / are both ext3. The problem is that mkdumprd doesn't bother to figure out that /var could be on a separate partition, so the init script just blindly mounts /, and tries to copy the core into /var/crash/... and since /var/crash doesn't exist, it fails.

The patch won't fix this.

Comment 6 Cong Wang 2012-06-22 08:38:46 UTC
Yeah, I see. Then the problem is harder and more serious than Bug 809983.

Comment 7 Cong Wang 2012-06-22 10:04:20 UTC
Hmm, after a second thought, did you put the block device mounted on /var into your /etc/kdump.conf? Something like:

ext3 /dev/sdbX  #the device mounted on /var
path crash  #relative path inside /var

? Please share your kdump.conf if possible.

Thanks!

Comment 8 Cong Wang 2012-06-22 10:06:27 UTC
Errr, in your case it sould be:

ext3 /dev/mapper/v1-varvol
path crash

in /etc/kdump.conf.

Comment 9 Eric Hagberg 2012-06-22 10:08:15 UTC
The point is to _not_ touch the default kdump.conf, and mkdumprd should just work, like it does in RHEL6.

If I do put the ext3 and path directives into kdump.conf, then of course things work fine, but it shouldn't be needed for the stock case where you just want to dump to /var/crash on your local filesystem.

Comment 10 Cong Wang 2012-06-22 10:18:13 UTC
Yeah... I saw how RHEL6 handles this, will try to backport it to RHEL5.
Thanks!

Comment 11 Cong Wang 2012-06-22 10:27:58 UTC
Created attachment 593683 [details]
Proposed Patch

This is an untested patch, please help to test it? I can build an rpm for you if you need.

Thanks!

Comment 12 Eric Hagberg 2012-06-22 12:21:07 UTC
Still fails the same way. I don't see anything in the kdump initrd's generated init script that makes any attempt to mount /var (which is on an lvm partition, by the way).

Comment 13 Cong Wang 2012-06-25 05:53:07 UTC
I see, I still missed one part... :( Will update the patch.

Comment 14 Cong Wang 2012-06-25 06:36:36 UTC
Created attachment 594107 [details]
Proposed Patch v2

Updated patch

Comment 15 Eric Hagberg 2012-06-25 09:12:05 UTC
Still failing:

Creating block device ram9
Saving to the local filesystem UUID=5068996b-9a6b-4160-966b-7f94011d05dd
findfs: Unable to resolve 'UUID=5068996b-9a6b-4160-966b-7f94011d05dd'
BusyBox v1.2.0 (2009.07.02-14:09+0000) multi-call binary

No help available.

mount: Can't find /mnt in /etc/fstab
Attempting to enter user-space to capture vmcore
Creating root device.
Checking root filesystem.
fsck 1.38 (30-Jun-2005)
e2fsck 1.38 (30-Jun-2005)
/: recovering journal
/: clean, 102501/750720 files, 693901/1501440 blocks
Mounting root filesystem.
Trying mount -t ext4 /dev/cciss/c0d0p2 /sysroot
kjournald starting.  Commit interval 5 secondst

EXT3 FS on cciss/c0d0p2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Using ext3 on root filesystem
Switching to new root and running init.
SELinux:  Disabled at runtime.
type=1404 audit(1340615410.722:2): selinux=0 auid=4294967295 ses=4294967295
INIT: version 2.86 booting
                Welcome to Red Hat Enterprise Linux Server
                Press 'I' to enter interactive startup.
Setting clock  (utc): Mon Jun 25 05:10:13 EDT 2012 [  OK  ]
Starting udev: Kernel panic - not syncing: Out of memory and no killable processes...

Comment 16 Cong Wang 2012-06-25 09:44:49 UTC
Created attachment 594144 [details]
Proposed Patch v3

Ok, let's just remove the UUID converting code.

Comment 17 Eric Hagberg 2012-06-25 14:07:49 UTC
Yep - it works now!

Comment 18 Eric Hagberg 2012-06-25 14:32:48 UTC
... almost. I'm pretty sure that the RHEL6 default mkdumprd uses makedumpfile by default so it isn't just using "cp" to create the vmcore file.

The currently-patched version appears to just use "cp" instead.

Comment 19 Cong Wang 2012-06-26 02:25:07 UTC
(In reply to comment #18)
> ... almost. I'm pretty sure that the RHEL6 default mkdumprd uses
> makedumpfile by default so it isn't just using "cp" to create the vmcore
> file.
> 
> The currently-patched version appears to just use "cp" instead.

Yeah, this is expected, because we don't have a chance to change the default core_collector to makedumpfile on RHEL5, so "cp" is still the default one. :)

Thanks for testing!

Comment 20 RHEL Program Management 2012-06-26 11:28:58 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 27 errata-xmlrpc 2013-01-08 04:09:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0012.html


Note You need to log in before you can comment on or make changes to this bug.