Bug 788678

Summary: mkdumprd doesn't detect /var mounted on separate partion and kdump kernel fails
Product: Red Hat Enterprise Linux 5 Reporter: Matthew Whitehead <mwhitehe>
Component: kexec-toolsAssignee: Cong Wang <amwang>
Status: CLOSED ERRATA QA Contact: Xu Wang <xuwang>
Severity: medium Docs Contact:
Priority: high    
Version: 5.5CC: amwang, eric.hagberg, fhirtz, hagberg, qcai, rkhan, ruyang, xuwang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kexec-tools-1.102pre-157.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-08 04:09:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch for bug 809983
none
Proposed Patch
none
Proposed Patch v2
none
Proposed Patch v3 none

Description Matthew Whitehead 2012-02-08 18:46:37 UTC
Description of problem: 

On our system /var is a separate parition. Under RHEL6, the mkdumprd utility correctly detects this on the live system (without needing entries in /etc/kdump.conf) and builds a kdump initrd that mounts the /var patition and dumps to /crash on that parition.

RHEL5 does not do this. The kdump initrd fails to save the core file because it is incorrectly looking for /var/crash on the root filesystem, and it isn't there. It eventually panics the kdump kernel.

Version-Release number of selected component (if applicable): RHEL5.5


How reproducible: 100%


Steps to Reproduce:
1. mkdumprd
2. sysrq-c
3. observe the kdump fail
  
Actual results: crash dump is not obtained.


Expected results: crash dump should be obtained.


Additional info:

Comment 1 Cong Wang 2012-06-21 06:39:06 UTC
Hi, Matthew,

This could be a dup of Bug 809983, which filesystem is your /var using? Please show me the output of 'mount'?

Thanks!

Comment 2 Eric Hagberg 2012-06-21 11:32:29 UTC
Can't read that BZ, but the fs is ext3.

np30c3n1.one-nyp.ms.com /var/user/hagberg 2$ mount|grep "/var "
/dev/mapper/v1-varvol on /var type ext3 (rw)
np30c3n1.one-nyp.ms.com /var/user/hagberg 3$

Comment 3 Cong Wang 2012-06-22 04:00:23 UTC
Hi, Eric,

Bug 809983 is basically a bug that we missed ext* modules when a non-root partation is used as a dump target, so could be a dup of this one.

So, you /var is ext3, but rootfs is a different ext* filesystem?

Thanks.

Comment 4 Cong Wang 2012-06-22 04:02:15 UTC
Created attachment 593644 [details]
Patch for bug 809983

This is the patch to fix bug 809983, please test if it could fix this problem as well?

Comment 5 Eric Hagberg 2012-06-22 04:05:47 UTC
/var and / are both ext3. The problem is that mkdumprd doesn't bother to figure out that /var could be on a separate partition, so the init script just blindly mounts /, and tries to copy the core into /var/crash/... and since /var/crash doesn't exist, it fails.

The patch won't fix this.

Comment 6 Cong Wang 2012-06-22 08:38:46 UTC
Yeah, I see. Then the problem is harder and more serious than Bug 809983.

Comment 7 Cong Wang 2012-06-22 10:04:20 UTC
Hmm, after a second thought, did you put the block device mounted on /var into your /etc/kdump.conf? Something like:

ext3 /dev/sdbX  #the device mounted on /var
path crash  #relative path inside /var

? Please share your kdump.conf if possible.

Thanks!

Comment 8 Cong Wang 2012-06-22 10:06:27 UTC
Errr, in your case it sould be:

ext3 /dev/mapper/v1-varvol
path crash

in /etc/kdump.conf.

Comment 9 Eric Hagberg 2012-06-22 10:08:15 UTC
The point is to _not_ touch the default kdump.conf, and mkdumprd should just work, like it does in RHEL6.

If I do put the ext3 and path directives into kdump.conf, then of course things work fine, but it shouldn't be needed for the stock case where you just want to dump to /var/crash on your local filesystem.

Comment 10 Cong Wang 2012-06-22 10:18:13 UTC
Yeah... I saw how RHEL6 handles this, will try to backport it to RHEL5.
Thanks!

Comment 11 Cong Wang 2012-06-22 10:27:58 UTC
Created attachment 593683 [details]
Proposed Patch

This is an untested patch, please help to test it? I can build an rpm for you if you need.

Thanks!

Comment 12 Eric Hagberg 2012-06-22 12:21:07 UTC
Still fails the same way. I don't see anything in the kdump initrd's generated init script that makes any attempt to mount /var (which is on an lvm partition, by the way).

Comment 13 Cong Wang 2012-06-25 05:53:07 UTC
I see, I still missed one part... :( Will update the patch.

Comment 14 Cong Wang 2012-06-25 06:36:36 UTC
Created attachment 594107 [details]
Proposed Patch v2

Updated patch

Comment 15 Eric Hagberg 2012-06-25 09:12:05 UTC
Still failing:

Creating block device ram9
Saving to the local filesystem UUID=5068996b-9a6b-4160-966b-7f94011d05dd
findfs: Unable to resolve 'UUID=5068996b-9a6b-4160-966b-7f94011d05dd'
BusyBox v1.2.0 (2009.07.02-14:09+0000) multi-call binary

No help available.

mount: Can't find /mnt in /etc/fstab
Attempting to enter user-space to capture vmcore
Creating root device.
Checking root filesystem.
fsck 1.38 (30-Jun-2005)
e2fsck 1.38 (30-Jun-2005)
/: recovering journal
/: clean, 102501/750720 files, 693901/1501440 blocks
Mounting root filesystem.
Trying mount -t ext4 /dev/cciss/c0d0p2 /sysroot
kjournald starting.  Commit interval 5 secondst

EXT3 FS on cciss/c0d0p2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Using ext3 on root filesystem
Switching to new root and running init.
SELinux:  Disabled at runtime.
type=1404 audit(1340615410.722:2): selinux=0 auid=4294967295 ses=4294967295
INIT: version 2.86 booting
                Welcome to Red Hat Enterprise Linux Server
                Press 'I' to enter interactive startup.
Setting clock  (utc): Mon Jun 25 05:10:13 EDT 2012 [  OK  ]
Starting udev: Kernel panic - not syncing: Out of memory and no killable processes...

Comment 16 Cong Wang 2012-06-25 09:44:49 UTC
Created attachment 594144 [details]
Proposed Patch v3

Ok, let's just remove the UUID converting code.

Comment 17 Eric Hagberg 2012-06-25 14:07:49 UTC
Yep - it works now!

Comment 18 Eric Hagberg 2012-06-25 14:32:48 UTC
... almost. I'm pretty sure that the RHEL6 default mkdumprd uses makedumpfile by default so it isn't just using "cp" to create the vmcore file.

The currently-patched version appears to just use "cp" instead.

Comment 19 Cong Wang 2012-06-26 02:25:07 UTC
(In reply to comment #18)
> ... almost. I'm pretty sure that the RHEL6 default mkdumprd uses
> makedumpfile by default so it isn't just using "cp" to create the vmcore
> file.
> 
> The currently-patched version appears to just use "cp" instead.

Yeah, this is expected, because we don't have a chance to change the default core_collector to makedumpfile on RHEL5, so "cp" is still the default one. :)

Thanks for testing!

Comment 20 RHEL Program Management 2012-06-26 11:28:58 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 27 errata-xmlrpc 2013-01-08 04:09:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0012.html