Bug 788678
Summary: | mkdumprd doesn't detect /var mounted on separate partion and kdump kernel fails | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Matthew Whitehead <mwhitehe> | ||||||||||
Component: | kexec-tools | Assignee: | Cong Wang <amwang> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Xu Wang <xuwang> | ||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | high | ||||||||||||
Version: | 5.5 | CC: | amwang, eric.hagberg, fhirtz, hagberg, qcai, rkhan, ruyang, xuwang | ||||||||||
Target Milestone: | rc | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | kexec-tools-1.102pre-157.el5 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2013-01-08 04:09:08 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Matthew Whitehead
2012-02-08 18:46:37 UTC
Hi, Matthew, This could be a dup of Bug 809983, which filesystem is your /var using? Please show me the output of 'mount'? Thanks! Can't read that BZ, but the fs is ext3. np30c3n1.one-nyp.ms.com /var/user/hagberg 2$ mount|grep "/var " /dev/mapper/v1-varvol on /var type ext3 (rw) np30c3n1.one-nyp.ms.com /var/user/hagberg 3$ Hi, Eric, Bug 809983 is basically a bug that we missed ext* modules when a non-root partation is used as a dump target, so could be a dup of this one. So, you /var is ext3, but rootfs is a different ext* filesystem? Thanks. Created attachment 593644 [details] Patch for bug 809983 This is the patch to fix bug 809983, please test if it could fix this problem as well? /var and / are both ext3. The problem is that mkdumprd doesn't bother to figure out that /var could be on a separate partition, so the init script just blindly mounts /, and tries to copy the core into /var/crash/... and since /var/crash doesn't exist, it fails. The patch won't fix this. Yeah, I see. Then the problem is harder and more serious than Bug 809983. Hmm, after a second thought, did you put the block device mounted on /var into your /etc/kdump.conf? Something like: ext3 /dev/sdbX #the device mounted on /var path crash #relative path inside /var ? Please share your kdump.conf if possible. Thanks! Errr, in your case it sould be: ext3 /dev/mapper/v1-varvol path crash in /etc/kdump.conf. The point is to _not_ touch the default kdump.conf, and mkdumprd should just work, like it does in RHEL6. If I do put the ext3 and path directives into kdump.conf, then of course things work fine, but it shouldn't be needed for the stock case where you just want to dump to /var/crash on your local filesystem. Yeah... I saw how RHEL6 handles this, will try to backport it to RHEL5. Thanks! Created attachment 593683 [details]
Proposed Patch
This is an untested patch, please help to test it? I can build an rpm for you if you need.
Thanks!
Still fails the same way. I don't see anything in the kdump initrd's generated init script that makes any attempt to mount /var (which is on an lvm partition, by the way). I see, I still missed one part... :( Will update the patch. Created attachment 594107 [details]
Proposed Patch v2
Updated patch
Still failing: Creating block device ram9 Saving to the local filesystem UUID=5068996b-9a6b-4160-966b-7f94011d05dd findfs: Unable to resolve 'UUID=5068996b-9a6b-4160-966b-7f94011d05dd' BusyBox v1.2.0 (2009.07.02-14:09+0000) multi-call binary No help available. mount: Can't find /mnt in /etc/fstab Attempting to enter user-space to capture vmcore Creating root device. Checking root filesystem. fsck 1.38 (30-Jun-2005) e2fsck 1.38 (30-Jun-2005) /: recovering journal /: clean, 102501/750720 files, 693901/1501440 blocks Mounting root filesystem. Trying mount -t ext4 /dev/cciss/c0d0p2 /sysroot kjournald starting. Commit interval 5 secondst EXT3 FS on cciss/c0d0p2, internal journal EXT3-fs: mounted filesystem with ordered data mode. Using ext3 on root filesystem Switching to new root and running init. SELinux: Disabled at runtime. type=1404 audit(1340615410.722:2): selinux=0 auid=4294967295 ses=4294967295 INIT: version 2.86 booting Welcome to Red Hat Enterprise Linux Server Press 'I' to enter interactive startup. Setting clock (utc): Mon Jun 25 05:10:13 EDT 2012 [ OK ] Starting udev: Kernel panic - not syncing: Out of memory and no killable processes... Created attachment 594144 [details]
Proposed Patch v3
Ok, let's just remove the UUID converting code.
Yep - it works now! ... almost. I'm pretty sure that the RHEL6 default mkdumprd uses makedumpfile by default so it isn't just using "cp" to create the vmcore file. The currently-patched version appears to just use "cp" instead. (In reply to comment #18) > ... almost. I'm pretty sure that the RHEL6 default mkdumprd uses > makedumpfile by default so it isn't just using "cp" to create the vmcore > file. > > The currently-patched version appears to just use "cp" instead. Yeah, this is expected, because we don't have a chance to change the default core_collector to makedumpfile on RHEL5, so "cp" is still the default one. :) Thanks for testing! This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0012.html |