Bug 605411
Summary: | check raw partition for core dump when starting kdump service | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Dave Maley <dmaley> | ||||
Component: | kexec-tools | Assignee: | Cong Wang <amwang> | ||||
Status: | CLOSED ERRATA | QA Contact: | Chao Ye <cye> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 6.1 | CC: | akarlsso, amwang, cye, jwest, martin.wilck, nhorman, qcai, rkhan, syeghiay | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | kexec-tools-2_0_0-155_el6 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2011-05-19 14:15:15 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 561978 | ||||||
Attachments: |
|
It looks like RHEL5 needs this as well, however I'll hold off on cloning until we have some feedback on the patch from FJ. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux major release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Major release. This request is not yet committed for inclusion. This looks strange, if you really want to collect the core from a raw partition, why not just dump to a vmcore file instead of a raw partition? Neil, is this what 'raw XXX' is used for? raw XXX causes the kdump initramfs to dump /proc/vmcore to a raw disk w/o the aid of any file system. The patch above appears to recover such a dump, although I'm hesitant to take it in this form, as there is no guarantee that: 1) there will be space in /var/crash for such a dump 2) there is no guarantee that we know to need to use makedumpfile -R on the dump. Thats part of the reason we require by hand dump recovery it might be better to just indicate to the user that a dump is waiting for recovery on startup This issue has been proposed when we are only considering blocker issues in the current Red Hat Enterprise Linux release. It has been denied for the current Red Hat Enterprise Linux release. ** If you would still like this issue considered for the current release, ask your support representative to file as a blocker on your behalf. Otherwise ask that it be considered for the next Red Hat Enterprise Linux release. ** Dave, still waiting on feedback from FJ, as per comment 4. Thanks! (In reply to comment #4) > raw XXX causes the kdump initramfs to dump /proc/vmcore to a raw disk w/o the > aid of any file system. > > The patch above appears to recover such a dump, although I'm hesitant to take > it in this form, as there is no guarantee that: > 1) there will be space in /var/crash for such a dump You never know that, the same situation exists if you dump straight into a partition in the first place. But AFAICT many customers use DUMPLEVEL 31 anyway, so the space requirements are moderate. mkdumprd issues a warning if dump is configured to write to a file system partition that it considers too small to capture the dump (well, IMO mkdumprd's required size estimate is pretty rough). The same could be done for raw partitions. > 2) there is no guarantee that we know to need to use makedumpfile -R on the > dump. I don't understand what you mean here. To my understanding makedumpfile -R will be required because mkdumprd either uses dd or makedumpfile -F, which both produce "flat" format. > Thats part of the reason we require by hand dump recovery > > it might be better to just indicate to the user that a dump is waiting for > recovery on startup That should be very clearly documented and indicated, otherwise the risk of loosing a dump in a critical situation is very high. Raw dumping is desirable in many environments because the chance for failure is lower than in all other cases. Besides, it behaves like diskdump in RHEL3/RHEL4 and many users like this behavior. fine. lowering priority/severity, since this has not been accepted for 6.0, and isn't causing any real failures currently. Changing into a 6.1 feature request. So, besides this patch, we also need to: 1) warn if the partition is too small 2) document that a dump is waiting for recovery on startup in this case. Am I missing anything? With dump compression, how do you know if the partition is too small? With the terabyte memory era approaching, requiring the dump partition to be the same size as physical memory is getting unrealistic. Amerigo, I think all we need to do here honestly is just check to see if a core exits and try to recover it as the attached patch does. Lets not try get too fancy with it. Tested with latest build: ===================================================== [root@ibm-js22-07 ~]# rpm -q kernel kexec-tools kernel-2.6.32-122.el6.ppc64 kexec-tools-2.0.0-171.el6.ppc64 [root@ibm-js22-07 ~]# tail /etc/kdump.conf #kdump_post /var/crash/scripts/kdump-post.sh #extra_bins /usr/bin/lftp #disk_timeout 30 #extra_modules gfs2 #options modulename options #default shell raw /dev/sda3 core_collector makedumpfile -c --message-level 1 -d 31 default shell [root@ibm-js22-07 ~]# touch /etc/kdump.conf [root@ibm-js22-07 ~]# service kdump restart Stopping kdump:[ OK ] Detected change(s) the following file(s): /etc/kdump.conf Rebuilding /boot/initrd-2.6.32-122.el6.ppc64kdump.img Starting kdump:[ OK ] [root@ibm-js22-07 ~]# echo c > /proc/sysrq-trigger -------------------------------------------------------------------------------------- mdadm: No arrays found in config file or automatically Free memory/Total memory (free %): 150208 / 231296 ( 64.9419 ) Scanning logical volumes Reading all physical volumes. This may take a while... Found volume group "vg_ibmjs2207" using metadata type lvm2 Activating logical volumes 2 logical volume(s) in volume group "vg_ibmjs2207" now active Free memory/Total memory (free %): 149312 / 231296 ( 64.5545 ) Saving to partition /dev/sda3 Free memory/Total memory (free %): 149312 / 231296 ( 64.5545 ) Copying data : [100 %] Saving core complete Restarting system. ...... Starting RPC idmapd: [ OK ] Dump saved to /var/crash/2011-03-18-05:46/vmcore Starting kdump:[ OK ] ...... [root@ibm-js22-07 ~]# ls -lsh /var/crash/2011-03-18-05\:46/ total 28M 28M -rw-------. 1 root root 28M Mar 18 05:46 vmcore Change status to VERIFIED. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0736.html |
Created attachment 424944 [details] partner provided patch Description of problem: When a "raw" partition is entered in /etc/kdump.conf, kdump does write a dump to the partition but the dump is not automatically recovered at the next reboot. Judging from the comment in /etc/init.d/kdump: function start() { #TODO check raw partition for core dump image this seems to be a missing feature actually. However it makes dumping to a raw partition (the most robust setting IMO) unpractical for anybody except gurus who would be able to recover the dump manually (guessing the size of the dump would be the problem). I am not such a guru and don't feel like figuring out how the size is encoded in the makedumpfile header. Version-Release number of selected component (if applicable): kexec-tools-2.0.0-72.el6 How reproducible: always Steps to Reproduce: 1. configure a "raw" device in /etc/kdump.conf 2. crash system 3. vmcore not retrieved from raw partition upon reboot Actual results: dump is written but not recovered Expected results: dump is recovered, as it was with diskdump in the old days Additional info: