Description of problem: After booting into capture kernel using the latest kexec-tools 1.102pre-10.el5 and RHEL5.2-Server-20080224.nightly tree, vmcore only has a zero size. However, downgraded only kexec-tools package to 1.101-194.4.el5, kdump is working properly. Version-Release number of selected component (if applicable): kexec-tools-1.102pre-10.el5 kernel-2.6.18-83.el5 RHEL5.2-Server-20080224.nightly How reproducible: Reserved hp-lp1.rhts.boston.redhat.com from RHTS. Steps to Reproduce: 1. configured kdump and crashkernel=512M@256M. 3. echo c >/proc/sysrq-trigger Actual results: In capture kernel, kdump service started with, + '[' -s /proc/vmcore ']' + start + status Expected results: + '[' -s /proc/vmcore ']' + save_core Additional info: I have attached kdump service starting and capture kernel booting logs.
Created attachment 295894 [details] kdump service starting log for version 1.102pre-10.el5
Created attachment 295895 [details] kdump service starting log for version 1.101-194.4.el5
Created attachment 295896 [details] capture kernel booting long for version 1.102pre-10.el5
Created attachment 295897 [details] capture kernel booting long for version 1.101-194.4.el5
this is a kernel bug, not a kexec-tools bug, updating component accordingly. I'm reserving an RHTS machine to reproduce
Ok, I'm reproducing it in rhts, Tracking it down now.
*** Bug 435020 has been marked as a duplicate of this bug. ***
Looks like the elf header is never actually getting set up (or we're reading it from the wrng location. That header setup is handled by kexec in userspace during service startup, so I'll have to start looking there)
removing the kernel vmcoreinfo patch from the kernel does not correct this, so this may well be a kexec-tools bug after all
Back to thinking its a kernel problem. I have verified that we are inserting a properly formatted elf core header into a crash segment on kexec load, but the oldmem read routine from the same address when booting the kdump kernel returns only zeros. Need to understand why that is. A kernel bisection may be required here to track this down quickly.
Not sure if that help, but I have tried to downgrade only kernel package to -53.el5, yet the problem is still there. So, if the bug is in kernel, it probably not a regression.
THe bug description seems to indicate that this is kexec tools regression. I'm just wondering why not try bisection kexec tool from 1.101-194.4.el5 to 1.102pre-10.el5 to figure out the cause..
because my current tests don't suggest that it is a kexec-tools regression (at least not wholly). The problem is being caused by the fact that the elf core header which was installed at kexec time is not getting found during pivot kernel boot. I've verified that it is properly stored in kernel memory at kexec load, but reads as all zeros during kdump kernel boot. This suggests that either someone is overwriting the memory in question or the kernels read_oldmem api isn't functioning properly on ia64. Thats why I'm looking there. Besides, if there is a any kexec regression, it'll almost certainly occur between the version update, which will make bisection pretty well useless there.
Ok, more testing, and I'm back to thinking this is in fact a kexec-tools problem. while the buffer that gets passed down to the kernel containing the elf core header is good and properly formatted, something about the metadata that we're passing down (either for that segment, or the segment immediately prior is somehow bogus. If I replace the crash_create_elf64_headers (FUNC) function in 102pre with the prepare_crash_memory_elf64_headers function from kexec-tools-1.101 we work again. That may be my short term solution, unless I can figure out where the discrepancy in there lies pretty quick.
further update. The problem seems related to the size of the buffer being registered for the crash dump header. Its odd because the memory segment its being loaded into appears to have sufficient space to load a larger header so I can't imagine what the problem would be. I need to try incrementally increasing the size of the buffer to see where it fails, and then understand why
not sure why yet, but something about specifying the alignment value when creating the elf headers may be affecting this.
Created attachment 296408 [details] patch to adjust alignment for ia64 systems Think I've found the problem. It appears the elf core header alignment for kexec no longer needs to be 4k but rather 1k. I'm attaching a patch. Cai can you test it out and confirm that it works for you? I just tested it here and received a successfull vmcore on my test bed. Thanks!
Yes, successfully got vmcore on several IA64 machines with the patch applied to 1.102pre-11.el5.
Ok, as soon as the ACKs come in, I'll check this in. Thanks!
Hang on! I have seen some strange endless messages here just after booting into capture kernel on either hp-lp1.rhts.boston.redhat.com or hp-rx1620-01.rhts.boston.redhat.com. Total of 1 processors activated (2392.06 BogoMIPS). checking if image is initramfs... it is Freeing initrd memory: 5664kB freed Bad page state in process 'swapper' page:e0000000110eeed8 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0 (Not tainted) Trying to fix it up, but a reboot is needed Backtrace: Call Trace: [<a000000100013ae0>] show_stack+0x40/0xa0 sp=e000000015697b40 bsp=e0000000156912a8 [<a000000100013b70>] dump_stack+0x30/0x60 sp=e000000015697d10 bsp=e000000015691290 [<a00000010010a260>] bad_page+0xe0/0x160 sp=e000000015697d10 bsp=e000000015691248 [<a00000010010aa30>] free_hot_cold_page+0x110/0x320 sp=e000000015697d20 bsp=e000000015691200 [<a00000010010ad70>] free_hot_page+0x30/0x60 sp=e000000015697d20 bsp=e0000000156911d8 [<a00000010010d010>] __free_pages+0xb0/0x100 sp=e000000015697d20 bsp=e0000000156911b0 [<a00000010010d1e0>] free_pages+0x180/0x1a0 sp=e000000015697d20 bsp=e000000015691188 [<a000000100760dc0>] free_initrd_mem+0x1e0/0x2e0 sp=e000000015697d20 bsp=e000000015691160 [<a000000100753410>] free_initrd+0x130/0x180 sp=e000000015697d30 bsp=e000000015691128 [<a000000100756460>] populate_rootfs+0x1e0/0x200 sp=e000000015697d30 bsp=e0000000156910f8 [<a0000001007487d0>] init+0x3d0/0x780 sp=e000000015697d30 bsp=e0000000156910c8 [<a0000001000121b0>] kernel_thread_helper+0x30/0x60 sp=e000000015697e30 bsp=e0000000156910a0 [<a0000001000090c0>] start_kernel_thread+0x20/0x40 sp=e000000015697e30 bsp=e0000000156910a0 Logs have been attached. In addition, there is one machine failed to get a vmcore. hp-sapphire-01.rhts.boston.redhat.com
Created attachment 296597 [details] strange endless messages just after booting in capture kernel
Created attachment 296598 [details] hp-sapphire-01 failed to get a vmcore
Can you please move this to a new bug. The freeing of initrd memory shouldn't have anything to do with where we place the elfcoreheader for /proc/vmcore. I'm looking at your log from comment 22 and I note that it is indicative of what happens when you boot to the rootfs when /proc/vmcore is of zero length. Are you sure you're booting these systems with a kernel loaded using a patched version of the kexec-tools package?
dang, scratch that last comment. I was just referred to an old thread about this same problem when I posted this upstream. Apparently what I am changing is part of an old fix for which we do not have the kernel component sucked in. I'll retest and let you know shortly what the new fix will be
https://lists.linux-foundation.org/pipermail/fastboot/2007-February/012819.html
Regarding comment 22, I have reserved hp-sapphire-01 today, and with the patch applied to kexec-tool. There was 1 in 10 attempts failed to get a vmcore. Not sure why yet, but I suppose I can do the same testing again when the new patch is ready.
Ok, I've read over the above thread, and I think I understand whats going on. The origional upstream problem occured when the elfcorehdr was stored in a segment of memory that that fell outside of the saved_max_pfn range that was configured when the system booted up. The saved_max_pfn value in this situation was computed incorrectly, as it stored the max page frame of the running kernel, not the overall system. Increasing that value by computing the max page frame of physical ram solved that problem. Thats not happening in RHEL5. We are succesfully attempting to read oldmemory in the kernel, and our saved_max_pfn is corect for the system. But when we go to read that area of memory, we get back zeros. This strongly suggests (since the elf header was successfully stored origionally in the first kernel), that the startup code for the kernel is not mapping the page of memory that the elf core header is stored in. This may be intentional, if we have a discontiguous memory system and we are at the end of a memory chunk in the segment that we store the elf header in. I think we far to late in beta to go messing with the kernel pagetable setup code. That being said, the easiest solution for ia64 I think, is going to be more or less what I proposed before. we can just assume that the last page of any available hole is suspect and not use it. Its a lousy solution, as it potentially wastes memory, but it will fix up the regression until we can properly solve this issue after release. I'll work up a patch shortly
Created attachment 296798 [details] new patch to fix ia64 new patch to adjust alignment and find a safe location for elfcorehdr on ia64 Here it is Cai. Can you test it out please? Its worked for 10+ kexecs for me.
Test results: Overall, apart from strange messages mentioned in comment 21 (I'll raise a separate BZ for it), vmcore has successfully been generated in several IA64 machines. Individual failures, - intel-s6e5231-01.rhts.boston.redhat.com First time, looks like capture kernel reset just after getting into INIT stage. Full log (capture kernel boot started from "SysRq : Trigger a crashdump"): http://rhts.redhat.com/testlogs/16955/59714/503863/2127538-test_log--kernel-distribution-ltp-kdump-crasher-EXTERNALWATCHDOG Further manual tests suggested that after booting into capture kernel, remote serial console was scratched. However, there was a vmcore afterwards if I remotely manually rebooted after more than a hour. - hp-rx2660-03.rhts.boston.redhat.com capture kernel stalled at, cciss: MSI-X init failed -22 cciss0: <0x3230> at PCI 0000:05:00.0 IRQ 53 using DAC blocks= 860051760 block_size= 512 heads= 255, sectors= 32, cylinders= 105399 Then, if I login via serial, and hit some keys, it could process futher, but it eventually failed here, qla2xxx 0000:04:00.1: LIP reset occured (f700). qla2xxx 0000:04:00.1: LOOP UP detected (4 Gbps). Full log (I hit "ENTER help ENTER" after the first hang): http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=2129298 Does it sound like a new bug? - hp-olympia1.rhts.boston.redhat.com Looks like capture kernel has been reset immediately, and then ELILO failed to found the kernel file. Full log (capture kernel boot started from "SysRq : Trigger a crashdump"): http://rhts.redhat.com/testlogs/16959/59718/503873/2128481-test_log--kernel-distribution-ltp-kdump-crasher-EXTERNALWATCHDOG Sound like a new bug too?
"First time, looks like capture kernel reset just after getting into INIT stage. " I have a suspicion that, because it looks like we paniced due to a supposedly botched kernel command line, that this is a result of bz 428310, which will be fixed as soon as I get clearance after beta gets released "cciss: MSI-X init failed -22 cciss0: <0x3230> at PCI 0000:05:00.0 IRQ 53 using DAC" This is almost certainly bz 230717 "Looks like capture kernel has been reset immediately" This looks like that SAL processing bug that we worked on a few weeks ago. I wonder if ia64 needs something simmilar done to it. Does the failure happen consistently on this system? If so, we probably need a new bug for it. Also I notice that all of these logs are booting with kernel 2.6.18-84.el5. did you add the patch we're testing in, and just not modify the release number? Or is something else going on here? I think you're testing shows that we have a reasonably good fix. Two of the other problems that you've hit are most likely other bugs that are being tracked. As for the third, lets investigate a bit more. If it consistently fails on that system, we can open a new bug to track that. I'll check this in as soon as the ACKS come through. Thanks!
Q: Does the failure happen consistently on this system? If so, we probably need a new bug for it. A: Yes, I have seen it at least 2 times in a row. I shall file it shortly. Q: did you add the patch we're testing in A: Which kernel patch are you talking about? I have only added the patch in comment 28 to kexec-tools.
Sorry, when I mentioned the kernel patch I was thinking about the path from teh SAL processing bug, its irrelevant here, you're right.
I'm gonig to request the exception flag be set on this, since kdump is pretty well doa on ia64 without this.
tested kdump with rhel 5.2 beta iso on a intel tiger platfrom.. got same failure as this bugzilla title.. After downgrad kexec tool to kexec-tools-1.101-164.el5.ia64.rpm, retest it.. now the size of /var/crash/2008-03-28../vmcore looks reasonable after kdump..
theres no need to keep testing here at the moment. I have a fix ready. I'm just waiting for the exception to be granted so that I can check it in.
This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP.
This is ready to be checked in, I just need all the acks set on it.
I get an oops when booting up the kexec'ed kernel when using the latest kexec-tools + this patch: checking if image is initramfs... it is Freeing initrd memory: 4544kB freed Bad page state in process 'swapper' page:e00000001112b8a8 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0 (Not tainted) Trying to fix it up, but a reboot is needed Backtrace: Call Trace: [<a000000100013ae0>] show_stack+0x40/0xa0 sp=e0000000156efb40 bsp=e0000000156e92a8 [<a000000100013b70>] dump_stack+0x30/0x60 sp=e0000000156efd10 bsp=e0000000156e9290 [<a00000010010a260>] bad_page+0xe0/0x160 sp=e0000000156efd10 bsp=e0000000156e9248 [<a00000010010aa30>] free_hot_cold_page+0x110/0x320 sp=e0000000156efd20 bsp=e0000000156e9200 [<a00000010010ad70>] free_hot_page+0x30/0x60 sp=e0000000156efd20 bsp=e0000000156e91d8 [<a00000010010d010>] __free_pages+0xb0/0x100 sp=e0000000156efd20 bsp=e0000000156e91b0 [<a00000010010d1e0>] free_pages+0x180/0x1a0 sp=e0000000156efd20 bsp=e0000000156e9188 [<a000000100760e20>] free_initrd_mem+0x1e0/0x2e0 sp=e0000000156efd20 bsp=e0000000156e9160 [<a000000100753470>] free_initrd+0x130/0x180 sp=e0000000156efd30 bsp=e0000000156e9128 [<a0000001007564c0>] populate_rootfs+0x1e0/0x200 sp=e0000000156efd30 bsp=e0000000156e90f8 [<a0000001007487d0>] init+0x3d0/0x780 sp=e0000000156efd30 bsp=e0000000156e90c8 [<a0000001000121b0>] kernel_thread_helper+0x30/0x60 sp=e0000000156efe30 bsp=e0000000156e90a0 [<a0000001000090c0>] start_kernel_thread+0x20/0x40 sp=e0000000156efe30 bsp=e0000000156e90a0 then, I get many many many more stack traces to follow. All appear to be complaining about: Bad page state in process 'swapper'
Reading through the full bug report now I see the same errors I am seeing in the previous comment were also mentioned in comment #20. I am _pretty_ sure I grabbed the right patch (the one named "new patch to fix ia64") but I will re-confirm. Also, I applied the patch against kexec-tools-1.102pre-15.el5.src.rpm and I see this was originally reported against -10. Might be the patch doesn't work proplery with -15 (just a wild guess).
Tried the patch against kexec-tools-debuginfo-1.102pre-10 and saw the same oops.
The messages are benign, since the kernel is able to fix up these page table errors. After they are fixed you should still get a valid vmcore (with the patch). We're tracking this subsequent error in bz 436475. I'm taking the patch from this bz as it fixes the regression in 5.2 which is the immediate concern. If you have any thoughts as to whats causing the page table errors, you're input is welcome over on bz 436475
Created attachment 299065 [details] revised patch to restrict location of elfcorehdr to crash dump memory region revised version of the patch
dchapman has requested to test this before I check it in, please let me know when it has your thumbs up.
Neil, Trying _very_ hard not to be a pain here :) I am not quite ready to admit that this issue and the bad page state one are not the same (or closely related). It seems we are still not calculating elfcorehdr properly. My reasons for thinking this are if I start with your previous patch (the one titled "new patch to fix ia64") and then add in the elfcorehdr /= 1024 hack that we discussed in IRC it works _perfectly_ (no bad page state, no tons of garbage dumped to the screen AND it collects the core dump).
OK, I think I am starting to see imaginary things now.... I was mistaken when I thought that hack actually allowed things to work perfectly. What really happened is it allowed the kexec kernel to boot and it _appeared_ to save the dump but the dump was 0 bytes so essentially it was back to the original behavior. So, I agree, this patch does allow kdump to save the dump. But it also _introduces_ the issue we are seeing and tracking as BZ 436475 (we got past that point in the boot cleanly before, even if the vmcore was 0 bytes). I guess I can give this a thumbs up since we now can get dumps, however I think we need to seriously continue to look at the BZ 436475 issue.
Copy that, I'll check it in in the AM. And we'll keep on the other issue (I think I've already cc'd you on it). thanks for testing!
Neil, Wait!!! don't check in that patch just yet! I found the real issue: Upstream there is this commit: commit bcd72df212636eee645276a2409b0eef8c250dee Author: Magnus Damm <magnus.jp> Date: Thu Feb 15 22:42:35 2007 +0900 kexec-tools: Use EFI_LOADER_DATA for ELF core header (ia64) long description in the commit but the key bit is: This strategy requires changes in the secondary kernel as well, I'll post the kernel patches in a little while. I have not looked at the kernel side yet but what I did do is revert just this one patch (it is very small) from the latest RHEL version of kexec-tools (kexec-tools-1.102pre-15) and it is once again working cleanly.
Created attachment 299094 [details] patch to revert upstream EFI_LOADER_DATA commit Here is the patch I mentioned in the previous comment. This fixes the zero size vmcore issue as well as the "bad page state" issue. Tested on an HP rx6600.
Thats odd, I could have sworn I reverted this patch as part of my previous testing, although I was doing it piecemeal. I should have used git-bisect. Thanks for the find, I'll check this in right away. FWIW, this is the corresponding kernel commit: cee87af2a5f75713b98d3e65e43872e547122cd5 I would expect the better thing to do would be to take the kernel change (since it seems safe), but I would rather not slam this into the kernel in the last minute. I'll revert the change from kexec-tools for 5.2 and we can look at taking both changes in 5.3
*** Bug 436475 has been marked as a duplicate of this bug. ***
*** Bug 439208 has been marked as a duplicate of this bug. ***
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0313.html