Hide Forgot
Description of problem: bz682085 released kexec-tools update, which causes kdump regression. There are two symptoms observed: 1. Data of less than 640kB is invalid, as the data is stored in incorrect address. This causes the vmcore to be unreadable by crash command. 2. Partial dump may fail when it was specified excluding free page. Version-Release number of selected component (if applicable): Applying RHBA-2011-0382 kexec-tools-1.102pre-126.el5_6.5 causes this issue. How reproducible: Always Steps to Reproduce: For 1: 1. Specify /etc/kdump.conf: ext3 /dev/sda6, core_collector makedumpfile -d 0 /etc/sysconfig/kdump is default. 2. Invoke kdump: echo c > /proc/sysrq-trigger 3. Open vmcore by crash. For 2: 1. Specify /etc/kdump.conf: ext3 /dev/sda6, core_collector makedumpfile -d 31 /etc/sysconfig/kdump is default. 2. echo c > /proc/sysrq-trigger Actual results: For 1: zone_table which is stored at less than 640kB shows invalid data(all zero). For 2: dump failed with following error: page_to_pfn: Can't convert the address of page descriptor (21eb6fefa7dfded5) to pfn. Expected results: vmcore should be collected and valid data must be collected. Additional info: 0-640k is utilized by 2nd OS so the original would be copied. When this happens, because of the errata, the 2nd kernel omits the use of reserved area. For example: 0-64k reserved -638k usable -640k reserved the memory range from 64k-638k is the only portion that'll be copied, but the copied area considers the range was copied from 0k, not 64k, which causes the vmcore to be corrupted. The vendor has proposed with an initial patch.
Created attachment 491685 [details] initial patch
Takuma, Please elabrate more about the reproducer and environment of this issue. We can't reproduce this in-house so far. CAI Qian
bz678308 tells it's been reproduced. Not sure why you are not hitting this. The key here is that the first memory region must be set to reserved in the 1st kernel and this should happen. If you can use the box that's been used on bz678303, that should reproduce the issue.
Takuma, could you try the patch here, https://bugzilla.redhat.com/attachment.cgi?id=493217&action=diff ? I think this is a duplicate of bug 696547. Thanks.
Created attachment 494095 [details] Proposed patch (In reply to comment #11) Takuma, please try the attached one.
*** This bug has been marked as a duplicate of bug 696547 ***