Bug 449111

Summary: [RHEL5.2] makedumpfile corrupts vmcore on ia64: crash's bt fails to unwind
Product: Red Hat Enterprise Linux 5 Reporter: Kiyoshi Ueda <kueda>
Component: kexec-toolsAssignee: Neil Horman <nhorman>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: urgent    
Version: 5.2CC: anderson, coughlan, cward, ishida-sxc, junichi.nomura, kueda, m-ikeda, mikeda, nhorman, oomichi, qcai, tachibana, tao, tatsu-ab1
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 20:58:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 457233    
Attachments:
Description Flags
Command results of filesize comparison and crash 'bt -a'
none
Add the check of overlapping load segment.
none
results of 'objdump -x' against the original vmcore before filtering by makedumpfile none

Description Kiyoshi Ueda 2008-05-30 14:03:42 UTC
Description of problem:
makedumpfile corrupts vmcore on ia64.
As a result, crash's bt sub-command fails to unwind.

This problem doesn't happen in kexec-tools-1.101-194.4 of RHEL5.1.
So this problem is a regression from RHEL5.1.

This problem doesn't happen when -E option is specified to makedumpfile.
Other options and no option in /etc/kdump.conf like below cause
the problem.
    o core_collector makedumpfile -c -d 15
    o core_collector makedumpfile -c
    o core_collector makedumpfile -d 15
    o core_collector makedumpfile


Version-Release number of selected component:
kexec-tools-1.102pre-21 (and also kexec-tools-1.102pre-23)
kernel-2.6.18-92.el5
crash-4.0-5.0.3


How reproducible:
Always


Steps to Reproduce:
 1. Specify makedumpfile as a core_collector in /etc/kdump.conf
    (NOTE: Don't use -E option)
       core_collector makedumpfile
 2. Collect dumpfile
 3. Run crash command against the collected dumpfile
 4. Run bt sub-command of crash


Actual results:
bt sub-command fails.
-----------------------------------------------------------------------
crash> bt
PID: 9513   TASK: e0000001254a8000  CPU: 1   COMMAND: "bash"
 #0 [BSP:e0000001254a93e8] machine_kexec at a000000100059760
 #1 [BSP:e0000001254a93c8] crash_kexec at a0000001000ca690
 #2 [BSP:e0000001254a93a0] sysrq_handle_crashdump at a0000001003b3900
 #3 [BSP:e0000001254a9350] __handle_sysrq at a0000001003b3140
 #4 [BSP:e0000001254a9320] write_sysrq_trigger at a0000001001f2850
 #5 [BSP:e0000001254a92d0] vfs_write at a0000001001644e0
 #6 [BSP:e0000001254a9258] sys_write at a000000100165030
 #7 [BSP:e0000001254a9258] __ia64_trace_syscall at a00000010000bdb0
bt: unwind: failed to locate return link (ip=0xa00000010000bdb0)!
crash>
-----------------------------------------------------------------------


Expected results:
bt shows register information and full backtrace.
-----------------------------------------------------------------------
crash> bt
PID: 11383  TASK: e0000001301e8000  CPU: 15  COMMAND: "bash"
 #0 [BSP:e0000001301e93e8] machine_kexec at a000000100059760
 #1 [BSP:e0000001301e93c8] crash_kexec at a0000001000ca690
 #2 [BSP:e0000001301e93a0] sysrq_handle_crashdump at a0000001003b3900
 #3 [BSP:e0000001301e9350] __handle_sysrq at a0000001003b3140
 #4 [BSP:e0000001301e9320] write_sysrq_trigger at a0000001001f2850
 #5 [BSP:e0000001301e92d0] vfs_write at a0000001001644e0
 #6 [BSP:e0000001301e9258] sys_write at a000000100165030
 #7 [BSP:e0000001301e9258] __ia64_trace_syscall at a00000010000bdb0
  EFRAME: e0000001301efe40
      B0: 20000000001464a0      CR_IIP: a000000000010620
 CR_IPSR: 0000121308526010      CR_IFS: 0000000000000008
  AR_PFS: c000000000000008      AR_RSC: 000000000000000f
 AR_UNAT: 0000000000000000     AR_RNAT: 0000000000000000
  AR_CCV: 0000000000000000     AR_FPSR: 0009804c8a70033f
  LOADRS: 0000000001b80000 AR_BSPSTORE: 600007ffffda42d0
      B6: 200000000020e780          B7: a000000000010640
      PR: 0000000000590a41          R1: 2000000000280238
      R2: e0000001301efee0          R3: e0000001301efef8
      R8: 0000000000000001          R9: 0000000000000004
     R10: 0000000000000000         R11: c000000000000512
     R12: 60000fffffd9f9a0         R13: 2000000000304e00
     R14: 2000000003c18000         R15: 0000000000000403
     R16: 00000000fbad2a84         R17: 20000000002fdda0
     R18: 2000000003c18000         R19: 2000000003c1c000
     R20: 0009804c8a70033f         R21: 200000000012a8a0
     R22: 0000000000000000         R23: 600007ffffda4430
     R24: 0000000000000000         R25: 0000000000000000
     R26: c000000000000006         R27: 000000000000000f
     R28: a000000000010620         R29: 0000121308526010
     R30: 0000000000000006         R31: 00000000005a0a41
      F6: 000000000000000000000     F7: 000000000000000000000
      F8: 0ffff8000000000000000     F9: 1003effffffffffffc000
     F10: 1003e0000000000000001    F11: 0fff0fffffffff0000000
 #8 [BSP:e0000001301e9258] __kernel_syscall_via_break at a000000000010620
-----------------------------------------------------------------------


Additional info:

Comment 1 Neil Horman 2008-05-30 15:52:11 UTC
Have you tried this without makedumpfile?  I'd like to be sure that it only
happens with makedumpfile, and is not a more systemic problem with kexec.  Thanks!

Comment 2 Kiyoshi Ueda 2008-05-30 15:59:39 UTC
No problem without makedumpfile.  So it's makedumpfile problem, I think.

Comment 3 Dave Anderson 2008-05-30 16:10:49 UTC
Has it been ascertained that it's due to corruption -- or could it be
caused by a missing page?


Comment 4 Neil Horman 2008-05-30 16:25:32 UTC
+1 to daves comment.  The use of makedumpfile without options is particularly
interesting, given that that should make makedumpfile act effectively as a
no-op.  Can you take the vmcore you created and comment #2 and run it through
makedumpfile by hand without any -c,-d or -E option?  Compre the sizes of the
two, I would expect them to be identical.  If not, that would seem to suggest
taht makedumpfile is removing something that it shouldn't.

Comment 5 Dave Anderson 2008-05-30 16:38:14 UTC
I forget (or rather never knew) -- what is the difference between:

  (1) makedumpfile
  (2) makedumpfile -E

(i.e., without any other arguments in both cases)?

And in the case of the "corrupt" vmcore, does the "bt" fail to show
the register set for *all* tasks?  For just the active tasks?  Or just
the panic task?



Comment 6 Kiyoshi Ueda 2008-05-30 21:07:14 UTC
Created attachment 307237 [details]
Command results of filesize comparison and crash 'bt -a'

Re: Comment#3
Probably it's due to corruption, but not sure yet.
It's still under investigation in NEC.


Re: Comment#4 and Comment#5
  o The size of the following vmcores are different.
      1) vmcore without using makedumpfile	 : 7965380204 byte
      2) vmcore filtered (1) by 'makedumpfile'	 : 7970694360 byte
      3) vmcore filtered (1) by 'makedumpfile -E': 7965392492 byte

  o "bt" fails to show the register set for *all* tasks,
    when the vmcore is corrupted.

Please see the attached file for the actual results.

Comment 7 Ken'ichi Ohmichi 2008-06-03 02:20:31 UTC
Created attachment 308179 [details]
Add the check of overlapping load segment.

Comment 8 Ken'ichi Ohmichi 2008-06-03 02:21:55 UTC
Hi,

I investigated this problem, and I found the makedumpfile problem that it cannot
output valid data around overlapping PT_LOAD area.

The following data is my problematic /proc/vmcore.
Paddr [0x4000000 - 0x4638ce0] is overlapping.
PT_LOAD(1): Paddr [0x4000000 - 0x4638ce0]
PT_LOAD(2): Paddr [0x4000000 - 0x4db3000]

The crash utility (4.0-6.3) gets invalid UNW_LENGTH(hdr) (== 0) at
build_script(), and unw_decode() does not be operated.
In my test environment, invalid UNW_LENGTH(hdr) is gotten when reading the
physical address 0x463b058. This address is contained in PT_LOAD(2) but
makedumpfile outputs dump data around PT_LOAD(1) to a dumpfile.
makedumpfile outputs dump data by each page without checking a continuous page
in the came PT_LOAD.
The attached patch adds the check logic and fixes this problem.



Comment 9 Dave Anderson 2008-06-03 14:07:00 UTC
> The following data is my problematic /proc/vmcore.
> Paddr [0x4000000 - 0x4638ce0] is overlapping.
> PT_LOAD(1): Paddr [0x4000000 - 0x4638ce0]
> PT_LOAD(2): Paddr [0x4000000 - 0x4db3000

Just to clarify, when you say, "problematic /proc/vmcore" you mean
the problematic vmcore that makedumpfile created, correct?

In other words, the original /proc/vmcore did not have any
overlapping PT_LOAD segments, correct?

 

Comment 10 Neil Horman 2008-06-03 15:35:51 UTC
The patch looks good to me.  Dave judging by the patch, the answer to your
question is 'yes'.  The patch makes changes to the segments that are written to
disk, rather than the segments that are read in from /proc/vmcore itself.  

Comment 11 RHEL Program Management 2008-06-03 15:51:29 UTC
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.

Comment 12 Dave Anderson 2008-06-03 15:53:48 UTC
> The patch looks good to me.  Dave judging by the patch, the answer to your
> question is 'yes'.  The patch makes changes to the segments that are written
> to disk, rather than the segments that are read in from /proc/vmcore itself.  

Right, I understand, and I'm not disputing the patch.

But given that the "makedumpfile with no options" should pretty much emulate
the same PT_LOAD segments that are registered in the original /proc/vmcore,
I'm wondering how the overlap/confusion would arise to begin with?

Could it have something to do with a p_memsz/p_filesz in the original
vmcore's PT_LOAD segments not being page-aligned values?  Where does the
0x4638ce0 come from?  (more specifically, the "ce0")


Comment 13 Kiyoshi Ueda 2008-06-03 16:51:28 UTC
Created attachment 308264 [details]
results of 'objdump -x' against the original vmcore before filtering by makedumpfile

Re: Comment#9
Although Ken'ichi may give better answer, I think the answer is 'no'.
The original vmcore before filtering by makedumpfile has the overlapping
segments.

See the attached file for details.

Comment 14 Neil Horman 2008-06-03 17:20:30 UTC
I'm sorry, but I don't see any overlapping segments in that output (but I may
just be glossing over them).  Can you point it out to me?


Comment 15 Dave Anderson 2008-06-03 17:51:18 UTC
Thanks Kiyoshi -- now I understand.  The ia64 vmcore contains a separate
PT_LOAD segment for the mapped kernel text and static data in region 5:
 
 vaddr: 0xa000000100000000 paddr: 0x0000000004000000 memsz: 0x0000000000638ce0

and there's also this overlapping unity-mapped segment here in region 7:
   
 vaddr: 0xe000000004000000 paddr: 0x0000000004000000 memsz: 0x0000000000db3000

I forgot about that.  And that's also where the "ce0" comes into play, as
it is associated with the "end" symbol in the mapped kernel segment.  It
seems like that region 5 section should probably be rounded up to a page
boundary, but there may be some compelling reason that it's not.
 

Comment 16 Kiyoshi Ueda 2008-06-03 18:37:23 UTC
Dave pointed the overlapping segments in Comment#15
for the question in Comment#14.
Back to ASSIGNED.

Comment 17 Neil Horman 2008-06-03 18:42:17 UTC
yep, I see them now, thanks for that!  I'll check this in as soon as pm acks it.

Comment 19 Neil Horman 2008-06-23 13:58:47 UTC
fixed in 1.102pre-25.el5.

Comment 28 Ken'ichi Ohmichi 2008-07-31 02:21:25 UTC
I tested 1.102pre-29.el5 and confirmed this problem is solved.

On ia64 RHEL5.2GA, subcommand 'bt' works fine like the following:

crash> bt
PID: 5744   TASK: e00000017b080000  CPU: 0   COMMAND: "bash"
 #0 [BSP:e00000017b0813e8] machine_kexec at a000000100059760
 #1 [BSP:e00000017b0813c8] crash_kexec at a0000001000ca690
 #2 [BSP:e00000017b0813a0] sysrq_handle_crashdump at a0000001003b3900
 #3 [BSP:e00000017b081350] __handle_sysrq at a0000001003b3140
 #4 [BSP:e00000017b081320] write_sysrq_trigger at a0000001001f2850
 #5 [BSP:e00000017b0812d0] vfs_write at a0000001001644e0
 #6 [BSP:e00000017b081258] sys_write at a000000100165030
 #7 [BSP:e00000017b081258] __ia64_trace_syscall at a00000010000bdb0
  EFRAME: e00000017b087e40
      B0: 20000000001564a0      CR_IIP: a000000000010620
 CR_IPSR: 00001213085a6010      CR_IFS: 0000000000000008
  AR_PFS: c000000000000008      AR_RSC: 000000000000000f
 AR_UNAT: 0000000000000000     AR_RNAT: 0000000000000000
  AR_CCV: 0000000000000000     AR_FPSR: 0009804c8a70033f
  LOADRS: 0000000001b80000 AR_BSPSTORE: 600007ffffa542d0
      B6: 200000000021e780          B7: a000000000010640
      PR: 0000000000590a41          R1: 2000000000290238
      R2: 60000fffffa4f9a0          R3: 60000fffffa4f9b0
      R8: 0000000000000001          R9: 0000000000000004
     R10: 0000000000000000         R11: c000000000000512
     R12: 60000fffffa4f9b0         R13: 2000000000314e00
     R14: 00000000000001f5         R15: 0000000000000403
     R16: 60000fffffa4f890         R17: 400000000000ec76
     R18: 400000000000ebb0         R19: 20000000003103f8
     R20: 6000000000021260         R21: 0000000000000030
     R22: 2000000000310410         R23: 6000000000021238
     R24: 0000000000000000         R25: 0000000000000000
     R26: c000000000000004         R27: 000000000000000f
     R28: a000000000010620         R29: 00001213085a6010
     R30: 0000000000000004         R31: 00000000005a0a41
      F6: 000000000000000000000     F7: 000000000000000000000
      F8: 000000000000000000000     F9: 000000000000000000000
     F10: 000000000000000000000    F11: 000000000000000000000
 #8 [BSP:e00000017b081258] __kernel_syscall_via_break at a000000000010620
crash>

Thank you for merging the patch.


Comment 33 Chris Ward 2008-11-14 14:04:11 UTC
~~~ Attention Partners! ~~~

Please test this URGENT / HIGH priority bug at your earliest convenience to ensure it makes it into the upcoming RHEL 5.3 release. The fix should be present in the Partner Snapshot #2 (kernel*-122), available NOW at ftp://partners.redhat.com. As we are approaching the end of the RHEL 5.3 test cycle, it is critical that you report back testing results as soon as possible. 

If you have VERIFIED the fix, please add PartnerVerified to the Bugzilla Keywords field to indicate this. If you find that this issue has not been properly fixed, set the bug status to ASSIGNED with a comment describing the issues you encountered.

All NEW issues encountered (not part of this bug fix) should have a new bug created with the proper keywords and flags set to trigger a review for their inclusion in the upcoming RHEL 5.3 or other future release. Post a link in this bugzilla pointing to the new issue to ensure it is not overlooked.

For any additional questions, speak with your Partner Manager.

Comment 34 Chris Ward 2008-11-18 18:12:53 UTC
~~ Snapshot 3 is now available ~~ 

Snapshot 3 is now available for Partner Testing, which should contain a fix that resolves this bug. ISO's available as usual at ftp://partners.redhat.com. Your testing feedback is vital! Please let us know if you encounter any NEW issues (file a new bug) or if you have VERIFIED the fix is present and functioning as expected (add PartnerVerified Keyword).

Ping your Partner Manager with any additional questions. Thanks!

Comment 37 Shinichi Ishida 2008-11-28 08:12:54 UTC
NEC confirmed that this problem was resolved on RHEL5.3 Snapshot2.

Comment 39 errata-xmlrpc 2009-01-20 20:58:21 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0105.html