Bug 653288 - [REG][5.6]backtrace subcommand of crash command fails.
Summary: [REG][5.6]backtrace subcommand of crash command fails.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: crash
Version: 5.6
Hardware: i686
OS: Unspecified
medium
high
Target Milestone: rc
: ---
Assignee: Dave Anderson
QA Contact: Kernel Dump QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-11-15 08:22 UTC by Masayoshi Yamazaki
Modified: 2018-11-14 16:25 UTC (History)
3 users (show)

Fixed In Version: crash-4.1.2-7.el5
Doc Type: Bug Fix
Doc Text:
On an x86 architecture, the "bt" command may have occasionally failed to produce the backtrace of an NMI-interrupted idle task. This error has been fixed, and the correct output is now displayed as expected.
Clone Of:
Environment:
Last Closed: 2011-01-13 22:50:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
x86_cpu_idle.patch (480 bytes, patch)
2010-11-17 21:46 UTC, Dave Anderson
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0059 0 normal SHIPPED_LIVE crash bug fix update 2011-01-12 17:15:15 UTC

Comment 1 Dave Anderson 2010-11-15 13:34:56 UTC
Please provide a pointer to the vmlinux/vmcore pair.

Thanks,
  Dave

Comment 2 Dave Anderson 2010-11-15 14:45:15 UTC
> Description of problem:
> > This is regression from RHEL5.5. 

Also, how did you determine that it is a regression from RHEL5.5?  

Did you run crash-4.1.2-4.el5 on the same vmlinux/vmcore pair?

Comment 4 Han Pingtian 2010-11-16 09:43:58 UTC
I also encountered this bug.

Description of problem:
crash 4.1.2-6.el5
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
NOTE: stdin: not a tty

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...


please wait... (gathering kmem slab cache data)
                                                

please wait... (gathering module symbol data)
                                                

please wait... (gathering task table data)
                                                

please wait... (determining panic task)
                                                
      KERNEL: /usr/lib/debug/lib/modules/2.6.18-231.el5/vmlinux
    DUMPFILE: /var/crash/2010-11-14-22:55:44/vmcore  [PARTIAL DUMP]
        CPUS: 2
        DATE: Sun Nov 14 22:49:07 2010
      UPTIME: 00:06:02
LOAD AVERAGE: 0.93, 0.73, 0.36
       TASKS: 124
    NODENAME: dell-pe700-01.rhts.eng.bos.redhat.com
     RELEASE: 2.6.18-231.el5
     VERSION: #1 SMP Mon Nov 8 18:38:37 EST 2010
     MACHINE: i686  (3391 Mhz)
      MEMORY: 3.7 GB
       PANIC: "SysRq : Trigger a crashdump"
         PID: 8197
     COMMAND: "runtest.sh"
        TASK: f4b46550  [THREAD_INFO: f533f000]
         CPU: 0
       STATE: TASK_RUNNING (SYSRQ)

crash> foreach bt
PID: 0      TASK: c068e3c0  CPU: 0   COMMAND: "swapper"
 #0 [c0708f5c] schedule at c061f15a
 #1 [c0708fd4] cpu_idle at c0403d25

PID: 0      TASK: f7c8d550  CPU: 1   COMMAND: "swapper"
bt: cannot resolve stack trace:
 #0 [f7c91f0c] crash_nmi_callback at c041a3f0
 #1 [f7c91f58] do_nmi at c040683f
 #2 [f7c91f80] nmi at c0405b5d
    EAX: 00000000  EBX: 00000001  ECX: f7c91000  EDX: cae198c4  EBP: 00000000 
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000000
    CS:  0060      EIP: c0403d12  ERR: 00000000  EFLAGS: 00000246 
 #3 [f7c91fb4] cpu_idle at c0403d12
...


Version-Release number of selected component (if applicable):
crash 4.1.2-6.el5
kexec-tools 1.102pre-118.el5.i386
kernel-2.6.18-231.el5.i686
RHEL5.6-Server-20101110.0


How reproducible:
Not sure. I'll try to reproduce it and save vmcore somewhere
if possible.

The whole crash log can be found here:
https://beaker.engineering.redhat.com/logs/2010/11/310/31073/61516/686612///crash.vmcore.log

Comment 5 Dave Anderson 2010-11-16 13:57:44 UTC
> The whole crash log can be found here:
> https://beaker.engineering.redhat.com/logs/2010/11/310/31073/61516/686612 ///crash.vmcore.log

Those "log" files are useless with respect to debugging backtrace issues.

Please -- *please* -- do not file backtrace-related bugzillas without saving
the vmlinux/vmcore pairs that were used to generate the faulty backtrace output.
Backtrace errors cannot be debugged without being able to investigate
the actual stack contents in the vmcore, and any proposed fix cannot
be verified without being able to test it with the original vmcore.

Comment 9 Dave Anderson 2010-11-17 21:46:05 UTC
Created attachment 461167 [details]
x86_cpu_idle.patch


Patch to fix this "bt" resolution error:

crash 4.1.2-6.el5:

crash> bt
PID: 0      TASK: cd97faa0  CPU: 6   COMMAND: "swapper"
bt: cannot resolve stack trace:
 #0 [cd980f0c] crash_nmi_callback at c041a420
 #1 [cd980f58] do_nmi at c040682c
 #2 [cd980f80] nmi at c0405b5d
    EAX: ffffffff  EBX: 00000006  ECX: c04031fb  EDX: cd83b098  EBP: 00000000 
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000000
    CS:  0060      EIP: c0403cf0  ERR: ffffffff  EFLAGS: 00000286 
 #3 [cd980fb4] cpu_idle at c0403cf0
bt: text symbols on stack:
    [cd980f10] mwait_idle at c04031fb
    [cd980f34] cpu_idle at c0403cf0
    [cd980f58] do_nmi at c0406832
    [cd980f80] nmi_stack_correct at c0405b62
    [cd980f88] mwait_idle at c04031fb
    [cd980fac] cpu_idle at c0403cf0
bt: possible exception frames:
  KERNEL-MODE EXCEPTION FRAME AT cd980f0c:
    EAX: ffffffff  EBX: 00000006  ECX: c04031fb  EDX: cd83b098  EBP: 00000000 
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000000
    CS:  0060      EIP: c0403cf0  ERR: ffffffff  EFLAGS: 00000286 
  KERNEL-MODE EXCEPTION FRAME AT cd980f84:
    EAX: ffffffff  EBX: 00000006  ECX: c04031fb  EDX: cd83b098  EBP: 00000000 
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000000
    CS:  0060      EIP: c0403cf0  ERR: ffffffff  EFLAGS: 00000286 
crash>

With the patch applied:

crash> bt
PID: 0      TASK: cd97faa0  CPU: 6   COMMAND: "swapper"
 #0 [cd980f0c] crash_nmi_callback at c041a420
 #1 [cd980f58] do_nmi at c040682c
 #2 [cd980f80] nmi at c0405b5d
    EAX: ffffffff  EBX: 00000006  ECX: c04031fb  EDX: cd83b098  EBP: 00000000 
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000000
    CS:  0060      EIP: c0403cf0  ERR: ffffffff  EFLAGS: 00000286 
 #3 [cd980fb4] cpu_idle at c0403cf0
crash>

Comment 13 Jaromir Hradilek 2010-12-01 19:06:17 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
On an x86 architecture, the "bt" command may have occasionally failed to produce the backtrace of an NMI-interrupted idle task. This error has been fixed, and the correct output is now displayed as expected.

Comment 15 errata-xmlrpc 2011-01-13 22:50:53 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0059.html


Note You need to log in before you can comment on or make changes to this bug.