Please provide a pointer to the vmlinux/vmcore pair. Thanks, Dave
> Description of problem: > > This is regression from RHEL5.5. Also, how did you determine that it is a regression from RHEL5.5? Did you run crash-4.1.2-4.el5 on the same vmlinux/vmcore pair?
I also encountered this bug. Description of problem: crash 4.1.2-6.el5 Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. NOTE: stdin: not a tty GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... please wait... (gathering kmem slab cache data) please wait... (gathering module symbol data) please wait... (gathering task table data) please wait... (determining panic task) KERNEL: /usr/lib/debug/lib/modules/2.6.18-231.el5/vmlinux DUMPFILE: /var/crash/2010-11-14-22:55:44/vmcore [PARTIAL DUMP] CPUS: 2 DATE: Sun Nov 14 22:49:07 2010 UPTIME: 00:06:02 LOAD AVERAGE: 0.93, 0.73, 0.36 TASKS: 124 NODENAME: dell-pe700-01.rhts.eng.bos.redhat.com RELEASE: 2.6.18-231.el5 VERSION: #1 SMP Mon Nov 8 18:38:37 EST 2010 MACHINE: i686 (3391 Mhz) MEMORY: 3.7 GB PANIC: "SysRq : Trigger a crashdump" PID: 8197 COMMAND: "runtest.sh" TASK: f4b46550 [THREAD_INFO: f533f000] CPU: 0 STATE: TASK_RUNNING (SYSRQ) crash> foreach bt PID: 0 TASK: c068e3c0 CPU: 0 COMMAND: "swapper" #0 [c0708f5c] schedule at c061f15a #1 [c0708fd4] cpu_idle at c0403d25 PID: 0 TASK: f7c8d550 CPU: 1 COMMAND: "swapper" bt: cannot resolve stack trace: #0 [f7c91f0c] crash_nmi_callback at c041a3f0 #1 [f7c91f58] do_nmi at c040683f #2 [f7c91f80] nmi at c0405b5d EAX: 00000000 EBX: 00000001 ECX: f7c91000 EDX: cae198c4 EBP: 00000000 DS: 007b ESI: 00000000 ES: 007b EDI: 00000000 CS: 0060 EIP: c0403d12 ERR: 00000000 EFLAGS: 00000246 #3 [f7c91fb4] cpu_idle at c0403d12 ... Version-Release number of selected component (if applicable): crash 4.1.2-6.el5 kexec-tools 1.102pre-118.el5.i386 kernel-2.6.18-231.el5.i686 RHEL5.6-Server-20101110.0 How reproducible: Not sure. I'll try to reproduce it and save vmcore somewhere if possible. The whole crash log can be found here: https://beaker.engineering.redhat.com/logs/2010/11/310/31073/61516/686612///crash.vmcore.log
> The whole crash log can be found here: > https://beaker.engineering.redhat.com/logs/2010/11/310/31073/61516/686612 ///crash.vmcore.log Those "log" files are useless with respect to debugging backtrace issues. Please -- *please* -- do not file backtrace-related bugzillas without saving the vmlinux/vmcore pairs that were used to generate the faulty backtrace output. Backtrace errors cannot be debugged without being able to investigate the actual stack contents in the vmcore, and any proposed fix cannot be verified without being able to test it with the original vmcore.
Created attachment 461167 [details] x86_cpu_idle.patch Patch to fix this "bt" resolution error: crash 4.1.2-6.el5: crash> bt PID: 0 TASK: cd97faa0 CPU: 6 COMMAND: "swapper" bt: cannot resolve stack trace: #0 [cd980f0c] crash_nmi_callback at c041a420 #1 [cd980f58] do_nmi at c040682c #2 [cd980f80] nmi at c0405b5d EAX: ffffffff EBX: 00000006 ECX: c04031fb EDX: cd83b098 EBP: 00000000 DS: 007b ESI: 00000000 ES: 007b EDI: 00000000 CS: 0060 EIP: c0403cf0 ERR: ffffffff EFLAGS: 00000286 #3 [cd980fb4] cpu_idle at c0403cf0 bt: text symbols on stack: [cd980f10] mwait_idle at c04031fb [cd980f34] cpu_idle at c0403cf0 [cd980f58] do_nmi at c0406832 [cd980f80] nmi_stack_correct at c0405b62 [cd980f88] mwait_idle at c04031fb [cd980fac] cpu_idle at c0403cf0 bt: possible exception frames: KERNEL-MODE EXCEPTION FRAME AT cd980f0c: EAX: ffffffff EBX: 00000006 ECX: c04031fb EDX: cd83b098 EBP: 00000000 DS: 007b ESI: 00000000 ES: 007b EDI: 00000000 CS: 0060 EIP: c0403cf0 ERR: ffffffff EFLAGS: 00000286 KERNEL-MODE EXCEPTION FRAME AT cd980f84: EAX: ffffffff EBX: 00000006 ECX: c04031fb EDX: cd83b098 EBP: 00000000 DS: 007b ESI: 00000000 ES: 007b EDI: 00000000 CS: 0060 EIP: c0403cf0 ERR: ffffffff EFLAGS: 00000286 crash> With the patch applied: crash> bt PID: 0 TASK: cd97faa0 CPU: 6 COMMAND: "swapper" #0 [cd980f0c] crash_nmi_callback at c041a420 #1 [cd980f58] do_nmi at c040682c #2 [cd980f80] nmi at c0405b5d EAX: ffffffff EBX: 00000006 ECX: c04031fb EDX: cd83b098 EBP: 00000000 DS: 007b ESI: 00000000 ES: 007b EDI: 00000000 CS: 0060 EIP: c0403cf0 ERR: ffffffff EFLAGS: 00000286 #3 [cd980fb4] cpu_idle at c0403cf0 crash>
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: On an x86 architecture, the "bt" command may have occasionally failed to produce the backtrace of an NMI-interrupted idle task. This error has been fixed, and the correct output is now displayed as expected.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0059.html