Bug 572605 - [5.5] backtrace subcommand of crash command fails.
Summary: [5.5] backtrace subcommand of crash command fails.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: crash
Version: 5.5
Hardware: i386
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Dave Anderson
QA Contact: Han Pingtian
URL:
Whiteboard:
Depends On:
Blocks: 588337 601089
TreeView+ depends on / blocked
 
Reported: 2010-03-11 16:54 UTC by Issue Tracker
Modified: 2018-10-27 14:19 UTC (History)
4 users (show)

Fixed In Version: crash-4.1.2-5.el5
Doc Type: Bug Fix
Doc Text:
Prior to this update, running the "bt" command on an x86 architecture may have failed with the following error message: bt: cannot resolve stack trace This occurred when a task that received a non-maskable interrupt (NMI) entered the kernel from user space, but had not called the requested system call function yet, or when the copy_thread() function set the Extended Instruction Pointer (EIP) value of a newly forked task to the "ret_from_fork" entry point. This has been fixed, and the "bt" command now works as expected.
Clone Of:
: 588337 (view as bug list)
Environment:
Last Closed: 2011-01-13 22:50:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0059 0 normal SHIPPED_LIVE crash bug fix update 2011-01-12 17:15:15 UTC

Comment 2 Issue Tracker 2010-03-11 16:54:57 UTC
Event posted on 03-11-2010 10:53am CST by streeter

I have confirmed that the backtrace problem still occurs in the upstream
crash version.

crash> set 3240
    PID: 3240
COMMAND: "udevd"
   TASK: f7f56000  [THREAD_INFO: f0de8000]
    CPU: 1
  STATE: TASK_RUNNING (ACTIVE)
crash> bt
PID: 3240   TASK: f7f56000  CPU: 1   COMMAND: "udevd"
bt: cannot resolve stack trace:
 #0 [f0de8f34] crash_nmi_callback at c041a420
 #1 [f0de8f80] do_nmi at c04067f8
 #2 [f0de8fa8] nmi at c0405b29
    EAX: 0013bfb0  EBX: 00bc4fc0  ECX: 004e6000  EDX: 00870188  EBP:
bfb0d200 
    DS:  007b      ESI: b7f2f548  ES:  007b      EDI: b7f2f000
    CS:  0060      EIP: c061d0e4  ERR: 0013bfb0  EFLAGS: 00000046 
 #3 [f0de8fdc] page_fault at c061d0e4
bt: text symbols on stack:
    [f0de8f5c] page_fault at c061d0e4
    [f0de8f80] do_nmi at c04067fe
    [f0de8fa8] nmi_stack_correct at c0405b2e
    [f0de8fd4] page_fault at c061d0e4
bt: possible exception frames:
  KERNEL-MODE EXCEPTION FRAME AT f0de8f34:
    EAX: 0013bfb0  EBX: 00bc4fc0  ECX: 004e6000  EDX: 00870188  EBP:
bfb0d200 
    DS:  007b      ESI: b7f2f548  ES:  007b      EDI: b7f2f000
    CS:  0060      EIP: c061d0e4  ERR: 0013bfb0  EFLAGS: 00000046 
  KERNEL-MODE EXCEPTION FRAME AT f0de8fac:
    EAX: 0013bfb0  EBX: 00bc4fc0  ECX: 004e6000  EDX: 00870188  EBP:
bfb0d200 
    DS:  007b      ESI: b7f2f548  ES:  007b      EDI: b7f2f000
    CS:  0060      EIP: c061d0e4  ERR: 0013bfb0  EFLAGS: 00000046 



This event sent from IssueTracker by streeter  [SEG - Kernel]
 issue 597893

Comment 4 Dave Anderson 2010-03-11 20:25:53 UTC
Guy,

These are kind of interesting -- in that I've never seen an NMI "hit" just
after the task entered kernel space.  In the case above, the user task had
just taken a user-space virtual address page fault, and entered the kernel at
the "page_fault" interrupt gate entry point.  There's not much to it, as
it just pushes the handler function address of do_page_fault(() and then
jumps to "error_code":

  crash> dis page_fault
  0xc061d0e4 <page_fault>:        push   $0xc061dd74
  0xc061d0e9 <page_fault+5>:      jmp    0xc0405a50
  crash> sym 0xc061dd74 0xc0405a50
  c061dd74 (T) do_page_fault  
  c0405a50 (t) error_code  ../arch/i386/kernel/entry.S
  crash>

and in the "error_code" assembly code, the exception frame pt_regs
structure gets laid down at the very top of the kernel stack.
Note that the kernel stack address is f0de8fdc, which is at
the very top of the stack, and so the exception frame is not
in place yet.  That being the case, the "possible exception
frames" listed do not include a "USER-MODE EXCEPTION FRAME".
 
So there's nothing to display, i.e., the "correct" backtrace
would be essentially:

PID: 3240   TASK: f7f56000  CPU: 1   COMMAND: "udevd"
 #0 [f0de8f34] crash_nmi_callback at c041a420
 #1 [f0de8f80] do_nmi at c04067f8
 #2 [f0de8fa8] nmi at c0405b29
    EAX: 0013bfb0  EBX: 00bc4fc0  ECX: 004e6000  EDX: 00870188  EBP: bfb0d200 
    DS:  007b      ESI: b7f2f548  ES:  007b      EDI: b7f2f000
    CS:  0060      EIP: c061d0e4  ERR: 0013bfb0  EFLAGS: 00000046 
 #3 [f0de8fdc] page_fault at c061d0e4

Now, the other example shown in the IT is similar-but-different,
but it did get a litter farther along in kernel space, and the
system_call() function was able to get the exception frame laid
down.  The stack address is at f6c26fb8, and there is a "USER-MODE
EXCEPTION FRAME" found just above it at f6c26fbc:

...
#3 [f6c26fb8] system_call at c0404f3b
...
USER-MODE EXCEPTION FRAME AT f6c26fbc:
   EAX: 007c0a00  EBX: 0089bff4  ECX: bfc080a0  EDX: 0000002f
   DS:  007b      ESI: 0089e044  ES:  007b      EDI: 00693000
   SS:  007b      ESP: bfc0808c  EBP: bfc080c8
   CS:  0073      EIP: 007c0a00  ERR: ffffffff  EFLAGS: 00010206 

So in that case, the "correct" backtrace would be:

PID: 3186   TASK: f6cb4550  CPU: 0   COMMAND: "udevd"
#0 [f6c26f10] crash_nmi_callback at c041a420
#1 [f6c26f5c] do_nmi at c04067f8
#2 [f6c26f84] nmi at c0405b29
   EAX: 00000003  EBX: 0089bff4  ECX: 00000000  EDX: 00000003  EBP: f6c26000
   DS:  007b      ESI: 0089e044  ES:  007b      EDI: c061dd74
   CS:  0060      EIP: c0404f3b  ERR: 00000003  EFLAGS: 00000006
#3 [f6c26fb8] system_call at c0404f3b
   EAX: 007c0a00  EBX: 0089bff4  ECX: bfc080a0  EDX: 0000002f
   DS:  007b      ESI: 0089e044  ES:  007b      EDI: 00693000
   SS:  007b      ESP: bfc0808c  EBP: bfc080c8
   CS:  0073      EIP: 007c0a00  ERR: ffffffff  EFLAGS: 00010206 

In any case, the data is all there, it's just the x86 backtrace code
was written to never expect a backtrace to *begin* at a kernel entry
point -- it failed trying to find out "who called" page_fault() or
system_call().     

Anyway, Guy, can you ask Fujitsu if they have saved the example
"system_call" vmcore that was shown in the IT?  Or any other vmcore
that finds a USER-MODE "possible exception frame"?

Thanks,
  Dave

Comment 6 Dave Anderson 2010-03-11 22:13:51 UTC
Thanks, Guy -- I've got a fix for the "page_fault" case where there is no
user-mode exception frame at all.  

But it's not clear to me why their "system_call" example -- where there actually
was a user-mode exception frame laid down -- would fail?  

So I really need one of those failure-type vmcores -- so if they don't have
one available, I'd really appreciate it if they could reproduce it again!

Comment 7 Dave Anderson 2010-03-12 18:58:31 UTC
> Thanks, Guy -- I've got a fix for the "page_fault" case where there is no
> user-mode exception frame at all.

I've created a patched RHEL5 version of crash here:

  http://people.redhat.com/anderson/BZ_572605

  crash-4.1.2-4.btfix1.i386.rpm
  crash-4.1.2-4.btfix1.src.rpm
  crash-debuginfo-4.1.2-4.btfix1.i386.rpm

The patched version above *only* addresses the scenario seen in the
"page_fault" vmcore, where the NMI was taken before the full user-mode
pt_regs exception frame was laid down on the kernel stack.

However, the patched version does *not* address the other scenario
that was shown in the vmcore output in the Issue Tracker description,
where the NMI took place in "system_call", but there was a USER-MODE
exception frame on the kernel stack.

If Fujitsu does not have that original vmcore available (or one 
similar to it), then they should re-test with the patched version
until the second type of failure re-occurs.

Comment 9 Dave Anderson 2010-03-29 13:02:46 UTC
> I confirmed this problem was fixed with the test package.

No, you have not confirmed that the problem was fixed.

The example that you have shown is similar to the "page_fault"
vmcore, where the NMI interrupt was taken before the full
pt_regs backtrace was written to the kernel stack.

Note that the stack frame address of the highest kernel
function (system_call) is at f5b36fd0:

> The following are the test-package(crash-4.1.2-4.btfix1.i386.rpm) log:
> ------------------------
> crash> bt -a
> PID: 16151  TASK: f5eedaa0  CPU: 0   COMMAND: "dd"
> #0 [f5b36f28] crash_nmi_callback at c041a420
> #1 [f5b36f74] do_nmi at c04067f8
> #2 [f5b36f9c] nmi at c0405b29
>    EAX: 00000003  EBX: 00b74fc0  ECX: 00000003  EDX: 00b7102b  EBP:
> bfa1b370
>    DS:  007b      ESI: 00000008  ES:  007b      EDI: 08048034
>    CS:  0060      EIP: c0404f48  ERR: 00000003  EFLAGS: 00000087
> #3 [f5b36fd0] system_call at c0404f48
> <snip>
> ------------------------

and that the "possible" user-mode exception frame would start at 
address f5b36fbc:

>  USER-MODE EXCEPTION FRAME AT f5b36fbc:
>    EAX: 00b7102b  EBX: 00b7007b  ECX: 0000007b  EDX: 00000003
>    DS:  007b      ESI: c0404f48  ES:  007b      EDI: 00000060
>    SS:  007b      ESP: bfa1b32c  EBP: 00000087
>    CS:  0073      EIP: 00b6f97b  ERR: ffffffff  EFLAGS: 00010206
> <snip>

Note that *all* user-mode exception frames would start at the same 
offset into the kernel stack, or xxxxxfbc.  For example: 

crash> foreach bt -e | grep USER
  USER-MODE EXCEPTION FRAME AT cb53afbc:
  USER-MODE EXCEPTION FRAME AT f7946fbc:
  USER-MODE EXCEPTION FRAME AT f7675fbc:
  USER-MODE EXCEPTION FRAME AT f75b8fbc:
  USER-MODE EXCEPTION FRAME AT f76e3fbc:
  USER-MODE EXCEPTION FRAME AT f7f75fbc:
  USER-MODE EXCEPTION FRAME AT f7f22fbc:
  USER-MODE EXCEPTION FRAME AT f7826fbc:
  USER-MODE EXCEPTION FRAME AT f7f3dfbc:
  ...

When the NMI arrived in your test example, the kernel stack
was at f5b36fd0:  

> #3 [f5b36fd0] system_call at c0404f48

So then it's impossible for it to have created the full pt_regs
exception frame.  That is what happened with the original "page_fault"
vmcore sample.  And, yes, the crash utility is handling that
case as expected.

However, in the case of the original "system_call" vmcore
example shown in the Issue Tracker description, the stack
address when the NMI was received was *below* the location
where the pt_regs exception frame would have been.  So the
pt_regs should be in place, and should be displayed.  Note
that the NMI stack address was f6c26fb8, which is *below*
the user-mode exception frame address of f6c26fbc:

> This problem occurred once every about 40 times in one of our tests.
> 
> The following are the log:
> 
> ===
> PID: 3186   TASK: f6cb4550  CPU: 0   COMMAND: "udevd"
> bt: cannot resolve stack trace:
> #0 [f6c26f10] crash_nmi_callback at c041a420
> #1 [f6c26f5c] do_nmi at c04067f8
> #2 [f6c26f84] nmi at c0405b29
>    EAX: 00000003  EBX: 0089bff4  ECX: 00000000  EDX: 00000003  EBP: f6c26000
>    DS:  007b      ESI: 0089e044  ES:  007b      EDI: c061dd74
>    CS:  0060      EIP: c0404f3b  ERR: 00000003  EFLAGS: 00000006
> #3 [f6c26fb8] system_call at c0404f3b
> bt: text symbols on stack:
>    [f6c26f20] do_page_fault at c061dd74
>    [f6c26f38] restore_all at c0404f3b
>    [f6c26f54] do_page_fault at c061dd74
>    [f6c26f5c] do_nmi at c04067fe
>    [f6c26f64] do_page_fault at c061e04d
>    [f6c26f68] do_page_fault at c061e0c6
>    [f6c26f7c] do_page_fault at c061dd74
>    [f6c26f84] nmi_stack_correct at c0405b2e
>    [f6c26f98] do_page_fault at c061dd74
>    [f6c26fb0] restore_all at c0404f3b
>    [f6c26fdc] startup_32 at c040007b
> bt: possible exception frames:
>  KERNEL-MODE EXCEPTION FRAME AT f6c26f10:
>    EAX: 00000003  EBX: 0089bff4  ECX: 00000000  EDX: 00000003  EBP: f6c26000
>    DS:  007b      ESI: 0089e044  ES:  007b      EDI: c061dd74
>    CS:  0060      EIP: c0404f3b  ERR: 00000003  EFLAGS: 00000006
>  KERNEL-MODE EXCEPTION FRAME AT f6c26f88:
>    EAX: 00000003  EBX: 0089bff4  ECX: 00000000  EDX: 00000003  EBP: f6c26000
>    DS:  007b      ESI: 0089e044  ES:  007b      EDI: c061dd74
>    CS:  0060      EIP: c0404f3b  ERR: 00000003  EFLAGS: 00000006
>  USER-MODE EXCEPTION FRAME AT f6c26fbc:
>    EAX: 007c0a00  EBX: 0089bff4  ECX: bfc080a0  EDX: 0000002f
>    DS:  007b      ESI: 0089e044  ES:  007b      EDI: 00693000
>    SS:  007b      ESP: bfc0808c  EBP: bfc080c8
>    CS:  0073      EIP: 007c0a00  ERR: ffffffff  EFLAGS: 00010206 
 
And the patch in crash-4.1.2-4.btfix1.i386.rpm does *not* fix
the situation shown above.  That patch *only* fixes situations
where the kernel stack address of the highest function is *above*
the fixed location of the base of the exception frame (xxxxxfbc).

Please continue testing until you see a situation where the
highest kernel stack frame address shown is *below* the start
of the user-mode exception frame.  That would only be true
if the "system_call" (or other kernel entry point) stack address
is at xxxxxfb8 or lower.

Comment 10 Dave Anderson 2010-03-29 13:14:23 UTC
So, again, just to clarify.  The 4k kernel stack starts at address xxxxx000
and ends at xxxxxfff will always have a user-mode pt_regs exception frame 
that starts at xxxxxfbc:

  xxxxx000  base of kernel stack

< topmost kernel function must have a stack address below xxxxxfbc >

  xxxxxfbc  start of pt_regs user exception frame
              |
              V
  xxxxxfff  top of kernel stack

What is required is a vmcore like the one shown in the Issue Tracker, where
the topmost kernel function's stack address is *below* the start of the
pt_regs exception frame, i.e., below *below* xxxxxfbc.

Comment 12 Dave Anderson 2010-04-12 12:45:54 UTC
> I uploaded vmcore whose "system_call" stack address is at xxxxxfb8
> to dropbox.redhat.com.
> The file name is fj-597893-reproduction.tar.bz2.

Can you either copy this vmcore to megatron, or give me instructions on
how to copy it from "dropbox.redhat.com"?

If I log in as anonymous/anonymous, I get this

KERBEROS_V4 rejected as an authentication type
Name (dropbox.redhat.com:anderson): anonymous
331 Please specify the password.
Password:
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> get fj-597893-reproduction.tar.bz2
local: fj-597893-reproduction.tar.bz2 remote: fj-597893-reproduction.tar.bz2
227 Entering Passive Mode (209,132,183,100,24,215)
550 Permission denied.
ftp>

Comment 14 Dave Anderson 2010-04-19 15:35:29 UTC
Thank you the second vmcore, which contains an active
task that was running within the syscall entry point code,
had laid down a full pt_regs exception frame, but had not 
yet called the requested system call function:
  
  crash> bt 3330
  PID: 3330   TASK: f6a33000  CPU: 0   COMMAND: "w_dd.sh"
  bt: cannot resolve stack trace:
   #0 [f5832f10] crash_nmi_callback at c041a420
   #1 [f5832f5c] do_nmi at c04067f8
   #2 [f5832f84] nmi at c0405b29
      EAX: 00000000  EBX: 090a1318  ECX: 00000080  EDX: fffffeff  EBP: f5832000 
      DS:  007b      ESI: 090a1318  ES:  007b      EDI: 090a45f8
      CS:  0060      EIP: c0404fec  ERR: 00000000  EFLAGS: 00000002 
   #3 [f5832fb8] system_call at c0404fec
  bt: text symbols on stack:
      [f5832f30] mmu_notifier_unregister at c047007b
      [f5832f38] syscall_exit_work at c0404fec
      [f5832f5c] do_nmi at c04067fe
      [f5832f6c] audit_syscall_entry at c0449c56
      [f5832f84] nmi_stack_correct at c0405b2e
      [f5832fa8] mmu_notifier_unregister at c047007b
      [f5832fb0] syscall_exit_work at c0404fec
      [f5832fdc] startup_32 at c040007b
  bt: possible exception frames:
    KERNEL-MODE EXCEPTION FRAME AT f5832f10:
      EAX: 00000000  EBX: 090a1318  ECX: 00000080  EDX: fffffeff  EBP: f5832000 
      DS:  007b      ESI: 090a1318  ES:  007b      EDI: 090a45f8
      CS:  0060      EIP: c0404fec  ERR: 00000000  EFLAGS: 00000002 
    KERNEL-MODE EXCEPTION FRAME AT f5832f88:
      EAX: 00000000  EBX: 090a1318  ECX: 00000080  EDX: fffffeff  EBP: f5832000 
      DS:  007b      ESI: 090a1318  ES:  007b      EDI: 090a45f8
      CS:  0060      EIP: c0404fec  ERR: 00000000  EFLAGS: 00000002 
    USER-MODE EXCEPTION FRAME AT f5832fbc:
      EAX: 00000000  EBX: 090a1318  ECX: 00000004  EDX: ffffffff 
      DS:  007b      ESI: 090a1318  ES:  007b      EDI: 090a45f8
      SS:  007b      ESP: bf967528  EBP: bf9675a8
      CS:  0073      EIP: 003d8402  ERR: 00000021  EFLAGS: 00000297
  crash>
  
I've created an updated patched RHEL5 version of crash here:

  http://people.redhat.com/anderson/BZ_572605

  crash-4.1.2-4.btfix2.i386.rpm
  crash-4.1.2-4.btfix2.src.rpm
  crash-debuginfo-4.1.2-4.btfix2.i386.rpm
  
The "btfix2" version fixes the system_call issue shown above like so:

  crash> bt 3330
  PID: 3330   TASK: f6a33000  CPU: 0   COMMAND: "w_dd.sh"
   #0 [f5832f10] crash_nmi_callback at c041a420
   #1 [f5832f5c] do_nmi at c04067f8
   #2 [f5832f84] nmi at c0405b29
      EAX: 00000000  EBX: 090a1318  ECX: 00000080  EDX: fffffeff  EBP: f5832000 
      DS:  007b      ESI: 090a1318  ES:  007b      EDI: 090a45f8
      CS:  0060      EIP: c0404fec  ERR: 00000000  EFLAGS: 00000002 
   #3 [f5832fb8] system_call at c0404fec
      EAX: 00000000  EBX: 090a1318  ECX: 00000004  EDX: ffffffff 
      DS:  007b      ESI: 090a1318  ES:  007b      EDI: 090a45f8
      SS:  007b      ESP: bf967528  EBP: bf9675a8
      CS:  0073      EIP: 003d8402  ERR: 00000021  EFLAGS: 00000297 
  crash> 

Also, while testing with the second vmcore, I noticed that there was 
another backtrace bug that was indicated by 3 other non-active tasks, 
all of which had just been forked, but had not been resumed:
  
  crash> bt 1109 1386 26487
  PID: 1109   TASK: f5eba550  CPU: 0   COMMAND: "r_dd.sh"
  bt: cannot resolve stack trace:
   #0 [f5869fbc] ret_from_fork at c0404e00
  bt: text symbols on stack:
      [f5869fdc] startup_32 at c040007b
  bt: possible exception frame:
    USER-MODE EXCEPTION FRAME AT f5869fbc:
      EAX: 00000000  EBX: 01200011  ECX: 00000000  EDX: 00000000 
      DS:  007b      ESI: 00000000  ES:  007b      EDI: b7fb6708
      SS:  007b      ESP: bf85a53c  EBP: bf85a578
      CS:  0073      EIP: 00621402  ERR: 00000078  EFLAGS: 00000282 
  
  PID: 1386   TASK: f4eb6000  CPU: 0   COMMAND: "w_dd.sh"
  bt: cannot resolve stack trace:
   #0 [f5f5afbc] ret_from_fork at c0404e00
  bt: text symbols on stack:
      [f5f5afdc] startup_32 at c040007b
  bt: possible exception frame:
    USER-MODE EXCEPTION FRAME AT f5f5afbc:
      EAX: 00000000  EBX: 01200011  ECX: 00000000  EDX: 00000000 
      DS:  007b      ESI: 00000000  ES:  007b      EDI: b7f6a708
      SS:  007b      ESP: bfaf5ebc  EBP: bfaf5ef8
      CS:  0073      EIP: 0058e402  ERR: 00000078  EFLAGS: 00000282 
  
  PID: 26487  TASK: f7cf4000  CPU: 1   COMMAND: "w_dd.sh"
  bt: cannot resolve stack trace:
   #0 [f5efefbc] ret_from_fork at c0404e00
  bt: text symbols on stack:
      [f5efefdc] startup_32 at c040007b
  bt: possible exception frame:
    USER-MODE EXCEPTION FRAME AT f5efefbc:
      EAX: 00000000  EBX: 01200011  ECX: 00000000  EDX: 00000000 
      DS:  007b      ESI: 00000000  ES:  007b      EDI: b7fc6708
      SS:  007b      ESP: bfcaaa3c  EBP: bfcaaa78
      CS:  0073      EIP: 00dbf402  ERR: 00000078  EFLAGS: 00000282 
  crash> 
  
The error messages above were generated because the backtrace code
had presumed that non-active tasks are always blocked in schedule().
But in the case of forks, the resumption EIP is set to ret_from_fork()
by the kernel's copy_thread() function.  With the "btfix2" patch,
the backtraces look like this:

  crash> bt 1109 1386 26487
  PID: 1109   TASK: f5eba550  CPU: 0   COMMAND: "r_dd.sh"
   #0 [f5869fbc] ret_from_fork at c0404e00
      EAX: 00000000  EBX: 01200011  ECX: 00000000  EDX: 00000000 
      DS:  007b      ESI: 00000000  ES:  007b      EDI: b7fb6708
      SS:  007b      ESP: bf85a53c  EBP: bf85a578
      CS:  0073      EIP: 00621402  ERR: 00000078  EFLAGS: 00000282 
  
  PID: 1386   TASK: f4eb6000  CPU: 0   COMMAND: "w_dd.sh"
   #0 [f5f5afbc] ret_from_fork at c0404e00
      EAX: 00000000  EBX: 01200011  ECX: 00000000  EDX: 00000000 
      DS:  007b      ESI: 00000000  ES:  007b      EDI: b7f6a708
      SS:  007b      ESP: bfaf5ebc  EBP: bfaf5ef8
      CS:  0073      EIP: 0058e402  ERR: 00000078  EFLAGS: 00000282 
  
  PID: 26487  TASK: f7cf4000  CPU: 1   COMMAND: "w_dd.sh"
   #0 [f5efefbc] ret_from_fork at c0404e00
      EAX: 00000000  EBX: 01200011  ECX: 00000000  EDX: 00000000 
      DS:  007b      ESI: 00000000  ES:  007b      EDI: b7fc6708
      SS:  007b      ESP: bfcaaa3c  EBP: bfcaaa78
      CS:  0073      EIP: 00dbf402  ERR: 00000078  EFLAGS: 00000282 
  crash> 

Because of the unusual nature of Fujitsu's test program, I would
appreciate it if they could re-test with this second "btfix2" 
crash utility.

Thanks,
  Dave

Comment 31 Jaromir Hradilek 2010-12-01 19:04:48 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Prior to this update, running the "bt" command on an x86 architecture may have failed with the following error message:

  bt: cannot resolve stack trace

This occurred when a task that received a non-maskable interrupt (NMI) entered the kernel from user space, but had not called the requested system call function yet, or when the copy_thread() function set the Extended Instruction Pointer (EIP) value of a newly forked task to the "ret_from_fork" entry point. This has been fixed, and the "bt" command now works as expected.

Comment 33 errata-xmlrpc 2011-01-13 22:50:09 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0059.html


Note You need to log in before you can comment on or make changes to this bug.