Bug 1567738 - [rhel-6.10] crash: `bt` reports 'bt: cannot transition from exception stack to current process stack'
Summary: [rhel-6.10] crash: `bt` reports 'bt: cannot transition from exception stack t...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: crash
Version: 6.10
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Dave Anderson
QA Contact: Yuming Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-16 06:44 UTC by Yuming Liu
Modified: 2018-06-19 05:25 UTC (History)
5 users (show)

Fixed In Version: crash-7.1.0-8.el6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-19 05:24:56 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1930 None None None 2018-06-19 05:25:02 UTC

Comment 6 Dave Anderson 2018-04-16 15:14:50 UTC
There is a crash-7.2.1 fix that addresses this:

  commit d833432f1ed2d7f507c05d3b6c3e6aa732c49e56
  Author: Dave Anderson <anderson@redhat.com>
  Date:   Fri Jan 19 14:17:53 2018 -0500

    Initial pass for support of kernel page table isolation.  The x86_64
    "bt" command may indicate "bt: cannot transition from exception stack
    to current process stack" if the crash callback NMI occurred while an
    active task was running on the new entry trampoline stack.  This has
    only been tested on the RHEL7 backport of the upstream patch because
    as of this commit, crash does not run on 4.15-rc kernels.  Further
    changes may be required for upstream kernels, and distributions that
    implement the kernel changes differently than upstream.
    (anderson@redhat.com)

Hopefully the patch can be successfully applied to the older 7.1.0-based
crash utility sources.

Comment 7 Dave Anderson 2018-04-16 15:57:07 UTC
(In reply to Dave Anderson from comment #6)
> There is a crash-7.2.1 fix that addresses this:
> ...
> Hopefully the patch can be successfully applied to the older 7.1.0-based
> crash utility sources.

The crash-7.2.1 patch can be applied to the RHEL6 crash sources with
a bit of manual intervention, but with the patch backported, it still
does not recognize the KPTI stack, and the backtrace fails in the
same way.  

Oddly enough, the upstream crash utility does work OK.  I'd prefer to
avoid a rebase, so I'm investigating further.

Comment 8 Dave Anderson 2018-04-16 18:17:13 UTC
> The crash-7.2.1 patch can be applied to the RHEL6 crash sources with
> a bit of manual intervention, but with the patch backported, it still
> does not recognize the KPTI stack, and the backtrace fails in the
> same way.

Sorry, it was my mistake doing the backport to crash-7.1.0.  Here is
crash-7.1.0 with the patch correctly applied: 

$ ./crash /root/vm*

crash 7.1.0-8.el6
Copyright (C) 2002-2014  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

      KERNEL: /root/vmlinux                     
    DUMPFILE: /root/vmcore  [PARTIAL DUMP]
        CPUS: 24
        DATE: Fri Apr 13 07:21:54 2018
      UPTIME: 00:14:03
LOAD AVERAGE: 26.30, 22.22, 11.85
       TASKS: 681
    NODENAME: hp-dl380pg8-02.rhts.eng.pek2.redhat.com
     RELEASE: 2.6.32-746.el6.x86_64
     VERSION: #1 SMP Thu Mar 29 18:05:09 EDT 2018
     MACHINE: x86_64  (1994 Mhz)
      MEMORY: 24 GB
       PANIC: "SysRq : Trigger a crash"
         PID: 10375
     COMMAND: "runtest.sh"
        TASK: ffff8803350bc040  [THREAD_INFO: ffff880339074000]
         CPU: 12
       STATE: TASK_RUNNING (SYSRQ)

crash> bt 13174
PID: 13174  TASK: ffff8806386d2040  CPU: 1   COMMAND: "stress-ng-fork"
 #0 [ffff880032e49e90] crash_nmi_callback at ffffffff81035d8c
 #1 [ffff880032e49ea0] notifier_call_chain at ffffffff8155b350
 #2 [ffff880032e49ee0] atomic_notifier_call_chain at ffffffff8155b3ba
 #3 [ffff880032e49ef0] notify_die at ffffffff810af2ae
 #4 [ffff880032e49f20] do_nmi at ffffffff81558ea9
 #5 [ffff880032e49f50] nmi at ffffffff815587a1
    [exception RIP: page_fault]
    RIP: ffffffff81558260  RSP: ffff880032e46860  RFLAGS: 00000002
    RAX: 00007ffd052b8f50  RBX: 0000000000000001  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: 00007ffd052c6fd0
    RBP: 0000000000000001   R8: 0000000000001900   R9: 0000000000140000
    R10: 0000000000000000  R11: 00007f4d8daaddea  R12: 00007ffd052c8b10
    R13: 00007ffd052c895c  R14: 0000000000408828  R15: 00007ffd052b8f50
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
--- <NMI exception stack> ---
 #6 [ffff880032e46860] page_fault at ffffffff81558260
    RIP: 00007f4d8daae2f9  RSP: 00007ffd052b8f38  RFLAGS: 00010202
    RAX: 00007ffd052b8f50  RBX: 0000000000000001  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: 00007ffd052c5fd0
    RBP: 0000000000000001   R8: 0000000000002900   R9: 0000000000140000
    R10: 0000000000000000  R11: 00007f4d8daaddea  R12: 00007ffd052c8b10
    R13: 00007ffd052c895c  R14: 0000000000408828  R15: 00007ffd052b8f50
    ORIG_RAX: 0000000000000007  CS: 0033  SS: 002b
--- <entry trampoline stack> ---
crash> 

There is a currently-existing RHEL6.10 errata that was filed today:

  RHBA-2018:33493-01 crash bug fix and enhancement update
  https://errata.devel.redhat.com/advisory/33493

Once this BZ gets pm_ack+ and rhel-6.10.0+, I will add it to the
errata.

Comment 18 errata-xmlrpc 2018-06-19 05:24:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1930


Note You need to log in before you can comment on or make changes to this bug.