Bug 1530887
| Summary: | crash: bt: cannot transition from exception stack to IRQ stack to current process stack | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Emma Wu <xiawu> |
| Component: | crash | Assignee: | Dave Anderson <anderson> |
| Status: | CLOSED ERRATA | QA Contact: | Emma Wu <xiawu> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.5 | CC: | qzhao, salmy, xiawu |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | crash-7.2.0-3.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-04-10 17:54:31 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Comment 3
Dave Anderson
2018-01-05 18:56:12 UTC
(In reply to Dave Anderson from comment #3) > Is it possible that you can tell me which kernel version this started > happening on? There's a change in the format of the top of each per-cpu > IRQ stack, and I'm trying to figure out what kernel patch caused the > change. Bisected it down to 3.10.0-778.el7: 3.10.0-715.el7 64 bytes of zeroes 3.10.0-760.el7 64 bytes of zeroes 3.10.0-773.el7 64 bytes of zeroes 3.10.0-775.el7 64 bytes of zeroes 3.10.0-776.el7 64 bytes of zeroes 3.10.0-777.el7 64 bytes of zeroes 3.10.0-778.el7 full usage of stack BINGO! 3.10.0-784.el7 full usage of stack 3.10.0-807.el7 full usage of stack 3.10.0-808.el7 full usage of stack 3.10.0-823.el7 full usage of stack Kernel changelog: * Thu Nov 09 2017 Rafael Aquini <aquini> [3.10.0-778.el7] - [kernel] livepatch: __klp_disable_patch() should never be called for disabled patches (Josh Poimboeuf) [1430637] - [kernel] livepatch: Correctly call klp_post_unpatch_callback() in error paths (Josh Poimboeuf) [1430637] - [kernel] livepatch: add transition notices (Josh Poimboeuf) [1430637] - [kernel] livepatch: move transition "complete" notice into klp_complete_transition() (Josh Poimboeuf) [1430637] - [kernel] livepatch: add (un)patch callbacks (Josh Poimboeuf) [1430637] - [kernel] ftrace: Add more checks for FTRACE_FL_DISABLED in processing ip records (Josh Poimboeuf) [1430637] - [x86] stacktrace: Avoid recording save_stack_trace() wrappers (Josh Poimboeuf) [1430637] - [x86] x86/dumpstack: Remove raw stack dump (Josh Poimboeuf) [1430637] - [x86] unwind: Fix oprofile module link error (Josh Poimboeuf) [1430637] - [x86] dumpstack: Fix show_stack() task pointer regression (Josh Poimboeuf) [1430637] - [x86] dumpstack: Remove dump_trace() and related callbacks (Josh Poimboeuf) [1430637] - [x86] dumpstack: Convert show_trace_log_lvl() to use the new unwinder (Josh Poimboeuf) [1430637] - [x86] oprofile/x86: Convert x86_backtrace() to use the new unwinder (Josh Poimboeuf) [1430637] - [x86] stacktrace: Convert save_stack_trace_*() to use the new unwinder (Josh Poimboeuf) [1430637] - [x86] perf/x86: Convert perf_callchain_kernel() to use the new unwinder (Josh Poimboeuf) [1430637] - [x86] dumpstack: Remove NULL task pointer convention (Josh Poimboeuf) [1430637] - [x86] dumpstack: Remove unnecessary stack pointer arguments (Josh Poimboeuf) [1430637] - [x86] oprofile/x86: Add regs->ip to oprofile trace (Josh Poimboeuf) [1430637] - [x86] perf/x86: Check perf_callchain_store() error (Josh Poimboeuf) [1430637] - [kernel] livepatch: unpatch all klp_objects if klp_module_coming fails (Josh Poimboeuf) [1430637] - [kernel] livepatch: Small shadow variable documentation fixes (Josh Poimboeuf) [1430637] - [kernel] livepatch: __klp_shadow_get_or_alloc() is local to shadow.c (Josh Poimboeuf) [1430637] - [kernel] livepatch: introduce shadow variable API (Josh Poimboeuf) [1430637] - [x86] x86/dumpstack: Fix interrupt and exception stack boundary checks (Josh Poimboeuf) [1430637] - [kernel] livepatch: Fix stacking of patches with respect to RCU (Josh Poimboeuf) [1430637] - [kernel] livepatch: Make livepatch dependent on !TRIM_UNUSED_KSYMS (Josh Poimboeuf) [1430637] - [kernel] livepatch: Reduce the time of finding module symbols (Josh Poimboeuf) [1430637] - [kernel] livepatch: add missing printk newlines (Josh Poimboeuf) [1430637] - [kernel] livepatch: Cancel transition a safe way for immediate patches (Josh Poimboeuf) [1430637] - [kernel] livepatch: make klp_mutex proper part of API (Josh Poimboeuf) [1430637] - [kernel] livepatch: allow removal of a disabled patch (Josh Poimboeuf) [1430637] - [kernel] livepatch: add /proc/<pid>/patch_state (Josh Poimboeuf) [1430637] - [kernel] livepatch: change to a per-task consistency model (Josh Poimboeuf) [1430637] - [kernel] livepatch: store function sizes (Josh Poimboeuf) [1430637] - [kernel] livepatch: use kstrtobool() in enabled_store() (Josh Poimboeuf) [1430637] - [kernel] livepatch: move patching functions into patch.c (Josh Poimboeuf) [1430637] - [kernel] livepatch: remove unnecessary object loaded check (Josh Poimboeuf) [1430637] - [kernel] livepatch: separate enabled and patched states (Josh Poimboeuf) [1430637] - [kernel] livepatch/x86: add TIF_PATCH_PENDING thread flag (Josh Poimboeuf) [1430637] - [kernel] livepatch: create temporary klp_update_patch_state() stub (Josh Poimboeuf) [1430637] - [x86] x86/entry: define _TIF_ALLWORK_MASK flags explicitly (Josh Poimboeuf) [1430637] - [kernel] stacktrace/x86: add function for detecting reliable stack traces (Josh Poimboeuf) [1430637] - [x86] x86/unwind: update unwinder for livepatch (Josh Poimboeuf) [1430637] - [kernel] x86/entry: annotate entry code call locations for livepatch unwinder (Josh Poimboeuf) [1430637] - [kernel] livepatch: doc: remove the limitation for schedule() patching (Josh Poimboeuf) [1430637] - [kernel] documentation/livepatch: Fix stale link to gmame (Josh Poimboeuf) [1430637] - [x86] x86/boot: Move the _stext marker to before the boot code (Josh Poimboeuf) [1430637] - [x86] x86/boot/64: Put a real return address on the idle task stack (Josh Poimboeuf) [1430637] - [x86] x86/boot/64: Use a common function for starting CPUs (Josh Poimboeuf) [1430637] - [x86] x86/unwind: Add new unwind interface and implementations (Josh Poimboeuf) [1430637] - [x86] x86/dumpstack: Add recursion checking for all stacks (Josh Poimboeuf) [1430637] - [x86] x86/dumpstack: Add support for unwinding empty IRQ stacks (Josh Poimboeuf) [1430637] - [x86] dumpstack: Add get_stack_info() interface (Josh Poimboeuf) [1430637] - [x86] dumpstack: Simplify in_exception_stack() (Josh Poimboeuf) [1430637] - [x86] dumpstack: Allow preemption in show_stack_log_lvl() and dump_trace() (Josh Poimboeuf) [1430637] - [x86] dumpstack: Add get_stack_pointer() and get_frame_pointer() (Josh Poimboeuf) [1430637] - [x86] x86/dumpstack: Make printk_stack_address() more generally useful (Josh Poimboeuf) [1430637] - [x86] x86/dumpstack/ftrace: Don't print unreliable addresses in print_context_stack_bp() (Josh Poimboeuf) [1430637] - [x86] x86/dumpstack/ftrace: Mark function graph handler function as unreliable (Josh Poimboeuf) [1430637] - [x86] ftrace/x86: Implement HAVE_FUNCTION_GRAPH_RET_ADDR_PTR (Josh Poimboeuf) [1430637] - [x86] x86/dumpstack/ftrace: Convert dump_trace() callbacks to use ftrace_graph_ret_addr() (Josh Poimboeuf) [1430637] - [kernel] ftrace: Add ftrace_graph_ret_addr() stack unwinding helpers (Josh Poimboeuf) [1430637] - [kernel] ftrace: Add return address pointer to ftrace_ret_stack (Josh Poimboeuf) [1430637] - [kernel] ftrace: Remove CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST from config (Josh Poimboeuf) [1430637] - [kernel] ftrace: Only allocate the ret_stack 'fp' field when needed (Josh Poimboeuf) [1430637] - [x86] dumpstack: Remove 64-byte gap at end of irq stack (Josh Poimboeuf) [1430637] - [kernel] x86/dumpstack: Remove extra brackets around "<EOE>" (Josh Poimboeuf) [1430637] - [kernel] x86/asm/head: Rename 'stack_start' -> 'initial_stack' (Josh Poimboeuf) [1430637] - [kernel] x86/dumpstack: Remove show_trace() (Josh Poimboeuf) [1430637] - [kernel] livepatch: use arch_klp_init_object_loaded() to finish arch-specific tasks (Josh Poimboeuf) [1430637] - [kernel] x86/dumpstack: Try harder to get a call trace on stack overflow (Josh Poimboeuf) [1430637] - [kernel] x86/dumpstack: Honor supplied @regs arg (Josh Poimboeuf) [1430637] - [kernel] x86: avoid avoid passing around 'thread_info' in stack dumping code (Josh Poimboeuf) [1430637] - [kernel] livepatch: make object/func-walking helpers more robust (Josh Poimboeuf) [1430637] - [kernel] livepatch: Add some basic livepatch documentation (Josh Poimboeuf) [1430637] - [kernel] livepatch: robustify klp_register_patch() API error checking (Josh Poimboeuf) [1430637] - [kernel] livepatch: Allow architectures to specify an alternate ftrace location (Josh Poimboeuf) [1430637] - [kernel] livepatch: reuse module loader code to write relocations (Josh Poimboeuf) [1430637] - [kernel] module: preserve Elf information for livepatch modules (Josh Poimboeuf) [1430637] - [kernel] elf: add livepatch-specific Elf constants (Josh Poimboeuf) [1430637] - [kernel] sscanf: implement basic character sets (Josh Poimboeuf) [1430637] - [kernel] livepatch/module: remove livepatch module notifier (Josh Poimboeuf) [1430637] - [kernel] modules: split part of complete_formation() into prepare_coming_module() (Josh Poimboeuf) [1430637] - [kernel] livepatch: Fix the error message about unresolvable ambiguity (Josh Poimboeuf) [1430637] - [kernel] klp: remove CONFIG_LIVEPATCH dependency from klp headers (Josh Poimboeuf) [1430637] - [kernel] klp: remove superfluous errors in asm/livepatch.h (Josh Poimboeuf) [1430637] - [kernel] perf: generalize perf_callchain (Josh Poimboeuf) [1430637] - [kernel] ftrace/module: remove ftrace module notifier (Josh Poimboeuf) [1430637] - [kernel] ftrace/module: Call clean up function when module init fails early (Josh Poimboeuf) [1430637] - [kernel] livepatch: change the error message in asm/livepatch.h header files (Josh Poimboeuf) [1430637] - [kernel] ftrace: Fix the race between ftrace and insmod (Josh Poimboeuf) [1430637] - [kernel] ftrace: Add infrastructure for delayed enabling of module functions (Josh Poimboeuf) [1430637] - [kernel] ftrace: Cleanup of global variables ftrace_new_pgs and ftrace_update_cnt (Josh Poimboeuf) [1430637] - [kernel] livepatch: Cleanup module page permission changes (Josh Poimboeuf) [1430637] - [kernel] livepatch: function, sympos scheme in livepatch sysfs directory (Josh Poimboeuf) [1430637] - [kernel] livepatch: add sympos as disambiguator field to klp_reloc (Josh Poimboeuf) [1430637] - [kernel] livepatch: add old_sympos as disambiguator field to klp_func (Josh Poimboeuf) [1430637] - [kernel] module: Add module_{enable,disable}_ro() (Josh Poimboeuf) [1430637] - [kernel] module: Use the same logic for setting and unsetting RO/NX (Josh Poimboeuf) [1430637] - [kernel] livepatch: x86: fix relocation computation with kASLR (Josh Poimboeuf) [1430637] - [kernel] livepatch: Fix crash with !CONFIG_DEBUG_SET_MODULE_RONX (Josh Poimboeuf) [1430637] - [kernel] livepatch: Improve error handling in klp_disable_func() (Josh Poimboeuf) [1430637] - [kernel] ftrace: Format MCOUNT_ADDR address as type unsigned long (Josh Poimboeuf) [1430637] - [kernel] livepatch: add module locking around kallsyms calls (Josh Poimboeuf) [1430637] - [kernel] livepatch: annotate klp_init() with __init (Josh Poimboeuf) [1430637] - [kernel] livepatch: introduce patch/func-walking helpers (Josh Poimboeuf) [1430637] - [kernel] livepatch: make kobject in klp_object statically allocated (Josh Poimboeuf) [1430637] - [kernel] livepatch: Prevent patch inconsistencies if the coming module notifier fails (Josh Poimboeuf) [1430637] - [kernel] livepatch: match return value to function signature (Josh Poimboeuf) [1430637] - [kernel] livepatch: x86: make kASLR logic more accurate (Josh Poimboeuf) [1430637] - [kernel] livepatch: add support on s390 (Josh Poimboeuf) [1430637] - [kernel] livepatch: Fix subtle race with coming and going modules (Josh Poimboeuf) [1430637] - [kernel] livepatch: remove unnecessary call to klp_find_object_module() (Josh Poimboeuf) [1430637] - [kernel] livepatch: fix RCU usage in klp_find_external_symbol() (Josh Poimboeuf) [1430637] - [kernel] x86/kernel: Fix output of show_stack_log_lvl() (Josh Poimboeuf) [1430637] - [kernel] livepatch: RCU protect struct klp_func all the time when used in klp_ftrace_handler() (Josh Poimboeuf) [1430637] - [kernel] livepatch: remove extern specifier from header files (Josh Poimboeuf) [1430637] - [kernel] livepatch: fix format string in kobject_init_and_add() (Josh Poimboeuf) [1430637] - [kernel] livepatch: simplify disable error path (Josh Poimboeuf) [1430637] - [kernel] livepatch: add missing newline to error message (Josh Poimboeuf) [1430637] - [kernel] livepatch: rename config to CONFIG_LIVEPATCH (Josh Poimboeuf) [1430637] - [kernel] livepatch: fix uninitialized return value (Josh Poimboeuf) [1430637] - [kernel] livepatch: change ARCH_HAVE_LIVE_PATCHING to HAVE_LIVE_PATCHING (Josh Poimboeuf) [1430637] - [kernel] livepatch: support for repatching a function (Josh Poimboeuf) [1430637] - [kernel] livepatch: enforce patch stacking semantics (Josh Poimboeuf) [1430637] - [kernel] livepatch: fix deferred module patching order (Josh Poimboeuf) [1430637] - [kernel] livepatch: handle ancient compilers with more grace (Josh Poimboeuf) [1430637] - [kernel] livepatch: kconfig: use bool instead of boolean (Josh Poimboeuf) [1430637] - [kernel] livepatch: samples: fix usage example comments (Josh Poimboeuf) [1430637] - [kernel] livepatch: use FTRACE_OPS_FL_IPMODIFY (Josh Poimboeuf) [1430637] - [kernel] livepatch: move x86 specific ftrace handler code to arch/x86 (Josh Poimboeuf) [1430637] - [kernel] livepatch: samples: add sample live patching module (Josh Poimboeuf) [1430637] - [kernel] livepatch: kernel: add support for live patching (Josh Poimboeuf) [1430637] - [kernel] powerpc/ftrace: simplify prepare_ftrace_return (Josh Poimboeuf) [1430637] - [kernel] x86: Fix dumpstack_64 irq stack handling (Josh Poimboeuf) [1430637] - [kernel] x86: Fix dumpstack_64 to keep state of "used" variable in loop (Josh Poimboeuf) [1430637] - [kernel] x86: Clean up dumpstack_64.c code (Josh Poimboeuf) [1430637] - [x86] dumpstack: Fix printk_address for direct addresses (Josh Poimboeuf) [1430637] - [kernel] s390/ftrace: prepare_ftrace_return() function call order (Josh Poimboeuf) [1430637] - [x86] revert "dumpstack: Remove raw stack dump" (Josh Poimboeuf) [1430637] This is the 3.10.0-778.el7 kernel patch that causes the problem: - [x86] dumpstack: Remove 64-byte gap at end of irq stack (Josh Poimboeuf) A crash utility fix is underway. Patch posted upstream: https://github.com/crash-utility/crash/commit/63419fb9a535732082ae7b542ebb2399e6a3ccc9 Fix for the "bt" command in x86_64 kernels that contain, or have backports of, kernel commit 4950d6d48a0c43cc61d0bbb76fb10e0214b79c66, titled "x86/dumpstack: Remove 64-byte gap at end of irq stack". Without the patch, backtraces fail to transition from the IRQ stack back to the process stack, showing an error message such as "bt: cannot transition exception stack to IRQ stack to current process stack". (anderson) (In reply to Emma Wu from comment #0) > Created attachment 1376656 [details] > crash.vmcore.log > > Description of problem: > > 'bt' reported "cannot transition from exception stack to IRQ stack to > current process stack" when analyzing a vmcore using crash utility > > crash> bt -c 20 > PID: 0 TASK: ffff9e3c39caaf70 CPU: 20 COMMAND: "swapper/20" > #0 [ffff9e3eef905e48] crash_nmi_callback at ffffffffbb652ae1 > #1 [ffff9e3eef905e58] nmi_handle at ffffffffbbcfd057 > #2 [ffff9e3eef905eb0] do_nmi at ffffffffbbcfd26d > #3 [ffff9e3eef905ef0] end_repeat_nmi at ffffffffbbcfc513 > [exception RIP: ktime_get_update_offsets_now+181] > RIP: ffffffffbb6f5e35 RSP: ffff9e3eef903f30 RFLAGS: 00000087 > RAX: 000000309548b88f RBX: 0000000000000000 RCX: 0000000000000017 > RDX: 0000000000000006 RSI: ffff9e3eef90fa58 RDI: ffffffffbc22a440 > RBP: ffff9e3eef903f70 R8: 0000000000000000 R9: ffff9e3eef913ab0 > R10: 7fffffffffffffff R11: 0000000000000000 R12: 000000308559e40d > R13: ffffffffbc22a440 R14: ffff9e3eef90f9a8 R15: 0000000000035480 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > --- <NMI exception stack> --- > #4 [ffff9e3eef903f30] ktime_get_update_offsets_now at ffffffffbb6f5e35 > #5 [ffff9e3eef903f78] hrtimer_interrupt at ffffffffbb6bbf65 > --- <IRQ stack> --- > bt: cannot transition from exception stack to IRQ stack to current process > stack: > exception stack pointer: ffff9e3eef905e48 > IRQ stack pointer: ffff9e3eef903fa8 > process stack pointer: ffff9e3eef903fa8 > current stack base: ffff9e3c39cdc000 > crash> With the patch applied, here is the correct backtrace: crash> bt ffff9e3c39caaf70 PID: 0 TASK: ffff9e3c39caaf70 CPU: 20 COMMAND: "swapper/20" #0 [ffff9e3eef905e48] crash_nmi_callback at ffffffffbb652ae1 #1 [ffff9e3eef905e58] nmi_handle at ffffffffbbcfd057 #2 [ffff9e3eef905eb0] do_nmi at ffffffffbbcfd26d #3 [ffff9e3eef905ef0] end_repeat_nmi at ffffffffbbcfc513 [exception RIP: ktime_get_update_offsets_now+181] RIP: ffffffffbb6f5e35 RSP: ffff9e3eef903f30 RFLAGS: 00000087 RAX: 000000309548b88f RBX: 0000000000000000 RCX: 0000000000000017 RDX: 0000000000000006 RSI: ffff9e3eef90fa58 RDI: ffffffffbc22a440 RBP: ffff9e3eef903f70 R8: 0000000000000000 R9: ffff9e3eef913ab0 R10: 7fffffffffffffff R11: 0000000000000000 R12: 000000308559e40d R13: ffffffffbc22a440 R14: ffff9e3eef90f9a8 R15: 0000000000035480 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #4 [ffff9e3eef903f30] ktime_get_update_offsets_now at ffffffffbb6f5e35 #5 [ffff9e3eef903f78] hrtimer_interrupt at ffffffffbb6bbf65 #6 [ffff9e3eef903fc0] local_apic_timer_interrupt at ffffffffbb656bb5 #7 [ffff9e3eef903fd8] smp_apic_timer_interrupt at ffffffffbbd0747d #8 [ffff9e3eef903ff0] apic_timer_interrupt at ffffffffbbd0595d --- <IRQ stack> --- #9 [ffff9e3c39cdfdb8] apic_timer_interrupt at ffffffffbbd0595d [exception RIP: cpuidle_enter_state+82] RIP: ffffffffbbb60792 RSP: ffff9e3c39cdfe60 RFLAGS: 00000206 RAX: 000000309548b6bd RBX: ffff9e3eef90d180 RCX: 0000000000000017 RDX: 0000000225c17d03 RSI: ffff9e3c39cdffd8 RDI: 000000309548b6bd RBP: ffff9e3c39cdfe88 R8: 00000000000003c6 R9: ffff9e3eef913ab0 R10: 7fffffffffffffff R11: 0000000000000000 R12: ffff9e3c39cdfe00 R13: ffff9e3eef90f9e0 R14: ffffffffbb6bb065 R15: ffff9e3c39cdfde0 ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 #10 [ffff9e3c39cdfe90] cpuidle_idle_call at ffffffffbbb608d8 #11 [ffff9e3c39cdfed0] arch_cpu_idle at ffffffffbb63520e #12 [ffff9e3c39cdfee0] cpu_startup_entry at ffffffffbb6ef0aa #13 [ffff9e3c39cdff28] start_secondary at ffffffffbb6548f6 #14 [ffff9e3c39cdff50] start_cpu at ffffffffbb6000d5 crash> Response from ticket filed with secalert: [engineering.redhat.com #465000] Waiver request for a new errata rpmdiff issue On Fri Jan 12 05:18:57 2018, anderson wrote: > > For the first time, rpmdiff failed the Execshield for the crash > utility package, > here for just the ppc64/ppc64le architecture: > > RHBA-2017:31254-01 crash bug fix and enhancement update > https://errata.devel.redhat.com/rpmdiff/show/188497?result_id=5244164 > > I don't understand how this could be a problem. Would it be possible > for you to waive this result? > > Thanks, > Dave Anderson > > Hi Dave, Unfortunately this is because of a change made in the latest version of binutils which switched all ppc builds to not be compiled with relro by default. We have worked with the binutils maintainer and decided to revert the change. We'll have to wait for the latest binutils to be released and then rebuild the package which should then have the correct hardening set. I do not have a timeframe for release but it should be fixed soon. Thank you for your understanding. Regards, -- Sam Fowler, Red Hat Product Security Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0955 |