Red Hat Bugzilla – Bug 1000440
[abrt] crash-6.1.4-1.fc19: __schedule_frame_adjust: Process /usr/bin/crash was killed by signal 11 (SIGSEGV)
Last modified: 2013-12-16 14:07:54 EST
Description of problem:
I was trying to debug kernel 3.10.9-20.fc19.x86_64 crash by using command:
sudo crash /var/crash/127.0.0.1-2013.08.23-15\:17\:07/vmcore /usr/lib/debug/lib/modules/3.10.9-200.fc19.x86_64/vmlinux
I used command "bt" to get the stack trace which made the debugger crash.
The output of the session:
crash: cannot determine thread return address
DUMPFILE: /var/crash/127.0.0.1-2013.08.23-15:17:07/vmcore [PARTIAL DUMP]
DATE: Fri Aug 23 15:16:52 2013
LOAD AVERAGE: 4.78, 1.41, 0.51
VERSION: #1 SMP Wed Aug 21 19:27:58 UTC 2013
MACHINE: x86_64 (1596 Mhz)
MEMORY: 3.9 GB
PANIC: "Oops: 0000 [#1] SMP " (check log for details)
TASK: ffff880135a9cc40 (1 of 2) [THREAD_INFO: ffff880135b42000]
STATE: TASK_RUNNING (PANIC)
PID: 0 TASK: ffff880135a9cc40 CPU: 1 COMMAND: "swapper/1"
#0 [ffff8801223e02d0] __schedule at ffffffff8163d631
The program crashed after printing the line above.
Version-Release number of selected component:
cmdline: crash /var/crash/127.0.0.1-2013.08.23-15:17:07/vmcore /usr/lib/debug/lib/modules/3.10.9-200.fc19.x86_64/vmlinux
runlevel: N 5
Thread no. 1 (10 frames)
#0 __schedule_frame_adjust at x86_64.c:7446
#1 x86_64_low_budget_back_trace_cmd at x86_64.c:3237
#2 back_trace at kernel.c:2509
#3 cmd_bt at kernel.c:2114
#4 exec_command at main.c:771
#5 main_loop at main.c:719
#6 captured_command_loop at ./main.c:228
#7 catch_errors at exceptions.c:531
#8 captured_main at ./main.c:958
#9 catch_errors at exceptions.c:531
Created attachment 789586 [details]
Created attachment 789587 [details]
Created attachment 789588 [details]
Created attachment 789589 [details]
Created attachment 789590 [details]
Created attachment 789591 [details]
Created attachment 789592 [details]
Created attachment 789593 [details]
Created attachment 789594 [details]
Created attachment 789595 [details]
Created attachment 789596 [details]
What I really would like is the vmcore. Do you still have it?
Sure, the vmcore file can be found here:
(too big to upload as an attachment to bugzilla)
> Sure, the vmcore file can be found here:
OK, thanks I've got it.
I haven't figured why exactly, but it has something to do with the
memory corruption caused by the stack overflow that you can see
in the "log" command. Or you can do a "set 9821", and then a "bt".
The crash utility is not finding the real panic task because
the kexec/kdump work in the kernel is being done from the page
just underneath the overrun stack of pic 9821. (i.e., instead
of on a legitimate stack page)
And when the panic task is not found, crash just defaults to setting
the initial task to pid 0 on cpu 0, which at least is guaranteed
But for some reason, a stack address from the pid 9821 is mistakenly
being used by pid 0, and when "bt" tries to unwind pid 0's stack,
it creates a bogus offset value when mathematically using the
"real" pid 0 base stack address in conjunction with the unrelated
stack address from pid 9821. (and I'm currently trying to
figure out why that's happening...)
Anyway, it goes without saying that the crash utility shouldn't core
dump, regardless of the contents of the vmcore. You can ditch
the vmcore -- thanks again.
> And when the panic task is not found, crash just defaults to setting
> the initial task to pid 0 on cpu 0, which at least is guaranteed
> to exist.
Except that this one is defaulting to the idle/swapper task 0 on cpu 1
instead of cpu 0, which presumably is related to the fact that the
real panic task 9821 was also running on cpu 1.
This is a bizarre (probably a one-time-only) dumpfile. The investigation
The dump is related to bug 994824 which turned out to likely be caused by bug 917081.
I haven't tried to produce the dump again (at least not yet) because I found a workaround that solves my original (apparently mei kernel module and suspend/resume related) problem.
I see that the system did a resume, but I'm wondering whether the crash
happened immediately upon the resume?
The system didn't crash immediately. It sort of came back up at least partially. I could see the desktop but the system wouldn't react to keyboard and mouse input. After a couple of seconds or something the system started saving the crash dump.
The problem is that the runqueue for cpu 1 shows the swapper task as currently active, in conflict with the per-cpu "current_task" variable, which shows pid 9821 as the active task -- all complicated by the fact that there was no evidence of the crash occurring on pid 9821's stack because it overflowed and used the page below it.
Information for build crash-7.0.2-1.fc21: