Bug 1031362 - kernel softlockup while executing the command ppc64_cpu --smt=on [NEEDINFO]
Summary: kernel softlockup while executing the command ppc64_cpu --smt=on
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 20
Hardware: ppc64
OS: All
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-17 08:50 UTC by IBM Bug Proxy
Modified: 2014-03-17 20:50 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-03-17 18:44:48 UTC
Type: ---
jforbes: needinfo?
bugproxy: needinfo?
bugproxy: needinfo?


Attachments (Terms of Use)
dmesg output (49.35 KB, text/plain)
2013-11-17 08:50 UTC, IBM Bug Proxy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 99146 0 None None None Never

Description IBM Bug Proxy 2013-11-17 08:50:39 UTC
== Comment: #0 - IRANNA D. ANKAD <iranna.ankad.com> - 2013-10-28 10:46:01 ==
I just issued below set of 3 commands & it threw many kernel softlockup traces causing the last command to hang.

[root@jupiterioc-lp3 ~]# ppc64_cpu --smt
SMT is on
[root@jupiterioc-lp3 ~]# ppc64_cpu --smt=off
[root@jupiterioc-lp3 ~]# ppc64_cpu --smt=on

<...... command hangs.....>

FYI here is sniff of call traces, for more details attaching fresh dmesg command output.


Fedora release 20 (Heisenbug)
Kernel 3.11.0-300.fc20.ppc64p7 on an ppc64 (hvc0)

jupiterioc-lp3 login: [ 4548.388596] INFO: rcu_sched self-detected stall on CPU { 1}  (t=384082 jiffies g=2224 c=2223 q=13748)
[ 4548.388653] CPU: 1 PID: 2103 Comm: ppc64_cpu Not tainted 3.11.0-300.fc20.ppc64p7 #1
[ 4548.388660] Call Trace:
[ 4548.388669] [c000000bbdb82a00] [c000000000014ba0] .show_stack+0x130/0x200 (unreliable)
[ 4548.388680] [c000000bbdb82ad0] [c00000000083e19c] .dump_stack+0x88/0xb4
[ 4548.388689] [c000000bbdb82b50] [c000000000168a38] .rcu_check_callbacks+0x418/0x8d0
[ 4548.388698] [c000000bbdb82c90] [c0000000000abea8] .update_process_times+0x58/0xb0
[ 4548.388706] [c000000bbdb82d20] [c000000000114ab0] .tick_sched_handle.isra.16+0x40/0xd0
[ 4548.388714] [c000000bbdb82db0] [c000000000114ba4] .tick_sched_timer+0x64/0xa0
[ 4548.388722] [c000000bbdb82e50] [c0000000000cd094] .__run_hrtimer+0xb4/0x2a0
[ 4548.388730] [c000000bbdb82ef0] [c0000000000ce048] .hrtimer_interrupt+0x148/0x330
[ 4548.388738] [c000000bbdb83000] [c00000000001e8a0] .timer_interrupt+0x120/0x2e0
[ 4548.388746] [c000000bbdb830b0] [c000000000002554] decrementer_common+0x154/0x180
[ 4548.388757] --- Exception: 901 at .__bitmap_weight+0x44/0x100
[ 4548.388757]     LR = .build_sched_domains+0xc3c/0xdb0
[ 4548.388775] [c000000bbdb833a0] [c000000bbdb83450] 0xc000000bbdb83450 (unreliable)
[ 4548.388783] [c000000bbdb83450] [c0000000000e196c] .build_sched_domains+0xc3c/0xdb0
[ 4548.388791] [c000000bbdb835a0] [c0000000000e1dc0] .partition_sched_domains+0x260/0x3f0
[ 4548.388799] [c000000bbdb83680] [c000000000139864] .cpuset_update_active_cpus+0x24/0x60
[ 4548.388807] [c000000bbdb836f0] [c0000000000e1ff8] .cpuset_cpu_active+0xa8/0xd0
[ 4548.388815] [c000000bbdb83770] [c000000000833dac] .notifier_call_chain+0x8c/0x100
[ 4548.388823] [c000000bbdb83810] [c0000000000983f0] .cpu_notify+0x40/0xa0
[ 4548.388830] [c000000bbdb83890] [c000000000098694] ._cpu_up+0x204/0x210
[ 4548.388837] [c000000bbdb83950] [c0000000000987ec] .cpu_up+0x14c/0x1d0
[ 4548.388846] [c000000bbdb839e0] [c0000000006bbb74] .cpu_subsys_online+0x54/0xc0
[ 4548.388854] [c000000bbdb83a80] [c0000000004f99d8] .device_online+0xb8/0x120
[ 4548.388861] [c000000bbdb83b10] [c0000000004f9af4] .store_online+0xb4/0xf0
[ 4548.388868] [c000000bbdb83bb0] [c0000000004f57c4] .dev_attr_store+0x64/0xa0
[ 4548.388876] [c000000bbdb83c40] [c0000000002e2404] .sysfs_write_file+0xf4/0x1d0
[ 4548.388885] [c000000bbdb83cf0] [c000000000242b58] .vfs_write+0xe8/0x260
[ 4548.388892] [c000000bbdb83d90] [c000000000243854] .SyS_write+0x64/0xe0
[ 4548.388900] [c000000bbdb83e30] [c000000000009dd4] syscall_exit+0x0/0x98

Comment 1 IBM Bug Proxy 2013-11-17 08:50:56 UTC
Created attachment 825102 [details]
dmesg output

Comment 2 IBM Bug Proxy 2013-11-27 06:30:37 UTC
------- Comment From iranna.ankad.com 2013-11-27 06:27 EDT-------
(In reply to comment #9)
> Iranna,
>
> Can you please provide machine access?
>
> -Bharani

Hello Bharani,
The original system (P7+ Jupiter) is busy running some priority tests for next one week. So I thought of recreating this issue on another P7 Jupiter system but with latest F20 Beta kernel. I could not recreate this issue. I also confirm that this scenario works fine on P8 with F20 Beta as well. So..for now I am OK to close this bug. I shall reopen, if I happen to notice again.   Thanks!

FYI
[root@als0153 ~]# ppc64_cpu --smt=off
[root@als0153 ~]# ppc64_cpu --smt=on
[root@als0153 ~]# ppc64_cpu --smt=off
[root@als0153 ~]# ppc64_cpu --smt=on
[root@als0153 ~]# uname -a
Linux als0153.austin.ibm.com 3.11.6-301.fc20.ppc64p7 #1 SMP Mon Oct 21 18:49:17 MST 2013 ppc64 ppc64 ppc64 GNU/Linux
[root@als0153 ~]#

Comment 3 Justin M. Forbes 2014-02-24 14:02:47 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs.

Fedora 20 has now been rebased to 3.13.4-200.fc20.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 4 Justin M. Forbes 2014-03-17 18:44:48 UTC
*********** MASS BUG UPDATE **************

This bug has been in a needinfo state for several weeks and is being closed with insufficient data due to inactivity. If this is still an issue with Fedora 20, please feel free to reopen the bug and provide the additional information requested.


Note You need to log in before you can comment on or make changes to this bug.