Description of problem: beaker test /kernel/Biscayne/ltp-lite causes INFO: rcu_sched self-detected stall on CPU[11405.392812] INFO: rcu_sched self-detected stall on CPU Version-Release number of selected component (if applicable): kernel-4.4.0-0.rc5.22.el7 How reproducible: Every time on AMD Seattle systems. Does not happen on Mustang or HP McDivitt Actual results: [11225.895492] Task dump for CPU 1: [11225.898709] float_power R running task 0 9919 9846 0x00000202 [11225.905751] Call Trace: [11225.908188] [<fffffe0000091930>] ret_from_fork+0x0/0x50 [11405.391812] INFO: rcu_sched self-detected stall on CPU[11405.392812] INFO: rcu_sched self-detected stall on CPU [11405.392815] 1-...: (10486090 ticks this GP) idle=5a1/140000000000001/0 softirq=145365/145365 fqs=3423851 [11405.392816] (t=10500232 jiffies g=91545 c=91544 q=229437) [11405.392818] Task dump for CPU 0: [11405.392819] float_power R running task 0 9915 9846 0x00000202 [11405.392822] Call Trace: [11405.392825] [<fffffe0000091930>] ret_from_fork+0x0/0x50 [11405.392826] Task dump for CPU 1: [11405.392826] float_power R running task 0 9919 9846 0x00000202 [11405.392828] Call Trace: [11405.392830] [<fffffe0000096ed4>] dump_backtrace+0x0/0x17c [11405.392833] [<fffffe0000097074>] show_stack+0x24/0x2c [11405.392835] [<fffffe00000f19c0>] sched_show_task+0xa0/0xf4 [11405.392837] [<fffffe00000f3dac>] dump_cpu_task+0x48/0x54 [11405.392838] [<fffffe000011bf74>] rcu_dump_cpu_stacks+0xa4/0xf4 [11405.392840] [<fffffe000011feb4>] rcu_check_callbacks+0x4fc/0x8f4 [11405.392842] [<fffffe000012527c>] update_process_times+0x44/0x74 [11405.392844] [<fffffe0000134e38>] tick_sched_handle.isra.15+0x3c/0x7c [11405.392846] [<fffffe0000134ec4>] tick_sched_timer+0x4c/0x84 [11405.392848] [<fffffe00001259d4>] __hrtimer_run_queues+0x13c/0x248 [11405.392850] [<fffffe00001262e0>] hrtimer_interrupt+0xa0/0x1d4 [11405.392853] [<fffffe00005c7a90>] arch_timer_handler_phys+0x3c/0x48 [11405.392855] [<fffffe0000115978>] handle_percpu_devid_irq+0x94/0x124 [11405.392857] [<fffffe0000110dd0>] generic_handle_irq+0x34/0x4c [11405.392859] [<fffffe0000111158>] __handle_domain_irq+0x6c/0xc4 [11405.392860] [<fffffe00000904a4>] gic_handle_irq+0x64/0xb8 [11405.392862] Exception stack(0xfffffe035d893bb0 to 0xfffffe035d893cd0) [11405.392863] 3ba0: fffffe035471bd7c 0000000000000000 [11405.392865] 3bc0: fffffe035d893d00 fffffe00001089f8 00000000a0000145 fffffe035471bd7c [11405.392867] 3be0: 0000000000000000 0000000000000000 fffffe03fe0e5a80 fffffe03fe0e5a90 [11405.392868] 3c00: fffffe03fe0c5a80 0000000000000000 0000000000000000 0000000000000000 [11405.392870] 3c20: 00000000000000de 0000000000000028 000003ffb13c7af4 0000000032ab0ec0 [11405.392871] 3c40: 000000000000000c 000000000088e922 0000000021287924 0000000807cd621a [11405.392873] 3c60: fffffe000009699c 000003ffb11fd028 000003ffe652c9f0 fffffe035471bd7c [11405.392875] 3c80: 0000000000000000 0000000000004022 0000000000000000 0000000000000000 [11405.392876] 3ca0: fffffe035471bd7c 0000000000000000 fffffe035471bd60 fffffe0000782000 [11405.392877] 3cc0: fffffe035d890000 fffffe035d893d00 [11405.392879] [<fffffe00000914e8>] el1_irq+0x68/0xc0 [11405.392882] [<fffffe0000753cd4>] rwsem_down_write_failed+0x98/0x2f8 [11405.392884] [<fffffe0000753490>] down_write+0x60/0x64 [11405.392886] [<fffffe00001d8c94>] vm_mmap_pgoff+0x88/0xe8 [11405.392889] [<fffffe00001f0a50>] SyS_mmap_pgoff+0x190/0x214 [11405.392891] [<fffffe00000969f0>] sys_mmap+0x54/0x68 [11405.392893] [<fffffe0000091a0c>] __sys_trace_return+0x0/0x4 [11405.658723] [11405.660383] 0-...: (10483669 ticks this GP) idle=2a7/140000000000001/0 softirq=144874/144874 fqs=3423936 [11405.670027] (t=10500510 jiffies g=91545 c=91544 q=229437) Additional info: beaker links in follow up comment
passes on: amd-seattle-05.lab.eng.rdu.redhat.com repeatedly fails on: amd-seattle-06.khw.lab.eng.bos.redhat.com
This looks to be fixed with 4.4.0-0.23.el7
Per comment #4, I am moving this to ON_QA so it can be closed.
ltp-lite ran on amd-seattle-05.khw.lab.eng.bos.redhat.com with the 4.5.0-0.rc3.27.el7 kernel with no rcu_sched stalls. I'll mark this verified and close it. (We can re-open it if the problem comes back.) https://beaker.engineering.redhat.com/jobs/1219825