Bug 1837816

Summary:	NMI backtrace for cpu 0 when run oslat
Product:	Red Hat Enterprise Linux 8	Reporter:	Qiao Zhao <qzhao>
Component:	kernel-rt	Assignee:	Juri Lelli <jlelli>
kernel-rt sub component:	Core-Kernel	QA Contact:	Qiao Zhao <qzhao>
Status:	CLOSED NOTABUG	Docs Contact:
Severity:	unspecified
Priority:	unspecified	CC:	bhu, lgoncalv, peterx, qzhao, rt-qe, williams
Version:	8.2	Flags:	pm-rhel: mirror+
Target Milestone:	rc
Target Release:	8.3
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-05-28 01:09:24 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Qiao Zhao 2020-05-20 04:03:23 UTC

Description of problem:


[ 1658.641062] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 1658.641066] rcu: 	1-....: (62659 ticks this GP) idle=542/1/0x4000000000000002 softirq=0/0 fqs=14956 
[ 1658.641067] rcu: 	 (t=60000 jiffies g=24841 q=23)
[ 1658.641070] Sending NMI from CPU 1 to CPUs 0:
[ 1658.642073] NMI backtrace for cpu 0
[ 1658.642074] CPU: 0 PID: 3462 Comm: oslat Kdump: loaded Not tainted 4.18.0-193.rt13.51.el8.x86_64 #1
[ 1658.642074] Hardware name: NEC Express5800/R120f-2M [N8100-2226F]/MS-S0901, BIOS 5.0.8022 06/22/2015
[ 1658.642074] RIP: 0033:0x402245
[ 1658.642075] Code: 2f 20 00 75 f0 0f ae e8 0f 31 48 c1 e2 20 89 c0 48 09 c2 48 89 55 20 0f ae e8 0f 31 48 c1 e2 20 89 c3 48 09 d3 0f ae e8 0f 31 <48> c1 e2 20 41 89 c4 48 89 ef 49 09 d4 4c 89 e6 48 29 de e8 83 fc
[ 1658.642076] RSP: 002b:00007f38c4486ad0 EFLAGS: 00000246
[ 1658.642076] RAX: 000000001349b312 RBX: 005c729c1349b2c8 RCX: 00000000000008fa
[ 1658.642077] RDX: 00000000005c729c RSI: 0000000000000040 RDI: 000000000218d930
[ 1658.642077] RBP: 000000000218d930 R08: 00007f38b4000b60 R09: 0000000000000003
[ 1658.642078] R10: 0000000000000000 R11: 000000000000000f R12: 005c729c1349b288
[ 1658.642078] R13: 00007fff2356cc2f R14: 0000000000000000 R15: 00007f38c4486bc0
[ 1658.642078] FS:  00007f38c4487700 GS:  0000000000000000
[ 1658.642095] NMI backtrace for cpu 1
[ 1658.642096] CPU: 1 PID: 3463 Comm: oslat Kdump: loaded Not tainted 4.18.0-193.rt13.51.el8.x86_64 #1
[ 1658.642097] Hardware name: NEC Express5800/R120f-2M [N8100-2226F]/MS-S0901, BIOS 5.0.8022 06/22/2015
[ 1658.642097] Call Trace:
[ 1658.642100]  <IRQ>
[ 1658.642107]  dump_stack+0x5c/0x80
[ 1658.642110]  nmi_cpu_backtrace.cold.4+0x13/0x4e
[ 1658.642112]  nmi_trigger_cpumask_backtrace+0x10a/0x11d
[ 1658.642116]  rcu_dump_cpu_stacks+0x9c/0xca
[ 1658.642118]  rcu_sched_clock_irq.cold.63+0x190/0x45e
[ 1658.642122]  ? tick_sched_do_timer+0x70/0x70
[ 1658.642125]  update_process_times+0x28/0x60
[ 1658.642127]  tick_sched_handle+0x22/0x60
[ 1658.642128]  tick_sched_timer+0x37/0x70
[ 1658.642130]  __hrtimer_run_queues+0x11f/0x390
[ 1658.642135]  ? activate_task+0x75/0x130
[ 1658.642138]  hrtimer_interrupt+0x10a/0x220
[ 1658.642140]  ? _raw_spin_lock+0x13/0x40
[ 1658.642143]  smp_apic_timer_interrupt+0x82/0x1f0
[ 1658.642145]  apic_timer_interrupt+0xf/0x20
[ 1658.642146]  </IRQ>
[ 1658.642147] RIP: 0033:0x401efd
[ 1658.642149] Code: c0 e9 75 ff ff ff 0f 1f 80 00 00 00 00 41 55 48 89 f0 31 d2 41 54 55 53 48 83 ec 08 8b 4f 10 48 f7 f1 85 c0 0f 88 b6 00 00 00 <8b> 15 fd 32 20 00 8d 68 01 49 89 fc 48 89 c3 41 89 c5 39 ea 7f 04
[ 1658.642150] RSP: 002b:00007f38c3c85aa0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[ 1658.642151] RAX: 0000000000000000 RBX: 005c729c134929de RCX: 0000000000000900
[ 1658.642152] RDX: 0000000000000047 RSI: 0000000000000047 RDI: 000000000218d978
[ 1658.642152] RBP: 000000000218d978 R08: 00007f38bc000b60 R09: 0000000000000003
[ 1658.642153] R10: 0000000000000000 R11: 000000000000000f R12: 005c729c13492997
[ 1658.642153] R13: 00007fff2356cc2f R14: 0000000000000000 R15: 00007f38c3c85bc0


Version-Release number of selected component (if applicable):
kernel-rt-4.18.0-193.rt13.51.el8.x86_64

Current active profile: latency-performance

# lscpu 
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              48
On-line CPU(s) list: 0-47
Thread(s) per core:  2
Core(s) per socket:  12
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
Stepping:            2
CPU MHz:             2599.970
CPU max MHz:         3100.0000
CPU min MHz:         1200.0000
BogoMIPS:            4599.84
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            30720K
NUMA node0 CPU(s):   0-11,24-35
NUMA node1 CPU(s):   12-23,36-47
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts md_clear flush_l1d

How reproducible:
always

Steps to Reproduce:
1. Clone oslat https://github.com/xzpeter/oslat
2. Compile and run oslat with "./oslat --cpu-list 0,1 --rtprio 1 --runtime 10" or any cpu-list on CPU0
3. Check console log/dmesg can saw call trace.

Actual results:


Expected results:


Additional info:

Comment 4 Juri Lelli 2020-05-22 07:38:13 UTC

> Steps to Reproduce:
> 1. Clone oslat https://github.com/xzpeter/oslat
> 2. Compile and run oslat with "./oslat --cpu-list 0,1 --rtprio 1 --runtime
> 10" or any cpu-list on CPU0
> 3. Check console log/dmesg can saw call trace.

Oh, so this is Peter's new sysjitter like tool, isn't it?

I've got the impression that running it on non-isolated CPUs
(is this the case?) might simply hog system threads (e.g., RCU
cb, etc.).

Reading github README it seems that the tool is intended to be
run on fully isolated CPUs. Peter, is my impression correct?
Would you expect something like what's described in this BZ to
happen if oslat threads are run on housekeeping/non-isolated
CPUs?

Thanks!

Comment 5 Qiao Zhao 2020-05-22 08:05:42 UTC

(In reply to Juri Lelli from comment #4)
> > Steps to Reproduce:
> > 1. Clone oslat https://github.com/xzpeter/oslat
> > 2. Compile and run oslat with "./oslat --cpu-list 0,1 --rtprio 1 --runtime
> > 10" or any cpu-list on CPU0
> > 3. Check console log/dmesg can saw call trace.
> 
> Oh, so this is Peter's new sysjitter like tool, isn't it?

Yes, it's Peter's new tool oslat (old name sysjitter2)

> 
> I've got the impression that running it on non-isolated CPUs
> (is this the case?) might simply hog system threads (e.g., RCU
> cb, etc.).
> 
> Reading github README it seems that the tool is intended to be
> run on fully isolated CPUs. Peter, is my impression correct?
> Would you expect something like what's described in this BZ to
> happen if oslat threads are run on housekeeping/non-isolated
> CPUs?
> 
> Thanks!

Comment 6 Peter Xu 2020-05-22 17:55:31 UTC

Qiao,

Firstly I don't think we can run oslat (or any program) using FIFO:99, e.g., the ksoftirqd threads will be using FIFO:2 on most kernels, so FIFO:99 will mean none of the ksoftirqd (and other kernel threads that we still want) to be scheduled properly.  So when running these tests, I'd always be using FIFO:1.

Meanwhile, there's indeed a tricky point in oslat that we can't run FIFO priority (even fifo:1) on cpu0 because cpu0 is by default used by the main thread of oslat to receive stop signals.  So "--rtprio N --cpu-list 0,..." used together means it'll probably hang forever because the stopping signal is not gonna received correctly.  I've also added a warning into oslat tool (v0.1.1 pushed) to show that it's expected.  Let's just run without cpu0 for simplicity, or you can lift the main thread priority too to make it work again:

sudo chrt -f 2 ./oslat --rtprio 1 --cpu-list 0 --runtime 1

Then the main thread will be lifted to fifo:2 then the SIGARLM will still get its way.

Comment 7 Qiao Zhao 2020-05-25 03:08:08 UTC

(In reply to Peter Xu from comment #6)
> Qiao,
> 
> Firstly I don't think we can run oslat (or any program) using FIFO:99, e.g.,
> the ksoftirqd threads will be using FIFO:2 on most kernels, so FIFO:99 will
> mean none of the ksoftirqd (and other kernel threads that we still want) to
> be scheduled properly.  So when running these tests, I'd always be using
> FIFO:1.
> 
> Meanwhile, there's indeed a tricky point in oslat that we can't run FIFO
> priority (even fifo:1) on cpu0 because cpu0 is by default used by the main
> thread of oslat to receive stop signals.  So "--rtprio N --cpu-list 0,..."
> used together means it'll probably hang forever because the stopping signal
> is not gonna received correctly.  I've also added a warning into oslat tool
> (v0.1.1 pushed) to show that it's expected.  Let's just run without cpu0 for
> simplicity, or you can lift the main thread priority too to make it work
> again:
> 
> sudo chrt -f 2 ./oslat --rtprio 1 --cpu-list 0 --runtime 1
> 
> Then the main thread will be lifted to fifo:2 then the SIGARLM will still
> get its way.

Hi Peter,

Thanks for your explanation and fix/update code, now the results/format looks good. (v0.1.3) :)

- Qiao

Comment 8 Juri Lelli 2020-05-27 13:41:39 UTC

(In reply to Qiao Zhao from comment #7)
> (In reply to Peter Xu from comment #6)
> > Qiao,
> > 
> > Firstly I don't think we can run oslat (or any program) using FIFO:99, e.g.,
> > the ksoftirqd threads will be using FIFO:2 on most kernels, so FIFO:99 will
> > mean none of the ksoftirqd (and other kernel threads that we still want) to
> > be scheduled properly.  So when running these tests, I'd always be using
> > FIFO:1.
> > 
> > Meanwhile, there's indeed a tricky point in oslat that we can't run FIFO
> > priority (even fifo:1) on cpu0 because cpu0 is by default used by the main
> > thread of oslat to receive stop signals.  So "--rtprio N --cpu-list 0,..."
> > used together means it'll probably hang forever because the stopping signal
> > is not gonna received correctly.  I've also added a warning into oslat tool
> > (v0.1.1 pushed) to show that it's expected.  Let's just run without cpu0 for
> > simplicity, or you can lift the main thread priority too to make it work
> > again:
> > 
> > sudo chrt -f 2 ./oslat --rtprio 1 --cpu-list 0 --runtime 1
> > 
> > Then the main thread will be lifted to fifo:2 then the SIGARLM will still
> > get its way.
> 
> Hi Peter,
> 
> Thanks for your explanation and fix/update code, now the results/format
> looks good. (v0.1.3) :)

Good. Should we close this then?

Thanks!

Comment 9 Qiao Zhao 2020-05-28 01:09:24 UTC

(In reply to Juri Lelli from comment #8)
> (In reply to Qiao Zhao from comment #7)
> > (In reply to Peter Xu from comment #6)
> > > Qiao,
> > > 
> > > Firstly I don't think we can run oslat (or any program) using FIFO:99, e.g.,
> > > the ksoftirqd threads will be using FIFO:2 on most kernels, so FIFO:99 will
> > > mean none of the ksoftirqd (and other kernel threads that we still want) to
> > > be scheduled properly.  So when running these tests, I'd always be using
> > > FIFO:1.
> > > 
> > > Meanwhile, there's indeed a tricky point in oslat that we can't run FIFO
> > > priority (even fifo:1) on cpu0 because cpu0 is by default used by the main
> > > thread of oslat to receive stop signals.  So "--rtprio N --cpu-list 0,..."
> > > used together means it'll probably hang forever because the stopping signal
> > > is not gonna received correctly.  I've also added a warning into oslat tool
> > > (v0.1.1 pushed) to show that it's expected.  Let's just run without cpu0 for
> > > simplicity, or you can lift the main thread priority too to make it work
> > > again:
> > > 
> > > sudo chrt -f 2 ./oslat --rtprio 1 --cpu-list 0 --runtime 1
> > > 
> > > Then the main thread will be lifted to fifo:2 then the SIGARLM will still
> > > get its way.
> > 
> > Hi Peter,
> > 
> > Thanks for your explanation and fix/update code, now the results/format
> > looks good. (v0.1.3) :)
> 
> Good. Should we close this then?
> 
> Thanks!

Sure, we can close this bug as not a bug.

- Qiao