Bug 2139901 - High cyclictest latency on SNO DU node with 8-6 rt-kernel on icelake CPU
Summary: High cyclictest latency on SNO DU node with 8-6 rt-kernel on icelake CPU
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel-rt
Version: 8.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Juri Lelli
QA Contact: Qiao Zhao
URL:
Whiteboard:
: 2166701 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-11-03 18:52 UTC by Dwaine Gonyier
Modified: 2023-08-14 11:27 UTC (History)
26 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:
dosman: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-138733 0 None None None 2022-11-09 04:48:26 UTC

Internal Links: 2179366

Comment 10 Dwaine Gonyier 2022-11-15 19:30:45 UTC
I'll take a look at adding this pre-test delay to cyclic test runs

Comment 14 Bart Wensley 2022-11-21 14:25:52 UTC
Thanks Dwaine - change looks good to me.

Comment 15 Juri Lelli 2022-11-30 14:20:28 UTC
Me and Daniel Bristot de Oliveira continue investigation; still unclear
if this is a regression in software, tuning/hw issue, or combination
thereof. Takes 12-24h to reproduce this issue, which is making debugging
challenging. Issue is present on both 8.4z and 8.6z.

Since no evident culprit can be seen from traces, we decided to try to
do some bisect with respect to older kernels. Brent and Bart are also
helping with this approach.

Brent, Bart, maybe you want to share status of your findings so far?

Thanks!

Comment 31 Daniel Bristot de Oliveira 2023-02-09 20:15:58 UTC
Hi Bart,

The problem we tracked down using timerlat was a kworker dealing with DRM driver.

The summary of the debugged evidence are:

  - high timer interrupt latency
  - current CPU running on idle
  - no IRQs in the current CPU before the timer

Then you look at what is running on other CPUs you will see an agetty waking up a kworker, and the kworker runs a DRM driver work.

I am adding two features to rtla timerlat to make the debugging of these cases obvious:

  - The tool prints an auto analysis at the end of the execution if the trace hits the threshold
  - The tool can also (optionally) print the tasks running on other CPUs, and if the task is a kworker, the last work it ran.

The patch is here:
  https://lore.kernel.org/lkml/3d45f40e630317f51ac6d678e2d96d310e495729.1675179318.git.bristot@kernel.org/

Comment 32 Daniel Bristot de Oliveira 2023-02-09 20:42:22 UTC
Here is one example of the output, also showing the tasks on other CPUs.... it is not the same problem because I do not have a reproducer, but it is a good example of how the tool output would look like.

In the "IRQ handler delay:" would be the major factor (highest %), the current thread is swapper (like in the example), but the stack trace was the poll idle.

Then a CPU would print a line like the one of CPU 6, a kworker, running a work of a DRM driver.

----- output sample ---------

# timerlat -a 20 --dump-task -c 11-23 -q
                                     Timer Latency                                              
  0 00:00:01   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
 11 #299       |        0         0         0         1 |        3         2         3         8
 12 #299       |        0         0         0         2 |        3         2         4         8
 13 #299       |        0         0         0         2 |        3         2         3         5
 14 #299       |        0         0         0         1 |        3         2         3         4
 15 #299       |        0         0         0         1 |        3         2         2         5
 16 #233       |        0         0         0         2 |       22         2         3        22
 17 #299       |        0         0         0         1 |        3         2         3         4
 18 #298       |        0         0         0         1 |        4         2         3         5
 19 #298       |        0         0         0         1 |        3         1         3         5
 20 #298       |        0         0         0         1 |        3         2         3         5
 21 #298       |        0         0         0         1 |        3         2         3         4
 22 #298       |        0         0         0         1 |        3         2         3         6
 23 #298       |        0         0         0         1 |        3         2         3         3
rtla timerlat hit stop tracing
## CPU 16 hit stop tracing, analyzing it ##
  IRQ handler delay:		                	     0.67 us (3.02 %)
  IRQ latency:						     0.89 us
  Timerlat IRQ duration:				     1.63 us (7.40 %)
  Blocking thread:					     3.30 us (14.94 %)
	              swapper/16:0        		     3.30 us
    Blocking thread stack trace
		-> timerlat_irq
		-> __hrtimer_run_queues
		-> hrtimer_interrupt
		-> __sysvec_apic_timer_interrupt
		-> sysvec_apic_timer_interrupt
		-> asm_sysvec_apic_timer_interrupt
		-> cpuidle_enter_state
		-> cpuidle_enter
		-> do_idle
		-> cpu_startup_entry
		-> start_secondary
		-> secondary_startup_64_no_verify
  IRQ interference					     2.18 us (9.90 %)
	                    eno1:64 			     2.18 us
  Softirq interference					    13.76 us (62.36 %)
	                  NET_RX:3  			    13.76 us
------------------------------------------------------------------------
  Thread latency:					    22.06 us (100%)

Max timerlat IRQ latency from idle: 2.36 us in cpu 12

Printing CPU tasks:
    [000]                swapper/0:0
    [001]                swapper/1:0
    [002]                swapper/2:0
    [003]                swapper/3:0
    [004]                         :0
    [005]                swapper/5:0
    [006]            kworker/u64:3:3430 kworker:<...>:flush_to_ldisc
    [007]                swapper/7:0
    [008]                swapper/8:0
    [009]                swapper/9:0
    [010]                     bash:2554
    [011]               swapper/11:0
    [012]               swapper/12:0
    [013]                     sshd:2476
    [014]               swapper/14:0
    [015]               swapper/15:0
    [016]              timerlat/16:3518
    [017]               swapper/17:0
    [018]               swapper/18:0
    [019]               swapper/19:0
    [020]               swapper/20:0
    [021]               swapper/21:0
    [022]               swapper/22:0
    [023]               swapper/23:0
  Saving trace to timerlat_trace.txt


---- 

(for curiosity, in this case, the problem was a softirq caused by the network IRQ. See how easy it is to find the source of a latency? (it is a non-rt kernel, so we have softirq).

Comment 36 Nahian 2023-03-06 14:17:15 UTC
Seeing the same issue in SPR (ocp 4.12.x). Was able to get a better result once we removed the console modules.

Comment 64 Eder Zulian 2023-03-28 18:20:45 UTC
*** Bug 2166701 has been marked as a duplicate of this bug. ***

Comment 69 Bart Wensley 2023-04-12 12:02:02 UTC
It sounds like this could be related to BZ2179366 - one of the recommendations there is to remove console=tty0 from the kernel arguments.

Comment 70 Juri Lelli 2023-04-12 13:34:02 UTC
(In reply to Bart Wensley from comment #69)
> It sounds like this could be related to BZ2179366 - one of the
> recommendations there is to remove console=tty0 from the kernel arguments.

Indeed! Could we maybe try to do a run after removing that argument?

Comment 71 Dahir Osman 2023-04-12 13:43:33 UTC
(In reply to Juri Lelli from comment #70)
> (In reply to Bart Wensley from comment #69)
> > It sounds like this could be related to BZ2179366 - one of the
> > recommendations there is to remove console=tty0 from the kernel arguments.
> 
> Indeed! Could we maybe try to do a run after removing that argument?

I will try it out.

Comment 82 Jocelyn Falempe 2023-05-16 10:48:53 UTC
There was too much conflicts to solve, so I've pushed a branch on top of my DRM backport for rhel8.9
But you can still build it and install it on rhel8.6:

https://gitlab.com/kdj0c/rhel-8/-/tree/mgag200_dma_rhel8.9

I've made a brew build here:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=52633516

Comment 83 Juri Lelli 2023-05-16 15:06:01 UTC
(In reply to Jocelyn Falempe from comment #82)
> There was too much conflicts to solve, so I've pushed a branch on top of my
> DRM backport for rhel8.9
> But you can still build it and install it on rhel8.6:
> 
> https://gitlab.com/kdj0c/rhel-8/-/tree/mgag200_dma_rhel8.9
> 
> I've made a brew build here:
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=52633516

I took your branch and merged it with rhel8/main-rt (as testing for this
BZ has been using the RT kernel). Had to fix a couple of conflicts which
hopefully I did get right. Could we please re-test with the following?

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=52639012

Thanks!

Comment 84 Dahir Osman 2023-05-19 12:35:45 UTC
Here are the result for this new kernel build.

############# dumping env ###########
delay=30            
HOSTNAME=cyclictest
DURATION=24h  
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_ADDR=172.30.0.1                          
container=oci                     
KUBERNETES_PORT=tcp://172.30.0.1:443
PWD=/root                            
HOME=/root
INTERVAL=100                         
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP_PORT=443                                                                                                                                                                                                                                                                                           KUBERNETES_PORT_443_TCP=tcp://172.30.0.1:443                                                                                                                                                                                                                                                                               tool=cyclictest                                                                                                                                                                                                                                                                                                            
TERM=xterm                           
NSS_SDB_USE_CACHE=no
SHLVL=1                    
rt_priority=95                                              
KUBERNETES_SERVICE_PORT=443                                                                                                                                  
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
KUBERNETES_SERVICE_HOST=172.30.0.1                                                                                                                           
_=/usr/bin/env      
#####################################
           
########## container info ###########                                                                                                                        
/proc/cmdline:                                                                                                                                               
BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-bed5cd4096e872eed8365a4fa9c134670b4d5e6833ff393045f2514acef37b00/vmlinuz-4.18.0-492.rt7.281.el8.bz2139901.dma.g7eae.x86_64 ignition.platform.id=metal console=tty0 console=ttyS0,115200n8 ostree=/ostree/boot.1/rhcos/bed5cd4096e872eed8365a4fa9c134670b4d5e6833ff393045f2514acef37b00/0
 root=UUID=dfcac331-e86f-4b5f-a33b-d75b6f8a2b38 rw rootflags=prjquota boot=UUID=00b97106-52e5-4838-84d3-df42b2cd8f99 crashkernel=512M skew_tick=1 nohz=on rcu_nocbs=2-31,34-63 tuned.non_isolcpus=00000003,00000003 systemd.cpu_affinity=0,1,32,33 intel_iommu=on iommu=pt isolcpus=managed_irq,2-31,34-63 nohz_full=2-31,3
4-63 tsc=nowatchdog nosoftlockup nmi_watchdog=0 mce=off rcutree.kthread_prio=11 default_hugepagesz=1G hugepagesz=1G hugepages=32 rcupdate.rcu_normal_after_boot=0 efi=runtime module_blacklist=irdma intel_pstate=disable
#####################################                                                                                                                        
**** uid: 0 ****                                                                                                                                             
allowed cpu list: 2-9,34-41                                                                                                                                  
cyclictest 4.18.0-492.rt7.281.el8.bz2139901.dma.g7eae.x86_64                                                                                                 
removing cpu34 from the cpu list because it is a sibling of cpu2 which will be the mainaffinity                       
new cpu list: 3,4,5,6,7,8,9,35,36,37,38,39,40,41                                                                                                             
running cmd: cyclictest -q -D 24h -p 95 -t 14 -a 3,4,5,6,7,8,9,35,36,37,38,39,40,41 -h 30 -i 100 --mainaffinity 2 -m  
sleep 30 before test                                                                                                                                         
# /dev/cpu_dma_latency set to 0us                                                                                                                            
# Histogram                                                                                                                                                  
000000 000000   000000  000000  000000  000000  000000  000000  000000  000000  000000  000000  000000  000000  000000
000001 000000   000000  000000  000000  000000  000000  000000  000000  000000  000000  000000  000000  000000  000000
000002 855696427        89988731        859494159       859632645       860384157       820132009       857866838       862010338       058065  862829963       791073587       854493837       767166401       862092844
000003 8270677  773958874       4474448 4336505 3582563 43827272        6098070 1962116 863903476       1140366 72656437        9485119 96727181        1879550
000004 025208   040754  016588  024386  026482  032061  027803  021769  028777  020327  262420  015469  097785  022797
000005 004985   006862  011183  005140  004078  005856  004069  003750  006725  006392  004757  003018  004930  004053
000006 001962   002540  002384  000966  002245  001760  002912  001738  001721  001962  001009  001816  002035  000566
000007 000649   001634  001080  000240  000192  000385  000210  000222  000327  000880  000376  000371  000299  000074
000008 000013   000325  000030  000050  000062  000273  000016  000004  000336  000015  000112  000152  000115  000004
000009 000001   000156  000034  000023  000074  000165  000006  000006  000232  000006  000131  000066  000070  000008
000010 000000   000073  000019  000009  000055  000132  000007  000006  000169  000004  000051  000048  000053  000005
000011 000001   000036  000010  000005  000035  000041  000003  000000  000083  000002  000037  000017  000035  000001
000012 000001   000004  000009  000001  000005  000002  000002  000000  000010  000000  000020  000000  000021  000003
000013 000001   000002  000009  000002  000001  000001  000003  000002  000000  000003  000008  000001  000007  000001
000014 000001   000000  000004  000001  000001  000002  000006  000002  000001  000000  000094  000001  000123  000000
000015 000001   000001  000002  000001  000001  000000  000003  000000  000000  000000  000100  000001  000076  000000                              
000016 000003   000000  000001  000002  000001  000000  000000  000000  000000  000001  000034  000000  000034  000000
000017 000000   000000  000001  000000  000003  000000  000002  000000  000000  000000  000111  000000  000128  000000
000018 000002   000000  000001  000000  000001  000000  000000  000000  000000  000000  000107  000000  000096  000000
000019 000007   000000  000001  000000  000001  000000  000002  000000  000000  000000  000069  000000  000065  000000
000020 000003   000000  000000  000000  000002  000000  000001  000000  000000  000000  000071  000000  000074  000000
000021 000002   000000  000000  000000  000000  000000  000001  000000  000000  000000  000068  000000  000068  000000
000022 000000   000000  000000  000000  000000  000000  000000  000000  000000  000000  000066  000000  000054  000000
000023 000000   000000  000000  000000  000000  000000  000001  000000  000000  000000  000036  000000  000037  000000                                                                                                                                                                                                     
000024 000001   000000  000000  000000  000001  000000  000000  000000  000000  000000  000020  000000  000021  000000
000025 000000   000000  000001  000000  000000  000000  000000  000000  000000  000000  000011  000001  000009  000000
000026 000001   000000  000000  000000  000000  000000  000000  000000  000000  000000  000004  000000  000004  000000
000027 000001   000000  000000  000001  000000  000000  000000  000000  000000  000000  000004  000000  000005  000000
000028 000001   000000  000002  000000  000000  000001  000001  000000  000001  000000  000003  000000  000002  000000
000029 000000   000001  000005  000000  000000  000000  000000  000000  000000  000000  000001  000001  000002  000000
# Total: 863999948 863999993 863999971 863999977 863999960 863999960 863999956 863999953 863999923 863999921 863999744 863999918 863999730 863999906
# Min Latencies: 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002                                                                                                                                                                                                                       
# Avg Latencies: 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002 00002
# Max Latencies: 00107 00088 00117 00058 00098 00088 00101 00063 00087 00065 00131 00088 00131 00061                                                                                                                                                                                                                       
# Histogram Overflows: 00009 00007 00017 00009 00008 00007 00008 00008 00007 00008 00179 00007 00178 00008
# Histogram Overflow at cycle number:
# Thread 0: 24028322 167458592 278296832 292348542 518296853 641775101 723826932 839486941 857092800
# Thread 1: 167458634 278296874 292348584 518296895 641775144 723826975 839486985
# Thread 2: 24028353 167458623 236092463 278296863 292348573 431447013 482932643 482932653 517867343 518296884 546646973 575917432 641775132 723826963 791036891 834257481 857092831
# Thread 3: 00026 36946880 167458620 278296860 292348570 518296881 536596970 641775130 723826961
# Thread 4: 167458601 278296841 292348551 518296862 641775111 723826942 839486952 857092811
# Thread 5: 167458601 278296841 292348551 518296862 641775111 723826942 839486952
# Thread 6: 167458598 278296838 292348548 518296859 641775107 723826938 839486948 857092807
# Thread 7: 36946855 167458595 278296835 292348545 518296856 536596945 641775105 723826936
# Thread 8: 167458564 278296804 292348514 518296825 641775074 723826905 839486915
# Thread 9: 36946823 167458563 278296803 292348513 518296824 536596913 641775073 723826904
# Thread 10: 3360730 3796810 3807260 4707210 8617760 24028280 24028290 32601750 36946821 47756210 50829670 56617560 61867590 65903980 75796870 76716100 81057300 81198710 81528120 88293090 92229370 95781740 102548240 108476880 109407470 110746760 114097740 114097780 114097790 114097820 # 00149 others
# Thread 11: 167458559 278296799 292348509 518296820 641775069 723826900 839486910
# Thread 12: 3360715 3796795 3807245 4707195 8617745 24028265 24028275 32601735 36946806 47756195 50829655 56617545 61867575 65903965 75796855 76716085 81057285 81198695 81528105 88293075 92229355 95781725 102548225 108476865 109407455 110746745 114097725 114097765 114097775 114097805 # 00148 others
# Thread 13: 36946808 167458548 278296788 292348498 518296809 536596898 641775058 723826889

Comment 85 Juri Lelli 2023-06-21 08:28:54 UTC
Hi Jocelyn,

Do you think we have other options to try out?

Thanks!


Note You need to log in before you can comment on or make changes to this bug.