Bug 1820626

Summary: Change tuned default to "isolcpus=domain,managed_irq,X-Y"
Product: Red Hat Enterprise Linux 8 Reporter: Pei Zhang <pezhang>
Component: tunedAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED ERRATA QA Contact: Robin Hack <rhack>
Severity: low Docs Contact:
Priority: low    
Version: 8.3CC: chayang, fbaudin, jeder, jinzhao, jskarvad, juzhang, lcapitulino, mtosatti, peterx, ppandit, rhack
Target Milestone: rcKeywords: Patch, TestCaseNeeded, TestCaseProvided, Triaged, Upstream
Target Release: 8.0Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tuned-2.16.0-0.1.rc1.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-09 19:58:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1799014    
Bug Blocks: 1932086    

Description Pei Zhang 2020-04-03 13:30:55 UTC
Description of problem:

When kernel >= kernel-4.18.0-181.el8(fix of Bug 1783026), we need "isolcpus=domain,managed_irq,X-Y" (fix of Bug 1797025)in kernel line to achieve expected latency performance for KVM-RT guest. 

Currently we manually append "isolate_managed_irq=Y" to file /etc/tuned/realtime-virtual-*-variables.conf to get "isolcpus=domain,managed_irq,X-Y". It's better to set this change automatically by system, rather than let users manually do.


Version-Release number of selected component (if applicable):
tuned-2.13.0-6.el8.noarch
kernel-rt-4.18.0-192.rt13.50.el8.x86_64


How reproducible:
100%


Steps to Reproduce:
1. Manually append "isolate_managed_irq=Y" to /etc/tuned/realtime-virtual-host-variables.conf, it works as expected in RT host.

# cat /etc/tuned/realtime-virtual-host-variables.conf 
isolated_cores=1,3,5,7,9,11,13,15,17,19,12,14,16,18
isolate_managed_irq=Y

# tuned-adm profile realtime-virtual-host

# reboot

# cat /proc/cmdline 
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-192.rt13.50.el8.x86_64 root=/dev/mapper/rhel_dell--per430--09-root ro crashkernel=auto resume=/dev/mapper/rhel_dell--per430--09-swap rd.lvm.lv=rhel_dell-per430-09/root rd.lvm.lv=rhel_dell-per430-09/swap console=ttyS0,115200n81 skew_tick=1 isolcpus=managed_irq,domain,1,3,5,7,9,11,13,15,17,19,12,14,16,18 intel_pstate=disable nosoftlockup nohz=on nohz_full=1,3,5,7,9,11,13,15,17,19,12,14,16,18 rcu_nocbs=1,3,5,7,9,11,13,15,17,19,12,14,16,18 default_hugepagesz=1G iommu=pt intel_iommu=on tsc=nowatchdog mitigations=off


2. Manually append "isolate_managed_irq=Y" to /etc/tuned/realtime-virtual-guest-variables.conf, it works as expected in RT guest.

# cat /etc/tuned/realtime-virtual-guest-variables.conf 
isolated_cores=1
isolate_managed_irq=Y

# tuned-adm profile realtime-virtual-guest

# cat /proc/cmdline 
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-192.rt13.50.el8.x86_64 root=/dev/mapper/rhel_vm--73--232-root ro console=tty0 console=ttyS0,115200n8 biosdevname=0 crashkernel=auto resume=/dev/mapper/rhel_vm--73--232-swap rd.lvm.lv=rhel_vm-73-232/root rd.lvm.lv=rhel_vm-73-232/swap skew_tick=1 isolcpus=managed_irq,domain,1 intel_pstate=disable nosoftlockup nohz=on nohz_full=1 rcu_nocbs=1 default_hugepagesz=1G iommu=pt intel_iommu=on tsc=nowatchdog mitigations=off

Actual results:
We need some manual setup to get "isolcpus=domain,managed_irq,X-Y" in kernel line.

Expected results:
We request get "isolcpus=domain,managed_irq,X-Y" as default in kernel line. 

Additional info:

Comment 1 Jaroslav Škarvada 2020-04-03 14:09:20 UTC
It's quite tricky on systems with multiple kernels.

Are you OK with the simplification: if there is installed kernel >= 4.18.0-181 and such kernel is the default, boot always with the "isolcpus=domain,managed_irq,X-Y"?

With such simplification the user could select some older kernel < 4.18.0-181 and it will be also booted with the "isolcpus=domain,managed_irq,X-Y", because  implementing this to work differently per every installed kernel (and also support it to work with older kernels which could be manually installed later) would be quite tricky.

Comment 2 Luiz Capitulino 2020-04-03 18:39:06 UTC
Marcelo, Peter, is Jaroslav solution from comment 1 good enough?

Comment 3 Peter Xu 2020-04-03 21:43:07 UTC
TBH I still think do this in kernel should be easier, but it seems I'm the only one who thinks so. :) So I'll leave this question to Marcelo.

Comment 4 Pei Zhang 2020-04-04 02:13:13 UTC
It's also OK for me to do this change in kernel. 

If we finally decide doing this in kernel, feel free to change the component from tuned to kernel. Thanks.

Comment 5 Marcelo Tosatti 2020-04-07 14:17:00 UTC
(In reply to Jaroslav Škarvada from comment #1)
> It's quite tricky on systems with multiple kernels.
> 
> Are you OK with the simplification: if there is installed kernel >=
> 4.18.0-181 and such kernel is the default, boot always with the
> "isolcpus=domain,managed_irq,X-Y"?
> 
> With such simplification the user could select some older kernel <
> 4.18.0-181 and it will be also booted with the
> "isolcpus=domain,managed_irq,X-Y", because  implementing this to work
> differently per every installed kernel (and also support it to work with
> older kernels which could be manually installed later) would be quite tricky.

[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.10.0-1123.rt56.1087noshat2.el7.x86_64 root=/dev/mapper/rhel_virtlab504-root ro crashkernel=auto rd.lvm.lv=rhel_virtlab504/root rd.lvm.lv=rhel_virtlab504/swap console=ttyS1,115200 LANG=en_US.UTF-8 default_hugepagesz=1G spectre_v2=off nopti skew_tick=1 isolcpus=managed_irq,domain,1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 intel_pstate=disable nosoftlockup nohz=on nohz_full=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31 rcu_nocbs=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
[    0.000000] sched: Error, all isolcpus= values must be between 0 and 32

Comment 6 Jaroslav Škarvada 2020-04-07 15:59:28 UTC
(In reply to Marcelo Tosatti from comment #5)
> (In reply to Jaroslav Škarvada from comment #1)
> > It's quite tricky on systems with multiple kernels.
> > 
> > Are you OK with the simplification: if there is installed kernel >=
> > 4.18.0-181 and such kernel is the default, boot always with the
> > "isolcpus=domain,managed_irq,X-Y"?
> > 
> > With such simplification the user could select some older kernel <
> > 4.18.0-181 and it will be also booted with the
> > "isolcpus=domain,managed_irq,X-Y", because  implementing this to work
> > differently per every installed kernel (and also support it to work with
> > older kernels which could be manually installed later) would be quite tricky.
> 
> [    0.000000] Kernel command line:
> BOOT_IMAGE=/vmlinuz-3.10.0-1123.rt56.1087noshat2.el7.x86_64
> root=/dev/mapper/rhel_virtlab504-root ro crashkernel=auto
> rd.lvm.lv=rhel_virtlab504/root rd.lvm.lv=rhel_virtlab504/swap
> console=ttyS1,115200 LANG=en_US.UTF-8 default_hugepagesz=1G spectre_v2=off
> nopti skew_tick=1
> isolcpus=managed_irq,domain,1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
> intel_pstate=disable nosoftlockup nohz=on
> nohz_full=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
> rcu_nocbs=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
> [    0.000000] sched: Error, all isolcpus= values must be between 0 and 32

This is error from the kernel - it probably doesn't like the 'isolcpus=managed_irq,domain,' because it's 3.10.0-1123.rt56.1087noshat2 which is lower than the 4.18.0-181 and probably it doesn't support this feature. Also this is RHEL-8 bugzilla, but the kernel from the log is RHEL-7 kernel.

Automatic handling of the "managed_irq,domain" options is not yet implemented in the Tuned, it needs to be set manually by the "isolate_managed_irq" option in the /etc/tuned/realtime-virtual-host-variables.conf.

I agree with the comment 3 that handling of this will be easier on the kernel side. There will be more problems on the Tuned side - e.g. Tuned will check the kernel, update the grub entries, but the machine will have to be rebooted for the changes to be applied to the kernel - this will trigger e.g. when upgrading from the pre kernel-4.18.0-181 to the kernel-4.18.0-181 (or later).

Comment 7 Marcelo Tosatti 2020-04-08 12:01:56 UTC
(In reply to Jaroslav Škarvada from comment #6)
> (In reply to Marcelo Tosatti from comment #5)
> > (In reply to Jaroslav Škarvada from comment #1)
> > > It's quite tricky on systems with multiple kernels.
> > > 
> > > Are you OK with the simplification: if there is installed kernel >=
> > > 4.18.0-181 and such kernel is the default, boot always with the
> > > "isolcpus=domain,managed_irq,X-Y"?
> > > 
> > > With such simplification the user could select some older kernel <
> > > 4.18.0-181 and it will be also booted with the
> > > "isolcpus=domain,managed_irq,X-Y", because  implementing this to work
> > > differently per every installed kernel (and also support it to work with
> > > older kernels which could be manually installed later) would be quite tricky.
> > 
> > [    0.000000] Kernel command line:
> > BOOT_IMAGE=/vmlinuz-3.10.0-1123.rt56.1087noshat2.el7.x86_64
> > root=/dev/mapper/rhel_virtlab504-root ro crashkernel=auto
> > rd.lvm.lv=rhel_virtlab504/root rd.lvm.lv=rhel_virtlab504/swap
> > console=ttyS1,115200 LANG=en_US.UTF-8 default_hugepagesz=1G spectre_v2=off
> > nopti skew_tick=1
> > isolcpus=managed_irq,domain,1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
> > intel_pstate=disable nosoftlockup nohz=on
> > nohz_full=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
> > rcu_nocbs=1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
> > [    0.000000] sched: Error, all isolcpus= values must be between 0 and 32
> 
> This is error from the kernel - it probably doesn't like the
> 'isolcpus=managed_irq,domain,' because it's 3.10.0-1123.rt56.1087noshat2
> which is lower than the 4.18.0-181 and probably it doesn't support this
> feature. Also this is RHEL-8 bugzilla, but the kernel from the log is RHEL-7
> kernel.
> 
> Automatic handling of the "managed_irq,domain" options is not yet
> implemented in the Tuned, it needs to be set manually by the
> "isolate_managed_irq" option in the
> /etc/tuned/realtime-virtual-host-variables.conf.
> 
> I agree with the comment 3 that handling of this will be easier on the
> kernel side. There will be more problems on the Tuned side - e.g. Tuned will
> check the kernel, update the grub entries, but the machine will have to be
> rebooted for the changes to be applied to the kernel - this will trigger
> e.g. when upgrading from the pre kernel-4.18.0-181 to the kernel-4.18.0-181
> (or later).

Jaroslav,

This will break the Tuned profile on older systems, silently.

So better have "isolcpus=managed_irq,domain" as a default on new Tuned versions.

Can't we make this new tuned package depend on the kernel version that contains
"isolcpus=managed_irq,domain" ? Seems like a decent temporary solution.

Comment 8 Jaroslav Škarvada 2020-04-08 15:54:38 UTC
(In reply to Marcelo Tosatti from comment #7)
> This will break the Tuned profile on older systems, silently.
> 
> So better have "isolcpus=managed_irq,domain" as a default on new Tuned
> versions.
>
I.e. the 'isolate_managed_irq=Y' to be the new default.

> Can't we make this new tuned package depend on the kernel version that
> contains
> "isolcpus=managed_irq,domain" ? Seems like a decent temporary solution.

Not on the Tuned package (because there may be customers sticking for some reason on some older kernel version), but such requirement for the tuned-profiles-realtime package could work as an temporal solution. It would enforce user to have the new kernel installed, but it will not prevent her or him from downgrading or changing the default kernel to some older version.

Comment 12 Prasad Pandit 2021-06-07 11:39:05 UTC
'isolcpus=managed_irq' parameter was introduced upstream via
  -> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=11ea68f553e244851d15793a7fa33a97c46d8271
     genirq, sched/isolation: Isolate from handling managed interrupts
     ...
     Add a new sub-parameter 'managed_irq' for 'isolcpus' and the corresponding
     logic in the interrupt affinity selection code.

$ git tag --contains 11ea68f553e244851d15793a7fa33a97c46d8271 |less
v5.6
v5.7
v5.8
v5.9
...
v5.12
v5.13-rc5
===

root@ibm-p8-kvm-03-guest-02:~# cat /etc/debian_version 
10.0
root@ibm-p8-kvm-03-guest-02:~# 
root@ibm-p8-kvm-03-guest-02:~# uname -r
4.19.0-5-amd64
root@ibm-p8-kvm-03-guest-02:~# 
root@ibm-p8-kvm-03-guest-02:~# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-4.19.0-5-amd64 root=UUID=aaae86d5-7ddd-486a-8e79-9aeba5d60623 ro console=ttyS0,115200n8 isolcpus=managed_irq,domain,1
root@ibm-p8-kvm-03-guest-02:~#
root@ibm-p8-kvm-03-guest-02:~# journalctl | grep -i 'isolcpus: '
Jun 07 07:02:34 ibm-p8-kvm-03-guest-02 kernel: isolcpus: Error, unknown flag
root@ibm-p8-kvm-03-guest-02:~#
---

[root@localhost ~]# cat /etc/centos-release
CentOS Linux release 8.2.2004 (Core)
[root@localhost ~]#
[root@localhost ~]# uname -r
4.18.0-193.6.3.el8_2.x86_64
[root@localhost ~]# 
[root@localhost ~]# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-193.6.3.el8_2.x86_64 root=UUID=4fd120e4-1f6d-46b3-a404-5569ef6af1f9 ro console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=UUID=40f14688-2619-4046-a9eb-b7333fff1b84 console=ttyS0,115200 isolcpus=managed_irq,domain,1
[root@localhost ~]# 
[root@localhost ~]# journalctl | grep -i managed_irq
Jun 07 07:13:22 localhost.localdomain kernel: Command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-193.6.3.el8_2.x86_64 root=UUID=4fd120e4-1f6d-46b3-a404-5569ef6af1f9 ro console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=UUID=40f14688-2619-4046-a9eb-b7333fff1b84 console=ttyS0,115200 isolcpus=managed_irq,domain,1
[root@localhost ~]# 
[root@localhost ~]# journalctl | grep -i 'isolcpus: '
[root@localhost ~]#
---

[root@localhost ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.2 (Ootpa)
[root@localhost ~]# 
[root@localhost ~]# uname -r
4.18.0-193.19.1.el8_2.x86_64
[root@localhost ~]# 
[root@localhost ~]# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-193.19.1.el8_2.x86_64 root=UUID=9a1216f6-f049-48b9-8a89-2f4bb3e081df ro console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=UUID=72a5810d-589c-460d-9187-9289b86dac11 console=ttyS0,115200 skew_tick=1 isolcpus=managed_irq,domain,1 intel_pstate=disable nosoftlockup tsc=nowatchdog
[root@localhost ~]# 
[root@localhost ~]# journalctl | grep -i managed_irq
Jun 07 07:07:50 localhost.localdomain kernel: Command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-193.19.1.el8_2.x86_64 root=UUID=9a1216f6-f049-48b9-8a89-2f4bb3e081df ro console=tty0 rd_NO_PLYMOUTH crashkernel=auto resume=UUID=72a5810d-589c-460d-9187-9289b86dac11 console=ttyS0,115200 skew_tick=1 isolcpus=managed_irq,domain,1 intel_pstate=disable nosoftlockup tsc=nowatchdog
[root@localhost ~]# journalctl | grep -i 'isolcpus: '
[root@localhost ~]#
---

[root@localhost ~]# cat /etc/fedora-release 
Fedora release 31 (Thirty One)
[root@localhost ~]# 
[root@localhost ~]# uname -r
5.3.8-300.fc31.x86_64
[root@localhost ~]# 
[root@localhost ~]# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.3.8-300.fc31.x86_64 root=UUID=8f105aa0-f486-42b6-866e-b7f53e89effa ro console=tty0 rd_NO_PLYMOUTH resume=UUID=f98a1ef3-02fe-4634-b83e-393bafbbf48a console=ttyS
0,115200 isolcpus=managed_irq,domain,1
[root@localhost ~]# 
[root@localhost ~]# journalctl | grep -i managed_irq
Jun 07 06:45:14 localhost.localdomain kernel: Command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.3.8-300.fc31.x86_64 root=UUID=8f105aa0-f486-42b6-866e-b7f53e89effa ro console=tty0 rd_NO_PLYMOUTH resume=UUID=f98a1ef3-02fe-4634-b83e-393bafbbf48a console=ttyS0,115200 isolcpus=managed_irq,domain,1
[root@localhost ~]# 
[root@localhost ~]# journalctl | grep -i 'isolcpus:'
Jun 07 06:45:14 localhost.localdomain kernel: isolcpus: Error, unknown flag
[root@localhost ~]# 
---


* Kernel logs a warning (pr_warn()) error if the isolcpus sub-parameter is not recognised.

* Enabling 'isolate_managed_irq=Y' by default should not cause a fatal kernel boot error.

Comment 13 Prasad Pandit 2021-06-07 13:20:20 UTC
Raised PR#356
  -> https://github.com/redhat-performance/tuned/pull/356

Comment 18 Pei Zhang 2021-07-09 09:05:26 UTC
Testing update:

For rhel8.5 kvm-rt testing, this issue has gone with tuned-2.16.0-0.1.rc1.el8.noarch. Thank you all. 

And I think RHEL9 also requires this fix. I've created a new bz to track: Bug 1980680 - [RHEL9]Change tuned default to "isolcpus=domain,managed_irq,X-Y" 


Testing details:

$ cat /etc/tuned/realtime-virtual-host-variables.conf 
#
# Variable settings below override the definitions from the
# /etc/tuned/realtime-variables.conf file.
#
# Examples:
# isolated_cores=2,4-7
isolated_cores=2-19
#
#
# Uncomment the 'isolate_managed_irq=Y' bellow if you want to move kernel
# managed IRQs out of isolated cores. Note that this requires kernel
# support. Please only specify this parameter if you are sure that the
# kernel supports it.
#
isolate_managed_irq=Y

#
# Set the desired combined queue count value using the parameter provided
# below. Ideally this should be set to the number of housekeeping CPUs i.e.,
# in the example given below it is assumed that the system has 4 housekeeping
# (non-isolated) CPUs.
#
# netdev_queue_count=4


$ cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-321.rt7.102.el8.x86_64 root=/dev/mapper/rhel_dell--per430--11-root ro crashkernel=auto resume=/dev/mapper/rhel_dell--per430--11-swap rd.lvm.lv=rhel_dell-per430-11/root rd.lvm.lv=rhel_dell-per430-11/swap console=ttyS0,115200n81 skew_tick=1 isolcpus=managed_irq,domain,2-19 intel_pstate=disable nosoftlockup tsc=nowatchdog nohz=on nohz_full=2-19 rcu_nocbs=2-19 irqaffinity=0,1

Comment 23 errata-xmlrpc 2021-11-09 19:58:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (tuned bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4476