RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1797025 - Support "managed_irq" in "isolcpus=" parameter
Summary: Support "managed_irq" in "isolcpus=" parameter
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: tuned
Version: 8.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.0
Assignee: Jaroslav Škarvada
QA Contact: Robin Hack
URL:
Whiteboard:
Depends On: 1783026
Blocks: 1640832
TreeView+ depends on / blocked
 
Reported: 2020-01-31 18:47 UTC by Peter Xu
Modified: 2021-09-03 15:16 UTC (History)
8 users (show)

Fixed In Version: tuned-2.13.0-6.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-28 16:59:38 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github redhat-performance tuned pull 255 0 None closed realtime: added support for managed_irq 2021-02-11 12:07:22 UTC
Red Hat Issue Tracker RHELPLAN-33965 0 None None None 2021-09-03 15:16:09 UTC
Red Hat Issue Tracker RHELPLAN-38811 0 None None None 2021-09-03 15:16:01 UTC
Red Hat Product Errata RHBA-2020:1883 0 None None None 2020-04-28 16:59:55 UTC

Description Peter Xu 2020-01-31 18:47:29 UTC
Kernel commit:

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=11ea68f553e244851d15793a7fa33a97c46d8271

Tuned needs to understand and apply this new sub-parameter to wherever we used isolated_cores.  In other words, we should switch kernel parameter usages like:

  isolcpus=X-Y

to:

  isolcpus=managed_irq,X-Y

So that the isolated cores (X-Y) won't be affected by kernel managed IRQs which can bring extra spikes.

Comment 1 Peter Xu 2020-01-31 21:21:08 UTC
I've raised a question here which could be a challenge...

https://bugzilla.redhat.com/show_bug.cgi?id=1783026#c29

I think one solution could be that we offer another parameter in tuned, like in realtime/realtime-variables.conf:

# Examples:
#
# isolated_cores=2,4-7
# isolated_cores=2-23
#
# Set this when we want to move kernel managed IRQs out of isolated
# cores.  Note that this requires kernel support.  Please only specify
# this parameter if you are sure that the kernel supports it.
#
# isolate_managed_irq=Y

Then we let the user to choose whether to enable this.  Not sure whether this is the best way, though.

Comment 2 Peter Xu 2020-03-12 16:55:59 UTC
I just found a side-effect of managed_irq sub-parameter.  When user specified isolate_managed_irq=Y, instead of using:

  isolcpus=managed_irq,X-Y

I think we need an extra of:

  isolcpus=domain,managed_irq,X-Y

To make sure we keep the HK_FLAG_DOMAIN in kernel.

Let me explain...

The kernel played a trick with the isolcpus= parameter in that if there's no sub-parameter at all, then it'll apply the default one, which is "domain" (HK_FLAG_DOMAIN).  While if there is some sub-parameter specified (in our case, managed_irq), then it'll not apply the default sub-parameter but use what is specified.

Before the managed_irq thing, we're using "isolcpus=X-Y" which implies "isolcpus=domain,X-Y".

So, after the managed_irq, if we want to keep the same behavior as before, but also apply the managed irq logic, what we really need here is "isolcpus=domain,managed_irq,X-Y".

Verify this is easy: HK_FLAG_DOMAIN governs the schedule domain.  So if we're only with "isolcpus=managed_irq,X-Y", we should observe that even our bash will be put into the isolation domain.  Just login to any shell, and try:

  $ taskset -pc $$

The correct result should not contain any isolated cores.

Pei, you can have a look on your test machines to see whether we have this problem after using the managed_irq sub-param.

Comment 3 Pei Zhang 2020-03-13 05:13:00 UTC
(In reply to Peter Xu from comment #2)
> I just found a side-effect of managed_irq sub-parameter.  When user
> specified isolate_managed_irq=Y, instead of using:
> 
>   isolcpus=managed_irq,X-Y
> 
> I think we need an extra of:
> 
>   isolcpus=domain,managed_irq,X-Y
> 
> To make sure we keep the HK_FLAG_DOMAIN in kernel.
> 
> Let me explain...
> 
> The kernel played a trick with the isolcpus= parameter in that if there's no
> sub-parameter at all, then it'll apply the default one, which is "domain"
> (HK_FLAG_DOMAIN).  While if there is some sub-parameter specified (in our
> case, managed_irq), then it'll not apply the default sub-parameter but use
> what is specified.
> 
> Before the managed_irq thing, we're using "isolcpus=X-Y" which implies
> "isolcpus=domain,X-Y".
> 
> So, after the managed_irq, if we want to keep the same behavior as before,
> but also apply the managed irq logic, what we really need here is
> "isolcpus=domain,managed_irq,X-Y".
> 
> Verify this is easy: HK_FLAG_DOMAIN governs the schedule domain.  So if
> we're only with "isolcpus=managed_irq,X-Y", we should observe that even our
> bash will be put into the isolation domain.  Just login to any shell, and
> try:
> 
>   $ taskset -pc $$
> 
> The correct result should not contain any isolated cores.
> 
> Pei, you can have a look on your test machines to see whether we have this
> problem after using the managed_irq sub-param.

Peter,

After using "isolcpus=managed_irq,X-Y", seems it shows correct result.

In guest:

# lscpu
...
NUMA node0 CPU(s):   0-9
...

# cat /proc/cmdline 
BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-187.rt13.45.el8bz1779046.x86_64 root=/dev/mapper/rhel_vm--74--105-root ro console=tty0 console=ttyS0,115200n8 biosdevname=0 crashkernel=auto resume=/dev/mapper/rhel_vm--74--105-swap rd.lvm.lv=rhel_vm-74-105/root rd.lvm.lv=rhel_vm-74-105/swap skew_tick=1 isolcpus=2,3,4,5,6,7,8,9 intel_pstate=disable nosoftlockup nohz=on nohz_full=2,3,4,5,6,7,8,9 rcu_nocbs=2,3,4,5,6,7,8,9 default_hugepagesz=1G iommu=pt intel_iommu=on tsc=nowatchdog skew_tick=1 isolcpus=managed_irq,2,3,4,5,6,7,8,9 intel_pstate=disable nosoftlockup tsc=nowatchdog nohz=on nohz_full=2,3,4,5,6,7,8,9 rcu_nocbs=2,3,4,5,6,7,8,9


# taskset -pc $$
pid 9691's current affinity list: 0,1

Comment 4 Peter Xu 2020-03-13 20:15:51 UTC
Good to know it's not a problem downstream. (time to dig the reason but maybe next week :)

Though we probably need that for upstream, or I must have missed something...  Maybe this can be confirmed when we work on this bz.  After all it shouldn't hurt to append "domain" too because it should not break the old ones (it should be the 1st sub-parameter and the default one starting from the very beginning, so it shouldn't break anyone but keep the same behavior always).

Comment 5 Luiz Capitulino 2020-03-17 16:50:55 UTC
Requesting exception, since a complete solution for bug 1783026
depends on this.

Comment 8 Jaroslav Škarvada 2020-03-17 17:13:26 UTC
I am OK with the 8.2 respin.

Comment 9 Jaroslav Škarvada 2020-03-17 17:16:51 UTC
Is this request about conditional or unconditional addition of the managed_irq? I.e. comment 1?

Comment 11 Peter Xu 2020-03-17 17:58:24 UTC
(In reply to Jaroslav Škarvada from comment #9)
> Is this request about conditional or unconditional addition of the
> managed_irq? I.e. comment 1?

Conditional.  Meanwhile, please have a look at comment 2-4, which I think we'd better still follow for upstream kernels (again I haven't digged on why downstream isn't affected, but I think it should affect upstream kernels; it would be good if someone else could verify this too)... 

So I think this is the summary:

- If "isolate_managed_irq=Y" is specified, then append sub-parameters "managed_irq,domain", as:

  isolcpus=managed_irq,domain,X-Y

  The "domain" is majorly for keeping the old behavior, as "isolcpus=X-Y" should implicitly hint "isolcpus=domain,X-Y".

- If "isolate_managed_irq=N" (default) is specified, then keep the isolcpus= parameters as is would be fine, as:

  isolcpus=X-Y

Thanks,

Comment 13 Jaroslav Škarvada 2020-03-20 17:06:35 UTC
https://github.com/redhat-performance/tuned/pull/255

Comment 25 errata-xmlrpc 2020-04-28 16:59:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1883


Note You need to log in before you can comment on or make changes to this bug.