Bug 1784645
| Summary: | Need a new tuned plug-in, irqbalance | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Andrew Theurer <atheurer> |
| Component: | tuned | Assignee: | Jaroslav Škarvada <jskarvad> |
| Status: | CLOSED ERRATA | QA Contact: | Robin Hack <rhack> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 8.3 | CC: | fsimonce, jeder, jmencak, jskarvad, lcapitulino, mtosatti, peterx, rhack, yquinn |
| Target Milestone: | rc | Keywords: | Patch, TestCaseProvided, Triaged, Upstream |
| Target Release: | 8.0 | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | tuned-2.14.0-0.1.rc1.el8 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-11-04 02:03:07 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1817044, 1825061 | ||
|
Description
Andrew Theurer
2019-12-17 22:01:06 UTC
When applying or re-applying this profile, tuned would update the /etc/sysconfig/irqbalance config file and restart the irqbalance service. Andrew, why exactly a new irqbalance plugin for tuned is necessary? What is necessary is that the Tuned operator performs the configurations necessary, in /sys/, to add or remove certain CPUs from the interrupt mask of irqs. Can you please explain? Marcelo, I am not sure I understand your question, but the request here is based on avoiding the use of 'script' plug-in, and changing the irqbalance configuration currently happens with 'script' plug-in. There is no configuration that happens in /sys for irqs. And as long as irqbalance daemon is active, it is necessary to tell it to not use certain CPUs. There is no plan for tuned to alter /proc/irq/N/smp_affinity masks directly (while having irqbalance disabled). (In reply to Andrew Theurer from comment #3) > Marcelo, I am not sure I understand your question, but the request here is > based on avoiding the use of 'script' plug-in, and changing the irqbalance > configuration currently happens with 'script' plug-in. There is no > configuration that happens in /sys for irqs. And as long as irqbalance > daemon is active, it is necessary to tell it to not use certain CPUs. There > is no plan for tuned to alter /proc/irq/N/smp_affinity masks directly (while > having irqbalance disabled). OK! Would https://github.com/redhat-performance/tuned/pull/243 work until this is implemented properly as a tuned plugin? (In reply to jmencak from comment #5) > Would https://github.com/redhat-performance/tuned/pull/243 work until this > is implemented properly as a tuned plugin? It depends, what calls irqbalance_banned_cpus_setup() and is this something you can do in NTO? (In reply to Andrew Theurer from comment #6) > (In reply to jmencak from comment #5) > > Would https://github.com/redhat-performance/tuned/pull/243 work until this > > is implemented properly as a tuned plugin? > > It depends, what calls irqbalance_banned_cpus_setup() and is this something > you can do in NTO? Both the cpu-partitioning and realtime profiles call this via the script plugin. Yes, I can use/do this in NTO. (In reply to Andrew Theurer from comment #1) > When applying or re-applying this profile, tuned would update the > /etc/sysconfig/irqbalance config file and restart the irqbalance service. I think 'systemctl try-restart' is more appropriate instead of a hard restart, isn't it? (In reply to Andrew Theurer from comment #3) > There > is no plan for tuned to alter /proc/irq/N/smp_affinity masks directly (while > having irqbalance disabled). I'm not sure what you're trying to say here, but note that the scheduler plugin already sets these affinities. To fixed values, that is. It doesn't do any balancing. [1] https://github.com/redhat-performance/tuned/blob/041333f9c5daf37f96340f418f4168347551f52a/tuned/plugins/plugin_scheduler.py#L572 I've been looking into this, and it's unclear to me whether we need to set IRQBALANCE_BANNED_CPUS at all. At least on RHEL-8, irqbalance parses /sys/devices/system/cpu/isolated and /sys/devices/system/cpu/nohz_full and uses that as the default value for the list of banned CPUs. So the list of banned CPUs seems to be automatically populated just like we want it. I've tested it (with modified cpu-partitioning and realtime* profiles; on regular, non-RT kernel) and it seems to work. The only case this would not work would be if someone explicitly set IRQBALANCE_BANNED_CPUS in /etc/sysconfig/irqbalance, as that would override the autodetection. Question is, do we care? If the admin explicitly set the variable, they likely have a good reason for it and Tuned shouldn't touch it. Am I missing something? Can someone provide more context? Andrew, Luiz? Thanks. I'll go check how it behaves on RHEL-7. So I've done some testing with different RHEL releases and all related Tuned profiles (cpu-partitioning, realtime*). I've modified the profiles so that they don't touch the irqbalance sysconfig file. irqbalance (almost) always automatically populates the list of banned CPUs with correct values. Here's a summary: - RHEL-8.3, RHEL-8.3 with RT kernel, RHEL-7.9, RHEL-7.4, RHEL-7.2: list of banned CPUs is correctly set - RHEL-7.3: there seems to be a bug - the list of banned CPUs contains one more CPU than is necessary For the record, the way I got the effective list of banned CPUs on RHEL-8 is this: python3 -c "import socket; s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM); s.connect(b'\\0irqbalance$(pidof irqbalance).sock'); s.send(b'setup'); print(s.recv(1024)); s.close()" On RHEL-7, the following was necessary: gdb -p $(pidof irqbalance) -ex 'print banned_cpus' -ex detach -ex quit It looks to me like we don't really need an irqbalance plugin, because irqbalance sets the list of banned CPUs automatically. I think we should simply drop the irqbalance config handling from Tuned profiles entirely. (Please see my previous comments). What do you guys think? (In reply to Ondřej Lysoněk from comment #11) > I've been looking into this, and it's unclear to me whether we need to set > IRQBALANCE_BANNED_CPUS at all. At least on RHEL-8, irqbalance parses > /sys/devices/system/cpu/isolated and /sys/devices/system/cpu/nohz_full and > uses that as the default value for the list of banned CPUs. So the list of > banned CPUs seems to be automatically populated just like we want it. I've > tested it (with modified cpu-partitioning and realtime* profiles; on > regular, non-RT kernel) and it seems to work. > > The only case this would not work would be if someone explicitly set > IRQBALANCE_BANNED_CPUS in /etc/sysconfig/irqbalance, as that would override > the autodetection. Question is, do we care? If the admin explicitly set the > variable, they likely have a good reason for it and Tuned shouldn't touch it. > > Am I missing something? Can someone provide more context? Andrew, Luiz? Looking at the code of irqbalance 1.4.0/cputree.c/setup_banned_cpus(), I think the problem is the assumption the profiles (cpu-partitioning/realtime) will always use isolcpus and/or nohz_full. This may not the case (especially for other profiles and possibly for profiles inheriting these and perhaps removing isolcpus/nohz_full), therefore the plugin is still needed AFAICS. > > Thanks. > > I'll go check how it behaves on RHEL-7. (In reply to jmencak from comment #14) > Looking at the code of irqbalance 1.4.0/cputree.c/setup_banned_cpus(), > I think the problem is the assumption the profiles > (cpu-partitioning/realtime) > will always use isolcpus and/or nohz_full. This may not the case (especially > for other profiles and possibly for profiles inheriting these and perhaps > removing isolcpus/nohz_full) You say "especially". Can you elaborate on when it's not the case with our profiles? Is there a use case for removing isolcpus/nonhz_full while still tuning irqbalance? Let's talk real use cases, not hypothetical ones. (In reply to Ondřej Lysoněk from comment #15) > (In reply to jmencak from comment #14) > > Looking at the code of irqbalance 1.4.0/cputree.c/setup_banned_cpus(), > > I think the problem is the assumption the profiles > > (cpu-partitioning/realtime) > > will always use isolcpus and/or nohz_full. This may not the case (especially > > for other profiles and possibly for profiles inheriting these and perhaps > > removing isolcpus/nohz_full) > > You say "especially". Can you elaborate on when it's not the case with our > profiles? > > Is there a use case for removing isolcpus/nonhz_full while still tuning > irqbalance? Let's talk real use cases, not hypothetical ones. Perhaps I misunderstood what you were getting at. Are you saying that someone might already be doing what you described? And that we shouldn't remove irqbalance handling from our profiles because it might break someone's setup? (In reply to Ondřej Lysoněk from comment #15) > (In reply to jmencak from comment #14) > > Looking at the code of irqbalance 1.4.0/cputree.c/setup_banned_cpus(), > > I think the problem is the assumption the profiles > > (cpu-partitioning/realtime) > > will always use isolcpus and/or nohz_full. This may not the case (especially > > for other profiles and possibly for profiles inheriting these and perhaps > > removing isolcpus/nohz_full) > > You say "especially". Can you elaborate on when it's not the case with our > profiles? > > Is there a use case for removing isolcpus/nonhz_full while still tuning > irqbalance? Let's talk real use cases, not hypothetical ones. We already have the situation where we can't use nohz_full because of problems with CFS-quota, but we still need to control where interrupts go. This is for Openshift. (In reply to Andrew Theurer from comment #17) > (In reply to Ondřej Lysoněk from comment #15) > > (In reply to jmencak from comment #14) > > > Looking at the code of irqbalance 1.4.0/cputree.c/setup_banned_cpus(), > > > I think the problem is the assumption the profiles > > > (cpu-partitioning/realtime) > > > will always use isolcpus and/or nohz_full. This may not the case (especially > > > for other profiles and possibly for profiles inheriting these and perhaps > > > removing isolcpus/nohz_full) > > > > You say "especially". Can you elaborate on when it's not the case with our > > profiles? > > > > Is there a use case for removing isolcpus/nonhz_full while still tuning > > irqbalance? Let's talk real use cases, not hypothetical ones. > > We already have the situation where we can't use nohz_full because of > problems with CFS-quota, but we still need to control where interrupts go. > This is for Openshift. Hi Andrew, thank you very much for the information! So to expand on that, I suppose an example of a use case would be using the cpu-partitioning profile with the nohz_full setting removed. Given that cpu-partitioning doesn't use isolcpus, neither /sys/devices/system/cpu/isolated nor /sys/devices/system/cpu/nohz_full will be set in such a case, so irqbalance has no way of knowing what the banned CPUs should be. Yes, that is what we are doing on OCP, but we are not quite to the point where we actually use cpu-partitioning today (tuned is not used yet, but NTO/tuned will be for this work very soon), but that is the plan for OCP 4.6: using a cpu-part profile, but with nohz_full removed, but still need the irqs moved away. Upstream PR: https://github.com/redhat-performance/tuned/pull/274 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (tuned bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4559 |