Bug 2016540
| Summary: | RHEL9 traceback in fv_cpu_pinning test on some aarch64 systems | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | John Kacur <jkacur> | |
| Component: | tuna | Assignee: | John Kacur <jkacur> | |
| Status: | CLOSED ERRATA | QA Contact: | Qiao Zhao <qzhao> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 9.0 | CC: | bhu, mstowell, qzhao, rt-maint | |
| Target Milestone: | rc | Keywords: | Triaged | |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | tuna-0.16-3.el9 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2018285 (view as bug list) | Environment: | ||
| Last Closed: | 2022-05-17 15:55:01 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2018285, 2020013 | |||
I did some more investigation, and added a line to print the pid in tuna
./tuna-cmd.py -c 31 -i
pid = 991
Traceback (most recent call last):
File "/root/src/tuna/./tuna-cmd.py", line 762, in <module>
main()
File "/root/src/tuna/./tuna-cmd.py", line 601, in main
tuna.isolate_cpus(cpu_list, get_nr_cpus())
File "/root/src/tuna/tuna/tuna.py", line 371, in isolate_cpus
raise err
File "/root/src/tuna/tuna/tuna.py", line 363, in isolate_cpus
os.sched_setaffinity(pid, affinity)
OSError: [Errno 16] Device or resource busy
and I looked at pid 991
cat /proc/991/stat
991 (cppc_fie) S 2 0 0 0 -1 2129984 0 0 0 0 1635 43557 0 0 -101 0 1 0 1595 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 1 0 0 17 22 0 6 0 0 0 0 0 0 0 0 0 0 0
This the flags 2129984
python
Python 3.9.7 (default, Sep 9 2021, 00:00:00)
[GCC 11.2.1 20210728 (Red Hat 11.2.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 2129984 & 0x04000000 and True or False
False
Which means setaffinity is allowed
ps ax | grep 991
991 ? S 7:37 [cppc_fie]
Wondering if there is something about cppc_fie in kernel that prevents setting affinity
cppc_fie uses SCHED_DEADLINE if admission control is on, you cannot restrict the cpus to a smaller set try the following echo -1 > /proc/sys/kernel/sched_rt_runtime_us If you are able to run tuna --cpus 31 --isolate successfully after that, then we know what the problem is. However, shutting off admission control is probably not a good work around for you. Note this probably manifests itself in rhel-9 because of the way the environment is set-up. However it could potentially happen in rhel-8.x too, so any changes should be backported there as well. The fix I added prints a warning if setaffinity triggers an EBUSY error and continues. This can occur if a pid is attached to a device using SCHED_DEADLINE and control admission is on. The user can then do one of the following. 1. Simply ignore the one pid when isolating the CPU or if the user is worried it might be impacting performance (such as realtime latency) 2. reboot using isolcpus to isolate the cpu or 3. Turn off admission control and rerun tuna isolate and then turn admission control back on. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (new packages: tuna), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:3955 |
Description of problem: trace-back in tuna on some aarch64 systems trying to isolate cpus on rhel9 [root@ampere-hr330a-09 ~]# tuna --cpus 31 --isolate Traceback (most recent call last): File "/usr/bin/tuna", line 763, in <module> main() File "/usr/bin/tuna", line 601, in main tuna.isolate_cpus(cpu_list, get_nr_cpus()) File "/usr/lib/python3.9/site-packages/tuna/tuna.py", line 370, in isolate_cpus raise err File "/usr/lib/python3.9/site-packages/tuna/tuna.py", line 363, in isolate_cpus os.sched_setaffinity(pid, affinity) OSError: [Errno 16] Device or resource busy How reproducible: Not on every machine, but on machines where it happens, everytime