Bug 1474961

Summary: Executing taskset error: taskset: failed to set pid 30's affinity: Invalid argument
Product: Red Hat Enterprise Linux 7 Reporter: Luiz Capitulino <lcapitulino>
Component: tunedAssignee: Ondřej Lysoněk <olysonek>
Status: CLOSED ERRATA QA Contact: Tereza Cerna <tcerna>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.5CC: jeder, jskarvad, olysonek, tcerna
Target Milestone: rcKeywords: Patch, Upstream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tuned-2.9.0-0.1.rc1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 16:04:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1240765, 1467576, 1485946    

Description Luiz Capitulino 2017-07-25 17:14:47 UTC
Description of problem:

Sometimes, when activating the realtime-virtual-host profile I see several of the following error message in /var/log/tuned/tuned.log:

2017-07-19 09:17:10,343 ERROR    tuned.utils.commands: Executing taskset error: taskset: failed to set pid 30's affinity: Invalid argument

I think this error may come from tuna when doing mass task migration between cores. There are to cases where this error is expected:

1. The task vanished before the taskset command was able to change its cpumask
2. The task doesn't allow for cpumask change

I think tuned or tuna should check for the two conditions above and ignore the error if they happen.


Version-Release number of selected component (if applicable): tuned-2.8.0-5.el7.noarch

Comment 2 Ondřej Lysoněk 2017-08-22 07:44:40 UTC
I agree that in the first case, when the task vanishes, the error should not be logged (maybe we could log an informative message with debug level though).

But in the second case, I think the error should still be logged, because Tuned in fact failed to do something you told it to. You can always choose which tasks the tuning should be applied to using a regex in the profile file.

Comment 3 Luiz Capitulino 2017-08-22 13:13:03 UTC
If the new task isolation code in tuned can do that, then I agree that's the best solution. However, I guess the current code can't do that, it just tries to move as much tasks as possible off a CPU in a best effort way. This works fine in practice. So, both ways would do it for us.

Comment 4 Ondřej Lysoněk 2017-08-22 14:26:20 UTC
You are right, it looks like Tuned doesn't report errors when the affinity can't be changed:
https://github.com/redhat-performance/tuned/blob/master/tuned/plugins/plugin_scheduler.py#L332-L333

I didn't know that. But I think it should (maybe the log level could be WARN).

Comment 5 Luiz Capitulino 2017-08-22 20:37:45 UTC
You could change the log level to WARN and/or add a special checks for the few threads that are per-cpu (migration, ksoftirqd, timersoftirqd, posixcputmr, rcuc.

The new CPU isolation engine shouldn't have this problem I guess.

Comment 6 Ondřej Lysoněk 2017-08-28 12:53:43 UTC
I created https://github.com/redhat-performance/tuned/pull/63. However I think there's some space for improvement.

With commit 66d4843 you get tons of warnings in the logs when the isolated_cores parameter is set, such as the following:
tuned.plugins.plugin_scheduler: Affinity of PID 9534 is not changeable.
tuned.plugins.plugin_scheduler: Affinity of PID 522 is not changeable.

It seems all the PIDs it's complaining about are kernel threads. Maybe we could change the behaviour so that:
* Tuned doesn't complain about not being able to change affinity of kernel threads, and/or
* Only one message is logged after the migration is completed saying that some tasks could not be migrated

Comment 7 Ondřej Lysoněk 2017-08-28 12:57:22 UTC
(In reply to Luiz Capitulino from comment #5)
> You could change the log level to WARN and/or add a special checks for the
> few threads that are per-cpu (migration, ksoftirqd, timersoftirqd,
> posixcputmr, rcuc.

It seems that these are not the only ones which have fixed affinity (e.g. kworker's affinity can't be changed)

Comment 8 Ondřej Lysoněk 2017-08-30 12:37:54 UTC
Fixed upstream:
https://github.com/redhat-performance/tuned/commit/f9f5b073a599cc22f4b3f27ce134c97e45e3a0e4

(In reply to Ondřej Lysoněk from comment #6)
> With commit 66d4843 you get tons of warnings in the logs when the
> isolated_cores parameter is set, such as the following:
> tuned.plugins.plugin_scheduler: Affinity of PID 9534 is not changeable.
> tuned.plugins.plugin_scheduler: Affinity of PID 522 is not changeable.

We decided to solve the problem by logging failures to set affinity in kernel threads and zombie processes (empty /proc/[pid]/cmdline) as DEBUG. The rationale is that kernel threads seem to be often pinned to specific CPUs and there's nothing to do about it, so complaining about it in the logs is annoying and pointless. For zombie processes, if we fail to set their affinity, then we don't really care either as those processes won't run on the CPU ever again.

Comment 9 Luiz Capitulino 2017-08-30 13:14:03 UTC
FWIW, this looks good to me.

Comment 10 Fedora Update System 2017-10-13 14:21:15 UTC
tuned-2.9.0-0.1.rc1.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-d9c6b990df

Comment 12 Fedora Update System 2017-10-13 22:25:31 UTC
tuned-2.9.0-0.1.rc1.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-d9c6b990df

Comment 13 Fedora Update System 2017-10-13 23:25:22 UTC
tuned-2.9.0-0.1.rc1.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-5f0849d207

Comment 14 Fedora Update System 2017-10-29 21:07:28 UTC
tuned-2.9.0-1.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-0e45ce4685

Comment 15 Fedora Update System 2017-10-29 21:13:34 UTC
tuned-2.9.0-1.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-c30e9bd1ea

Comment 20 errata-xmlrpc 2018-04-10 16:04:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0879