Bug 1472840

Summary: File "/usr/lib/python2.7/site-packages/tuna/tuna.py", line 366, in isolate_cpus
Product: Red Hat Enterprise Linux 7 Reporter: Luiz Capitulino <lcapitulino>
Component: tunaAssignee: John Kacur <jkacur>
Status: CLOSED ERRATA QA Contact: Jiri Kastner <jkastner>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.5CC: acme, bhu, jkacur, jskarvad, pezhang, tcerna
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 18:13:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1240765, 1442258    
Attachments:
Description Flags
Use errno codes instead of raw numbers
none
exit isolate_cpus with an error msg instead of a traceback none

Description Luiz Capitulino 2017-07-19 13:53:27 UTC
Description of problem:

We call tuna from tuned's realtime-virtual-host profile like this:

tuna -c "$TUNED_isolated_cores" -i

From time to time when booting the system, I'm getting the following error (from tuned logs):

2017-07-19 09:16:56,898 ERROR    tuned.plugins.plugin_script: script '/usr/lib/tuned/realtime/script.sh' error: 1, 'Traceback (most recent call last):
  File "/usr/bin/tuna", line 710, in <module>
    main()
  File "/usr/bin/tuna", line 546, in main
    tuna.isolate_cpus(cpu_list, get_nr_cpus())
  File "/usr/lib/python2.7/site-packages/tuna/tuna.py", line 366, in isolate_cpus
    raise e
OSError: [Errno 22] Invalid argument'

When this happens the profile is only half applied, which causes KVM-RT to become unfunctional.

NOTE: It is the first time I see this, so either: this is a recent issue or this system makes it more likely to happen.

Version-Release number of selected component (if applicable): tuna-0.13-5.el7.noarch, tuned-2.8.0-5.el7.noarch


How reproducible:


Steps to Reproduce:
1. Setup the realtime-virtual-host profile
2. Boot the system a few times

Comment 2 Luiz Capitulino 2017-07-19 14:00:26 UTC
Jaroslav,

While debugging this, I realized that it's the realtime profile that's calling tuna. However, the realtime-virtual-host profile also calls tuna with the same command-line. While I guess that calling tuna twice is not the cause of this issue, should we remove tuna invocation from the realtime-virtual-{host,guest} profiles?

Comment 3 John Kacur 2017-07-19 14:32:56 UTC
(In reply to Luiz Capitulino from comment #2)
> Jaroslav,
> 
> While debugging this, I realized that it's the realtime profile that's
> calling tuna. However, the realtime-virtual-host profile also calls tuna
> with the same command-line. While I guess that calling tuna twice is not the
> cause of this issue, should we remove tuna invocation from the
> realtime-virtual-{host,guest} profiles?

You mentioned that when this occurs the profile is only partially applied. This is occurring in the guest I assume, not the host? If we can't detect when the profile is fully applied, then I would think removing the call from the guest would solve the problem?

Comment 4 Luiz Capitulino 2017-07-19 14:50:38 UTC
Both calls to tuna happen in the host, the guest is not involved. I guess the profile is half applied because tuna returned an error and maybe tuned stopped short applying the profile. I think the best action for tuned would be to revert what was applied and fail, but that's a entierly different issue/BZ.

For this BZ we probably have to fix tuna.

PS: I'll add the needinfo back because I also need input from Jaroslav regarding comment 2.

Comment 5 Jaroslav Škarvada 2017-07-19 15:59:04 UTC
(In reply to Luiz Capitulino from comment #2)
> Jaroslav,
> 
> While debugging this, I realized that it's the realtime profile that's
> calling tuna. However, the realtime-virtual-host profile also calls tuna
> with the same command-line. While I guess that calling tuna twice is not the
> cause of this issue, should we remove tuna invocation from the
> realtime-virtual-{host,guest} profiles?

Yes, we can remove it. But it seems there are more duplicities. E.g. as a next step we could probably also drop defirqaffinity.py, because it seems it's functionality is already implemented in Tuna. And we could also move to the plugin scheduler which has all the functionality built-in and can also skip processes / associated threads specified by regex. We switched cpu-partitioning to plugin scheduler upstream. Other profiles could be also switched.

Comment 6 Jaroslav Škarvada 2017-07-19 16:36:48 UTC
(In reply to Jaroslav Škarvada from comment #5)
> (In reply to Luiz Capitulino from comment #2)
> > Jaroslav,
> > 
> > While debugging this, I realized that it's the realtime profile that's
> > calling tuna. However, the realtime-virtual-host profile also calls tuna
> > with the same command-line. While I guess that calling tuna twice is not the
> > cause of this issue, should we remove tuna invocation from the
> > realtime-virtual-{host,guest} profiles?
> 
> Yes, we can remove it. But it seems there are more duplicities. E.g. as a
> next step we could probably also drop defirqaffinity.py, because it seems
> it's functionality is already implemented in Tuna. And we could also move to
> the plugin scheduler which has all the functionality built-in and can also
> skip processes / associated threads specified by regex. We switched
> cpu-partitioning to plugin scheduler upstream. Other profiles could be also
> switched.

Upstream commit removing the second invocation of Tuna:
https://github.com/redhat-performance/tuned/commit/962bae3d719d81a7cf52843270d3eb4bd762efce

Comment 7 Luiz Capitulino 2017-07-20 13:09:35 UTC
Thanks Jaroslav!

Comment 8 John Kacur 2017-09-13 10:45:57 UTC
*** Bug 1455478 has been marked as a duplicate of this bug. ***

Comment 10 John Kacur 2017-09-13 10:49:08 UTC
Created attachment 1325309 [details]
Use errno codes instead of raw numbers

Comment 11 John Kacur 2017-09-13 10:50:16 UTC
Created attachment 1325310 [details]
exit isolate_cpus with an error msg instead of a traceback

Comment 19 errata-xmlrpc 2018-04-10 18:13:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0967