Bug 1461509 - realtime-virtual-host: error doesn't prevent profile from getting applied [NEEDINFO]
realtime-virtual-host: error doesn't prevent profile from getting applied
Status: ON_QA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: tuned (Show other bugs)
7.4
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Luiz Capitulino
qe-baseos-daemons
: Patch, Upstream
Depends On:
Blocks: kvm-rt-tuned
  Show dependency treegraph
 
Reported: 2017-06-14 11:56 EDT by Luiz Capitulino
Modified: 2018-06-21 21:28 EDT (History)
6 users (show)

See Also:
Fixed In Version: tuned-2.10.0-0.1.rc1.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
lcapitulino: needinfo? (jskarvad)


Attachments (Terms of Use)
patch1 (589 bytes, patch)
2018-02-22 13:14 EST, Luiz Capitulino
no flags Details | Diff
patch2 (1.24 KB, patch)
2018-02-22 13:14 EST, Luiz Capitulino
no flags Details | Diff
patch3 (3.52 KB, patch)
2018-02-22 13:15 EST, Luiz Capitulino
no flags Details | Diff
patch4 (3.61 KB, patch)
2018-02-22 13:15 EST, Luiz Capitulino
no flags Details | Diff

  None (edit)
Description Luiz Capitulino 2017-06-14 11:56:43 EDT
Description of problem:

The realtime-virtual-host profile seems to be missing some error checks. This causes the profile to get applied even though the configuration has failed.

Version-Release number of selected component (if applicable): tuned-2.8.0-5.el7.noarch


How reproducible:


Steps to Reproduce:
1. Make sure the realtime-virtual-host is not applied and clear its cache files

# tuned-adm profile desktop (or any other profile_
# rm -f /usr/lib/tuned/realtime-virtual-host/{lapic_timer_adv_ns,lapic_timer_adv_ns.cpumodel}

2. Install the kernel-rt package without installing kernel-rt-kvm OR temporarily unload kvm modules and rename them

3. Active the realtime-virtual-host profile and check it has succeeded

# tuned-adm profile realtime-virtual-host
# echo $?
0

4. The problem hasn't been correctly applied because a VM is ran during the activation process. This can be confirmed by checking that the lapic_timer files from step 1 weren't created

NOTE: Maybe things will get automatically fixed if the modules are loaded and tuned restarted or the machine reboots. But this is still something to be fixed.
Comment 2 Luiz Capitulino 2017-06-26 13:47:22 EDT
Note to self: it's probably a good idea to check all profiles for this problem.
Comment 3 Luiz Capitulino 2017-10-13 13:06:56 EDT
I'm not sure I'll be able to fix this for 7.5, since this missed tuned release boat. So, I think it's important we're aware of this issue's impact:

1. This BZ is really about the realtime-virtual-host missing almost all error checking. Actually, this is common among all tuned profiles, but this profile has code that can fail in a reproducible way (see comment 4)

2. I also found that the realtime-virtual-host profile has broken bash code, which fails at every single run. As it fails silently, we never knew it. Luckily (or not), it's just a sanity check that's being skipped

3. The very worst case scenario for this bug is fundamental step failing (such as running tuna to isolated a CPU) and we not seeing it since this fails silently. But this has never happened in practice

4. What seems to be a reproducible way to get trig this bug (which is what I got originally) is:

A. the kernel-rt package is installed but kernel-rt-kvm is not
B. The realtime-virtual-host profile is activated
C. Finding the best latency for the advance timer feature will silently fail, since the kvm module is not loaded
D. The profile is applied
E. kernel-rt-kvm is installed
F. A realtime guest is started
H. realtime guest will get spikes, since /sys/module/kvm/parameters/lapic_timer_advance_ns=0

However, rebooting or reloading tuned after step E should fix things up.
Comment 4 Luiz Capitulino 2017-10-18 14:41:49 EDT
This issue doesn't reproduce very easily and it has an workaround. Let's move to 7.6, as this missed the tuned release.
Comment 5 Luiz Capitulino 2018-02-22 13:14 EST
Created attachment 1399499 [details]
patch1
Comment 6 Luiz Capitulino 2018-02-22 13:14 EST
Created attachment 1399500 [details]
patch2
Comment 7 Luiz Capitulino 2018-02-22 13:15 EST
Created attachment 1399501 [details]
patch3
Comment 8 Luiz Capitulino 2018-02-22 13:15 EST
Created attachment 1399502 [details]
patch4
Comment 9 Luiz Capitulino 2018-02-22 13:18:35 EST
Posted series to maintainer and here. Note that this series is only half the battle: it detects the error and logs them, but we also need bug 1385838 so that tuned notifies systemd an error happened.
Comment 10 Jaroslav Škarvada 2018-02-22 15:07:36 EST
(In reply to Luiz Capitulino from comment #9)
> Posted series to maintainer and here. Note that this series is only half the
> battle: it detects the error and logs them, but we also need bug 1385838 so
> that tuned notifies systemd an error happened.

Thanks. Upstream commits:
https://github.com/redhat-performance/tuned/commit/c823e3c5f2a003717a4a0b73dde4c4003bbbe567
https://github.com/redhat-performance/tuned/commit/c989a8bd7cfa13d95e31875c564fd03630e54b6f
https://github.com/redhat-performance/tuned/commit/685e16640dc5c9c25037eb97cec41f9df308db46
https://github.com/redhat-performance/tuned/commit/c614ad03ff668c753ddf99772c3b47c80412533d
Comment 12 Luiz Capitulino 2018-06-13 13:27:57 EDT
So, an error message is now printed. However, tuned-adm returns zero and reports the realtime-virtual-host profile has been applied:

[root@virtlab500 realtime-virtual-host]# tuned-adm profile realtime-virtual-host
ERROR    tuned.utils.commands: Executing sysctl error: sysctl: cannot stat /proc/sys/kernel/numa_balancing: No such file or directory
ERROR    tuned.plugins.plugin_script: script '/usr/lib/tuned/realtime-virtual-host/script.sh' error output: 'Failed to set smp_affinity for IRQ 33: [Errno 5] Input/output error
Failed to set smp_affinity for IRQ 34: [Errno 5] Input/output error
Failed to set smp_affinity for IRQ 35: [Errno 5] Input/output error
Failed to set smp_affinity for IRQ 36: [Errno 5] Input/output error
defirqaffinity.py remove failed'
ERROR    tuned.plugins.plugin_script: script '/usr/lib/tuned/realtime-virtual-host/script.sh' returned error code: 1 <----------- New error message reporting the activation script has failed
[root@virtlab500 realtime-virtual-host]# echo $?
0
[root@virtlab500 realtime-virtual-host]# tuned-adm active
Current active profile: realtime-virtual-host
[root@virtlab500 realtime-virtual-host]# 

(Please, ignore the IRQ error messages since this is bug 1590937).

While printing the error message is helpful, it is not enough to solve this issue. IMO, in case of an error tuned should:

1. Go back to the previous profile
2. Return an error code

Now, I understand this might be an impactful change for 7.6 at this point. So, I'd be fine to move this BZ to 7.7 or even RHEL8 as long as we agree this has to be done.

What you think Jaroslav?

Note You need to log in before you can comment on or make changes to this bug.