RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1461509 - realtime-virtual-host: error doesn't prevent profile from getting applied
Summary: realtime-virtual-host: error doesn't prevent profile from getting applied
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: tuned
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Luiz Capitulino
QA Contact: Tereza Cerna
URL:
Whiteboard:
Depends On:
Blocks: kvm-rt-tuned
TreeView+ depends on / blocked
 
Reported: 2017-06-14 15:56 UTC by Luiz Capitulino
Modified: 2018-10-30 10:50 UTC (History)
6 users (show)

Fixed In Version: tuned-2.10.0-0.1.rc1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1626082 (view as bug list)
Environment:
Last Closed: 2018-10-30 10:48:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch1 (589 bytes, patch)
2018-02-22 18:14 UTC, Luiz Capitulino
no flags Details | Diff
patch2 (1.24 KB, patch)
2018-02-22 18:14 UTC, Luiz Capitulino
no flags Details | Diff
patch3 (3.52 KB, patch)
2018-02-22 18:15 UTC, Luiz Capitulino
no flags Details | Diff
patch4 (3.61 KB, patch)
2018-02-22 18:15 UTC, Luiz Capitulino
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3172 0 None None None 2018-10-30 10:50:31 UTC

Description Luiz Capitulino 2017-06-14 15:56:43 UTC
Description of problem:

The realtime-virtual-host profile seems to be missing some error checks. This causes the profile to get applied even though the configuration has failed.

Version-Release number of selected component (if applicable): tuned-2.8.0-5.el7.noarch


How reproducible:


Steps to Reproduce:
1. Make sure the realtime-virtual-host is not applied and clear its cache files

# tuned-adm profile desktop (or any other profile_
# rm -f /usr/lib/tuned/realtime-virtual-host/{lapic_timer_adv_ns,lapic_timer_adv_ns.cpumodel}

2. Install the kernel-rt package without installing kernel-rt-kvm OR temporarily unload kvm modules and rename them

3. Active the realtime-virtual-host profile and check it has succeeded

# tuned-adm profile realtime-virtual-host
# echo $?
0

4. The problem hasn't been correctly applied because a VM is ran during the activation process. This can be confirmed by checking that the lapic_timer files from step 1 weren't created

NOTE: Maybe things will get automatically fixed if the modules are loaded and tuned restarted or the machine reboots. But this is still something to be fixed.

Comment 2 Luiz Capitulino 2017-06-26 17:47:22 UTC
Note to self: it's probably a good idea to check all profiles for this problem.

Comment 3 Luiz Capitulino 2017-10-13 17:06:56 UTC
I'm not sure I'll be able to fix this for 7.5, since this missed tuned release boat. So, I think it's important we're aware of this issue's impact:

1. This BZ is really about the realtime-virtual-host missing almost all error checking. Actually, this is common among all tuned profiles, but this profile has code that can fail in a reproducible way (see comment 4)

2. I also found that the realtime-virtual-host profile has broken bash code, which fails at every single run. As it fails silently, we never knew it. Luckily (or not), it's just a sanity check that's being skipped

3. The very worst case scenario for this bug is fundamental step failing (such as running tuna to isolated a CPU) and we not seeing it since this fails silently. But this has never happened in practice

4. What seems to be a reproducible way to get trig this bug (which is what I got originally) is:

A. the kernel-rt package is installed but kernel-rt-kvm is not
B. The realtime-virtual-host profile is activated
C. Finding the best latency for the advance timer feature will silently fail, since the kvm module is not loaded
D. The profile is applied
E. kernel-rt-kvm is installed
F. A realtime guest is started
H. realtime guest will get spikes, since /sys/module/kvm/parameters/lapic_timer_advance_ns=0

However, rebooting or reloading tuned after step E should fix things up.

Comment 4 Luiz Capitulino 2017-10-18 18:41:49 UTC
This issue doesn't reproduce very easily and it has an workaround. Let's move to 7.6, as this missed the tuned release.

Comment 5 Luiz Capitulino 2018-02-22 18:14:35 UTC
Created attachment 1399499 [details]
patch1

Comment 6 Luiz Capitulino 2018-02-22 18:14:59 UTC
Created attachment 1399500 [details]
patch2

Comment 7 Luiz Capitulino 2018-02-22 18:15:21 UTC
Created attachment 1399501 [details]
patch3

Comment 8 Luiz Capitulino 2018-02-22 18:15:48 UTC
Created attachment 1399502 [details]
patch4

Comment 9 Luiz Capitulino 2018-02-22 18:18:35 UTC
Posted series to maintainer and here. Note that this series is only half the battle: it detects the error and logs them, but we also need bug 1385838 so that tuned notifies systemd an error happened.

Comment 10 Jaroslav Škarvada 2018-02-22 20:07:36 UTC
(In reply to Luiz Capitulino from comment #9)
> Posted series to maintainer and here. Note that this series is only half the
> battle: it detects the error and logs them, but we also need bug 1385838 so
> that tuned notifies systemd an error happened.

Thanks. Upstream commits:
https://github.com/redhat-performance/tuned/commit/c823e3c5f2a003717a4a0b73dde4c4003bbbe567
https://github.com/redhat-performance/tuned/commit/c989a8bd7cfa13d95e31875c564fd03630e54b6f
https://github.com/redhat-performance/tuned/commit/685e16640dc5c9c25037eb97cec41f9df308db46
https://github.com/redhat-performance/tuned/commit/c614ad03ff668c753ddf99772c3b47c80412533d

Comment 12 Luiz Capitulino 2018-06-13 17:27:57 UTC
So, an error message is now printed. However, tuned-adm returns zero and reports the realtime-virtual-host profile has been applied:

[root@virtlab500 realtime-virtual-host]# tuned-adm profile realtime-virtual-host
ERROR    tuned.utils.commands: Executing sysctl error: sysctl: cannot stat /proc/sys/kernel/numa_balancing: No such file or directory
ERROR    tuned.plugins.plugin_script: script '/usr/lib/tuned/realtime-virtual-host/script.sh' error output: 'Failed to set smp_affinity for IRQ 33: [Errno 5] Input/output error
Failed to set smp_affinity for IRQ 34: [Errno 5] Input/output error
Failed to set smp_affinity for IRQ 35: [Errno 5] Input/output error
Failed to set smp_affinity for IRQ 36: [Errno 5] Input/output error
defirqaffinity.py remove failed'
ERROR    tuned.plugins.plugin_script: script '/usr/lib/tuned/realtime-virtual-host/script.sh' returned error code: 1 <----------- New error message reporting the activation script has failed
[root@virtlab500 realtime-virtual-host]# echo $?
0
[root@virtlab500 realtime-virtual-host]# tuned-adm active
Current active profile: realtime-virtual-host
[root@virtlab500 realtime-virtual-host]# 

(Please, ignore the IRQ error messages since this is bug 1590937).

While printing the error message is helpful, it is not enough to solve this issue. IMO, in case of an error tuned should:

1. Go back to the previous profile
2. Return an error code

Now, I understand this might be an impactful change for 7.6 at this point. So, I'd be fine to move this BZ to 7.7 or even RHEL8 as long as we agree this has to be done.

What you think Jaroslav?

Comment 14 Jaroslav Škarvada 2018-09-06 14:19:54 UTC
(In reply to Luiz Capitulino from comment #12)
> So, an error message is now printed. However, tuned-adm returns zero and
> reports the realtime-virtual-host profile has been applied:
> 
> [root@virtlab500 realtime-virtual-host]# tuned-adm profile
> realtime-virtual-host
> ERROR    tuned.utils.commands: Executing sysctl error: sysctl: cannot stat
> /proc/sys/kernel/numa_balancing: No such file or directory
> ERROR    tuned.plugins.plugin_script: script
> '/usr/lib/tuned/realtime-virtual-host/script.sh' error output: 'Failed to
> set smp_affinity for IRQ 33: [Errno 5] Input/output error
> Failed to set smp_affinity for IRQ 34: [Errno 5] Input/output error
> Failed to set smp_affinity for IRQ 35: [Errno 5] Input/output error
> Failed to set smp_affinity for IRQ 36: [Errno 5] Input/output error
> defirqaffinity.py remove failed'
> ERROR    tuned.plugins.plugin_script: script
> '/usr/lib/tuned/realtime-virtual-host/script.sh' returned error code: 1
> <----------- New error message reporting the activation script has failed
> [root@virtlab500 realtime-virtual-host]# echo $?
> 0
> [root@virtlab500 realtime-virtual-host]# tuned-adm active
> Current active profile: realtime-virtual-host
> [root@virtlab500 realtime-virtual-host]# 
> 
> (Please, ignore the IRQ error messages since this is bug 1590937).
> 
> While printing the error message is helpful, it is not enough to solve this
> issue. IMO, in case of an error tuned should:
> 
> 1. Go back to the previous profile
> 2. Return an error code
> 
> Now, I understand this might be an impactful change for 7.6 at this point.
> So, I'd be fine to move this BZ to 7.7 or even RHEL8 as long as we agree
> this has to be done.
> 
> What you think Jaroslav?

I agree, I will clone it and we will address it in next release.

Comment 15 Jaroslav Škarvada 2018-09-06 14:22:25 UTC
Cloned as bug 1626082.

Comment 16 Tereza Cerna 2018-09-07 11:40:25 UTC
Package tuned-2.10.0-4.el7.noarch was tested.

I did sanity check of attached patches and new package. All patches were applied.
Several situations were simulated and corresponding behavior was executed (calling of function die, error messages, calling of function run_tsc_deadline_latency...).

Following testing will continue in BZ#1626082 which will implemented rollback when some fatal error will appear. It is useful to deploy current fix (not move whole bug to 7.7), because of repaired naming of function run_tsc_deadline_latency and improved error message when application of realtime-virtual-host profile fails.

Comment 18 errata-xmlrpc 2018-10-30 10:48:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3172


Note You need to log in before you can comment on or make changes to this bug.