Bug 1346715

Summary: tuned profile configuration for "kernel.sched_rt_runtime_us" in package "tuned-profiles-realtime" cause error when spawning instance
Product: Red Hat Enterprise Linux 7 Reporter: Harald Jensås <hjensas>
Component: tunedAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED ERRATA QA Contact: Tereza Cerna <tcerna>
Severity: high Docs Contact: Jiri Herrmann <jherrman>
Priority: high    
Version: 7.2CC: jeder, jherrman, jskarvad, lcapitulino, mtosatti, psklenar, tcerna, williams
Target Milestone: rcKeywords: Upstream, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tuned-2.7.0-1.el7 Doc Type: Release Note
Doc Text:
The global limit on how much time realtime scheduling may use has been removed in realtime Tuned profile Prior to this update, the Tuned utility configuration for the `kernel.sched_rt_runtime_us` sysctl variable in the realtime profile included in the _tuned-profiles-realtime_ package was incorrect. As a consequence, creating a virtual machine instance caused an error due to incompatible scheduling time. Now, the value of `kernel.sched_rt_runtime_us` is set to "-1" (no limit), and the described problem no longer occurs.
Story Points: ---
Clone Of:
: 1372190 (view as bug list) Environment:
Last Closed: 2016-11-04 07:27:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1240765, 1372190    

Description Harald Jensås 2016-06-15 08:44:42 UTC
Description of problem:
The TuneD configuration for Realtime "kernel.sched_rt_runtime_us = 1000000"
that it comes from the package "tuned-profiles-realtime" has a wrong
scheduling time.

Package Version:
Name        : tuned-profiles-realtime
Arch        : noarch
Version     : 2.5.1
Release     : 4.el7_2.3
Size        : 2.2 k
Repo        : installed
From repo   : rhel-7-server-nfv-rpms
Summary     : Additional tuned profile(s) targeted to realtime
URL         : https://fedorahosted.org/tuned/
License     : GPLv2+
Description : Additional tuned profile(s) targeted to realtime.

$ grep -R kernel.sched_rt_runtime_us /usr/lib/tuned/realtime/*
/usr/lib/tuned/realtime/tuned.conf:kernel.sched_rt_runtime_us = 1000000

The correct sysctl value should be "-1".


If this configuration will not be changed the Nova VM Spawing with the
following flavor will fail:

+----------------------------+------------------------------------------------+
| Property                   | Value                                          |
+----------------------------+------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                          |
| OS-FLV-EXT-DATA:ephemeral  | 0                                              |
| disk                       | 10                                             |
| extra_specs                | {
                                "hw:cpu_realtime_mask": "^0",
                                "hw:cpu_policy": "dedicated",
                                "hw:cpu_threads_policy": "prefer",
                                "hw:mem_page_size": "large",
                                "hw:cpu_realtime": "yes"
                                }                                             |
| id                         | f7c9a94e-7e0a-4e5b-a7f6-aab8f0dd01ad           |
| name                       | nfv2.small.rt                                  |
| os-flavor-access:is_public | True                                           |
| ram                        | 2048                                           |
| rxtx_factor                | 1.0                                            |
| swap                       |                                                |
| vcpus                      | 2                                              |
+----------------------------+------------------------------------------------+


Actual results:
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [req-7a7031fd-2f93-4015-afee-dffbe19411b4 9d83a1b7cd8c49faa6ff8b926e30b068 d0df738c75604087811b3a8cb5776ecc - - -] [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49] Instance failed to spawn
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49] Traceback (most recent call last):
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2156, in _build_resources
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     yield resources
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2009, in _build_and_run_instance
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     block_device_info=block_device_info)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2585, in spawn
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     block_device_info=block_device_info)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4699, in _create_domain_and_network
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     xml, pause=pause, power_on=power_on)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4629, in _create_domain
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     guest.launch(pause=pause)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 142, in launch
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     self._encoded_xml, errors='ignore')
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 204, in __exit__
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     six.reraise(self.type_, self.value, self.tb)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 137, in launch
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     return self._domain.createWithFlags(flags)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     rv = execute(f, *args, **kwargs)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     six.reraise(c, e, tb)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     rv = meth(*args, **kwargs)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1059, in createWithFlags
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49] libvirtError: Cannot set scheduler parameters for pid 44662: Operation not permitted
2016-05-25 13:59:33.684 40122 ERROR nova.compute.manager [instance: e1c29a6e-f12a-4aab-8c9d-408587234a49]

Expected results:
Spawning instance should not fail.

Additional info:

Comment 2 Jaroslav Škarvada 2016-06-15 11:48:10 UTC
A global limit on how much time realtime scheduling may use. A run time of -1 specifies runtime == period, ie. no limit.

I think that -1 is better for realtime, but I am afraid it could also lockup non RT tasks. Could anybody dealing with RT elaborate on it? Jeremy?

I also think that Nova should react more sanely on this condition.

Comment 4 Luiz Capitulino 2016-06-15 19:19:07 UTC
Yes, Harald is right. The correct value is -1.

Jaroslav, can you fix it (in upstream and downstream) or should I post a patch?

Comment 5 Jaroslav Škarvada 2016-06-15 19:41:20 UTC
Upstream commit containing the fix:
https://git.fedorahosted.org/cgit/tuned.git/commit/?id=099ae05b747783aa8f4b0a743070ca8a89962d57

Comment 7 Luiz Capitulino 2016-06-15 19:45:14 UTC
*** Bug 1346430 has been marked as a duplicate of this bug. ***

Comment 10 Tereza Cerna 2016-08-09 12:50:48 UTC
==============================================
Verified in:
    tuned-2.7.0-1.el7.noarch
    tuned-profiles-realtime-2.7.0-1.el7.noarch
    kernel-3.10.0-327.10.1.el7.x86_64
PASS
==============================================

# cat /usr/lib/tuned/realtime/tuned.conf | grep "kernel.sched_rt_runtime_us"
kernel.sched_rt_runtime_us = -1
# tuned-adm profile realtime
# sysctl kernel.sched_rt_runtime_us
kernel.sched_rt_runtime_us = -1

==============================================
Reproduced in:
    tuned-2.5.1-4.el7.noarch
    tuned-profiles-realtime-2.5.1-4.el7.noarch
    kernel-3.10.0-327.10.1.el7.x86_64
FAIL
==============================================

# cat /usr/lib/tuned/realtime/tuned.conf | grep "kernel.sched_rt_runtime_us"
kernel.sched_rt_runtime_us = 1000000
# tuned-adm profile realtime
# sysctl kernel.sched_rt_runtime_us
kernel.sched_rt_runtime_us = 1000000

Comment 11 Tereza Cerna 2016-08-09 12:51:14 UTC
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

If you see following:

# cat /usr/lib/tuned/realtime/tuned.conf | grep "kernel.sched_rt_runtime_us"
kernel.sched_rt_runtime_us = -1
# tuned-adm profile realtime
# sysctl kernel.sched_rt_runtime_us
kernel.sched_rt_runtime_us = 950000
                             ^^^^^^  
This is not problem of tuned, but kernel. Use older version or newer one.

# rpm -q kernel
kernel-3.10.0-481.el7.x86_64

See bug BZ#1357928.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Comment 13 Tereza Cerna 2016-09-01 13:04:30 UTC
Automated test case was created: 
/CoreOS/tuned/Regression/realtime--check-kernel-sched_rt_runtime_us


NEW FUNCTIONALITY:

:: [   PASS   ] :: Command 'cat /usr/lib/tuned/realtime/tuned.conf | grep 'kernel.sched_rt_runtime_us' | grep '\-1'' (Expected 0, got 0)
:: [   PASS   ] :: Command 'tuned-adm profile realtime' (Expected 0, got 0)
:: [   PASS   ] :: Command 'sysctl kernel.sched_rt_runtime_us | grep '\-1'' (Expected 0, got 0)
:: [   LOG    ] :: Duration: 3s
:: [   LOG    ] :: Assertions: 3 good, 0 bad
:: [   PASS   ] :: RESULT: Check value of kernel.sched_rt_runtime_us


OLD FUNCTIONALITY

::: [   FAIL   ] :: Command 'cat /usr/lib/tuned/realtime/tuned.conf | grep 'kernel.sched_rt_runtime_us' | grep '\-1'' (Expected 0, got 1)
:: [   PASS   ] :: Command 'tuned-adm profile realtime' (Expected 0, got 0)
:: [   FAIL   ] :: Command 'sysctl kernel.sched_rt_runtime_us | grep '\-1'' (Expected 0, got 1)
:: [   LOG    ] :: Duration: 4s
:: [   LOG    ] :: Assertions: 1 good, 2 bad
:: [   FAIL   ] :: RESULT: Check value of kernel.sched_rt_runtime_us

Comment 15 errata-xmlrpc 2016-11-04 07:27:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2479.html