Description of problem: The tuned pod will threw below error after create performance-patch and performanceprofie like as below: 2022-03-02 07:27:49,257 ERROR tuned.units.manager: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tuned/units/manager.py", line 119, in _try_call return f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/tuned/plugins/instance/instance.py", line 78, in apply_tuning self._plugin.instance_apply_tuning(self) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 261, in instance_apply_tuning self._instance_apply_static(instance) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 317, in _instance_apply_static self._execute_all_non_device_commands(instance) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 433, in _execute_all_non_device_commands self._execute_non_device_command(instance, command, new_value) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 514, in _execute_non_device_command command["set"](new_value, sim = False) File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_selinux.py", line 49, in _set_avc_cache_threshold threshold = int(value) ValueError: invalid literal for int() with base 10: '8192 # Custom (atomic host)' Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Deploy PAO from Operator Hub 2. Create Performace profile and Tuned profile like as $ cat performance-patch.sh oc create -f- <<EOF apiVersion: tuned.openshift.io/v1 kind: Tuned metadata: name: performance-patch namespace: openshift-cluster-node-tuning-operator spec: profile: - data: | [main] summary=Configuration changes profile inherited from performance created tuned include=openshift-node-performance-profile [bootloader] cmdline_crash=nohz_full=0,2-4 [sysctl] kernel.timer_migration=1 [service] service.stalld=start,enable name: performance-patch recommend: - machineConfigLabels: machineconfiguration.openshift.io/role: master priority: 19 profile: performance-patch EOF [ocpadmin@ec2-18-217-45-133 nto]$ cat performance-profile.yaml apiVersion: performance.openshift.io/v2 kind: PerformanceProfile metadata: finalizers: - foreground-deletion name: profile spec: additionalKernelArgs: - idle=poll cpu: isolated: 0,3 reserved: 1-2 globallyDisableIrqLoadBalancing: true hugepages: defaultHugepagesSize: 1G pages: - count: 2 size: 1G machineConfigPoolSelector: pools.operator.machineconfiguration.openshift.io/master: "" nodeSelector: node-role.kubernetes.io/master: "" numa: topologyPolicy: restricted realTimeKernel: enabled: false Actual results: the proformance-patch isn't applied Expected results: the proformance-patch applied and without error in tuned pod logs Additional info: oc logs tuned-nkhjt -n openshift-cluster-node-tuning-operator I0302 07:27:41.721131 3689 controller.go:1221] starting openshift-tuned v4.10.0-202202241816.p0.g3c5760e.assembly.stream-0-gb855682-dirty I0302 07:27:41.830129 3689 controller.go:323] disabling system tuned... I0302 07:27:41.831650 3689 controller.go:1015] started events processors I0302 07:27:41.831722 3689 controller.go:349] extracting TuneD profiles I0302 07:27:41.837925 3689 controller.go:1053] started controller I0302 07:27:45.469428 3689 controller.go:427] written "/etc/tuned/recommend.d/50-openshift.conf" to set TuneD profile performance-patch I0302 07:27:45.804559 3689 controller.go:440] starting tuned... 2022-03-02 07:27:47,052 INFO tuned.daemon.application: TuneD: 2.18.0, kernel: 4.18.0-305.34.2.el8_4.x86_64 2022-03-02 07:27:47,053 INFO tuned.daemon.application: dynamic tuning is globally disabled 2022-03-02 07:27:47,182 INFO tuned.daemon.daemon: using sleep interval of 1 second(s) 2022-03-02 07:27:47,183 INFO tuned.daemon.daemon: Running in automatic mode, checking what profile is recommended for your configuration. 2022-03-02 07:27:47,185 INFO tuned.daemon.daemon: Using 'performance-patch' profile 2022-03-02 07:27:47,204 INFO tuned.profiles.loader: loading profile: performance-patch 2022-03-02 07:27:47,955 INFO tuned.daemon.controller: starting controller 2022-03-02 07:27:47,963 INFO tuned.daemon.daemon: starting tuning 2022-03-02 07:27:48,281 INFO tuned.plugins.base: instance cpu: assigning devices cpu1, cpu0, cpu3, cpu2 2022-03-02 07:27:48,291 INFO tuned.plugins.plugin_cpu: We are running on an x86 GenuineIntel platform 2022-03-02 07:27:48,304 WARNING tuned.plugins.plugin_cpu: your CPU doesn't support MSR_IA32_ENERGY_PERF_BIAS, ignoring CPU energy performance bias 2022-03-02 07:27:48,341 INFO tuned.plugins.plugin_disk: Device 'nvme0n1' not supported by hdparm 2022-03-02 07:27:48,343 INFO tuned.plugins.base: instance disk: assigning devices nvme0n1 2022-03-02 07:27:48,351 INFO tuned.plugins.base: instance net: assigning devices ens5 2022-03-02 07:27:48,820 INFO tuned.plugins.plugin_bootloader: cannot read '/etc/default/grub' 2022-03-02 07:27:48,832 ERROR tuned.plugins.plugin_cpu: unable to evaluate latency value (probably wrong settings in the 'cpu' section of current profile), disabling PM QoS 2022-03-02 07:27:48,838 ERROR tuned.plugins.plugin_sysctl: Failed to set sysctl parameter 'kernel.nmi_watchdog' to '0 # cpu-partitioning #realtime': [Errno 524] Unknown error 524 2022-03-02 07:27:48,839 INFO tuned.plugins.plugin_sysctl: reapplying system sysctl 2022-03-02 07:27:49,257 ERROR tuned.units.manager: BUG: Unhandled exception in start_tuning: invalid literal for int() with base 10: '8192 # Custom (atomic host)' 2022-03-02 07:27:49,257 ERROR tuned.units.manager: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tuned/units/manager.py", line 119, in _try_call return f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/tuned/plugins/instance/instance.py", line 78, in apply_tuning self._plugin.instance_apply_tuning(self) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 261, in instance_apply_tuning self._instance_apply_static(instance) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 317, in _instance_apply_static self._execute_all_non_device_commands(instance) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 433, in _execute_all_non_device_commands self._execute_non_device_command(instance, command, new_value) File "/usr/lib/python3.6/site-packages/tuned/plugins/base.py", line 514, in _execute_non_device_command command["set"](new_value, sim = False) File "/usr/lib/python3.6/site-packages/tuned/plugins/plugin_selinux.py", line 49, in _set_avc_cache_threshold threshold = int(value) ValueError: invalid literal for int() with base 10: '8192 # Custom (atomic host)' 2022-03-02 07:27:49,268 WARNING tuned.plugins.plugin_vm: Incorrect 'transparent_hugepages' value 'never # network-latency'. 2022-03-02 07:27:49,276 INFO tuned.plugins.plugin_systemd: setting 'CPUAffinity' to '1 2' in the '/etc/systemd/system.conf' 2022-03-02 07:27:51,183 INFO tuned.plugins.plugin_script: calling script '/usr/lib/tuned/cpu-partitioning/script.sh' with arguments '['start']' 2022-03-02 07:27:53,095 INFO tuned.plugins.plugin_bootloader: installing additional boot command line parameters to grub2 2022-03-02 07:27:53,095 INFO tuned.plugins.plugin_bootloader: cannot find grub.cfg to patch 2022-03-02 07:27:53,156 INFO tuned.daemon.daemon: static tuning from profile 'performance-patch' applied E0302 07:27:53.157692 3689 controller.go:775] unable to sync(daemon/) requeued (0)
Please provide the versions of used components, especially PAO. This is a known issue that was fixed half a year ago by https://github.com/openshift-kni/performance-addon-operators/commit/874da9e1adaabde490fd9ab58be3e8cd13c32b94
This is a potential 4.9 to 4.10 blocker.
Linking TuneD BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2060138
the PAO version is 4.9.7
Martin, Can you help us understand the impact of this? This is in 4.10? Regards, Ken Y
Ken please check the doc text, it is all there. This only affects PAO 4.9 when combined with OCP 4.10 (during cluster upgrade) or a custom Tuned override with end of line comments. Neither clean 4.10 install nor clean 4.9 are affected.
The errata https://access.redhat.com/errata/RHSA-2022:1162 ships cluster-node-tuning-operator-container-v4.10.0-202203282147.p0.g3c5760e.assembly.stream that includes tuned-2.18.0-1.1.20220317gite1045f2d.el8fdp.noarch That is the same tuned as released in https://access.redhat.com/errata/RHBA-2022:1084 that fixed https://bugzilla.redhat.com/show_bug.cgi?id=2064605 Marking as fixed