Bug 2037036

Summary:	The tuned profile goes into degraded status and ksm.service is displayed in the log.
Product:	OpenShift Container Platform	Reporter:	Jiří Mencák <jmencak>
Component:	Node Tuning Operator	Assignee:	Jiří Mencák <jmencak>
Status:	CLOSED ERRATA	QA Contact:	liqcui
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4.10	CC:	aapark, aos-bugs, dagray, liqcui
Target Milestone:	---
Target Release:	4.10.0
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	2036303	Environment:
Last Closed:	2022-03-12 04:40:12 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2036303

Description Jiří Mencák 2022-01-04 17:11:40 UTC

+++ This bug was initially created as a clone of Bug #2036303 +++

Description of problem:
When the Tuned profile is updated. The tuned profile is applied to the node, but still remains DEGRADED.


Version-Release number of selected component (if applicable):
$ omg get clusterversion
NAME     VERSION  AVAILABLE  PROGRESSING  SINCE  STATUS
version  4.9.12   True       False        38m    Error while reconciling 4.9.12: the cluster operator insights is degraded

How reproducible:


Steps to Reproduce:
1. Install and setup performance addon operator
[root@bastion1 dk]# oc get performanceprofiles.performance.openshift.io performance -oyaml
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
 creationTimestamp: "2021-11-02T10:18:56Z"
 finalizers:
 - foreground-deletion
 generation: 1
 name: performance
 resourceVersion: "9172819"
 uid: 931a600a-7e9a-499d-9e08-f99abbdd90ed
spec:
 cpu:
   isolated: 4-39,44-79
   reserved: 0-3,40-43
 globallyDisableIrqLoadBalancing: true
 hugepages:
   defaultHugepagesSize: 1G
   pages:
   - count: 32
     node: 0
     size: 1G
   - count: 32
     node: 1
     size: 1G
 nodeSelector:
   node-role.kubernetes.io/sys: ""
 numa:
   topologyPolicy: restricted

2. create a tuned profile
[root@bastion1 smile]# cat tuned_sysctl_socket_buffer_profile.yaml
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
 name: sysctl-socket-buffer
 namespace: openshift-cluster-node-tuning-operator
spec:
 profile:
 - data: |
     [main]
     summary=Set rmem_default,rmem_max,wmem_default,wmem_max
     include=openshift-node
     [sysctl]
     net.core.rmem_default = 2097152
     net.core.rmem_max = 2097152
     net.core.wmem_default = 2097152
     net.core.wmem_max = 2097152
   name: openshift-sysctl
 recommend:
 - machineConfigLabels:
     machineconfiguration.openshift.io/role: "sys"
   priority: 20
   profile: openshift-sysctl

3. tuned profile is degraded
[root@bastion1 dk]# oc get profile -A
NAMESPACE                                NAME                         TUNED                     APPLIED   DEGRADED   AGE
openshift-cluster-node-tuning-operator   master01.ss2.samsung.local   openshift-control-plane   True      False      65d
openshift-cluster-node-tuning-operator   master02.ss2.samsung.local   openshift-control-plane   True      False      64d
openshift-cluster-node-tuning-operator   master03.ss2.samsung.local   openshift-control-plane   True      False      65d
openshift-cluster-node-tuning-operator   worker01.ss2.samsung.local   openshift-sysctl-oam      True      True       61d
openshift-cluster-node-tuning-operator   worker02.ss2.samsung.local   openshift-sysctl-oam      True      False      61d
openshift-cluster-node-tuning-operator   worker03.ss2.samsung.local   openshift-sysctl-oam      True      True       61d
openshift-cluster-node-tuning-operator   worker04.ss2.samsung.local   openshift-sysctl-oam      True      False      61d
openshift-cluster-node-tuning-operator   worker05.ss2.samsung.local   openshift-sysctl-sys      True      False      61d
openshift-cluster-node-tuning-operator   worker06.ss2.samsung.local   openshift-sysctl-sys      True      True       61d
openshift-cluster-node-tuning-operator   worker07.ss2.samsung.local   openshift-sysctl-sys      True      False      61d
openshift-cluster-node-tuning-operator   worker08.ss2.samsung.local   openshift-sysctl-sys      True      False      61d
openshift-cluster-node-tuning-operator   worker09.ss2.samsung.local   openshift-sysctl-call     True      False      34d
openshift-cluster-node-tuning-operator   worker10.ss2.samsung.local   openshift-sysctl-call     True      True       34d
openshift-cluster-node-tuning-operator   worker11.ss2.samsung.local   openshift-sysctl-call2    True      False      6d20h
openshift-cluster-node-tuning-operator   worker12.ss2.samsung.local   openshift-sysctl-call2    True      False      6d20h

Actual results:
1) Error occurred in tuned profile
--
$ omg get profile worker10.ss2.samsung.local -o yaml
~
status:
  bootcmdline: skew_tick=1 nohz=on rcu_nocbs=4-27,32-55 tuned.non_isolcpus=f000000f
    intel_pstate=disable nosoftlockup tsc=nowatchdog intel_iommu=on iommu=pt isolcpus=managed_irq,4-27,32-55
    systemd.cpu_affinity=0,1,2,3,28,29,30,31 default_hugepagesz=1G +
  conditions:
  - lastTransitionTime: '2021-12-29T03:30:22Z'
    message: Tuned profile applied.
    reason: AsExpected
    status: 'True'
    type: Applied
  - lastTransitionTime: '2021-12-29T03:30:22Z'
    message: Tuned daemon issued one or more error message(s) during profile application.
    reason: TunedError
    status: 'True'
    type: Degraded
  tunedProfile: openshift-sysctl-call
--

2) error log in tuned Pod
--
$ omg logs tuned-zzgm5
~
2021-12-29T03:30:24.027172311Z 2021-12-29 03:30:24,027 INFO     tuned.plugins.plugin_cpu: setting new cpu latency 2
2021-12-29T03:30:24.033503757Z 2021-12-29 03:30:24,033 INFO     tuned.plugins.plugin_sysctl: reapplying system sysctl
2021-12-29T03:30:24.528353891Z 2021-12-29 03:30:24,528 INFO     tuned.plugins.plugin_systemd: setting 'CPUAffinity' to '0 1 2 3 28 29 30 31' in the '/etc/systemd/system.conf'
2021-12-29T03:30:25.007818601Z 2021-12-29 03:30:25,007 INFO     tuned.plugins.plugin_script: calling script '/usr/lib/tuned/cpu-partitioning/script.sh' with arguments '['start']'
2021-12-29T03:30:25.535868718Z 2021-12-29 03:30:25,535 ERROR    tuned.plugins.plugin_script: script '/usr/lib/tuned/cpu-partitioning/script.sh' error output: 'Unit ksm.service does not exist, proceeding anyway.
2021-12-29T03:30:25.535868718Z Unit ksmtuned.service does not exist, proceeding anyway.'
2021-12-29T03:30:25.536893772Z 2021-12-29 03:30:25,536 INFO     tuned.plugins.plugin_bootloader: installing additional boot command line parameters to grub2
2021-12-29T03:30:25.537422292Z E1229 03:30:25.537398   16277 tuned.go:776] unable to sync(daemon/) requeued (6)
2021-12-29T03:30:25.537499978Z E1229 03:30:25.537479   16277 tuned.go:776] unable to sync(daemon/) requeued (7)
2021-12-29T03:30:25.537575410Z 2021-12-29 03:30:25,537 INFO     tuned.daemon.daemon: static tuning from profile 'openshift-sysctl-call' applied

Expected results:
tuned profile 'DEGRADED STATUS' will be false

Additional info:

Comment 1 Jiří Mencák 2022-01-04 17:14:26 UTC

This is fixed upstream by https://github.com/redhat-performance/tuned/pull/331

The latest TuneD shipped via FDP in 4.10 already has the fix.  Nevertheless, other fix is needed for 4.10 for [bootloader] plugin.
PR to follow soon.

Comment 3 Jiří Mencák 2022-01-06 08:43:54 UTC

Fixed on 4.10.0-0.nightly-2022-01-05-181126 and above.

$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-05-181126   True        False         11h     Cluster version is 4.10.0-0.nightly-2022-01-05-181126

$ oc get no
NAME                                                          STATUS   ROLES    AGE   VERSION
jmencak-jjhml-master-0.c.openshift-gce-devel.internal         Ready    master   12h   v1.22.1+6859754
jmencak-jjhml-master-1.c.openshift-gce-devel.internal         Ready    master   12h   v1.22.1+6859754
jmencak-jjhml-master-2.c.openshift-gce-devel.internal         Ready    master   12h   v1.22.1+6859754
jmencak-jjhml-worker-a-8kc2n.c.openshift-gce-devel.internal   Ready    worker   12h   v1.22.1+6859754
jmencak-jjhml-worker-b-k54sj.c.openshift-gce-devel.internal   Ready    worker   12h   v1.22.1+6859754

$ oc label no jmencak-jjhml-worker-a-8kc2n.c.openshift-gce-devel.internal node-role.kubernetes.io/worker-rt=
node/jmencak-jjhml-worker-a-8kc2n.c.openshift-gce-devel.internal labeled

$ oc create -f- <<EOF
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
  name: openshift-cpu-partitioning
  namespace: openshift-cluster-node-tuning-operator
spec:
  profile:
  - data: |
      [main]
      summary=Custom OpenShift cpu-partitioning profile
      include=openshift-node,cpu-partitioning
      [variables]
      # {isolated,no_balance}_cores take a list of ranges; e.g. isolated_cores=2,4-7
      isolated_cores=1
      no_balance_cores=1
      [bootloader]
      # set empty values to disable RHEL initrd setting in cpu-partitioning
      initrd_remove_dir=
      initrd_dst_img=
      initrd_add_dir=
    name: openshift-cpu-partitioning

  recommend:
  - match:
    - label: node-role.kubernetes.io/worker-rt
    priority: 20
    profile: openshift-cpu-partitioning
EOF

$ oc get po -o wide|grep worker-a
tuned-hshhh                                     1/1     Running   0          12h   10.0.128.3    jmencak-jjhml-worker-a-8kc2n.c.openshift-gce-devel.internal   <none>           <none>

$ oc logs tuned-hshhh | grep ERROR
2022-01-06 08:37:25,761 ERROR    tuned.plugins.plugin_sysctl: Failed to set sysctl parameter 'kernel.nmi_watchdog' to '0': [Errno 524] Unknown error 524
2022-01-06 08:37:26,253 ERROR    tuned.plugins.plugin_sysctl: Failed to set sysctl parameter 'kernel.nmi_watchdog' to '0': [Errno 524] Unknown error 524
2022-01-06 08:37:26,312 ERROR    tuned.plugins.plugin_sysctl: Failed to set sysctl parameter 'kernel.nmi_watchdog' to '0': [Errno 524] Unknown error 524

$ oc get profile
NAME                                                          TUNED                        APPLIED   DEGRADED   AGE
jmencak-jjhml-master-0.c.openshift-gce-devel.internal         openshift-control-plane      True      False      12h
jmencak-jjhml-master-1.c.openshift-gce-devel.internal         openshift-control-plane      True      False      12h
jmencak-jjhml-master-2.c.openshift-gce-devel.internal         openshift-control-plane      True      False      12h
jmencak-jjhml-worker-a-8kc2n.c.openshift-gce-devel.internal   openshift-cpu-partitioning   True      True       12h
jmencak-jjhml-worker-b-k54sj.c.openshift-gce-devel.internal   openshift-node               True      False      12h

Now, the profile `jmencak-jjhml-worker-a-8kc2n.c.openshift-gce-devel.internal` is Degraded, however, that's expected on GCP/AWS/... and VMs where you cannot
set kernel.nmi_watchdog sysctl and TuneD issues ERROR in the logs.  You will not see this on bare metal and the profile will not be degraded.  Looking through the logs, there is no longer an issue with ksm.service.

$ oc logs tuned-hshhh | grep ksm.service

Comment 4 liqcui 2022-01-06 14:59:26 UTC

Verified in my cluster as below:

[ocpadmin@ec2-18-217-45-133 sro]$ oc get no
NAME                                                          STATUS   ROLES    AGE   VERSION
liqcui-gcp4906-pmrrj-master-0.c.openshift-qe.internal         Ready    master   86m   v1.22.1+6859754
liqcui-gcp4906-pmrrj-master-1.c.openshift-qe.internal         Ready    master   86m   v1.22.1+6859754
liqcui-gcp4906-pmrrj-master-2.c.openshift-qe.internal         Ready    master   86m   v1.22.1+6859754
liqcui-gcp4906-pmrrj-worker-a-vh9d7.c.openshift-qe.internal   Ready    worker   72m   v1.22.1+6859754
liqcui-gcp4906-pmrrj-worker-b-7lz6j.c.openshift-qe.internal   Ready    worker   75m   v1.22.1+6859754
liqcui-gcp4906-pmrrj-worker-c-llvnm.c.openshift-qe.internal   Ready    worker   75m   v1.22.1+6859754
[ocpadmin@ec2-18-217-45-133 sro]$ oc label no liqcui-gcp4906-pmrrj-worker-a-vh9d7.c.openshift-qe.internal node-role.kubernetes.io/worker-rt=
node/liqcui-gcp4906-pmrrj-worker-a-vh9d7.c.openshift-qe.internal labeled
[ocpadmin@ec2-18-217-45-133 sro]$ oc create -f- <<EOF
> apiVersion: tuned.openshift.io/v1
> kind: Tuned
> metadata:
>   name: openshift-cpu-partitioning
>   namespace: openshift-cluster-node-tuning-operator
> spec:
>   profile:
>   - data: |
>       [main]
>       summary=Custom OpenShift cpu-partitioning profile
>       include=openshift-node,cpu-partitioning
>       [variables]
>       # {isolated,no_balance}_cores take a list of ranges; e.g. isolated_cores=2,4-7
>       isolated_cores=1
>       no_balance_cores=1
>       [bootloader]
>       # set empty values to disable RHEL initrd setting in cpu-partitioning
>       initrd_remove_dir=
>       initrd_dst_img=
>       initrd_add_dir=
>     name: openshift-cpu-partitioning
> 
>   recommend:
>   - match:
>     - label: node-role.kubernetes.io/worker-rt
>     priority: 20
>     profile: openshift-cpu-partitioning
> EOF
tuned.tuned.openshift.io/openshift-cpu-partitioning created

[ocpadmin@ec2-18-217-45-133 sro]$ oc get ns |grep tun
openshift-cluster-node-tuning-operator             Active   92m
[ocpadmin@ec2-18-217-45-133 sro]$ oc get po -n openshift-cluster-node-tuning-operator -o wide|grep liqcui-gcp4906-pmrrj-worker-a-vh9d7.c.openshift-qe.internal
tuned-fnxz8                                     1/1     Running   0          75m   10.0.128.2    liqcui-gcp4906-pmrrj-worker-a-vh9d7.c.openshift-qe.internal   <none>           <none>

[ocpadmin@ec2-18-217-45-133 sro]$ oc logs tuned-fnxz8  -n openshift-cluster-node-tuning-operator | tail -10
2022-01-06 14:54:25,388 INFO     tuned.plugins.plugin_cpu: setting new cpu latency 0
2022-01-06 14:54:25,390 ERROR    tuned.plugins.plugin_sysctl: Failed to set sysctl parameter 'kernel.nmi_watchdog' to '0': [Errno 524] Unknown error 524
2022-01-06 14:54:25,390 INFO     tuned.plugins.plugin_sysctl: reapplying system sysctl
2022-01-06 14:54:25,489 INFO     tuned.plugins.plugin_systemd: setting 'CPUAffinity' to '0 2 3' in the '/etc/systemd/system.conf'
2022-01-06 14:54:25,508 INFO     tuned.plugins.plugin_script: calling script '/usr/lib/tuned/cpu-partitioning/script.sh' with arguments '['start']'
2022-01-06 14:54:25,642 INFO     tuned.plugins.plugin_bootloader: installing additional boot command line parameters to grub2
2022-01-06 14:54:25,643 INFO     tuned.plugins.plugin_bootloader: cannot find grub.cfg to patch
E0106 14:54:25.643783    3470 controller.go:775] unable to sync(daemon/) requeued (4)
E0106 14:54:25.643824    3470 controller.go:775] unable to sync(daemon/) requeued (5)
2022-01-06 14:54:25,643 INFO     tuned.daemon.daemon: static tuning from profile 'openshift-cpu-partitioning' applied
[ocpadmin@ec2-18-217-45-133 sro]$ oc get profile -n openshift-cluster-node-tuning-operator
NAME                                                          TUNED                        APPLIED   DEGRADED   AGE
liqcui-gcp4906-pmrrj-master-0.c.openshift-qe.internal         openshift-control-plane      True      False      86m
liqcui-gcp4906-pmrrj-master-1.c.openshift-qe.internal         openshift-control-plane      True      False      86m
liqcui-gcp4906-pmrrj-master-2.c.openshift-qe.internal         openshift-control-plane      True      False      86m
liqcui-gcp4906-pmrrj-worker-a-vh9d7.c.openshift-qe.internal   openshift-cpu-partitioning   True      True       76m
liqcui-gcp4906-pmrrj-worker-b-7lz6j.c.openshift-qe.internal   openshift-node               True      False      78m
liqcui-gcp4906-pmrrj-worker-c-llvnm.c.openshift-qe.internal   openshift-node               True      False      78m
[ocpadmin@ec2-18-217-45-133 sro]$ oc logs tuned-fnxz8  -n openshift-cluster-node-tuning-operator | grep ksm.service

Comment 7 errata-xmlrpc 2022-03-12 04:40:12 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056