Bug 1999608

Summary: Tuned configuration fails and does not recover when profile references a not yet existing performance profile configuration
Product: OpenShift Container Platform Reporter: Jiří Mencák <jmencak>
Component: Node Tuning OperatorAssignee: Jiří Mencák <jmencak>
Status: CLOSED ERRATA QA Contact: Simon <skordas>
Severity: high Docs Contact:
Priority: high    
Version: 4.8CC: aos-bugs, dagray, imiller, mcornea, skordas
Target Milestone: ---Keywords: AutomationBlocker
Target Release: 4.8.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1998247 Environment:
Last Closed: 2021-09-21 08:02:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1998247    
Bug Blocks:    

Comment 1 Jiří Mencák 2021-09-08 17:09:08 UTC
*** Bug 2000997 has been marked as a duplicate of this bug. ***

Comment 4 Simon 2021-09-16 17:28:11 UTC
$ oc get clusterversions.config.openshift.io 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-09-15-234929   True        False         168m    Cluster version is 4.8.0-0.nightly-2021-09-15-234929

$ oc get pods -n openshift-operators
NAME                                   READY   STATUS    RESTARTS   AGE
performance-operator-748ff74d9-b6f7n   1/1     Running   0          7m15s

$ oc create -f- <<EOF
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
  name: performance-patch
  namespace: openshift-cluster-node-tuning-operator
spec:
  profile:
  - data: |
      [main]
      summary=Configuration changes profile inherited from performance created tuned
      include=openshift-node-performance-openshift-node-performance-profile
      [bootloader]
      cmdline_crash=nohz_full=2-23,26-47
      [sysctl]
      kernel.timer_migration=1
      [service]
      service.stalld=start,enable
    name: performance-patch
  recommend:
  - machineConfigLabels:
      machineconfiguration.openshift.io/role: master
    priority: 19
    profile: performance-patch
EOF
tuned.tuned.openshift.io/performance-patch created

$ oc get tuned -n openshift-cluster-node-tuning-operator 
NAME                AGE
default             3h19m
performance-patch   98s
rendered            3h19m

$ oc get profiles -n openshift-cluster-node-tuning-operator 
NAME                                         TUNED               APPLIED   DEGRADED   AGE
ip-10-0-138-112.us-east-2.compute.internal   openshift-node      True      False      27m
ip-10-0-145-212.us-east-2.compute.internal   openshift-node      True      False      27m
ip-10-0-156-93.us-east-2.compute.internal    performance-patch   False     True       3h16m
ip-10-0-186-187.us-east-2.compute.internal   performance-patch   False     True       3h16m
ip-10-0-220-241.us-east-2.compute.internal   performance-patch   False     True       3h19m

$ oc create -f- <<EOF
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
  finalizers:
  - foreground-deletion
  name: openshift-node-performance-profile
spec:
  additionalKernelArgs:
  - idle=poll
  cpu:
    isolated: 2-3
    reserved: 0-1
  globallyDisableIrqLoadBalancing: true
  hugepages:
    defaultHugepagesSize: 1G
    pages:
    - count: 32
      size: 1G
  machineConfigPoolSelector:
    pools.operator.machineconfiguration.openshift.io/master: ""
  nodeSelector:
    node-role.kubernetes.io/master: ""
  numa:
    topologyPolicy: restricted
  realTimeKernel:
    enabled: false
EOF

$ oc get tuned -n openshift-cluster-node-tuning-operator 
NAME                                                            AGE
default                                                         3h22m
openshift-node-performance-openshift-node-performance-profile   39s
performance-patch                                               4m2s
rendered                                                        3h22m

$ oc get profiles
NAME                                         TUNED               APPLIED   DEGRADED   AGE
ip-10-0-138-112.us-east-2.compute.internal   openshift-node      True      False      37m
ip-10-0-145-212.us-east-2.compute.internal   openshift-node      True      False      37m
ip-10-0-156-93.us-east-2.compute.internal    performance-patch   True      True       3h26m
ip-10-0-186-187.us-east-2.compute.internal   performance-patch   True      True       3h26m
ip-10-0-220-241.us-east-2.compute.internal   performance-patch   True      True       3h29m

Comment 6 errata-xmlrpc 2021-09-21 08:02:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.8.12 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3511