Bug 1998247
Summary: | Tuned configuration fails and does not recover when profile references a not yet existing performance profile configuration | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Marius Cornea <mcornea> | |
Component: | Node Tuning Operator | Assignee: | Jiří Mencák <jmencak> | |
Status: | CLOSED ERRATA | QA Contact: | Simon <skordas> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.8 | CC: | aos-bugs, dagray, imiller | |
Target Milestone: | --- | |||
Target Release: | 4.9.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1999608 (view as bug list) | Environment: | ||
Last Closed: | 2021-10-18 17:49:21 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1999608 |
Description
Marius Cornea
2021-08-26 17:03:33 UTC
Thank you for the report. Could you please provide either must-gather, or the output of: $ oc get profile -n openshift-cluster-node-tuning-operator and the logs from the Tuned container on the node that fail to apply the profile? No need for must-gather or the output I asked for. Have a minimal reproducer for NTO. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-09-01-193941 True False 3h45m Cluster version is 4.9.0-0.nightly-2021-09-01-193941 $ node=$(oc get nodes | grep -m 1 worker | cut -f 1 -d ' ') && echo $node pod=$(oc get pods -n openshift-cluster-node-tuning-operator -o wide | grep $node | cut -d ' ' -f 1) && echo $pod ip-10-0-136-123.us-east-2.compute.internal tuned-xsxrv $ oc get routes -n openshift-console NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD console console-openshift-console.apps.skordas92b.qe.devcluster.openshift.com console https reencrypt/Redirect None downloads downloads-openshift-console.apps.skordas92b.qe.devcluster.openshift.com downloads http edge/Redirect None # Log in into console # Install Performance Addon Operator # Operators -> Operator Hub -> Performance Addon Operator -> Install $ oc get pods -n openshift-operators NAME READY STATUS RESTARTS AGE performance-operator-7fc5bcb7c9-4m67g 1/1 Running 0 91s # Create tuned oc create -f- <<EOF apiVersion: tuned.openshift.io/v1 kind: Tuned metadata: name: performance-patch namespace: openshift-cluster-node-tuning-operator spec: profile: - data: | [main] summary=Configuration changes profile inherited from performance created tuned include=openshift-node-performance-profile [bootloader] cmdline_crash=nohz_full=2-23,26-47 [sysctl] kernel.timer_migration=1 [service] service.stalld=start,enable name: performance-patch recommend: - machineConfigLabels: machineconfiguration.openshift.io/role: master priority: 19 profile: performance-patch EOF $ oc get tuned -n openshift-cluster-node-tuning-operator NAME AGE default 4h31m performance-patch 14s rendered 4h31m $ oc get profiles -n openshift-cluster-node-tuning-operator NAME TUNED APPLIED DEGRADED AGE ip-10-0-136-123.us-east-2.compute.internal openshift-node True False 4h24m ip-10-0-147-0.us-east-2.compute.internal performance-patch False True 4h31m ip-10-0-161-12.us-east-2.compute.internal performance-patch False True 4h31m ip-10-0-178-33.us-east-2.compute.internal openshift-node True False 4h24m ip-10-0-199-56.us-east-2.compute.internal performance-patch False True 4h31m ip-10-0-204-47.us-east-2.compute.internal openshift-node True False 4h24m # create Performance profile oc create -f- <<EOF apiVersion: performance.openshift.io/v2 kind: PerformanceProfile metadata: finalizers: - foreground-deletion name: openshift-node-performance-profile spec: additionalKernelArgs: - idle=poll cpu: isolated: 2-23,26-47 reserved: 0-1,24-25 globallyDisableIrqLoadBalancing: true hugepages: defaultHugepagesSize: 1G pages: - count: 32 size: 1G machineConfigPoolSelector: pools.operator.machineconfiguration.openshift.io/master: "" nodeSelector: node-role.kubernetes.io/master: "" numa: topologyPolicy: restricted realTimeKernel: enabled: false EOF $ oc get performanceprofiles.performance.openshift.io -n openshift-operators -o yaml apiVersion: v1 items: - apiVersion: performance.openshift.io/v2 kind: PerformanceProfile metadata: creationTimestamp: "2021-09-02T15:54:19Z" finalizers: - foreground-deletion generation: 1 name: openshift-node-performance-profile resourceVersion: "105104" uid: a227e6c2-8480-49c9-b7d6-619292d2f8eb spec: additionalKernelArgs: - idle=poll cpu: isolated: 2-23,26-47 reserved: 0-1,24-25 globallyDisableIrqLoadBalancing: true hugepages: defaultHugepagesSize: 1G pages: - count: 32 size: 1G machineConfigPoolSelector: pools.operator.machineconfiguration.openshift.io/master: "" nodeSelector: node-role.kubernetes.io/master: "" numa: topologyPolicy: restricted realTimeKernel: enabled: false status: conditions: - lastHeartbeatTime: "2021-09-02T15:54:20Z" lastTransitionTime: "2021-09-02T15:54:20Z" status: "True" type: Available - lastHeartbeatTime: "2021-09-02T15:54:20Z" lastTransitionTime: "2021-09-02T15:54:20Z" status: "True" type: Upgradeable - lastHeartbeatTime: "2021-09-02T15:54:20Z" lastTransitionTime: "2021-09-02T15:54:20Z" status: "False" type: Progressing - lastHeartbeatTime: "2021-09-02T15:54:20Z" lastTransitionTime: "2021-09-02T15:54:20Z" status: "False" type: Degraded runtimeClass: performance-openshift-node-performance-profile tuned: openshift-cluster-node-tuning-operator/openshift-node-performance-openshift-node-performance-profile kind: List metadata: resourceVersion: "" selfLink: "" No errors after applying performance after tuned Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |