Bug 1719967
Summary: | During upgrade, node-tuning operator status rapidly alternates between new and old version | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Clayton Coleman <ccoleman> | |
Component: | Node Tuning Operator | Assignee: | Jiří Mencák <jmencak> | |
Status: | CLOSED ERRATA | QA Contact: | Simon <skordas> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.1.z | CC: | lmeyer, mifiedle, sejug, skordas, sponnaga, vlaad, wsun | |
Target Milestone: | --- | Keywords: | Reopened | |
Target Release: | 4.1.z | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | 4.1.6 | |||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: The node-tuning-operator did not implement leader election.
Consequence: Two operators can run for a brief period of time causing resource contention.
Fix: Implemented "leader election for life".
Result: Node-tuning operators can co-exist within the same namespace without causing resource contention.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1724274 1729273 (view as bug list) | Environment: | ||
Last Closed: | 2019-07-11 18:10:49 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1724274, 1729273 |
Description
Clayton Coleman
2019-06-12 20:33:57 UTC
Suspect this may be the cause of reported secret leaks in the node-tuning namespace (if two version run during an upgrade, they could be conflicting). No, that's not it. You were right in your original comment though, leader election is not yet implemented by the node-tuning operator. (In reply to Clayton Coleman from comment #2) > Suspect this may be the cause of reported secret leaks in the node-tuning > namespace (if two version run during an upgrade, they could be conflicting). No, that's not it. You were right in your original comment though, leader election is not (yet) implemented by the node-tuning operator. Can no longer reproduce with upstream PR: https://github.com/openshift/cluster-node-tuning-operator/pull/67 The 4.1.x PR has not merged. Moving back to POST. https://github.com/openshift/cluster-node-tuning-operator/pull/68 The build currently attached to the 4.1.4 errata is cluster-node-tuning-operator-container-v4.1.4-201906271212 which does not contain this fix. This will be picked up in 4.1.5 or if 4.1.4 is rebuilt Verification with builds: [1] 4.1.0-0.nightly-2019-06-29-054428 [2] 4.1.0-0.nightly-2019-06-30-214248 [3] 4.1.0-0.nightly-2019-07-02-050624 [4] 4.1.0-0.nightly-2019-07-02-101812 Build [3] and [4] with fix. I have tried different combinations for update and downgrade. Last one: # oc get clusterversions.config.openshift.io NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.1.0-0.nightly-2019-07-02-050624 True False 10m Cluster version is 4.1.0-0.nightly-2019-07-02-050624 # oc get configmaps -n openshift-cluster-node-tuning-operator NAME DATA AGE node-tuning-operator-lock 0 21m tuned-profiles 1 4h16m tuned-recommend 1 4h16m # oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-07-02-101812 --force Updating to release image registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-07-02-101812 # oc get configmaps -n openshift-cluster-node-tuning-operator NAME DATA AGE node-tuning-operator-lock 0 42m tuned-profiles 1 5h5m tuned-recommend 1 5h5m # oc get clusterversions.config.openshift.io NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.1.0-0.nightly-2019-07-02-101812 True False 31m Cluster version is 4.1.0-0.nightly-2019-07-02-101812 Can't reproduce issue any more! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:1635 |