Bug 2048563
Summary: | Leader election conventions for cluster topology | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ehila <ehila> |
Component: | OLM | Assignee: | Ehila <ehila> |
OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | agreene, krizza, tyslaton |
Version: | 4.10 | ||
Target Milestone: | --- | ||
Target Release: | 4.11.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: The package server was not topology aware when defining its leader election duration, renewal deadline, and retry periods.
Consequence: The package server created unnecessary strain on topologies with limited resources, such as single node environments.
Fix: Introduced a leaderElection package that is topology aware, reducing strain on clusters with limited resources.
Result: The package server is topology aware and sets reasonable lease duration, renewal deadlines, and retry periods for the topology.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2022-08-10 10:45:53 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Ehila
2022-01-31 14:16:04 UTC
1, Create an SNO cluster with this fixed PR. [cloud-user@preserve-olm-env jian]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-02-23-185405 True False 4h4m Cluster version is 4.11.0-0.nightly-2022-02-23-185405 [cloud-user@preserve-olm-env jian]$ oc get infrastructure cluster -o=jsonpath={.status.controlPlaneTopology} SingleReplica [cloud-user@preserve-olm-env jian]$ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-153-136.us-east-2.compute.internal Ready master,worker 4h24m v1.23.3+fe7796f [cloud-user@preserve-olm-env jian]$ oc exec catalog-operator-7f65bd4697-7swnp -- olm --version OLM version: 0.19.0 git commit: 6858269bdc4b31466ff5eca7d6287fe387077fa7 126 2022-02-24T08:18:06.820Z INFO controllers.packageserver currently topology mode {"csv": "openshift-operator-lifecycle-manager/packageserver", "highly available": false} 2, Check if the `leaseDurationSeconds` changed to 270s. [cloud-user@preserve-olm-env jian]$ oc get cm packageserver-controller-lock -o yaml apiVersion: v1 kind: ConfigMap metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"package-server-manager-587c7d499f-5v4f8_ebadc75b-bc25-49d9-9090-79c11135cf75","leaseDurationSeconds":270,"acquireTime":"2022-02-24T03:54:37Z","renewTime":"2022-02-24T08:14:33Z","leaderTransitions":0}' creationTimestamp: "2022-02-24T03:54:37Z" name: packageserver-controller-lock namespace: openshift-operator-lifecycle-manager resourceVersion: "75149" uid: 7870db39-b92a-4faa-93e7-c83b66d9f877 3, Check if the package server works well. [cloud-user@preserve-olm-env jian]$ oc get packagemanifest NAME CATALOG AGE ibm-security-verify-operator Certified Operators 4h30m openshift-qiskit-operator Community Operators 4h30m ... LGMT, verify it. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |