Bug 2048563
| Summary: | Leader election conventions for cluster topology | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Ehila <ehila> |
| Component: | OLM | Assignee: | Ehila <ehila> |
| OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | agreene, krizza, tyslaton |
| Version: | 4.10 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.11.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: The package server was not topology aware when defining its leader election duration, renewal deadline, and retry periods.
Consequence: The package server created unnecessary strain on topologies with limited resources, such as single node environments.
Fix: Introduced a leaderElection package that is topology aware, reducing strain on clusters with limited resources.
Result: The package server is topology aware and sets reasonable lease duration, renewal deadlines, and retry periods for the topology.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-08-10 10:45:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Ehila
2022-01-31 14:16:04 UTC
1, Create an SNO cluster with this fixed PR.
[cloud-user@preserve-olm-env jian]$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-02-23-185405 True False 4h4m Cluster version is 4.11.0-0.nightly-2022-02-23-185405
[cloud-user@preserve-olm-env jian]$ oc get infrastructure cluster -o=jsonpath={.status.controlPlaneTopology}
SingleReplica
[cloud-user@preserve-olm-env jian]$ oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-153-136.us-east-2.compute.internal Ready master,worker 4h24m v1.23.3+fe7796f
[cloud-user@preserve-olm-env jian]$ oc exec catalog-operator-7f65bd4697-7swnp -- olm --version
OLM version: 0.19.0
git commit: 6858269bdc4b31466ff5eca7d6287fe387077fa7
126 2022-02-24T08:18:06.820Z INFO controllers.packageserver currently topology mode {"csv": "openshift-operator-lifecycle-manager/packageserver", "highly available": false}
2, Check if the `leaseDurationSeconds` changed to 270s.
[cloud-user@preserve-olm-env jian]$ oc get cm packageserver-controller-lock -o yaml
apiVersion: v1
kind: ConfigMap
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"package-server-manager-587c7d499f-5v4f8_ebadc75b-bc25-49d9-9090-79c11135cf75","leaseDurationSeconds":270,"acquireTime":"2022-02-24T03:54:37Z","renewTime":"2022-02-24T08:14:33Z","leaderTransitions":0}'
creationTimestamp: "2022-02-24T03:54:37Z"
name: packageserver-controller-lock
namespace: openshift-operator-lifecycle-manager
resourceVersion: "75149"
uid: 7870db39-b92a-4faa-93e7-c83b66d9f877
3, Check if the package server works well.
[cloud-user@preserve-olm-env jian]$ oc get packagemanifest
NAME CATALOG AGE
ibm-security-verify-operator Certified Operators 4h30m
openshift-qiskit-operator Community Operators 4h30m
...
LGMT, verify it.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |