Bug 2052599
| Summary: | kube-controller-manger should use configmap lease | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | ravig <rgudimet> | |
| Component: | kube-controller-manager | Assignee: | ravig <rgudimet> | |
| Status: | CLOSED ERRATA | QA Contact: | zhou ying <yinzhou> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 4.10 | CC: | aos-bugs, knarra, maszulik, mfojtik | |
| Target Milestone: | --- | Flags: | mfojtik:
needinfo?
|
|
| Target Release: | 4.10.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | EmergencyRequest | |||
| Fixed In Version: | Doc Type: | No Doc Update | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | 2052598 | |||
| : | 2052700 (view as bug list) | Environment: | ||
| Last Closed: | 2022-03-10 16:43:51 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 2052700 | |||
| Bug Blocks: | ||||
|
Description
ravig
2022-02-09 16:16:45 UTC
** A NOTE ABOUT USING URGENT ** This BZ has been set to urgent severity and priority. When a BZ is marked urgent priority Engineers are asked to stop whatever they are doing, putting everything else on hold. Please be prepared to have reasonable justification ready to discuss, and ensure your own and engineering management are aware and agree this BZ is urgent. Keep in mind, urgent bugs are very expensive and have maximal management visibility. NOTE: This bug was automatically assigned to an engineering manager with the severity reset to *unspecified* until the emergency is vetted and confirmed. Please do not manually override the severity. ** INFORMATION REQUIRED ** Please answer these questions before escalation to engineering: 1. Has a link to must-gather output been provided in this BZ? We cannot work without. If must-gather fails to run, attach all relevant logs and provide the error message of must-gather. 2. Give the output of "oc get clusteroperators -o yaml". 3. In case of degraded/unavailable operators, have all their logs and the logs of the operands been analyzed [yes/no] 4. List the top 5 relevant errors from the logs of the operators and operands in (3). 5. Order the list of degraded/unavailable operators according to which is likely the cause of the failure of the other, root-cause at the top. 6. Explain why (5) is likely the right order and list the information used for that assessment. 7. Explain why Engineering is necessary to make progress. ** A NOTE ABOUT USING URGENT ** This BZ has been set to urgent severity and priority. When a BZ is marked urgent priority Engineers are asked to stop whatever they are doing, putting everything else on hold. Please be prepared to have reasonable justification ready to discuss, and ensure your own and engineering management are aware and agree this BZ is urgent. Keep in mind, urgent bugs are very expensive and have maximal management visibility. NOTE: This bug was automatically assigned to an engineering manager with the severity reset to *unspecified* until the emergency is vetted and confirmed. Please do not manually override the severity. ** INFORMATION REQUIRED ** Please answer these questions before escalation to engineering: 1. Has a link to must-gather output been provided in this BZ? We cannot work without. If must-gather fails to run, attach all relevant logs and provide the error message of must-gather. 2. Give the output of "oc get clusteroperators -o yaml". 3. In case of degraded/unavailable operators, have all their logs and the logs of the operands been analyzed [yes/no] 4. List the top 5 relevant errors from the logs of the operators and operands in (3). 5. Order the list of degraded/unavailable operators according to which is likely the cause of the failure of the other, root-cause at the top. 6. Explain why (5) is likely the right order and list the information used for that assessment. 7. Explain why Engineering is necessary to make progress. Checked with latest payload, make sure there are both configmap and lease locks for KCM: [root@localhost roottest]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-02-11-123954 True False 47m Cluster version is 4.10.0-0.nightly-2022-02-11-123954 [root@localhost roottest]# oc project openshift-kube-controller-manager-operator Now using project "openshift-kube-controller-manager-operator" on server "https://api.yinzhou214.qe.devcluster.openshift.com:6443". [root@localhost roottest]# oc get lease NAME HOLDER AGE kube-controller-manager-operator-lock kube-controller-manager-operator-668cf96f7b-9bwvl_6928e020-58ba-4223-a054-1f8b5e19e953 66m [root@localhost roottest]# oc get lease kube-controller-manager-operator-lock -o yaml apiVersion: coordination.k8s.io/v1 kind: Lease metadata: creationTimestamp: "2022-02-14T02:13:29Z" name: kube-controller-manager-operator-lock namespace: openshift-kube-controller-manager-operator resourceVersion: "45913" uid: 7c875d77-f28f-4de9-96b6-b899ab46b797 spec: acquireTime: "2022-02-14T02:14:48.000000Z" holderIdentity: kube-controller-manager-operator-668cf96f7b-9bwvl_6928e020-58ba-4223-a054-1f8b5e19e953 leaseDurationSeconds: 137 leaseTransitions: 1 renewTime: "2022-02-14T03:19:59.656523Z" [root@localhost roottest]# oc get cm |grep lock kube-controller-manager-operator-lock 0 67m Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |