Bug 1654977
| Summary: | The event of the csv still is "RequirementsNotMet" even if all requirements met. | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jian Zhang <jiazha> |
| Component: | OLM | Assignee: | Vu Dinh <vdinh> |
| Status: | CLOSED ERRATA | QA Contact: | Jian Zhang <jiazha> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.1.0 | CC: | chezhang, dyan, ecordell, jfan, nhale, scuppett, vlaad, zitang |
| Target Milestone: | --- | ||
| Target Release: | 4.1.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause:
Excessive amount of Pending event requests are sent to event recorder as CSV is stuck in Pending state due to some requirements are no longer met.
Consequence: The event recorder rate limit is reached and subsequent events to be lost due to slow refill rate. As a result, Succeeded events are not recorded.
Fix: The Pending event is no longer sent if CSV is already in Pending state.
Result: The Succeeded event is now recorded correctly.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-06-04 10:41:04 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Jian, Could you post all the events for the CSV? Thanks. Post the whole info for your refence.
[core@jian-master-0 ~]$ oc describe csv
...
Status:
Certs Last Updated: <nil>
Certs Rotate At: <nil>
Conditions:
Last Transition Time: 2018-12-06T09:42:33Z
Last Update Time: 2018-12-06T09:42:33Z
Message: requirements not yet checked
Phase: Pending
Reason: RequirementsUnknown
Last Transition Time: 2018-12-06T09:42:33Z
Last Update Time: 2018-12-06T09:42:33Z
Message: one or more requirements couldn't be found
Phase: Pending
Reason: RequirementsNotMet
Last Transition Time: 2018-12-06T09:42:34Z
Last Update Time: 2018-12-06T09:42:34Z
Message: all requirements found, attempting install
Phase: InstallReady
Reason: AllRequirementsMet
Last Transition Time: 2018-12-06T09:42:34Z
Last Update Time: 2018-12-06T09:42:34Z
Message: waiting for install components to report healthy
Phase: Installing
Reason: InstallSucceeded
Last Transition Time: 2018-12-06T09:42:34Z
Last Update Time: 2018-12-06T09:42:35Z
Message: installing: Waiting: waiting for deployment etcd-operator to become ready: Waiting for deployment spec update to be observed...
Phase: Installing
Reason: InstallWaiting
Last Transition Time: 2018-12-06T09:42:39Z
Last Update Time: 2018-12-06T09:42:39Z
Message: install strategy completed with no errors
Phase: Succeeded
Reason: InstallSucceeded
Last Transition Time: 2018-12-06T09:52:44Z
Last Update Time: 2018-12-06T09:52:44Z
Message: requirements no longer met
Phase: Failed
Reason: RequirementsNotMet
Last Transition Time: 2018-12-06T09:52:44Z
Last Update Time: 2018-12-06T09:52:44Z
Message: install strategy completed with no errors
Phase: Succeeded
Reason: InstallSucceeded
Last Transition Time: 2018-12-06T09:52:44Z
Last Update Time: 2018-12-06T09:52:44Z
Message: requirements not met
Phase: Pending
Reason: RequirementsNotMet
Last Transition Time: 2018-12-06T09:53:37Z
Last Update Time: 2018-12-06T09:53:37Z
Message: all requirements found, attempting install
Phase: InstallReady
Reason: AllRequirementsMet
Last Transition Time: 2018-12-06T09:53:37Z
Last Update Time: 2018-12-06T09:53:37Z
Message: waiting for install components to report healthy
Phase: Installing
Reason: InstallSucceeded
Last Transition Time: 2018-12-06T09:53:37Z
Last Update Time: 2018-12-06T09:53:37Z
Message: installing: Waiting: waiting for deployment etcd-operator to become ready: Waiting for deployment spec update to be observed...
Phase: Installing
Reason: InstallWaiting
Last Transition Time: 2018-12-06T09:53:37Z
Last Update Time: 2018-12-06T09:53:37Z
Message: install strategy completed with no errors
Phase: Succeeded
Reason: InstallSucceeded
Last Transition Time: 2018-12-06T09:58:17Z
Last Update Time: 2018-12-06T09:58:17Z
Message: requirements no longer met
Phase: Failed
Reason: RequirementsNotMet
Last Transition Time: 2018-12-06T09:58:17Z
Last Update Time: 2018-12-06T09:58:17Z
Message: install strategy completed with no errors
Phase: Succeeded
Reason: InstallSucceeded
Last Transition Time: 2018-12-06T09:58:17Z
Last Update Time: 2018-12-06T09:58:17Z
Message: requirements not met
Phase: Pending
Reason: RequirementsNotMet
Last Transition Time: 2018-12-06T10:02:16Z
Last Update Time: 2018-12-06T10:02:16Z
Message: all requirements found, attempting install
Phase: InstallReady
Reason: AllRequirementsMet
Last Transition Time: 2018-12-06T10:02:16Z
Last Update Time: 2018-12-06T10:02:16Z
Message: waiting for install components to report healthy
Phase: Installing
Reason: InstallSucceeded
Last Transition Time: 2018-12-06T10:02:16Z
Last Update Time: 2018-12-06T10:02:16Z
Message: installing: Waiting: waiting for deployment etcd-operator to become ready: Waiting for deployment spec update to be observed...
Phase: Installing
Reason: InstallWaiting
Last Transition Time: 2018-12-06T10:02:16Z
Last Update Time: 2018-12-06T10:02:16Z
Message: install strategy completed with no errors
Phase: Succeeded
Reason: InstallSucceeded
Last Transition Time: 2018-12-06T10:02:16Z
Last Update Time: 2018-12-06T10:02:16Z
Message: install strategy completed with no errors
Phase: Succeeded
Reason: InstallSucceeded
Requirement Status:
Group: apiextensions.k8s.io
Kind: CustomResourceDefinition
Name: etcdbackups.etcd.database.coreos.com
Status: Present
Uuid: b034b902-f92b-11e8-b8aa-be36c804dcbb
Version: v1beta1
Group: apiextensions.k8s.io
Kind: CustomResourceDefinition
Name: etcdclusters.etcd.database.coreos.com
Status: Present
Uuid: b05556c8-f92b-11e8-b8aa-be36c804dcbb
Version: v1beta1
Group: apiextensions.k8s.io
Kind: CustomResourceDefinition
Name: etcdrestores.etcd.database.coreos.com
Status: Present
Uuid: b0842499-f92b-11e8-b8aa-be36c804dcbb
Version: v1beta1
Dependents:
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["*"],"apiGroups":["etcd.database.coreos.com"],"resources":["etcdclusters","etcdbackups","etcdrestores"]}
Status: Satisfied
Version: v1beta1
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["*"],"apiGroups":[""],"resources":["pods","services","endpoints","persistentvolumeclaims","events"]}
Status: Satisfied
Version: v1beta1
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["*"],"apiGroups":["apps"],"resources":["deployments"]}
Status: Satisfied
Version: v1beta1
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["get"],"apiGroups":[""],"resources":["secrets"]}
Status: Satisfied
Version: v1beta1
Group:
Kind: ServiceAccount
Name: etcd-operator
Status: Present
Version: v1
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal RequirementsUnknown 22m operator-lifecycle-manager requirements not yet checked
Normal AllRequirementsMet 11m (x2 over 22m) operator-lifecycle-manager all requirements found, attempting install
Normal InstallSucceeded 11m (x2 over 22m) operator-lifecycle-manager waiting for install components to report healthy
Normal RequirementsNotMet 3m (x23 over 22m) operator-lifecycle-manager one or more requirements couldn't be found
[core@jian-master-0 ~]$ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
etcdoperator.v0.9.2 etcd 0.9.2 etcdoperator.v0.9.0 Succeeded
[core@jian-master-0 ~]$ oc get pods
NAME READY STATUS RESTARTS AGE
etcd-operator-68b4997899-mt5bg 3/3 Running 0 23m
The event is here:
Last Transition Time: 2018-12-06T10:02:16Z
Last Update Time: 2018-12-06T10:02:16Z
Message: all requirements found, attempting install
Phase: InstallReady
Reason: AllRequirementsMet
We're definitely emitting the AllRequirementsMet event here. It does not appear that there is a bug.
Evan, Yes, but I guess you listed above belongs to the "Conditions:" field. In the "Events" field, you can see the below sentence: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal RequirementsUnknown 22m operator-lifecycle-manager requirements not yet checked Normal AllRequirementsMet 11m (x2 over 22m) operator-lifecycle-manager all requirements found, attempting install Normal InstallSucceeded 11m (x2 over 22m) operator-lifecycle-manager waiting for install components to report healthy Normal RequirementsNotMet 3m (x23 over 22m) operator-lifecycle-manager one or more requirements couldn't be found The last event still shows "RequirementsNotMet". The OLM version: io.openshift.build.commit.id=840d806a3b20e5ebb7229631d0168864b1cfed12 and the fixed PR has been merged in.
1, Install the etcd-operator on Web console.
2, Edit the etcd role, remove the below permissions:
$oc edit role etcdoperator.v0.9.2-89crk
...
- apiGroups:
- etcd.database.coreos.com
resources:
- etcdclusters
- etcdbackups
- etcdrestores
verbs:
- '*'
3, Check the etcd operator csv status. The events look good.
[jzhang@dhcp-140-18 ocp18]$ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
etcdoperator.v0.9.2 etcd 0.9.2 etcdoperator.v0.9.0 Pending
[jzhang@dhcp-140-18 ocp18]$ oc describe csv etcdoperator.v0.9.2
...
Dependents:
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["*"],"apiGroups":["etcd.database.coreos.com"],"resources":["etcdclusters","etcdbackups","etcdrestores"]}
Status: NotSatisfied
Version: v1beta1
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["*"],"apiGroups":[""],"resources":["pods","services","endpoints","persistentvolumeclaims","events"]}
Status: Satisfied
Version: v1beta1
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["*"],"apiGroups":["apps"],"resources":["deployments"]}
Status: Satisfied
Version: v1beta1
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["get"],"apiGroups":[""],"resources":["secrets"]}
Status: Satisfied
Version: v1beta1
Group:
Kind: ServiceAccount
Message: Policy rule not satisfied for service account
Name: etcd-operator
Status: PresentNotSatisfied
Version: v1
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal RequirementsUnknown 21m operator-lifecycle-manager requirements not yet checked
Normal AllRequirementsMet 21m operator-lifecycle-manager all requirements found, attempting install
Normal InstallSucceeded 21m operator-lifecycle-manager waiting for install components to report healthy
Normal InstallWaiting 21m (x4 over 21m) operator-lifecycle-manager installing: Waiting: waiting for deployment etcd-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
Normal RequirementsNotMet 8s operator-lifecycle-manager requirements not met
Normal InstallSucceeded 8s (x2 over 20m) operator-lifecycle-manager install strategy completed with no errors
Warning RequirementsNotMet 8s operator-lifecycle-manager requirements no longer met
Normal RequirementsNotMet 4s (x18 over 21m) operator-lifecycle-manager one or more requirements couldn't be found
4, Add the above permissions back and check the etcd csv status. All dependents were satisfied, but, the event still shows "one or more requirements couldn't be found". Verify failed.
[jzhang@dhcp-140-18 ocp18]$ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
etcdoperator.v0.9.2 etcd 0.9.2 etcdoperator.v0.9.0 Succeeded
[jzhang@dhcp-140-18 ocp18]$ oc describe csv etcdoperator.v0.9.2
...
Dependents:
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["*"],"apiGroups":["etcd.database.coreos.com"],"resources":["etcdclusters","etcdbackups","etcdrestores"]}
Status: Satisfied
Version: v1beta1
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["*"],"apiGroups":[""],"resources":["pods","services","endpoints","persistentvolumeclaims","events"]}
Status: Satisfied
Version: v1beta1
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["*"],"apiGroups":["apps"],"resources":["deployments"]}
Status: Satisfied
Version: v1beta1
Group: rbac.authorization.k8s.io
Kind: PolicyRule
Message: namespaced rule:{"verbs":["get"],"apiGroups":[""],"resources":["secrets"]}
Status: Satisfied
Version: v1beta1
Group:
Kind: ServiceAccount
Message:
Name: etcd-operator
Status: Present
Version: v1
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal RequirementsUnknown 29m operator-lifecycle-manager requirements not yet checked
Normal AllRequirementsMet 28m operator-lifecycle-manager all requirements found, attempting install
Normal InstallSucceeded 28m operator-lifecycle-manager waiting for install components to report healthy
Normal InstallWaiting 28m (x4 over 28m) operator-lifecycle-manager installing: Waiting: waiting for deployment etcd-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available...
Normal RequirementsNotMet 7m56s operator-lifecycle-manager requirements not met
Normal InstallSucceeded 7m56s (x2 over 28m) operator-lifecycle-manager install strategy completed with no errors
Warning RequirementsNotMet 7m56s operator-lifecycle-manager requirements no longer met
Normal RequirementsNotMet 7m52s (x18 over 29m) operator-lifecycle-manager one or more requirements couldn't be found
Fix has merged. LGTM, verify it, details as below: OLM version: io.openshift.build.commit.id=b2d1cd21368bc8cc10e4ca11a231f09077630c33 Cluster version is 4.1.0-0.nightly-2019-05-06-011159 The steps as Comment 8 shows, the event info: mac:~ jianzhang$ oc describe csv etcdoperator.v0.9.4-clusterwide ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal RequirementsUnknown 8m49s operator-lifecycle-manager requirements not yet checked Normal InstallWaiting 8m45s (x2 over 8m45s) operator-lifecycle-manager installing: Waiting: waiting for deployment etcd-operator to become ready: Waiting for rollout to finish: 0 of 1 updated replicas are available... Warning RequirementsNotMet 4m22s operator-lifecycle-manager requirements no longer met Normal RequirementsNotMet 4m21s operator-lifecycle-manager requirements not met Normal RequirementsNotMet 4m20s (x3 over 8m49s) operator-lifecycle-manager one or more requirements couldn't be found Normal AllRequirementsMet 63s (x3 over 8m48s) operator-lifecycle-manager all requirements found, attempting install Normal InstallSucceeded 61s (x3 over 8m46s) operator-lifecycle-manager waiting for install components to report healthy Normal InstallSucceeded 60s (x4 over 8m29s) operator-lifecycle-manager install strategy completed with no errors mac:~ jianzhang$ oc get csv NAME DISPLAY VERSION REPLACES PHASE etcdoperator.v0.9.4-clusterwide etcd 0.9.4-clusterwide etcdoperator.v0.9.2-clusterwide Succeeded Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |
Description of problem: The event of the csv still is "RequirementsNotMet" even if all requirements met. Version-Release number of selected component (if applicable): [core@jian-master-0 ~]$ oc exec olm-operator-796dc97869-ddgq5 -- olm -version OLM version: 0.8.0 git commit: bb46d55 How reproducible: always Steps to Reproduce: 1. Create the etcd-operator via the subscription. For example: [core@jian-master-0 ~]$ cat etcd-operator.yaml apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: namespace: etcd-demo generateName: etcd- spec: source: rh-operators name: etcd startingCSV: etcdoperator.v0.9.2 channel: alpha 2. Check the etcd-operator, it works well. [core@jian-master-0 ~]$ oc get csv NAME DISPLAY VERSION REPLACES PHASE etcdoperator.v0.9.2 etcd 0.9.2 etcdoperator.v0.9.0 Succeeded [core@jian-master-0 ~]$ oc get pods NAME READY STATUS RESTARTS AGE etcd-operator-68b4997899-ctkfn 3/3 Running 0 16m 3. Remove a rule of role. The event and status both are as expected. [core@jian-master-0 ~]$ oc edit role role.rbac.authorization.k8s.io/etcdoperator.v0.9.2-0 edited [core@jian-master-0 ~]$ oc get csv NAME DISPLAY VERSION REPLACES PHASE etcdoperator.v0.9.2 etcd 0.9.2 etcdoperator.v0.9.0 Pending [core@jian-master-0 ~]$ oc describe csv ... Group: rbac.authorization.k8s.io Kind: PolicyRule Message: namespaced rule:{"verbs":["*"],"apiGroups":["etcd.database.coreos.com"],"resources":["etcdclusters","etcdbackups","etcdrestores"]} Status: NotSatisfied Version: v1beta1 ... Group: Kind: ServiceAccount Name: etcd-operator Status: PresentNotSatisfied ... Normal RequirementsNotMet 8s (x44 over 1h) operator-lifecycle-manager one or more requirements couldn't be found 4, Add the rules back and check the csv. Actual results: [core@jian-master-0 ~]$ oc get csv NAME DISPLAY VERSION REPLACES PHASE etcdoperator.v0.9.2 etcd 0.9.2 etcdoperator.v0.9.0 Succeeded [core@jian-master-0 ~]$ oc describe csv ... Group: rbac.authorization.k8s.io Kind: PolicyRule Message: namespaced rule:{"verbs":["*"],"apiGroups":["etcd.database.coreos.com"],"resources":["etcdclusters","etcdbackups","etcdrestores"]} Status: Satisfied Version: v1beta1 ... Kind: ServiceAccount Name: etcd-operator Status: Present Version: v1 ... Events: ... Normal RequirementsNotMet 16m (x44 over 2h) operator-lifecycle-manager one or more requirements couldn't be found The event still is "RequirementsNotMet" even if all requirements are met. Expected results: The event should be "AllRequirementsMet". Additional info: