Bug 2000004
| Summary: | [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Degraded: master nodes drained too quickly | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jan Chaloupka <jchaloup> |
| Component: | openshift-apiserver | Assignee: | Jan Chaloupka <jchaloup> |
| Status: | CLOSED WONTFIX | QA Contact: | Xingxing Xia <xxia> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.9 | CC: | aos-bugs, kewang, mfojtik, sttts, xxia |
| Target Milestone: | --- | Flags: | mfojtik:
needinfo?
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | LifecycleStale | ||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1998031 | Environment: |
jobs=periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-ovirt-upgrade=all
|
| Last Closed: | 2023-01-16 11:36:58 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Jan Chaloupka
2021-09-01 07:46:01 UTC
Cloning https://bugzilla.redhat.com/show_bug.cgi?id=1998031 to defer resolution of "The operators allow 1 replica to be unavailable and change condition/Degraded to true only when there are at least 2 replicas unavailable (since both deployments set maxUnavailable to 1)" part. I don't quite agree. I think 1 replica unavailable should change Degraded to True. If 2 replicas unavailable, it is "Available" that should be changed (to False) instead. E.g. bug 1999946 and bug 2001456 both hit "1 of 3 requested instances are unavailable for apiserver.openshift-apiserver" and changed Degraded, they are indeed bugs. If not change, not sure whether the possible bug may be escaped. > I think 1 replica unavailable should change Degraded to True https://github.com/openshift/cluster-authentication-operator/blob/9efb3c1e5ac657aaa87f237d2c6aea586b7aad49/vendor/github.com/openshift/api/config/v1/types_cluster_operator.go#L161-L177 // Degraded indicates that the operator's current state does not match its // desired state over a period of time resulting in a lower quality of service. // The period of time may vary by component, but a Degraded state represents // persistent observation of a condition. ... // ... A service should not // report Degraded during the course of a normal upgrade Given the operator is going through an upgrade, reporting condition/Degraded=True is incorrect. The important piece of information here is "Degraded state represents persistent observation of a condition". The reported issue is not persistent, only temporary. This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Whiteboard if you think this bug should never be marked as stale. Please consult with bug assignee before you do that. Will need more time to properly improve the logic for setting the operator Degraded condition Will need more time to properly improve the logic for setting the operator Degraded condition Due to higher priority tasks I have been able to resolve this issue in time. Moving to the next sprint. Due to higher priority tasks I have been able to resolve this issue in time. Moving to the next sprint. Dear reporter, we greatly appreciate the bug you have reported here. Unfortunately, due to migration to a new issue-tracking system (https://issues.redhat.com/), we cannot continue triaging bugs reported in Bugzilla. Since this bug has been stale for multiple days, we, therefore, decided to close this bug. If you think this is a mistake or this bug has a higher priority or severity as set today, please feel free to reopen this bug and tell us why. We are going to move every re-opened bug to https://issues.redhat.com. Thank you for your patience and understanding. |