Bug 1994443
Summary: | openshift-console operator incorrectly reports Available=false | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jan Chaloupka <jchaloup> |
Component: | Management Console | Assignee: | Jakub Hadvig <jhadvig> |
Status: | CLOSED ERRATA | QA Contact: | Yadan Pei <yapei> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.9 | CC: | aos-bugs, jhadvig, jokerman, spadgett, wking, yapei |
Target Milestone: | --- | ||
Target Release: | 4.9.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: |
job=periodic-ci-openshift-release-master-ci-4.9-e2e-gcp-upgrade=all
job=periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-e2e-aws-upgrade=all
|
|
Last Closed: | 2021-10-18 17:46:55 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2001268 |
Description
Jan Chaloupka
2021-08-17 09:41:30 UTC
Other instances of the flake: https://search.ci.openshift.org/?search=%5C%5Bbz-Management+Console%5C%5D+clusteroperator%2Fconsole+should+not+change+condition%2FAvailable&maxAge=24h&context=1&type=junit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job Checking if the operand is upgraded or in the process of upgrading belongs under condition/Progressing. From https://github.com/openshift/api/blob/a6156965faae5ce117e3cd3735981a3fc0e27e27/config/v1/types_cluster_operator.go#L152-L159: ``` // Progressing indicates that the operator is actively rolling out new code, // propagating config changes, or otherwise moving from one steady state to // another. ``` did some regression testing about console-operator such as the steps in https://bugzilla.redhat.com/show_bug.cgi?id=1989055#c7 https://bugzilla.redhat.com/show_bug.cgi?id=1952405#c0 in all conditions console-operator is reporting correct Available status, also check console-operator logs: $ oc logs -f console-operator-56ccbc8575-2tfs5 -n openshift-console-operator | grep '1 replica' E0906 01:46:48.231736 1 status.go:78] SyncLoopRefreshProgressing InProgress Working toward version 4.9.0-0.nightly-2021-09-05-192114, 1 replicas available I0906 01:46:48.419484 1 status_controller.go:211] clusteroperator/console diff {"status":{"conditions":[{"lastTransitionTime":"2021-09-05T23:58:11Z","message":"All is well","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2021-09-06T01:46:46Z","message":"SyncLoopRefreshProgressing: Working toward version 4.9.0-0.nightly-2021-09-05-192114, 1 replicas available","reason":"SyncLoopRefresh_InProgress","status":"True","type":"Progressing"},{"lastTransitionTime":"2021-09-06T00:00:38Z","message":"All is well","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2021-09-05T23:52:44Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} I0906 01:46:48.430693 1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-console-operator", Name:"console-operator", UID:"3bd0ea6a-0039-43dd-a1d8-98b98a99e36d", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/console changed: Progressing message changed from "SyncLoopRefreshProgressing: Changes made during sync updates, additional sync expected." to "SyncLoopRefreshProgressing: Working toward version 4.9.0-0.nightly-2021-09-05-192114, 1 replicas available" E0906 03:02:48.305656 1 status.go:78] SyncLoopRefreshProgressing InProgress Working toward version 4.9.0-0.nightly-2021-09-05-192114, 1 replicas available I0906 03:02:48.493694 1 status_controller.go:211] clusteroperator/console diff {"status":{"conditions":[{"lastTransitionTime":"2021-09-05T23:58:11Z","message":"All is well","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2021-09-06T03:02:46Z","message":"SyncLoopRefreshProgressing: Working toward version 4.9.0-0.nightly-2021-09-05-192114, 1 replicas available","reason":"SyncLoopRefresh_InProgress","status":"True","type":"Progressing"},{"lastTransitionTime":"2021-09-06T00:00:38Z","message":"All is well","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2021-09-05T23:52:44Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} I0906 03:02:48.507008 1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-console-operator", Name:"console-operator", UID:"3bd0ea6a-0039-43dd-a1d8-98b98a99e36d", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/console changed: Progressing message changed from "SyncLoopRefreshProgressing: Changes made during sync updates, additional sync expected." to "SyncLoopRefreshProgressing: Working toward version 4.9.0-0.nightly-2021-09-05-192114, 1 replicas available" We can see from above logs that when only 1 replicas is available console is reporting Available: True, Progress status is reporting True as expected as well However when I set console to be Removed, console deployment, service and route will be entirely removed, co/console is still reporting Available: True while user can not visit console at all $ oc get console.operator cluster -o json | jq .spec { "logLevel": "Normal", "managementState": "Removed", "operatorLogLevel": "Debug" } $ while true;do oc get all -n openshift-console | grep console; oc get co | grep console; sleep 5; done | tee -a watch-consoleoperator.log No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 27m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 27m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 27m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 27m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 28m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 28m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 28m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 28m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 28m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 28m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 28m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 28m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 29m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 29m No resources found in openshift-console namespace. console 4.9.0-0.nightly-2021-09-05-192114 True False False 29m Check console-operator logs: I0906 06:17:38.219270 1 controller.go:99] console is in a removed state: deleting ConsoleCliDownloads custom resources I0906 06:17:38.219732 1 controller.go:97] console-operator is in a removed state: deleting "downloads" service I0906 06:17:38.220376 1 controller.go:95] console is in an removed state: removing synced downloads deployment I0906 06:17:38.221685 1 operator.go:243] console has been removed. I0906 06:17:38.221701 1 operator.go:254] deleting console resources I0906 06:17:38.224760 1 controller.go:97] console-operator is in a removed state: deleting "console" service I0906 06:17:38.226293 1 controller.go:137] finished deleting ConsoleCliDownloads custom resources E0906 06:17:38.226724 1 base_controller.go:251] "ConsoleDownloadsDeploymentSyncController" controller failed to sync "key", err: deployments.apps "downloads" not found I0906 06:17:38.226884 1 reflector.go:381] github.com/openshift/client-go/oauth/informers/externalversions/factory.go:101: forcing resync I0906 06:17:38.226906 1 reflector.go:381] k8s.io/client-go/informers/factory.go:134: forcing resync I0906 06:17:38.226928 1 reflector.go:381] k8s.io/client-go/informers/factory.go:134: forcing resync I0906 06:17:38.226956 1 reflector.go:381] github.com/openshift/client-go/config/informers/externalversions/factory.go:101: forcing resync I0906 06:17:38.230116 1 controller.go:99] console is in a removed state: deleting ConsoleCliDownloads custom resources I0906 06:17:38.230311 1 reflector.go:381] github.com/openshift/client-go/config/informers/externalversions/factory.go:101: forcing resync I0906 06:17:38.237610 1 controller.go:137] finished deleting ConsoleCliDownloads custom resources I0906 06:17:38.243785 1 reflector.go:381] k8s.io/client-go/informers/factory.go:134: forcing resync I0906 06:17:38.254859 1 operator.go:276] finished deleting console resources I0906 06:17:38.254875 1 operator.go:229] finished syncing operator "cluster" (22.519µs) I0906 06:17:38.395944 1 request.go:600] Waited for 169.098818ms due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operator.openshift.io/v1/consoles/cluster I0906 06:17:38.399730 1 controller.go:95] console is in an removed state: removing synced downloads deployment E0906 06:17:38.402286 1 base_controller.go:251] "ConsoleDownloadsDeploymentSyncController" controller failed to sync "key", err: deployments.apps "downloads" not found I0906 06:17:38.596052 1 request.go:600] Waited for 366.520012ms due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operator.openshift.io/v1/consoles/cluster I0906 06:17:38.602627 1 controller.go:111] console-operator is in a removed state: deleting "console" route I0906 06:17:38.795633 1 request.go:600] Waited for 565.236754ms due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operator.openshift.io/v1/consoles/cluster I0906 06:17:38.800517 1 controller.go:97] console-operator is in a removed state: deleting "downloads" service I0906 06:17:38.995906 1 request.go:600] Waited for 763.809168ms due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operator.openshift.io/v1/consoles/cluster I0906 06:17:38.999645 1 controller.go:111] console-operator is in a removed state: deleting "downloads" route I0906 06:17:39.195848 1 request.go:600] Waited for 958.181465ms due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operator.openshift.io/v1/consoles/cluster I0906 06:17:39.199715 1 controller.go:97] console-operator is in a removed state: deleting "console" service I0906 06:17:39.395922 1 request.go:600] Waited for 1.140995178s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operator.openshift.io/v1/consoles/cluster I0906 06:17:39.395940 1 request.go:668] Waited for 1.140995178s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operator.openshift.io/v1/consoles/cluster I0906 06:17:39.407964 1 operator.go:181] started syncing operator "cluster" (2021-09-06 06:17:39.407955857 +0000 UTC m=+22870.770623255) I0906 06:17:39.421543 1 operator.go:243] console has been removed. I0906 06:17:39.421563 1 operator.go:254] deleting console resources I0906 06:17:39.445637 1 operator.go:276] finished deleting console resources I0906 06:17:39.445653 1 operator.go:229] finished syncing operator "cluster" (28.604µs) I0906 06:17:39.595886 1 request.go:600] Waited for 1.010927579s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operator.openshift.io/v1/consoles/cluster I0906 06:17:39.600289 1 controller.go:97] console-operator is in a removed state: skipping health checks I0906 06:17:39.795975 1 request.go:600] Waited for 1.176406105s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operator.openshift.io/v1/consoles/cluster I0906 06:17:39.799882 1 controller.go:111] console-operator is in a removed state: deleting "console" route I0906 06:17:44.097611 1 controller.go:95] console is in an removed state: removing synced downloads deployment E0906 06:17:44.101325 1 base_controller.go:251] "ConsoleDownloadsDeploymentSyncController" controller failed to sync "key", err: deployments.apps "downloads" not found I0906 06:17:48.107616 1 httplog.go:89] "HTTP" verb="GET" URI="/healthz" latency="136.121µs" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:43888" resp=200 I0906 06:17:48.111398 1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="4.380768ms" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:43890" resp=200 I0906 06:17:55.227930 1 httplog.go:89] "HTTP" verb="GET" URI="/metrics" latency="5.034854ms" userAgent="Prometheus/2.29.2" srcIP="10.128.2.18:33640" resp=200 I0906 06:17:58.110608 1 httplog.go:89] "HTTP" verb="GET" URI="/healthz" latency="145.334µs" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:44070" resp=200 I0906 06:17:58.110868 1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="2.160809ms" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:44068" resp=200 I0906 06:18:07.860377 1 httplog.go:89] "HTTP" verb="GET" URI="/metrics" latency="7.130971ms" userAgent="Prometheus/2.29.2" srcIP="10.129.2.11:50168" resp=200 I0906 06:18:08.106778 1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="430.825µs" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:44270" resp=200 I0906 06:18:08.106966 1 httplog.go:89] "HTTP" verb="GET" URI="/healthz" latency="111.263µs" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:44268" resp=200 I0906 06:18:18.110641 1 httplog.go:89] "HTTP" verb="GET" URI="/healthz" latency="129.717µs" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:44462" resp=200 I0906 06:18:18.112941 1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="4.288098ms" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:44464" resp=200 I0906 06:18:25.232542 1 httplog.go:89] "HTTP" verb="GET" URI="/metrics" latency="9.188576ms" userAgent="Prometheus/2.29.2" srcIP="10.128.2.18:33640" resp=200 I0906 06:18:28.107569 1 httplog.go:89] "HTTP" verb="GET" URI="/healthz" latency="157.81µs" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:44642" resp=200 I0906 06:18:28.108827 1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="2.33758ms" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:44644" resp=200 I0906 06:18:37.864605 1 httplog.go:89] "HTTP" verb="GET" URI="/metrics" latency="11.181203ms" userAgent="Prometheus/2.29.2" srcIP="10.129.2.11:50168" resp=200 I0906 06:18:38.107179 1 httplog.go:89] "HTTP" verb="GET" URI="/healthz" latency="325.339µs" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:44830" resp=200 I0906 06:18:38.107448 1 httplog.go:89] "HTTP" verb="GET" URI="/readyz" latency="97.67µs" userAgent="kube-probe/1.22+" srcIP="10.130.0.1:44832" resp=200 I0906 06:18:38.589726 1 controller.go:97] console-operator is in a removed state: skipping health checks Shall we update Available from True to False for this case? Moving this bug to VERIFIED since the original issue reported is fixed and will track the Available status issue when console is removed in a separate bug https://bugzilla.redhat.com/show_bug.cgi?id=2001523 the bug tracking wrong Available status when console is removed Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |