Bug 1999314 - console-operator is slow to mark Degraded as False once console starts working
Summary: console-operator is slow to mark Degraded as False once console starts working
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.10.0
Assignee: Jakub Hadvig
QA Contact: Siva Reddy
URL:
Whiteboard:
Depends On:
Blocks: 2010681
TreeView+ depends on / blocked
 
Reported: 2021-08-30 21:06 UTC by Seth Jennings
Modified: 2022-03-12 04:38 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-12 04:37:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
console-operator Degraded messages (193.54 KB, text/plain)
2021-10-29 04:09 UTC, Yadan Pei
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift console-operator pull 590 0 None open Bug 1999314: Resync all controllers periodically 2021-09-14 10:33:22 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-12 04:38:16 UTC

Description Seth Jennings 2021-08-30 21:06:54 UTC
Description of problem:
https://bugzilla.redhat.com/show_bug.cgi?id=1945326 helped with this issue but it still exists.

The console can be working O(minutes) before the console ClusterOperator Degraded condition is marked False.

This delays any process waiting on the CVO to report complete cluster installation.

Version-Release number of selected component (if applicable):
4.8.5

How reproducible:
Sometimes.  Depends if ingress and dns are ready before the console-operator does its check for the first time.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:
Good question.  Faster?


Additional info:
It seems that a bad DNS lookup can be cached for some time and cause subsequent check to fail, even in the ingress is now working.

Comment 3 Siva Reddy 2021-09-29 14:12:06 UTC
Now over multiple installations it is marked Degraded as False faster.

Cluster version:
 4.10.0-0.nightly-2021-09-28-220911

Steps to verify
 1. Install the cluster manually and monitor the operator status
    oc get co | grep console 
 2. Also not simultaneously how the ingress and dns operator are coming up
 
  Now the Degraded as False is marked faster

Comment 4 Yadan Pei 2021-10-29 04:09:29 UTC
Created attachment 1838187 [details]
console-operator Degraded messages

Comment 6 Jakub Hadvig 2021-11-01 11:39:44 UTC
Yadan when the operator is in `Removed` state is should not matter if you create a dummy-route called `console`, cause the operator will immediately remove it.

Also tried to reproduce the issue on 4.10.0-0.nightly-2021-10-31-210828 cluster but after couple of seconds I've turned the operator to `Managed` state the console got admitted and all seems well.

About your questions:

> 1. console operator reports `RouteHealthAvailable: console route is not admitted` and stays in `Available: False && Degraded: False` status about 116s, wh console-operator has been kept in this state for about 2 minutes? Is this working as intended?

no this is not intended to happen

> after console rout is admitted but not yet available, console reports `Available: False && Degraded: False` status again(this is working as intended)

when the route is admitted and no other error occurs, the conditions should be `Available: True && Degraded: False`

Comment 10 errata-xmlrpc 2022-03-12 04:37:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.