1835551 – console operator is reporting DEGRADED in http proxy cluster

Bug 1835551 - console operator is reporting DEGRADED in http proxy cluster

Summary: console operator is reporting DEGRADED in http proxy cluster

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Management Console
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.5.0
Assignee:	Jakub Hadvig
QA Contact:	Yadan Pei
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-05-14 03:06 UTC by Yadan Pei
Modified:	2020-07-13 17:38 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: Missing `config.openshift.io/inject-proxy: console-operator` annotation on the console-operator's deployment, which injects the proxy variables into the operator's pod, if they are set. Also the console route health check is not using the injected proxy variables when creating a client for the health check. Consequence: Console's route health check fails and the route is reporting RouteHealthDegraded condition. Fix: Add the needed annotation to inject the proxy variables, if they are set and use them inside the client that is doing the health check. Result: Console health check passes.
Clone Of:
Environment:
Last Closed:	2020-07-13 17:38:37 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift console-operator pull 429	0	None	closed	Bug 1835551: Inject proxy envars to console-operator deployment	2021-02-16 00:49:42 UTC
Red Hat Product Errata	RHBA-2020:2409	0	None	None	None	2020-07-13 17:38:52 UTC

Description Yadan Pei 2020-05-14 03:06:38 UTC

Description of problem:
console operator is reporting Degraded: True due to RouteHealthDegraded error, but it looks like console is functionally working, nothing is affected. It looks like console-operator is throwing this error because console-operator doesn't have proxy set correctly

Version-Release number of selected component (if applicable):
4.5.0-0.nightly-2020-05-13-221558

How reproducible:
Always

Steps to Reproduce:
1. create a UPI on AWS with http_proxy cluster
2. check cluster health status


Actual results:
2. Installation can be successful, but console operator is reporting Degraded status due to RouteHealthDegraded error

# oc get co
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.5.0-0.nightly-2020-05-13-221558   True        False         False      62m
cloud-credential                           4.5.0-0.nightly-2020-05-13-221558   True        False         False      81m
cluster-autoscaler                         4.5.0-0.nightly-2020-05-13-221558   True        False         False      71m
config-operator                            4.5.0-0.nightly-2020-05-13-221558   True        False         False      71m
console                                    4.5.0-0.nightly-2020-05-13-221558   True        False         True       46m
csi-snapshot-controller                    4.5.0-0.nightly-2020-05-13-221558   True        False         False      67m
dns                                        4.5.0-0.nightly-2020-05-13-221558   True        False         False      76m
etcd                                       4.5.0-0.nightly-2020-05-13-221558   True        False         False      76m
image-registry                             4.5.0-0.nightly-2020-05-13-221558   True        False         False      68m
ingress                                    4.5.0-0.nightly-2020-05-13-221558   True        False         False      67m
insights                                   4.5.0-0.nightly-2020-05-13-221558   True        False         False      72m
kube-apiserver                             4.5.0-0.nightly-2020-05-13-221558   True        False         False      75m
kube-controller-manager                    4.5.0-0.nightly-2020-05-13-221558   True        False         False      75m
kube-scheduler                             4.5.0-0.nightly-2020-05-13-221558   True        False         False      75m
kube-storage-version-migrator              4.5.0-0.nightly-2020-05-13-221558   True        False         False      67m
machine-api                                4.5.0-0.nightly-2020-05-13-221558   True        False         False      69m
machine-approver                           4.5.0-0.nightly-2020-05-13-221558   True        False         False      75m
machine-config                             4.5.0-0.nightly-2020-05-13-221558   True        False         False      39m
marketplace                                4.5.0-0.nightly-2020-05-13-221558   True        False         False      45m
monitoring                                 4.5.0-0.nightly-2020-05-13-221558   True        False         False      44m
network                                    4.5.0-0.nightly-2020-05-13-221558   True        False         False      77m
node-tuning                                4.5.0-0.nightly-2020-05-13-221558   True        False         False      77m
openshift-apiserver                        4.5.0-0.nightly-2020-05-13-221558   True        False         False      73m
openshift-controller-manager               4.5.0-0.nightly-2020-05-13-221558   True        False         False      72m
openshift-samples                          4.5.0-0.nightly-2020-05-13-221558   True        False         False      71m
operator-lifecycle-manager                 4.5.0-0.nightly-2020-05-13-221558   True        False         False      76m
operator-lifecycle-manager-catalog         4.5.0-0.nightly-2020-05-13-221558   True        False         False      76m
operator-lifecycle-manager-packageserver   4.5.0-0.nightly-2020-05-13-221558   True        False         False      46m
service-ca                                 4.5.0-0.nightly-2020-05-13-221558   True        False         False      77m
storage                                    4.5.0-0.nightly-2020-05-13-221558   True        False         False      72m

# oc describe co console 
Name:         console
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1
Kind:         ClusterOperator
Metadata:
  Creation Timestamp:  2020-05-14T01:22:23Z
  Generation:          1
  Managed Fields:
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
      f:status:
        .:
        f:extension:
    Manager:      cluster-version-operator
    Operation:    Update
    Time:         2020-05-14T01:22:23Z
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
        f:relatedObjects:
        f:versions:
    Manager:         console
    Operation:       Update
    Time:            2020-05-14T01:58:06Z
  Resource Version:  30928
  Self Link:         /apis/config.openshift.io/v1/clusteroperators/console
  UID:               43ffed44-5aee-4145-832e-3af8cf948630
Spec:
Status:
  Conditions:
    Last Transition Time:  2020-05-14T01:38:29Z
    Message:               RouteHealthDegraded: failed to GET route (https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health): Get https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                RouteHealth_FailedGet
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2020-05-14T01:42:06Z
    Reason:                AsExpected
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2020-05-14T01:58:06Z
    Reason:                AsExpected
    Status:                True
    Type:                  Available
    Last Transition Time:  2020-05-14T01:32:22Z
    Reason:                AsExpected
    Status:                True
    Type:                  Upgradeable
  Extension:               <nil>
  Related Objects:
    Group:      operator.openshift.io
    Name:       cluster
    Resource:   consoles
    Group:      config.openshift.io
    Name:       cluster
    Resource:   consoles
    Group:      config.openshift.io
    Name:       cluster
    Resource:   infrastructures
    Group:      config.openshift.io
    Name:       cluster
    Resource:   proxies
    Group:      oauth.openshift.io
    Name:       console
    Resource:   oauthclients
    Group:      
    Name:       openshift-console-operator
    Resource:   namespaces
    Group:      
    Name:       openshift-console
    Resource:   namespaces
    Group:      
    Name:       console-public
    Namespace:  openshift-config-managed
    Resource:   configmaps
  Versions:
    Name:     operator
    Version:  4.5.0-0.nightly-2020-05-13-221558
Events:       <none>

# oc get route console -n openshift-console -o json | jq '.status'
{
  "ingress": [
    {
      "conditions": [
        {
          "lastTransitionTime": "2020-05-14T01:36:24Z",
          "status": "True",
          "type": "Admitted"
        }
      ],
      "host": "console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com",
      "routerCanonicalHostname": "apps.qe-yapei.qe.devcluster.openshift.com",
      "routerName": "default",
      "wildcardPolicy": "None"
    }
  ]
}

# oc logs -f console-operator-584d7fd69-vfvr9 -n openshift-console-operator
....
E0514 02:47:27.566964       1 status.go:78] RouteHealthDegraded FailedGet failed to GET route (https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health): Get https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0514 02:47:27.567085       1 controller.go:354] console-route-sync--work-queue-key failed with : failed to GET route (https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health): Get https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0514 02:48:10.767143       1 status.go:78] RouteHealthDegraded FailedGet failed to GET route (https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health): Get https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0514 02:48:10.767283       1 controller.go:354] console-route-sync--work-queue-key failed with : failed to GET route (https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health): Get https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0514 02:48:16.567621       1 status.go:78] RouteHealthDegraded FailedGet failed to GET route (https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health): Get https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0514 02:48:16.567827       1 controller.go:354] console-route-sync--work-queue-key failed with : failed to GET route (https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health): Get https://console-openshift-console.apps.qe-yapei.qe.devcluster.openshift.com/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
.....

# oc get pods -n openshift-console-operator
NAME                               READY   STATUS    RESTARTS   AGE
console-operator-584d7fd69-vfvr9   1/1     Running   0          56m

# oc rsh -n openshift-console-operator console-operator-584d7fd69-vfvr9
sh-4.2$ env | grep -i proxy
sh-4.2$ 

# oc get proxies.config -o yaml
apiVersion: v1
items:
- apiVersion: config.openshift.io/v1
  kind: Proxy
  metadata:
    creationTimestamp: "2020-05-14T01:22:28Z"
    generation: 1
    managedFields:
    - apiVersion: config.openshift.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          .: {}
          f:httpProxy: {}
          f:httpsProxy: {}
          f:noProxy: {}
          f:trustedCA:
            .: {}
            f:name: {}
        f:status:
          .: {}
          f:httpProxy: {}
          f:httpsProxy: {}
          f:noProxy: {}
      manager: cluster-bootstrap
      operation: Update
      time: "2020-05-14T01:22:28Z"
    name: cluster
    resourceVersion: "778"
    selfLink: /apis/config.openshift.io/v1/proxies/cluster
    uid: ac5d7239-0bd6-4a30-8e38-1bb137227aeb
  spec:
    httpProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-16-79-11.us-east-2.compute.amazonaws.com:3128
    httpsProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-16-79-11.us-east-2.compute.amazonaws.com:3128
    noProxy: test.no-proxy.com
    trustedCA:
      name: ""
  status:
    httpProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-16-79-11.us-east-2.compute.amazonaws.com:3128
    httpsProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-16-79-11.us-east-2.compute.amazonaws.com:3128
    noProxy: .cluster.local,.svc,.us-east-2.compute.internal,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.qe-yapei.qe.devcluster.openshift.com,etcd-0.qe-yapei.qe.devcluster.openshift.com,etcd-1.qe-yapei.qe.devcluster.openshift.com,etcd-2.qe-yapei.qe.devcluster.openshift.com,localhost,test.no-proxy.com
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

# oc -n openshift-console get rs    // it looks like some console pods console-68c899fdf-* are removed during this process

NAME                   DESIRED   CURRENT   READY   AGE
console-68c899fdf      0         0         0       81m
console-7f6bfc45cc     2         2         2       76m
downloads-7fc58bfbcc   2         2         2       88m# oc -n openshift-console get event | grep console-68c899fdf
79m         Warning   Unhealthy           pod/console-68c899fdf-zj6dh       Readiness probe failed: Get https://10.130.0.26:8443/health: dial tcp 10.130.0.26:8443: connect: connection refused
79m         Warning   Unhealthy           pod/console-68c899fdf-zj6dh       Liveness probe failed: Get https://10.130.0.26:8443/health: dial tcp 10.130.0.26:8443: connect: connection refused
79m         Normal    Killing             pod/console-68c899fdf-zj6dh       Container console failed liveness probe, will be restarted

# oc get pods -n openshift-console
NAME                         READY   STATUS    RESTARTS   AGE
console-7f6bfc45cc-hs446     1/1     Running   0          76m
console-7f6bfc45cc-zcbvd     1/1     Running   0          68m
downloads-7fc58bfbcc-r5p2q   1/1     Running   0          68m
downloads-7fc58bfbcc-smjz8   1/1     Running   0          68m

Expected results:
console-operator should report correct status, Available: True, PROGRESSING: False, Degraded: False on a successful installation

Additional info:

Comment 3 XiaochuanWang 2020-05-18 03:25:06 UTC

Reported error "RouteHealthDegraded: failed to GET route" does not appear on the cluster with Proxy.

Verified version: 4.5.0-0.nightly-2020-05-15-011814

Comment 4 errata-xmlrpc 2020-07-13 17:38:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409

Note You need to log in before you can comment on or make changes to this bug.