Bug 1901057 - authentication operator health check failed when installing a cluster behind proxy
Summary: authentication operator health check failed when installing a cluster behind ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: apiserver-auth
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.7.0
Assignee: Standa Laznicka
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-24 11:37 UTC by Johnny Liu
Modified: 2021-02-24 15:36 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:35:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-authentication-operator pull 387 0 None closed Bug 1901057: proxyconfig controller: add router CA to the trusted pool 2021-01-11 10:36:08 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:36:17 UTC

Description Johnny Liu 2020-11-24 11:37:04 UTC
Description of problem:


Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-11-22-204912

How reproducible:
Always

Steps to Reproduce:
1. inject proxy info into install-config.yaml, such as:
---
apiVersion: v1
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform:
    azure: {}
  replicas: 3
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform:
    azure: {}
  replicas: 3
metadata:
  name: miyadav24azur
platform:
  azure:
    region: northcentralus
    baseDomainResourceGroupName: os4-common
    networkResourceGroupName: miyadav24azur-rg
    virtualNetwork: miyadav24azur-vnet
    controlPlaneSubnet: miyadav24azur-master-subnet
    computeSubnet: miyadav24azur-worker-subnet
pullSecret: HIDDEN
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  serviceNetwork:
  - 172.30.0.0/16
  machineNetwork:
  - cidr: 10.0.0.0/16
  networkType: OpenShiftSDN
publish: Internal
proxy:
  httpProxy: http://user:password@proxy.example.com:3128
  httpsProxy: http://user:password@proxy.example.com:3128
  noProxy: test.no-proxy.com
baseDomain: qe.azure.devcluster.openshift.com
2. Trigger installation
3.

Actual results:
Installation get failed, because authentication operator get into Degrade state.

$ oc describe co authentication
Name:         authentication
Namespace:    
Labels:       <none>
Annotations:  exclude.release.openshift.io/internal-openshift-hosted: true
              include.release.openshift.io/self-managed-high-availability: true
API Version:  config.openshift.io/v1
Kind:         ClusterOperator
Metadata:
  Creation Timestamp:  2020-11-24T05:10:38Z
  Generation:          1
  Managed Fields:
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:exclude.release.openshift.io/internal-openshift-hosted:
          f:include.release.openshift.io/self-managed-high-availability:
      f:spec:
      f:status:
        .:
        f:extension:
    Manager:      cluster-version-operator
    Operation:    Update
    Time:         2020-11-24T05:10:38Z
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
        f:relatedObjects:
        f:versions:
    Manager:         authentication-operator
    Operation:       Update
    Time:            2020-11-24T11:14:37Z
  Resource Version:  125042
  Self Link:         /apis/config.openshift.io/v1/clusteroperators/authentication
  UID:               399ee289-363f-4e10-a4bb-642d66c050b8
Spec:
Status:
  Conditions:
    Last Transition Time:  2020-11-24T05:17:20Z
    Message:               ProxyConfigControllerDegraded: endpoint("https://oauth-openshift.apps.miyadav24azur.qe.azure.devcluster.openshift.com/healthz") is unreachable with proxy(Get "https://oauth-openshift.apps.miyadav24azur.qe.azure.devcluster.openshift.com/healthz": x509: certificate signed by unknown authority) and without proxy(Get "https://oauth-openshift.apps.miyadav24azur.qe.azure.devcluster.openshift.com/healthz": x509: certificate signed by unknown authority)
    Reason:                ProxyConfigController_SyncError
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2020-11-24T11:14:37Z
    Message:               All is well
    Reason:                AsExpected
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2020-11-24T07:30:16Z
    Message:               OAuthServerDeploymentAvailable: availableReplicas==2
    Reason:                AsExpected
    Status:                True
    Type:                  Available
    Last Transition Time:  2020-11-24T05:15:20Z
    Message:               All is well
    Reason:                AsExpected
    Status:                True
    Type:                  Upgradeable
  Extension:               <nil>
  Related Objects:
    Group:      operator.openshift.io
    Name:       cluster
    Resource:   authentications
    Group:      config.openshift.io
    Name:       cluster
    Resource:   authentications
    Group:      config.openshift.io
    Name:       cluster
    Resource:   infrastructures
    Group:      config.openshift.io
    Name:       cluster
    Resource:   oauths
    Group:      route.openshift.io
    Name:       oauth-openshift
    Namespace:  openshift-authentication
    Resource:   routes
    Group:      
    Name:       oauth-openshift
    Namespace:  openshift-authentication
    Resource:   services
    Group:      
    Name:       openshift-config
    Resource:   namespaces
    Group:      
    Name:       openshift-config-managed
    Resource:   namespaces
    Group:      
    Name:       openshift-authentication
    Resource:   namespaces
    Group:      
    Name:       openshift-authentication-operator
    Resource:   namespaces
    Group:      
    Name:       openshift-ingress
    Resource:   namespaces
    Group:      
    Name:       openshift-oauth-apiserver
    Resource:   namespaces
  Versions:
    Name:     oauth-apiserver
    Version:  4.7.0-0.nightly-2020-11-22-204912
    Name:     oauth-openshift
    Version:  4.7.0-0.nightly-2020-11-22-204912_openshift
    Name:     operator
    Version:  4.7.0-0.nightly-2020-11-22-204912
Events:       <none>

Log into one oauth pod, curl the healthz url, it get passed.
$ oc -n openshift-authentication rsh oauth-openshift-77bdfbcd8b-2pvsw
sh-4.4# env |grep -i proxy
HTTP_PROXY=http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@10.0.99.4:3128
NO_PROXY=.cluster.local,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.miyadav24azur.qe.azure.devcluster.openshift.com,etcd-0.,etcd-1.,etcd-2.,localhost,test.no-proxy.com
HTTPS_PROXY=http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@10.0.99.4:3128
sh-4.4# curl -k https://oauth-openshift.apps.miyadav24azur.qe.azure.devcluster.openshift.com/healthz
ok

Expected results:
authentication operator get good state, installation get completed.


Additional info:
1. Did not hit such issues on 4.7.0-0.nightly-2020-11-18-085225
2. see this issue for the 1st time on 4.7.0-0.nightly-2020-11-20-234717

Comment 1 Michal Fojtik 2020-12-01 08:55:46 UTC
The proxy CA cert should be present in "openshift-authentication-operator/trusted-ca-bundle" but it is not. The curl command work because you passed `-k` which suppress the TLS validation.

Comment 2 Johnny Liu 2020-12-01 09:02:47 UTC
> The proxy CA cert should be present in "openshift-authentication-operator/trusted-ca-bundle" but it is not.
My proxy is a *http* proxy, no any CA cert for it.

Comment 3 Johnny Liu 2020-12-01 09:05:12 UTC
And more interesting things is 4.6 have no such problem. Even some earlier night build, such as: 4.7.0-0.nightly-2020-11-18-085225

Comment 4 Johnny Liu 2020-12-07 05:22:03 UTC
BTW, I am not sure if this has something with Bug 1901034.

Comment 5 Standa Laznicka 2020-12-07 08:34:17 UTC
This is a new check which is why you're seeing errors. I'm assuming there is a bug but I did not get to looking for it yet.

Comment 10 Johnny Liu 2020-12-15 02:16:53 UTC
Verified this bug with 4.7.0-0.nightly-2020-12-12-195258, and passed.

Comment 13 errata-xmlrpc 2021-02-24 15:35:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.