Bug 1904231 - [GCP, UPI, Proxy] auth, mco, storage problems at end of installation
Summary: [GCP, UPI, Proxy] auth, mco, storage problems at end of installation
Keywords:
Status: CLOSED DUPLICATE of bug 1901034
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: apiserver-auth
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Antonio Murdaca
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-03 21:02 UTC by To Hung Sze
Modified: 2020-12-04 21:03 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-04 21:03:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description To Hung Sze 2020-12-03 21:02:22 UTC
Description of problem:
MCO available is false after installation.

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-12-03-103850
gcp
cluster is: UPI, behind proxy, OVN and encrypted

How reproducible:
Use our automation to create a cluster with this template:
private-templates/functionality-testing/aos-4_7/upi-on-gcp/versioned-installer-http_proxy-remove_rhcos_worker-ovn-etcd_encryption-ci



Steps to Reproduce:
1.
2.
3.

Actual results:
Error:
+ ./openshift-install wait-for install-complete --dir '/home/jenkins/workspace/Launch Environment Flexy/workdir/install-dir'
level=info msg=Waiting up to 40m0s for the cluster at https://api.tsze-re11312.qe.gcp.devcluster.openshift.com:6443 to initialize...
level=error msg=Cluster operator authentication Degraded is True with ProxyConfigController_SyncError: ProxyConfigControllerDegraded: endpoint("https://oauth-openshift.apps.tsze-re11312.qe.gcp.devcluster.openshift.com/healthz") is unreachable with proxy(Get "https://oauth-openshift.apps.tsze-re11312.qe.gcp.devcluster.openshift.com/healthz": x509: certificate signed by unknown authority) and without proxy(Get "https://oauth-openshift.apps.tsze-re11312.qe.gcp.devcluster.openshift.com/healthz": x509: certificate signed by unknown authority)
level=info msg=Cluster operator baremetal Disabled is True with UnsupportedPlatform: Nothing to do on this Platform
level=info msg=Cluster operator insights Disabled is False with AsExpected: 
level=info msg=Cluster operator machine-config Progressing is True with : Working towards 4.7.0-0.nightly-2020-12-03-103850
level=error msg=Cluster operator machine-config Degraded is True with RequiredPoolsFailed: Unable to apply 4.7.0-0.nightly-2020-12-03-103850: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 3)
level=info msg=Cluster operator machine-config Available is False with : Cluster not available for 4.7.0-0.nightly-2020-12-03-103850
level=info msg=Cluster operator network ManagementStateDegraded is False with : 
level=info msg=Cluster operator storage Progressing is True with GCPPDCSIDriverOperatorCR_GCPPDDriverControllerServiceController_Deploying: GCPPDCSIDriverOperatorCRProgressing: GCPPDDriverControllerServiceControllerProgressing: Waiting for Deployment to deploy pods
level=info msg=Cluster operator storage Available is False with GCPPDCSIDriverOperatorCR_GCPPDDriverControllerServiceController_Deploying: GCPPDCSIDriverOperatorCRAvailable: GCPPDDriverControllerServiceControllerAvailable: Waiting for Deployment to deploy the CSI Controller Service
level=error msg=Cluster initialization failed because one or more operators are not functioning properly.
level=error msg=The cluster should be accessible for troubleshooting as detailed in the documentation linked below,
level=error msg=https://docs.openshift.com/container-platform/latest/support/troubleshooting/troubleshooting-installations.html
level=error msg=The 'wait-for install-complete' subcommand can then be used to continue the installation
level=fatal msg=failed to initialize the cluster: Cluster operator machine-config is still updating


./oc get co shows these with problem:
machine-config                                                                 False       True          True       126m

storage                                    4.7.0-0.nightly-2020-12-03-103850   False       True          False      35s


Expected results:
Cluster finishes installation

Additional info:

Comment 1 To Hung Sze 2020-12-03 21:04:40 UTC
I have the must-gather - too big to be attached here.
Please ping me and I can send it over / share it.

Comment 2 Kirsten Garrison 2020-12-03 23:55:52 UTC
Hi, can you please attach a must gather from this cluster?

Comment 3 To Hung Sze 2020-12-04 15:33:22 UTC
Just shared the zip file with you.


Note You need to log in before you can comment on or make changes to this bug.