Bug 1814133 - Failed to install OCP 4.3 latest on the Bare Metals: Cluster operator authentication Degraded is True with RouteHealthDegradedFailedGet: RouteHealthDegraded: failed to GET route: dial tcp <ip>:443: connect: connection refused\
Summary: Failed to install OCP 4.3 latest on the Bare Metals: Cluster operator authent...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.5.0
Assignee: Dan Mace
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-17 07:02 UTC by spandura
Modified: 2022-08-04 22:27 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-08-04 18:05:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-08-04 18:05:43 UTC

Description spandura 2020-03-17 07:02:08 UTC
Description of problem:
==========================
Failed to install OCP 4.3 latest build on Bare Metals. 


Following is the error message for OCP 4.3 install failure on Bare Metals
=======================================================
fatal: [10.8.32.24]: FAILED! => {
    "changed": false,
    "cmd": [
        "openshift-install",
        "wait-for",
        "install-complete"
    ],
    "delta": "0:30:00.122576",
    "end": "2020-03-13 10:17:29.372341",
    "invocation": {
        "module_args": {
            "_raw_params": "openshift-install wait-for install-complete",
            "_uses_shell": false,
            "argv": null,
            "chdir": "/root/ocp-cluster-bm",
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true,
            "warn": true
        }
    },
    "msg": "non-zero return code",
    "rc": 1,
    "start": "2020-03-13 09:47:29.249765",
    "stderr": "level=info msg=\"Waiting up to 30m0s for the cluster at https://api.ocp-cluster-bm.ocpmig.css-qe.com:6443 to initialize...\"\nlevel=error msg=\"Cluster operator authentication Degraded is True with RouteHealthDegradedFailedGet: RouteHealthDegraded: failed to GET route: dial tcp 10.8.32.24:443: connect: connection refused\"\nlevel=info msg=\"Cluster operator authentication Progressing is Unknown with NoData: \"\nlevel=info msg=\"Cluster operator authentication Available is Unknown with NoData: \"\nlevel=info msg=\"Cluster operator console Progressing is True with SyncLoopRefreshProgressingInProgress: SyncLoopRefreshProgressing: Working toward version 4.3.3\"\nlevel=info msg=\"Cluster operator console Available is False with DeploymentAvailableInsufficientReplicas: DeploymentAvailable: 0 pods available for console deployment\"\nlevel=info msg=\"Cluster operator insights Disabled is False with : \"\nlevel=fatal msg=\"failed to initialize the cluster: Some cluster operators are still updating: authentication, console\"",
    "stderr_lines": [
        "level=info msg=\"Waiting up to 30m0s for the cluster at https://api.ocp-cluster-bm.ocpmig.css-qe.com:6443 to initialize...\"",
        "level=error msg=\"Cluster operator authentication Degraded is True with RouteHealthDegradedFailedGet: RouteHealthDegraded: failed to GET route: dial tcp 10.8.32.24:443: connect: connection refused\"",
        "level=info msg=\"Cluster operator authentication Progressing is Unknown with NoData: \"",
        "level=info msg=\"Cluster operator authentication Available is Unknown with NoData: \"",
        "level=info msg=\"Cluster operator console Progressing is True with SyncLoopRefreshProgressingInProgress: SyncLoopRefreshProgressing: Working toward version 4.3.3\"",
        "level=info msg=\"Cluster operator console Available is False with DeploymentAvailableInsufficientReplicas: DeploymentAvailable: 0 pods available for console deployment\"",
        "level=info msg=\"Cluster operator insights Disabled is False with : \"",
        "level=fatal msg=\"failed to initialize the cluster: Some cluster operators are still updating: authentication, console\""
    ],
    "stdout": "",
    "stdout_lines": []
}

PLAY RECAP ********************************************************************************************************************************************************************************************************
10.8.32.24                 : ok=89   changed=54   unreachable=0    failed=1    skipped=7    rescued=0    ignored=0   


Version-Release number of the following components:
==========================================================
[core@master0 ~]$ rpm -qa | grep openshift
openshift-hyperkube-4.3.3-202002140552.git.0.e38059c.el8.x86_64
openshift-clients-4.3.3-202002140552.git.1.ff73b47.el8.x86_64

ocp_installers_index_url: https://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.3.3/

ocp_rhcos_index_url: https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.3/latest/


How reproducible:
====================
1/1


Steps to Reproduce:
========================
1. Ran this playbook to install ocp 4.3 on baremetals: https://code.engineering.redhat.com/gerrit/gitweb?p=CSS_OCP_OCS_Migration.git;a=blob;f=css_ocp_ocs_migration/ansible/playbooks/setup_ocp_4_x.yml;h=43770efd7f969edd65d28d8f4e876c74fe4135ef;hb=9e861bef5f85c95854ab76351a35613e1192c850

Actual results:
=======================
Failed to install OCP 4.3 on bare metals

Expected results:
====================
Openshift install should be successful

Additional info:
=====================
cat ocp-cluster-bm/.openshift_install.log
http://pastebin.test.redhat.com/845283

cat /etc/haproxy/haproxy.cfg: 
http://pastebin.test.redhat.com/845285

iptables -L :
http://pastebin.test.redhat.com/845286


cat ocp-cluster-bm/install-config.yaml
http://pastebin.test.redhat.com/845289

Comment 1 spandura 2020-03-17 10:20:45 UTC
[root@dell-per730-09 ~]# export KUBECONFIG="ocp-cluster-bm/auth/kubeconfig" 
[root@dell-per730-09 ~]# 
[root@dell-per730-09 ~]# oc projects
You have access to the following projects and can switch between them with 'oc project <projectname>':

default
kube-node-lease
kube-public
kube-system
openshift
openshift-apiserver
openshift-apiserver-operator
openshift-authentication
openshift-authentication-operator
openshift-cloud-credential-operator
openshift-cluster-machine-approver
openshift-cluster-node-tuning-operator
openshift-cluster-samples-operator
openshift-cluster-storage-operator
openshift-cluster-version
openshift-config
openshift-config-managed
openshift-console
openshift-console-operator
openshift-controller-manager
openshift-controller-manager-operator
openshift-dns
openshift-dns-operator
openshift-etcd
openshift-image-registry
openshift-infra
openshift-ingress
openshift-ingress-operator
openshift-insights
openshift-kni-infra
openshift-kube-apiserver
openshift-kube-apiserver-operator
openshift-kube-controller-manager
openshift-kube-controller-manager-operator
openshift-kube-scheduler
openshift-kube-scheduler-operator
openshift-machine-api
openshift-machine-config-operator
openshift-marketplace
openshift-monitoring
openshift-multus
openshift-network-operator
openshift-node
openshift-openstack-infra
openshift-operator-lifecycle-manager
openshift-operators
openshift-sdn
openshift-service-ca
openshift-service-ca-operator
openshift-service-catalog-apiserver-operator
openshift-service-catalog-controller-manager-operator
[root@dell-per730-09 ~]# 
[root@dell-per730-09 ~]# 
[root@dell-per730-09 ~]# oc get pods -n openshift-console
NAME                        READY   STATUS             RESTARTS   AGE
console-5d49d56cd9-m8bdq    0/1     Running            15         66m
console-5f9c89ffc5-glwzf    0/1     Running            15         67m
console-5f9c89ffc5-rvbd7    0/1     CrashLoopBackOff   15         67m
downloads-f99ff6d4f-9ncj5   1/1     Running            0          69m
downloads-f99ff6d4f-jlndd   1/1     Running            0          69m
[root@dell-per730-09 ~]# 
[root@dell-per730-09 ~]# oc logs pod/console-5d49d56cd9-m8bdq -n openshift-console
2020/03/17 10:11:34 cmd/main: cookies are secure!
2020/03/17 10:11:34 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:11:44 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:11:54 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:12:04 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:12:14 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:12:24 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:12:34 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:12:44 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:12:54 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:13:04 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:13:14 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:13:24 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:13:34 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:13:44 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:13:54 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:14:04 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
2020/03/17 10:14:14 auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com/oauth/token failed: Head https://oauth-openshift.apps.ocp-cluster-bm.ocpmig.css-qe.com: dial tcp 10.8.32.24:443: connect: connection refused
[root@dell-per730-09 ~]#

Comment 2 Scott Dodson 2020-03-17 12:51:31 UTC
1) Please test with the latest version, 4.3.5
2) It's complaining about the route for various components which is highly dependent on your provisioned load balancer, please check that and the status of ingress

Most likely a problem in your infrastructure but moving over to Routing

Comment 3 spandura 2020-03-17 13:07:50 UTC
[root@dell-per730-09 ~]# oc adm must-gather
http://pastebin.test.redhat.com/845417

Comment 4 spandura 2020-03-17 13:08:00 UTC
[root@dell-per730-09 ~]# oc adm must-gather
http://pastebin.test.redhat.com/845417

Comment 5 spandura 2020-03-17 16:20:32 UTC
I tried the same on the latest build and still seeing the issues:

[core@master0 ~]$ rpm -qa | grep openshift
openshift-hyperkube-4.3.5-202002280657.git.0.b3bfb5a.el8.x86_64
openshift-clients-4.3.5-202002280657.git.1.55a9334.el8.x86_64

Comment 8 spandura 2020-03-17 19:51:29 UTC
Moving this bug to VERIFIED as i made some changes to my environment and it worked.

Comment 11 errata-xmlrpc 2020-08-04 18:05:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5 image release advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.