Created attachment 1593815 [details] log-bundle Description of problem: After using the openshift-installer to generate an install-config I attempt to deploy a cluster to AWS. The deployment eventually fails waiting for Kubernetes API: context deadline exceeded. I was able to ssh to the bootstrap node in AWS and retrieve data from the bootkube.service log (see attachment). openshift-installer log snippet: time="2019-07-26T16:09:41-07:00" level=debug msg="Still waiting for the Kubernetes API: Get https://api.ocs-ci-clacroix.qe.rh-ocs.com:6443/version?timeout=32s: dial tcp 18.218.245.145:6443: connect: connection refused" time="2019-07-26T16:09:54-07:00" level=debug msg="Fetching \"Install Config\"..." time="2019-07-26T16:09:54-07:00" level=debug msg="Loading \"Install Config\"..." time="2019-07-26T16:09:54-07:00" level=debug msg=" Loading \"SSH Key\"..." time="2019-07-26T16:09:54-07:00" level=debug msg=" Loading \"Base Domain\"..." time="2019-07-26T16:09:54-07:00" level=debug msg=" Loading \"Platform\"..." time="2019-07-26T16:09:54-07:00" level=debug msg=" Loading \"Cluster Name\"..." time="2019-07-26T16:09:54-07:00" level=debug msg=" Loading \"Base Domain\"..." time="2019-07-26T16:09:54-07:00" level=debug msg=" Loading \"Pull Secret\"..." time="2019-07-26T16:09:54-07:00" level=debug msg=" Loading \"Platform\"..." time="2019-07-26T16:09:54-07:00" level=debug msg="Using \"Install Config\" loaded from state file" time="2019-07-26T16:09:54-07:00" level=debug msg="Reusing previously-fetched \"Install Config\"" time="2019-07-26T16:09:54-07:00" level=info msg="Pulling debug logs from the bootstrap machine" time="2019-07-26T16:09:55-07:00" level=debug msg="Gathering bootstrap journals ..." time="2019-07-26T16:09:56-07:00" level=debug msg="Gathering bootstrap containers ..." time="2019-07-26T16:09:59-07:00" level=debug msg="time=\"2019-07-26T23:09:59Z\" level=fatal msg=\"failed to connect: failed to connect: context deadline exceeded\"" time="2019-07-26T16:10:00-07:00" level=debug msg="Gathering rendered assets..." time="2019-07-26T16:10:00-07:00" level=debug msg="Gathering cluster resources ..." time="2019-07-26T16:10:01-07:00" level=debug msg="Waiting for logs ..." time="2019-07-26T16:10:02-07:00" level=debug msg="The connection to the server api.ocs-ci-clacroix.qe.rh-ocs.com:6443 was refused - did you specify the right host or port?" ... time="2019-07-26T16:10:04-07:00" level=debug msg="The connection to the server api.ocs-ci-clacroix.qe.rh-ocs.com:6443 was refused - did you specify the right host or port?" time="2019-07-26T16:10:04-07:00" level=debug msg="Gather remote logs" time="2019-07-26T16:10:04-07:00" level=debug msg="Collecting info from 10.0.132.64" time="2019-07-26T16:10:04-07:00" level=debug msg="lost connection" time="2019-07-26T16:10:04-07:00" level=debug msg="ssh: connect to host 10.0.132.64 port 22: Connection refused\r" time="2019-07-26T16:10:04-07:00" level=debug msg="Collecting info from 10.0.147.0" time="2019-07-26T16:10:04-07:00" level=debug msg="lost connection" time="2019-07-26T16:10:04-07:00" level=debug msg="ssh: connect to host 10.0.147.0 port 22: Connection refused\r" time="2019-07-26T16:10:04-07:00" level=debug msg="Collecting info from 10.0.170.19" time="2019-07-26T16:10:04-07:00" level=debug msg="lost connection" time="2019-07-26T16:10:04-07:00" level=debug msg="ssh: connect to host 10.0.170.19 port 22: Connection refused\r" time="2019-07-26T16:10:04-07:00" level=debug msg="Log bundle written to ~/log-bundle.tar.gz" time="2019-07-26T16:10:05-07:00" level=info msg="Bootstrap gather logs captured here \"/Users/clacroix/clusters/4.2.0-deploy-16/log-bundle-20190726161004.tar.gz\"" time="2019-07-26T16:10:05-07:00" level=fatal msg="waiting for Kubernetes API: context deadline exceeded" Version-Release number of selected component (if applicable): 4.2.0-0.nightly-2019-07-26-152831 This is the latest version I have attempted with. I have tried various other verions (both ci and nightly) that were accepted in the past few days with the same exact results. How reproducible: 100% of attempts to deploy using recent versions of 4.2. Steps to Reproduce: 1. Generate install-config using openshift-installer 2. Attempt deployment of cluster to AWS 3. Actual results: Deployment failure - waiting for Kubernetes API: context deadline exceeded. Expected results: Successful deployment Additional info:
Created attachment 1593816 [details] bootkube-service-log
Created attachment 1593817 [details] openshitf-installer log
Jul 26 22:39:47 ip-10-0-7-58 bootkube.sh[1539]: time="2019-07-26T22:39:47Z" level=error msg="Error pulling image ref //registry.svc.ci.openshift.org/ocp/release@sha256:6ccb990f8616a6efca05b411af56c79f5c4502f05fcd1f5cfce143858a8d0986: Error initializing source docker://registry.svc.ci.openshift.org/ocp/release@sha256:6ccb990f8616a6efca05b411af56c79f5c4502f05fcd1f5cfce143858a8d0986: Error reading manifest sha256:6ccb990f8616a6efca05b411af56c79f5c4502f05fcd1f5cfce143858a8d0986 in registry.svc.ci.openshift.org/ocp/release: unauthorized: authentication required" Please use the correct pull secret when using installer+release-image build by CI.
(In reply to Abhinav Dahiya from comment #3) > Please use the correct pull secret when using installer+release-image build > by CI. Can you point me to where I can generate the correct pull secret for these builds or how to construct it myself? I've been using the pull secret I downloaded from openshift.com's install steps.
Hi, Could you please help with comment #4?