Hide Forgot
Description of problem: Unable to deploy using nightly build using IPI method Bootstrap never flips into healthy state in all three AWS NLBs and causes deployment failure. Version-Release number of the following components: $ ~/bin/openshift-install version /home/ccallega/bin/openshift-install v4.1.0-201905081711-dirty built from commit 218340f12450cae2961abfaab5985c0677dd63b5 release image registry.svc.ci.openshift.org/ocp/release@sha256:1cb302f7f7508582c5150ee908279e4a52614e801b9785a11b74b1ae7834f501 Bootstrap EC2 AMI: mycluster-x9lxw-master (ami-0dd265e79a060be94) How reproducible: Always Steps to Reproduce: 1. ~/bin/openshift-install --dir=/tmp/openshift/mycluster create install-config 2. ~/bin/openshift-install --dir=/tmp/openshift/mycluster create cluster Actual results: blah blah blah... DEBUG Apply complete! Resources: 117 added, 0 changed, 0 destroyed. DEBUG DEBUG The state of your infrastructure has been saved to the path DEBUG below. This state is required to modify and destroy your DEBUG infrastructure, so keep it safe. To inspect the complete state DEBUG use the `terraform show` command. DEBUG DEBUG State path: /tmp/openshift-install-269239872/terraform.tfstate DEBUG OpenShift Installer v4.1.0-201905081711-dirty DEBUG Built from commit 218340f12450cae2961abfaab5985c0677dd63b5 INFO Waiting up to 30m0s for the Kubernetes API at https://api.mycluster.ccallegar-aws.sysdeseng.com:6443... DEBUG Still waiting for the Kubernetes API: Get https://api.mycluster.ccallegar-aws.sysdeseng.com:6443/version?timeout=32s: dial tcp 3.14.206.133:6443: connect: connection refused DEBUG Still waiting for the Kubernetes API: Get https://api.mycluster.ccallegar-aws.sysdeseng.com:6443/version?timeout=32s: dial tcp 3.14.206.133:6443: connect: connection refused DEBUG Still waiting for the Kubernetes API: Get https://api.mycluster.ccallegar-aws.sysdeseng.com:6443/version?timeout=32s: dial tcp 3.14.206.133:6443: connect: connection refused DEBUG Still waiting for the Kubernetes API: Get https://api.mycluster.ccallegar-aws.sysdeseng.com:6443/version?timeout=32s: dial tcp 3.14.206.133:6443: connect: connection refused DEBUG Still waiting for the Kubernetes API: Get https://api.mycluster.ccallegar-aws.sysdeseng.com:6443/version?timeout=32s: dial tcp 3.14.206.133:6443: connect: connection refused ....and so on until FATAL Expected results: A complete OpenShift installation Additional info: $ ssh -A core.107.194 '/usr/local/bin/installer-gather.sh 10.0.136.54 10.0.156.35 10.0.166.221' The authenticity of host '18.223.107.194 (18.223.107.194)' can't be established. ECDSA key fingerprint is SHA256:MMEg3BaYWgMQgGO618QPLM+ZKbvCFWfuHLYBW1whSsw. ECDSA key fingerprint is MD5:45:f3:19:55:15:4f:56:93:d6:a5:51:d3:d4:94:54:47. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '18.223.107.194' (ECDSA) to the list of known hosts. Gathering bootstrap journals ... Gathering bootstrap containers ... Gathering rendered assets... cp: cannot open '/var/opt/openshift/auth/kubeconfig' for reading: Permission denied cp: cannot open '/var/opt/openshift/auth/kubeconfig-kubelet' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/admin-kubeconfig-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/aggregator-ca.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/aggregator-ca.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/aggregator-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/aggregator-client.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/aggregator-client.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/aggregator-signer.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/aggregator-signer.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/apiserver-proxy.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/apiserver-proxy.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/etcd-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/etcd-metric-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/etcd-metric-signer.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/etcd-metric-signer.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/etcd-metric-signer-client.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/etcd-metric-signer-client.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/etcd-signer.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/etcd-signer.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/etcd-client.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/etcd-client.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-lb-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-lb-server.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-lb-server.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-internal-lb-server.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-internal-lb-server.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-lb-signer.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-lb-signer.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-localhost-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-localhost-server.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-localhost-server.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-localhost-signer.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-localhost-signer.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-service-network-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-service-network-server.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-service-network-server.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-service-network-signer.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-service-network-signer.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-complete-server-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-complete-client-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-to-kubelet-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-to-kubelet-client.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-to-kubelet-client.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-to-kubelet-signer.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-apiserver-to-kubelet-signer.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-control-plane-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-control-plane-kube-controller-manager-client.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-control-plane-kube-controller-manager-client.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-control-plane-kube-scheduler-client.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-control-plane-kube-scheduler-client.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kube-control-plane-signer.key' for reading: Permission denied Gathering cluster resources ... cp: cannot open '/var/opt/openshift/tls/kube-control-plane-signer.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kubelet-bootstrap-kubeconfig-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kubelet-client-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kubelet-client.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kubelet-client.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kubelet-signer.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kubelet-signer.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/kubelet-serving-ca-bundle.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/machine-config-server.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/machine-config-server.crt' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/service-account.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/service-account.pub' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/journal-gatewayd.key' for reading: Permission denied cp: cannot open '/var/opt/openshift/tls/journal-gatewayd.crt' for reading: Permission denied rm: missing operand Try 'rm --help' for more information. rm: missing operand Try 'rm --help' for more information. Waiting for logs ... The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? The connection to the server api.mycluster.ccallegar-aws.sysdeseng.com:6443 was refused - did you specify the right host or port? Gather remote logs Log bundle written to ~/log-bundle.tar.gz log-bundle.tar.gz is attached
Created attachment 1566187 [details] log-bundle.tar.gz
Created attachment 1566188 [details] openshift_install_dir.tar.gz
I can leave mycluster infrastructure up and running for today (05/09/2019) but I will have to destroy it at end of business day.
I see this same failure behavior when trying to deploy via the UPI method as well
looking at the gathered logs less bootstrap/journals/bootkube.log ``` May 09 14:45:10 ip-10-0-13-157 systemd[1]: Started Bootstrap a Kubernetes cluster. May 09 14:45:13 ip-10-0-13-157 bootkube.sh[1381]: Pulling release image... May 09 14:45:14 ip-10-0-13-157 bootkube.sh[1381]: error pulling image "registry.svc.ci.openshift.org/ocp/release@sha256:1cb302f7f7508582c5150ee908279e4a52614e801b9785a11b74b1ae7834f501": unable to pull registry.svc.ci.openshift.org/ocp/release@sha256:1cb302f7f7508582c5150ee908279e4a52614e801b9785a11b74b1ae7834f501: unable to pull image: Error determining manifest MIME type for docker://registry.svc.ci.openshift.org/ocp/release@sha256:1cb302f7f7508582c5150ee908279e4a52614e801b9785a11b74b1ae7834f501: Error reading manifest sha256:1cb302f7f7508582c5150ee908279e4a52614e801b9785a11b74b1ae7834f501 in registry.svc.ci.openshift.org/ocp/release: unauthorized: authentication required ``` Looks like your pull secret is not valid. Please use the correct pull secret.
ARG!
My pull secret has gone invalid more than once in the past 6 months. Is there a way to test it?
I'm reopening. The failure to install is irrelevant. I want this to track must gather complaining about not having access for all of those files.
Jeremiah, Can you look at the installer-gather.sh output above? I think you had mentioned that there were changes that went in after it was originally introduced that created problems. We need to fix those.
> My pull secret has gone invalid more than once in the past 6 months. Is there a way to test it? The keys from [1] are good forever, although new authorities may be added to them. So a key from there will always successfully install a given release image, but you may need to re-fetch keys to install newer release images. Nightlies and other CI images (e.g. from [2]) require an additional authority to fetch from the CI registry. That key expires each month, and requires you to be in the OpenShift GitHub org, etc. This going stale is your problem above. There is an open ticket for preflighting these creds: bug 1662106 [1]: https://cloud.redhat.com/openshift/install [2]: https://openshift-release.svc.ci.openshift.org/
https://github.com/openshift/installer/pull/1735
I dunno how the errata tool decides to push things into ON_QA, but this isn't in the most-recent nightly yet: $ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-09-223828 | grep ' installer ' installer https://github.com/openshift/installer 403a93d1f683384800597ac38e9c2fc0180b3a5d $ git log --first-parent --format='%ad %h %d %s' --date=iso 403a93d1f68..origin/master 2019-05-09 23:22:59 +0200 59e927d2b (HEAD -> master, origin/release-4.2, origin/release-4.1, origin/master, origin/HEAD) Merge pull request #1735 from jstuever/bz1708307
*** Bug 1706750 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758