Description of problem: On installation fails, installer is trying to download bootstrap logs accessing the IP address of the bootstrap node. This is an address of an internal network that could not be accessible from the server where the installer is being executed. Any connection with the cluster has to be done using the FIP address assigned to the API instance. $ ./openshift-install create cluster INFO Consuming "Install Config" from target directory INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.morenod-ocp.qe.devcluster.openshift.com:6443... INFO API v1.14.0+d406851 up INFO Waiting up to 30m0s for bootstrapping to complete... INFO Pulling debug logs from the bootstrap machine ERROR failed to create SSH client: dial tcp 192.168.26.7:22: connect: connection timed out FATAL failed to wait for bootstrapping to complete: timed out waiting for the condition Version-Release number of the following components: $ ./openshift-install version ./openshift-install unreleased-master-1110-g7cc42fa81b7a26c6ae180af3ff097f32d8c30c51-dirty built from commit 7cc42fa81b7a26c6ae180af3ff097f32d8c30c51 release image registry.svc.ci.openshift.org/ocp/release@sha256:f221a7095d3b4e57ecc2a9280152e3d096d3c665209945b1651ba4292e9a17f9 How reproducible: Steps to Reproduce: 1.Install OCP4 on OSP 2.Wait until installation fails (or force it) 3.Check installation log Actual results: Installation fails and logs cannot be obtained Expected results: Logs are downloaded using the API instance as a jumping host to connect with the rest of the nodes Additional info: Please attach logs from ansible-playbook with the -vvv flag
Reassign to Dan Prince. He is currently working on this.
I saw bootstrap-fip is in for this isuse, but I found that this floating ip will not be destroyed even I destroy the cluster.
(In reply to weiwei jiang from comment #2) > I saw bootstrap-fip is in for this isuse, but I found that > this floating ip will not be destroyed even I destroy the cluster. Created bug https://bugzilla.redhat.com/show_bug.cgi?id=1740543 for deleting this FIP
Tried with ➜ ✗ ./openshift-install version ./openshift-install v4.2.0-201908181300-dirty built from commit 4e204c5e509de1bd31113b0c0e73af1a35e52c0a release image registry.svc.ci.openshift.org/ocp/release@sha256:bbf1c5a9b0ca47cae481bd9327bcde6869b826118e14fce61a6e3f99728f4c4c it still use internal ip to pull bootstrap logs. ... DEBUG Loading "Platform"... DEBUG Using "Install Config" loaded from state file DEBUG Reusing previously-fetched "Install Config" INFO Pulling debug logs from the bootstrap machine ERROR failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: dial tcp 192.168.0.14:22: connect: connection timed out FATAL waiting for Kubernetes API: context deadline exceeded (openstack) server list --name wjosp0819 +--------------------------------------+-------------------------------------+--------+---------------------------------------------------------------+-------+----------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-------------------------------------+--------+---------------------------------------------------------------+-------+----------+ | 280acfc1-94e8-458c-953d-6121c41cd2f6 | preserve-wjosp0819b-6rwjc-master-2 | ACTIVE | preserve-wjosp0819b-6rwjc-openshift=192.168.0.25 | | m1.large | | 3ecf217b-4efc-4478-b92e-b2076f7273d1 | preserve-wjosp0819b-6rwjc-master-1 | ACTIVE | preserve-wjosp0819b-6rwjc-openshift=192.168.0.13 | | m1.large | | 98442568-b5a9-4868-a399-3c6fea8000d0 | preserve-wjosp0819b-6rwjc-master-0 | ACTIVE | preserve-wjosp0819b-6rwjc-openshift=192.168.0.27 | | m1.large | | 42de807d-af1c-45af-b935-d4101d74decb | preserve-wjosp0819b-6rwjc-bootstrap | ACTIVE | preserve-wjosp0819b-6rwjc-openshift=192.168.0.14, 10.0.79.162 | | m1.large | +--------------------------------------+-------------------------------------+--------+---------------------------------------------------------------+-------+----------+
Should be resolved by: https://github.com/openshift/installer/pull/2212
There is a regression caused by https://github.com/openshift/installer/pull/2128 The bootstrap_fip resource was moved from topology to bootstrap module, so it can't be found. The fix that solves the regression is on review: https://github.com/openshift/installer/pull/2256
Verified on 4.2.0-0.nightly-2019-08-25-233755: $ ./openshift-install gather bootstrap --log-level debug DEBUG OpenShift Installer v4.2.0-201908251340-dirty DEBUG Built from commit c2e6b0afd7f33ae0125d1ac96f3948919748ffc5 DEBUG Fetching "Install Config"... DEBUG Loading "Install Config"... DEBUG Loading "SSH Key"... DEBUG Loading "Base Domain"... DEBUG Loading "Platform"... DEBUG Loading "Cluster Name"... DEBUG Loading "Base Domain"... DEBUG Loading "Pull Secret"... DEBUG Loading "Platform"... DEBUG Using "Install Config" loaded from state file DEBUG Reusing previously-fetched "Install Config" INFO Pulling debug logs from the bootstrap machine DEBUG Gathering bootstrap journals ... DEBUG Gathering bootstrap containers ... DEBUG Gathering rendered assets... DEBUG Gathering cluster resources ... DEBUG error: the server doesn't have a resource type "pods" DEBUG error: the server doesn't have a resource type "nodes" DEBUG error: the server doesn't have a resource type "nodes" DEBUG error: the server doesn't have a resource type "apiservices" DEBUG error: the server doesn't have a resource type "pods" DEBUG error: the server doesn't have a resource type "clusterversion" DEBUG error: the server doesn't have a resource type "clusteroperators" DEBUG Waiting for logs ... DEBUG error: the server doesn't have a resource type "csr" DEBUG error: the server doesn't have a resource type "configmaps" DEBUG error: the server doesn't have a resource type "kubecontrollermanager" DEBUG error: the server doesn't have a resource type "kubeapiserver" DEBUG error: the server doesn't have a resource type "events" DEBUG error: the server doesn't have a resource type "endpoints" DEBUG error: the server doesn't have a resource type "machineconfigpools" DEBUG error: the server doesn't have a resource type "machineconfigs" DEBUG error: the server doesn't have a resource type "namespaces" DEBUG error: the server doesn't have a resource type "nodes" DEBUG error: the server doesn't have a resource type "openshiftapiserver" DEBUG error: the server doesn't have a resource type "pods" DEBUG error: the server doesn't have a resource type "secrets" DEBUG error: the server doesn't have a resource type "roles" DEBUG error: the server doesn't have a resource type "services" DEBUG error: the server doesn't have a resource type "rolebindings" DEBUG error: the server doesn't have a resource type "secrets" DEBUG Error from server (NotFound): the server could not find the requested resource DEBUG Gather remote logs DEBUG Collecting info from 192.168.0.16 EBUG Warning: Permanently added '192.168.0.16' (ECDSA) to the list of known hosts. DEBUG Gathering master journals ... DEBUG Gathering master containers ... DEBUG Waiting for logs ... DEBUG Collecting info from 192.168.0.14 EBUG Warning: Permanently added '192.168.0.14' (ECDSA) to the list of known hosts. DEBUG Gathering master journals ... DEBUG Gathering master containers ... DEBUG Waiting for logs ... DEBUG Collecting info from 192.168.0.24 EBUG Warning: Permanently added '192.168.0.24' (ECDSA) to the list of known hosts. DEBUG Gathering master journals ... DEBUG Gathering master containers ... DEBUG Waiting for logs ... DEBUG Log bundle written to ~/log-bundle.tar.gz INFO Bootstrap gather logs captured here "log-bundle-20190826110214.tar.gz"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922