1. Ran an install that failed 2. SSH gather from bootstrap failed with: ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: dial tcp 10.0.0.7:22: connect: operation timed out 3. I have an SSH key in .pem combined format (pub + private key), but installer appeared to not read it. I believe the installer method https://github.com/openshift/installer/blob/master/pkg/gather/ssh/ssh.go#L116 should also attempt to read PEM encoded private keys. PEM encoded keys aren't totally uncommon, and most engineers / developers on the team use them, so it would increase team efficiency.
The currently seems more like a feature required than a bug. So it is better tracked in JIRA imo.
It's not proper to add keys blindly to existing agents, we will only update the output on error to be more clear.
The following is the error seen when the ssh key is neither specified with --key or loaded into the keyring. It may be helpful to display the key specific error only when authentication failed, as opposed to a connection timeout as indicated in the original error above. ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
I want more information about how you reproduce the error " dial tcp 10.0.0.7:22: connect: operation timed out". if do I need disable 22 port of the some nodes' ingress rule? which node, or any other ways? Can you give me more detail?
It has fixed. test payload:4.4.0-0.nightly-2020-02-18-211831 The log will contain "failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key" only when there is no key, otherwise,it won't contain this logs info.
Looks like you found a way to reproduce this. You can replicate an SSH 'operation timed out' by using a bootstrap IP of a non-existing host such as 192.168.2.1 in the following example: openshift-install gather bootstrap --bootstrap 192.168.2.1 --master 192.168.2.1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581