Description of problem: When master machines are created by the installer, it should be possible to SSH to them using `ssh core@<master_ip>`. However, this does not work for me with the current nightly build of OCP4.6. At the same time, I can SSH to bootstrap machine using the same method. Version-Release number of the following components: openshift-install: 4.6.0-0.nightly-2020-08-24-034934 RHV: 4.3.11.2-0.1.el7 How reproducible: 100 % (I tried it twice on different RHV environments) Steps to Reproduce: 1. Prepare install-confing.yaml in a way that is contains your public SSH key. My install-config: http://pastebin.test.redhat.com/895896 2. Run the installer 3. Once the master VMs finish their ignition stage, open one of the master's console in RHV and take note of its IP address. 4. Try SSH to the master VM. I tried it while even explicitly providing my identity file to the ssh command: ssh -i ~/.ssh/id_rsa core@<master_ip_address> This is what I got though: http://pastebin.test.redhat.com/895904 Additional info: I've been using this install-config.yaml for past several weeks and never before had a problem SSHing to master VMs. Also, it's possible for me to SSH to the bootstrap VM. If I made some configuration mistake (e.g. typo in public SSH key in install-config.yaml), I shouldn't be able to SSH to the bootstrap machine.
did the installation completed successfull?
(In reply to Evgeny Slutsky from comment #1) > did the installation completed successfull? Actually it did not. For some reason image-registry operator failed to come up. But I believe this is not relevant. Public SSH key(s) should be copied to master VMs during ignition stage. One of the reasons this happens is to debug failed installations.
we also can see this issue in our CI env.
looks like rchos issue.
*** Bug 1872127 has been marked as a duplicate of this bug. ***
This is a consequence of the changes for bug 1868062 and will be fixed by a bootimage update.
We have the same problem with OCP4.6 even when deployment finished successfully. We cannot access both masters and workers via ssh
Workaround for others hitting this oc debug node/<nodeName> chroot /host cd /var/home/core/.ssh cp authorized_keys.d/ignition authorized_keys chown core:core authorized_keys
*** Bug 1873014 has been marked as a duplicate of this bug. ***
*** Bug 1871789 has been marked as a duplicate of this bug. ***
Verified in 4.6.0-0.nightly-2020-09-19-060512 ``` $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-09-19-060512 True False 93m Cluster version is 4.6.0-0.nightly-2020-09-19-060512 $ oc debug node/ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7 -- chroot /host cat /var/home/core/.ssh/authorized_keys Starting pod/ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7-debug ... To use host binaries, run `chroot /host` ssh-rsa AAAAB3NzaC1y..... Removing debug pod ... $ cat update-ssh-worker.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: 99-new-worker-sshkey spec: config: ignition: version: 3.1.0 passwd: users: - name: core sshAuthorizedKeys: - | ssh-ed25519 AAAAC..... $ oc apply -f update-ssh-worker.yaml machineconfig.machineconfiguration.openshift.io/99-new-worker-sshkey created $ oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 110m 00-worker c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 110m 01-master-container-runtime c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 110m 01-master-kubelet c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 110m 01-worker-container-runtime c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 110m 01-worker-kubelet c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 110m 70-multi-kargs 25m 99-master-generated-registries c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 110m 99-master-ssh 3.1.0 117m 99-new-worker-sshkey 3.1.0 6s 99-worker-generated-registries c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 110m 99-worker-ssh 3.1.0 117m rendered-master-9d39db7fc2ec3a03099836ae174057df c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 110m rendered-worker-27733f7362bcf053ebffdd905ae1ccff c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 1s rendered-worker-c357fae6e3fdfa250b30478995e1fb05 c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 15m rendered-worker-e6c79d53ce9a19fa5793a06663af7c76 c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 25m rendered-worker-ff560ececef24a9e8da6f01097187105 c08c048584ef0bf18ab2dd88fdddd93279e1c6a1 3.1.0 110m $ oc debug node/ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7 -- chroot /host cat /var/home/core/.ssh/authorized_keys Starting pod/ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7-debug ... To use host binaries, run `chroot /host` ssh-ed25519 AAAAC3.... ssh-rsa AAAAB3NzaC1yc.... Removing debug pod ... $ oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ci-ln-j3tbpx2-f76d1-lrx8m-master-0 Ready master 120m v1.19.0+7f9e863 10.0.0.3 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 ci-ln-j3tbpx2-f76d1-lrx8m-master-1 Ready master 119m v1.19.0+7f9e863 10.0.0.5 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 ci-ln-j3tbpx2-f76d1-lrx8m-master-2 Ready master 120m v1.19.0+7f9e863 10.0.0.2 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7 Ready worker 109m v1.19.0+7f9e863 10.0.32.2 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 ci-ln-j3tbpx2-f76d1-lrx8m-worker-c-pvg2x Ready worker 109m v1.19.0+7f9e863 10.0.32.3 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 ci-ln-j3tbpx2-f76d1-lrx8m-worker-d-zs8jq Ready worker 109m v1.19.0+7f9e863 10.0.32.4 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 $ oc debug node/ci-ln-j3tbpx2-f76d1-lrx8m-worker-c-pvg2x Starting pod/ci-ln-j3tbpx2-f76d1-lrx8m-worker-c-pvg2x-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.32.3 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# cat .ssh/id_ed25519 -----BEGIN OPENSSH PRIVATE KEY----- b3BlbnNzaC1r..... -----END OPENSSH PRIVATE KEY----- sh-4.4# ssh -l core -i /root/.ssh/id_ed25519 10.0.32.2 Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 Part of OpenShift 4.6, RHCOS is a Kubernetes native operating system managed by the Machine Config Operator (`clusteroperator/machine-config`). WARNING: Direct SSH access to machines is not recommended; instead, make configuration changes via `machineconfig` objects: https://docs.openshift.com/container-platform/4.6/architecture/architecture-rhcos.html --- [core@ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7 ~]$ exit logout Connection to 10.0.32.2 closed. sh-4.4# exit exit sh-4.4# exit exit Removing debug pod ... ```
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196