Bug 1871795 - Cannot SSH to master VMs using core user
Summary: Cannot SSH to master VMs using core user
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.6.0
Assignee: Benjamin Gilbert
QA Contact: Michael Nguyen
URL:
Whiteboard:
: 1871789 1872127 (view as bug list)
Depends On: 1868062
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-24 10:23 UTC by Jan Zmeskal
Modified: 2020-10-27 16:30 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1874656 (view as bug list)
Environment:
Last Closed: 2020-10-27 16:30:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 4095 0 None closed Bug 1871795: bump RHCOS images to fix SSH authentication 2021-01-22 13:58:44 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:30:58 UTC

Description Jan Zmeskal 2020-08-24 10:23:52 UTC
Description of problem:
When master machines are created by the installer, it should be possible to SSH to them using `ssh core@<master_ip>`. However, this does not work for me with the current nightly build of OCP4.6. At the same time, I can SSH to bootstrap machine using the same method.

Version-Release number of the following components:
openshift-install: 4.6.0-0.nightly-2020-08-24-034934
RHV: 4.3.11.2-0.1.el7

How reproducible:
100 % (I tried it twice on different RHV environments)

Steps to Reproduce:
1. Prepare install-confing.yaml in a way that is contains your public SSH key. My install-config: http://pastebin.test.redhat.com/895896
2. Run the installer
3. Once the master VMs finish their ignition stage, open one of the master's console in RHV and take note of its IP address.
4. Try SSH to the master VM. I tried it while even explicitly providing my identity file to the ssh command:
ssh -i ~/.ssh/id_rsa core@<master_ip_address>
This is what I got though: http://pastebin.test.redhat.com/895904

Additional info:
I've been using this install-config.yaml for past several weeks and never before had a problem SSHing to master VMs.
Also, it's possible for me to SSH to the bootstrap VM. If I made some configuration mistake (e.g. typo in public SSH key in install-config.yaml), I shouldn't be able to SSH to the bootstrap machine.

Comment 1 Evgeny Slutsky 2020-08-24 10:39:44 UTC
did the installation completed successfull?

Comment 2 Jan Zmeskal 2020-08-24 10:45:12 UTC
(In reply to Evgeny Slutsky from comment #1)
> did the installation completed successfull?

Actually it did not. For some reason image-registry operator failed to come up. But I believe this is not relevant. Public SSH key(s) should be copied to master VMs during ignition stage. One of the reasons this happens is to debug failed installations.

Comment 3 Evgeny Slutsky 2020-08-24 11:27:19 UTC
we also can see this issue in our CI env.

Comment 4 Evgeny Slutsky 2020-08-25 07:05:11 UTC
looks like rchos issue.

Comment 5 Benjamin Gilbert 2020-08-25 11:49:18 UTC
*** Bug 1872127 has been marked as a duplicate of this bug. ***

Comment 6 Benjamin Gilbert 2020-08-25 11:50:34 UTC
This is a consequence of the changes for bug 1868062 and will be fixed by a bootimage update.

Comment 7 Lubov 2020-08-25 12:07:52 UTC
We have the same problem with OCP4.6 even when deployment finished successfully.
We cannot access both masters and workers via ssh

Comment 8 Seth Jennings 2020-08-25 17:27:43 UTC
Workaround for others hitting this

oc debug node/<nodeName>
chroot /host
cd /var/home/core/.ssh
cp authorized_keys.d/ignition authorized_keys
chown core:core authorized_keys

Comment 11 Sandro Bonazzola 2020-09-01 11:42:55 UTC
*** Bug 1873014 has been marked as a duplicate of this bug. ***

Comment 12 Sandro Bonazzola 2020-09-01 11:44:27 UTC
*** Bug 1871789 has been marked as a duplicate of this bug. ***

Comment 13 Micah Abbott 2020-09-19 19:44:39 UTC
Verified in 4.6.0-0.nightly-2020-09-19-060512

```
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-09-19-060512   True        False         93m     Cluster version is 4.6.0-0.nightly-2020-09-19-060512

$ oc debug node/ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7 -- chroot /host cat /var/home/core/.ssh/authorized_keys                                                                                                                                   
Starting pod/ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7-debug ...                                                                                                                                                                                                                                                               
To use host binaries, run `chroot /host`                                                                                                                                                                                                                                                                                      
ssh-rsa AAAAB3NzaC1y.....
                                                                               
                                                                               
Removing debug pod ...                                
                                                                                                                   
$ cat update-ssh-worker.yaml                                                        
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig  
metadata:         
  labels:                       
    machineconfiguration.openshift.io/role: worker
  name: 99-new-worker-sshkey                  
spec:                  
  config:                                             
    ignition:                                                         
      version: 3.1.0                                                        
    passwd:                                                                                                                                                    
      users:                                                              
      - name: core                              
        sshAuthorizedKeys:                                                 
        - |                       
          ssh-ed25519 AAAAC.....

$ oc apply -f update-ssh-worker.yaml                                                                                                                                                                                                               
machineconfig.machineconfiguration.openshift.io/99-new-worker-sshkey created

$ oc get mc                                                                         
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE                                            
00-master                                          c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             110m                                           
00-worker                                          c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             110m                                           
01-master-container-runtime                        c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             110m                                                                                                                                                                                                          
01-master-kubelet                                  c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             110m    
01-worker-container-runtime                        c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             110m                                           
01-worker-kubelet                                  c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             110m                                           
70-multi-kargs                                                                                                  25m                                                                                                                                                                                                           
99-master-generated-registries                     c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             110m                                           
99-master-ssh                                                                                 3.1.0             117m                                           
99-new-worker-sshkey                                                                          3.1.0             6s                                             
99-worker-generated-registries                     c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             110m                                           
99-worker-ssh                                                                                 3.1.0             117m                                           
rendered-master-9d39db7fc2ec3a03099836ae174057df   c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             110m                                                                                                                                                                                                          
rendered-worker-27733f7362bcf053ebffdd905ae1ccff   c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             1s
rendered-worker-c357fae6e3fdfa250b30478995e1fb05   c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             15m
rendered-worker-e6c79d53ce9a19fa5793a06663af7c76   c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             25m 
rendered-worker-ff560ececef24a9e8da6f01097187105   c08c048584ef0bf18ab2dd88fdddd93279e1c6a1   3.1.0             110m

$ oc debug node/ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7 -- chroot /host cat /var/home/core/.ssh/authorized_keys                                                                                                                                   
Starting pod/ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7-debug ...                                                                                                
To use host binaries, run `chroot /host`                                                                                                                       
ssh-ed25519 AAAAC3....                                   
                                                                                                                                                               
ssh-rsa AAAAB3NzaC1yc....
                                                                                                                                                               
                                                                                                                                                               
Removing debug pod ...                                                                          

$ oc get nodes -o wide
NAME                                       STATUS   ROLES    AGE    VERSION           INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                 CONTAINER-RUNTIME
ci-ln-j3tbpx2-f76d1-lrx8m-master-0         Ready    master   120m   v1.19.0+7f9e863   10.0.0.3      <none>        Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa)   4.18.0-193.23.1.el8_2.x86_64   cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8
ci-ln-j3tbpx2-f76d1-lrx8m-master-1         Ready    master   119m   v1.19.0+7f9e863   10.0.0.5      <none>        Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa)   4.18.0-193.23.1.el8_2.x86_64   cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8
ci-ln-j3tbpx2-f76d1-lrx8m-master-2         Ready    master   120m   v1.19.0+7f9e863   10.0.0.2      <none>        Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa)   4.18.0-193.23.1.el8_2.x86_64   cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8
ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7   Ready    worker   109m   v1.19.0+7f9e863   10.0.32.2     <none>        Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa)   4.18.0-193.23.1.el8_2.x86_64   cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8
ci-ln-j3tbpx2-f76d1-lrx8m-worker-c-pvg2x   Ready    worker   109m   v1.19.0+7f9e863   10.0.32.3     <none>        Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa)   4.18.0-193.23.1.el8_2.x86_64   cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8
ci-ln-j3tbpx2-f76d1-lrx8m-worker-d-zs8jq   Ready    worker   109m   v1.19.0+7f9e863   10.0.32.4     <none>        Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa)   4.18.0-193.23.1.el8_2.x86_64   cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8


$ oc debug node/ci-ln-j3tbpx2-f76d1-lrx8m-worker-c-pvg2x
Starting pod/ci-ln-j3tbpx2-f76d1-lrx8m-worker-c-pvg2x-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.32.3
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# cat .ssh/id_ed25519 
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1r.....
-----END OPENSSH PRIVATE KEY-----
sh-4.4# ssh -l core -i /root/.ssh/id_ed25519 10.0.32.2
Red Hat Enterprise Linux CoreOS 46.82.202009182140-0
  Part of OpenShift 4.6, RHCOS is a Kubernetes native operating system
  managed by the Machine Config Operator (`clusteroperator/machine-config`).

WARNING: Direct SSH access to machines is not recommended; instead,
make configuration changes via `machineconfig` objects:
  https://docs.openshift.com/container-platform/4.6/architecture/architecture-rhcos.html

---
[core@ci-ln-j3tbpx2-f76d1-lrx8m-worker-b-25cc7 ~]$ exit
logout
Connection to 10.0.32.2 closed.
sh-4.4# exit
exit
sh-4.4# exit
exit

Removing debug pod ...
```

Comment 15 errata-xmlrpc 2020-10-27 16:30:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.