Bug 1868062 - New key written to authorized_keys instead of ignition allowing user to continue to ssh into machines with old key
Summary: New key written to authorized_keys instead of ignition allowing user to conti...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.6
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: 4.6.0
Assignee: Benjamin Gilbert
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks: 1871795
TreeView+ depends on / blocked
 
Reported: 2020-08-11 14:39 UTC by To Hung Sze
Modified: 2020-10-27 16:27 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:27:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 4095 0 None closed Bug 1871795: bump RHCOS images to fix SSH authentication 2021-01-18 22:45:51 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:27:50 UTC

Description To Hung Sze 2020-08-11 14:39:35 UTC
Description of problem:
After replacing key in masters in an OpenShift cluster, user can still ssh into the masters with old key.

Version-Release number of selected component (if applicable):
4.6

How reproducible: Always

Steps to reproduce:
Create a cluster
Generate a new key pair

$oc get machineconfig 99-master-ssh -o yaml > update-ssh-master.yaml

edit the yaml and replace the key under sshAuthorizedKeys with new key

$oc apply -f update-ssh-master.yaml

ssh into the master with NEW key - success

Actual results:
ssh into the master with OLD key - still successful


Expected results:
ssh into the master with OLD key - should fail


Additional info:
On masters, .ssh/authorized_keys.d/ignition file contains the old key.

It was pointed out that 
https://github.com/openshift/machine-config-operator/blob/e0e622950e56/pkg/daemon/update.go#L1318 shows that the new key is written to authorized_keys instead of authorized_keys.d/ignition.

Comment 1 Benjamin Gilbert 2020-08-11 14:46:05 UTC
Ignition writes SSH keys to .ssh/authorized_keys.d/ignition, not .ssh/authorized_keys.  On RHCOS, sshd is configured with that path as an additional AuthorizedKeysFile.  But the MCO updates .ssh/authorized_keys instead.  The result is that SSH keys initially provisioned via Ignition cannot be removed with the MCO.

It might make sense for MCO to manage .ssh/authorized_keys.d/ignition instead of authorized_keys, but an automatic migration would be needed.

Comment 2 Benjamin Gilbert 2020-08-11 16:49:07 UTC
I completely forgot about the behavior change in Ignition 2.x (spec 3).  The problem is limited to OpenShift 4.6, and previous versions were fine.

In Ignition 0.x (spec 2), Ignition wrote keys to .ssh/authorized_keys.d/coreos-ignition and then synced them to ~/.ssh/authorized_keys.  The operating system didn't look at .ssh/authorized_keys.d/coreos-ignition, so it was harmless for the MCO to update authorized_keys directly.

Ignition 2.x no longer updates authorized_keys.  Instead, it relies on sshd to read .ssh/authorized_keys.d/ignition directly.

We'll need to deduplicate our use of these two files.  Assuming we don't want to change the historical practice of the MCO managing authorized_keys, I think it'd be fine for the MCO to delete .ssh/authorized_keys.d/ignition after merging its contents into authorized_keys.  (Or, equivalently, after first reconciling the MachineConfig?)

RHCOS currently special-cases .ssh/authorized_keys.d/ignition and doesn't read other files in authorized_keys.d, including coreos-ignition.  FCOS has tooling which automatically reads _every_ file in authorized_keys.d.  There are no current plans to include that tooling in RHCOS, but as a precaution, it might be safer for the MCO to delete .ssh/authorized_keys.d/coreos-ignition as well.

Comment 4 Benjamin Gilbert 2020-08-13 14:47:48 UTC
Filip, I think that's probably a separate issue.  RHCOS reads SSH public keys out of both ~/.ssh/authorized_keys and ~/.ssh/authorized_keys.d/ignition.

Comment 6 Colin Walters 2020-08-17 14:12:26 UTC
Why is this a private bug and why are people creating private comments?  I don't see anything confidental here.
I think this is something in Bugzilla defaulting to private bugs.  Does anyone object to me un-marking it with Group: redhat?

I do think this is an important bug, and actually a security issue in some cases so this should be a 4.6 blocker.

One option is to change Ignition on RHCOS to write to the old location for now; it seems simpler than changing the MCO to effectively undo what Ignition is doing, but that path also seems fine.

(We have messy upgrade issues here too because we need to handle the case where a cluster was installed with < 4.6, then upgraded and where the keys are still in the main file and not the ignition file, if we choose to keep it)

Comment 7 Antonio Murdaca 2020-08-17 14:18:47 UTC
(In reply to Colin Walters from comment #6)
> Why is this a private bug and why are people creating private comments?  I
> don't see anything confidental here.
> I think this is something in Bugzilla defaulting to private bugs.  Does
> anyone object to me un-marking it with Group: redhat?
> 
> I do think this is an important bug, and actually a security issue in some
> cases so this should be a 4.6 blocker.
> 
> One option is to change Ignition on RHCOS to write to the old location for
> now; it seems simpler than changing the MCO to effectively undo what
> Ignition is doing, but that path also seems fine.

I also think rolling back ignition till we implement a better migration strategy (which I concur is feasible) would be desiderable for the MCO

> 
> (We have messy upgrade issues here too because we need to handle the case
> where a cluster was installed with < 4.6, then upgraded and where the keys
> are still in the main file and not the ignition file, if we choose to keep
> it)

Comment 8 Colin Walters 2020-08-18 16:44:39 UTC
I'm unilaterally lifting the private flag.

Comment 9 Colin Walters 2020-08-18 17:41:31 UTC
Internal discussion resulted in this:

Option 1: Revert just RHCOS:
===

Revert back to writing `authorized_keys` for RHCOS for now.

+ Fixes the immediate problem
- Doesn't help OKD/FCOS

Option 2: MCO merge:
===

Patch the MCO to work with both cases and to handle upgrades:

MCO deletes `.ssh/authorized_keys.d/ignition` after merging its contents into `authorized_keys`.

+ Better fix
- More work and lots going on in MCO

Also MCO writes /etc/ssh/sshd_config.d/01-disable-authorized_keys.d.conf to disable Afterburn SSH keys for OKD.  RHCOS will ignore it.


Option 3: "Belt and Suspenders" (both option 1 and option 2)
===

RHCOS wouldn’t have to change, but in OKD+FCOS we could test the changes out.
Basically do both MCO work _and_ have Ignition build in RHCOS with the flag to disable authorized_keys.d.

+ Lets us try out the change in OKD first
- More work (but not much)

Option 4: Revert writing the `/ignition file` in FCOS *and* RHCOS
===

Luca points out this is documented in the spec: https://github.com/coreos/ignition/blob/58f67d61975c799cc9322aa07b044099b52d5837/doc/configuration-v3_0.md

(I hope no one really argues this)

Comment 10 Benjamin Gilbert 2020-08-20 02:07:01 UTC
Colin, re comment 6, I recommended to the original reporter that they file a private bug until the security consequences could be evaluated.  I hadn't realized at that time that the problem only affected 4.6.

Comment 11 Benjamin Gilbert 2020-08-20 19:11:48 UTC
We've decided on option 1 for now, with the intention to revisit after 4.6, and without precluding the issue from being addressed in OKD as time permits.  Until that happens, we should document for OKD that install-time SSH keys cannot be removed via a MachineConfig.

Comment 14 Benjamin Gilbert 2020-08-25 11:45:16 UTC
This needs a bootimage update.  Otherwise Ignition will write to ~/.ssh/authorized_keys.d/ignition, and then we'll immediately update into a new RHCOS release that won't read that file.

Comment 16 Micah Abbott 2020-09-08 17:35:49 UTC
Verified with 4.6.0-fc.4 

```
$ oc get clusterversion                                                                       
NAME      VERSION      AVAILABLE   PROGRESSING   SINCE   STATUS                                                                               
version   4.6.0-fc.4   True        False         163m    Cluster version is 4.6.0-fc.4                                                        

$ oc get nodes                                                                                
NAME                           STATUS   ROLES    AGE    VERSION                                                                               
ip-10-0-134-86.ec2.internal    Ready    master   3h3m   v1.19.0-rc.2+514f31a                                                                  
ip-10-0-148-100.ec2.internal   Ready    worker   173m   v1.19.0-rc.2+514f31a                                                                  
ip-10-0-150-160.ec2.internal   Ready    master   3h5m   v1.19.0-rc.2+514f31a                                                                  
ip-10-0-180-51.ec2.internal    Ready    worker   173m   v1.19.0-rc.2+514f31a                                                                                                                                                                                                                
ip-10-0-218-54.ec2.internal    Ready    worker   173m   v1.19.0-rc.2+514f31a                                                                                                                                                                                                                
ip-10-0-223-205.ec2.internal   Ready    master   3h4m   v1.19.0-rc.2+514f31a                                                                                                                                                                                                                

$ oc get mc                                                                                   
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE                           
00-master                                          130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h3m                          
00-worker                                          130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h3m                          
01-master-container-runtime                        130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h3m                                                                                                                                                                        
01-master-kubelet                                  130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h3m                                                                                                                                                                        
01-worker-container-runtime                        130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h3m                                                                                                                                                                        
01-worker-kubelet                                  130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h3m                                                                                                                                                                        
99-master-generated-registries                     130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h3m                                                                                                                                                                        
99-master-ssh                                                                                 3.1.0             3h9m
99-worker-generated-registries                     130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h3m
99-worker-ssh                                                                                 3.1.0             3h9m                                                                                                                                                                        rendered-master-b0985b6c15f0f96c125e32b7a5b1a081   130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h3m                                                                                                                                                                        rendered-worker-27ca4280a07e7e99a33a3347cfff1f6b   130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h3m                                                                                                                                                                        
rendered-worker-8daaa44d8998b182fd082a2f3dac2feb   130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             123m
zz-test                                                                                                         123m


$ oc debug node/ip-10-0-134-86.ec2.internal -- chroot /host cat /var/home/core/.ssh/authorized_keys
Starting pod/ip-10-0-134-86ec2internal-debug ...
To use host binaries, run `chroot /host`                                                                                                      
ssh-rsa AAAAB3Nza....
                                                                                                                                                                                                                                                                                            
Removing debug pod ...                          


$ cat update-ssh-master.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: master
  name: 99-new-master-sshkey
spec:
  config:
    ignition:
      version: 3.1.0
    passwd:
      users:
      - name: core
        sshAuthorizedKeys:
        - |
          ssh-ed25519 AAAAC3....

$ oc apply -f update-ssh-master.yaml 
machineconfig.machineconfiguration.openshift.io/99-new-master-sshkey created

$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h7m
00-worker                                          130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h7m
01-master-container-runtime                        130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h7m
01-master-kubelet                                  130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h7m
01-worker-container-runtime                        130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h7m
01-worker-kubelet                                  130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h7m
99-master-generated-registries                     130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h7m
99-master-ssh                                                                                 3.1.0             3h13m
99-new-master-sshkey                                                                          3.1.0             14s
99-worker-generated-registries                     130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h7m
99-worker-ssh                                                                                 3.1.0             3h13m
rendered-master-14f4c9b8751656e373cce056401dbf98   130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             9s
rendered-master-b0985b6c15f0f96c125e32b7a5b1a081   130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h7m
rendered-worker-27ca4280a07e7e99a33a3347cfff1f6b   130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             3h7m
rendered-worker-8daaa44d8998b182fd082a2f3dac2feb   130947243313dcfa8a4f0ef487f458f923df1128   3.1.0             128m
zz-test                                                                                                         128m

$ oc debug node/ip-10-0-134-86.ec2.internal -- chroot /host cat /var/home/core/.ssh/authorized_keys
Starting pod/ip-10-0-134-86ec2internal-debug ...
To use host binaries, run `chroot /host`
ssh-rsa AAAAB3NzaC1y....

ssh-ed25519 AAAAC3Nza....


Removing debug pod ...
```

Comment 17 Benjamin Gilbert 2020-09-24 22:15:45 UTC
https://github.com/openshift/machine-config-operator/pull/2087 proposes to modify the MCO to disable reading authorized_keys.d fragments on FCOS.  RHCOS and OKD behavior would then remain as in option 1 (reading and writing only authorized_keys) for the long term.  The tradeoffs with this approach are described in https://github.com/openshift/machine-config-operator/pull/2087#pullrequestreview-489234956.

Comment 19 errata-xmlrpc 2020-10-27 16:27:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.