Bug 1639493 - old fernet tokens are not cleaned up on the controller
Summary: old fernet tokens are not cleaned up on the controller
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z1
: 14.0 (Rocky)
Assignee: Harry Rybacki
QA Contact: Jeremy Agee
URL:
Whiteboard:
Depends On:
Blocks: 1653970
TreeView+ depends on / blocked
 
Reported: 2018-10-15 21:23 UTC by Jeremy Agee
Modified: 2019-03-18 13:03 UTC (History)
10 users (show)

Fixed In Version: openstack-tripleo-heat-templates-9.0.1-0.20181013060912.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1653970 (view as bug list)
Environment:
Last Closed: 2019-03-18 13:03:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 618604 0 'None' MERGED Update kolla_config to deal with keystone fernet key rotation 2021-01-29 19:36:39 UTC
OpenStack gerrit 626753 0 'None' MERGED Update kolla_config to deal with keystone fernet key rotation 2021-01-29 19:36:39 UTC
Red Hat Product Errata RHBA-2019:0446 0 None None None 2019-03-18 13:03:17 UTC

Description Jeremy Agee 2018-10-15 21:23:31 UTC
Description of problem:
Old fernet tokens are retained on the controller after mistral reaches to max_active_keys=5 (default).

Steps to Reproduce:
1. Deploy a undercloud and deploy the overcloud leaving the default setting ManageKeystoneFernetKeys: true

2. Rotate fernet tokens until max keys are reached. At this time the original 1,2 tokens are untracked by mistral.

(undercloud) [stack@undercloud-0 ~]$ openstack workflow execution create tripleo.fernet_keys.v1.rotate_fernet_keys '{"container": "overcloud"}'

(undercloud) [stack@undercloud-0 ~]$ mistral task-get-result d389ced4-4b55-4288-9f76-7a510fe30ac4

{
    "/etc/keystone/fernet-keys/0": {
        "content": "cVbVzTVtQLDzGCcfYR-58BhsqtbIy75g1GRZm969Uh8="
    }, 
    "/etc/keystone/fernet-keys/3": {
        "content": "zwkkJG5lr0e46RnXOqOz2xf7MFgrXARmhTsgWBlIZmI="
    }, 
    "/etc/keystone/fernet-keys/4": {
        "content": "1rriYXTcXcw_XnPEY8QlHjAib1JQxUoilYhHIBqv8qg="
    }, 
    "/etc/keystone/fernet-keys/5": {
        "content": "pKozY6zC1uYUormSP3RwqIknPD7R22crBjfmj0M_jvY="
    }, 
    "/etc/keystone/fernet-keys/6": {
        "content": "yH21T5dKCd08g1ZF_EcnElmMXPD3rfKkTzJoV1rj4Dc="
    }
}

Actual results:
The old untracked fernet tokens are retained on the controller and not cleaned up.
[root@controller-0 fernet-keys]# ls -la /var/lib/config-data/puppet-generated/keystone/etc/keystone/fernet-keys/
total 28
-rw-------. 1 42425 42425 44 Oct 15 16:09 0
-rw-------. 1 42425 42425 44 Oct 13 00:30 1
-rw-------. 1 42425 42425 44 Oct 13 02:18 2
-rw-------. 1 42425 42425 44 Oct 15 16:09 3
-rw-------. 1 42425 42425 44 Oct 15 16:09 4
-rw-------. 1 42425 42425 44 Oct 15 16:09 5
-rw-------. 1 42425 42425 44 Oct 15 16:09 6

Expected results:
The overcloud controller will only have the current fernet tokens managed by mistral from the undercloud.

Comment 1 Dougal Matthews 2018-10-17 10:25:36 UTC
I believe Fernet bugs are best handled by the Security DFG

Comment 2 Nathan Kinder 2018-11-14 22:21:20 UTC
(In reply to Jeremy Agee from comment #0)
>
> [root@controller-0 fernet-keys]# ls -la
> /var/lib/config-data/puppet-generated/keystone/etc/keystone/fernet-keys/
> total 28
> -rw-------. 1 42425 42425 44 Oct 15 16:09 0
> -rw-------. 1 42425 42425 44 Oct 13 00:30 1
> -rw-------. 1 42425 42425 44 Oct 13 02:18 2
> -rw-------. 1 42425 42425 44 Oct 15 16:09 3
> -rw-------. 1 42425 42425 44 Oct 15 16:09 4
> -rw-------. 1 42425 42425 44 Oct 15 16:09 5
> -rw-------. 1 42425 42425 44 Oct 15 16:09 6

There is a clue here with the timestamps.  On my OSP13 test deployment (where fernet rotation works fine), the timestamp for all of the keys is exactly the same.  The reason for this is that every time a rotation workflow is executed, all of the old keys are deleted on the controller before all of the current keys are copied down.  You can see this in the ansible playbook used for fernet rotation here:

  https://github.com/openstack/tripleo-common/blob/f7f2d33170a8d152bcde87211ff438fe4a16cad8/playbooks/rotate-keys.yaml#L17-L28

In your example above, keys 1 and 2 have a different date/time on them.  As you can see in the playbook, there should be a 'rm-rf' on the entire directory where fernet keys are kept during workflow execution.  It would be useful to see the output from the playbook.  You can get this by running a rotation task, waiting to let it complete, then running the following:

  openstack workflow execution output show <ID>

Comment 3 Nathan Kinder 2018-11-14 22:56:40 UTC
Additionally, it might be useful to increase the verbosity of the output that we get when executing the ansible playbook.  This can be done by updating the workbook definition used for fernet rotation.  This is done on the undercloud like so:

---------------------------------------------------------------------------
$ openstack workbook definition show tripleo.fernet_keys.v1 > /tmp/fernet-workbook
$ vi /tmp/fernet-workbook (change "verbosity" value for "deploy_keys" task to "5")
$ openstack workbook update /tmp/fernet-workbook
---------------------------------------------------------------------------

After the workbook is updated, run another fernet rotation workflow execution and wait for it to complete.  When it has finished, you can get the verbose output from the ansibvle playbook execution by running the following command:

  openstack workflow execution show <ID>

When troubleshooting is complete, you will want to reset the verbosity to the default of 0.

Comment 4 Nathan Kinder 2018-11-16 19:29:06 UTC
I have not got to the bottom of this just yet, but I have a few more clues about is going on.

There are 2 areas used for fernet keys on the controller nodes:

  Initial keys - /var/lib/config-data/keystone/etc/keystone/fernet-keys
  Key repo     - /var/lib/config-data/puppet-generated/etc/keystone/fernet-keys

If you look in the "initial keys" area, it will only have keys 0 and 1.  These are the initial keys that are generated when you first deploy the overcloud.  They will not change as a part of the fernet rotation workflow.

The key rotation workflow works in the "key repo" area.  If you trigger rotation enough to go past the configured number of max keys, you will find that this area works as you expect (you will have key 0, and max-1 other keys).  For example, my test environment (with the default max of 5 keys) after a few rotations has these keys:

  0, 7, 8, 9, 10.

If you look inside of the keystone container on a controller, it seems to do some sort of merging of these two areas from what I can see.  On my test deployment, my container has these keys:

  0, 1, 6, 7, 8, 9, 10

All of the keys numbers that are present in the "key repo" have matching date/timestamps with their counterparts that are in "/etc/keystone/fernet-keys" in the keystone container.  Key 1 has a date/timestamp that matches the key from the "initial keys" area.  This leads me to believe that we first take the "initial keys", then overlay the "key repo" keys on top (which overwrites key 0 with the newer copy since it exists in both areas).

The real mystery for me is why key 6 still exists within the container.  This could be a fluke from my environment, or it could be evidence of a bug.  More investigation is needed.

Comment 5 Nathan Kinder 2018-11-16 20:56:44 UTC
I think there was a red herring in my previous update caused by bug 1639495.  I think we can ignore the face that keys 2-5 were cleaned up on my system, as the above mentioned bug prevented those keys from ever being copied into the "key repo".

The key rotation workflow is doing its job correctly.  From what I can tell, the keystone container itself is responsible for copying the keys from the key repo into its active configuration area at container startup time.  This is related to how the keystone container is built, which seems to be handled by kolla.

If you look in the keystone container config on the overcloud controller node (in /var/lib/tripleo-config/docker-container-startup-config-step_3.json), you can see that the area where the "key repo" lives is not actually bind mounted into the active configuration area of the keystone container:

  "/var/lib/config-data/puppet-generated/keystone/:/var/lib/kolla/config_files/src:ro"

When the keystone container starts, the entrypoint will end up running various tasks before the service is actually started.  The handling of configuration appears to be controlled by /var/lib/kolla/config_files/config.json in the container itself.  If you look at this file in the keystone container, you will see this snippet:

    "config_files": [
        {
            "dest": "/",
            "merge": true,
            "preserve_properties": true,
            "source": "/var/lib/kolla/config_files/src/*"
        }
    ]

This shows that config files will be copied from /var/lib/kolla/config_files/src on the container into the root area (the directory structure in the "src" area contains everything needed to create an absolute path of the destination).

I think that the problem here is that "merge" is true.  This setting likely makes sense for the rest of keystone's config files, but it may be what is causing old fernet keys to be left on the container.

Comment 6 Nathan Kinder 2018-11-16 22:04:17 UTC
I have found a work-around for this issue, which we can use as the basis for a fix.  My theory in comment#5 was correct.  To solve it, we can edit the kolla config for keystone to first copy the fernet-keys directory with merge set to false, then we can merge any other config as usual.  This can be done manually by editing /var/lib/kolla/config_files/keystone.json on your overcloud controller nodes to add another item to the "config_files" list.  The resulting file that I used on my Rocky installation looks like this:

{
    "command": "/usr/sbin/httpd -DFOREGROUND", 
    "config_files": [
        {
            "dest": "/etc/keystone/fernet-keys",
            "merge": false,
            "preserve_properties": true,
            "source": "/var/lib/kolla/config_files/src/etc/keystone/fernet-keys"
        },
        {
            "dest": "/", 
            "merge": true, 
            "preserve_properties": true, 
            "source": "/var/lib/kolla/config_files/src/*"
        }
    ]
}

The result will be that the fernet keys in the container match our key repo area, which is what we want.  The next step is looking to see who is responsible for the creation and deployment of /var/lib/kolla/config_files/keystone.json on the overcloud nodes so we can modify the template.

Comment 7 Nathan Kinder 2018-11-16 23:02:36 UTC
A patch for this has been submitted for upstream for tripleo-heat-templates:

  https://review.openstack.org/618604

Comment 9 Nathan Kinder 2018-12-20 21:34:35 UTC
An upstream backport patch for this has been submitted for stable/rocky:

  https://review.openstack.org/626753/

Comment 10 Harry Rybacki 2019-01-03 19:12:19 UTC
Upstream changes have merged.

Comment 11 Harry Rybacki 2019-01-24 19:44:58 UTC
Downstream build created. Moving bug to MODIFIED.

Comment 14 Mikey Ariel 2019-02-20 12:44:20 UTC
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Comment 16 errata-xmlrpc 2019-03-18 13:03:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0446


Note You need to log in before you can comment on or make changes to this bug.