1727345 – [RFE] Enable memcached for fernet by default in OSP

Bug 1727345 - [RFE] Enable memcached for fernet by default in OSP

Summary: [RFE] Enable memcached for fernet by default in OSP

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-heat-templates
Sub Component:
Version:	16.1 (Train)
Hardware:	All
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	Alpha
Target Release:	17.0
Assignee:	Lance Bragstad
QA Contact:	Jeremy Agee
Docs Contact:
URL:
Whiteboard:
Depends On:	1438451
Blocks:
TreeView+	depends on / blocked

Reported:	2019-07-05 15:41 UTC by Federico Iezzi
Modified:	2022-09-21 12:08 UTC (History)
CC List:	13 users (show)
Fixed In Version:	openstack-tripleo-heat-templates-14.3.1-0.20220607161058.ced328c.el9ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-09-21 12:07:43 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	777633	None	master: MERGED	tripleo-heat-templates: Enable fernet token cache by default (I987cd4924c291351c651f794a9679aab1ddd408b)	2022-06-13 19:35:46 UTC
Red Hat Issue Tracker	OSP-422	None	None	None	2022-02-04 12:19:52 UTC
Red Hat Knowledge Base (Solution)	3679131	None	None	None	2021-02-25 17:43:50 UTC
Red Hat Product Errata	RHEA-2022:6543	None	None	None	2022-09-21 12:08:48 UTC

Description Federico Iezzi 2019-07-05 15:41:53 UTC

Description of problem:
Keystone Fernet tokens have been set default provider since the Pike dev cycle [1]
As reported upstream [2] and [3] the general API responsiveness has been impacted, in some cases, by even 70% slower response.

While in some environment such degradation is generally not an issue, specifically for Telcos, the VNF life-cycle is impacted. To what extent, it depends on the VNF complexity we're dealing with. For instance, vEPC, one of the most common NFV use-case has many components. SGW and PGW are connected from dozens to even hundreds of networks. Backup and Restore when relying on the OpenStack API can get much slower, potentially affecting the SLA.

Revering the token driver to UUID is not a scalable approach. Upstream has proposed a solution: in order to revert to comparable UUID responsiveness level, a shared Memcached backend for all the services is required in order to cache as much Keystone data as possible. Upstream has/had a proposal [4] (on-hold since months) focused on the Keystone tokens while in the OpenStack community has been proposed a much wider caching scope [5] including catalog, domain_config, revoke, roles, identity, tokens even at the issue time.

The initial upstream approach is a starting point but it's not good enough. The community approach is far better.
Some people have expressed concern regarding the responsiveness in the even a Memcached server is unavailable while forgetting to remember that when RabbitMQ is unavailable, OSP is far slower and totally unusable (and not only for Telcos)

Version-Release number of selected component (if applicable):
OSP13, OSP14, OSP15, OSP16

How reproducible:
Run standard tempest against an OSP13 with Fernet and then with UUID token and compare the results.

Actual results:
Multiple OpenStack APIs are slower than OSP10.

Expected results:
OSP10-level API responsiveness or even better but certainly not worse.

Additional info:
[1] https://github.com/openstack/tripleo-heat-templates/commit/c737eea8c0594bce5c86cd712dab559fa5d1a385
[2] https://review.opendev.org/#/c/634505/ please see comment by "Lukas Bezdicka" on the "Feb 13 3:37 PM"
[3] https://docs.google.com/document/d/1zpi9OP_RrZv4ACy43HnqYNoTIRU6j10JJaQI9HB5ugY/
[4] https://review.opendev.org/#/c/634505/
[5] https://www.holdenthecloud.com/2018/05/10/keystone-optimization/

Comment 3 Vincent S. Cojot 2020-03-28 14:28:32 UTC

Ths would be real nice if thise could get merged. It is not intrusive at all: the j2 patch only defines a new memcached_servers value and doesn't change any prior values.
Also, since the memcached_servers.yaml file is a new file and is optional, this wouldn't break anything for existing deploys.
Could we please get the j2 patch merged for OSP13z12 and OSP16z1?

Comment 5 stchen 2020-09-30 19:29:20 UTC

Closing EOL, OSP 15 has been retired as of Sept 19

Comment 6 Federico Iezzi 2020-10-01 06:48:40 UTC

The problem describe in this BZ is still affecting even the latest OSP16.1 release, hence re-opening it.

Comment 8 Grzegorz Grasza 2021-02-25 17:41:56 UTC

I removed https://review.opendev.org/c/openstack/tripleo-heat-templates/+/634505/ , since this is something that was already implemented in https://bugzilla.redhat.com/show_bug.cgi?id=1652558

Comment 24 errata-xmlrpc 2022-09-21 12:07:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:6543

Note You need to log in before you can comment on or make changes to this bug.