Description of problem: Keystone Fernet tokens have been set default provider since the Pike dev cycle [1] As reported upstream [2] and [3] the general API responsiveness has been impacted, in some cases, by even 70% slower response. While in some environment such degradation is generally not an issue, specifically for Telcos, the VNF life-cycle is impacted. To what extent, it depends on the VNF complexity we're dealing with. For instance, vEPC, one of the most common NFV use-case has many components. SGW and PGW are connected from dozens to even hundreds of networks. Backup and Restore when relying on the OpenStack API can get much slower, potentially affecting the SLA. Revering the token driver to UUID is not a scalable approach. Upstream has proposed a solution: in order to revert to comparable UUID responsiveness level, a shared Memcached backend for all the services is required in order to cache as much Keystone data as possible. Upstream has/had a proposal [4] (on-hold since months) focused on the Keystone tokens while in the OpenStack community has been proposed a much wider caching scope [5] including catalog, domain_config, revoke, roles, identity, tokens even at the issue time. The initial upstream approach is a starting point but it's not good enough. The community approach is far better. Some people have expressed concern regarding the responsiveness in the even a Memcached server is unavailable while forgetting to remember that when RabbitMQ is unavailable, OSP is far slower and totally unusable (and not only for Telcos) Version-Release number of selected component (if applicable): OSP13, OSP14, OSP15, OSP16 How reproducible: Run standard tempest against an OSP13 with Fernet and then with UUID token and compare the results. Actual results: Multiple OpenStack APIs are slower than OSP10. Expected results: OSP10-level API responsiveness or even better but certainly not worse. Additional info: [1] https://github.com/openstack/tripleo-heat-templates/commit/c737eea8c0594bce5c86cd712dab559fa5d1a385 [2] https://review.opendev.org/#/c/634505/ please see comment by "Lukas Bezdicka" on the "Feb 13 3:37 PM" [3] https://docs.google.com/document/d/1zpi9OP_RrZv4ACy43HnqYNoTIRU6j10JJaQI9HB5ugY/ [4] https://review.opendev.org/#/c/634505/ [5] https://www.holdenthecloud.com/2018/05/10/keystone-optimization/
Ths would be real nice if thise could get merged. It is not intrusive at all: the j2 patch only defines a new memcached_servers value and doesn't change any prior values. Also, since the memcached_servers.yaml file is a new file and is optional, this wouldn't break anything for existing deploys. Could we please get the j2 patch merged for OSP13z12 and OSP16z1?
Closing EOL, OSP 15 has been retired as of Sept 19
The problem describe in this BZ is still affecting even the latest OSP16.1 release, hence re-opening it.
I removed https://review.opendev.org/c/openstack/tripleo-heat-templates/+/634505/ , since this is something that was already implemented in https://bugzilla.redhat.com/show_bug.cgi?id=1652558
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:6543