Description of problem: When scaling out an overcloud from 244 to 251 nodes, we see the deployment fail due to keystone performance issue.
We have an undercloud with 64 logical cores, however only 12 keystone workers were deployed based on the ::os_workers fact here https://github.com/openstack/puppet-openstacklib/blob/master/lib/facter/os_workers.rb
In previous releases, 24 workers would be deployed (12 keystone-main and 12 keystone-admin), but since the actions have been consolidated within a single "keystone" worker since keystone v3, TripleO sets lower number of total "keystone" workers (compared to when it would set keystone-main and keystone-admin workers).
On bumping the number of keystone workers on my undercloud from 12 to 24, the heat stack update proceeded.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Deploy undercloud with defaults
2. Deploy a a large overcloud
3. Scale the overcloud
Scale out fails
Scale out should succeed
Keystone performance drop in overcloud which is related to this bug:
It is possible the fix for that will also help with this
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.