Description of problem:
Given the nature of the design, the os-collect-config polling is actually synchronized and the 30 sec period causes cpu utilization to go to 100% every 30 seconds for several seconds each poll period.
It is not clear that this is scalable
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.start top in a window on the undercloud machine
2.start a small deployment (65 nodes)
3.observe top behaviour over time.
Spikes in CPU utilization close to 100% every 30 seconds.
A smoother spread
Making the delay before the first poll a random number between 0 and $period would help de-synchronize the polling for many nodes. This would be a change to os-collect-config.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
Changes are in os-collect-config-7.0.1-0.20170612052603.5870ed6.el7ost & openstack-tripleo-heat-templates-7.0.0-0.20170616123155.el7ost
Verified on build 2017-11-29.2
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.