Description of problem:
Given the nature of the design, the os-collect-config polling is actually synchronized and the 30 sec period causes cpu utilization to go to 100% every 30 seconds for several seconds each poll period.
It is not clear that this is scalable
Version-Release number of selected component (if applicable):
How reproducible:
every time
Steps to Reproduce:
1.start top in a window on the undercloud machine
2.start a small deployment (65 nodes)
3.observe top behaviour over time.
Actual results:
Spikes in CPU utilization close to 100% every 30 seconds.
Expected results:
A smoother spread
Additional info:
Making the delay before the first poll a random number between 0 and $period would help de-synchronize the polling for many nodes. This would be a change to os-collect-config.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHEA-2017:3462