Description of problem: Sometime in the past couple of releases, watchman went from consuming a little under 10% of a CPU to somewhere in the 20-30% range. As I understand it from looking at our configs, we are using the new gear state plugin, but the metrics plugin is not enabled. I have not looked for a root cause yet, nor have I tried disabling individual plugins. Version-Release number of selected component (if applicable): openshift-origin-node-util-1.22.6-1.el6oso.noarch How reproducible: Always (at least, it appears pretty consistent across our nodes) Steps to Reproduce: 1. Create a node with hundreds of gears (500 should be sufficient) 2. Run watchman for a while 3. check CPU usage using "ps auxww --cumulative | grep watchman". The third column shows the precentage of CPU used by watchman and its child processes. Actual results: CPU usage is over 20% Expected results: Less than that. :)
Added element STATE_CHECK_PERIOD to /etc/sysconfig/watchman to allow detuning of state checks. https://github.com/openshift/origin-server/pull/5383
Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/c84642a6f0c03af10fad08c6064f686f74e2dedf Bug 1091433 - Add setting to detune GearStatePlugin * Add sysconfig/watchman element STATE_CHECK_PERIOD to control frequency of running GearStatePlugin
Test on devenv_4769, STATE_CHECK_PERIOD could take effect for watchman. steps: 1. Config in /etc/sysconfig/watchman and restart watchman STATE_CHANGE_DELAY=60 STATE_CHECK_PERIOD=60 2. change gear state and check the syslog, could get gear state change info in syslog with below log after about 2 min 3. check the cpu usage, it is lower than 20% Move bug to verified.