Bug 1091433

Summary: watchman consumes too much CPU
Product: OpenShift Online Reporter: Andy Grimm <agrimm>
Component: ContainersAssignee: Jhon Honce <jhonce>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.xCC: bleanhar, bmeng, jgoulding, yadu
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1096863 (view as bug list) Environment:
Last Closed: 2014-07-15 10:28:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1096863    

Description Andy Grimm 2014-04-25 15:08:39 UTC
Description of problem:

Sometime in the past couple of releases, watchman went from consuming a little under 10% of a CPU to somewhere in the 20-30% range.  As I understand it from looking at our configs, we are using the new gear state plugin, but the metrics plugin is not enabled.  I have not looked for a root cause yet, nor have I tried disabling individual plugins.

Version-Release number of selected component (if applicable):

openshift-origin-node-util-1.22.6-1.el6oso.noarch

How reproducible:

Always (at least, it appears pretty consistent across our nodes)

Steps to Reproduce:
1. Create a node with hundreds of gears (500 should be sufficient)
2. Run watchman for a while
3. check CPU usage using "ps auxww --cumulative | grep watchman".  The third column shows the precentage of CPU used by watchman and its child processes.

Actual results:

CPU usage is over 20%

Expected results:

Less than that.  :)

Comment 1 Jhon Honce 2014-05-06 19:58:32 UTC
Added element STATE_CHECK_PERIOD to /etc/sysconfig/watchman to allow detuning of state checks.

https://github.com/openshift/origin-server/pull/5383

Comment 2 openshift-github-bot 2014-05-06 20:53:59 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/c84642a6f0c03af10fad08c6064f686f74e2dedf
Bug 1091433 - Add setting to detune GearStatePlugin

* Add sysconfig/watchman element STATE_CHECK_PERIOD to control
  frequency of running GearStatePlugin

Comment 3 Yan Du 2014-05-07 09:43:46 UTC
Test on devenv_4769, STATE_CHECK_PERIOD could take effect for watchman.

steps:
1. Config in /etc/sysconfig/watchman and restart watchman
STATE_CHANGE_DELAY=60
STATE_CHECK_PERIOD=60
2. change gear state and check the syslog, could get gear state change info in syslog with below log after about 2 min
3. check the cpu usage, it is lower than 20%

Move bug to verified.