Bug 1096863
Summary: | watchman consumes too much CPU | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Brenton Leanhardt <bleanhar> |
Component: | Containers | Assignee: | Brenton Leanhardt <bleanhar> |
Status: | CLOSED ERRATA | QA Contact: | libra bugs <libra-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 2.1.0 | CC: | adellape, agrimm, anli, bleanhar, bmeng, gpei, libra-onpremise-devel, xjia, yadu |
Target Milestone: | --- | Keywords: | Upstream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openshift-origin-node-util-1.22.11.1-1.el6op | Doc Type: | Bug Fix |
Doc Text: |
Previously, Watchman's frequency for checking gear state was hard-coded in the tool, and it could consume too much CPU as a result. This bug fix adds many additional configuration parameters along with documentation to the /etc/sysconfig/watchman file, and administrators now have access to more tuning options when using Watchman.
|
Story Points: | --- |
Clone Of: | 1091433 | Environment: | |
Last Closed: | 2014-08-04 13:27:06 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1091433, 1097959 | ||
Bug Blocks: | 1105225 |
Description
Brenton Leanhardt
2014-05-12 14:33:43 UTC
We should pull in this upstream PR too: https://github.com/openshift/origin-server/pull/5418/files These are two additional pull requests that ship important updates for watchman: https://github.com/openshift/origin-server/pull/5429 https://github.com/openshift/origin-server/pull/5437 When the OOM plugin is backported we should consider pulling in https://github.com/openshift/origin-server/pull/5494 as well. Upstream commits: commit c84642a6f0c03af10fad08c6064f686f74e2dedf Author: Jhon Honce <jhonce> Date: Tue May 6 08:40:56 2014 -0700 Bug 1091433 - Add setting to detune GearStatePlugin * Add sysconfig/watchman element STATE_CHECK_PERIOD to control frequency of running GearStatePlugin commit dbc9cfadb7c82eba7b17638e7f79e2c0a01bdf8e Author: Jhon Honce <jhonce> Date: Thu May 15 11:41:36 2014 -0700 Bug 1097959 - Add THROTTLER_CHECK_PERIOD to detune Throttler * Add THROTTLER_CHECK_PERIOD element to /etc/sysconfig/watchman to allow Operator to set period for checking cgroup counters commit 6188dd63856e048aa51071e059618141ce13fd04 Author: Andy Grimm <agrimm> Date: Mon May 12 16:05:30 2014 -0400 Introduce oom plugin and disable syslog plugin The oom plugin is improves handling of out-of-memory conditions in gears by dynamically adjusting a cgroup's memory limit while cleaning up its tasks. commit efec8b5f07988f3e95de5b5c54aae380b0879b98 Author: Andy Grimm <agrimm> Date: Tue May 20 15:22:57 2014 -0400 Remove an incorrect comment line in oom_plugin commit a43a0d461974087568d3e7e60f61e890a1e9b0d1 Author: Andy Grimm <agrimm> Date: Tue May 20 15:25:30 2014 -0400 Disable OOM kills for gear cgroups commit ba9636528748d0cb24b455e102b9f3098072c7c6 Author: Andy Grimm <agrimm> Date: Tue May 20 15:31:20 2014 -0400 Add OOM_CHECK_PERIOD to oo-watchman man page commit 322cb2dacc7c8cc3c1cbbb35fc2e98248a8a5d61 Author: Jhon Honce <jhonce> Date: Wed May 21 16:00:11 2014 -0700 WIP Node Platform - Skip syslog_plugin test if it has been disabled Verified and pass on puddle-2-1-2014-07-15 The CPU became less after update to puddle-2-1-2014-07-15. and the configure values also take effect. 1) On OSE GA build. Watchman consumes 42% CPU times. [root@node ~]# ps auxww --cumulative | grep watchman root 23276 42 0.1 13263832 184336 ? Sl 17:17 8:31 watchman root 110942 0.0 0.0 103256 856 pts/1 S+ 17:25 0:00 grep watchman 2) On puddle puddle-2-1-2014-07-15, Only 11.5% CPU times. root@node ~]# ps auxww --cumulative | grep watchman root 2683 11.5 0.3 13001500 163292 ? Sl 20:05 12:45 watchman root 18410 0.0 0.0 103256 888 pts/1 S+ 21:55 0:00 grep watchman 3) After add the following configuration. STATE_CHANGE_DELAY=60 STATE_CHECK_PERIOD=60 [root@node ~]# ps auxww --cumulative | grep watchman root 10021 10.8 0.1 12905248 82308 ? Sl 22:00 0:56 watchman root 24596 0.0 0.0 103256 852 pts/2 S+ 22:08 0:00 grep watchman Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0999.html |