Bug 1096863
| Summary: | watchman consumes too much CPU | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Brenton Leanhardt <bleanhar> |
| Component: | Containers | Assignee: | Brenton Leanhardt <bleanhar> |
| Status: | CLOSED ERRATA | QA Contact: | libra bugs <libra-bugs> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 2.1.0 | CC: | adellape, agrimm, anli, bleanhar, bmeng, gpei, libra-onpremise-devel, xjia, yadu |
| Target Milestone: | --- | Keywords: | Upstream |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openshift-origin-node-util-1.22.11.1-1.el6op | Doc Type: | Bug Fix |
| Doc Text: |
Previously, Watchman's frequency for checking gear state was hard-coded in the tool, and it could consume too much CPU as a result. This bug fix adds many additional configuration parameters along with documentation to the /etc/sysconfig/watchman file, and administrators now have access to more tuning options when using Watchman.
|
Story Points: | --- |
| Clone Of: | 1091433 | Environment: | |
| Last Closed: | 2014-08-04 13:27:06 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1091433, 1097959 | ||
| Bug Blocks: | 1105225 | ||
|
Description
Brenton Leanhardt
2014-05-12 14:33:43 UTC
We should pull in this upstream PR too: https://github.com/openshift/origin-server/pull/5418/files These are two additional pull requests that ship important updates for watchman: https://github.com/openshift/origin-server/pull/5429 https://github.com/openshift/origin-server/pull/5437 When the OOM plugin is backported we should consider pulling in https://github.com/openshift/origin-server/pull/5494 as well. Upstream commits:
commit c84642a6f0c03af10fad08c6064f686f74e2dedf
Author: Jhon Honce <jhonce>
Date: Tue May 6 08:40:56 2014 -0700
Bug 1091433 - Add setting to detune GearStatePlugin
* Add sysconfig/watchman element STATE_CHECK_PERIOD to control
frequency of running GearStatePlugin
commit dbc9cfadb7c82eba7b17638e7f79e2c0a01bdf8e
Author: Jhon Honce <jhonce>
Date: Thu May 15 11:41:36 2014 -0700
Bug 1097959 - Add THROTTLER_CHECK_PERIOD to detune Throttler
* Add THROTTLER_CHECK_PERIOD element to /etc/sysconfig/watchman to
allow Operator to set period for checking cgroup counters
commit 6188dd63856e048aa51071e059618141ce13fd04
Author: Andy Grimm <agrimm>
Date: Mon May 12 16:05:30 2014 -0400
Introduce oom plugin and disable syslog plugin
The oom plugin is improves handling of out-of-memory conditions
in gears by dynamically adjusting a cgroup's memory limit while
cleaning up its tasks.
commit efec8b5f07988f3e95de5b5c54aae380b0879b98
Author: Andy Grimm <agrimm>
Date: Tue May 20 15:22:57 2014 -0400
Remove an incorrect comment line in oom_plugin
commit a43a0d461974087568d3e7e60f61e890a1e9b0d1
Author: Andy Grimm <agrimm>
Date: Tue May 20 15:25:30 2014 -0400
Disable OOM kills for gear cgroups
commit ba9636528748d0cb24b455e102b9f3098072c7c6
Author: Andy Grimm <agrimm>
Date: Tue May 20 15:31:20 2014 -0400
Add OOM_CHECK_PERIOD to oo-watchman man page
commit 322cb2dacc7c8cc3c1cbbb35fc2e98248a8a5d61
Author: Jhon Honce <jhonce>
Date: Wed May 21 16:00:11 2014 -0700
WIP Node Platform - Skip syslog_plugin test if it has been disabled
Verified and pass on puddle-2-1-2014-07-15 The CPU became less after update to puddle-2-1-2014-07-15. and the configure values also take effect. 1) On OSE GA build. Watchman consumes 42% CPU times. [root@node ~]# ps auxww --cumulative | grep watchman root 23276 42 0.1 13263832 184336 ? Sl 17:17 8:31 watchman root 110942 0.0 0.0 103256 856 pts/1 S+ 17:25 0:00 grep watchman 2) On puddle puddle-2-1-2014-07-15, Only 11.5% CPU times. root@node ~]# ps auxww --cumulative | grep watchman root 2683 11.5 0.3 13001500 163292 ? Sl 20:05 12:45 watchman root 18410 0.0 0.0 103256 888 pts/1 S+ 21:55 0:00 grep watchman 3) After add the following configuration. STATE_CHANGE_DELAY=60 STATE_CHECK_PERIOD=60 [root@node ~]# ps auxww --cumulative | grep watchman root 10021 10.8 0.1 12905248 82308 ? Sl 22:00 0:56 watchman root 24596 0.0 0.0 103256 852 pts/2 S+ 22:08 0:00 grep watchman Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0999.html |