Bug 837066 - watchman inefficient
watchman inefficient
Status: CLOSED CURRENTRELEASE
Product: OpenShift Origin
Classification: Red Hat
Component: Containers (Show other bugs)
2.x
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Jhon Honce
libra bugs
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-02 12:16 EDT by Mike McGrath
Modified: 2015-05-14 18:56 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-08-07 16:42:40 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Mike McGrath 2012-07-02 12:16:17 EDT
On systems with lots of gears (think in the 2,000-3,000 range) watchman takes up a lot of cpu power trying to crunch through them all.  We should maybe put a sleep in it to make things run slower, or maybe keep a list of the idle gears and only check their health every hour or so.
Comment 1 Jhon Honce 2012-07-02 19:31:28 EDT
 * Minimal processing of idled or stopped applications
 * Delay between loops default changed to 20 secs
 * if > 50% of applications idled on node delay has 3x multiplier 

Pull Request li#20
Comment 2 Johnny Liu 2012-07-12 08:08:11 EDT
Verified this bug on devenv-stage_223, PASS.


Get the following message in syslog:

Jul 12 05:30:40 ip-10-195-169-69 rhc-watchman[17135]: Starting rhc-watchman => delay: 20s, exception threshold: 10
<--snip-->
Jul 12 06:08:42 ip-10-195-169-69 rhc-watchman[17135]: Running rhc-watchman => delay: 20s, exception threshold: 10
<--snip-->
Jul 12 06:10:23 ip-10-195-169-69 rhc-watchman[17135]: Running rhc-watchman => delay: 60s, exception threshold: 10
<--snip-->
Jul 12 06:12:23 ip-10-195-169-69 rhc-watchman[17135]: Running rhc-watchman => delay: 180s, exception threshold: 10
<--snip-->
Jul 12 06:15:23 ip-10-195-169-69 rhc-watchman[17135]: Running rhc-watchman => delay: 540s, exception threshold: 10
<--snip-->
Comment 3 Jhon Honce 2012-07-12 10:42:20 EDT
Delay should not keep growing.
Comment 4 Jhon Honce 2012-07-12 10:53:37 EDT
Waiting on https://github.com/openshift/li/pull/61
Comment 5 Jhon Honce 2012-07-12 11:25:54 EDT
(In reply to comment #4)
> Waiting on https://github.com/openshift/li/pull/61

Logging comment was not updated
https://github.com/openshift/li/pull/62
Comment 6 Johnny Liu 2012-07-16 08:27:59 EDT
Verified this bug with devenv_1894, and PASS.


Get the following message in syslog:
<--snip-->
Jul 16 08:24:39 ip-10-194-26-207 rhc-watchman[1720]: Running rhc-watchman => delay: 20s, exception threshold: 10
<--snip-->
Jul 16 08:24:59 ip-10-194-26-207 rhc-watchman[1720]: Running rhc-watchman => delay: 20s, exception threshold: 10
<--snip-->
Jul 16 08:26:19 ip-10-194-26-207 rhc-watchman[1720]: Running rhc-watchman => delay: 60s, exception threshold: 10
<--snip-->
Jul 16 08:27:19 ip-10-194-26-207 rhc-watchman[1720]: Running rhc-watchman => delay: 60s, exception threshold: 10
<--snip-->

Note You need to log in before you can comment on or make changes to this bug.