Bug 837066 - watchman inefficient
Summary: watchman inefficient
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Jhon Honce
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-07-02 16:16 UTC by Mike McGrath
Modified: 2015-05-14 22:56 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-07 20:42:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Mike McGrath 2012-07-02 16:16:17 UTC
On systems with lots of gears (think in the 2,000-3,000 range) watchman takes up a lot of cpu power trying to crunch through them all.  We should maybe put a sleep in it to make things run slower, or maybe keep a list of the idle gears and only check their health every hour or so.

Comment 1 Jhon Honce 2012-07-02 23:31:28 UTC
 * Minimal processing of idled or stopped applications
 * Delay between loops default changed to 20 secs
 * if > 50% of applications idled on node delay has 3x multiplier 

Pull Request li#20

Comment 2 Johnny Liu 2012-07-12 12:08:11 UTC
Verified this bug on devenv-stage_223, PASS.


Get the following message in syslog:

Jul 12 05:30:40 ip-10-195-169-69 rhc-watchman[17135]: Starting rhc-watchman => delay: 20s, exception threshold: 10
<--snip-->
Jul 12 06:08:42 ip-10-195-169-69 rhc-watchman[17135]: Running rhc-watchman => delay: 20s, exception threshold: 10
<--snip-->
Jul 12 06:10:23 ip-10-195-169-69 rhc-watchman[17135]: Running rhc-watchman => delay: 60s, exception threshold: 10
<--snip-->
Jul 12 06:12:23 ip-10-195-169-69 rhc-watchman[17135]: Running rhc-watchman => delay: 180s, exception threshold: 10
<--snip-->
Jul 12 06:15:23 ip-10-195-169-69 rhc-watchman[17135]: Running rhc-watchman => delay: 540s, exception threshold: 10
<--snip-->

Comment 3 Jhon Honce 2012-07-12 14:42:20 UTC
Delay should not keep growing.

Comment 4 Jhon Honce 2012-07-12 14:53:37 UTC
Waiting on https://github.com/openshift/li/pull/61

Comment 5 Jhon Honce 2012-07-12 15:25:54 UTC
(In reply to comment #4)
> Waiting on https://github.com/openshift/li/pull/61

Logging comment was not updated
https://github.com/openshift/li/pull/62

Comment 6 Johnny Liu 2012-07-16 12:27:59 UTC
Verified this bug with devenv_1894, and PASS.


Get the following message in syslog:
<--snip-->
Jul 16 08:24:39 ip-10-194-26-207 rhc-watchman[1720]: Running rhc-watchman => delay: 20s, exception threshold: 10
<--snip-->
Jul 16 08:24:59 ip-10-194-26-207 rhc-watchman[1720]: Running rhc-watchman => delay: 20s, exception threshold: 10
<--snip-->
Jul 16 08:26:19 ip-10-194-26-207 rhc-watchman[1720]: Running rhc-watchman => delay: 60s, exception threshold: 10
<--snip-->
Jul 16 08:27:19 ip-10-194-26-207 rhc-watchman[1720]: Running rhc-watchman => delay: 60s, exception threshold: 10
<--snip-->


Note You need to log in before you can comment on or make changes to this bug.