Bug 999837 - Throttler will keep trying to restore the throttled gears even if they had been deleted from node
Throttler will keep trying to restore the throttled gears even if they had be...
Status: CLOSED CURRENTRELEASE
Product: OpenShift Online
Classification: Red Hat
Component: Containers (Show other bugs)
2.x
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Fotios Lindiakos
libra bugs
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-22 04:53 EDT by Meng Bo
Modified: 2015-05-14 19:26 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-08-29 08:54:36 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Meng Bo 2013-08-22 04:53:22 EDT
Description of problem:
Make a gear in to throttle cgroup status. Then delete the app from rhc client. Watch the /var/log/messages, we can find the rhc-watchman will keep trying to restore the gear unless do a libra-watchman restart.


Version-Release number of selected component (if applicable):
devenv-stage_448

How reproducible:
always

Steps to Reproduce:
1.Create app
2.Make the app into throttle cgroup setting with some shell command
> dd if=/dev/zero of=/dev/null &
3.Delete the app from rhc client
4.Check the log of rhc-watchman


Actual results:
It will keep trying to restore the throttled gear even if the gear was already deleted.

Expected results:
It should ignore the deleted gears after some times trying.

Additional info:
From the log, we can find that it has been trying to restore the non-exist gear for over 10 minutes.

Aug 22 04:08:49 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:08:49 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:09:09 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:09:09 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:09:29 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:09:29 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:09:49 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:09:49 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:10:09 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:10:09 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:10:29 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:10:29 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:10:49 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:10:49 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:11:09 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:11:09 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:11:29 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:11:29 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:11:49 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:11:49 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:12:09 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:12:09 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:12:29 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:12:29 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:12:49 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:12:49 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:13:09 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:13:09 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:13:29 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:13:29 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:13:49 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:13:49 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:14:09 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:14:09 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:14:29 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:14:29 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:14:49 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:14:49 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:15:09 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:15:09 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:15:29 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:15:29 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:15:49 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:15:49 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:16:09 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:16:09 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:16:29 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:16:29 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:16:49 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:16:49 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:17:09 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:17:09 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:17:29 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:17:29 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Aug 22 04:17:49 ip-10-196-51-239 rhc-watchman[27526]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 22 04:17:49 ip-10-196-51-239 rhc-watchman[27526]: Throttler: REFUSED restore => 25989dea0afc11e3bbd012313b083001 (unknown utilization)
Comment 1 Michal Fojtik 2013-08-22 09:15:23 EDT
It seems like a caching issue to me. Will check
Comment 2 Fotios Lindiakos 2013-08-22 15:13:53 EDT
We were only removing values from the running_apps hash, but we needed to also remove them from the previously throttled hash.

Updated in this PR: https://github.com/openshift/origin-server/pull/3474
Comment 3 openshift-github-bot 2013-08-22 20:16:44 EDT
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/b9d0e377ed03c4b686b9c285008923203a2a2c71
Merge pull request #3474 from fotioslindiakos/Bug999837

Merged by openshift-bot
Comment 4 Meng Bo 2013-08-23 05:26:00 EDT
Checked on devenv-stage_452, issue has been fixed.

Aug 23 05:22:01 ip-10-118-14-174 rhc-watchman[1927]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 23 05:22:01 ip-10-118-14-174 rhc-watchman[1927]: Throttler: throttle => a323e9300bd411e3a65012313d080160 (127.325)
Aug 23 05:22:21 ip-10-118-14-174 rhc-watchman[1927]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 23 05:22:21 ip-10-118-14-174 rhc-watchman[1927]: Throttler: REFUSED restore => a323e9300bd411e3a65012313d080160 (still over threshold (378.562))
Aug 23 05:22:39 ip-10-118-14-174 CGRE[1016]: Reloading rules configuration
Aug 23 05:22:41 ip-10-118-14-174 rhc-watchman[1927]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 23 05:23:01 ip-10-118-14-174 rhc-watchman[1927]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 23 05:23:21 ip-10-118-14-174 rhc-watchman[1927]: Running rhc-watchman => delay: 20s, exception threshold: 10
Aug 23 05:23:41 ip-10-118-14-174 rhc-watchman[1927]: Running rhc-watchman => delay: 20s, exception threshold: 10


Cgroup will reload configuration rules after app being deleted.

Move bug to verified.

Note You need to log in before you can comment on or make changes to this bug.