Bug 1084035
Summary: | Scalable app is scaled down automatically when scale up with MIN setting of a large number | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Nan Wei <nwei> |
Component: | Containers | Assignee: | Brenton Leanhardt <bleanhar> |
Status: | CLOSED WORKSFORME | QA Contact: | libra bugs <libra-bugs> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 2.1.0 | CC: | erich, gpei, jialiu, libra-onpremise-devel, lmeyer, nwei, xtian |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-04-18 11:16:19 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Nan Wei
2014-04-03 13:21:24 UTC
If I had to guess, it might be that the update to the app's MIN value isn't committed until after the scale-up, and in the meantime haproxy sees no traffic and start un-scaling itself? I would expect that there should be an app lock to block that from happening, though. This will require some digging... (In reply to Luke Meyer from comment #2) > If I had to guess, it might be that the update to the app's MIN value isn't > committed until after the scale-up, and in the meantime haproxy sees no > traffic and start un-scaling itself? +1 Still happening as described, and not on an online devenv. Hard to think of why they would be different...
On the 2.1 devenv I'm seeing this in the gear haproxy log and don't seem to online:
[WARNING] 093/111220 (8135) : Server express/gear-533eca73037f758e96000029-demo is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 14 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 093/111220 (8135) : Server express/gear-533eca73037f758e9600002a-demo is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 13 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[... etc]
May be some config is keeping the scaled gears from being reachable?
One thing to note also is that the de-scaling only seems to happen after the http request by rhc times out (error 502).
Also interesting that this is the state in the 2.1 gear:
> less app-root/data/scale_limits.txt
scale_min=1
scale_max=-1
(online has scale_min=15, correctly)
openshift-watchman service wasn't started, but with it on, no difference observed.
Maybe because online is docker-ized? Long shot...
Can't immediately find anything to indicate what is going wrong.
Retest this bug with 2.2/2016-03-29.1 puddle, and no such issue any more. So close this bug. |