Bug 1670695
| Summary: | [cloud-CA] After updating clusterAutoscaler maxNodesTotal value, this flag does not work | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | sunzhaohua <zhsun> | ||||||||
| Component: | Cloud Compute | Assignee: | Andrew McDermott <amcdermo> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | sunzhaohua <zhsun> | ||||||||
| Severity: | medium | Docs Contact: | |||||||||
| Priority: | medium | ||||||||||
| Version: | 4.1.0 | CC: | brad.ison, jhou, zhsun | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | 4.1.0 | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2019-06-04 10:42:28 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
Trying to figure out what exactly is going on here. So, it looks like the cluster was scaling up, and while that was happening the maxNodesTotal value was increased, but eventually the cluster exceeded the max size. Is that right? I think this may only happen if the autoscaler restarts before new nodes are ready. Can you confirm if, at the time the autoscaler restarted to pick up the new maxNodesTotal value, there were any nodes that were not yet in a "Ready" state? Brad,
Sometimes this will also happen after new nodes are ready.
This is the steps I tested:
1. Set maxNodesTotal=7, autoscaler works as expected.
2. After new nodes are ready, update maxNodesTotal=9, autoscaler works as expected.
3. After new nodes are ready, update maxNodesTotal=11, eventually the cluster exceeded the max size.
step 2:
$ oc logs -f cluster-autoscaler-default-6789dcfb79-wg42n
I0131 02:58:35.357410 1 leaderelection.go:187] attempting to acquire leader lease openshift-cluster-api/cluster-autoscaler...
I0131 02:58:52.795632 1 leaderelection.go:196] successfully acquired lease openshift-cluster-api/cluster-autoscaler
I0131 02:59:03.026393 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2b size to 4
E0131 02:59:13.794678 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 02:59:23.877088 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 02:59:33.950753 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 02:59:44.022705 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 02:59:54.107962 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:00:04.198743 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:00:14.279314 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:00:24.358894 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:00:34.434187 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:00:44.513815 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
rpc error: code = Unknown desc = container with ID starting with 96b95ae1e743401de54f9cf990482c9146b392a7272b2e71997b7ef6b2137ed0 not found: ID does not exist
$ oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-129-37.us-east-2.compute.internal Ready worker 40m v1.11.0+dde478551e
ip-10-0-148-154.us-east-2.compute.internal Ready worker 4m17s v1.11.0+dde478551e
ip-10-0-149-135.us-east-2.compute.internal Ready worker 40m v1.11.0+dde478551e
ip-10-0-153-105.us-east-2.compute.internal Ready worker 9m44s v1.11.0+dde478551e
ip-10-0-157-86.us-east-2.compute.internal Ready worker 4m17s v1.11.0+dde478551e
ip-10-0-165-193.us-east-2.compute.internal Ready worker 40m v1.11.0+dde478551e
ip-10-0-26-123.us-east-2.compute.internal Ready master 49m v1.11.0+dde478551e
ip-10-0-4-37.us-east-2.compute.internal Ready master 49m v1.11.0+dde478551e
ip-10-0-45-63.us-east-2.compute.internal Ready master 49m v1.11.0+dde478551e
step 3:
$ oc edit clusterautoscaler default
clusterautoscaler.autoscaling.openshift.io/default edited
apiVersion: autoscaling.openshift.io/v1alpha1
kind: ClusterAutoscaler
metadata:
creationTimestamp: 2019-01-31T02:50:34Z
generation: 1
name: default
resourceVersion: "36731"
selfLink: /apis/autoscaling.openshift.io/v1alpha1/clusterautoscalers/default
uid: fb192df3-2502-11e9-86b8-024f56e29114
spec:
resourceLimits:
maxNodesTotal: 11
scaleDown:
delayAfterAdd: 10s
delayAfterDelete: 10s
delayAfterFailure: 10s
enabled: true
$ oc logs -f cluster-autoscaler-default-8745d955d-lj6jb
I0131 03:05:48.073916 1 leaderelection.go:187] attempting to acquire leader lease openshift-cluster-api/cluster-autoscaler...
I0131 03:06:03.234709 1 leaderelection.go:196] successfully acquired lease openshift-cluster-api/cluster-autoscaler
I0131 03:06:13.833075 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2b size to 5
I0131 03:06:24.797856 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2a size to 2
E0131 03:06:34.898959 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:06:44.971573 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:06:55.047775 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:07:05.120855 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:07:15.207749 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:07:25.278675 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
E0131 03:07:35.360769 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached
I0131 03:07:45.430281 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2a size to 4
I0131 03:08:45.993273 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2c size to 2
$ oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-129-37.us-east-2.compute.internal Ready worker 48m v1.11.0+dde478551e
ip-10-0-131-31.us-east-2.compute.internal Ready worker 4m29s v1.11.0+dde478551e
ip-10-0-132-184.us-east-2.compute.internal Ready worker 5m51s v1.11.0+dde478551e
ip-10-0-134-247.us-east-2.compute.internal Ready worker 4m28s v1.11.0+dde478551e
ip-10-0-148-154.us-east-2.compute.internal Ready worker 13m v1.11.0+dde478551e
ip-10-0-149-135.us-east-2.compute.internal Ready worker 48m v1.11.0+dde478551e
ip-10-0-153-105.us-east-2.compute.internal Ready worker 18m v1.11.0+dde478551e
ip-10-0-155-143.us-east-2.compute.internal Ready worker 5m39s v1.11.0+dde478551e
ip-10-0-157-86.us-east-2.compute.internal Ready worker 13m v1.11.0+dde478551e
ip-10-0-165-193.us-east-2.compute.internal Ready worker 49m v1.11.0+dde478551e
ip-10-0-173-245.us-east-2.compute.internal Ready worker 3m34s v1.11.0+dde478551e
ip-10-0-26-123.us-east-2.compute.internal Ready master 58m v1.11.0+dde478551e
ip-10-0-4-37.us-east-2.compute.internal Ready master 58m v1.11.0+dde478551e
ip-10-0-45-63.us-east-2.compute.internal Ready master 58m v1.11.0+dde478551e
$ oc get deploy scale-up -o yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
creationTimestamp: 2019-01-31T02:53:13Z
generation: 1
labels:
app: scale-up
name: scale-up
namespace: openshift-cluster-api
resourceVersion: "41805"
selfLink: /apis/extensions/v1beta1/namespaces/openshift-cluster-api/deployments/scale-up
uid: 5a148ff9-2503-11e9-99ef-0aa1875edafe
spec:
progressDeadlineSeconds: 2147483647
replicas: 35
revisionHistoryLimit: 10
selector:
matchLabels:
app: scale-up
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: scale-up
spec:
containers:
- command:
- /bin/sh
- -c
- echo 'this should be in the logs' && sleep 86400
image: docker.io/library/busybox
imagePullPolicy: Always
name: busybox
resources:
requests:
memory: 2Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 0
I spent quite some significant time trying to reproduce this both on AWS and also using the kubemark [actuator]. I was not able to reproduce this. If you run the steps again with latest installer version does this still repeat for you? It will be reproduced.
Reproduce steps:
1. create clusterautoscaler, maxNodesTotal set 7.
apiVersion: autoscaling.openshift.io/v1alpha1
kind: ClusterAutoscaler
metadata:
creationTimestamp: 2019-02-28T10:02:10Z
generation: 4
name: default
resourceVersion: "33039"
selfLink: /apis/autoscaling.openshift.io/v1alpha1/clusterautoscalers/default
uid: e9dbc0ba-3b3f-11e9-8a15-0add85e6ca2e
spec:
resourceLimits:
maxNodesTotal: 11
scaleDown:
delayAfterAdd: 10s
delayAfterDelete: 10s
delayAfterFailure: 10s
enabled: true
unneededTime: 10s
2. create machineautoscaler
$ oc get machineautoscaler
NAME REF KIND REF NAME MIN MAX AGE
autoscale-us-east-2a MachineSet zhsun5-pmx48-worker-us-east-2a 1 5 26m
autoscale-us-east-2b MachineSet zhsun5-pmx48-worker-us-east-2b 1 5 25m
autoscale-us-east-2c MachineSet zhsun5-pmx48-worker-us-east-2c 1 5 25m
3. create pod to scale up the cluster
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: scale-up
labels:
app: scale-up
spec:
replicas: 35
selector:
matchLabels:
app: scale-up
template:
metadata:
labels:
app: scale-up
spec:
containers:
- name: busybox
image: docker.io/library/busybox
resources:
requests:
memory: 2Gi
command:
- /bin/sh
- "-c"
- "echo 'this should be in the logs' && sleep 86400"
terminationGracePeriodSeconds: 0
4. After new nodes are ready, update maxNodesTotal=9, autoscaler works as expected.
5. After new nodes are ready, update maxNodesTotal=11, eventually the cluster exceeded the max size.
$ oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-132-2.us-east-2.compute.internal Ready worker 34m v1.12.4+4dd65df23d
ip-10-0-137-81.us-east-2.compute.internal Ready master 51m v1.12.4+4dd65df23d
ip-10-0-141-44.us-east-2.compute.internal Ready worker 102s v1.12.4+4dd65df23d
ip-10-0-151-141.us-east-2.compute.internal Ready worker 6m49s v1.12.4+4dd65df23d
ip-10-0-153-33.us-east-2.compute.internal Ready master 51m v1.12.4+4dd65df23d
ip-10-0-153-48.us-east-2.compute.internal Ready worker 3m15s v1.12.4+4dd65df23d
ip-10-0-154-206.us-east-2.compute.internal Ready worker 34m v1.12.4+4dd65df23d
ip-10-0-156-140.us-east-2.compute.internal Ready worker 6m49s v1.12.4+4dd65df23d
ip-10-0-157-6.us-east-2.compute.internal Ready worker 3m15s v1.12.4+4dd65df23d
ip-10-0-160-240.us-east-2.compute.internal Ready worker 22m v1.12.4+4dd65df23d
ip-10-0-167-174.us-east-2.compute.internal Ready worker 34m v1.12.4+4dd65df23d
ip-10-0-168-151.us-east-2.compute.internal Ready worker 14m v1.12.4+4dd65df23d
ip-10-0-172-109.us-east-2.compute.internal Ready master 51m v1.12.4+4dd65df23d
ip-10-0-174-103.us-east-2.compute.internal Ready worker 14m v1.12.4+4dd65df23d
Created attachment 1539422 [details]
maxNodesTotal=7
Created attachment 1539423 [details]
maxNodesTotal=9
Created attachment 1539424 [details]
maxNodesTotal=11
(In reply to sunzhaohua from comment #4) > It will be reproduced. > > Reproduce steps: > > 1. create clusterautoscaler, maxNodesTotal set 7. > apiVersion: autoscaling.openshift.io/v1alpha1 > kind: ClusterAutoscaler > metadata: > creationTimestamp: 2019-02-28T10:02:10Z > generation: 4 > name: default > resourceVersion: "33039" > selfLink: > /apis/autoscaling.openshift.io/v1alpha1/clusterautoscalers/default > uid: e9dbc0ba-3b3f-11e9-8a15-0add85e6ca2e > spec: > resourceLimits: > maxNodesTotal: 11 > scaleDown: > delayAfterAdd: 10s > delayAfterDelete: 10s > delayAfterFailure: 10s > enabled: true > unneededTime: 10s > > 2. create machineautoscaler > $ oc get machineautoscaler > NAME REF KIND REF NAME MIN > MAX AGE > autoscale-us-east-2a MachineSet zhsun5-pmx48-worker-us-east-2a 1 > 5 26m > autoscale-us-east-2b MachineSet zhsun5-pmx48-worker-us-east-2b 1 > 5 25m > autoscale-us-east-2c MachineSet zhsun5-pmx48-worker-us-east-2c 1 > 5 25m > > 3. create pod to scale up the cluster > apiVersion: extensions/v1beta1 > kind: Deployment > metadata: > name: scale-up > labels: > app: scale-up > spec: > replicas: 35 > selector: > matchLabels: > app: scale-up > template: > metadata: > labels: > app: scale-up > spec: > containers: > - name: busybox > image: docker.io/library/busybox > resources: > requests: > memory: 2Gi > command: > - /bin/sh > - "-c" > - "echo 'this should be in the logs' && sleep 86400" > terminationGracePeriodSeconds: 0 > > > 4. After new nodes are ready, update maxNodesTotal=9, autoscaler works as > expected. > 5. After new nodes are ready, update maxNodesTotal=11, eventually the > cluster exceeded the max size. > > $ oc get node > NAME STATUS ROLES AGE > VERSION > ip-10-0-132-2.us-east-2.compute.internal Ready worker 34m > v1.12.4+4dd65df23d > ip-10-0-137-81.us-east-2.compute.internal Ready master 51m > v1.12.4+4dd65df23d > ip-10-0-141-44.us-east-2.compute.internal Ready worker 102s > v1.12.4+4dd65df23d > ip-10-0-151-141.us-east-2.compute.internal Ready worker 6m49s > v1.12.4+4dd65df23d > ip-10-0-153-33.us-east-2.compute.internal Ready master 51m > v1.12.4+4dd65df23d > ip-10-0-153-48.us-east-2.compute.internal Ready worker 3m15s > v1.12.4+4dd65df23d > ip-10-0-154-206.us-east-2.compute.internal Ready worker 34m > v1.12.4+4dd65df23d > ip-10-0-156-140.us-east-2.compute.internal Ready worker 6m49s > v1.12.4+4dd65df23d > ip-10-0-157-6.us-east-2.compute.internal Ready worker 3m15s > v1.12.4+4dd65df23d > ip-10-0-160-240.us-east-2.compute.internal Ready worker 22m > v1.12.4+4dd65df23d > ip-10-0-167-174.us-east-2.compute.internal Ready worker 34m > v1.12.4+4dd65df23d > ip-10-0-168-151.us-east-2.compute.internal Ready worker 14m > v1.12.4+4dd65df23d > ip-10-0-172-109.us-east-2.compute.internal Ready master 51m > v1.12.4+4dd65df23d > ip-10-0-174-103.us-east-2.compute.internal Ready worker 14m > v1.12.4+4dd65df23d Can you try again but this time using: scaleDown: enabled: true only in the CA config. I have been able to reproduce this twice today. Will continue to investigate as it doesn't happen every time. Thanks for the logs. I'm not sure where to go with this; I simply cannot reproduce this on the two clusters you have shared with me, on my own cluster (or via kubemark). I have tried on/off for over a week now. I will look to extend our e2e test so that it does something similar (if not identical) so that we validate there per commit. Made additional progress here and will either update the existing PR, or close that and raise a new one as the fix looks to be simpler than that raised in https://github.com/openshift/kubernetes-autoscaler/pull/46. Verified. It worked as expected. Thanks Andrew McDermott. clusterversion: 4.0.0-0.nightly-2019-03-06-074438 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |
Description of problem: After updating clusterAutoscaler maxNodesTotal value, autoscaler can scale up nodes to a number greater than the intended value. Version-Release number of selected component (if applicable): $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-01-29-025207 True False 2h Cluster version is 4.0.0-0.nightly-2019-01-29-025207 How reproducible: Always Steps to Reproduce: 1. Create clusterautoscaler resource, set maxNodesTotal=7 2. Create pod to scale up the cluster,check logs and node number 3. Edit clusterautoscaler resource, set maxNodesTotal=9 $ oc get clusterautoscaler default -o yaml apiVersion: autoscaling.openshift.io/v1alpha1 kind: ClusterAutoscaler metadata: generation: 1 name: default spec: resourceLimits: maxNodesTotal: 9 scaleDown: delayAfterAdd: 10s delayAfterDelete: 10s delayAfterFailure: 10s enabled: true 4. Check logs and node number Actual results: After updating clusterAutoscaler maxNodesTotal values, Node number is greater than the expected value. Before updating clusterAutoscaler maxNodesTotal values, autoscaler logs: $ oc logs -f cluster-autoscaler-default-686c6d5459-h8dt7 I0130 07:22:05.935702 1 leaderelection.go:187] attempting to acquire leader lease openshift-cluster-api/cluster-autoscaler... I0130 07:22:21.752204 1 leaderelection.go:196] successfully acquired lease openshift-cluster-api/cluster-autoscaler I0130 07:23:52.088862 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2c size to 2 E0130 07:24:02.171334 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0130 07:24:12.232822 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached After updating clusterAutoscaler maxNodesTotal values, autoscaler logs: $ oc logs -f cluster-autoscaler-default-6765bb8dc7-dvgj7 I0130 07:26:43.101080 1 leaderelection.go:187] attempting to acquire leader lease openshift-cluster-api/cluster-autoscaler... I0130 07:27:04.966954 1 leaderelection.go:196] successfully acquired lease openshift-cluster-api/cluster-autoscaler I0130 07:27:15.204164 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2c size to 3 I0130 07:27:26.011054 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2b size to 3 $ $ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-11-8.us-east-2.compute.internal Ready master 3h45m v1.11.0+dde478551e ip-10-0-134-88.us-east-2.compute.internal Ready worker 3h37m v1.11.0+dde478551e ip-10-0-151-165.us-east-2.compute.internal Ready worker 2m2s v1.11.0+dde478551e ip-10-0-154-152.us-east-2.compute.internal Ready worker 3h37m v1.11.0+dde478551e ip-10-0-157-185.us-east-2.compute.internal Ready worker 2m2s v1.11.0+dde478551e ip-10-0-165-148.us-east-2.compute.internal Ready worker 5m17s v1.11.0+dde478551e ip-10-0-166-152.us-east-2.compute.internal Ready worker 2m2s v1.11.0+dde478551e ip-10-0-166-24.us-east-2.compute.internal Ready worker 3h37m v1.11.0+dde478551e ip-10-0-26-144.us-east-2.compute.internal Ready master 3h45m v1.11.0+dde478551e ip-10-0-46-25.us-east-2.compute.internal Ready master 3h45m v1.11.0+dde478551e $ oc get machine NAME INSTANCE STATE TYPE REGION ZONE AGE zhsun-master-0 i-03bdc74b8dd712763 running m4.xlarge us-east-2 us-east-2a 3h zhsun-master-1 i-040e67812dda04da4 running m4.xlarge us-east-2 us-east-2b 3h zhsun-master-2 i-08932384bb572f448 running m4.xlarge us-east-2 us-east-2c 3h zhsun-worker-us-east-2a-rzlfv i-01ab8e6d6007624fd running m4.large us-east-2 us-east-2a 3h zhsun-worker-us-east-2b-7lqg7 i-0531bd2808d1ddbe2 running m4.large us-east-2 us-east-2b 3h zhsun-worker-us-east-2b-rlng5 i-0608c403d0a98493f running m4.large us-east-2 us-east-2b 7m zhsun-worker-us-east-2b-xr4l7 i-0eecf58bed7a134e7 running m4.large us-east-2 us-east-2b 7m zhsun-worker-us-east-2c-7b4lh i-0690a898daee5f27f running m4.large us-east-2 us-east-2c 7m zhsun-worker-us-east-2c-ct67b i-04bf06175e22da477 running m4.large us-east-2 us-east-2c 3h zhsun-worker-us-east-2c-r7tfd i-0fb7c85636caf6b6b running m4.large us-east-2 us-east-2c 11m Expected results: After updating clusterAutoscaler maxNodesTotal value, Node number is still less than the expected value Additional info: