Hide Forgot
Description of problem: After updating clusterAutoscaler maxNodesTotal value, autoscaler can scale up nodes to a number greater than the intended value. Version-Release number of selected component (if applicable): $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-01-29-025207 True False 2h Cluster version is 4.0.0-0.nightly-2019-01-29-025207 How reproducible: Always Steps to Reproduce: 1. Create clusterautoscaler resource, set maxNodesTotal=7 2. Create pod to scale up the cluster,check logs and node number 3. Edit clusterautoscaler resource, set maxNodesTotal=9 $ oc get clusterautoscaler default -o yaml apiVersion: autoscaling.openshift.io/v1alpha1 kind: ClusterAutoscaler metadata: generation: 1 name: default spec: resourceLimits: maxNodesTotal: 9 scaleDown: delayAfterAdd: 10s delayAfterDelete: 10s delayAfterFailure: 10s enabled: true 4. Check logs and node number Actual results: After updating clusterAutoscaler maxNodesTotal values, Node number is greater than the expected value. Before updating clusterAutoscaler maxNodesTotal values, autoscaler logs: $ oc logs -f cluster-autoscaler-default-686c6d5459-h8dt7 I0130 07:22:05.935702 1 leaderelection.go:187] attempting to acquire leader lease openshift-cluster-api/cluster-autoscaler... I0130 07:22:21.752204 1 leaderelection.go:196] successfully acquired lease openshift-cluster-api/cluster-autoscaler I0130 07:23:52.088862 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2c size to 2 E0130 07:24:02.171334 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0130 07:24:12.232822 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached After updating clusterAutoscaler maxNodesTotal values, autoscaler logs: $ oc logs -f cluster-autoscaler-default-6765bb8dc7-dvgj7 I0130 07:26:43.101080 1 leaderelection.go:187] attempting to acquire leader lease openshift-cluster-api/cluster-autoscaler... I0130 07:27:04.966954 1 leaderelection.go:196] successfully acquired lease openshift-cluster-api/cluster-autoscaler I0130 07:27:15.204164 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2c size to 3 I0130 07:27:26.011054 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2b size to 3 $ $ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-11-8.us-east-2.compute.internal Ready master 3h45m v1.11.0+dde478551e ip-10-0-134-88.us-east-2.compute.internal Ready worker 3h37m v1.11.0+dde478551e ip-10-0-151-165.us-east-2.compute.internal Ready worker 2m2s v1.11.0+dde478551e ip-10-0-154-152.us-east-2.compute.internal Ready worker 3h37m v1.11.0+dde478551e ip-10-0-157-185.us-east-2.compute.internal Ready worker 2m2s v1.11.0+dde478551e ip-10-0-165-148.us-east-2.compute.internal Ready worker 5m17s v1.11.0+dde478551e ip-10-0-166-152.us-east-2.compute.internal Ready worker 2m2s v1.11.0+dde478551e ip-10-0-166-24.us-east-2.compute.internal Ready worker 3h37m v1.11.0+dde478551e ip-10-0-26-144.us-east-2.compute.internal Ready master 3h45m v1.11.0+dde478551e ip-10-0-46-25.us-east-2.compute.internal Ready master 3h45m v1.11.0+dde478551e $ oc get machine NAME INSTANCE STATE TYPE REGION ZONE AGE zhsun-master-0 i-03bdc74b8dd712763 running m4.xlarge us-east-2 us-east-2a 3h zhsun-master-1 i-040e67812dda04da4 running m4.xlarge us-east-2 us-east-2b 3h zhsun-master-2 i-08932384bb572f448 running m4.xlarge us-east-2 us-east-2c 3h zhsun-worker-us-east-2a-rzlfv i-01ab8e6d6007624fd running m4.large us-east-2 us-east-2a 3h zhsun-worker-us-east-2b-7lqg7 i-0531bd2808d1ddbe2 running m4.large us-east-2 us-east-2b 3h zhsun-worker-us-east-2b-rlng5 i-0608c403d0a98493f running m4.large us-east-2 us-east-2b 7m zhsun-worker-us-east-2b-xr4l7 i-0eecf58bed7a134e7 running m4.large us-east-2 us-east-2b 7m zhsun-worker-us-east-2c-7b4lh i-0690a898daee5f27f running m4.large us-east-2 us-east-2c 7m zhsun-worker-us-east-2c-ct67b i-04bf06175e22da477 running m4.large us-east-2 us-east-2c 3h zhsun-worker-us-east-2c-r7tfd i-0fb7c85636caf6b6b running m4.large us-east-2 us-east-2c 11m Expected results: After updating clusterAutoscaler maxNodesTotal value, Node number is still less than the expected value Additional info:
Trying to figure out what exactly is going on here. So, it looks like the cluster was scaling up, and while that was happening the maxNodesTotal value was increased, but eventually the cluster exceeded the max size. Is that right? I think this may only happen if the autoscaler restarts before new nodes are ready. Can you confirm if, at the time the autoscaler restarted to pick up the new maxNodesTotal value, there were any nodes that were not yet in a "Ready" state?
Brad, Sometimes this will also happen after new nodes are ready. This is the steps I tested: 1. Set maxNodesTotal=7, autoscaler works as expected. 2. After new nodes are ready, update maxNodesTotal=9, autoscaler works as expected. 3. After new nodes are ready, update maxNodesTotal=11, eventually the cluster exceeded the max size. step 2: $ oc logs -f cluster-autoscaler-default-6789dcfb79-wg42n I0131 02:58:35.357410 1 leaderelection.go:187] attempting to acquire leader lease openshift-cluster-api/cluster-autoscaler... I0131 02:58:52.795632 1 leaderelection.go:196] successfully acquired lease openshift-cluster-api/cluster-autoscaler I0131 02:59:03.026393 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2b size to 4 E0131 02:59:13.794678 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 02:59:23.877088 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 02:59:33.950753 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 02:59:44.022705 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 02:59:54.107962 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:00:04.198743 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:00:14.279314 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:00:24.358894 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:00:34.434187 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:00:44.513815 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached rpc error: code = Unknown desc = container with ID starting with 96b95ae1e743401de54f9cf990482c9146b392a7272b2e71997b7ef6b2137ed0 not found: ID does not exist $ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-129-37.us-east-2.compute.internal Ready worker 40m v1.11.0+dde478551e ip-10-0-148-154.us-east-2.compute.internal Ready worker 4m17s v1.11.0+dde478551e ip-10-0-149-135.us-east-2.compute.internal Ready worker 40m v1.11.0+dde478551e ip-10-0-153-105.us-east-2.compute.internal Ready worker 9m44s v1.11.0+dde478551e ip-10-0-157-86.us-east-2.compute.internal Ready worker 4m17s v1.11.0+dde478551e ip-10-0-165-193.us-east-2.compute.internal Ready worker 40m v1.11.0+dde478551e ip-10-0-26-123.us-east-2.compute.internal Ready master 49m v1.11.0+dde478551e ip-10-0-4-37.us-east-2.compute.internal Ready master 49m v1.11.0+dde478551e ip-10-0-45-63.us-east-2.compute.internal Ready master 49m v1.11.0+dde478551e step 3: $ oc edit clusterautoscaler default clusterautoscaler.autoscaling.openshift.io/default edited apiVersion: autoscaling.openshift.io/v1alpha1 kind: ClusterAutoscaler metadata: creationTimestamp: 2019-01-31T02:50:34Z generation: 1 name: default resourceVersion: "36731" selfLink: /apis/autoscaling.openshift.io/v1alpha1/clusterautoscalers/default uid: fb192df3-2502-11e9-86b8-024f56e29114 spec: resourceLimits: maxNodesTotal: 11 scaleDown: delayAfterAdd: 10s delayAfterDelete: 10s delayAfterFailure: 10s enabled: true $ oc logs -f cluster-autoscaler-default-8745d955d-lj6jb I0131 03:05:48.073916 1 leaderelection.go:187] attempting to acquire leader lease openshift-cluster-api/cluster-autoscaler... I0131 03:06:03.234709 1 leaderelection.go:196] successfully acquired lease openshift-cluster-api/cluster-autoscaler I0131 03:06:13.833075 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2b size to 5 I0131 03:06:24.797856 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2a size to 2 E0131 03:06:34.898959 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:06:44.971573 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:06:55.047775 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:07:05.120855 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:07:15.207749 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:07:25.278675 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached E0131 03:07:35.360769 1 static_autoscaler.go:275] Failed to scale up: max node total count already reached I0131 03:07:45.430281 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2a size to 4 I0131 03:08:45.993273 1 scale_up.go:584] Scale-up: setting group openshift-cluster-api/zhsun-worker-us-east-2c size to 2 $ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-129-37.us-east-2.compute.internal Ready worker 48m v1.11.0+dde478551e ip-10-0-131-31.us-east-2.compute.internal Ready worker 4m29s v1.11.0+dde478551e ip-10-0-132-184.us-east-2.compute.internal Ready worker 5m51s v1.11.0+dde478551e ip-10-0-134-247.us-east-2.compute.internal Ready worker 4m28s v1.11.0+dde478551e ip-10-0-148-154.us-east-2.compute.internal Ready worker 13m v1.11.0+dde478551e ip-10-0-149-135.us-east-2.compute.internal Ready worker 48m v1.11.0+dde478551e ip-10-0-153-105.us-east-2.compute.internal Ready worker 18m v1.11.0+dde478551e ip-10-0-155-143.us-east-2.compute.internal Ready worker 5m39s v1.11.0+dde478551e ip-10-0-157-86.us-east-2.compute.internal Ready worker 13m v1.11.0+dde478551e ip-10-0-165-193.us-east-2.compute.internal Ready worker 49m v1.11.0+dde478551e ip-10-0-173-245.us-east-2.compute.internal Ready worker 3m34s v1.11.0+dde478551e ip-10-0-26-123.us-east-2.compute.internal Ready master 58m v1.11.0+dde478551e ip-10-0-4-37.us-east-2.compute.internal Ready master 58m v1.11.0+dde478551e ip-10-0-45-63.us-east-2.compute.internal Ready master 58m v1.11.0+dde478551e $ oc get deploy scale-up -o yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "1" creationTimestamp: 2019-01-31T02:53:13Z generation: 1 labels: app: scale-up name: scale-up namespace: openshift-cluster-api resourceVersion: "41805" selfLink: /apis/extensions/v1beta1/namespaces/openshift-cluster-api/deployments/scale-up uid: 5a148ff9-2503-11e9-99ef-0aa1875edafe spec: progressDeadlineSeconds: 2147483647 replicas: 35 revisionHistoryLimit: 10 selector: matchLabels: app: scale-up strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate template: metadata: creationTimestamp: null labels: app: scale-up spec: containers: - command: - /bin/sh - -c - echo 'this should be in the logs' && sleep 86400 image: docker.io/library/busybox imagePullPolicy: Always name: busybox resources: requests: memory: 2Gi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 0
I spent quite some significant time trying to reproduce this both on AWS and also using the kubemark [actuator]. I was not able to reproduce this. If you run the steps again with latest installer version does this still repeat for you?
It will be reproduced. Reproduce steps: 1. create clusterautoscaler, maxNodesTotal set 7. apiVersion: autoscaling.openshift.io/v1alpha1 kind: ClusterAutoscaler metadata: creationTimestamp: 2019-02-28T10:02:10Z generation: 4 name: default resourceVersion: "33039" selfLink: /apis/autoscaling.openshift.io/v1alpha1/clusterautoscalers/default uid: e9dbc0ba-3b3f-11e9-8a15-0add85e6ca2e spec: resourceLimits: maxNodesTotal: 11 scaleDown: delayAfterAdd: 10s delayAfterDelete: 10s delayAfterFailure: 10s enabled: true unneededTime: 10s 2. create machineautoscaler $ oc get machineautoscaler NAME REF KIND REF NAME MIN MAX AGE autoscale-us-east-2a MachineSet zhsun5-pmx48-worker-us-east-2a 1 5 26m autoscale-us-east-2b MachineSet zhsun5-pmx48-worker-us-east-2b 1 5 25m autoscale-us-east-2c MachineSet zhsun5-pmx48-worker-us-east-2c 1 5 25m 3. create pod to scale up the cluster apiVersion: extensions/v1beta1 kind: Deployment metadata: name: scale-up labels: app: scale-up spec: replicas: 35 selector: matchLabels: app: scale-up template: metadata: labels: app: scale-up spec: containers: - name: busybox image: docker.io/library/busybox resources: requests: memory: 2Gi command: - /bin/sh - "-c" - "echo 'this should be in the logs' && sleep 86400" terminationGracePeriodSeconds: 0 4. After new nodes are ready, update maxNodesTotal=9, autoscaler works as expected. 5. After new nodes are ready, update maxNodesTotal=11, eventually the cluster exceeded the max size. $ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-132-2.us-east-2.compute.internal Ready worker 34m v1.12.4+4dd65df23d ip-10-0-137-81.us-east-2.compute.internal Ready master 51m v1.12.4+4dd65df23d ip-10-0-141-44.us-east-2.compute.internal Ready worker 102s v1.12.4+4dd65df23d ip-10-0-151-141.us-east-2.compute.internal Ready worker 6m49s v1.12.4+4dd65df23d ip-10-0-153-33.us-east-2.compute.internal Ready master 51m v1.12.4+4dd65df23d ip-10-0-153-48.us-east-2.compute.internal Ready worker 3m15s v1.12.4+4dd65df23d ip-10-0-154-206.us-east-2.compute.internal Ready worker 34m v1.12.4+4dd65df23d ip-10-0-156-140.us-east-2.compute.internal Ready worker 6m49s v1.12.4+4dd65df23d ip-10-0-157-6.us-east-2.compute.internal Ready worker 3m15s v1.12.4+4dd65df23d ip-10-0-160-240.us-east-2.compute.internal Ready worker 22m v1.12.4+4dd65df23d ip-10-0-167-174.us-east-2.compute.internal Ready worker 34m v1.12.4+4dd65df23d ip-10-0-168-151.us-east-2.compute.internal Ready worker 14m v1.12.4+4dd65df23d ip-10-0-172-109.us-east-2.compute.internal Ready master 51m v1.12.4+4dd65df23d ip-10-0-174-103.us-east-2.compute.internal Ready worker 14m v1.12.4+4dd65df23d
Created attachment 1539422 [details] maxNodesTotal=7
Created attachment 1539423 [details] maxNodesTotal=9
Created attachment 1539424 [details] maxNodesTotal=11
(In reply to sunzhaohua from comment #4) > It will be reproduced. > > Reproduce steps: > > 1. create clusterautoscaler, maxNodesTotal set 7. > apiVersion: autoscaling.openshift.io/v1alpha1 > kind: ClusterAutoscaler > metadata: > creationTimestamp: 2019-02-28T10:02:10Z > generation: 4 > name: default > resourceVersion: "33039" > selfLink: > /apis/autoscaling.openshift.io/v1alpha1/clusterautoscalers/default > uid: e9dbc0ba-3b3f-11e9-8a15-0add85e6ca2e > spec: > resourceLimits: > maxNodesTotal: 11 > scaleDown: > delayAfterAdd: 10s > delayAfterDelete: 10s > delayAfterFailure: 10s > enabled: true > unneededTime: 10s > > 2. create machineautoscaler > $ oc get machineautoscaler > NAME REF KIND REF NAME MIN > MAX AGE > autoscale-us-east-2a MachineSet zhsun5-pmx48-worker-us-east-2a 1 > 5 26m > autoscale-us-east-2b MachineSet zhsun5-pmx48-worker-us-east-2b 1 > 5 25m > autoscale-us-east-2c MachineSet zhsun5-pmx48-worker-us-east-2c 1 > 5 25m > > 3. create pod to scale up the cluster > apiVersion: extensions/v1beta1 > kind: Deployment > metadata: > name: scale-up > labels: > app: scale-up > spec: > replicas: 35 > selector: > matchLabels: > app: scale-up > template: > metadata: > labels: > app: scale-up > spec: > containers: > - name: busybox > image: docker.io/library/busybox > resources: > requests: > memory: 2Gi > command: > - /bin/sh > - "-c" > - "echo 'this should be in the logs' && sleep 86400" > terminationGracePeriodSeconds: 0 > > > 4. After new nodes are ready, update maxNodesTotal=9, autoscaler works as > expected. > 5. After new nodes are ready, update maxNodesTotal=11, eventually the > cluster exceeded the max size. > > $ oc get node > NAME STATUS ROLES AGE > VERSION > ip-10-0-132-2.us-east-2.compute.internal Ready worker 34m > v1.12.4+4dd65df23d > ip-10-0-137-81.us-east-2.compute.internal Ready master 51m > v1.12.4+4dd65df23d > ip-10-0-141-44.us-east-2.compute.internal Ready worker 102s > v1.12.4+4dd65df23d > ip-10-0-151-141.us-east-2.compute.internal Ready worker 6m49s > v1.12.4+4dd65df23d > ip-10-0-153-33.us-east-2.compute.internal Ready master 51m > v1.12.4+4dd65df23d > ip-10-0-153-48.us-east-2.compute.internal Ready worker 3m15s > v1.12.4+4dd65df23d > ip-10-0-154-206.us-east-2.compute.internal Ready worker 34m > v1.12.4+4dd65df23d > ip-10-0-156-140.us-east-2.compute.internal Ready worker 6m49s > v1.12.4+4dd65df23d > ip-10-0-157-6.us-east-2.compute.internal Ready worker 3m15s > v1.12.4+4dd65df23d > ip-10-0-160-240.us-east-2.compute.internal Ready worker 22m > v1.12.4+4dd65df23d > ip-10-0-167-174.us-east-2.compute.internal Ready worker 34m > v1.12.4+4dd65df23d > ip-10-0-168-151.us-east-2.compute.internal Ready worker 14m > v1.12.4+4dd65df23d > ip-10-0-172-109.us-east-2.compute.internal Ready master 51m > v1.12.4+4dd65df23d > ip-10-0-174-103.us-east-2.compute.internal Ready worker 14m > v1.12.4+4dd65df23d Can you try again but this time using: scaleDown: enabled: true only in the CA config.
I have been able to reproduce this twice today. Will continue to investigate as it doesn't happen every time. Thanks for the logs.
PR - https://github.com/openshift/kubernetes-autoscaler/pull/46
I'm not sure where to go with this; I simply cannot reproduce this on the two clusters you have shared with me, on my own cluster (or via kubemark). I have tried on/off for over a week now. I will look to extend our e2e test so that it does something similar (if not identical) so that we validate there per commit.
Made additional progress here and will either update the existing PR, or close that and raise a new one as the fix looks to be simpler than that raised in https://github.com/openshift/kubernetes-autoscaler/pull/46.
New PR: https://github.com/openshift/kubernetes-autoscaler/pull/47
Verified. It worked as expected. Thanks Andrew McDermott. clusterversion: 4.0.0-0.nightly-2019-03-06-074438
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758