Bug 1803639 - balanceSimilarNodeGroups doesn't work
Summary: balanceSimilarNodeGroups doesn't work
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.4
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: Alberto
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks: 1804826
TreeView+ depends on / blocked
 
Reported: 2020-02-17 05:31 UTC by sunzhaohua
Modified: 2020-08-27 22:35 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1804826 (view as bug list)
Environment:
Last Closed: 2020-08-27 22:35:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kubernetes-autoscaler pull 126 0 None closed BUG 1803639: UPSTREAM: <carry>: openshift: Add topology.kubernetes.io labels to be ignored when comparing similar node g... 2020-12-01 18:35:18 UTC

Description sunzhaohua 2020-02-17 05:31:13 UTC
Description of problem:
balanceSimilarNodeGroups doesn't work


Version-Release number of selected component (if applicable):
4.4.0-0.nightly-2020-02-16-221315

How reproducible:
Always

Steps to Reproduce:
1. Create clusterautoscaler 
apiVersion: autoscaling.openshift.io/v1
kind: ClusterAutoscaler
metadata:
  name: default
spec:
  balanceSimilarNodeGroups: true
  scaleDown:
    delayAfterAdd: 10s
    delayAfterDelete: 10s
    delayAfterFailure: 10s
    enabled: true
    unneededTime: 10s

2. Create machineautoscalers
$ oc get machineautoscalers
NAME                  REF KIND     REF NAME                          MIN   MAX   AGE
machineautoscaler-a   MachineSet   zhsun44-c9lb2-worker-us-east-2a   1     12    122m
machineautoscaler-b   MachineSet   zhsun44-c9lb2-worker-us-east-2b   1     12    120m
machineautoscaler-c   MachineSet   zhsun44-c9lb2-worker-us-east-2c   1     12    32m

3. Add payload to scale up the cluster


Actual results:
Balance only in 1 group.Couldn't see the "splitting scale-up" message from the cluster-autoscaler.

I0217 04:07:42.367422       1 scale_up.go:431] Best option to resize: openshift-machine-api/zhsun44-c9lb2-worker-us-east-2b
I0217 04:07:42.367449       1 scale_up.go:435] Estimated 6 nodes needed in openshift-machine-api/zhsun44-c9lb2-worker-us-east-2b
I0217 04:07:42.367549       1 scale_up.go:540] Final scale-up plan: [{openshift-machine-api/zhsun44-c9lb2-worker-us-east-2b 1->7 (max: 12)}]
I0217 04:07:42.367576       1 scale_up.go:701] Scale-up: setting group openshift-machine-api/zhsun44-c9lb2-worker-us-east-2b size to 7


$ oc get machineset
NAME                              DESIRED   CURRENT   READY   AVAILABLE   AGE
zhsun44-c9lb2-worker-us-east-2a   1         1         1       1           3h12m
zhsun44-c9lb2-worker-us-east-2b   7         7         7       7           3h12m
zhsun44-c9lb2-worker-us-east-2c   1         1         1       1           3h12m


Expected results:
Balance in 3 groups.

Additional info:

Comment 3 sunzhaohua 2020-02-21 09:52:31 UTC
Verify failed.

clusterversion: 4.4.0-0.nightly-2020-02-21-045519

I0221 09:51:55.241296       1 scale_up.go:431] Best option to resize: openshift-machine-api/zhsun2-k9bts-w-b
I0221 09:51:55.241328       1 scale_up.go:435] Estimated 6 nodes needed in openshift-machine-api/zhsun2-k9bts-w-b
I0221 09:51:55.241474       1 scale_up.go:540] Final scale-up plan: [{openshift-machine-api/zhsun2-k9bts-w-b 1->7 (max: 12)}]
I0221 09:51:55.241497       1 scale_up.go:701] Scale-up: setting group openshift-machine-api/zhsun2-k9bts-w-b size to 7

Comment 4 Alberto 2020-02-21 10:00:01 UTC
>clusterversion: 4.4.0-0.nightly-2020-02-21-045519

Please verify against 4.5

Comment 5 sunzhaohua 2020-02-25 03:49:41 UTC
Sorry.
Verified.
clusterversion: 4.5.0-0.ci-2020-02-25-010652

I0225 03:41:01.976494       1 scale_up.go:431] Best option to resize: openshift-machine-api/zhsun45-tcth9-worker-us-east-2c
I0225 03:41:01.976518       1 scale_up.go:435] Estimated 23 nodes needed in openshift-machine-api/zhsun45-tcth9-worker-us-east-2c
I0225 03:41:01.976635       1 scale_up.go:532] Splitting scale-up between 3 similar node groups: {openshift-machine-api/zhsun45-tcth9-worker-us-east-2c, openshift-machine-api/zhsun45-tcth9-worker-us-east-2a, openshift-machine-api/zhsun45-tcth9-worker-us-east-2b}
I0225 03:41:01.976659       1 scale_up.go:540] Final scale-up plan: [{openshift-machine-api/zhsun45-tcth9-worker-us-east-2c 1->9 (max: 12)} {openshift-machine-api/zhsun45-tcth9-worker-us-east-2a 1->9 (max: 12)} {openshift-machine-api/zhsun45-tcth9-worker-us-east-2b 1->8 (max: 12)}]
I0225 03:41:01.976679       1 scale_up.go:701] Scale-up: setting group openshift-machine-api/zhsun45-tcth9-worker-us-east-2c size to 9
I0225 03:41:01.991680       1 scale_up.go:701] Scale-up: setting group openshift-machine-api/zhsun45-tcth9-worker-us-east-2a size to 9
I0225 03:41:02.006312       1 scale_up.go:701] Scale-up: setting group openshift-machine-api/zhsun45-tcth9-worker-us-east-2b size to 8

$ oc get machineset
NAME                              DESIRED   CURRENT   READY   AVAILABLE   AGE
zhsun45-tcth9-worker-us-east-2a   9         9         9       9           33m
zhsun45-tcth9-worker-us-east-2b   8         8         8       8           33m
zhsun45-tcth9-worker-us-east-2c   9         9         9       9           33m

Comment 6 Luke Meyer 2020-08-27 22:35:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409'


Note You need to log in before you can comment on or make changes to this bug.