Bug 2042265 - [IBM]"--scale-down-utilization-threshold" doesn't work on IBMCloud
Summary: [IBM]"--scale-down-utilization-threshold" doesn't work on IBMCloud
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.10.0
Assignee: Michael McCune
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-19 06:52 UTC by sunzhaohua
Modified: 2022-03-12 04:41 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-12 04:41:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
autoscaler logs (6.73 MB, text/plain)
2022-01-19 06:52 UTC, sunzhaohua
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-provider-ibmcloud pull 17 0 None open Bug 2042265: Fix machine providerID format 2022-01-21 16:07:43 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-12 04:41:20 UTC

Description sunzhaohua 2022-01-19 06:52:50 UTC
Created attachment 1851797 [details]
autoscaler logs

Description of problem:
Autoscaler shouldn't scale down based on scale down utilization threshold, but it will remove nodes. For example, the nodes will be removed even if I set utilizationThreshold: "0.001".

Version-Release number of selected component (if applicable):
4.10.0-0.nightly-2022-01-17-223655

How reproducible:
Always

Steps to Reproduce:
1. Create clusterautoscaler
apiVersion: "autoscaling.openshift.io/v1"
kind: "ClusterAutoscaler"
metadata:
  name: "default"
spec:
  scaleDown:
    enabled: true
    delayAfterAdd: 10s
    delayAfterDelete: 10s
    delayAfterFailure: 10s
    unneededTime: 10s
    utilizationThreshold: "0.001"

2.Create machineautoscaler 
$ oc get machineautoscaler                                   [14:25:09]
NAME                REF KIND     REF NAME               MIN   MAX   AGE
machineautoscaler   MachineSet   huliu-033-tnd6f-test   1     3     170m

3. $ oc scale deployment cluster-version-operator -n openshift-cluster-version --replicas=0
$ oc scale deployment cluster-autoscaler-operator --replicas=0
$ oc edit deploy cluster-autoscaler-default
        - --v=4
4. Create workload to scale up 
5. Waiting the node join the cluster, then delete workload
6. Check if machine could be scale down, and check autoscaler logs

Actual results:
Nodes will be removed even if we set “--scale-down-utilization-threshold=0.001”.

    spec:
      containers:
      - args:
        - --logtostderr
        - --v=4
        - --cloud-provider=clusterapi
        - --namespace=openshift-machine-api
        - --scale-down-enabled=true
        - --scale-down-delay-after-add=10s
        - --scale-down-delay-after-delete=10s
        - --scale-down-delay-after-failure=10s
        - --scale-down-unneeded-time=10s
        - --scale-down-utilization-threshold=0.001
Add workload, machineset huliu-033-tnd6f-test scales up to 3 machines
$ oc get machine                                             
NAME                             PHASE      TYPE        REGION   ZONE      AGE
huliu-033-tnd6f-master-0         Running    bx2d-4x16   eu-gb    eu-gb-1   24h
huliu-033-tnd6f-master-1         Running    bx2d-4x16   eu-gb    eu-gb-2   24h
huliu-033-tnd6f-master-2         Running    bx2d-4x16   eu-gb    eu-gb-3   24h
huliu-033-tnd6f-test-6ff57       Running    bx2d-4x16   eu-gb    eu-gb-3   13m
huliu-033-tnd6f-test-brvps       Running    bx2d-4x16   eu-gb    eu-gb-3   8m3s
huliu-033-tnd6f-test-xkrrf       Running    bx2d-4x16   eu-gb    eu-gb-3   9m34s
huliu-033-tnd6f-worker-1-7v4bl   Running    bx2d-4x16   eu-gb    eu-gb-1   22h

Remove workload, machineset huliu-033-tnd6f-test scales down to 1 machine
$ oc get node                                                
NAME                             STATUS                     ROLES    AGE   VERSION
huliu-033-tnd6f-master-0         Ready                      master   25h   v1.23.0+60f5a1c
huliu-033-tnd6f-master-1         Ready                      master   25h   v1.23.0+60f5a1c
huliu-033-tnd6f-master-2         Ready                      master   25h   v1.23.0+60f5a1c
huliu-033-tnd6f-test-brvps       Ready                      worker   19m   v1.23.0+60f5a1c
huliu-033-tnd6f-worker-1-7v4bl   Ready                      worker   22h   v1.23.0+60f5a1c

$ oc logs -f cluster-autoscaler-default-7c7bd99d87-cstm8 | grep utilization
…
I0119 06:00:08.735925       1 scale_down.go:444] Node huliu-033-tnd6f-test-brvps is not suitable for removal - memory utilization too big (0.087781)
I0119 06:00:20.565116       1 scale_down.go:444] Node huliu-033-tnd6f-test-xkrrf is not suitable for removal - memory utilization too big (0.087781)
I0119 06:00:20.565396       1 scale_down.go:444] Node huliu-033-tnd6f-test-brvps is not suitable for removal - memory utilization too big (0.087781)
I0119 06:00:32.398823       1 scale_down.go:444] Node huliu-033-tnd6f-test-xkrrf is not suitable for removal - memory utilization too big (0.087781)
I0119 06:00:32.399000       1 scale_down.go:444] Node huliu-033-tnd6f-test-brvps is not suitable for removal - memory utilization too big (0.087781)
I0119 06:00:44.246585       1 scale_down.go:444] Node huliu-033-tnd6f-test-xkrrf is not suitable for removal - memory utilization too big (0.087781)
I0119 06:00:44.246810       1 scale_down.go:444] Node huliu-033-tnd6f-test-brvps is not suitable for removal - memory utilization too big (0.087781)

Expected results:
Cluster autoscaler should uses utilizationThreshold to determine if a node should be scaled down, below which a node can be considered for scale down.

Additional info:

Comment 1 Joel Speed 2022-01-19 09:37:47 UTC
Do you happen to have a must gather for the cluster on which you replicated this bug? Would be good to see what the `cluster-autoscaler-default` deployment looked like

Comment 2 Joel Speed 2022-01-19 09:47:50 UTC
Scratch that, the configuration is printed in the logs, will review the logs

Comment 3 Joel Speed 2022-01-19 10:04:49 UTC
I0119 05:55:53.919184       1 clusterapi_controller.go:556] node "huliu-033-tnd6f-test-6ff57" is in nodegroup "MachineSet/openshift-machine-api/huliu-033-tnd6f-test"
I0119 05:55:53.919231       1 scale_down.go:444] Node huliu-033-tnd6f-test-6ff57 is not suitable for removal - memory utilization too big (0.087781)

I0119 05:56:04.755697       1 static_autoscaler.go:335] 3 unregistered nodes present
I0119 05:56:04.755739       1 static_autoscaler.go:611] Removing unregistered node ibmvpc://huliu-033-tnd6f/eu-gb-3/huliu-033-tnd6f-test-6ff57

So it scaled down the machine because it decided that it was unregistered, odd given that it had just before noted the utilization of this same node.
Will need to refresh my memory on what an unregistered node is and work out why IBM is not registering its nodes

Comment 4 Michael McCune 2022-01-19 22:37:53 UTC
i think we will need to see a must-gather from this, as well as the logs for the ibm machine controller. for some reason, the autoscaler is considering these new instances as not becoming nodes in the cluster. i have a feeling we will need to see the node, machine, and machineset objects for when this happens, as well as the logs for the machine controller.

in reference to the output above, these machines appear to exist
huliu-033-tnd6f-test-6ff57       Running    bx2d-4x16   eu-gb    eu-gb-3   13m
huliu-033-tnd6f-test-xkrrf       Running    bx2d-4x16   eu-gb    eu-gb-3   9m34s

but have no equivalent in the node listing.

these are the first and last references i see in the logs to these nodes:

huliu-033-tnd6f-test-6ff57
W0119 05:40:48.912285       1 clusterapi_controller.go:455] Machine "huliu-033-tnd6f-test-6ff57" has no providerID
I0119 05:56:04.755739       1 static_autoscaler.go:611] Removing unregistered node ibmvpc://huliu-033-tnd6f/eu-gb-3/huliu-033-tnd6f-test-6ff57

huliu-033-tnd6f-test-xkrrf
W0119 05:44:28.274254       1 clusterapi_controller.go:455] Machine "huliu-033-tnd6f-test-xkrrf" has no providerID
I0119 06:03:30.986062       1 static_autoscaler.go:611] Removing unregistered node ibmvpc://huliu-033-tnd6f/eu-gb-3/huliu-033-tnd6f-test-xkrrf


it looks like these nodes are unregistered for more than 15 minutes, which means they should be reaped by the autoscaler, given the max-node-provisioning-time.

I0119 04:51:19.924415       1 flags.go:52] FLAG: --max-node-provision-time="15m0s"

so, for some reason these machines never became nodes and the autoscaler properly deleted them as unregistered. i think to get to the bottom of this mystery we'll need the info i mentioned at the top of this comment.

i am switching the component to Other Providers as i believe this is not an issue with the autoscaler.

Comment 6 Joel Speed 2022-01-20 10:34:54 UTC
Having looked at the must gather, I can see the issue. Taking a single instance as an example:

The ProviderID on the Machine ibmvpc://zhsunibm-nf2zt/eu-gb-1/zhsunibm-nf2zt-worker-1-5vszq
The ProviderID on the Node ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm-nf2zt/0787_d3914005-41f1-4b81-83b7-e0db02df3aa7

These need to match otherwise the autoscaler won't understand how to relate the two objects. Also, these should just match in general, they are referring to the same instance.

We need to fix this before we ship 4.10 otherwise this will be very hard to fix down the line

Comment 7 Michael McCune 2022-01-20 22:35:58 UTC
just as a followup here, we have talked with IBM and they have an engineer looking into a patch for the machine api actuator.

Comment 10 sunzhaohua 2022-01-24 04:48:55 UTC
Verified
clusterversion: 4.10.0-0.nightly-2022-01-22-102609

Tested with above steps, "--scale-down-utilization-threshold" work as expected. If I set utilizationThreshold: "0.001" the nodes will not be removed. And the providerIDs are match.
$ oc get machine -o yaml | grep providerID                                                                           [12:44:57]
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/0787_e78bb302-6a64-4d80-9014-7ae20d6198cf
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/0797_393d5d52-d90f-4cf5-ad35-baa59ed0a345
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/07a7_aebab996-5132-4d2c-86e5-0dcfc0fa5bfd
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/0787_3edd0f98-459c-4850-810a-adecdfd8ed18
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/0797_4f9f4b1e-bb36-4d42-8f4e-49bc11447061
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/07a7_b4eaa9f2-bc7b-4d99-9590-bcd3976dd3a3
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/07a7_ac90b255-e169-4ad6-ad05-4c0061ca8b63
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/07a7_92b8b70d-2e97-4b80-b333-ec329db3f4f9
$ oc get node -o yaml | grep providerID                                                                              [12:45:21]
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/0787_e78bb302-6a64-4d80-9014-7ae20d6198cf
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/0797_393d5d52-d90f-4cf5-ad35-baa59ed0a345
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/07a7_aebab996-5132-4d2c-86e5-0dcfc0fa5bfd
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/0787_3edd0f98-459c-4850-810a-adecdfd8ed18
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/0797_4f9f4b1e-bb36-4d42-8f4e-49bc11447061
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/07a7_b4eaa9f2-bc7b-4d99-9590-bcd3976dd3a3
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/07a7_ac90b255-e169-4ad6-ad05-4c0061ca8b63
    providerID: ibm://fdc2e14cf8bc4d53a67f972dc2e2c861///zhsunibm24-z2nc2/07a7_92b8b70d-2e97-4b80-b333-ec329db3f4f9

Comment 13 errata-xmlrpc 2022-03-12 04:41:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.