Bug 1857175
Summary: | [AWS] Machineset creating infinite(large number) of machines , when an machineset with default values is used that has labels and selectors not related to cluster | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Milind Yadav <miyadav> |
Component: | Cloud Compute | Assignee: | Alberto <agarcial> |
Cloud Compute sub component: | Other Providers | QA Contact: | Milind Yadav <miyadav> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | unspecified | ||
Version: | 4.6 | ||
Target Milestone: | --- | ||
Target Release: | 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-10-27 16:14:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Milind Yadav
2020-07-15 10:41:04 UTC
>an machineset with default values is used that has labels and selectors not related to cluster
could you be more specific on what this means?
>an machineset with default values is used that has labels and selectors not related to cluster
could you be more specific on what this means?
>an machineset with default values is used that has labels and selectors not related to cluster
I meant below data was not relevant to the existing cluster :
selector:
matchLabels:
machine.openshift.io/cluster-api-cluster: pmali1307-bls8p
machine.openshift.io/cluster-api-machineset: pmali1307-bls8p-worker-us-east-2a
template:
metadata:
labels:
machine.openshift.io/cluster-api-cluster: pmali1307-bls8p
machine.openshift.io/cluster-api-machine-role: worker
machine.openshift.io/cluster-api-machine-type: worker
machine.openshift.io/cluster-api-machineset: pmali1307-bls8p-worker-us-east-2a
the labels we usually use are inline with the ones that comes with installation .
Thanks Milind. I had a quick look I believe this is so with https://github.com/openshift/machine-api-operator/pull/608/files we introduced an unfortunate discrepancy with the labels used by the machineSet to decide ownership over the machines https://github.com/openshift/machine-api-operator/blob/5688547505e7963783f04ad0737740cfac4b6457/pkg/controller/machineset/controller.go#L377 for the scenario where a bad machine.openshift.io/cluster-api-cluster is set by the user. We might want to include the same logic in the machineSet controller and additionally may be get back enforce via webhooks as well https://github.com/openshift/machine-api-operator/pull/610/files we need to revendor the changes in the actuator for this to pass. Moving back to assigned. Validated for AWS on : NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-07-22-214212 True False 26m Cluster version is 4.6.0-0.nightly-2020-07-22-214212 Steps : 1. Create machineset using below yaml: http://pastebin.test.redhat.com/884420 2.oc create -f <filename>.yaml machineset created successfully 3.check machine, machineset and nodes [miyadav@miyadav ~]$ oc get machineset oc get machiNAME DESIRED CURRENT READY AVAILABLE AGE miyadav-2307-zpd9c-new 1 1 1 1 17m miyadav-awsb-dxjkx-worker-us-east-2a 1 1 1 1 59m miyadav-awsb-dxjkx-worker-us-east-2b 1 1 1 1 59m miyadav-awsb-dxjkx-worker-us-east-2c 1 1 1 1 59m [miyadav@miyadav ~]$ oc get machines -o wide NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE miyadav-2307-zpd9c-new-dhzvv Running m4.large us-east-2 us-east-2a 17m ip-10-0-145-136.us-east-2.compute.internal aws:///us-east-2a/i-06bc31b3f0e0fd5a2 running miyadav-awsb-dxjkx-master-0 Running m5.xlarge us-east-2 us-east-2a 59m ip-10-0-133-14.us-east-2.compute.internal aws:///us-east-2a/i-063970939f1f7e42d running miyadav-awsb-dxjkx-master-1 Running m5.xlarge us-east-2 us-east-2b 59m ip-10-0-180-32.us-east-2.compute.internal aws:///us-east-2b/i-0d68493a62e2412e7 running miyadav-awsb-dxjkx-master-2 Running m5.xlarge us-east-2 us-east-2c 59m ip-10-0-200-43.us-east-2.compute.internal aws:///us-east-2c/i-0b02e6d92f49085d6 running miyadav-awsb-dxjkx-worker-us-east-2a-6m6bf Running m5.large us-east-2 us-east-2a 45m ip-10-0-138-50.us-east-2.compute.internal aws:///us-east-2a/i-0aaaf5c1335568126 running miyadav-awsb-dxjkx-worker-us-east-2b-f2mrv Running m5.large us-east-2 us-east-2b 45m ip-10-0-191-230.us-east-2.compute.internal aws:///us-east-2b/i-0fcd744e92eb26168 running miyadav-awsb-dxjkx-worker-us-east-2c-rxck7 Running m5.large us-east-2 us-east-2c 45m ip-10-0-202-75.us-east-2.compute.internal aws:///us-east-2c/i-0532a17ad48f52b33 running Expected and actual : Machineset yaml updated with correct values and honored the replica count . . . spec: replicas: 1 selector: matchLabels: machine.openshift.io/cluster-api-cluster: miyadav-awsb-dxjkx machine.openshift.io/cluster-api-machineset: miyadav-2307-zpd9c-new template: metadata: labels: machine.openshift.io/cluster-api-cluster: miyadav-awsb-dxjkx machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker machine.openshift.io/cluster-api-machineset: miyadav-2307-zpd9c-new . . . Moving to VERIFIED Additional info: Will execute for GCP and Azure as well and update in case those fails. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |