Description of problem: Create a machine with invalid label "machine.openshift.io/cluster-api-cluster: zhsun2-7j5gk-invalid", the machine could be created successfully and the intance could join the cluster. Version-Release number of selected component (if applicable): 4.2.0-0.nightly-2019-08-25-233755 How reproducible: Always Steps to Reproduce: 1. Create a machine with invalid label apiVersion: machine.openshift.io/v1beta1 kind: Machine metadata: labels: machine.openshift.io/cluster-api-cluster: zhsun2-7j5gk-invalid machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker machine.openshift.io/instance-type: m1.large machine.openshift.io/region: regionOne machine.openshift.io/zone: nova name: zhsun2-7j5gk-worker-aaa namespace: openshift-machine-api spec: metadata: creationTimestamp: null providerSpec: value: apiVersion: openstackproviderconfig.openshift.io/v1alpha1 cloudName: openstack cloudsSecret: name: openstack-cloud-credentials namespace: openshift-machine-api flavor: m1.large image: rhcos-42.80.20190815.3 kind: OpenstackProviderSpec metadata: creationTimestamp: null networks: - filter: {} subnets: - filter: name: zhsun2-7j5gk-nodes tags: openshiftClusterID=zhsun2-7j5gk securityGroups: - filter: {} name: zhsun2-7j5gk-worker serverMetadata: Name: zhsun2-7j5gk-worker openshiftClusterID: zhsun2-7j5gk tags: - openshiftClusterID=zhsun2-7j5gk trunk: true userDataSecret: name: worker-user-data 2. Check machine, node and machine-controller logs Actual results: Machine could be created successful and instance could join the cluster. $ oc get node NAME STATUS ROLES AGE VERSION zhsun2-7j5gk-master-0 Ready master 154m v1.14.0+ceed07c42 zhsun2-7j5gk-master-1 Ready master 155m v1.14.0+ceed07c42 zhsun2-7j5gk-master-2 Ready master 154m v1.14.0+ceed07c42 zhsun2-7j5gk-worker-aaa Ready worker 42s v1.14.0+ceed07c42 zhsun2-7j5gk-worker-c498z Ready worker 148m v1.14.0+ceed07c42 zhsun2-7j5gk-worker-fbrmz Ready worker 147m v1.14.0+ceed07c42 zhsun2-7j5gk-worker-kcfp6 Ready worker 149m v1.14.0+ceed07c42 [ $ oc get machine NAME STATE TYPE REGION ZONE AGE zhsun2-7j5gk-master-0 ACTIVE m1.large regionOne nova 154m zhsun2-7j5gk-master-1 ACTIVE m1.large regionOne nova 154m zhsun2-7j5gk-master-2 ACTIVE m1.large regionOne nova 154m zhsun2-7j5gk-worker ACTIVE m1.large regionOne nova 82m zhsun2-7j5gk-worker-aaa ACTIVE m1.large regionOne nova 5m17s zhsun2-7j5gk-worker-c498z ACTIVE m1.large regionOne nova 153m zhsun2-7j5gk-worker-fbrmz ACTIVE m1.large regionOne nova 153m zhsun2-7j5gk-worker-kcfp6 ACTIVE m1.large regionOne nova 153m I0829 08:03:08.571560 1 controller.go:129] Reconciling Machine "zhsun2-7j5gk-worker-aaa" I0829 08:03:08.571705 1 controller.go:298] Machine "zhsun2-7j5gk-worker-aaa" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster I0829 08:03:08.587852 1 controller.go:129] Reconciling Machine "zhsun2-7j5gk-worker-aaa" I0829 08:03:08.588073 1 controller.go:298] Machine "zhsun2-7j5gk-worker-aaa" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster I0829 08:03:08.831336 1 controller.go:247] Reconciling machine object zhsun2-7j5gk-worker-aaa triggers idempotent create. I0829 08:04:01.402694 1 controller.go:129] Reconciling Machine "zhsun2-7j5gk-worker-aaa" I0829 08:04:01.402813 1 controller.go:298] Machine "zhsun2-7j5gk-worker-aaa" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster I0829 08:04:02.288349 1 controller.go:238] Reconciling machine "zhsun2-7j5gk-worker-aaa" triggers idempotent update I0829 08:04:02.288673 1 actuator.go:325] re-creating machine for update. E0829 08:04:02.288729 1 actuator.go:328] delete machine for update failed: Failed to get Machine Spec from Provider Spec (clients/machineservice.go 138): no such providerSpec found in manifest Expected results: machine-controller logs output label is not correct. Additional info:
In general, machine.openshift.io/cluster-api-cluster: zhsun2-7j5gk-invalid is supposed to be used to tag instance created in a cloud provider by the actuator. It has no effect on a node joining or not joining a cluster. If there is a way I recommend to tag all openstack instances. In AWS we use: {Key: "kubernetes.io/cluster/" + clusterID, Value: "owned"} clusterID is value of machine.openshift.io/cluster-api-cluster label.
@Jan: we use machine.openshift.io/cluster-api-cluster to tag various network resources for the machine. In this case, if we have an invalid tag, these resources can't be deleted by the installer. So I think the correct behavior here is to check if the cluster name is valid, and if not, return an error and prevent creation of the instance. My patch does exactly this: https://github.com/openshift/cluster-api-provider-openstack/pull/64 What do you think?
Verified. clusterversion 4.2.0-0.nightly-2019-09-11-202233 $ ./oc get machine NAME STATE TYPE REGION ZONE AGE zhsun-8p6x5-master-0 ACTIVE ci.m1.xlarge regionOne nova 89m zhsun-8p6x5-master-1 ACTIVE ci.m1.xlarge regionOne nova 89m zhsun-8p6x5-master-2 ACTIVE ci.m1.xlarge regionOne nova 89m zhsun-8p6x5-worker-7dm4t ACTIVE ci.m1.xlarge regionOne nova 87m zhsun-8p6x5-worker-aaa ERROR ci.m1.xlarge regionOne nova 16s zhsun-8p6x5-worker-mgpt2 ACTIVE ci.m1.xlarge regionOne nova 87m zhsun-8p6x5-worker-wccll ACTIVE ci.m1.xlarge regionOne nova 87m I0912 03:06:13.380580 1 controller.go:247] Reconciling machine object zhsun-8p6x5-worker-aaa triggers idempotent create. E0912 03:06:13.386579 1 actuator.go:118] machine.openshift.io/cluster-api-cluster label value is incorrect: zhsun-8p6x5-invalid, machine zhsun-8p6x5-worker-aaa cannot join cluster zhsun-8p6x5 E0912 03:06:13.393301 1 actuator.go:470] Machine error zhsun-8p6x5-worker-aaa: machine.openshift.io/cluster-api-cluster label value is incorrect: zhsun-8p6x5-invalid, machine zhsun-8p6x5-worker-aaa cannot join cluster zhsun-8p6x5 W0912 03:06:13.393342 1 controller.go:249] Failed to create machine "zhsun-8p6x5-worker-aaa": machine.openshift.io/cluster-api-cluster label value is incorrect: zhsun-8p6x5-invalid, machine zhsun-8p6x5-worker-aaa cannot join cluster zhsun-8p6x5
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922