Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1747245 - [osp] Machine could be created successfully when label is not correct
Summary: [osp] Machine could be created successfully when label is not correct
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.2.0
Assignee: Mike Fedosin
QA Contact: sunzhaohua
URL:
Whiteboard: osp
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-30 01:07 UTC by sunzhaohua
Modified: 2019-10-16 06:39 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-16 06:38:50 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-provider-openstack pull 64 0 None None None 2019-09-10 12:03:56 UTC
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:39:04 UTC

Description sunzhaohua 2019-08-30 01:07:33 UTC
Description of problem:
Create a machine with invalid label "machine.openshift.io/cluster-api-cluster: zhsun2-7j5gk-invalid", the machine could be created successfully and the intance could join the cluster.

Version-Release number of selected component (if applicable):
4.2.0-0.nightly-2019-08-25-233755

How reproducible:
Always

Steps to Reproduce:
1.  Create a machine with invalid label
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: zhsun2-7j5gk-invalid
    machine.openshift.io/cluster-api-machine-role: worker
    machine.openshift.io/cluster-api-machine-type: worker
    machine.openshift.io/instance-type: m1.large
    machine.openshift.io/region: regionOne
    machine.openshift.io/zone: nova
  name: zhsun2-7j5gk-worker-aaa
  namespace: openshift-machine-api
spec:
  metadata:
    creationTimestamp: null
  providerSpec:
    value:
      apiVersion: openstackproviderconfig.openshift.io/v1alpha1
      cloudName: openstack
      cloudsSecret:
        name: openstack-cloud-credentials
        namespace: openshift-machine-api
      flavor: m1.large
      image: rhcos-42.80.20190815.3
      kind: OpenstackProviderSpec
      metadata:
        creationTimestamp: null
      networks:
      - filter: {}
        subnets:
        - filter:
            name: zhsun2-7j5gk-nodes
            tags: openshiftClusterID=zhsun2-7j5gk
      securityGroups:
      - filter: {}
        name: zhsun2-7j5gk-worker
      serverMetadata:
        Name: zhsun2-7j5gk-worker
        openshiftClusterID: zhsun2-7j5gk
      tags:
      - openshiftClusterID=zhsun2-7j5gk
      trunk: true
      userDataSecret:
        name: worker-user-data
      
2. Check machine, node and machine-controller logs


Actual results:
Machine could be created successful and instance could join the cluster.

$ oc get node
NAME                        STATUS   ROLES    AGE    VERSION
zhsun2-7j5gk-master-0       Ready    master   154m   v1.14.0+ceed07c42
zhsun2-7j5gk-master-1       Ready    master   155m   v1.14.0+ceed07c42
zhsun2-7j5gk-master-2       Ready    master   154m   v1.14.0+ceed07c42
zhsun2-7j5gk-worker-aaa     Ready    worker   42s    v1.14.0+ceed07c42
zhsun2-7j5gk-worker-c498z   Ready    worker   148m   v1.14.0+ceed07c42
zhsun2-7j5gk-worker-fbrmz   Ready    worker   147m   v1.14.0+ceed07c42
zhsun2-7j5gk-worker-kcfp6   Ready    worker   149m   v1.14.0+ceed07c42
[
$ oc get machine
NAME                        STATE    TYPE       REGION      ZONE   AGE
zhsun2-7j5gk-master-0       ACTIVE   m1.large   regionOne   nova   154m
zhsun2-7j5gk-master-1       ACTIVE   m1.large   regionOne   nova   154m
zhsun2-7j5gk-master-2       ACTIVE   m1.large   regionOne   nova   154m
zhsun2-7j5gk-worker         ACTIVE   m1.large   regionOne   nova   82m
zhsun2-7j5gk-worker-aaa     ACTIVE   m1.large   regionOne   nova   5m17s
zhsun2-7j5gk-worker-c498z   ACTIVE   m1.large   regionOne   nova   153m
zhsun2-7j5gk-worker-fbrmz   ACTIVE   m1.large   regionOne   nova   153m
zhsun2-7j5gk-worker-kcfp6   ACTIVE   m1.large   regionOne   nova   153m

I0829 08:03:08.571560       1 controller.go:129] Reconciling Machine "zhsun2-7j5gk-worker-aaa"
I0829 08:03:08.571705       1 controller.go:298] Machine "zhsun2-7j5gk-worker-aaa" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I0829 08:03:08.587852       1 controller.go:129] Reconciling Machine "zhsun2-7j5gk-worker-aaa"
I0829 08:03:08.588073       1 controller.go:298] Machine "zhsun2-7j5gk-worker-aaa" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I0829 08:03:08.831336       1 controller.go:247] Reconciling machine object zhsun2-7j5gk-worker-aaa triggers idempotent create.
I0829 08:04:01.402694       1 controller.go:129] Reconciling Machine "zhsun2-7j5gk-worker-aaa"
I0829 08:04:01.402813       1 controller.go:298] Machine "zhsun2-7j5gk-worker-aaa" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I0829 08:04:02.288349       1 controller.go:238] Reconciling machine "zhsun2-7j5gk-worker-aaa" triggers idempotent update
I0829 08:04:02.288673       1 actuator.go:325] re-creating machine  for update.
E0829 08:04:02.288729       1 actuator.go:328] delete machine  for update failed: Failed to get Machine Spec from Provider Spec (clients/machineservice.go 138): no such providerSpec found in manifest

Expected results:
machine-controller logs output label is not correct.

Additional info:

Comment 1 Jan Chaloupka 2019-08-30 09:00:36 UTC
In general, machine.openshift.io/cluster-api-cluster: zhsun2-7j5gk-invalid is supposed to be used to tag instance created in a cloud provider by the actuator. It has no effect on a node joining or not joining a cluster.

If there is a way I recommend to tag all openstack instances. In AWS we use:

{Key: "kubernetes.io/cluster/" + clusterID, Value: "owned"}

clusterID is value of machine.openshift.io/cluster-api-cluster label.

Comment 2 Mike Fedosin 2019-09-10 14:23:00 UTC
@Jan: we use machine.openshift.io/cluster-api-cluster to tag various network resources for the machine. In this case, if we have an invalid tag, these resources can't be deleted by the installer.

So I think the correct behavior here is to check if the cluster name is valid, and if not, return an error and prevent creation of the instance. My patch does exactly this: https://github.com/openshift/cluster-api-provider-openstack/pull/64

What do you think?

Comment 4 sunzhaohua 2019-09-12 03:07:42 UTC
Verified.

clusterversion 4.2.0-0.nightly-2019-09-11-202233

$ ./oc get machine
NAME                       STATE    TYPE           REGION      ZONE   AGE
zhsun-8p6x5-master-0       ACTIVE   ci.m1.xlarge   regionOne   nova   89m
zhsun-8p6x5-master-1       ACTIVE   ci.m1.xlarge   regionOne   nova   89m
zhsun-8p6x5-master-2       ACTIVE   ci.m1.xlarge   regionOne   nova   89m
zhsun-8p6x5-worker-7dm4t   ACTIVE   ci.m1.xlarge   regionOne   nova   87m
zhsun-8p6x5-worker-aaa     ERROR    ci.m1.xlarge   regionOne   nova   16s
zhsun-8p6x5-worker-mgpt2   ACTIVE   ci.m1.xlarge   regionOne   nova   87m
zhsun-8p6x5-worker-wccll   ACTIVE   ci.m1.xlarge   regionOne   nova   87m

I0912 03:06:13.380580       1 controller.go:247] Reconciling machine object zhsun-8p6x5-worker-aaa triggers idempotent create.
E0912 03:06:13.386579       1 actuator.go:118] machine.openshift.io/cluster-api-cluster label value is incorrect: zhsun-8p6x5-invalid, machine zhsun-8p6x5-worker-aaa cannot join cluster zhsun-8p6x5
E0912 03:06:13.393301       1 actuator.go:470] Machine error zhsun-8p6x5-worker-aaa: machine.openshift.io/cluster-api-cluster label value is incorrect: zhsun-8p6x5-invalid, machine zhsun-8p6x5-worker-aaa cannot join cluster zhsun-8p6x5
W0912 03:06:13.393342       1 controller.go:249] Failed to create machine "zhsun-8p6x5-worker-aaa": machine.openshift.io/cluster-api-cluster label value is incorrect: zhsun-8p6x5-invalid, machine zhsun-8p6x5-worker-aaa cannot join cluster zhsun-8p6x5

Comment 5 errata-xmlrpc 2019-10-16 06:38:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.