Hide Forgot
Description of problem: The machine-healthcheck controller shows the node annotated with a machine name has no machine annoatation. Version-Release number of selected component (if applicable): bin/openshift-install v0.9.0-master-9-g31662509d435d0e94415c3e9b0093a441a5e7563 4.0.0-0.alpha-2019-01-09-045210 How reproducible: Always Steps to Reproduce: 1. Create a MachineHealthCheck CR in the openshift-cluster-api namespace. apiVersion: healthchecking.openshift.io/v1alpha1 kind: MachineHealthCheck metadata: name: example spec: selector: matchLabels: sigs.k8s.io/cluster-api-cluster: jhou sigs.k8s.io/cluster-api-machine-role: worker sigs.k8s.io/cluster-api-machine-type: worker sigs.k8s.io/cluster-api-machineset: jhou-worker-us-east-1b 2. Stop the kubelet running on the node that has a machine annotation jhou-worker-us-east-1b-rnlln: oc get nodes|grep ip-10-0-154-187.ec2.internal ip-10-0-154-187.ec2.internal NotReady worker 2h v1.11.0+f67f40dbad Verify that the node has machine annotation: oc get node ip-10-0-154-187.ec2.internal -o yaml|grep 'cluster.k8s.io/machine' cluster.k8s.io/machine: openshift-cluster-api/jhou-worker-us-east-1b-rnlln Verify the machine's label matches the machinehealthcheck's matchLabels: oc get machine jhou-worker-us-east-1b-rnlln -o yaml apiVersion: cluster.k8s.io/v1alpha1 kind: Machine metadata: creationTimestamp: 2019-01-09T06:26:17Z finalizers: - machine.cluster.k8s.io generateName: jhou-worker-us-east-1b- generation: 1 labels: sigs.k8s.io/cluster-api-cluster: jhou sigs.k8s.io/cluster-api-machine-role: worker sigs.k8s.io/cluster-api-machine-type: worker sigs.k8s.io/cluster-api-machineset: jhou-worker-us-east-1b name: jhou-worker-us-east-1b-rnlln namespace: openshift-cluster-api ownerReferences: - apiVersion: cluster.k8s.io/v1alpha1 blockOwnerDeletion: true controller: true kind: MachineSet name: jhou-worker-us-east-1b uid: a39c4cb7-13d4-11e9-afab-0a5d67725934 resourceVersion: "147179" selfLink: /apis/cluster.k8s.io/v1alpha1/namespaces/openshift-cluster-api/machines/jhou-worker-us-east-1b-rnlln uid: 7862cf0f-13d7-11e9-9f8f-0a609fedb69e spec: metadata: creationTimestamp: null providerConfig: value: ami: arn: null filters: null id: ami-0acd9649a24fe3a19 apiVersion: awsproviderconfig.k8s.io/v1alpha1 credentialsSecret: null deviceIndex: 0 iamInstanceProfile: arn: null filters: null id: jhou-worker-profile instanceType: m4.large keyName: null kind: AWSMachineProviderConfig loadBalancers: null metadata: creationTimestamp: null placement: availabilityZone: us-east-1b region: us-east-1 publicIp: null securityGroups: - arn: null filters: - name: tag:Name values: - jhou_worker_sg id: null subnet: arn: null filters: - name: tag:Name values: - jhou-worker-us-east-1b id: null tags: - name: openshiftClusterID value: c3223d2a-8b3c-43e5-9d07-e33ed7be6a6d - name: kubernetes.io/cluster/jhou value: owned userDataSecret: name: worker-user-data providerSpec: {} versions: kubelet: "" status: addresses: - address: 10.0.154.187 type: InternalIP - address: "" type: ExternalDNS - address: ip-10-0-154-187.ec2.internal type: InternalDNS lastUpdated: 2019-01-09T07:56:54Z nodeRef: kind: Node name: ip-10-0-154-187.ec2.internal uid: a100e30c-13d7-11e9-8395-024c79168cf2 providerStatus: apiVersion: awsproviderconfig.k8s.io/v1alpha1 conditions: - lastProbeTime: 2019-01-09T06:26:20Z lastTransitionTime: 2019-01-09T06:26:20Z message: machine successfully created reason: MachineCreationSucceeded status: "True" type: MachineCreation instanceId: i-0457b5bdaa3d671df instanceState: running kind: AWSMachineProviderStatus 3. Monitor the machine-healthcheck container oc logs -f clusterapi-manager-controllers-b9cc8df7-t9pjh -c machine-healthcheck|grep ip-10-0-154-187 Actual results: ``` I0109 07:53:14.104234 1 machinehealthcheck_controller.go:72] Reconciling MachineHealthCheck triggered by /ip-10-0-154-187.ec2.internal W0109 07:53:14.104433 1 machinehealthcheck_controller.go:91] No machine annotation for node ip-10-0-154-187.ec2.internal I0109 07:56:54.433932 1 machinehealthcheck_controller.go:72] Reconciling MachineHealthCheck triggered by /ip-10-0-154-187.ec2.internal W0109 07:56:54.434053 1 machinehealthcheck_controller.go:91] No machine annotation for node ip-10-0-154-187.ec2.internal ``` Expected results: The node ip-10-0-154-187.ec2.internal has machine annotation. Additional info:
Also reproducible on 4.0.0-0.nightly-2019-01-10-005204
Upstream PR: https://github.com/openshift/machine-api-operator/pull/175
Verified in version 4.0.0-0.nightly-2019-01-25-034943 $ oc logs -f clusterapi-manager-controllers-595cdd7745-2fdlj -c machine-healthcheck I0125 09:19:49.719644 1 machinehealthcheck_controller.go:135] Machine zhsun-worker-us-east-2c-8f8rt has no MachineHealthCheck associated I0125 09:19:52.231169 1 machinehealthcheck_controller.go:73] Reconciling MachineHealthCheck triggered by /ip-10-0-134-135.us-east-2.compute.internal I0125 09:19:52.231401 1 machinehealthcheck_controller.go:96] Node ip-10-0-134-135.us-east-2.compute.internal is annotated with machine openshift-cluster-api/zhsun-worker-us-east-2a-fw2wh I0125 09:19:52.231589 1 machinehealthcheck_controller.go:153] Initialising remediation logic for machine zhsun-worker-us-east-2a-fw2wh I0125 09:19:52.231710 1 machinehealthcheck_controller.go:190] No remediaton action was taken. Machine zhsun-worker-us-east-2a-fw2wh with node ip-10-0-134-135.us-east-2.compute.internal is healthy I0125 09:19:53.981858 1 machinehealthcheck_controller.go:73] Reconciling MachineHealthCheck triggered by /ip-10-0-28-232.us-east-2.compute.internal I0125 09:19:53.982016 1 machinehealthcheck_controller.go:96] Node ip-10-0-28-232.us-east-2.compute.internal is annotated with machine openshift-cluster-api/zhsun-master-1 I0125 09:19:53.982317 1 machinehealthcheck_controller.go:135] Machine zhsun-master-1 has no MachineHealthCheck associated I0125 09:19:54.788446 1 machinehealthcheck_controller.go:73] Reconciling MachineHealthCheck triggered by /ip-10-0-34-201.us-east-2.compute.internal I0125 09:19:54.788507 1 machinehealthcheck_controller.go:96] Node ip-10-0-34-201.us-east-2.compute.internal is annotated with machine openshift-cluster-api/zhsun-master-2
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758