Bug 1800423 - Remediation annotation should be applied to the Machine
Summary: Remediation annotation should be applied to the Machine
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.4.0
Assignee: Alberto
QA Contact: Jianwei Hou
URL:
Whiteboard:
Depends On: 1802956
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-07 02:54 UTC by Andrew Beekhof
Modified: 2020-05-15 16:03 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-15 16:03:49 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-api-operator pull 475 0 None closed Bug 1800423: Apply the reboot annotation to the machine 2021-01-20 15:05:38 UTC

Description Andrew Beekhof 2020-02-07 02:54:15 UTC
Description of problem:

Currently the MHC uses a Node annotation to indicate that it requires external remediation.  

It is desirable for the controller that implements the remediation that it be applied to the Machine instead.  This is also consistent with the M in MHC.

See: https://github.com/openshift/machine-api-operator/pull/475

Version-Release number of selected component (if applicable): 4.4

Comment 2 Milind Yadav 2020-02-14 10:52:37 UTC
Description : MHC uses a Node annotation to indicate that it requires external remediation , desirable was to have the  controller that implements the remediation that should be applied to the Machine instead

Cluster Version Tested on :NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-02-13-212616   True        False         7h19m   Cluster version is 4.4.0-0.nightly-2020-02-13-212616

Test Steps :

1.Create a machine health check using below yml 

apiVersion: machine.openshift.io/v1beta1
kind: MachineHealthCheck
metadata:
  creationTimestamp: '2020-02-14T09:47:08Z'
  generation: 1
  name: <User defined Name>
  namespace: openshift-machine-api
  resourceVersion: '71059'
  selfLink: >-
    /apis/machine.openshift.io/v1beta1/namespaces/openshift-machine-api/machinehealthchecks/mhc-miyadav-1402-drlvf-worker-us-east-2c
  uid: ef74b735-e58e-4c24-aa69-015d90998b77
spec:
  maxUnhealthy: 3
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <Your Cluster Name>
      machine.openshift.io/cluster-api-machine-role: worker
      machine.openshift.io/cluster-api-machine-type: worker
      machine.openshift.io/cluster-api-machineset: <Your Machine Set>
  unhealthyConditions:
    - status: 'False'
      timeout: 300s
      type: Ready
    - status: Unknown
      timeout: 300s
      type: Ready

Result : MHC created successfully

2.Annotate 'reboot' remediation strategy to the mhc
oc annotate mhc NAME healthchecking.openshift.io/strategy=reboot

Result : Annotation successful 

3.Go to cloud provider console, stop the machine instance of the machineset

Result : Machine should shutdown/stop.

4.Monitor machine-healthcheck-controller

Result :Remediation should trigger, but should not trigger delete. 
Instead, should add an annotation to the machine

5.oc get machine <Machine name of the machineset added to healthcheck> -o=jsonpath="{.metadata.annotations}"

Result : Annotation added successfully to machine
healthchecking.openshift.io/machine-remediation-reboot: ""


Note You need to log in before you can comment on or make changes to this bug.