Bug 1800425
| Summary: | Choose more appropriate annotation for external remediation | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Andrew Beekhof <abeekhof> |
| Component: | Cloud Compute | Assignee: | Steven Hardy <shardy> |
| Cloud Compute sub component: | BareMetal Provider | QA Contact: | Amit Ugol <augol> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | unspecified | CC: | stbenjam, vlaad |
| Version: | 4.3.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.4.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-05-15 16:04:29 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Andrew Beekhof
2020-02-07 03:01:27 UTC
For Testing the steps are : create a mhc -> annotate stratergy -> stop instance from Provider console -> Monitor mhc Expected : if annotated with machine.openshift.io/remediation-strategy=external-baremetal it will not be deleted and remediated by the healthcheck controller. So needed more info on , if the above steps suffice ? -- Expecting the below steps to cover the testing for the change --
version :
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.4.0-0.nightly-2020-03-08-213224 True False 6h45m Cluster version is 4.4.0-0.nightly-2020-03-08-213224
Steps :
1.Create mhc use below yaml :
---
apiVersion: machine.openshift.io/v1beta1
kind: MachineHealthCheck
metadata:
name: mh1
namespace: openshift-machine-api
spec:
maxUnhealthy: 3
selector:
matchLabels:
machine.openshift.io/cluster-api-cluster: <Your cluster>
machine.openshift.io/cluster-api-machine-role: worker
machine.openshift.io/cluster-api-machine-type: worker
machine.openshift.io/cluster-api-machineset: <Your machineset>
unhealthyConditions:
-
status: "False"
timeout: 300s
type: Ready
-
status: Unknown
timeout: 300s
type: Ready
2.Annotate mhc :
oc annotate mhc <mhc name> healthchecking.openshift.io/strategy=machine.openshift.io/remediation-strategy=external-baremetal
3.Terminate the machine of the machineset being monitored by mhc using the IAAS console (AWS in this)
Actual : Machine remediation did not happen and it stays in Failed state
Expected : No remediation should take place
(In reply to Milind Yadav from comment #4) > -- Expecting the below steps to cover the testing for the change -- > > version : > NAME VERSION AVAILABLE PROGRESSING > SINCE STATUS > version 4.4.0-0.nightly-2020-03-08-213224 True False > 6h45m Cluster version is 4.4.0-0.nightly-2020-03-08-213224 > > > Steps : > > 1.Create mhc use below yaml : > --- > apiVersion: machine.openshift.io/v1beta1 > kind: MachineHealthCheck > metadata: > name: mh1 > namespace: openshift-machine-api > spec: > maxUnhealthy: 3 > selector: > matchLabels: > machine.openshift.io/cluster-api-cluster: <Your cluster> > machine.openshift.io/cluster-api-machine-role: worker > machine.openshift.io/cluster-api-machine-type: worker > machine.openshift.io/cluster-api-machineset: <Your machineset> > unhealthyConditions: > - > status: "False" > timeout: 300s > type: Ready > - > status: Unknown > timeout: 300s > type: Ready > > 2.Annotate mhc : > oc annotate mhc <mhc name> > healthchecking.openshift.io/strategy=machine.openshift.io/remediation- > strategy=external-baremetal > This looks wrong. I think you want: oc annotate mhc <mhc name> machine.openshift.io/remediation-strategy=external-baremetal > 3.Terminate the machine of the machineset being monitored by mhc using the > IAAS console (AWS in this) > > Actual : Machine remediation did not happen and it stays in Failed state > Expected : No remediation should take place Was the 'host.metal3.io/external-remediation' annotation added to the machine associated with the failed node? I cannot check annotation at the node as , node died after the Instance that was containing it got terminated .
Do you mean the annotation 'host.meta3.io/external-remediation' was added or not on the machine that is showing failed status ?
Then , no , it wasnt , the annotation was
annotations:
machine.openshift.io/instance-state: running
(In reply to Milind Yadav from comment #6) > I cannot check annotation at the node as , node died after the Instance > that was containing it got terminated . It should be on the Machine, not the node. If the Node got deleted, then you've tested the default remediation strategy (deletion) not the baremetal one. > > Do you mean the annotation 'host.meta3.io/external-remediation' was added or > not on the machine that is showing failed status ? > > Then , no , it wasnt , the annotation was > > annotations: > machine.openshift.io/instance-state: running I would recommend retesting with 'oc annotate mhc <mhc name> machine.openshift.io/remediation-strategy=external-baremetal' @Andrew , I think this is what you expected and is correct , I will update the annotation value as you suggested , Thanks , the case still is VERIFIED In the validation steps updated : 2.Annotate mhc : > oc annotate mhc <mhc name> > healthchecking.openshift.io/strategy=machine.openshift.io/remediation- > strategy=external-baremetal to 'oc annotate mhc <mhc name> machine.openshift.io/remediation-strategy=external-baremetal' [miyadav@miyadav bug1800425]$ oc describe machine aiyengar-1103-6nfzf-worker-us-east-2c-q8p6j Name: aiyengar-1103-6nfzf-worker-us-east-2c-q8p6j Namespace: openshift-machine-api Labels: machine.openshift.io/cluster-api-cluster=aiyengar-1103-6nfzf machine.openshift.io/cluster-api-machine-role=worker machine.openshift.io/cluster-api-machine-type=worker machine.openshift.io/cluster-api-machineset=aiyengar-1103-6nfzf-worker-us-east-2c machine.openshift.io/instance-type=m4.large machine.openshift.io/region=us-east-2 machine.openshift.io/zone=us-east-2c Annotations: host.metal3.io/external-remediation: machine.openshift.io/instance-state: running |