Bug 1843597 - [IPI][OSP] Worker deleted on openstack is not recreated
Summary: [IPI][OSP] Worker deleted on openstack is not recreated
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.0
Assignee: egarcia
QA Contact: David Sanz
URL:
Whiteboard:
Depends On:
Blocks: 1848755
TreeView+ depends on / blocked
 
Reported: 2020-06-03 15:51 UTC by David Sanz
Modified: 2020-10-27 16:05 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 16:04:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-provider-openstack pull 101 0 None closed Bug 1843597: Revendor MAO and client-go 2020-07-06 16:55:34 UTC
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:05:18 UTC

Description David Sanz 2020-06-03 15:51:02 UTC
Description of problem:

With a MachineHealthCheck created for the workers machineset, if we delete a worker machine on openstack (openstack server delete <<UUID>>), it is not recreated.

$ oc get nodes,machines,machineset,MachineHealthCheck -A
NAME                               STATUS     ROLES    AGE   VERSION
node/mrnd-tst-6dlcb-master-0       Ready      master   43m   v1.18.3+a637491
node/mrnd-tst-6dlcb-master-1       Ready      master   43m   v1.18.3+a637491
node/mrnd-tst-6dlcb-master-2       Ready      master   43m   v1.18.3+a637491
node/mrnd-tst-6dlcb-worker-9wnc2   Ready      worker   30m   v1.18.3+a637491
node/mrnd-tst-6dlcb-worker-bgqgq   NotReady   worker   27m   v1.18.3+a637491

NAMESPACE               NAME                                                       PHASE     TYPE           REGION      ZONE   AGE
openshift-machine-api   machine.machine.openshift.io/mrnd-tst-6dlcb-master-0       Running   ci.m1.xlarge   regionOne   nova   43m
openshift-machine-api   machine.machine.openshift.io/mrnd-tst-6dlcb-master-1       Running   ci.m1.xlarge   regionOne   nova   43m
openshift-machine-api   machine.machine.openshift.io/mrnd-tst-6dlcb-master-2       Running   ci.m1.xlarge   regionOne   nova   43m
openshift-machine-api   machine.machine.openshift.io/mrnd-tst-6dlcb-worker-9wnc2   Running   ci.m1.xlarge   regionOne   nova   36m
openshift-machine-api   machine.machine.openshift.io/mrnd-tst-6dlcb-worker-bgqgq   Failed    ci.m1.xlarge   regionOne   nova   36m

NAMESPACE               NAME                                                    DESIRED   CURRENT   READY   AVAILABLE   AGE
openshift-machine-api   machineset.machine.openshift.io/mrnd-tst-6dlcb-worker   2         2         1       1           43m

NAMESPACE               NAME                                                             MAXUNHEALTHY   EXPECTEDMACHINES   CURRENTHEALTHY
openshift-machine-api   machinehealthcheck.machine.openshift.io/openstack-health-check   40%            2                  1

Related log from machine-api-controllers:

I0603 15:26:45.455332       1 controller.go:165] mrnd-tst-6dlcb-worker-bgqgq: reconciling Machine
I0603 15:26:45.497169       1 utils.go:99] Cloud provider CA cert not provided, using system trust bundle
I0603 15:26:50.751532       1 controller.go:420] mrnd-tst-6dlcb-worker-bgqgq: going into phase "Failed"
I0603 15:26:50.771680       1 controller.go:165] mrnd-tst-6dlcb-worker-bgqgq: reconciling Machine
W0603 15:26:50.771710       1 controller.go:262] mrnd-tst-6dlcb-worker-bgqgq: machine has gone "Failed" phase. It won't reconcile

Version-Release number of selected component (if applicable):
4.5.0-0.nightly-2020-06-03-105031

How reproducible:


Steps to Reproduce:
1.Install cluster IPI on OSP
2.Create a MachineHealthCheck for workers machineset
3.Destroy worker instance on openstack

Actual results:
Worker is not recreated

Expected results:
Worker is recreated by the machineset

Additional info:

Comment 2 Mike Fedosin 2020-06-22 08:36:34 UTC
The fix has been merged, so I move this bug to ON_QA: https://github.com/openshift/cluster-api-provider-openstack/pull/101

Comment 3 David Sanz 2020-06-22 11:32:38 UTC
Verified on 4.6.0-0.nightly-2020-06-20-011219

Comment 6 errata-xmlrpc 2020-10-27 16:04:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.