Bug 1740956 - Ironic nodes registered to the bmo Ironic are lost after rebooting the master node where bmo pod is running
Summary: Ironic nodes registered to the bmo Ironic are lost after rebooting the master...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Kubernetes-native Infrastructure
Classification: Red Hat
Component: Deployment
Version: unspecified
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 1.0
Assignee: Angus Thomas
QA Contact: Arik Chernetsky
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-14 01:15 UTC by Marius Cornea
Modified: 2020-04-06 13:21 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-16 11:49:54 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Marius Cornea 2019-08-14 01:15:43 UTC
Description of problem:

Ironic nodes registered to the bmo Ironic are lost after rebooting the master node where bmo pod is running.

Steps to Reproduce:

[cloud-user@rhhi-node-worker-0 dev-scripts]$ export OS_URL=http://172.22.0.3:6385
[cloud-user@rhhi-node-worker-0 dev-scripts]$ export OS_TOKEN=fake-token
[cloud-user@rhhi-node-worker-0 dev-scripts]$ openstack baremetal node list
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name               | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
| 56904f3a-88da-4ea9-b237-155216d12d9d | openshift-master-0 | None          | power on    | adopt failed       | False       |
| 21466e2a-a0c5-4a0e-ae4b-788e02c4fa0c | openshift-master-1 | None          | power on    | adopt failed       | False       |
| 1b48b778-d341-4e21-ad9e-a49ae9619ed1 | openshift-master-2 | None          | power on    | adopt failed       | False       |
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+


[cloud-user@rhhi-node-worker-0 dev-scripts]$ oc -n openshift-machine-api get pods/metal3-baremetal-operator-7469cd5f89-5h2fs -o yaml | grep master-
  nodeName: rhhi-node-master-1


Power off the master node where the baremetal-operator pod is running:
openstack baremetal node power off openshift-master-1 #ran on provisionhost ironic


Wait for the node to go offline:
[cloud-user@rhhi-node-worker-0 ~]$ oc get nodes
NAME                 STATUS     ROLES           AGE   VERSION
rhhi-node-master-0   Ready      master,worker   82m   v1.14.0+739670a83
rhhi-node-master-1   NotReady   master,worker   82m   v1.14.0+739670a83
rhhi-node-master-2   Ready      master,worker   82m   v1.14.0+739670a83


Power on the node back on:
openstack baremetal node power on openshift-master-1 #ran on provisionhost ironic

Wait for the node to go online:
[cloud-user@rhhi-node-worker-0 ~]$ oc get nodes
NAME                 STATUS   ROLES           AGE   VERSION
rhhi-node-master-0   Ready    master,worker   84m   v1.14.0+739670a83
rhhi-node-master-1   Ready    master,worker   84m   v1.14.0+739670a83
rhhi-node-master-2   Ready    master,worker   84m   v1.14.0+739670a83

Check baremetal nodes list:

[cloud-user@rhhi-node-worker-0 ~]$ export OS_URL=http://172.22.0.3:6385
[cloud-user@rhhi-node-worker-0 ~]$ export OS_TOKEN=fake-token
[cloud-user@rhhi-node-worker-0 ~]$ openstack baremetal node list

Version-Release number of selected component (if applicable):

How reproducible:
100%

Actual results:
Ironic nodes which were registered during initial deployment are lost after rebooting the master node where the bmo pod is running.

Expected results:
The ironic nodes details persist across reboots.

Additional info:

Comment 1 Doug Hellmann 2019-08-23 22:54:59 UTC
https://github.com/metal3-io/baremetal-operator/pull/278 should resolve this

Comment 2 Nelly Credi 2019-08-25 13:01:37 UTC
If you feel that this bug is fixed, please set the 'fixed in version' and let QE verify it.
since i see you have a PR im setting it to POST


Note You need to log in before you can comment on or make changes to this bug.