Bug 1830350
| Summary: | ironic mac ports not deleted on node (worker) deletion | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Dave Wilson <dwilson> |
| Component: | Bare Metal Hardware Provisioning | Assignee: | Steven Hardy <shardy> |
| Bare Metal Hardware Provisioning sub component: | baremetal-operator | QA Contact: | Amit Ugol <augol> |
| Status: | CLOSED NOTABUG | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | derekh, hpokorny |
| Version: | 4.4 | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-05-26 16:29:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Dave Wilson
2020-05-01 17:35:25 UTC
Please provide any logs. From the ironic side, the logs from the ironic-inspector and dnsmasq containers. I'm moving this bug to the BMO component, since I suspect the first worker is not fully deleted by the time a new one is created. Ironic removes ports for removed nodes. We also need the exact steps to reproduce this, "Delete worker" isn't specific enough.
The logs from the baremetal-operator container would also be helpful - one possible reason is the node delete was not yet completed at the time you created the new BMH and the logs should confirm that - you can also interact directly with the ironic API on the master running the ironic-api container e.g
$ oc get pods -n openshift-machine-api | grep metal3
metal3-796ddb8446-kbn7r 8/8 Running 0 3d
$ oc describe pod metal3-796ddb8446-kbn7r -n openshift-machine-api | grep IRONIC_ENDPOINT
IRONIC_ENDPOINT: http://[fd00:1101::3]:6385/v1/
curl http://[fd00:1101::3]:6385/v1/ports | jq . #here you can grep for the conflicting mac, and also check v1/nodes for an existing node
(In reply to Dave Wilson from comment #0) > Description of problem: deletion of worker and then redploy of that worker > appears to cause a port conflict with " A port with MAC address > 40:a6:b7:00:47:e0 already exists. (HTTP 409)." Note: this message "A port with...." is expected when a node powers and boots the inspection image while inspector is not inspecting it. Can you check if inspection succeeded if so the node may just be waiting to be deployed. Details on how the node was deleted and added along with the logs requested above should help in figuring this out. I've not been able to repeat/verify this issue. At the time it was reported there were some switch config setting that were erroneous causing intermittent link issue on prov/bm nics. This resulted in having to delete and add the worker multiple times. Since, the issue with the switch port configs have been rectified and workers consistently deploy. |