Bug 1801238
| Summary: | [Baremetal on IPI]: Master machines are not assigned to nodes | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Constantin Vultur <cvultur> | ||||||||||
| Component: | Installer | Assignee: | Ian Main <imain> | ||||||||||
| Installer sub component: | OpenShift on Bare Metal IPI | QA Contact: | Constantin Vultur <cvultur> | ||||||||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||||||||
| Severity: | high | ||||||||||||
| Priority: | medium | CC: | augol, beth.white, dhellmann, imain, jhou, jtomasek, kni-bugs, mlammon, rbartal, rbryant, rhhi-next-mgmt-qe, sgordon, shardy, stbenjam, ukalifon, wsun, zhsun | ||||||||||
| Version: | 4.4 | Keywords: | TestBlocker, Triaged | ||||||||||
| Target Milestone: | --- | ||||||||||||
| Target Release: | 4.5.0 | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | |||||||||||||
| : | 1801970 1809664 1840133 (view as bug list) | Environment: | |||||||||||
| Last Closed: | 2020-07-13 17:14:44 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Bug Depends On: | |||||||||||||
| Bug Blocks: | 1771572, 1801970, 1809664, 1813800, 1813801, 1824241, 1825318, 1826505, 1840133 | ||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Constantin Vultur
2020-02-10 13:45:00 UTC
Created attachment 1662150 [details]
BMH showing no data
Related Github issue: https://github.com/openshift-metal3/dev-scripts/issues/917 The same issue is present on IPV6 environments deployed with the manual IPI installation process(no dev-scripts involved). Moving this to 4.5. To get this change in 4.4 at this point, you'll need to fix it in 4.5, and clone this bug to 4.4. Hi, per #comment4, if it needs to be fixed in 4.4, please clone one bug for 4.4. This is expected behavior right now with bare metal IPI. It can be worked around with an external script that writes Addresses to the master Machines. dev-scripts used to have a script to do this. AFAIK, the functional impact of this issue is that it breaks some of the bare metal host management in the UI. Created attachment 1666271 [details]
add-machine-ips.sh
Created attachment 1666272 [details]
link-machine-and-node.sh
Created attachment 1666273 [details]
utils.sh
Workaround: download add-machine-ips.sh link-machine-and-node.sh utils.sh scripts in the same directory export KUBECONFIG=clusterconfigs/auth/kubeconfig export CLUSTER_NAME=ocp-edge-cluster bash add-machine-ips.sh Also jq needs to be installed on the machine where the workaround steps are run. Also need to set link-machine-and-node.sh as executable chmod +x link-machine-and-node.sh *** Bug 1809664 has been marked as a duplicate of this bug. *** Hi Ian, can you look into this bug and confirm if it is fixed by your work on collecting introspection data or if there is a duplicate bug? Thanks, Beth This bug severely affects the UX of the OpenShift console bare metal hosts management operations on masters such as power off, and maintenance. Bare Metal Host includes node status when determining its state and since the relationship is not represented on master hosts, the status calculation ends up not resembling the reality. This fact introduces several UI bugs (adding these as blocked by this bug). Yes, working on getting data into the masters from the installers. This isn't IPV6 specific though I believe - correct me if I'm wrong. (In reply to Ian Main from comment #16) > Yes, working on getting data into the masters from the installers. This > isn't IPV6 specific though I believe - correct me if I'm wrong. To be clear, I'm looking at doing introspection during terraform operations and loading that data into the BMH CRs. > Yes, working on getting data into the masters from the installers. This isn't IPV6 specific though I believe - correct me if I'm wrong.
Yup, that's correct, I'll fix the title. It applies to all masters regardless of IPv4 or IPv6.
The node/machine/baremetalhost association is made based on node.InternalIP information, and we gather that through hardware inspection. We don't run inspection on masters, but Ian is working on running it once at install time, and then bringing that information over to the BareMetalHost object via annotations. Once that's populated all the relevant associations will be made.
*** Bug 1810430 has been marked as a duplicate of this bug. *** *** Bug 1822054 has been marked as a duplicate of this bug. *** I've just deployed a 4.4.4 cluster and the issue is still present. Given the fact the bootstrap VM was deleted I was not able to check what happened. Also the discussion from Github PR dates from the same date as the 4.4.4, so I wonder if this fix got into the 4.4.4 build. Looking for a 4.4.5 build after May 16th, to recheck this. Also on a 4.5.0 from May 22nd, this seems that is fixed. Still the Masters appear as "Host is powered off". (In reply to Constantin Vultur from comment #24) > I've just deployed a 4.4.4 cluster and the issue is still present. Given the > fact the bootstrap VM was deleted I was not able to check what happened. > Also the discussion from Github PR dates from the same date as the 4.4.4, so > I wonder if this fix got into the 4.4.4 build. > > Looking for a 4.4.5 build after May 16th, to recheck this. > > Also on a 4.5.0 from May 22nd, this seems that is fixed. Still the Masters > appear as "Host is powered off". The target release for this fix is 4.5.0 so what you're seeing is expected I think. Stephen/Ian can confirm but I suspect this won't be backported to 4.4 unless there's a very strong justification, due to the complexity of the fix. Deployed a cluster with 4.5.0-0.nightly-2020-05-26-063751 and now the information is being shown. Still the data is not accurate, since the status1 is Ready and status2 : Host is powered off. I filled https://bugzilla.redhat.com/show_bug.cgi?id=1840090 for this issue, since the wrong status might not be related to the fix implemented for this BZ. Another side-effect of the fix: https://bugzilla.redhat.com/show_bug.cgi?id=1840105 > Stephen/Ian can confirm but I suspect this won't be backported to 4.4 unless there's a very strong justification, due to the complexity of the fix. We ended up doing this in a simple way that could be backported, the 4.4 BZ is BZ1840106, and I've got a cherry-pick PR open. > Still the data is not accurate, since the status1 is Ready and status2 : Host is powered off. I filled https://bugzilla.redhat.com/show_bug.cgi?id=1840090 for this issue, since the wrong status might not be related to the fix implemented for this BZ. > Another side-effect of the fix: https://bugzilla.redhat.com/show_bug.cgi?id=1840105 Both these bugs sound nearly identical to me, and I do not think they are side effects or have anything to do with this one. We'll take a look though. *** Bug 1775494 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 |