Description of problem: After the cluster is deployed, the master machines are not assigned to nodes. Master Nodes don't have machine.openhift.io/machine annotation and the machine does not have status.addresses an status.nodeRef populated. Executing ./12_csr_hack.sh explicitly fixes the problem. Version-Release number of selected component (if applicable): 4.3.0-0.nightly-2020-02-03-115336-ipv6.1 How reproducible: Steps to reproduce the behavior, please include all details about the dev-scripts version (git commit SHA), any local variable overrides or other customizations, and whether you're deploying on VMs or Baremetal. Recent dev-scripts master (34d334f), default config, only NUM_WORKERS and PULL_SECRET set. Deploy cluster using default 'make' command. Check one of the master machine resources, the status.addresses and status.nodeRef is not populated. Actual results: The masters and nodes are not properly displaying information. The Nodes are not displaying the number of pods The BMH does not display any information and graphs Expected results: The master machines and nodes should reference each other correctly. Additional info:
Created attachment 1662150 [details] BMH showing no data
Related Github issue: https://github.com/openshift-metal3/dev-scripts/issues/917
The same issue is present on IPV6 environments deployed with the manual IPI installation process(no dev-scripts involved).
Moving this to 4.5. To get this change in 4.4 at this point, you'll need to fix it in 4.5, and clone this bug to 4.4.
Hi, per #comment4, if it needs to be fixed in 4.4, please clone one bug for 4.4.
This is expected behavior right now with bare metal IPI. It can be worked around with an external script that writes Addresses to the master Machines. dev-scripts used to have a script to do this. AFAIK, the functional impact of this issue is that it breaks some of the bare metal host management in the UI.
Created attachment 1666271 [details] add-machine-ips.sh
Created attachment 1666272 [details] link-machine-and-node.sh
Created attachment 1666273 [details] utils.sh
Workaround: download add-machine-ips.sh link-machine-and-node.sh utils.sh scripts in the same directory export KUBECONFIG=clusterconfigs/auth/kubeconfig export CLUSTER_NAME=ocp-edge-cluster bash add-machine-ips.sh
Also jq needs to be installed on the machine where the workaround steps are run.
Also need to set link-machine-and-node.sh as executable chmod +x link-machine-and-node.sh
*** Bug 1809664 has been marked as a duplicate of this bug. ***
Hi Ian, can you look into this bug and confirm if it is fixed by your work on collecting introspection data or if there is a duplicate bug? Thanks, Beth
This bug severely affects the UX of the OpenShift console bare metal hosts management operations on masters such as power off, and maintenance. Bare Metal Host includes node status when determining its state and since the relationship is not represented on master hosts, the status calculation ends up not resembling the reality. This fact introduces several UI bugs (adding these as blocked by this bug).
Yes, working on getting data into the masters from the installers. This isn't IPV6 specific though I believe - correct me if I'm wrong.
(In reply to Ian Main from comment #16) > Yes, working on getting data into the masters from the installers. This > isn't IPV6 specific though I believe - correct me if I'm wrong. To be clear, I'm looking at doing introspection during terraform operations and loading that data into the BMH CRs.
> Yes, working on getting data into the masters from the installers. This isn't IPV6 specific though I believe - correct me if I'm wrong. Yup, that's correct, I'll fix the title. It applies to all masters regardless of IPv4 or IPv6. The node/machine/baremetalhost association is made based on node.InternalIP information, and we gather that through hardware inspection. We don't run inspection on masters, but Ian is working on running it once at install time, and then bringing that information over to the BareMetalHost object via annotations. Once that's populated all the relevant associations will be made.
*** Bug 1810430 has been marked as a duplicate of this bug. ***
https://github.com/openshift/installer/pull/3591
*** Bug 1822054 has been marked as a duplicate of this bug. ***
I've just deployed a 4.4.4 cluster and the issue is still present. Given the fact the bootstrap VM was deleted I was not able to check what happened. Also the discussion from Github PR dates from the same date as the 4.4.4, so I wonder if this fix got into the 4.4.4 build. Looking for a 4.4.5 build after May 16th, to recheck this. Also on a 4.5.0 from May 22nd, this seems that is fixed. Still the Masters appear as "Host is powered off".
(In reply to Constantin Vultur from comment #24) > I've just deployed a 4.4.4 cluster and the issue is still present. Given the > fact the bootstrap VM was deleted I was not able to check what happened. > Also the discussion from Github PR dates from the same date as the 4.4.4, so > I wonder if this fix got into the 4.4.4 build. > > Looking for a 4.4.5 build after May 16th, to recheck this. > > Also on a 4.5.0 from May 22nd, this seems that is fixed. Still the Masters > appear as "Host is powered off". The target release for this fix is 4.5.0 so what you're seeing is expected I think. Stephen/Ian can confirm but I suspect this won't be backported to 4.4 unless there's a very strong justification, due to the complexity of the fix.
Deployed a cluster with 4.5.0-0.nightly-2020-05-26-063751 and now the information is being shown. Still the data is not accurate, since the status1 is Ready and status2 : Host is powered off. I filled https://bugzilla.redhat.com/show_bug.cgi?id=1840090 for this issue, since the wrong status might not be related to the fix implemented for this BZ.
Another side-effect of the fix: https://bugzilla.redhat.com/show_bug.cgi?id=1840105
> Stephen/Ian can confirm but I suspect this won't be backported to 4.4 unless there's a very strong justification, due to the complexity of the fix. We ended up doing this in a simple way that could be backported, the 4.4 BZ is BZ1840106, and I've got a cherry-pick PR open. > Still the data is not accurate, since the status1 is Ready and status2 : Host is powered off. I filled https://bugzilla.redhat.com/show_bug.cgi?id=1840090 for this issue, since the wrong status might not be related to the fix implemented for this BZ. > Another side-effect of the fix: https://bugzilla.redhat.com/show_bug.cgi?id=1840105 Both these bugs sound nearly identical to me, and I do not think they are side effects or have anything to do with this one. We'll take a look though.
*** Bug 1775494 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409