Description of problem: Restarting a node does not move console pod(s) before it finished restart of node (PowerOff, Start). As a result UI console kicks user out (see attachment) Version-Release number of selected component (if applicable): Cluster version is 4.6.0-fc.8 How reproducible: 100% Steps to Reproduce: 1. Install OCP 4.6 2. Compute-BMH->master-0-2<3 button kebab>-Restart 3. Popover invoked - Restart Bare Metal Host The message: "The bare metal host <node> will be restarted gracefully after all managed workloads are moved." Actual results: We see "Restart pending" Console pods are one example which do not get moved prior to the node doing its PowerOff & Start This is not graceful and wonder how many other workloads are not being moved Expected results: Expect console pod to move prior to restart Additional info: master-0-2 is restarted in this example Before restart # oc get pods -o wide -n openshift-console NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-676f59b7fb-67ggv 1/1 Running 0 3h46m 10.128.0.16 master-0-2 <none> <none> console-676f59b7fb-rqm57 1/1 Running 0 142m 10.129.0.34 master-0-1 <none> <none> downloads-6ddcb844f4-f7r9x 1/1 Running 0 3h46m 10.128.0.20 master-0-2 <none> <none> downloads-6ddcb844f4-whkdg 1/1 Running 0 4h 10.128.2.5 worker-0-0 <none> <none> At time of Poweroff/Restart # oc get pods -o wide -n openshift-console NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES console-676f59b7fb-67ggv 1/1 Running 0 3h48m 10.128.0.16 master-0-2 <none> <none> console-676f59b7fb-rqm57 1/1 Running 0 144m 10.129.0.34 master-0-1 <none> <none> downloads-6ddcb844f4-f7r9x 1/1 Running 0 3h48m 10.128.0.20 master-0-2 <none> <none> downloads-6ddcb844f4-whkdg 1/1 Running 0 4h2m 10.128.2.5 worker-0-0 <none> <none>
Created attachment 1717592 [details] login screen
> 3. Popover invoked - Restart Bare Metal Host > "The bare metal host <node> will be restarted gracefully after all managed workloads are moved." After some discussion it seems this message is misleading, any reboot of a BMH is *not* a graceful restart, it's a force power off/on via the BMC, triggered via the 'reboot.metal3.io' annotation on the host That was implemented via https://github.com/metal3-io/baremetal-operator/pull/424 and it doesn't care about workloads at all. Probably we need to update the UI to make this clearer, that this is a potentially disruptive action that should be approached with caution.
*** Bug 1883622 has been marked as a duplicate of this bug. ***
I have retested this today using virtual environment (4.6.0-rc.0) as well as on real Bare Metal (4.6.0-fc.8) which are recent stable images. I am no longer seeing the issue and at this time will close it. I will re-open if seen again.
I think this was closed in error. It should still be resolved as mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1883614#c2 Please advise if we should still fix in 4.7 with new message
PR: https://github.com/openshift/console/pull/7765
Now the message for 'Restart' action is: The host will be powered off and on again. Applications may be temporarily disrupted. Workloads currently running on this host will not be moved before restarting. This may cause service disruptions. I think the message now are more readable and reasonable Moving to VERIFIED, let me know if this is wrong This is checked against 4.7.0-0.nightly-2021-01-27-002938
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633