When an ironic driver doesn't support soft power off, we are supposed to fall back to doing a hard power off. This was inadvertently broken in OpenShift 4.9. Now when the driver doesn't support soft power off we end up returning a 'transient' error and retrying in an infinite loop. The Fujitsu driver is known to not support soft power off when its agent is not available on the host.
is it Fujitsu only specific?
(In reply to Lubov from comment #2) > is it Fujitsu only specific? Possibly not, but that is the only example I know of for sure (Fujitsu requires an agent running on the host to do soft power off, so if it is not present it fails in this way).
Assigning to @fj-lsoft-ofuku.fujitsu.com to clear our backlog
Hi Zane, Lubov, In case of Fujitsu server(iRMC driver), we have two power interfaces. - ipmitool: we can do soft power off - irmc: we can do soft power off if ServerView agent is installed in the system. (https://docs.openstack.org/ironic/latest/admin/drivers/irmc.html#supported-platforms) For OpenShift, ipmitool is hard-corded in Metal3 and OpenShift. - https://github.com/metal3-io/baremetal-operator/blob/master/pkg/bmc/irmc.go#L86 - https://github.com/openshift/baremetal-operator/blob/master/pkg/bmc/irmc.go#L86 So we always support soft power off in OpenShift. In conclusion, I don't think Fujitsu server is affected by this problem. But I will verify this modification (https://github.com/metal3-io/baremetal-operator/pull/985) against Fujitsu server just in case. Best Regards, Yasuhiro Futakawa
Hi Zane, Lubov, Fujitsu verified the latest nightly build which includes the following patches, and confirmed soft power off worked correctly. https://github.com/openshift/baremetal-operator/pull/180 Best Regards, Yasuhiro Futakawa
(In reply to Fujitsu container team from comment #7) > Hi Zane, Lubov, > > Fujitsu verified the latest nightly build which includes the following > patches, and confirmed soft power off worked correctly. > https://github.com/openshift/baremetal-operator/pull/180 > > Best Regards, > Yasuhiro Futakawa Many thanks, closing as verified!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056