Bug 1844238

Summary: [4.5] [UI] BMH status "Restart pending, Powering on" is not correct and should be "Host is powered off" with phased reboot
Product: OpenShift Container Platform Reporter: mlammon
Component: Console Metal3 PluginAssignee: Jiri Tomasek <jtomasek>
Status: CLOSED ERRATA QA Contact: Yanping Zhang <yanpzhan>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.5CC: abeekhof, achernet, aos-bugs, gharden, jtomasek, rawagner, scuppett, tjelinek, yapei
Target Milestone: ---Keywords: Triaged
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:05:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Bare Metal Hosts shows powered off when phased reboot none

Description mlammon 2020-06-04 20:20:42 UTC
Description of problem:
BMH status "Restart pending, Powering on" is not correct and should be "Host is powered off" with phased reboot


How reproducible:
100%

Steps to Reproduce:
1. Deploy cluster OCP 4.5
2. Login UI console and select Compute -> Bare Metal Host
3. Click on worker node <openshift-worker-0-0>
4. Click under actions/ edit annotation
5. Click add more and you will have ability to add KEY and VALUE
6. Add "reboot.metal3.io/anydata" in KEY only.  Note:  anydata can be any string

7.  This invokes the BMO Reboot "Phased Reboot".  Phased Reboot will bring host down
and remain down until the annotation is deleted from host.

In this use case the status on the Compute/BMH is not correct and would think should be "Host is powered off"  If you look at Compute->Nodes->worker-0-0 it shows NotReady with "Host Powered Off"


Additionally, on the very right 3 dots , if you select it, Power On is option
which would argue should not be in place since its controlled by annotation.

Actual results:
In this use case the status on the Compute/BMH is not correct and shows "Restart Pending/ Powering On" 

Expected results:
"Host is powered off"

Additional info:

[root@sealusa6 ~]# oc get bmh openshift-worker-0-0 -n openshift-machine-api -oyaml|grep poweredOn
      Xeon(R) CPU E5-2630 v4 @ 2.20GHz","clockMegahertz":2199.996,"flags":["3dnowprefetch","abm","adx","aes","apic","arat","arch_capabilities","arch_perfmon","avx","avx2","bmi1","bmi2","clflush","cmov","constant_tsc","cpuid","cpuid_fault","cx16","cx8","de","ept","erms","f16c","flexpriority","fma","fpu","fsgsbase","fxsr","hle","hypervisor","invpcid","invpcid_single","lahf_lm","lm","mca","mce","mmx","movbe","msr","mtrr","nopl","nx","pae","pat","pcid","pclmulqdq","pdpe1gb","pge","pni","popcnt","pse","pse36","pti","rdrand","rdseed","rdtscp","rep_good","rtm","sep","smap","smep","ss","sse","sse2","sse4_1","sse4_2","ssse3","syscall","tpr_shadow","tsc","tsc_adjust","tsc_deadline_timer","tsc_known_freq","umip","vme","vmx","vnmi","vpid","x2apic","xsave","xsaveopt","xtopology"],"count":8},"hostname":"worker-0-0"},"provisioning":{"state":"provisioned","ID":"67d16dfc-9417-4284-ae5c-b4f2c12b0d74","image":{"url":"http://172.22.0.3:6180/images/rhcos-45.81.202005200134-0-openstack.x86_64.qcow2/rhcos-45.81.202005200134-0-compressed.x86_64.qcow2","checksum":"http://172.22.0.3:6180/images/rhcos-45.81.202005200134-0-openstack.x86_64.qcow2/rhcos-45.81.202005200134-0-compressed.x86_64.qcow2.md5sum"}},"goodCredentials":{"credentials":{"name":"openshift-worker-0-0-bmc-secret","namespace":"openshift-machine-api"},"credentialsVersion":"24928"},"triedCredentials":{"credentials":{"name":"openshift-worker-0-0-bmc-secret","namespace":"openshift-machine-api"},"credentialsVersion":"24928"},"errorMessage":"","poweredOn":false,"operationHistory":{"register":{"start":"2020-06-02T16:28:18Z","end":"2020-06-02T16:28:43Z"},"inspect":{"start":"2020-06-02T16:28:43Z","end":"2020-06-02T16:30:25Z"},"provision":{"start":"2020-06-02T16:30:54Z","end":"2020-06-02T16:33:37Z"},"deprovision":{"start":null,"end":null}}}'
        f:poweredOn: {}
  poweredOn: false

Version:

4.5.0-0.nightly-2020-06-01-111748

Comment 1 Stephen Cuppett 2020-06-10 15:53:57 UTC
This is not a blocker for 4.5.0 GA. Setting target release to current development branch (4.6.0). For fixes (if any) requested/required on earlier versions, clones will be created for 4.5.z or earlier as appropriate.

Comment 3 Tomas Jelinek 2020-07-08 05:51:15 UTC
already has been triaged so just marking as such

Comment 4 Jiri Tomasek 2020-07-10 09:57:56 UTC
The reason for this behaviour is because UI is looking at spec.online on the BMH resource to identify if the host is instended to be powered on. In this case spec.online is true which in combination with status.poweredOn = false results in 'Powering On' power status in UI. The UI does not know about the phased reboot annotation and does not take it into account when identifying the power status. This needs to be enhanced.

Comment 5 Andrew Beekhof 2020-07-22 12:36:16 UTC
Me again, any chance this will land in 4.6?

Comment 6 Rastislav Wagner 2020-09-17 08:13:59 UTC
@Andrew yes, PR is up

Comment 8 Yadan Pei 2020-09-29 02:07:31 UTC
1. Compute -> Bare Metal Hosts -> openshift-worker-0-1 -> Actions -> Edit Annotations -> Add 'reboot.metal3.io/anydata' KEY only(no need to set VALUE) -> Save
2. Check BMH status via CLI
# oc get bmh openshift-worker-0-1 -n openshift-machine-api -o yaml
......
  poweredOn: false
  provisioning:
    ID: 5a2f3518-128e-4251-ae7c-a5b0134ea982
    bootMode: UEFI
    image:
      checksum: http://[fd00:1101::3]:6180/images/rhcos-46.82.202009222340-0-openstack.x86_64.qcow2/rhcos-46.82.202009222340-0-compressed.x86_64.qcow2.md5sum
      url: http://[fd00:1101::3]:6180/images/rhcos-46.82.202009222340-0-openstack.x86_64.qcow2/rhcos-46.82.202009222340-0-compressed.x86_64.qcow2
    rootDeviceHints:
      deviceName: /dev/sda
    state: provisioned
......
3. Check BMH status on Compute -> Bare Metal Hosts list page, it shows 'Provisioned, Powered off', this is correct

Verified on 4.6.0-0.nightly-2020-09-27-075304

Comment 9 Yadan Pei 2020-09-29 02:10:28 UTC
Created attachment 1717388 [details]
Bare Metal Hosts shows powered off when phased reboot

$ oc get bmh openshift-worker-0-1 -n openshift-machine-api -o yaml
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
  annotations:
    reboot.metal3.io/anydata: ""
  creationTimestamp: "2020-09-28T07:15:59Z"
  finalizers:
  - baremetalhost.metal3.io
  - machine.machine.openshift.io
......
  poweredOn: false
  provisioning:
    ID: 5a2f3518-128e-4251-ae7c-a5b0134ea982
    bootMode: UEFI
    image:
      checksum: http://[fd00:1101::3]:6180/images/rhcos-46.82.202009222340-0-openstack.x86_64.qcow2/rhcos-46.82.202009222340-0-compressed.x86_64.qcow2.md5sum
      url: http://[fd00:1101::3]:6180/images/rhcos-46.82.202009222340-0-openstack.x86_64.qcow2/rhcos-46.82.202009222340-0-compressed.x86_64.qcow2
    rootDeviceHints:
      deviceName: /dev/sda
    state: provisioned

Comment 12 errata-xmlrpc 2020-10-27 16:05:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196