Description of problem ====================== Given I have a non-HE RHV environment When I SSH into one of the hosts in the cluster (not SPM) And I execute `# reboot -f` Then RHV reports the host status 'up' while the host is going through a reboot Version-Release number of selected component (if applicable) ============================================================ 4.4.4.7-0.1.el8ev How reproducible ================ 100% on non-HE deployments. * This can be WA by restarting the ovirt-engine service * * It seems that this reproduces on deployments that are alive for some period of time - this wasn't empirically determined * Steps to Reproduce ================== 1. SSH to root user of one of the hosts in the cluster. 2. Execute `# reboot -f` on the host. Actual results ============== The host status remains 'up' until the host finishes reboot, at which point the host transitions to 'connecting' state for less than 2 seconds and then to 'up'. Expected results ================ The host transitions to 'connecting' state within 3 seconds of the reboot, and then to 'non-responsive', and only when the host finishes rebooting then it reported as 'connecting' and then 'up'. Additional info =============== # As written above a possible WA for this situation is restarting the ovirt-engine service. # This is possibly a 'family member' of bug 1846338, but this cannot be determined at the moment, without a deeper investigation. # I wasn't able to measure the time it takes for my environment to become "faulty" and not report the correct status for the rebooted host, however, ideally, the environment that will reproduce this bug would be live for more than a day or two.
We need to wait till we get more information from GC as introduced in BZ1936897
Closing for now, feel free to reopenif this is still reproducable on the latest version
Reopening because it's currently reproduced easily
Unfortunately again we are not able to reproduce the issue, so we need to close