Description of problem: During initialization, the hosted-engine mechanism needs to acquire 2 locks: * hosted-engine on the lockspace volume * host_id on hosted_storage ids. These take some time to be acquired. If the user runs a hosted-engine command in the meantime, this message is displayed: # hosted-engine --vm-status The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable. It implies there is a problem: storage unreachable or ovirt-ha-agent not running. But in fact its just waiting for sanlock, there are no problems and this message may make the user restart the agent again. Please improve this. Check what is the actual status and print an appropriate message in case it is juts initializing or if there is in fact a problem accessing storage or the agent is not running. Version-Release number of selected component (if applicable): ovirt-hosted-engine-setup-2.2.26-1.el7.noarch ovirt-hosted-engine-ha-2.2.16-1.el7.noarch vdsm-4.20.39.1-1.el7.x86_64 How reproducible: Always Steps to Reproduce: 1. Reboot Hosted-Engine host 2. Run 'hosted-engine' command
Let's try to use a less scaring message
This message appears for very short period of time. No harm is done and customer can check either ha-agent is running or storage connectivity, both are legit. I see this as a corner case. Not sure if such short time frame is so critical for the code change, especially when it happens only after host's restart.
Asaf got a fix, should cover also this corner case.
For the first time I checked the status, right after host got booted up, I saw initial issue, probably due to the fact that ha-agent was still getting started: serval15 ~]# hosted-engine --vm-status The hosted engine configuration has not been retrieved from shared storage yet, please ensure that ovirt-ha-agent service is running. On second iteration message has changed to this: serval15 ~]# hosted-engine --vm-status The hosted engine configuration has not been retrieved from shared storage yet, for more details please check sanlock status. After ~2 minutes status had been retrieved OK. Tested on: ovirt-hosted-engine-setup-2.5.4-2.el8ev.noarch ovirt-hosted-engine-ha-2.4.9-1.el8ev.noarch ovirt-engine-4.4.9.2-0.6.el8ev.noarch
This bugzilla is included in oVirt 4.4.9 release, published on October 20th 2021. Since the problem described in this bug report should be resolved in oVirt 4.4.9 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.