Description of problem: Cannot get the correct HE-VM status information with CLI or Cockpit. When deployed the HE via cockpit based otopi failed, then deploy the HE via CLI with noansible deployment. -------------------------------------------------------------------------------------------- [root@dell-per515-02 ~]# hosted-engine --deploy --noansible [ INFO ] Stage: Initializing [ INFO ] Generating a temporary VNC password. [ INFO ] Stage: Environment setup During customization use CTRL-D to abort. Continuing will configure this host for serving as hypervisor and create a VM where you have to install the engine afterwards. Are you sure you want to continue? (Yes, No)[Yes]: It has been detected that this program is executed through an SSH connection without using screen. Continuing with the installation may lead to broken installation if the network connection fails. It is highly recommended to abort the installation and run it inside a screen session using command "screen". Do you want to continue anyway? (Yes, No)[No]: yes [ INFO ] Hardware supports virtualization Configuration files: [] Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180126002300-z4f6rb.log Version: otopi-1.7.6 (otopi-1.7.6-1.el7ev) [ INFO ] Detecting available oVirt engine appliances [ INFO ] Stage: Environment packages setup [ INFO ] Stage: Programs detection [ INFO ] Stage: Environment setup [ ERROR ] The following VMs have been found: 63470a43-31ff-4330-977c-6716f38a1fc1 [ ERROR ] Failed to execute stage 'Environment setup': Cannot setup Hosted Engine with other VMs running [ INFO ] Stage: Clean up [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180126002309.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180126002300-z4f6rb.log ----------------------------------------------------------------------------------------------------- But check the HE-VM status, it seems that no vm status here. ---------------------------------------------------------------- [root@dell-per515-02 ~]# hosted-engine --vm-status You must run deploy first ----------------------------------------------------------------- Version-Release number of selected component (if applicable): cockpit-ws-157-1.el7.x86_64 cockpit-bridge-157-1.el7.x86_64 cockpit-storaged-157-1.el7.noarch cockpit-dashboard-157-1.el7.x86_64 cockpit-157-1.el7.x86_64 cockpit-ovirt-dashboard-0.11.5-0.1.el7ev.noarch cockpit-system-157-1.el7.noarch ovirt-hosted-engine-setup-2.2.8-1.el7ev.noarch ovirt-hosted-engine-ha-2.2.4-1.el7ev.noarch rhvm-appliance-4.2-20171219.0.el7.noarch rhvh-4.2.1.2-0.20180125.0+1 How reproducible: 100% Steps to Reproduce: 1. Clean install RHVH4.2.1 (rhvh-4.2.1.2-0.20180125.0+1) with ks 2. Deploy HE via cockpit based otopi 3. Redeploy HE via CLI with noansible deployment (hosted-engine --deploy --noansible) 4. Check the HE-VM status( hosted-engine --vm-status) Actual results: 1. After step2, deploy failed with some issues 2. After step3, deploy failed due to other VMs running [root@dell-per515-02 ~]# hosted-engine --deploy --noansible [ INFO ] Stage: Initializing [ INFO ] Generating a temporary VNC password. [ INFO ] Stage: Environment setup During customization use CTRL-D to abort. Continuing will configure this host for serving as hypervisor and create a VM where you have to install the engine afterwards. Are you sure you want to continue? (Yes, No)[Yes]: It has been detected that this program is executed through an SSH connection without using screen. Continuing with the installation may lead to broken installation if the network connection fails. It is highly recommended to abort the installation and run it inside a screen session using command "screen". Do you want to continue anyway? (Yes, No)[No]: yes [ INFO ] Hardware supports virtualization Configuration files: [] Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180126002300-z4f6rb.log Version: otopi-1.7.6 (otopi-1.7.6-1.el7ev) [ INFO ] Detecting available oVirt engine appliances [ INFO ] Stage: Environment packages setup [ INFO ] Stage: Programs detection [ INFO ] Stage: Environment setup [ ERROR ] The following VMs have been found: 63470a43-31ff-4330-977c-6716f38a1fc1 [ ERROR ] Failed to execute stage 'Environment setup': Cannot setup Hosted Engine with other VMs running [ INFO ] Stage: Clean up [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180126002309.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180126002300-z4f6rb.log 3. After step4, check the HE-VM status , it seeems to get the incorrect HostedEngine VM status information. [root@dell-per515-02 ~]# hosted-engine --vm-status You must run deploy first Expected results: Get the HE-VM status information successfully. Additional info: On the cockpit , after step4, it also seems that it is the clean ENV and no VM running.
This seems reasonable. If the engine is not deployed, the status cannot be checked. Please re-open if this persists after a successful deployment.
Yes. The issue is that the vm is running(the engine is not ok). 1. But for the user, he don't know the ENV is clean or vm running from the cockpit . 2. If deploy failed first time, user re-deploy the HE with noansible deployment, it will raise the error "Cannot HostedEngine setup with running VM" So, how to know the deployment status from the cockpit or CLI.
Perhaps we need another status for --vm-status to report that there was a failed deployment.
This doesn't make sense to me, please open a new RFE on the use case, not the solution and we will consider how to best address it.
(In reply to Yaniv Lavi from comment #4) > This doesn't make sense to me, please open a new RFE on the use case, not > the solution and we will consider how to best address it. What about your idea to open a new RFE , I confused that
The use case here is very clear. Attempt to deploy over ansible. A VM is cleared. Deployment fails for some reason. The system is now in an inconsistent state. --vm-status shows that it is clean. Trying to deploy HE fails because it is not clean. --vm-status would, ideally, check whether a VM for Node Zero is running and return some other result if it's present but ha-agent does not think it's deployed. Without this, the UX in cockpit doesn't let users know until after a failure. Yes, users should already know to clean up a failed deployment,but that's true of many bugs/RFEa...
Do we want here only a single true/false flag? Would it be enough if it output 'It seems like a previous attempt to deploy hosted-engine failed. Please reinstall the OS before trying again'? IMO current behavior is reasonable. 'hosted-engine --vm-status' is not designed to analyze this state, and doing a really good job (checking what's the status, what's good, what's bad, what failed, how to fix, etc) is a very big project. If you/we want something in-between above two, please state what exactly. I do not think we want to repeat in '--vm-status' all the checks that '--deploy' does, and remember that the code in '--deploy --noansible' is going to be removed in 4.3, if all goes well. Also, 'hosted-engine --deploy', in this state, fails very quickly after the start, before doing much interaction from the user. So does not waste too much time/effort.
In my opinion, it would be enough to output that, yes. We don't really need it to know exactly what's good and what's bad, just "a previous attempt failed, please clean/redeploy before trying again". We an rely on `hosted-engine --cleanup` to handle the edge cases.
The solution is under discussion, we will provide qa_ack if the fix in UI only or move it to default QA contact to ack. Thanks.
Tested with ovirt-hosted-engine-setup-2.2.18-1.el7ev If fails during the deployment, use 'hosted-engine --vm-status' to check the vm status, give the hint like here: #hosted-engine --vm-status It seems like a previous attempt to deploy hosted-engine failed or it's still in progress. Please clean it up before trying again So, moving to verified.
This bugzilla is included in oVirt 4.2.3 release, published on May 4th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.3 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.