Bug 1566622 - ha-host fails to retrieve vm.conf file and becomes not ha-host if vm.conf manually removed from host, while its not in global maintenance.
Summary: ha-host fails to retrieve vm.conf file and becomes not ha-host if vm.conf man...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: General
Version: 2.2.16
Hardware: x86_64
OS: Linux
low
low
Target Milestone: ---
: ---
Assignee: Ido Rosenzwig
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-12 15:57 UTC by Nikolai Sednev
Modified: 2022-02-25 11:12 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-16 09:34:34 UTC
oVirt Team: Integration
Embargoed:


Attachments (Terms of Use)
sosreport from alma04 (9.91 MB, application/x-xz)
2018-04-12 15:57 UTC, Nikolai Sednev
no flags Details
engine logs (9.20 MB, application/x-xz)
2018-04-12 16:02 UTC, Nikolai Sednev
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-44938 0 None None None 2022-02-25 11:12:08 UTC

Description Nikolai Sednev 2018-04-12 15:57:51 UTC
Created attachment 1420918 [details]
sosreport from alma04

Description of problem:
ha-host fails to retrieve vm.conf file from shared storage and becomes not ha-host if vm.conf manually removed from host, while its not in global maintenance.
[root@alma04 ~]# rm -f /var/run/ovirt-hosted-engine-ha/vm.conf 
[root@alma04 ~]#  hosted-engine --vm-status
The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable.


Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-2.1.4.2-1.el7ev.noarch
ovirt-hosted-engine-ha-2.1.11-1.el7ev.noarch
rhvm-appliance-4.1.20180125.0-1.el7.noarch
Red Hat Enterprise Linux Server release 7.5 (Maipo)
Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:
100%

Steps to Reproduce:
1.Deploy SHE over 2 ha-hosts over NFS.
2.SHE-VM is running on first host.
3.Run "rm -f /var/run/ovirt-hosted-engine-ha/vm.conf" on second host.
4.Check that /var/run/ovirt-hosted-engine-ha/vm.conf not being copied from shared storage and does not exists within the directory.
5."hosted-engine --vm-status"
The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable.
6.Check that NFS storage is accessible, instead of what was printed in step 5. 

Actual results:
ha-agent fails to retrieve vm.conf and have to be manually restarted to work around the issue.

Expected results:
vm.conf should be copied from shared storage without issues and ha-agent should not fail the ha-host.

Additional info:
sosreport from host is attached.

Comment 1 Nikolai Sednev 2018-04-12 16:02:39 UTC
Created attachment 1420919 [details]
engine logs

Comment 2 Nikolai Sednev 2018-04-12 16:03:57 UTC
In WEBUI engine recognizes second host as ha-host with active score of 3400, in CLI it appears as:
alma04 ~]#  hosted-engine --vm-status
The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable.

Comment 3 Martin Sivák 2018-04-12 16:26:15 UTC
We do not refresh the vm.conf unless it is necessary. Nobody is supposed to touch runtime files manually. So this is not really a bug. The file will eventually appear again once needed.

As long as hosted engine is running, the engine sees the right score (and comment 2 says it does) then there is nothing wrong, maybe except a missing comment about how to cause the refresh (try starting the vm, restart agent and probably some other events).

We could provide a command to download the file on user's request.

Comment 4 Martin Sivák 2018-04-12 16:29:12 UTC
Simone do you think we should change the detection logic here to show the status even when vm.conf is missing?

Comment 5 Simone Tiraboschi 2018-04-13 07:15:02 UTC
(In reply to Martin Sivák from comment #4)
> Simone do you think we should change the detection logic here to show the
> status even when vm.conf is missing?

Maybe just that fix could be worth although manually deleting a file is still not a recommended action.

Comment 6 Sandro Bonazzola 2019-01-16 09:34:34 UTC
Not going to fix this


Note You need to log in before you can comment on or make changes to this bug.