Bug 1008505 - ha broker service should check if metadate file exist upon starting. if not exist should give a warning message and fail to start.
Summary: ha broker service should check if metadate file exist upon starting. if not e...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-ha
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.3.0
Assignee: Martin Sivák
QA Contact: Artyom
URL:
Whiteboard: sla
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-16 13:41 UTC by Leonid Natapov
Modified: 2022-05-16 06:37 UTC (History)
8 users (show)

Fixed In Version: ovirt-hosted-engine-ha-0.1.0-0.3.1.beta1.el6ev
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-21 16:50:05 UTC
oVirt Team: SLA
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-46024 0 None None None 2022-05-16 06:37:55 UTC
Red Hat Product Errata RHEA-2014:0080 0 normal SHIPPED_LIVE new package: ovirt-hosted-engine-ha 2014-01-21 21:00:07 UTC
oVirt gerrit 19294 0 None MERGED Kill the agent if three subsequent requests fail 2020-09-09 08:15:18 UTC

Description Leonid Natapov 2013-09-16 13:41:54 UTC
ha broker service should check if metadate file exist upon starting. if not exist - should give a  warning message and fail to start.

Comment 1 Martin Sivák 2013-09-16 15:10:33 UTC
The HA agent can't check for the files directly. It uses the broker for that. And the broker does not know the service name (and thus the filename) until the agent asks for the metadata.

So after a discussion with Doron we decided to implement a 3-strike rule. If the agent tries to execute a call through the broker and sees three failures in a row then the agent will terminate.

Comment 2 Martin Sivák 2013-09-16 15:15:07 UTC
Greg, do you think that is a reasonable behaviour? It will wait for a total of three minutes to recover. The other hosts won't see any updates from this one for that time.

Comment 3 Greg Padgett 2013-09-24 00:28:51 UTC
I think it's reasonable.  With the patch, the agent will exit after 3 successive failures of most anything.  I don't immediately see a problem, but we should consider possible temporary failures and whether it's okay for them to effectively kill the agent.

Long-term we could consider an environment-checking routine, but for now I don't really see a big advantage in implementing that.

Comment 4 Greg Padgett 2013-10-11 14:06:20 UTC
Merged Change-Id: Ia5ec6935673133af2e60f339b035dba3438b617f

Comment 6 Artyom 2013-10-16 15:50:31 UTC
Can you please more explicit describe steps for verification of this bug.

Comment 7 Leonid Natapov 2013-10-16 15:59:56 UTC
(In reply to Artyom from comment #6)
> Can you please more explicit describe steps for verification of this bug.

1. See that ha-agent successfully starts after regular installation.
2.Delete metadata file and start the engine. It should be terminated after 3 minutes. Check agent.log file.

Comment 8 Leonid Natapov 2013-10-16 16:00:30 UTC
sorry,in step 2 start the ha agent (not engine).

Comment 9 Artyom 2013-10-18 13:43:43 UTC
Verified on ovirt-hosted-engine-ha-0.1.0-0.3.1.beta1.el6ev.noarch
After deleting metadata engine stop working and in log appear message:
Full response: failure failed to write metadata: [Errno 2] No such file or directory: '/rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_alukiano_host__deploy/db1206d1-060b-446f-ad9f-aca7ca23704b/ha_agent/hosted-engine.metadata'

Comment 10 Charlie 2013-11-28 01:41:43 UTC
This bug is currently attached to errata RHEA-2013:15591. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 11 errata-xmlrpc 2014-01-21 16:50:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0080.html


Note You need to log in before you can comment on or make changes to this bug.