Bug 1013638 - when the HA agent detects that the vm died unexpectedly, it should call --vm-poweroff before --vm-start
when the HA agent detects that the vm died unexpectedly, it should call --vm...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-ha (Show other bugs)
unspecified
Unspecified Unspecified
unspecified Severity high
: ---
: 3.3.0
Assigned To: Greg Padgett
Artyom
sla
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-30 09:45 EDT by Leonid Natapov
Modified: 2016-06-12 19:16 EDT (History)
7 users (show)

See Also:
Fixed In Version: ovirt-hosted-engine-ha-0.1.0-0.3.1.beta1.el6ev
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-21 11:50:17 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: SLA
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 19874 None None None Never
Red Hat Product Errata RHEA-2014:0080 normal SHIPPED_LIVE new package: ovirt-hosted-engine-ha 2014-01-21 16:00:07 EST

  None (edit)
Description Leonid Natapov 2013-09-30 09:45:47 EDT
when the HA agent detected that the vm died unexpectedly, it should  call --vm-poweroff before --vm-start

What happend now is that when VM dies or unable to start it stays in "down" state.
vdsClient shows that vm exist and it's down. When HA calls vm-start it fails to create Vm because it's already created according to vdsm. So we should call power off which will destroy VM before calling vm-start which creates VM.
----------
MainThread::ERROR::2013-09-30 16:31:31,965::hosted_engine::779::HostedEngine::(_handle_on) Engine vm died unexpectedly
MainThread::DEBUG::2013-09-30 16:31:31,966::hosted_engine::664::HostedEngine::(_perform_engine_actions) Processing engine state OFF
MainThread::ERROR::2013-09-30 16:31:31,966::hosted_engine::726::HostedEngine::(_handle_off) Engine down and local host has best score (2000), attempting to start engine VM
MainThread::DEBUG::2013-09-30 16:31:31,966::hosted_engine::664::HostedEngine::(_perform_engine_actions) Processing engine state START
MainThread::INFO::2013-09-30 16:31:31,966::hosted_engine::750::HostedEngine::(_start_engine_vm) Starting vm using `/usr/sbin/hosted-engine --vm-start`
MainThread::INFO::2013-09-30 16:31:32,216::hosted_engine::755::HostedEngine::(_start_engine_vm) stdout: Virtual machine already exists

MainThread::INFO::2013-09-30 16:31:32,216::hosted_engine::756::HostedEngine::(_start_engine_vm) stderr: 
MainThread::WARNING::2013-09-30 16:31:32,217::hosted_engine::762::HostedEngine::(_start_engine_vm) Failed to start engine VM, already running according to VDSM
---------------
Comment 1 Leonid Natapov 2013-10-02 04:17:47 EDT
How to reproduce:
1.Bring up hosted-engine environment with two hosts.
2.Make sure one host  engine vm is running on one of the hosts.
3.remove sanlock lockspace on the host where engine vm is running.
4.See in the agent.log file that ha agent tries to bring up VM but fails because there is no lockspace.
5.add sanlock lockspace
6.See that VM is UP.
Comment 2 Greg Padgett 2013-10-10 07:21:11 EDT
Merged Change-Id: I52c8f41c28fbc1942a8b392b275359df57c1b5ef
Comment 4 Artyom 2013-10-20 03:55:47 EDT
Verified on ovirt-hosted-engine-ha-0.1.0-0.3.1.beta1.el6ev.noarch
Used another scenario, vm kernel is manual crashed with echo c > /proc/sysrq-trigger command.
MainThread::ERROR::2013-10-20 10:49:03,201::hosted_engine::912::HostedEngine::(_handle_on) Engine vm died unexpectedly
MainThread::DEBUG::2013-10-20 10:49:03,202::hosted_engine::752::HostedEngine::(_perform_engine_actions) Processing engine state OFF
MainThread::ERROR::2013-10-20 10:49:03,202::hosted_engine::814::HostedEngine::(_handle_off) Engine down and local host has best score (2400), attempting to start engine VM
MainThread::DEBUG::2013-10-20 10:49:03,202::hosted_engine::752::HostedEngine::(_perform_engine_actions) Processing engine state START
MainThread::INFO::2013-10-20 10:49:03,202::hosted_engine::870::HostedEngine::(_clean_vdsm_state) Ensuring VDSM state is clear for engine VM
MainThread::DEBUG::2013-10-20 10:49:03,202::vds_client::38::SubmonitorUtil::(run_vds_client_cmd) Connecting to vdsClient at 0 with ssl=True
MainThread::DEBUG::2013-10-20 10:49:03,203::vds_client::59::SubmonitorUtil::(run_vds_client_cmd) Connected, running getVmStats, args ('a78c56fe-472a-4502-b25e-d04d8364f682',), kwargs {}
In log I can see that hosted_engine check status of vm if vm is down, he destroy vm via vdsClient
Comment 5 Charlie 2013-11-27 20:41:52 EST
This bug is currently attached to errata RHEA-2013:15591. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.
Comment 6 Greg Padgett 2013-12-06 13:01:36 EST
ovirt-hosted-engine-ha is a new package; does not need errata for bugs during its development.
Comment 7 errata-xmlrpc 2014-01-21 11:50:17 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0080.html

Note You need to log in before you can comment on or make changes to this bug.