Bug 1026377 - HA agent dies after manual migration of engine vm.
Summary: HA agent dies after manual migration of engine vm.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-ha
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 3.3.1
Assignee: Greg Padgett
QA Contact: Artyom
URL:
Whiteboard: sla
Depends On:
Blocks: rhev3.4beta 1142926
TreeView+ depends on / blocked
 
Reported: 2013-11-04 14:21 UTC by Leonid Natapov
Modified: 2016-06-12 23:16 UTC (History)
7 users (show)

Fixed In Version: ovirt-hosted-engine-ha-0.1.0-0.8.rc.el6ev
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-21 16:51:19 UTC
oVirt Team: SLA
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (971.96 KB, application/x-gzip)
2013-11-04 14:25 UTC, Leonid Natapov
no flags Details
agen log (3.82 MB, text/x-log)
2013-11-21 15:34 UTC, Artyom
no flags Details
vdsm and agent log (1.21 MB, application/zip)
2013-12-09 12:57 UTC, Artyom
no flags Details
libvirt logs (107.58 KB, application/zip)
2013-12-09 15:17 UTC, Artyom
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:0080 0 normal SHIPPED_LIVE new package: ovirt-hosted-engine-ha 2014-01-21 21:00:07 UTC
oVirt gerrit 21275 0 None None None Never
oVirt gerrit 21808 0 None None None Never

Description Leonid Natapov 2013-11-04 14:21:38 UTC
Description of problem:

I am manually migrating engine VM form one host to another.
It seems that everything is ok. At least according to UI. migration starts and migration ends. The engine vm runs on the second host.

ha-agent log shows different situation:
Here is the destination machine ha agent log:
---------------------------------------------
MainThread::INFO::2013-11-04 15:29:17,254::brokerlink::108::BrokerLink::(put_stats_on_storage) Storing blocks on storage at /rhev/data-center/mnt/orion.qa.lab.tlv.redhat.com:_export_hosted__engine2/6177a07d-7789-49ba-96d2-1469700a1509/ha_agent
MainThread::ERROR::2013-11-04 15:29:17,363::hosted_engine::720::HostedEngine::(_collect_all_host_stats) Host green-vdsb.qa.lab.tlv.redhat.com (id 1) is no longer updating its metadata
MainThread::INFO::2013-11-04 15:29:17,367::hosted_engine::729::HostedEngine::(_collect_all_host_stats) Host green-vdsa.qa.lab.tlv.redhat.com (id 2) metadata updated
MainThread::INFO::2013-11-04 15:29:17,367::hosted_engine::734::HostedEngine::(_collect_all_host_stats) Host green-vdsa.qa.lab.tlv.redhat.com (id 2): {'last-update-host-ts': 1383571749, 'last-update-local-ts': 1383571757.36339, 'hostname': 'green-vdsa.qa.lab.tlv.redhat.com', 'alive': True, 'engine-status': 'vm-down', 'score': 0, 'first-update': False}
MainThread::INFO::2013-11-04 15:29:17,368::hosted_engine::729::HostedEngine::(_collect_all_host_stats) Host green-vdsc.qa.lab.tlv.redhat.com (id 3) metadata updated
MainThread::INFO::2013-11-04 15:29:17,368::hosted_engine::734::HostedEngine::(_collect_all_host_stats) Host green-vdsc.qa.lab.tlv.redhat.com (id 3): {'last-update-host-ts': 1383571757, 'last-update-local-ts': 1383571757.36339, 'hostname': 'green-vdsc.qa.lab.tlv.redhat.com', 'alive': True, 'engine-status': 'vm-up good-health-status', 'score': 2400, 'first-update': False}
MainThread::WARNING::2013-11-04 15:29:17,419::hosted_engine::262::HostedEngine::(start_monitoring) Error while monitoring engine: Error 12 from migrateStatus: Fatal error during migration
MainThread::WARNING::2013-11-04 15:29:17,419::hosted_engine::265::HostedEngine::(start_monitoring) Unexpected error
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 259, in start_monitoring
    self._perform_engine_actions()
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 788, in _perform_engine_actions
    = self._vm_state_actions[self._rinfo['current-state']]()
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 46, in cleanup_wrapper
    ret = f(self)
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 1118, in _handle_migrate
    vm_id,
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/vds_client.py", line 85, in run_vds_client_cmd
    response['status']['message'])
DetailedError: Error 12 from migrateStatus: Fatal error during migration
MainThread::ERROR::2013-11-04 15:29:17,419::hosted_engine::279::HostedEngine::(start_monitoring) Shutting down the agent because of 3 failures in a row!
MainThread::INFO::2013-11-04 15:29:17,419::brokerlink::54::BrokerLink::(disconnect) Closing connection to ha-broker
MainThread::INFO::2013-11-04 15:29:17,420::agent::107::Broker::(run) Agent shutting down
----------------------------------------------------------------------------

Agent shutting down.

On the source machine ha-agent tries to start a VM because it can't see that the first host updating its metadata.

I am attaching ha and vdsm logs from both hosts.

Comment 1 Leonid Natapov 2013-11-04 14:25:50 UTC
Created attachment 819182 [details]
logs

Comment 2 Greg Padgett 2013-11-18 15:20:41 UTC
Merged Change-Id: I9da40599098446e6915a2bd675096d00fa07516d

Comment 4 Artyom 2013-11-21 15:34:13 UTC
Created attachment 827293 [details]
agen log

Checked on ovirt-hosted-engine-ha-0.1.0-0.6.beta1.el6ev.noarch
Migration also look ok, but after migration source host score is dropped to 0
And I also have error message in source host agent log

Comment 5 Charlie 2013-11-28 01:41:58 UTC
This bug is currently attached to errata RHEA-2013:15591. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 6 Greg Padgett 2013-12-06 18:01:57 UTC
ovirt-hosted-engine-ha is a new package; does not need errata for bugs during its development.

Comment 7 Artyom 2013-12-09 12:57:13 UTC
Created attachment 834311 [details]
vdsm and agent log

Checked on ovirt-hosted-engine-ha-0.1.0-0.8.rc.el6ev.noarch, migration still failed, with error in agent and vdsm log, but vm shutdown and start on other log

Comment 8 Artyom 2013-12-09 15:17:19 UTC
Created attachment 834357 [details]
libvirt logs

Comment 11 Greg Padgett 2013-12-16 17:51:03 UTC
We identified a separate issue causing migration failures, moving back to ON_QA to verify the existing HA agent code submitted.

Comment 12 Artyom 2013-12-16 18:15:30 UTC
Verified on ovirt-hosted-engine-ha-0.1.0-0.9.rc.el6ev.noarch

Comment 13 errata-xmlrpc 2014-01-21 16:51:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0080.html


Note You need to log in before you can comment on or make changes to this bug.