Bug 1280380

Summary: Hosted Engine: after migration of engine vm, source host still have engine vm in state down
Product: [oVirt] ovirt-hosted-engine-ha Reporter: Artyom <alukiano>
Component: AgentAssignee: Martin Sivák <msivak>
Status: CLOSED NOTABUG QA Contact: Artyom <alukiano>
Severity: high Docs Contact:
Priority: unspecified    
Version: 1.3.1CC: ahadas, alukiano, bugs, dfediuck, mavital, mgoldboi, michal.skrivanek, rgolan
Target Milestone: ovirt-3.6.6Keywords: Regression, Triaged
Target Release: ---Flags: rule-engine: ovirt-3.6.z+
rule-engine: blocker+
mgoldboi: planning_ack+
dfediuck: devel_ack+
mavital: testing_ack+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-16 11:08:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
source host logs none

Description Artyom 2015-11-11 15:27:11 UTC
Created attachment 1092795 [details]
source host logs

Description of problem:
After put host to local maintenance, agent migrate engine vm, but source host still have engine vm in state down.
Output from source code:

# hosted-engine --vm-status
--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : rose05.qa.lab.tlv.redhat.com
Host ID                            : 1
Engine status                      : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "down"}
Score                              : 0
stopped                            : False
Local maintenance                  : True
crc32                              : 08d9a82e
Host timestamp                     : 14834


--== Host 2 status ==--

Status up-to-date                  : True
Hostname                           : cyan-vdsf.qa.lab.tlv.redhat.com
Host ID                            : 2
Engine status                      : {"health": "good", "vm": "up", "detail": "up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 2cb9a39a
Host timestamp                     : 185334

# vdsClient -s 0 list table
489f97db-fd1d-4504-9ce0-f8732d6b57d1  20649  HostedEngine         Down


Version-Release number of selected component (if applicable):
ovirt-hosted-engine-ha-1.3.2.1-1.el7ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy hosted-engine on two hosts
2. Put host with engine vm to local maintenance(hosted-engine --set-maintenance --mode=local)
3. 

Actual results:
engine vm migrate to second host, but first host still have engine vm in state down

Expected results:
engine vm migrate to second host, and no engine vm on first host

Additional info:
Problem in class EngineMigratingAway, because new_data.migration_result have only string "Done", state machine moved to state ReinitializeFSM and not to EngineDown

Comment 1 Martin Sivák 2015-11-23 11:31:05 UTC
The state Down is expected as it needs to be collected by the engine, the ReinitializeFSM state is not though and we are probably missing an allowed value somewhere.

Comment 2 Red Hat Bugzilla Rules Engine 2015-11-27 05:40:01 UTC
This bug is not marked for z-stream, yet the milestone is for a z-stream version, therefore the milestone has been reset.
Please set the correct milestone or add the z-stream flag.

Comment 3 Red Hat Bugzilla Rules Engine 2015-11-27 05:40:01 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 4 Red Hat Bugzilla Rules Engine 2015-11-30 07:09:28 UTC
This bug is not marked for z-stream, yet the milestone is for a z-stream version, therefore the milestone has been reset.
Please set the correct milestone or add the z-stream flag.

Comment 5 Arik 2016-02-04 20:37:16 UTC
Artyom,
1. can you estimate how much time after the migration ended did you run the 'vdsClient -s 0 list table'?
2. if you wait for more than 15 secs after the engine VM is running on the destination, the VM is destroyed and disappeared from the source host?

Comment 6 Michal Skrivanek 2016-02-06 12:44:21 UTC
Also please add engine log as well

Comment 7 Roy Golan 2016-02-10 09:39:58 UTC
ping

Comment 9 Artyom 2016-02-17 14:06:09 UTC
Checked on ovirt-hosted-engine-ha-1.3.4.1-1.el7ev.noarch
I see that now all works fine and after migration, vm destroyed on source host, so I believe we can close this bug, in case if I will encounter this one again I will reopen this bug.