Bug 1455341 - [HE] when run "hosted-engine --vm-start" get wrong message
Summary: [HE] when run "hosted-engine --vm-start" get wrong message
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-ha
Classification: oVirt
Component: General
Version: 2.1.0.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.1.5
: ---
Assignee: Andrej Krejcir
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On: 1356425 1460982
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-24 20:38 UTC by Kobi Hakimi
Modified: 2017-08-23 08:06 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-08-23 08:06:49 UTC
oVirt Team: SLA
Embargoed:
nsednev: needinfo-
dfediuck: ovirt-4.1?
dfediuck: planning_ack?
dfediuck: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
screencast (10.91 MB, application/octet-stream)
2017-07-11 09:51 UTC, Nikolai Sednev
no flags Details
sosreport from host (9.76 MB, application/x-xz)
2017-07-11 09:54 UTC, Nikolai Sednev
no flags Details
engine's sosreport (9.33 MB, application/x-xz)
2017-07-11 09:55 UTC, Nikolai Sednev
no flags Details

Description Kobi Hakimi 2017-05-24 20:38:40 UTC
Description of problem:
[HE] when run "hosted-engine --vm-start" get message "VM exists and is down, destroying it"

Version-Release number of selected component (if applicable):
Red Hat Virtualization Manager Version: 4.1.2.2-0.1.el7
rhvm-appliance-4.1.20170221.0-1.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
On the host of which hosted the engine run the following commands: 
1. "hosted-engine --set-maintenance --mode=global"
2. "hosted-engine --vm-shutdown"
3. wait few minutes for vm machine to shutdown
4. "hosted-engine --vm-start"

Actual results:
got the message:
VM exists and is down, destroying it

Expected results:
to get the message:
VM exists and is down, starting it

Additional info:
# hosted-engine --vm-status(after shutdown and before the start)

!! Cluster is in GLOBAL MAINTENANCE mode !!



--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma23.scl.lab.tlv.redhat.com
Host ID                            : 1
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 1f0c0bf2
local_conf_timestamp               : 25138
Host timestamp                     : 25122
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=25122 (Wed May 24 23:31:13 2017)
	host-id=1
	score=3400
	vm_conf_refresh_time=25138 (Wed May 24 23:31:29 2017)
	conf_on_shared_storage=True
	maintenance=False
	state=GlobalMaintenance
	stopped=False

--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma26.scl.lab.tlv.redhat.com
Host ID                            : 2
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 53beb59e
local_conf_timestamp               : 128917
Host timestamp                     : 128901
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=128901 (Wed May 24 23:31:32 2017)
	host-id=2
	score=3400
	vm_conf_refresh_time=128917 (Wed May 24 23:31:48 2017)
	conf_on_shared_storage=True
	maintenance=False
	state=GlobalMaintenance
	stopped=False


--== Host 3 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma27.scl.lab.tlv.redhat.com
Host ID                            : 3
Engine status                      : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "down"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 1f6fdc8c
local_conf_timestamp               : 128898
Host timestamp                     : 128883
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=128883 (Wed May 24 23:31:12 2017)
	host-id=3
	score=3400
	vm_conf_refresh_time=128898 (Wed May 24 23:31:27 2017)
	conf_on_shared_storage=True
	maintenance=False
	state=GlobalMaintenance
	stopped=False

Comment 1 Andrej Krejcir 2017-07-10 12:27:47 UTC
The fix for master was done in Bug 1356425 and the cherry-pick to branch 2.1 was done in Bug 1460982, even if it is not related to this bug.

Comment 3 Nikolai Sednev 2017-07-11 09:50:31 UTC
I'm getting as follows:
alma03 ~]# hosted-engine --vm-start
VM exists and is down, cleaning up and restarting
Exception in thread Client localhost:54321 (most likely raised during interpreter shutdown):[root@alma03 ~]# 

Screencast being attached together with sosreports from the engine and host.

What that exception? Why it is being shown?

Comment 4 Nikolai Sednev 2017-07-11 09:51:20 UTC
Created attachment 1296149 [details]
screencast

Comment 5 Nikolai Sednev 2017-07-11 09:54:32 UTC
Created attachment 1296150 [details]
sosreport from host

Comment 6 Nikolai Sednev 2017-07-11 09:55:34 UTC
Created attachment 1296151 [details]
engine's sosreport

Comment 7 Nikolai Sednev 2017-07-11 09:58:17 UTC
Components on host:
ovirt-imageio-common-1.0.0-0.el7ev.noarch
mom-0.5.9-1.el7ev.noarch
ovirt-imageio-daemon-1.0.0-0.el7ev.noarch
sanlock-3.5.0-1.el7.x86_64
ovirt-setup-lib-1.1.3-1.el7ev.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch
qemu-kvm-rhev-2.9.0-16.el7.x86_64
ovirt-vmconsole-1.0.4-1.el7ev.noarch
vdsm-4.19.21-1.el7ev.x86_64
ovirt-hosted-engine-ha-2.1.4-1.el7ev.noarch
libvirt-client-3.2.0-14.el7.x86_64
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
ovirt-hosted-engine-setup-2.1.3.4-1.el7ev.noarch
ovirt-host-deploy-1.6.6-1.el7ev.noarch
Linux version 3.10.0-691.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Thu Jun 29 10:30:04 EDT 2017
Linux 3.10.0-691.el7.x86_64 #1 SMP Thu Jun 29 10:30:04 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.4 (Maipo)

On engine:
rhev-guest-tools-iso-4.1-5.el7ev.noarch
rhevm-doc-4.1.4-1.el7ev.noarch
rhevm-dependencies-4.1.1-1.el7ev.noarch
rhevm-4.1.4-0.2.el7.noarch
rhevm-branding-rhev-4.1.0-2.el7ev.noarch
rhevm-setup-plugins-4.1.2-1.el7ev.noarch
Linux version 3.10.0-693.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Thu Jul 6 19:56:57 EDT 2017
Linux3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.4 (Maipo)

Comment 8 Nikolai Sednev 2017-07-11 10:25:24 UTC
As the original bug not being reproduced, moving this bug to verified forth to https://bugzilla.redhat.com/show_bug.cgi?id=1455341#c3.
Works for me with correct message now: - "VM exists and is down, cleaning up and restarting".

Comment 9 Nikolai Sednev 2017-07-11 11:30:01 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1438678 explains the exception from comment #3. I've added my findings directly to 1438678.


Note You need to log in before you can comment on or make changes to this bug.