Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1455341

Summary:

[HE] when run "hosted-engine --vm-start" get wrong message

Product:

[oVirt] ovirt-hosted-engine-ha

Reporter:

Kobi Hakimi <khakimi>

Component:

General

Assignee:

Andrej Krejcir <akrejcir>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Nikolai Sednev <nsednev>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

2.1.0.5

CC:

akrejcir, bugs, dfediuck

Target Milestone:

ovirt-4.1.5

Keywords:

TestOnly, Triaged

Target Release:

---

Flags:

nsednev: needinfo-
dfediuck: ovirt-4.1?
dfediuck: planning_ack?
dfediuck: devel_ack+
rule-engine: testing_ack+

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

No Doc Update

Doc Text:

undefined

Story Points:

---

Clone Of:

Environment:

Last Closed:

2017-08-23 08:06:49 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

SLA

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1356425, 1460982

Bug Blocks:

Attachments:

Description	Flags
screencast	none
sosreport from host	none
engine's sosreport	none

Description Kobi Hakimi 2017-05-24 20:38:40 UTC

Description of problem:
[HE] when run "hosted-engine --vm-start" get message "VM exists and is down, destroying it"

Version-Release number of selected component (if applicable):
Red Hat Virtualization Manager Version: 4.1.2.2-0.1.el7
rhvm-appliance-4.1.20170221.0-1.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
On the host of which hosted the engine run the following commands: 
1. "hosted-engine --set-maintenance --mode=global"
2. "hosted-engine --vm-shutdown"
3. wait few minutes for vm machine to shutdown
4. "hosted-engine --vm-start"

Actual results:
got the message:
VM exists and is down, destroying it

Expected results:
to get the message:
VM exists and is down, starting it

Additional info:
# hosted-engine --vm-status(after shutdown and before the start)

!! Cluster is in GLOBAL MAINTENANCE mode !!



--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma23.scl.lab.tlv.redhat.com
Host ID                            : 1
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 1f0c0bf2
local_conf_timestamp               : 25138
Host timestamp                     : 25122
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=25122 (Wed May 24 23:31:13 2017)
	host-id=1
	score=3400
	vm_conf_refresh_time=25138 (Wed May 24 23:31:29 2017)
	conf_on_shared_storage=True
	maintenance=False
	state=GlobalMaintenance
	stopped=False

--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma26.scl.lab.tlv.redhat.com
Host ID                            : 2
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 53beb59e
local_conf_timestamp               : 128917
Host timestamp                     : 128901
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=128901 (Wed May 24 23:31:32 2017)
	host-id=2
	score=3400
	vm_conf_refresh_time=128917 (Wed May 24 23:31:48 2017)
	conf_on_shared_storage=True
	maintenance=False
	state=GlobalMaintenance
	stopped=False


--== Host 3 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : puma27.scl.lab.tlv.redhat.com
Host ID                            : 3
Engine status                      : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "down"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 1f6fdc8c
local_conf_timestamp               : 128898
Host timestamp                     : 128883
Extra metadata (valid at timestamp):
	metadata_parse_version=1
	metadata_feature_version=1
	timestamp=128883 (Wed May 24 23:31:12 2017)
	host-id=3
	score=3400
	vm_conf_refresh_time=128898 (Wed May 24 23:31:27 2017)
	conf_on_shared_storage=True
	maintenance=False
	state=GlobalMaintenance
	stopped=False

Comment 1 Andrej Krejcir 2017-07-10 12:27:47 UTC

The fix for master was done in Bug 1356425 and the cherry-pick to branch 2.1 was done in Bug 1460982, even if it is not related to this bug.

Comment 3 Nikolai Sednev 2017-07-11 09:50:31 UTC

I'm getting as follows:
alma03 ~]# hosted-engine --vm-start
VM exists and is down, cleaning up and restarting
Exception in thread Client localhost:54321 (most likely raised during interpreter shutdown):[root@alma03 ~]# 

Screencast being attached together with sosreports from the engine and host.

What that exception? Why it is being shown?

Comment 4 Nikolai Sednev 2017-07-11 09:51:20 UTC

Created attachment 1296149 [details]
screencast

Comment 5 Nikolai Sednev 2017-07-11 09:54:32 UTC

Created attachment 1296150 [details]
sosreport from host

Comment 6 Nikolai Sednev 2017-07-11 09:55:34 UTC

Created attachment 1296151 [details]
engine's sosreport

Comment 7 Nikolai Sednev 2017-07-11 09:58:17 UTC

Components on host:
ovirt-imageio-common-1.0.0-0.el7ev.noarch
mom-0.5.9-1.el7ev.noarch
ovirt-imageio-daemon-1.0.0-0.el7ev.noarch
sanlock-3.5.0-1.el7.x86_64
ovirt-setup-lib-1.1.3-1.el7ev.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7ev.noarch
qemu-kvm-rhev-2.9.0-16.el7.x86_64
ovirt-vmconsole-1.0.4-1.el7ev.noarch
vdsm-4.19.21-1.el7ev.x86_64
ovirt-hosted-engine-ha-2.1.4-1.el7ev.noarch
libvirt-client-3.2.0-14.el7.x86_64
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
ovirt-hosted-engine-setup-2.1.3.4-1.el7ev.noarch
ovirt-host-deploy-1.6.6-1.el7ev.noarch
Linux version 3.10.0-691.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Thu Jun 29 10:30:04 EDT 2017
Linux 3.10.0-691.el7.x86_64 #1 SMP Thu Jun 29 10:30:04 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.4 (Maipo)

On engine:
rhev-guest-tools-iso-4.1-5.el7ev.noarch
rhevm-doc-4.1.4-1.el7ev.noarch
rhevm-dependencies-4.1.1-1.el7ev.noarch
rhevm-4.1.4-0.2.el7.noarch
rhevm-branding-rhev-4.1.0-2.el7ev.noarch
rhevm-setup-plugins-4.1.2-1.el7ev.noarch
Linux version 3.10.0-693.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Thu Jul 6 19:56:57 EDT 2017
Linux3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.4 (Maipo)

Comment 8 Nikolai Sednev 2017-07-11 10:25:24 UTC

As the original bug not being reproduced, moving this bug to verified forth to https://bugzilla.redhat.com/show_bug.cgi?id=1455341#c3.
Works for me with correct message now: - "VM exists and is down, cleaning up and restarting".

Comment 9 Nikolai Sednev 2017-07-11 11:30:01 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1438678 explains the exception from comment #3. I've added my findings directly to 1438678.