Bug 1504150 - [downstream clone - 4.1.7] Engine-health monitor should expect new sanlock error message
Summary: [downstream clone - 4.1.7] Engine-health monitor should expect new sanlock er...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-ha
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.1.7
: ---
Assignee: Andrej Krejcir
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On: 1504032
Blocks: 1464002 1493547
TreeView+ depends on / blocked
 
Reported: 2017-10-19 14:48 UTC by rhev-integ
Modified: 2019-04-28 13:50 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of: 1504032
Environment:
Last Closed: 2017-11-07 17:26:57 UTC
oVirt Team: SLA
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:3136 0 normal SHIPPED_LIVE ovirt-hosted-engine-ha bug fix update for 4.1.7 2017-11-07 22:22:11 UTC
oVirt gerrit 82940 0 v2.1.z MERGED Broker: Change expected error message when lock is held by another host 2017-10-19 14:52:52 UTC

Description rhev-integ 2017-10-19 14:48:52 UTC
+++ This bug is an upstream to downstream clone. The original bug is: +++
+++   bug 1504032 +++
======================================================================

Description of problem:

VDSM now uses a newer version of libvirt, which reports a different error message when it fails to acquire the sanlock. The engine-health submonitor has to be changed to react to this new error message.

Version-Release number of selected component (if applicable):
VDSM: v4.19.35

How reproducible:
Always

Steps to Reproduce:
1. Deploy hosted engine on 2 hosts
2. Set global maintenance mode and shut down the engine VM
3. Cancel global maintenance mode
4. Wait for engine VM to start

Actual results:
On the host that does not run the VM, the agent moves from EngineStarting state to EngineMaybeAway.

Expected results:
The agent moves from EngineStarting to EngineForceStop.
There is an INFO line in agent log: "Another host already took over.."

(Originally by Andrej Krejcir)

Comment 2 Nikolai Sednev 2017-10-30 15:06:01 UTC
Works for me on ovirt-hosted-engine-setup-2.1.4-1.el7ev.noarch, thus moving to verified.

Please see the details bellow:
--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : alma03
Host ID                            : 1
Engine status                      : {"reason": "Storage of VM is locked. Is another host already starting the VM?", "
health": "bad", "vm": "already_locked", "detail": "down"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 6bfb3b60
local_conf_timestamp               : 7275
Host timestamp                     : 7273
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=7273 (Mon Oct 30 17:03:25 2017)
        host-id=1
        score=3400
        vm_conf_refresh_time=7275 (Mon Oct 30 17:03:27 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineForceStop
        stopped=False


--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : alma03
Host ID                            : 1
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail"
: "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 4cc56637
local_conf_timestamp               : 7316
Host timestamp                     : 7314
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=7314 (Mon Oct 30 17:04:06 2017)
        host-id=1
        score=3400
        vm_conf_refresh_time=7316 (Mon Oct 30 17:04:08 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineDown
        stopped=False

Comment 4 errata-xmlrpc 2017-11-07 17:26:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3136


Note You need to log in before you can comment on or make changes to this bug.