Bug 1571119 - [HE] - Engine complaining that the 'VM HostedEngine is down with error. Exit message: resource busy: Failed to acquire lock: Lease is held by another host.'
Summary: [HE] - Engine complaining that the 'VM HostedEngine is down with error. Exit ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-ha
Classification: oVirt
Component: Agent
Version: ---
Hardware: x86_64
OS: Linux
medium
medium vote
Target Milestone: ovirt-4.2.4
: 2.2.12
Assignee: Andrej Krejcir
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks: ovirt-hosted-engine-ha-2.2.14
TreeView+ depends on / blocked
 
Reported: 2018-04-24 06:58 UTC by Michael Burman
Modified: 2018-06-26 08:35 UTC (History)
6 users (show)

Fixed In Version: ovirt-hosted-engine-ha-2.2.12-1.el7ev
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-06-26 08:35:14 UTC
oVirt Team: Integration
rule-engine: ovirt-4.2+
ylavi: exception+


Attachments (Terms of Use)
HE logs (3.47 MB, application/x-gzip)
2018-04-24 06:58 UTC, Michael Burman
no flags Details


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 91426 'None' MERGED Fix migration failure check 2020-09-06 07:24:42 UTC
oVirt gerrit 91490 'None' MERGED Fix migration failure check 2020-09-06 07:24:42 UTC

Description Michael Burman 2018-04-24 06:58:19 UTC
Created attachment 1425826 [details]
HE logs

Description of problem:
[HE] - Engine complaining the 'VM HostedEngine is down with error. Exit message: resource busy: Failed to acquire lock: Lease is held by another host.' on HE VM migration, although the migration succeeded.

2018-04-24 09:24:25,418+03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-3) [] VM '82fe2ccb-ca42-4e19-8346-3ac7b80eb793'(HostedEngin
e) moved from 'WaitForLaunch' --> 'Down'
2018-04-24 09:24:25,499+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-3) [] EVENT_ID: VM_DOWN_ERROR(119), VM Hoste
dEngine is down with error. Exit message: resource busy: Failed to acquire lock: Lease is held by another host.
2018-04-24 09:24:25,516+03 INFO  [org.ovirt.engine.core.bll.ProcessDownVmCommand] (EE-ManagedThreadFactory-engine-Thread-22244) [613ab8d5] Running command: ProcessDownVmCo
mmand internal: true.


Version-Release number of selected component (if applicable):
4.2.3.2-0.1.el7
vdsm-4.20.26-1.el7ev.x86_64
ovirt-hosted-engine-ha-2.2.10-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.18-1.el7ev.noarch

How reproducible:
Almost every time migrating the HE VM - around 100%

Steps to Reproduce:
1. HE setup - Migrate the HE VM

Actual results:
Engine complaining that HE VM is down, but migration succeeded

Comment 1 Andrej Krejcir 2018-05-17 14:52:19 UTC
I couldn't reproduce this on a clean HE deployment.

Do you have more specific reproduction steps?
Or some more information about the environment?

Could you also attach debug level logs from vdsm, he-agent and he-broker?

Comment 2 Michael Burman 2018-05-21 10:34:14 UTC
(In reply to Andrej Krejcir from comment #1)
> I couldn't reproduce this on a clean HE deployment.
> 
> Do you have more specific reproduction steps?
> Or some more information about the environment?
> 
> Could you also attach debug level logs from vdsm, he-agent and he-broker?

It was a HE automation environment which is no longer available.
Just reproduced it locally on my setup. Please contact me offline and i will provide you access to the setup, will be faster. Thanks

2018-05-21 13:30:37,436+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-9) [] FINISH, DestroyVDSCommand, log id: 7d5739f3
2018-05-21 13:30:37,436+03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-9) [] VM 'ac715734-5df6-49b2-a9d4-8f86d3731aeb'(HostedEngin
e) moved from 'WaitForLaunch' --> 'Down'
2018-05-21 13:30:37,473+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-9) [] EVENT_ID: VM_DOWN_ERROR(119), VM Hoste
dEngine is down with error. Exit message: resource busy: Failed to acquire lock: Lease is held by another host.

All i did is to migrate the HE VM to a different host(was running on the SPM host prior the migration)

Comment 3 Nikolai Sednev 2018-06-05 15:06:51 UTC
I've failed to reproduce on latest components:
ovirt-hosted-engine-ha-2.2.13-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.22-1.el7ev.noarch
rhvm-appliance-4.2-20180601.0.el7.noarch
Linux 3.10.0-862.3.2.el7.x86_64 #1 SMP Tue May 15 18:22:15 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

Moving to verified.

Migration completed (VM: HostedEngine, Source: alma04.qa.lab.tlv.redhat.com, Destination: alma03.qa.lab.tlv.redhat.com, Duration: 42 seconds, Total: 53 seconds, Actual downtime: 304ms)
6/5/185:57:45 PM

Comment 4 Sandro Bonazzola 2018-06-26 08:35:14 UTC
This bugzilla is included in oVirt 4.2.4 release, published on June 26th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.4 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.