Bug 1019229 - ovirt-hosted-engine | Engine VM is not reported by vdsm although it's up and running.
ovirt-hosted-engine | Engine VM is not reported by vdsm although it's up and ...
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm (Show other bugs)
Unspecified Unspecified
unspecified Severity high
: ---
: 3.3.0
Assigned To: Greg Padgett
Ilanit Stein
: Triaged
Depends On:
  Show dependency treegraph
Reported: 2013-10-15 06:36 EDT by Leonid Natapov
Modified: 2016-06-12 19:16 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2013-11-28 10:37:00 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: SLA
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
vdsm log file (143.79 KB, application/x-gzip)
2013-10-15 06:36 EDT, Leonid Natapov
no flags Details
vdsm.log.3.xz (537.21 KB, application/x-xz)
2013-10-15 11:53 EDT, Federico Simoncelli
no flags Details

  None (edit)
Description Leonid Natapov 2013-10-15 06:36:53 EDT
Created attachment 812419 [details]
vdsm log file

Description of problem:

Engine VM is not reported by vdsm although it's up and running.

1.hosted engine environment with 2 hosts.
2.engine vm runs on 1 host and the second host is unavailable.
3.block connection to storage from the 1 host,so vm should start on the second host but it failed because the second host is unavailable. After resuming connection on the first host I can see that engine vm is up and running (i can connect to it) but according to vdsm it's not exist.

Here is the sanlock status output: 
[root@purple-vds3 vdsm]# sanlock status
daemon 6c067f1a-1056-4314-b825-7f2bf73d3b41.purple-vds
p -1 helper
p -1 listener
p 28340 HostedEngine
p 19763 
p -1 status
s ec7d5406-2cb9-472c-b7e7-2ad6fedf9e3a:1:/dev/ec7d5406-2cb9-472c-b7e7-2ad6fedf9e3a/ids:0
s ef4a31eb-688f-4f19-af9f-6a3e0bf82ebf:1:/rhev/data-center/mnt/orion.qa.lab.tlv.redhat.com\:_export_hosted__engine1/ef4a31eb-688f-4f19-af9f-6a3e0bf82ebf/dom_md/ids:0
s hosted-engine:1:/rhev/data-center/mnt/orion.qa.lab.tlv.redhat.com\:_export_hosted__engine1/ef4a31eb-688f-4f19-af9f-6a3e0bf82ebf/ha_agent/hosted-engine.lockspace:0
r ef4a31eb-688f-4f19-af9f-6a3e0bf82ebf:586a0e62-5f38-4e93-848f-57588e128261:/rhev/data-center/mnt/orion.qa.lab.tlv.redhat.com\:_export_hosted__engine1/ef4a31eb-688f-4f19-af9f-6a3e0bf82ebf/images/b64e7d62-c3cb-44d1-b280-f4e1318720fe/586a0e62-5f38-4e93-848f-57588e128261.lease:0:20 p 28340
vdsm version: vdsm-4.13.0-13.gitc08fbb6.el6_4.x86_64
vdsm log attached.
Comment 1 Ayal Baron 2013-10-15 07:28:18 EDT
what does vdsClient getVmsList report?
Comment 2 Leonid Natapov 2013-10-15 07:36:23 EDT
vdsClient list shows nothing.
vdsClient getVmsList can only be used with the pool and run on SPM. I have no pool.
Comment 3 Federico Simoncelli 2013-10-15 11:53:32 EDT
Created attachment 812604 [details]

Attaching some additional logs.
At first glance this looks like a race between vmCreate and vmDestroy:

10:33:44,290::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
10:33:47,529::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:07:51,461::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:07:54,173::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:20:00,162::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:20:11,626::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:20:12,928::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:20:23,653::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:20:24,944::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:20:36,199::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:20:37,551::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:20:48,235::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:20:49,522::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:21:00,222::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:21:01,533::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:21:12,199::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:21:13,480::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:21:24,739::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:21:26,022::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':

Thread-65601::ERROR::2013-10-15 11:21:22,551::vm::2132::vm.Vm::(_startUnderlyingVm) vmId=`ac687a24-05cd-43fb-9f9d-c53169fa8cd0`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 2092, in _startUnderlyingVm
  File "/usr/share/vdsm/vm.py", line 2959, in _run
    self._connection.createXML(domxml, flags),
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: Requested operation is not valid: domain is already active as 'HostedEngine'

11:21:37,233::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:21:38,538::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:21:49,542::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:21:50,831::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:22:01,652::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:22:02,975::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:22:13,848::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:22:15,168::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:22:25,986::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:22:27,313::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:22:38,357::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:22:39,771::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:22:50,659::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:22:52,039::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:23:02,789::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:23:04,095::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:23:15,140::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:23:16,511::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:23:27,233::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:23:28,556::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:23:39,230::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:23:40,521::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
11:23:51,268::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-
11:23:52,646::BindingXMLRPC::984::vds::(wrapper) client []::call vmCreate with ({'emulatedMachine':
12:46:07,337::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy with ('ac687a24-05cd-

Beside fixing this issue in vdsm we should ask ourselves why the he-agent is issuing these commands.
Comment 4 Leonid Natapov 2013-10-15 15:32:18 EDT
After discussion with Greg another bug was opened regarding these messages.
Comment 7 Michal Skrivanek 2013-11-13 10:12:00 EST
from virt point of view the VM was destroyed by
Thread-69404::DEBUG::2013-10-15 12:46:07,337::BindingXMLRPC::984::vds::(wrapper) client []::call vmDestroy …
and since then it doesn't exist. If someone somewhere somehow started it again without letting vdsm know then it must have been the ha_agent?
Comment 8 Michal Skrivanek 2013-11-13 10:13:16 EST
any other logs? from ha_agent perhaps?
only vdsm is not really helpful, I'm missing the time references and what was going on
Comment 9 Leonid Natapov 2013-11-13 10:58:28 EST
(In reply to Michal Skrivanek from comment #8)
> any other logs? from ha_agent perhaps?
> only vdsm is not really helpful, I'm missing the time references and what
> was going on

No :( it was a month ago when the bug was opened. And Federico asked vdsm logs,so I attached them.
Comment 10 Doron Fediuck 2013-11-18 07:20:26 EST
Greg & Leonid,
let's revisit this issue;
See if still reproducible. If so who creates / destroys the VM.
Comment 11 Greg Padgett 2013-11-18 19:25:55 EST
I tried to reproduce it with the latest packages and everything looks good.  I hope Leonid can confirm.

(As an aside, I wonder if this is related to bug 1022924.  It seems to have many similarities.)
Comment 12 Leonid Natapov 2013-11-24 08:21:09 EST
Indeed. I was unable to reproduce it.
Comment 13 Doron Fediuck 2013-11-28 10:37:00 EST
Please re-open if you're able to reproduce.

Note You need to log in before you can comment on or make changes to this bug.